Nothing Special   »   [go: up one dir, main page]

US20040132086A1 - Progesterone receptor-regulated gene expression and methods related thereto - Google Patents

Progesterone receptor-regulated gene expression and methods related thereto Download PDF

Info

Publication number
US20040132086A1
US20040132086A1 US10/776,827 US77682704A US2004132086A1 US 20040132086 A1 US20040132086 A1 US 20040132086A1 US 77682704 A US77682704 A US 77682704A US 2004132086 A1 US2004132086 A1 US 2004132086A1
Authority
US
United States
Prior art keywords
gene
progesterone receptor
expression
genes
chosen
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US10/776,827
Inventor
Kathryn Horwitz
Jennifer Richer
Original Assignee
University of Colorado
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Colorado filed Critical University of Colorado
Priority to US10/776,827 priority Critical patent/US20040132086A1/en
Assigned to HORWITZ, KATHRYN B., RICHER, JENNIFER reassignment HORWITZ, KATHRYN B. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: REGENTS OF THE UNIVERSITY OF COLORADO, THE
Publication of US20040132086A1 publication Critical patent/US20040132086A1/en
Assigned to NIH - DIETR reassignment NIH - DIETR CONFIRMATORY LICENSE (SEE DOCUMENT FOR DETAILS). Assignors: THE REGENTS OF THE UNIVERSITY OF COLORADO A BODY CORPORATE
Assigned to NATIONAL INSTITUTES OF HEALTH (NIH), U.S. DEPT. OF HEALTH AND HUMAN SERVICES (DHHS), U.S. GOVERNMENT reassignment NATIONAL INSTITUTES OF HEALTH (NIH), U.S. DEPT. OF HEALTH AND HUMAN SERVICES (DHHS), U.S. GOVERNMENT CONFIRMATORY LICENSE (SEE DOCUMENT FOR DETAILS). Assignors: UNIVERSITY OF COLORADO
Abandoned legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/48Biological material, e.g. blood, urine; Haemocytometers
    • G01N33/50Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
    • G01N33/53Immunoassay; Biospecific binding assay; Materials therefor
    • G01N33/566Immunoassay; Biospecific binding assay; Materials therefor using specific carrier or receptor proteins as ligand binding reagents where possible specific carrier or receptor proteins are classified with their target compounds
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • C12Q1/6886Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/118Prognosis of disease development
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/136Screening for pharmacological compounds
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/158Expression markers
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N2333/00Assays involving biological materials from specific organisms or of a specific nature
    • G01N2333/435Assays involving biological materials from specific organisms or of a specific nature from animals; from humans
    • G01N2333/705Assays involving receptors, cell surface antigens or cell surface determinants
    • G01N2333/72Assays involving receptors, cell surface antigens or cell surface determinants for hormones
    • G01N2333/723Steroid/thyroid hormone superfamily, e.g. GR, EcR, androgen receptor, oestrogen receptor
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N2500/00Screening for compounds of potential therapeutic value
    • G01N2500/04Screening involving studying the effect of compounds C directly on molecule A (e.g. C are potential ligands for a receptor A, or potential substrates for an enzyme A)

Definitions

  • This invention generally relates to expression profiles of genes that are regulated by progesterone receptors, and particularly by progesterone receptor isoforms PR-A and PR-B, and to the use of such genes in methods for identifying progesterone receptor agonist and antagonist ligands, including progesterone receptor isoform-specific ligands and tissue-specific ligands.
  • This invention also relates to methods for determining the profile of genes regulated by progesterone receptors in a tissue sample.
  • pluralities of polynucleotides transcribed from genes that are regulated by progesterone receptors are disclosed, as are pluralities of antibodies that selectively bind to proteins encoded by such genes.
  • Progesterone is a natural reproductive hormone that targets the breast, uterus, ovaries, brain, bone, blood vessels, immune system, etc.
  • Progestational agents are widely used for oral contraception, menopausal hormone replacement therapy, and cancer treatments.
  • Antiprogestins which are synthetic ligands that antagonize the actions of progesterone, are in clinical trials for contraception, for induction of labor, and to treat endometriosis, breast cancers and meningiomas.
  • the actions of progesterone are varied and tissue-specific. Even in the normal breast it can have diverse effects: depending on the physiological state of the woman, progesterone can be proliferative, antiproliferative, or differentiative.
  • progesterone promotes the development of breast cancers and accelerates the growth of established breast cancers.
  • progestins which are synthetic progestational agents, increase the risk of breast cancer. Paradoxically, they are protective in the uterus and prevent endometrial cancers.
  • Progesterone, synthetic progestins, and antiprogestins all initially work through the same molecular pathway. These are low molecular weight, lipid soluble “ligands”. They enter target cells passively, and pass into the nucleus where they bind to progesterone receptors (PRs). Ligand binding activates the PR proteins, which then dimerize, bind to DNA at the promoters of progesterone target genes, and either up- or down-regulate transcription of these genes. There are two natural isoforms of PR, the A- and B-receptors, also referred to herein as PR-A and PR-B, respectively.
  • the isoforms are derived from two distinct promoters in the single PR gene and are translated from separate translation initiation start sites.
  • PR-B receptors are 933 amino acids in length, which is 164 amino acids longer at the N-terminus than PR-A, and contain a unique transcriptional activation function, AF-3 (Sartorius et al., Mol. Endocrinol. 8, 1347-1360 (1994)). Downstream of the additional 164 amino acids of PR-B, the two PRs have the identical primary amino acid content. However, despite this close amino acid composition, the two receptors have dramatically different abilities to activate transcription of progestin-responsive promoters in experimental model systems (Sartorius et al., Mol. Endocrinol.
  • Progestin agonist-liganded PR-B are stronger transactivators than PR-A, although there are cell-type and promoter-dependent exceptions.
  • the antiprogestin RU486 has mixed agonist/antagonist activity on PR-B but not PR-A.
  • PR-A can dominantly inhibit PR-B and other members of the steroid receptor family, including estrogen receptors (ERs).
  • ERs estrogen receptors
  • PR-A are more likely to be transcriptional repressors than PR-B.
  • PR-A:PR-B ratio there are stage-specific and region-specific variations in the PR-A:PR-B ratio in the developing rat brain (Kato et al., J Steroid Biochem Mol Biol 47, 173-82 (1993)) and studies in primates show that PR-B predominates in the estrogen treated hypothalamus, while expression of the PR-A isoform predominates in the pituitary (Baez et al., J Biol Chem 262, 6582-8 (1987); Bethea et al., Endocrinology 139, 677-87 (1998)).
  • progesterone is both proliferative and differentiative [reviewed in ⁇ (Clarke et al., Endocr. Rev. 11, 266-301 (1990))].
  • Breast epithelium mitoses increase during the menstrual cycle and peak in the late luteal phase, coincident with high circulating levels of progesterone.
  • Progesterone induces lobular-alveolar outgrowth during each menstrual cycle and during pregnancy induces further lobular-alveolar development in preparation for the terminal differentiative event of lactation.
  • PR null mice exhibit incomplete mammary gland ductal branching and failure of lobulo-alveolar development, as well as failure to ovulate and to exhibit sexual behavior (Lydon et al., Genes Develop. 9, 2266-2278 (1995)).
  • PR are also direct targets of second-line progestin therapies in patients whose tumors have developed antiestrogen resistance (Kimmick et al., Cancer Treat Res 94, 231-54 (1998); Howell et al., Recent Results Cancer Res 152, 227-44 (1998)).
  • the PR-A to PR-B ratio was measured in 202 PR-positive human breast tumors (Graham et al., Cancer Res. 55, 5063-5068 (1995)). The majority had PR-A to PR-B ratios greater than one, and 33% had 3.7 times or more PR-A than PR-B. The functional significance of this is unknown.
  • PR-A Prior to the present invention, few, if any, endogenous genes differentially regulated by PR-A vs. PR-B were known in breast cancers or any other tissues. An excess of PR-A enhances the expression of SOX4 mRNA levels in breast cancer cells. Whether PR-B also regulates this gene is unknown. SOX4 induces DNA bending. PR-A enhance expression of the mouse multiple drug resistance (mdr) 1b gene, important for development of drug resistance in tumors. Whether this gene is regulated endogenously only by PR-A is unknown. To the present inventors' knowledge, no data on PR-B specific gene regulation in breast cancers (or any tissues) has been published prior to the present invention. Although certain of the genes listed in Table 8 below were previously known to be progesterone regulated, the PR isoform specificity of this regulation was not known.
  • One embodiment of the present invention relates to a method to identify agonist ligands of progesterone receptors.
  • the method includes the steps of: (a) contacting a progesterone receptor with a putative agonist ligand, wherein the progesterone receptor is selected from the group consisting of progesterone receptor A (PR-A) and progesterone receptor B (PR-B), under conditions wherein, in the absence of the putative agonist ligand, the progesterone receptor is not activated; (b) detecting expression of at least one gene that is regulated by the progesterone receptor when the progesterone receptor is activated; and, (c) comparing the expression of the at least one gene in the presence and in the absence of the putative agonist ligand, wherein detection of regulation of the expression of the at least one gene in the manner associated with activation of the progesterone receptor as set forth in (b) indicates that the putative agonist ligand is a progesterone receptor agonist.
  • detection of upregulation of expression of at least one gene chosen from a gene in Table 1, or detection of downregulation of at least one gene chosen from a gene in Table 2, in the presence of the putative agonist ligand indicates that the putative agonist ligand is a selective agonist of PR-A.
  • detection of upregulation of expression of at least one gene chosen from a gene in Table 3, or detection of downregulation of at least one gene chosen from a gene in Table 4, in the presence of the putative agonist ligand indicates that the putative agonist ligand is a selective agonist of PR-B.
  • Another embodiment of the present invention relates to a method to identify antagonists of progesterone receptors.
  • This method includes the steps of: (a) contacting a progesterone receptor with a putative antagonist ligand, wherein the progesterone receptor is selected from the group consisting of progesterone receptor A (PR-A) and progesterone receptor B (PR-B), under conditions wherein, in the absence of the putative antagonist ligand, the progesterone receptor is activated; (b) detecting expression of at least one gene that is regulated by the progesterone receptor when the progesterone receptor is activated; and, (c) comparing the expression of the at least one gene in the presence and in the absence of the putative antagonist ligand, wherein detection of inhibition or reversal of the regulation of expression of the at least one gene as compared to the regulation of expression of the at least one gene in the manner associated with activation of the progesterone receptor as set forth in (b), indicates that the putative antagonist ligand is a put
  • detection of inhibition of expression or downregulated expression of at least one gene chosen from a gene in Table 1 in the presence of the putative antagonist ligand as compared to the expression of the at least one gene in the presence of the compound that activates the progesterone receptor indicates that the putative antagonist ligand is a selective antagonist of PR-A.
  • detection of inhibition of expression or downregulation of expression of at least one gene chosen from a gene in Table 3 in the presence of the putative antagonist ligand as compared to the expression of the at least one gene in the presence of the compound that activates the progesterone receptor indicates that the putative antagonist ligand is a selective antagonist of PR-B.
  • the at least one gene is selected from the group consisting of: (i) at least one gene that is selectively upregulated by PR-A chosen from a gene in Table 1; (ii) at least one gene that is selectively downregulated by PR-A chosen from a gene in Table 2; (iii) at least one gene that is selectively upregulated by PR-B chosen from a gene in Table 3; (iv) at least one gene that is selectively downregulated by PR-B chosen from a gene in Table 4; (v) at least one gene that is upregulated or downregulated by both PR-A and PR-B chosen from a gene in Table 5; (vi) at least one gene that is reciprocally regulated by PR-A and PR-B chosen from a gene in Table 6; and, (vii) at least one gene that is regulated by one of the PR-A or the PR-B, wherein regulation of the gene is altered when the other of the PR-A or PR-B is expressed by the same cell, chosen from a
  • step (b) includes detecting expression of: 11-beta-hydroxysteroid dehydrogenase type 2, tissue factor gene, PCI gene (plasminogen activator inhibitor 3), MAD-3 Ik ⁇ -alpha, Niemann-Pick C disease (NPC1), platelet-type phosphofructokinase, phenylethanolamine n-methyltransferase (PNMT), transforming growth factor-beta 3 (TGF-beta3), Monocyte Chemotactic Protein 1, delta sleep inducing peptide (related to TSC-22), and estrogen receptor-related protein (hERRa1).
  • NPC1 Niemann-Pick C disease
  • PNMT phenylethanolamine n-methyltransferase
  • TGF-beta3 transforming growth factor-beta 3
  • Monocyte Chemotactic Protein 1 delta sleep inducing peptide (related to TSC-22), and estrogen receptor-related protein (hERRa1).
  • step (b) includes detecting expression of: growth arrest-specific protein (gas6), tissue factor gene, NF-IL6-beta (C/EBPbeta), PCI gene (plasminogen activator inhibitor), Stat5A, calcium-binding protein S100P, MSX-2, lipocortin II (calpactin I), selenium-binding protein (hSBP), and bullous pemphigoid antigen (plakin family).
  • step (b) includes detecting expression of phenylethanolamine n-methyltransferase (PNMT) adrenal medulla.
  • step (b) includes detecting expression of proteasome-like subunit MECL-1.
  • step (b) includes detecting expression of: growth arrest-specific protein and tissue factor gene.
  • the progesterone receptor can be PR-A, PR-B or both PR-A and PR-B.
  • the step (b) of detecting comprises detecting expression of at least five genes from any one or more of the Tables 1-7. In another aspect, the step (b) of detecting comprises detecting expression of at least ten genes from any one or more of the Tables 1-7. In yet another aspect, the step (b) of detecting comprises detecting expression of at least 15 genes from any one or more of the Tables 1-7.
  • the progesterone receptor is expressed by a cell.
  • the progesterone receptor is endogenously expressed by the cell or recombinantly expressed by the cell.
  • cell is part of a tissue from a test animal.
  • the step of contacting is performed by administration of the putative agonist ligand to the test animal or to the tissue of the test animal.
  • expression of the at least one gene is detected by measuring amounts of transcripts of the at least one gene before and after contact of the progesterone receptor with the putative agonist ligand.
  • expression of the at least one gene is detected by detecting hybridization of at least a portion of the at least one gene or a transcript thereof to a nucleic acid molecule comprising a portion of the at least one gene or a transcript thereof in a nucleic acid array.
  • expression of the at least one gene is detected by measuring expression of a reporter gene that is operatively linked to at least the regulatory region of the at least one gene.
  • expression of the at least one gene is detected by detecting the production of a protein encoded by the at least one gene.
  • the putative agonist ligand is a product of rational drug design.
  • Yet another embodiment of the present invention relates to a method to identify isoform-specific agonists of progesterone receptors.
  • This method includes the steps of: (a) contacting a progesterone receptor with a putative agonist ligand, wherein the progesterone receptor is selected from the group consisting of progesterone receptor A (PR-A) and progesterone receptor B (PR-B), under conditions wherein in the absence of the putative agonist ligand, the progesterone receptor is not activated; (b) detecting expression of at least one gene that is regulated by the progesterone receptor when the progesterone receptor is activated; and, (c) comparing the expression of the at least one gene in the presence and in the absence of the putative agonist ligand, wherein detection of regulation of the expression of the at least one gene in the manner associated with activation of the progesterone receptor as set forth in (b)(i) but not (b)(ii), indicates that the putative agonist ligand is
  • Another embodiment of the present invention relates to a method to identify isoform-specific antagonists of progesterone receptors.
  • This method includes the steps of: (a) contacting a progesterone receptor with a putative antagonist ligand, wherein the progesterone receptor is selected from the group consisting of progesterone receptor A (PR-A) and progesterone receptor B (PR-B), under conditions wherein, in the absence of the putative antagonist ligand, the progesterone receptor is activated; (b) detecting expression of at least one gene that is regulated by the progesterone receptor when the progesterone receptor is activated; and, (c) comparing the expression of the at least one gene in the presence and in the absence of the putative antagonist ligand, wherein, in the presence of the putative antagonist ligand, detection of inhibition or reversal of the regulation of expression of the at least one gene as compared to the regulation of expression of the at least one gene in the manner associated with activation of the progesterone receptor as
  • the progesterone receptor can include PR-A, PR-B, or both PR-A and PR-B.
  • the at least one gene is selected from the group consisting of: (i) at least one gene that is exclusively upregulated or downregulated by PR-A, chosen from a Table selected from the group consisting of Table 1 and Table 2; and, (b) at least one gene that is exclusively upregulated or downregulated by PR-B chosen from a Table selected from the group consisting of Table 3 and Table 4.
  • the step (b) of detecting comprises detecting expression of at least five genes from any one or more of the Tables 1-4.
  • the step (b) of detecting comprises detecting expression of at least ten genes from any one or more of the Tables 1-4. In yet another aspect, the step (b) of detecting comprises detecting expression of at least 15 genes from any one or more of the Tables 1-4.
  • Another embodiment of the present invention relates to a method to identify a tissue-specific agonist of a progesterone receptor.
  • This embodiment includes the steps of: (a) providing an expression profile for at least one gene that is known to be regulated by a progesterone receptor in both a first and second tissue type when the progesterone receptor is activated, wherein the at least one gene is chosen from the genes in any one or more of the genes in Tables 1-7; (b) contacting a progesterone receptor expressed by a first tissue type with a putative agonist ligand, wherein the progesterone receptor is selected from the group consisting of progesterone receptor A (PR-A) and progesterone receptor B (PR-B), under conditions wherein, in the absence of the putative agonist ligand, the progesterone receptor is not activated; (c) contacting a progesterone receptor expressed by a second tissue type with the putative agonist ligand under conditions wherein, in the absence of the putative agonist agonist
  • Yet another embodiment relates to a method to identify a tissue-specific antagonist of a progesterone receptor.
  • This method includes the steps of: (a) providing an expression profile for at least one gene that is known to be regulated by a progesterone receptor in both a first and second tissue type when the progesterone receptor is activated, wherein the at least one gene is chosen from the genes in any one or more of the genes in Tables 1-7; (b) contacting a progesterone receptor expressed by a first tissue type with a putative antagonist ligand, wherein the progesterone receptor is selected from the group consisting of progesterone receptor A (PR-A) and progesterone receptor B (PR-B), under conditions wherein, in the absence of the putative antagonist ligand, the progesterone receptor is activated; (c) contacting a progesterone receptor expressed by a second tissue type with the putative antagonist ligand, wherein the progesterone receptor is selected from the group consisting of progesterone receptor A
  • Another embodiment of the present invention relates to a method to identify a tissue-specific agonist of a progesterone receptor.
  • This method includes the steps of: (a) providing an expression profile for at least one gene that is known to be regulated by a progesterone receptor in a first tissue type but not a second tissue type when the progesterone receptor is activated, wherein the at least one gene is chosen from the genes in any one or more of the genes in Tables 1-7; (b) contacting a progesterone receptor expressed by the first tissue type with a putative agonist ligand, wherein the progesterone receptor is selected from the group consisting of progesterone receptor A (PR-A) and progesterone receptor B (PR-B), under conditions wherein, in the absence of the putative agonist ligand, the progesterone receptor is not activated; (c) detecting expression of the at least one gene from (a); (d) comparing the expression of the at least one gene in the presence and in the absence of the putative
  • Yet another embodiment of the present invention relates to a method to identify a tissue-specific antagonist of a progesterone receptor.
  • This method includes the steps of: (a) providing an expression profile for at least one gene that is known to be regulated by a progesterone receptor in a first but not in a second tissue type when the progesterone receptor is activated, wherein the at least one gene is chosen from the genes in any one or more of the genes in Tables 1-7; (b) contacting a progesterone receptor expressed by a first tissue type with a putative antagonist ligand, wherein the progesterone receptor is selected from the group consisting of progesterone receptor A (PR-A) and progesterone receptor B (PR-B), under conditions wherein, in the absence of the putative antagonist ligand, the progesterone receptor is activated; (c) detecting expression of the at least one gene from (a); and, (d) comparing the expression of the at least one gene in the presence and in the absence of the putative antagonist
  • the first tissue type is breast
  • the at least one gene is selected from the group consisting of: (i) at least one gene that is selectively upregulated by PR-A chosen from a gene in Table 1; (ii) at least one gene that is selectively downregulated by PR-A chosen from a gene in Table 2; (iii) at least one gene that is selectively upregulated by PR-B chosen from a gene in Table 3; (iv) at least one gene that is selectively downregulated by PR-B chosen from a gene in Table 4; (v) at least one gene that is upregulated or downregulated by both PR-A and PR-B chosen from a gene in Table 5; (vi) at least one gene that is reciprocally regulated by PR-A and PR-B chosen from a gene in Table 6; and, (vii) at least one gene that is regulated by one of the PR-A or the PR-B, where
  • the second tissue type is selected from the group consisting of breast, uterus, bone, cartilage, cardiovascular tissues, heart, lung, brain, meninges, pituitary, ovary, oocyte, corpus luteum, oviduct, fallopian tubes, T lymphocytes, B lymphocytes, thymocytes, salivary gland, placenta, skin, gut, pancreas, liver, testis, epididymis, bladder, urinary tract, eye, and teeth.
  • the first tissue type is a non-malignant tissue and wherein the second tissue type is a malignant tissue from the same tissue source as the first tissue type.
  • a preferred tissue source is breast tissue.
  • the first tissue type is a normal tissue and wherein the second tissue type is a non-malignant, abnormal tissue.
  • the expression profile of genes regulated by a progesterone receptor in the first or second tissue type can be provided by a method comprising: (a) providing a first cell of a selected tissue type that expresses a progesterone receptor A (PR-A) and not a progesterone receptor B (PR-B) and a second cell of the same tissue type that expresses PR-B and not PR-A; (b) stimulating the progesterone receptors in (a) by contacting the first and second cells with a progesterone receptor stimulatory ligand; (c) detecting expression of genes by the first and second cells in the presence of the stimulatory ligand and in the absence of the stimulatory ligand, wherein a difference in the expression of a gene in the presence of the stimulatory ligand as compared to in the absence of the stimulatory ligand, indicates that the gene is regulated by the progesterone
  • Another embodiment of the present invention relates to method to determine the profile of genes regulated by progesterone receptors in a breast tumor sample.
  • This method includes the steps of: (a) obtaining from a patient a breast tumor sample; (b) detecting expression of at least one gene in the breast tumor sample that is regulated by a progesterone receptor when the progesterone receptor is activated; and, (c) producing a profile of genes for the tumor sample that are regulated selectively by PR-A, selectively by PR-B, or by both PR-A and PR-B.
  • the at least one gene is selected from the group consisting of: (i) at least one gene that is selectively upregulated by PR-A chosen from a gene in Table 9; (ii) at least one gene that is selectively downregulated by PR-A chosen from a gene in Table 10; (iii) at least one gene that is selectively upregulated by PR-B chosen from a gene in Table 11; (iv) at least one gene that is selectively downregulated by PR-B chosen from a gene in Table 12; (v) at least one gene that is upregulated or downregulated by both PR-A and PR-B chosen from a gene in Table 13; (vi) at least one gene that is reciprocally regulated by PR-A and PR-B chosen from a gene in Table 14; and, (vii) at least one gene that is regulated by one of the PR-A or the PR-B, wherein regulation of the gene is altered when the other of the PR-A or PR-B is expressed by the same cell, chosen from a gene in Table 15.
  • Yet another embodiment of the present invention relates to a plurality of polynucleotides for the detection of the expression of genes regulated by progesterone receptors in breast tissue.
  • the plurality of polynucleotides consists of polynucleotide probes that are complementary to RNA transcripts, or nucleotides derived therefrom, of genes that are regulated by progesterone receptors.
  • the plurality of polynucleotides also comprises polynucleotide probes that are complementary to RNA transcripts, or nucleotides derived therefrom, of genes selected from the group consisting of: (a) at least one gene that is selectively upregulated by PR-A chosen from a gene in Table 1; (b) at least one gene that is selectively downregulated by PR-A chosen from a gene in Table 2; (c) at least one gene that is selectively upregulated by PR-B chosen from a gene in Table 3; (d) at least one gene that is selectively downregulated by PR-B chosen from a gene in Table 4; (e) at least one gene that is upregulated or downregulated by both PR-A and PR-B chosen from a gene in Table 5; (e) at least one gene that is reciprocally regulated by PR-A and PR-B chosen from a gene in Table 6; and, (f) at least one gene that is regulated by one of the PR-A or the PR-B, wherein regulation of the gene is altered when the
  • the polynucleotide probes are immobilized on a substrate.
  • the polynucleotide probes are hybridizable array elements in a microarray.
  • the polynucleotide probes are conjugated to detectable markers.
  • the plurality of polynucleotides further comprises at least one polynucleotide probe that is complementary to RNA transcripts, or nucleotides derived therefrom, of at least one gene chosen from the genes in Table 8.
  • Another embodiment of the present invention relates to a plurality of antibodies, or antigen binding fragments thereof, for the detection of the expression of genes regulated by progesterone receptors in breast tissue.
  • the plurality of antibodies, or antigen binding fragments thereof consists of antibodies, or antigen binding fragments thereof, that selectively bind to proteins encoded by genes that are regulated by progesterone receptors.
  • the plurality of antibodies, or antigen binding fragments thereof also comprises antibodies, or antigen binding fragments thereof, that selectively bind to proteins encoded by genes selected from the group consisting of: (a) at least one gene that is selectively upregulated by PR-A chosen from a gene in Table 1; (b) at least one gene that is selectively down-regulated by PR-A chosen from a gene in Table 2; (c) at least one gene that is selectively upregulated by PR-B chosen from a gene in Table 3; (d) at least one gene that is selectively downregulated by PR-B chosen from a gene in Table 4; (e) at least one gene that is upregulated or downregulated by both PR-A and PR-B chosen from a gene in Table 5; (e) at least one gene that is reciprocally regulated by PR-A and PR-B chosen from a gene in Table 6; and, (f) at least one gene that is regulated by one of the PR-A or the PR-B, wherein regulation of the gene is altered when the other of the PR-A or
  • the plurality of antibodies, or antigen binding fragments thereof further comprises at least one antibody, or an antigen binding fragment thereof, that selectively binds to a protein encoded by a gene chosen from the genes in Table 8.
  • Another embodiment of the present invention relates to a method to identify genes that are regulated by a progesterone receptor in two or more tissue types.
  • This method includes the steps of: (a) activating a progesterone receptor in two or more tissue types that express the progesterone receptor; (b) detecting expression of at least one gene the two or more tissue types, the at least one gene being chosen from a gene in any one or more of Tables 1-7, and, (c) identifying genes that are regulated by the progesterone receptor in each of the two or more tissue types.
  • This method can further include the step of detecting whether the genes are regulated selectively by PR-A, selectively by PR-B, or by both PR-A and PR-B.
  • Another embodiment of the present invention relates to a method to regulate the expression of a gene selected from the group consisting of any one or more of the genes in Tables 1-7.
  • the method includes administering to a cell that expresses a progesterone receptor a compound selected from the group consisting of: progesterone, a progestin, and an antiprogestin, wherein the compound is effective to regulate the expression of the gene.
  • the gene is selected from the group consisting of: growth arrest-specific protein (gas6), NF-IL6-beta (C/EBPbeta), calcium-binding protein S100P, MSX-2, selenium-binding protein (hSBP), and bullous pemphigoid antigen (plakin family).
  • the cell that expresses a progesterone receptor is in the breast tissue of a patient that has, or is at risk of developing, breast cancer.
  • the present invention generally relates to the identification of a large number of genes that are regulated by progesterone receptors, and particularly, to the identification of how these genes are regulated by the progesterone receptor isoforms, PR-A and PR-B.
  • progesterone receptors both progestin-like agonists and anti-progestin-like antagonists
  • these genes can be used to profile individuals that have been diagnosed with breast cancer to enhance the ability of the clinician to develop a prognosis and treatment protocols for the individual patient.
  • the genes can also be used to profile the progesterone receptor regulated gene expression in tissue types other than breast tissue. Moreover, given the knowledge of these genes, one can produce novel combinations of polynucleotides and/or antibodies and/or peptides for use in progestational drug screening assays or expression profiling of patient samples.
  • the present inventors have generated model systems to study PRs in breast cancer cells, that are unique to the present inventors' laboratory.
  • PRs are induced by estradiol.
  • progestin actions in the background of an estrogenized system.
  • PR-A and PR-B contain both PR-A and PR-B. This makes it impossible to dissect out the effects of each PR isoform independently.
  • the T47Dco breast cancer cells are unique to the present inventors' laboratory.
  • the present inventors have now used these three new cell lines to analyze progesterone-responsive gene regulation via PR-B or PR-A (with PR negative T47D-Y cells serving as a control) using AffymetrixTM microarray HFL6800 gene expression chips and AtlasTM Human cDNA Expression Arrays.
  • PR-B With PR negative T47D-Y cells serving as a control
  • AffymetrixTM microarray HFL6800 gene expression chips and AtlasTM Human cDNA Expression Arrays In addition to confirming the regulation of the few known progesterone-responsive genes, the present inventors have identified many genes not previously known to be regulated by PR. Importantly, the results described herein now allow discrimination of genes that are regulated uniquely by PR-B from genes that are uniquely regulated by PR-A. It was found that PR-B regulate more genes than PR-A in response to progesterone, but that a number of genes are uniquely regulated by PR-A.
  • genes Of the more than 6000 human genes screened, the present inventors have identified multiple genes, the expression of which is regulated by progesterone receptors.
  • the genes can be grouped into categories based on the regulation of expression of the genes by the progesterone receptor isoforms, PR-A and PR-B.
  • the genes have been grouped into the following main categories: (1) Genes that are selectively (i.e., exclusively or uniquely) upregulated by PR-A (Tables 1 and 9); (2) genes that are selectively downregulated by PR-A (Tables 2 and 10); (3) genes that are selectively upregulated by PR-B (Tables 3 and 11); (4) genes that are selectively downregulated by PR-B (Tables 4 and 12); (5) genes that are upregulated or downregulated in the same direction by both PR-A and PR-B (Tables 5 and 13); (6) genes that are reciprocally regulated by PR-A and PR-B (Tables 6 and 14); and (7) genes that are regulated by one of the isoforms, wherein such regulation is altered when the other isoform is present (e.g., the expression of the gene is either up- or downregulated in the presence of both receptors relative to the expression level of the gene in the presence of only one receptor) (Tables 7 and 15).
  • Tables 1-7 include all genes that were newly discovered to be regulated by progesterone receptors by the present inventors.
  • Tables 9-15 include all of the genes from Tables 1-7, respectively, and additionally include the genes that were identified by the present inventors that had previously been identified to be regulated generally by progesterone. This particular subset of genes (i.e., previously known to be regulated by progesterone) is also set forth separately in Table 8.
  • Table 16 is a list of genes identified in the present invention which were previously known to be involved in breast cancer or in the development of mammary tissue.
  • Table 17 is a list that categorizes the genes shown to be regulated by progesterone by the present inventors into functional categories based on GeneCard information as well as extensive literature reviews of each gene product.
  • Table 18 (See Example 1) shows the cumulative results of the gene array analysis with regard to the PR-B-expressing cells described in the Examples.
  • Table 19 shows the cumulative results of the gene array analysis with regard to the PR-A-expressing cells described in the Examples.
  • the genes identified as being regulated by progesterone receptors by the present inventors can be used as endpoints or markers in a method to identify ligands that regulate progesterone receptor activity.
  • the biological activity or biological action of a protein such as a progesterone receptor refers to any function(s) exhibited or performed by the protein that is ascribed to the naturally occurring form of the protein as measured or observed in vivo (i.e., in the natural physiological environment of the protein) or in vitro (i.e., under laboratory conditions).
  • a progesterone receptor that is of interest herein includes the effect of the receptor, particularly when the receptor is activated, on the expression of the downstream genes identified in the present invention.
  • a “downstream gene” or “endpoint gene” is any gene, the expression of which is regulated (up or down) by a progesterone receptor (PR-A and/or PR-B).
  • PR-A and/or PR-B progesterone receptor
  • the expression of the gene is typically regulated by the progesterone receptor when it is activated, although the expression of the gene may be regulated by the progesterone receptor in the absence of a stimulatory compound (i.e., the regulation may be ligand independent, or constitutive).
  • ligands Pharmaceutical companies are keenly interested in screening their vast libraries of chemical compounds for ones that bind to (ligands), and either activate or inhibit, progesterone receptors. Selected sets of one, two, three, or more of the genes (up to the number equivalent to all of the genes) of this invention can be used as end-points for rapid through-put screening of ligands that specifically and selectively influence the activity of PR-A and/or PR-B.
  • the ligands can be either agonists or antagonists of the progesterone receptor.
  • PR agonist ligand refers to any compound that interacts with a PR and elicits an observable response. More particularly, a PR agonist can include, but is not limited to, steroidal or non-steroidal compounds; a protein, peptide, or nucleic acid that selectively binds to and activates or increases the activation of a progesterone receptor; and most commonly includes progesterone, progesterone analogs, and any suitable product of drug design (e.g., a mimetic of progesterone, or a synthetic progestin) which is characterized by its ability to agonize (e.g., stimulate, induce, increase, enhance) the biological activity of a naturally occurring progesterone receptor in a manner similar to the natural agonist, progesterone (e.g., by interaction/binding with and/or direct or indirect activation of a progesterone receptor).
  • a PR agonist can include, but is not limited to, steroidal or non-steroidal compounds; a protein,
  • progestin as used herein is generally intended to include progesterone as well as any progesterone analog, such as a synthetic progestin.
  • a suitable agonist typically does not include an antibody or antigen binding fragment thereof, but to the extent that an antibody that selectively binds to and activates or increases the activation of a progesterone receptor can be designed and implemented as an agonist, such a compound is also contemplated.
  • the effect of the action of a given PR agonist on the expression of a downstream gene may be the downregulation of the gene or the suppression of the expression of a gene (e.g., when both isoforms of PR are present).
  • the action of the agonist on a PR may have undesirable consequences in one tissue type and beneficial consequences in another tissue type.
  • the term agonist is intended to refer to the ability of the ligand to act on a progesterone receptor in a manner that is substantially similar to the action of the natural PR ligand, progesterone, on the progesterone receptor (described in more detail below).
  • a PR agonist is identified under conditions wherein, in the absence of the agonist, the PR receptor is not activated, or is at least believed not to be in the presence of a compound that is known to activate the receptor, such as the natural ligand progesterone or a known progestin.
  • PR antagonist ligand refers to any compound which inhibits the effect of a PR agonist, as described above. More particularly, a PR antagonist is capable of associating with a progesterone receptor such that the biological activity of the receptor is decreased (e.g., reduced, inhibited, blocked, reversed, altered) in a manner that is antagonistic (e.g., against, a reversal of, contrary to) to the action of the natural agonist, progesterone, on the receptor.
  • Such a compound can include, but is not limited to, steroidal or non-steroidal compounds; a protein, peptide, or nucleic acid that selectively binds to and blocks access to the receptor by a natural or synthetic agonist ligand or reduces or inhibits the activity of a progesterone receptor; or a product of drug design that blocks the receptor or alters the biological activity of the receptor (e.g., an antiprogestin, which antagonizes the actions of progesterone).
  • an antiprogestin which antagonizes the actions of progesterone.
  • the action of a given PR antagonist on a given downstream gene via a PR may be to actually upregulate the gene.
  • the action of the antagonist on a PR may have undesirable consequences in one tissue type and beneficial consequences in another tissue type.
  • the term antagonist is intended to refer to the ability of the ligand to act on a progesterone receptor in a manner that is antagonistic to the action of the natural PR ligand, progesterone, or a synthetic PR agonist, on the progesterone receptor.
  • an antagonist is identified under control conditions wherein, in the absence of the antagonist, the progesterone receptor is stimulated, such as by the natural ligand, progesterone, or by any suitable progestin.
  • a PR antagonist can be identified by its ability to alter the regulation of downstream genes by the receptor in the absence of a known stimulator of the receptor.
  • ligand-independent regulators of progesterone receptor function can be identified by detecting effects on genes that are constitutively regulated by PR in the ligand-unactivated state.
  • agonists and antagonist ligands can include any regulatory ligand or compound that has the above-mentioned characteristics with regard to regulation of a progesterone receptor.
  • agonists and antagonists can include steroidal and non-steroidal compounds, proteins and peptides, nucleic acid molecules, antibodies, and/or mimetics (e.g., products of drug design or combinatorial chemistry).
  • Natural sex steroid hormone agonists are low molecular weight ringed cyclopentanophenanthrene compounds that in mammals include progesterone, estrogens and androgens. Steroid agonists can be extracted from a variety of natural sources, including the ovaries and testes. With the aim of enhancing the properties of natural steroid compounds, researchers have modified the cyclopentanophenanthrene structures and/or altered the substituent side-chains to generate semi-synthetic and synthetic steroidal and non-steroidal compounds. Non-steroidal compounds lack the classical cyclopentanophenanthrene structure. Nevertheless, all of these compounds—natural, semi-synthetic and synthetic, steroidal and non-steroidal compounds, bind to their respective nuclear receptors. Modified compounds can be either agonists or antagonists.
  • Progesterone is the natural “progestin” produced by the ovaries and adrenal glands of mammals. Semi-synthetic or synthetic analogs that have progesterone-like effects, can be either steroidal or non-steroidal. They are also included in the generic category called “progestins.” Natural, semi-synthetic or synthetic progestins bind to intracellular, usually intranuclear, progesterone receptors. Such progestins can be either “agonists” or “antagonists” (antiprogestins). Both agonists and antagonists can have variable levels of activity of the receptors. An agonist can be strong or weak with many levels in between. An antagonist can also be strong or weak. Some antagonists may have “mixed” agonist/antagonist properties. The present invention can screen for all of these types of progestins.
  • progesterone receptors include proteins and peptides, and nucleic acids and fragments thereof. Any compound that binds a receptor can be classified as a “ligand” of the receptor. If the ligand influences the activity of the progesterone receptor, the present invention can be used to screen for such ligand(s).
  • An isolated protein is a protein (including a peptide) that has been removed from its natural milieu (i.e., that has been subject to human manipulation) and can include purified proteins, partially purified proteins, recombinantly produced proteins, and synthetically produced proteins, for example. As such, “isolated” does not reflect the extent to which the protein has been purified.
  • An isolated protein useful as an antagonist or agonist according to the present invention can be isolated from its natural source, produced recombinantly or produced synthetically. Smaller peptides useful as regulatory ligands are typically produced synthetically by methods well known to those of skill in the art. Regulatory ligands of the present invention can also include an antibody or antigen binding fragment that selectively binds to a progesterone receptor.
  • the phrase “selectively binds to” refers to the ability of an antibody, antigen binding fragment or other binding partner (protein, peptide, nucleic acid) to preferentially bind to specified proteins. More specifically, the phrase “selectively binds” refers to the specific binding of one protein to another, wherein the level of binding, as measured by any standard assay, is statistically significantly higher than the background control for the assay.
  • Agonists and antagonists that are products of drug design can be produced using various methods known in the art.
  • Various methods of drug design, useful to design mimetics or other regulatory compounds useful in the present invention are disclosed in Maulik et al., 1997 , Molecular Biotechnology: Therapeutic Applications and Strategies , Wiley-Liss, Inc., which is incorporated herein by reference in its entirety.
  • a PR agonist or antagonist can be obtained, for example, from molecular diversity strategies (a combination of related strategies allowing the rapid construction of large, chemically diverse molecule libraries), libraries of natural or synthetic compounds, in particular from chemical or combinatorial libraries (i.e., libraries of compounds that differ in sequence or size but that have the similar building blocks) or by rational, directed or random drug design. See for example, Maulik et al., supra.
  • a molecular diversity strategy large compound libraries are synthesized, for example, from peptides, oligonucleotides, natural or synthetic steroidal compounds, carbohydrates and/or natural or synthetic organic and non-steroidal molecules, using biological, enzymatic and/or chemical approaches.
  • the critical parameters in developing a molecular diversity strategy include subunit diversity, molecular size, and library diversity.
  • the general goal of screening such libraries is to utilize sequential application of combinatorial selection to obtain high-affinity ligands for a desired target, and then to optimize the lead molecules by either random or directed design strategies. Methods of molecular diversity are described in detail in Maulik, et al., ibid.
  • the term “mimetic” is used to refer to any natural or synthetic steroidal compound, peptide, oligonucleotide, carbohydrate and/or natural or synthetic organic and non-steroidal molecule that is able to mimic the biological action of a naturally occurring or known synthetic progestin.
  • One embodiment of the present invention relates to a method to identify agonist ligands of progesterone receptors.
  • This method includes the steps of: (a) contacting a progesterone receptor with a putative agonist ligand, wherein the progesterone receptor is selected from the group consisting of progesterone receptor A (PR-A) and progesterone receptor B (PR-B), under conditions wherein, in the absence of the putative agonist ligand, the progesterone receptor is not activated; (b) detecting expression of at least one gene that is regulated by the progesterone receptor when the progesterone receptor is activated and, (c) comparing the expression of the at least one gene in the presence and in the absence of the putative agonist ligand, wherein detection of regulation of the expression of the at least one gene in the manner associated with activation of the progesterone receptor as set forth in (b) indicates that the putative agonist ligand is a progesterone receptor agonist.
  • the gene can include any one or more of any of the following genes: (i) one or more of the genes that are selectively upregulated by PR-A chosen from a gene in Table 1; (ii) one or more of the genes that are selectively downregulated by PR-A chosen from a gene in Table 2; (iii) one or more of the genes that are selectively upregulated by PR-B chosen from a gene in Table 3; (iv) one or more of the genes that are selectively downregulated by PR-B chosen from a gene in Table 4; (v) one or more of the genes that are upregulated or downregulated in the same direction by both PR-A and PR-B chosen from a gene in Table 5; (vi) one or more of the genes that are reciprocally regulated by PR-A and PR-B chosen from a gene in Table 6; and, (vii) one or more of the genes that are regulated by one of either PR-A or PR-B, wherein the regulation of the gene is altered when the other of the PR-A or PR-B is present, such
  • Another embodiment of the present invention relates to a method to identify antagonists of progesterone receptor.
  • This method includes the steps of: (a) contacting a progesterone receptor with a putative antagonist ligand, wherein the progesterone receptor is selected from the group consisting of progesterone receptor A (PR-A) and progesterone receptor B (PR-B), under conditions wherein, in the absence of said putative antagonist ligand, said progesterone receptor is activated (i.e., before, simultaneously with or after the contact of the receptor with the putative regulatory ligand); (b) detecting expression of at least one gene that is regulated by the progesterone receptor when the progesterone receptor is activated; and, (c) comparing the expression of the at least one gene in the presence and in the absence of the putative antagonist ligand.
  • PR-A progesterone receptor A
  • PR-B progesterone receptor B
  • the gene(s) to be detected in step (b) are chosen from one or more of the following genes: (i) one or more of the genes that are selectively upregulated by PR-A chosen from a gene in Table 1; (ii) one or more of the genes that are selectively downregulated by PR-A chosen from a gene in Table 2; (iii) one or more of the genes that are selectively upregulated by PR-B chosen from a gene in Table 3; (iv) one or more of the genes that are selectively downregulated by PR-B chosen from a gene in Table 4; (v) one or more of the genes that are upregulated or downregulated by both PR-A and PR-B chosen from a gene in Table 5; (vi) one or more of the genes that are reciprocally regulated by PR-A and PR-B chosen from a gene in Table 6; and, (vii) one or more of the genes that are regulated by one of the PR-A or the PR-B, wherein regulation of the gene is altered when the other of the PR-A or PR-B
  • the term “putative regulatory compound” or “putative regulatory ligand” refers to compounds having an unknown regulatory activity, at least with respect to the ability of such compounds to regulate progesterone receptors as described herein.
  • the method can be a cell-based assay, or non-cell-based assay.
  • the progesterone receptor is expressed by a cell (i.e., a cell-based assay).
  • the progesterone receptor is in a cell lysate, is in isolated cell nuclei, or is purified or produced free of cells.
  • the progesterone receptor can be a PR-A, a PR-B, or a combination of PR-A and PR-B.
  • One advantage of the present invention is that, given the knowledge of the isoform regulation of the various downstream genes disclosed herein, one can screen for ligands of the progesterone receptor, including screening for isoform specific ligands, using cells that express both receptors. Prior to the present invention, it was impossible to distinguish between the effects of one isoform or the other, because most cells express both isoforms.
  • the conditions under which a receptor according to the present invention is contacted with a putative regulatory ligand, such as by mixing; are conditions in which the receptor is not stimulated (activated) if essentially no regulatory ligand is present.
  • such conditions include normal culture conditions in the absence of a known stimulatory compound (a stimulatory compound being, for example, the natural ligand for the receptor (progesterone), a stimulatory peptide, or other equivalent stimulus, such as a synthetic progestin).
  • a stimulatory compound being, for example, the natural ligand for the receptor (progesterone), a stimulatory peptide, or other equivalent stimulus, such as a synthetic progestin).
  • the putative regulatory ligand is then contacted with the receptor.
  • the step of detecting is designed to indicate whether the putative regulatory ligand alters the biological activity of the receptor as compared to in the absence of the putative regulatory ligand (i.e., the background level), as determined by the effects of the contact between the ligand and the receptor on the expression of downstream genes as described herein.
  • the conditions under which a progesterone receptor according to the present invention is contacted with a putative regulatory ligand, such as by mixing are conditions in which the receptor is normally stimulated (activated) if essentially no regulatory ligand is present.
  • Such conditions can include, for example, contact of said receptor with a stimulator molecule (a stimulatory compound being, e.g., the natural ligand for the receptor (progesterone), a stimulatory peptide, or other equivalent stimulus, such as a synthetic progestin) which binds to the receptor and causes the receptor to become activated.
  • a stimulator molecule a stimulatory compound being, e.g., the natural ligand for the receptor (progesterone), a stimulatory peptide, or other equivalent stimulus, such as a synthetic progestin
  • the putative regulatory ligand can be contacted with the receptor prior to, or simultaneously with, the contact of the receptor with the stimulatory compound (e.g., to determine whether the putative regulatory ligand blocks or otherwise inhibits the stimulation of the progesterone receptor by the stimulatory compound), or after contact of the receptor with the stimulatory compound (e.g., to determine whether the putative regulatory ligand downregulates, or reduces the activation of the receptor).
  • the present methods involve contacting the progesterone receptor with the ligand being tested for a sufficient time to allow for interaction, activation or inhibition of the receptor by the ligand.
  • the period of contact with the ligand being tested can be varied depending on the result being measured, and can be determined by one of skill in the art. For example, for binding assays, a shorter time of contact with the compound being tested is typically suitable, than when activation is assessed, and particularly, when the expression of downstream genes is assessed.
  • the methods of the present invention detect the expression of downstream genes and therefore, the time of incubation is dependent upon the time required to achieve expression of the downstream genes.
  • Such a time period is typically at least 2 hours, and more preferably at least 4 hours, and more preferably at least 6 hours, although the time can be extended, if necessary to detect expression of a selected downstream gene.
  • the term “contact period” refers to the time period during which the progesterone receptor is in contact with the ligand being tested.
  • the term “incubation period” refers to the entire time during which the cells expressing the receptor, for example, are allowed to grow prior to evaluation, or the time during which genes affected by activation of the progesterone receptor are allowed to be expressed, and such time period can be inclusive of the contact period.
  • the incubation period includes all of the contact period and may include a further time period during which the compound being tested is not present, or is no longer being supplied to the receptor, but during which gene expression is continuing (in the case of a cell based assay) prior to scoring.
  • the incubation time for growth of cells can vary but is sufficient to allow for the binding of the progesterone receptor, the activation or inhibition of the receptor, and the effect on the expression of the downstream genes regulated by the receptor. It will be recognized that shorter incubation times are preferable because compounds can be more rapidly screened.
  • a cell-based assay is conducted under conditions which are effective to screen for regulatory compounds useful in the method of the present invention.
  • Effective conditions include, but are not limited to, appropriate media, temperature, pH and oxygen conditions that permit the growth of the cell that expresses the receptor.
  • An appropriate, or effective, medium refers to any medium in which a cell that naturally or recombinantly expresses a progesterone receptor, when cultured, is capable of cell growth and expression of the progesterone receptor.
  • Such a medium is typically a solid or liquid medium comprising growth factors and assimilable carbon, nitrogen and phosphate sources, as well as appropriate salts, minerals, metals and other nutrients, such as vitamins.
  • Culturing is carried out at a temperature, pH and oxygen content appropriate for the cell. Such culturing conditions are within the expertise of one of ordinary skill in the art. Exemplary cells expressing progesterone receptors are described in the Examples, and in detail in (Sartorius et al., Cancer Res. 54, 3668-3877 (1994)).
  • Cells that are useful in the cell-based assays of the present invention include any cell that expresses a progesterone receptor of the isoform A, isoform B, or a combination of PR-A and PR-B.
  • Such cells include cells that naturally express progesterone receptors, or cells that express progesterone receptors by recombinant technology.
  • Such cells preferably include, but are not limited to mammalian cells, which can originate from the breast or any other tissue.
  • tissues containing cells that are known to express the progesterone receptor naturally include, but are not limited to, breast, uterus, bone, cartilage, cardiovascular tissues, heart, lung, brain, meninges, pituitary, ovary, oocyte, corpus luteum, oviduct, fallopian tubes, T lymphocytes, B lymphocytes, thymocytes, salivary gland, placenta, skin, gut, pancreas, liver, testis, epididymis, bladder, urinary tract, eye, and teeth.
  • Cells suitable for use in a cell-based assay include normal or malignant cells, as well as cells that are not malignant, but which are abnormal, such as cells from a non-malignant tissue that is otherwise diseased (e.g., tissues from endometriosis and leiomyoma of the uterus, fibrocystic disease of the breast, polycystic ovary).
  • Other suitable cells are cells that express PR-A, PR-B, or both isoforms, as a result of recombinant technology. Such cells were used to discover the PR downstream genes of the present invention and are described in detail in Sartorius et al. (Sartorius et al., Cancer Res. 54, 3668-3877 (1994)).
  • Suitable cells are cells that express a PR-A and/or a PR-B transgene (i.e., cells isolated from a transgenic animal), or cells that have a germline deletion of one of the PR isoforms, but not the other (i.e., cells from a PR-A or PR-B knockout animal).
  • the method includes the step of detecting the expression of at least one, and preferably more than one, of the downstream genes that have now been shown to be regulated by progesterone receptors by the present inventors.
  • expression when used in connection with detecting the expression of a downstream gene of the present invention, can refer to detecting transcription of the gene and/or to detecting translation of the gene.
  • To detect expression of a downstream gene refers to the act of actively determining whether a gene is expressed or not. This can include determining whether the gene expression is upregulated as compared to a control, downregulated as compared to a control, or unchanged as compared to a control. Therefore, the step of detecting expression does not require that expression of the gene actually is upregulated or downregulated, but rather, can also include detecting that the expression of the gene has not changed (i.e., detecting no expression of the gene or no change in expression of the gene).
  • the present method includes the step of detecting the expression of at least one gene that is regulated by a progesterone receptor when the receptor is activated, as set forth in detail above.
  • the step of detecting includes detecting the expression of at least 2 genes, and preferably at least 3 genes, and more preferably at least 4 genes, and more preferably at least 5 genes, and more preferably at least 6 genes, and more preferably at least 7 genes, and more preferably at least 8 genes, and more preferably at least 9 genes, and more preferably at least 10 genes, and more preferably at least 11 genes, and more preferably at least 12 genes, and more preferably at least 13 genes, and more preferably at least 14 genes, and more preferably at least 15 genes, and so on, in increments of one, up to detecting expression of all of the downstream genes disclosed herein. Analysis of a number of genes greater than 1 can be accomplished simultaneously, sequentially, or cumulatively.
  • the gene(s) to be detected are preferably selected from the genes described in any one or more of Tables 1-7. These tables disclose genes that are regulated by progesterone receptors, and particularly, these tables disclose the manner in which the genes are regulated by the PR isoforms when the progesterone receptor is activated (i.e., by a stimulator of the receptor).
  • the genes to be detected can include one or more of: (1) genes that are selectively (i.e., exclusively or uniquely) upregulated by PR-A (Table 1); (2) genes that are selectively downregulated by PR-A (Table 2); (3) genes that are selectively upregulated by PR-B (Table 3); (4) genes that are selectively downregulated by PR-B (Table 4); (5) genes that are upregulated or downregulated in the same direction by both PR-A and PR-B (Table 5); (6) genes that are reciprocally regulated by PR-A and PR-B (Table 6); and (7) genes that are regulated by one of the PR isoforms, wherein such regulation is altered when the other PR isoform is present (e.g., the expression of the gene is either up- or downregulated in the presence of both receptors relative to the expression level of the gene in the presence of only one receptor) (Table 7).
  • the method further includes the additional detection of the expression of one or more genes that were previously known to be regulated by progesterone, but for which the PR isoform regulation
  • genes to be detected in any given method can include any one or more of the genes in any one or more of the Tables, and can include the detection of any combination of two or more of the genes in any one or more of the Tables. It is not mandatory that a given assay be restricted to the detection of all of the various genes in a single table, or to one gene in each table.
  • Tables 1-7 it is believed that these tables encompass genes that have been identified by the present inventors to be regulated by progesterone receptors, but which have not previously been described as being regulated by progesterone.
  • progesterone receptors genes that have been identified by the present inventors to be regulated by progesterone.
  • the removal of such gene from these tables and the placement of such gene into Table 8 is explicitly contemplated.
  • This rationale also applies to the genes of Table 16, which are believed to include all of those genes identified by the inventors that were previously known to be involved in breast cancer or mammary development. It is expressly contemplated that other genes from Tables 1-7 or 9-15 can be added to Table 16, if required for accuracy.
  • Tables 9-15 include all of the genes identified by the present inventors as being regulated by progesterone receptors (organized by isoform regulation, as for Tables 1-7), and, as discussed previously herein, include genes that were previously known to be regulated by progesterone.
  • genes regulated by progesterone receptors Given the knowledge of the genes regulated by progesterone receptors according to the present invention, one of skill in the art will be able to select one or more genes to detect in a method of the present invention, and the selection of the one or more genes can be determined by different factors. For example, certain subsets of the genes are useful for detecting genes regulated by PR-A exclusively (i.e., genes in Tables 1, 2, 9 and 10). Other subsets of genes are useful for detecting genes regulated by PR-B exclusively (i.e., genes in Tables 3, 4, 11 and 12).
  • PR-A exclusively
  • PR-B exclusively
  • One of skill in the art may wish to detect genes disclosed herein that are related to a particular function, to a particular tissue-type, or that are associated (or likely to be associated) with a particular disease or condition.
  • One of skill in the art may also wish to select genes on the basis of the change in expression level in the presence of progesterone (i.e., and therefore activation of
  • the method of the present invention includes detecting genes of the present invention that are related by function.
  • Table 17 provides a listing of the various genes identified by the present inventors, categorized by function. Therefore, one could screen functional sets of genes to make a specific determination about a given cell or tissue that expresses a progesterone receptor, or to identify a ligand that has an action that might be correlated with a functional gene. For example, one could use subsets of the disclosed genes to screen a tumor for the likelihood that it will metastasize by screening the genes in the “cell adhesion or cytoskeletal interaction” group of Table 17. Other uses for screening functional groups will be apparent to those of skill in the art.
  • a particular disease such as breast cancer.
  • genes for detection that are particularly highly regulated by progesterone receptors in that they display the largest increases or decreases in expression levels in the presence of progesterone as compared to in the absence of progesterone.
  • the detection of such genes can be advantageous because the endpoint may be more clear and require less quantitation.
  • the relative expression levels of the genes identified in the present invention are listed in the tables. In these tables, the fold increase or decrease in expression of the gene upon treatment of the progesterone receptor with progesterone for 6 hours is indicated.
  • fold increase or decrease was made with respect to the background level of expression of the gene, which in some cases, was undetectable (i.e., the gene was not detected at all in the absence of progesterone, but was detected in the presence of progesterone). Therefore, in one embodiment, one of skill in the art might choose to detect genes that exhibited a fold increase above background of at least 2. In another embodiment, one of skill in the art might choose to detect genes that exhibited a fold increase or decrease above background of at least 3, and in another embodiment at least 4, and in another embodiment at least 5, and in another embodiment at least 6, and in another embodiment at least 7, and in another embodiment at least 8, and in another embodiment at least 9, and in another embodiment at least 10 or higher fold changes. It is noted that fold increases or decreases are not typically compared from one gene to another, but with reference to the background level for that particular gene.
  • the step of detecting can include the detection of one or more reporter genes that are linked to promoters of one or more downstream genes according to the present invention.
  • the transcriptional read-out can use one, two or more promoters of any of the genes of this invention, linked to any of several reporter constructs, which are introduced into cells by any of several established transfection or infection methods, including, but not limited to, calcium phosphate transfection, transformation, electroporation, microinjection, lipofection, adsorption, infection (e.g., by a viral vector) and protoplast fusion.
  • the cells can be naturally PR-positive (containing both PRs), or they can stably or transiently express either one or both of the two PR-isoforms.
  • the cells can be exposed to the test ligands (i.e., the putative regulatory ligands) for different times and/or concentrations, and transcription of the PR-responsive promoter(s) of the downstream genes disclosed in this invention can be quantified.
  • cells expressing a PR as described above are exposed to the unknown test ligands at various concentrations and for various periods of time.
  • the transcriptional read-out can be expression of one, two or more of the genes of this invention, which are endogenously regulated in the cells. Expression of their transcripts and/or proteins is measured by any of a variety of known methods in the art several of which are exemplified in the Examples section.
  • RNA expression methods include but are not limited to: extraction of cellular mRNA and northern blotting using labeled probes that hybridize to transcripts encoding all or part of one or more of the genes of this invention; amplification of mRNA expressed from one or more of the genes of this invention using gene-specific primers and reverse transcriptase-polymerase chain reaction, followed by quantitative detection of the product by any of a variety of means; extraction of total RNA from the cells, which is then labeled and used to probe cDNAs or oligonucleotides encoding all or part of the PR-responsive genes of this invention, arrayed on any of a variety of surfaces.
  • Methods to measure protein expression levels of selected genes of this invention include, but are not limited to: western blotting, immunocytochemistry, flow cytometry or other immunologic-based assays; assays based on a property of the protein including but not limited to DNA binding, ligand binding, or interaction with other protein partners.
  • Nucleic acid arrays are particularly useful for detecting the expression of the downstream genes of the present invention.
  • the production and application of high-density arrays in gene expression monitoring have been disclosed previously in, for example, WO 97/10365; WO 92/10588; U.S. Pat. No. 6,040,138; U.S. Pat. No. 5,445,934; or WO95/35505, all of which are incorporated herein by reference in their entireties.
  • arrays see Hacia et al. (1996) Nature Genetics 14:441-447; Lockhart et al. (1996) Nature Biotechnol. 14:1675-1680; and De Risi et al. (1996) Nature Genetics 14:457-460.
  • an oligonucleotide, a cDNA, or genomic DNA that is a portion of a known gene occupies a known location on a substrate.
  • a nucleic acid target sample is hybridized with an array of such oligonucleotides and then the amount of target nucleic acids hybridized to each probe in the array is quantified.
  • One preferred quantifying method is to use confocal microscope and fluorescent labels.
  • the Affymetrix GeneChipTM Array system (Affymetrix, Santa Clara, Calif.) and the AtlasTMHuman cDNA Expression Array system are particularly suitable for quantifying the hybridization; however, it will be apparent to those of skill in the art that any similar systems or other effectively equivalent detection methods can also be used.
  • Suitable nucleic acid samples for screening on an array contain transcripts of interest or nucleic acids derived from the transcripts of interest (i.e., transcripts derived from the PR-regulated genes of the present invention).
  • a nucleic acid derived from a transcript refers to a nucleic acid for whose synthesis the mRNA transcript or a subsequence thereof has ultimately served as a template.
  • a cDNA reverse transcribed from a transcript, an RNA transcribed from that cDNA, a DNA amplified from the cDNA, an RNA transcribed from the amplified DNA, etc. are all derived from the transcript and detection of such derived products is indicative of the presence and/or abundance of the original transcript in a sample.
  • suitable samples include, but are not limited to, transcripts of the gene or genes, cDNA reverse transcribed from the transcript, cRNA transcribed from the cDNA, DNA amplified from the genes, RNA transcribed from amplified DNA, and the like.
  • the nucleic acids for screening are obtained from a homogenate of cells or tissues or other biological samples.
  • such sample is a total RNA preparation of a biological sample. More preferably in some embodiments, such a nucleic acid sample is the total mRNA isolated from a biological sample.
  • Biological samples may be of any biological tissue or fluid or cells from any organism. Frequently the sample will be a “clinical sample” which is a sample derived from a patient, such as a breast tumor sample from a patient. Typical clinical samples include, but are not limited to, sputum, blood, blood cells (e.g., white cells), tissue or fine needle biopsy samples, urine, peritoneal fluid, and pleural fluid, or cells therefrom.
  • Biological samples may also include sections of tissues, such as frozen sections or formalin fixed sections taken for histological purposes.
  • PCR polymerase chain reaction
  • LCR ligase chain reaction
  • nucleic acid hybridization simply involves contacting a probe and target nucleic acid under conditions where the probe and its complementary target can form stable hybrid duplexes through complementary base pairing.
  • hybridization conditions refer to standard hybridization conditions under which nucleic acid molecules are used to identify similar nucleic acid molecules. Such standard conditions are disclosed, for example, in Sambrook et al., Molecular Cloning: A Laboratory Manual , Cold Spring Harbor Labs Press, 1989. Sambrook et al., ibid., is incorporated by reference herein in its entirety (see specifically, pages 9.31-9.62).
  • hybrid duplexes e.g., DNA:DNA, RNA:RNA, or RNA:DNA
  • RNA:DNA e.g., DNA:DNA, RNA:RNA, or RNA:DNA
  • specificity of hybridization is reduced at lower stringency.
  • higher stringency e.g., higher temperature or lower salt
  • High stringency hybridization and washing conditions refer to conditions which permit isolation of nucleic acid molecules having at least about 90% nucleic acid sequence identity with the nucleic acid molecule being used to probe in the hybridization reaction (i.e., conditions permitting about 10% or less mismatch of nucleotides).
  • One of skill in the art can use the formulae in Meinkoth et al., 1984 , Anal. Biochem. 138, 267-284 (incorporated herein by reference in its entirety) to calculate the appropriate hybridization and wash conditions to achieve these particular levels of nucleotide mismatch. Such conditions will vary, depending on whether DNA:RNA or DNA:DNA hybrids are being formed. Calculated melting temperatures for DNA:DNA hybrids are 10° C.
  • stringent hybridization conditions for DNA:DNA hybrids include hybridization at an ionic strength of 6 ⁇ SSC (0.9 M Na + ) at a temperature of between about 20° C. and about 35° C., more preferably, between about 28° C. and about 40° C., and even more preferably, between about 35° C. and about 45° C.
  • stringent hybridization conditions for DNA:RNA hybrids include hybridization at an ionic strength of 6 ⁇ SSC (0.9 M Na + ) at a temperature of between about 30° C. and about 45° C., more preferably, between about 38° C. and about 50° C., and even more preferably, between about 45° C. and about 55° C.
  • T m can be calculated empirically as set forth in Sambrook et al., supra, pages 9.31 to 9.62.
  • the hybridized nucleic acids are detected by detecting one or more labels attached to the sample nucleic acids.
  • the labels may be incorporated by any of a number of means well known to those of skill in the art.
  • Detectable labels suitable for use in the present invention include any composition detectable by spectroscopic, photochemical, biochemical, immunochemical, electrical, optical or chemical means.
  • Useful labels in the present invention include biotin for staining with labeled streptavidin conjugate, magnetic beads (e.g., Dynabeads.TM.), fluorescent dyes (e.g., fluorescein, texas red, rhodamine, green fluorescent protein, and the like), radiolabels (e.g., 3 H, 125 I, 35 S, 14 C, or 32 P), enzymes (e.g., horse radish peroxidase, alkaline phosphatase and others commonly used in an ELISA), and colorimetric labels such as colloidal gold or colored glass or plastic (e.g., polystyrene, polypropylene, latex, etc.) beads.
  • fluorescent dyes e.g., fluorescein, texas red, rhodamine, green fluorescent protein, and the like
  • radiolabels e.g., 3 H, 125 I, 35 S, 14 C, or 32 P
  • enzymes e.g
  • radiolabels may be detected using photographic film or scintillation counters
  • fluorescent markers may be detected using a photodetector to detect emitted light
  • Enzymatic labels are typically detected by providing the enzyme with a substrate and detecting the reaction product produced by the action of the enzyme on the substrate, and colorimetric labels are detected by simply visualizing the colored label.
  • the term “quantifying” or “quantitating” when used in the context of quantifying transcription levels of a gene can refer to absolute or to relative quantification. Absolute quantification may be accomplished by inclusion of known concentration(s) of one or more target nucleic acids and referencing the hybridization intensity of unknowns with the known target nucleic acids (e.g. through generation of a standard curve). Alternatively, relative quantification can be accomplished by comparison of hybridization signals between two or more genes, or between two or more treatments to quantify the changes in hybridization intensity and, by implication, transcription level.
  • in vitro cell based assays may be designed to screen for compounds that affect the regulation of genes by a progesterone receptor at either the transcriptional or translational level.
  • One, two or more promoters of the genes of this invention can be used to screen unknown ligands for their ability to selectively regulate transcription in vitro via PR-A or PR-B.
  • Promoters of the selected genes can be linked to any of several reporters (including but not limited to chloramphenicol acetyl transferase, or luciferase) that measure transcriptional read-out.
  • the promoters can be tested as pure DNA, or as DNA bound to chromatin proteins.
  • Ligands at different concentrations and under different assay conditions can be screened for their ability to either up- or down-regulate transcription of the selected genes, under the control of either PR-A, PR-B or both.
  • cells expressing progesterone receptors or cell lysates comprising progesterone receptors are contacted with a putative regulatory ligand for a time sufficient to act on the receptor.
  • the cells or cell lysates contain one, two or more promoters of the selected genes that are linked to any of several reporters, and the transcription or translation of the reporter genes is measured.
  • Appropriate cells are preferably prepared from any cell type that naturally expresses the progesterone receptor or that recombinantly expresses the progesterone receptor, thereby ensuring that the cells contain the transcription factors required for transcription.
  • the screen can be used to identify ligands that modulate the expression of the reporter construct. In such screens, the level of reporter gene expression is determined in the presence of the test ligand and compared to the level of expression in the absence of the test ligand, or the test ligand is compared to a known ligand, such as progesterone.
  • the step of detecting can include detecting the expression of one or more downstream genes of the invention in intact animals or tissues obtained from such animals.
  • Mammalian (i.e. mouse, rat, monkey) or non-mammalian (ie. chicken) species that express PRs in their tissues and elaborate progesterone can be the test animals.
  • the unknown test ligand is introduced into intact or castrated animals by any of a variety of oral, intravenous, intramuscular, subdermal or other routes, for a variety of treatment times or concentrations.
  • the tissues to be surveyed can be either normal or malignant progesterone targets (including but not limited to the mammary glands, mammary cancers, uterus, or endometrial cancers).
  • the presence and quantity of endogenous mRNA or protein expression of one, two or more of the genes of this invention can be measured in those progesterone target tissues.
  • the gene markers can be measured in tissues that are fresh, frozen, fixed or otherwise preserved. They can be measured in cytoplasmic or nuclear organ-, tissue- or cell-extracts; or in cell membranes including but not limited to plasma, cytoplasmic, mitochondrial, golgi or nuclear membranes; in the nuclear matrix; or in cellular organelles and their extracts including but not limited to ribosomes, nuclei, nucleoli, mitochondria, or golgi.
  • Assays for endogenous expression of mRNAs or proteins encoded by the genes of this invention can be performed as described above.
  • transgenic animals can be generated for ligand screening.
  • Animals can be genetically manipulated to express the promoters of one, two or more of the genes of this invention linked to one or more reporters such as X-gal.
  • expression of galactosidase can be measured calorimetrically in normal or malignant progesterone target organs, or tissues containing PRs, or in organs or tissues during development.
  • Ligands that activate through either PR-A or PR-B can be identified by their ability to regulate the appropriate selective gene promoter.
  • the method of the present invention includes a step of comparing the results of detecting the expression of the one or more downstream genes in the presence and in the absence of the putative regulatory ligand, in order to determine whether any observed change in expression is due to the presence of the putative regulatory compound.
  • the step of comparing further includes comparing the expression of the one or more downstream genes detected in the presence of the ligand to the manner of expression of the genes that is associated with the activation of the progesterone receptor when the receptor is activated (described in detail below).
  • the present inventors have identified the expression profile of multiple genes that are regulated by PR, including the manner in which the genes are regulated (i.e., by which PR isoform, and in which direction by such isoform).
  • a putative test ligand is determined to be a regulator of PR if the expression of the gene or genes detected after contact of the PR with the ligand is statistically significantly altered (i.e., up or down) from the expression detected in the profile of a PR that has been activated by progesterone, or an equivalent agonist.
  • the expression profiles for the genes in Tables 1-19 were determined by evaluating PR that had been activated by progesterone after 6 hours.
  • a PR agonist is identified by detecting an expression profile in the presence of the agonist that, at a minimum, regulates the expression of the gene in the same direction (i.e, upregulation or downregulation) as it is regulated by an activated progesterone receptor (e.g., the manner of expression of the gene as indicated in Tables 1-7, 9-15 or 18 and 19).
  • an activated progesterone receptor e.g., the manner of expression of the gene as indicated in Tables 1-7, 9-15 or 18 and 19.
  • detection of the regulation of the expression of the gene in the “manner” associated with the activation of the PR refers to the detection of the upregulation of a gene that has now been shown by the present inventors to be selectively upregula3ted by PR-A (genes in Tables 1 and 9) when the receptor is in the presence of the putative agonist, as compared to in the absence of the putative agonist.
  • an agonist is identified when the expression of a gene from Tables 2 or 10 is detected to be downregulated in the presence of the putative agonist as compared to in the absence of the agonist.
  • the agonist regulates the expression of the gene in the same direction and to at least about 10%, and more preferably at least 20%, and more preferably at least 25%, and more preferably at least 30%, and more preferably at least 35%, and more preferably at least 40%, and more preferably at least 45%, and more preferably at least 50%, and preferably at least 55%, and more preferably at least 60%, and more preferably at least 65%, and more preferably at least 70%, and more preferably at least 75%, and more preferably at least 80%, and more preferably at least 85%, and more preferably at least 90%, and more preferably at least 95%, of the level of expression that is induced by a progesterone receptor that has been activated by progesterone.
  • an agonist regulates the expression of the gene in the same direction and to a level of expression that is substantially equal to or greater than the level of expression that is induced by a progesterone receptor that has been activated by progesterone.
  • the level of expression is determined with reference to the expression of the gene in the absence of the putative regulatory compound, or in the absence of progesterone, in the case of the control. The level of expression is then compared to the level of expression of the control, or the level of expression that is expected from the control.
  • a PR antagonist is identified by detecting an expression profile in the presence of the antagonist that, at a minimum, regulates the expression of the gene in the opposite direction (i.e, upregulation instead of downregulation) than the gene is regulated by an activated progesterone receptor (e.g., the manner of expression of the gene as indicated in Tables 1-7, 9-15 or 18 and 19), or causes a statistically significant reduction in the expression level of the gene as compared to the expression level of the gene when it is activated by progesterone, or prevents the regulation of the gene as compared to the regulation of the gene when the receptor is activated by progesterone.
  • an activated progesterone receptor e.g., the manner of expression of the gene as indicated in Tables 1-7, 9-15 or 18 and 19
  • the putative antagonists are screened against a PR that is activated, and so in the absence of the putative antagonist, the expression profile of the genes should be substantially the same as the expression profile set forth in Tables 1-7,9-15 and 18-19). Therefore, any statistically significant decrease (inhibition) in the expression level of the gene or a reversal of the direction of expression of the gene in the presence of the putative antagonist as compared to in the absence of the antagonist, indicates that the putative ligand is an antagonist.
  • the antagonist inhibits the expression of the detected gene by at least 5%, and more preferably at least 10%, and more preferably at least 15%, and more preferably at least 20%, and more preferably at least 25%, and more preferably at least 30%, and more preferably at least 40%, and more preferably at least 50%, and more preferably at least 60%, and more preferably at least 70%, and more preferably at least 80%, and more preferably at least 90%, as compared to the level of expression that is induced by the activated progesterone receptor in the absence of the putative antagonist.
  • an antagonist regulates the expression of the gene in the opposite direction (i.e., reverses the expression) as compared to the expression of the gene induced by the activated progesterone receptor in the absence of the putative antagonist.
  • a test ligand having 10% of the activity of an agonist can be an antagonist of that agonist). This may depend on the technique being used for detection as well as on the number of genes which are being tested. One of skill in the art can readily determine the criteria for selection of suitable antagonists.
  • one of skill in the art can, for the first time, identify isoform-specific regulators of progesterone receptors. Therefore, one embodiment of the present invention relates to a method to identify isoform-specific agonists of progesterone receptors.
  • This method includes the steps of: (a) contacting a progesterone receptor with a putative agonist ligand, wherein the progesterone receptor is selected from the group consisting of progesterone receptor A (PR-A) and progesterone receptor B (PR-B), under conditions wherein in the absence of the putative agonist ligand, the progesterone receptor is not activated; (b) detecting expression of at least one gene that is selectively regulated by the progesterone receptor when the progesterone receptor is activated, and (c) comparing the expression of the at least one gene in the presence and in the absence of the putative agonist ligand.
  • PR-A progesterone receptor A
  • PR-B progesterone receptor B
  • the at least one gene is selected from the group consisting of: (i) at least one gene that is exclusively upregulated or downregulated by PR-A, chosen from a Table selected from the group consisting of Table 1 and Table 2; and, (ii) at least one gene that is exclusively upregulated or downregulated by PR-B chosen from a Table selected from the group consisting of Table 3 and Table 4.
  • Detection of regulation of the expression of the at least one gene in the manner associated with activation of the progesterone receptor as set forth in (i) but not (ii), indicates that the putative agonist ligand is a PR-A-specific agonist
  • detection of regulation of the expression of the at least one gene in the manner associated with activation of the progesterone receptor as set forth in (ii) but not (i) indicates that the putative agonist ligand is a PR-B-specific agonist.
  • Another embodiment of the present invention relates to a method to identify isoform-specific antagonists of progesterone receptors, comprising: (a) contacting a progesterone receptor with a putative antagonist ligand, wherein the progesterone receptor is selected from the group consisting of progesterone receptor A (PR-A) and progesterone receptor B (PR-B), under conditions wherein, in the absence of the putative antagonist ligand, the progesterone receptor is activated; (b) detecting expression of at least one gene that is regulated by the progesterone receptor when the progesterone receptor is activated; and (c) comparing the expression of the at least one gene in the presence and in the absence of the putative antagonist ligand.
  • a putative antagonist ligand wherein the progesterone receptor is selected from the group consisting of progesterone receptor A (PR-A) and progesterone receptor B (PR-B), under conditions wherein, in the absence of the putative antagonist ligand, the progesterone
  • the at least one gene is selected from the group consisting of: (i) at least one gene that is exclusively upregulated or downregulated by PR-A, chosen from a Table selected from the group consisting of Table 1 and Table 2; and, (ii) at least one gene that is exclusively upregulated or downregulated by PR-B chosen from a Table selected from the group consisting of Table 3 and Table 4.
  • detection of inhibition or reversal of the regulation of expression of the at least one gene as compared to the regulation of expression of the at least one gene in the manner associated with activation of the progesterone receptor as set forth in (i) but not (ii), indicates that the putative antagonist ligand is a PR-A-specific antagonist
  • detection of inhibition or reversal of the regulation of expression of the at least one gene as compared to the regulation of the expression of the at least one gene in the manner associated with activation of the progesterone receptor as set forth in (ii) but not (i) indicates that the putative antagonist ligand is a PR-B-specific antagonist.
  • progesterone receptor isoforms Given the knowledge of the genes regulated exclusively by progesterone receptor isoforms according to the present invention, one of skill in the art will be able to select one or more genes to detect in a method of the present invention, and the selection of the one or more genes can be determined by different factors. For example, one of skill in the art may wish to further select genes to be detected on the basis of the function of the gene or gene product, on the basis of tissue-type in which a PR is expressed, on the basis of association with a particular condition or disease, or on the basis of the change in the level of expression of the gene when in the presence of progesterone. Such embodiments have generally been described above.
  • Antiprogestins that selectively inhibit progestin effects on only one of the two PRs, would be highly desirable, but do not exist at present. Such antagonist ligands would be useful not only for breast cancer treatment, but to treat a variety of reproductive disorders, and for contraception. Antagonists that can inhibit only PR-A without affecting PR-B (and vice-versa) would be highly desirable.
  • the current invention allows for rapid and direct screening for such ligands. For example, the invention identifies clusters of genes that are upregulated only by PR-A or PR-B in the presence of the agonist, progesterone.
  • genes of this invention that are exclusively regulated by PR-A or PR-B would first be activated by progesterone or another progestin.
  • Putative antiprogestins would be screened and selected on the basis of their ability to reverse or inhibit the effects of the agonist, progesterone, by comparing the expression profiles of the genes in the presence of the putative antiprogestin to the expression profile of the genes as a result of activation of the receptor with a progestin.
  • Isoform-specific agonists of PRs can be similarly selected by choosing ligands on the basis of their ability to mimic the effects of the agonist, progesterone, on the PR isoforms.
  • a gene in Table 1 is detected (i.e., a gene that is known to be upregulated selectively (i.e., exclusively, uniquely) by PR-A) when the PR to be tested (at least PR-A or a combination of PR-A and PR-B) is in the presence of a putative regulatory ligand, and the expression of that gene is determined to be in the manner associated with activation of the progesterone receptor (i.e., the gene is upregulated), then it can be concluded that the putative regulatory compound is a PR-A-specific agonist, because the present inventors have shown that the gene is exclusively upregulated by PR-A.
  • a gene in Table 4 is detected (i.e., a gene that is known to be downregulated selectively (i.e., exclusively, uniquely) by PR-B) when the PR to be tested (at least PR-B or a combination of PR-A and PR-B) is in the presence of a putative regulatory ligand, and the expression of that gene is determined to be is in the manner associated with activation of the progesterone receptor (i.e., the gene is downregulated), then it can be concluded that the putative regulatory compound is a PR-B-specific agonist, because the present inventors have shown that this particular gene is exclusively downregulated by PR-B.
  • the putative regulatory compound is a PR-B-specific antagonist, because the present inventors have shown that this particular gene is exclusively downregulated by PR-B.
  • Agonists and antagonists of progesterone receptors identified by the above methods or any other suitable method are useful in a variety of therapeutic methods as described herein.
  • Yet another embodiment of the present invention relates to a method to identify a tissue-specific agonist of a progesterone receptor.
  • This method includes the steps of: (a) providing an expression profile for at least one gene that is known to be regulated by a progesterone receptor in both a first and second tissue type when the progesterone receptor is activated, wherein the at least one gene is chosen from the genes in any one or more of the genes in Tables 1-7; (b) contacting a progesterone receptor expressed by a first tissue type with a putative agonist ligand, wherein the progesterone receptor is selected from the group consisting of progesterone receptor A (PR-A) and progesterone receptor B (PR-B), under conditions wherein, in the absence of the putative agonist ligand, the progesterone receptor is not activated; (c) contacting a progesterone receptor expressed by a second tissue type with the putative agonist ligand under conditions wherein, in the absence of the putative agonist
  • another embodiment of the present invention relates to a method to identify a tissue-specific agonist of a progesterone receptor, such method comprising: (a) providing an expression profile for at least one gene that is known to be regulated by a progesterone receptor in a first tissue type but not a second tissue type when the progesterone receptor is activated, wherein the at least one gene is chosen from the genes in any one or more of the genes in Tables 1-7; (b) contacting a progesterone receptor expressed by the first tissue type with a putative agonist ligand, wherein the progesterone receptor is selected from the group consisting of progesterone receptor A (PR-A) and progesterone receptor B (PR-B), under conditions wherein, in the absence of the putative agonist ligand, the progesterone receptor is not activated; (c) detecting expression of the at least one gene from (a); (d) comparing the expression of the at least one gene in the presence and in the absence of the putitor
  • Another embodiment of the present invention relates to a method to identify a tissue-specific antagonist of a progesterone receptor.
  • This method includes the steps of: (a) providing an expression profile for at least one gene that is known to be regulated by a progesterone receptor in both a first and second tissue type when the progesterone receptor is activated, wherein the at least one gene is chosen from the genes in any one or more of the genes in Tables 1-7; (b) contacting a progesterone receptor expressed by a first tissue type with a putative antagonist ligand, wherein the progesterone receptor is selected from the group consisting of progesterone receptor A (PR-A) and progesterone receptor B (PR-B), under conditions wherein, in the absence of the putative antagonist ligand, the progesterone receptor is activated; (c) contacting a progesterone receptor expressed by a second tissue type with the putative antagonist ligand, wherein the progesterone receptor is selected from the group consisting of progesterone
  • another embodiment of the present invention relates to a method to identify a tissue-specific antagonist of a progesterone receptor, such method including the steps of: (a) providing an expression profile for at least one gene that is known to be regulated by a progesterone receptor in a first but not in a second tissue type when the progesterone receptor is activated, wherein the at least one gene is chosen from the genes in any one or more of the genes in Tables 1-7; (b) contacting a progesterone receptor expressed by a first tissue type with a putative antagonist ligand, wherein the progesterone receptor is selected from the group consisting of progesterone receptor A (PR-A) and progesterone receptor B (PR-B), under conditions wherein, in the absence of the putative antagonist ligand, the progesterone receptor is activated; (c) detecting expression of the at least one gene from (a); and, (d) comparing the expression of the at least one gene in the presence and in the absence of the put
  • the first tissue type is breast, and at least one gene is selected from the group consisting of any one or more of the genes in Tables 1-7.
  • the first or second tissue type can be any tissue type, including any cell type, that expresses a progesterone receptor.
  • tissues that are known to express progesterone receptors include, but are not limited to, breast, uterus, bone, cartilage, cardiovascular tissues, heart, lung, brain, meninges, pituitary, ovary, oocyte, corpus luteum, oviduct, fallopian tubes, T lymphocytes, B lymphocytes, thymocytes, salivary gland, placenta, skin, gut, pancreas, liver, testis, epididymis, bladder, urinary tract, eye, and teeth.
  • the first tissue type is a non-malignant tissue and wherein the second tissue type is a malignant tissue from the same tissue source as the first tissue type.
  • a preferred tissue source for screening for regulators of malignant tissue but not non-malignant tissue is breast tissue.
  • the first tissue type is a normal tissue and wherein the second tissue type is a non-malignant, abnormal tissue.
  • tissue include, but are not limited to, tissues from endometriosis and leiomyoma of the uterus, fibrocystic disease of the breast, or polycystic ovary.
  • the method includes the detection of the any one or more of the following genes: 11-beta-hydroxysteroid dehydrogenase type 2, tissue factor gene, PCI gene (plasminogen activator inhibitor 3), MAD-3 Ik ⁇ -alpha, Niemann-Pick C disease (NPC1), platelet-type phosphofructokinase, phenylethanolamine n-methyltransferase (PNMT), transforming growth factor-beta 3 (TGF-beta3), Monocyte Chemotactic Protein 1, delta sleep inducing peptide (related to TSC-22), estrogen receptor-related protein (hERRa1).
  • NPC1 Niemann-Pick C disease
  • PNMT phenylethanolamine n-methyltransferase
  • TGF-beta3 transforming growth factor-beta 3
  • Monocyte Chemotactic Protein 1 delta sleep inducing peptide related to TSC-22
  • estrogen receptor-related protein hERRa1
  • the method includes the detection of the any one or more of the following genes: growth arrest-specific protein (gas6), tissue factor gene, NF-IL6-beta (C/EBPbeta), PCI gene (plasminogen activator inhibitor), Stat5A, calcium-binding protein S100P, MSX-2, lipocortin II (calpactin I), selenium-binding protein (hSBP), and bullous pemphigoid antigen (plakin family).
  • gas6 growth arrest-specific protein
  • tissue factor gene tissue factor gene
  • C/EBPbeta NF-IL6-beta
  • PCI gene plasmaogen activator inhibitor
  • Stat5A calcium-binding protein S100P
  • MSX-2 lipocortin II
  • calpactin I lipocortin II
  • hSBP selenium-binding protein
  • bullous pemphigoid antigen plaque family
  • the method includes the detection of phenylethanolamine n-methyltransferase (PNMT) adrenal medulla.
  • PNMT phenylethanolamine n-methyltransferase
  • the method includes the detection of proteasome-like subunit MECL-1. This gene is of particular interest when one of the tissue types is thymus tissue.
  • the expression profile of genes regulated by a progesterone receptor in the first or second tissue type is provided by a method comprising:
  • the present invention defines genes that are regulated by PR-A vs. PR-B in breast cancer cells. It is believed that many, if not most of these genes, will also be regulated by progesterone receptors in other tissues. Similar data can be generated for other tissues, including the uterus, bone, cardiovascular tissues, etc., or malignant vs. normal tissues. Progestin regulated genes in other tissues, which differ from the genes in breast cancer cells of this invention, can be identified, and be used to screen for ligands that regulate candidate genes only in the desired tissue. For example, using the appropriate gene clusters, one could identify a ligand that activates PR-A in the uterus but not the breast. Similarly one could screen out ligands that have undesirable organ or tissue effects.
  • ligands that are inadvertently bioactive in the liver could be discarded.
  • tissue specific methods described above it is also possible to screen for antagonists that block the actions of progestins in one organ or tissue and through one PR isoform, but not another organ or tissue and the other PR isoform.
  • PR-A are “good” receptors in the uterus but not the breast, a selective “antiprogestin-A” might be found that is only inhibitory in the breast.
  • Another method of the present invention relates to a method to identify genes that are regulated by a progesterone receptor in two or more tissue types.
  • the method includes the steps of: (a) activating a progesterone receptor in two or more tissue types that express the progesterone receptor; (b) detecting expression of at least one gene in the two or more tissue types, the at least one gene being chosen from a gene in any one or more of Tables 1-7, and, (c) identifying genes that are regulated by the progesterone receptor in each of the two or more tissue types.
  • the method further includes detecting whether the genes are regulated selectively by PR-A, selectively by PR-B, or by both PR-A and PR-B.
  • This method can generally be used to provide a profile of genes in a tissue type other than breast. Such a profile can then be used in a method for the identification of tissue-specific progesterone receptor ligands as described above, or in a method of determining a profile of genes for a given tissue sample as described below.
  • Yet another embodiment of the present invention relates to a method to determine the profile of genes regulated by progesterone receptors in a tissue sample.
  • the sample is a breast tumor sample.
  • This method includes the steps of: (a) obtaining from a patient a breast tumor sample; (b) detecting expression of at least one gene in the breast tumor sample that is regulated by a progesterone receptor when the progesterone receptor is activated; and, (c) producing a profile of genes for the tumor sample that are regulated selectively by PR-A, selectively by PR-B, or by both PR-A and PR-B.
  • the gene(s) to be profiled are being selected from the group consisting of: (i) at least one gene that is selectively upregulated by PR-A chosen from a gene in Table 9; (ii) at least one gene that is selectively downregulated by PR-A chosen from a gene in Table 10; (iii) at least one gene that is selectively upregulated by PR-B chosen from a gene in Table 11; (iv) at least one gene that is selectively downregulated by PR-B chosen from a gene in Table 12; (v) at least one gene that is upregulated or downregulated by both PR-A and PR-B chosen from a gene in Table 13; (vi) at least one gene that is reciprocally regulated by PR-A and PR-B chosen from a gene in Table 14; and, (vii) at least one gene that is regulated by one of the PR-A or the PR-B, wherein regulation of the gene is altered when the other of the PR-A or PR-B is expressed by the same cell, chosen from a gene in Table 15.
  • PRs are routinely measured in all breast cancers when the disease is first diagnosed. Presence of PRs, especially if the levels are high, informs the oncologist that the tumor is likely to be “hormone-dependent” and will respond to endocrine treatments. This spares the woman from much harsher treatments involving chemotherapies. Additionally, the number of PRs allows the oncologist to predict how aggressive the tumor is likely to be. High PR levels in her tumor indicates that a woman's prognosis is good. Thus measurement of total PRs levels plays a key role in the management of breast cancers.
  • PR-A and PR-B are present in PR-positive breast cancers.
  • the PR-A:PR-B ratio varies widely from tumor to tumor, and some tumors express only one or the other isoform. However, the clinical consequences of this heterogeneity are unknown. Because the transcriptional effects of the two PRs are believed to be so different, fluctuations in their ratio are expected to critically influence the biology of the tumors. However, at present, how that biology is affected is unknown. Whether in fact, PR-A are “bad” and PR-B are “good” in breast cancers, is also unknown.
  • total PRs are routinely measured in all primary breast cancers as a guide to therapy. Their presence and levels are used to predict whether the tumor is likely to respond to hormone treatments, and to estimate disease prognosis. Tumors that lack PRs have less than 10% chance of responding to hormone treatments; tumors that contain PRs have on average a 70% chance of responding to hormone treatments depending on the receptor levels. These numbers are statistical only, and therefore are not specifically informative for any individual patient.
  • the present invention has led to the development of assays that profile the tumor of an individual patient for “good” and “bad” surrogate markers of PR-A and PR-B. Thus it is now possible to measure not only the presence of PRs in a tumor, but the function of the PRs in that tumor.
  • one or more of the genes set forth in Tables 9-15 are selected to be screened in a tissue sample from a patient.
  • the tissue sample is a breast tumor sample.
  • the expression of the genes in the tissue sample can be detected using techniques described above for the various other methods of the present invention. For example, transcript expression levels of the selected genes can be measured in the tumor of a patient, by any of a number of known methods.
  • methods include but are not limited to: northern blotting; reverse transcriptase-polymerase chain reaction and detection of the product; use of labeled mRNA from the tumor to probe cDNAs or oligonucleotides encoding all or part of the PR-responsive genes of interest, arrayed on any of a variety of surfaces, as described above.
  • methods include but are not limited to: western blotting, immunocytochemistry, flow cytometry or other immunologic-based assays; assays based on a property of the protein including but not limited to DNA binding, ligand binding, or interaction with other protein partners, as described above.
  • each gene marker can be measured in primary tumors, metastatic tumors, locally recurring tumors, ductal carcinomas in situ, or other tumors of breast cell origin.
  • the markers can be measured in solid tumors that are fresh, frozen, fixed or otherwise preserved. They can be measured in cytoplasmic or nuclear tumor extracts; or in tumor membranes including but not limited to plasma, mitochondrial, golgi or nuclear membranes; in the nuclear matrix; or in tumor cell organelles and their extracts including but not limited to ribosomes, nuclei, mitochondria, golgi.
  • a profile of individual gene markers can be generated by one or more of the methods described above.
  • a profile of the genes regulated by progesterone receptors in a tissue sample refers to a reporting of the expression level of a given gene from Tables 9-15, wherein, based on the knowledge of the regulation of the genes provided by Tables 9-15, includes a classification of the gene with regard to how the gene is regulated by the PR isoforms. For example, if the gene, estrogen receptor-related protein, is identified as being expressed by a tumor sample, the profile for the tumor will include the reporting of the expression of at least one gene that is exclusively regulated by PR-A.
  • the data can be reported as raw data, and/or statistically analyzed by any of a variety of methods, and/or combined with any other prognostic marker(s) including but not limited to ER, % S-phase, other proliferation markers, markers of ER expression, tumor suppressor genes, etc.
  • prognostic marker(s) including but not limited to ER, % S-phase, other proliferation markers, markers of ER expression, tumor suppressor genes, etc.
  • progesterone receptor isoforms Given the knowledge of the genes regulated by progesterone receptor isoforms according to the present invention, one of skill in the art will be able to select one or more genes to detect in this method of the present invention, and the selection of the one or more genes can be determined by different factors. For example, one of skill in the art may wish to further select genes to be detected on the basis of the function of the gene or gene product, on the basis of PR isoform-specificity, on the basis of association with a particular condition or disease, or on the basis of the change in the level of expression of the gene when in the presence of progesterone. Such embodiments have generally been described above.
  • the method preferably includes the detection of the any one or more of the following genes: growth arrest-specific protein (gas6), tissue factor gene, NF-IL6-beta (C/EBPbeta), PCI gene (plasminogen activator inhibitor), Stat5A, calcium-binding protein S100P, MSX-2, lipocortin 11 (calpactin I), selenium-binding protein (hSBP), and bullous pemphigoid antigen (plakin family).
  • gas6 growth arrest-specific protein
  • tissue factor gene C/EBPbeta
  • PCI gene plasmaogen activator inhibitor
  • Stat5A calcium-binding protein S100P
  • MSX-2 lipocortin 11
  • hSBP selenium-binding protein
  • bullous pemphigoid antigen plaque family
  • the method preferably includes the detection of the any one of more of the following genes: growth arrest-specific protein (gas6), NF-IL6-beta (C/EBPbeta), calcium-binding protein S100P, MSX-2, selenium-binding protein (hSBP), and bullous pemphigoid antigen (plakin family).
  • gas6 growth arrest-specific protein
  • C/EBPbeta NF-IL6-beta
  • C/EBPbeta NF-IL6-beta
  • S100P calcium-binding protein S100P
  • MSX-2 calcium-binding protein
  • hSBP selenium-binding protein
  • bullous pemphigoid antigen bullous pemphigoid antigen
  • the profile of genes provided as a result of the screening of the tissue can be used by the patient or physician for decision-making regarding the usefulness of endocrine therapies in general (i.e. oophorectomy, antiestrogens or other SERMs, aromatase inhibitors, or others), or progestational therapy in particular (high dose progestins, antiprogestins or others).
  • the profile can be used to estimate how the disease is likely to respond and progress in any individual patient.
  • Clinical trials can be developed to correlate the relationship between PR-A vs. PR-B regulated genes, and the biological behavior of the tumor.
  • the gene clusters of this invention can be measured or quantified in normal breast or other normal tissues, either frozen or preserved, or in tissue or organelle extracts as described above, either alone or together with other markers (for example BRCA1), and used for genetic counseling.
  • breast tumors that overexpress PR-B or PR-A represent phenotypically different tumor subsets.
  • breast tumors that are identified as “PR-B rich” based on their expression of PR-B specific genes can be further assessed in terms of usual clinical parameters—tumor staging, pathological staging, size, nodal status, metastasis, responsiveness to hormonal and chemotherapies—and compared to parallel tumors that are “PR-A rich”.
  • the present inventors predict that PR-B rich tumors may be larger and more aggressive than PR-A rich tumors.
  • PR-B strongly and uniquely upregulate two important genes that support angiogenesis: L13720, growth arrest-specific protein (gas 6) is increased 23.1 fold; M27436, tissue factor gene is increased 18.1 fold. Increased angiogenesis, by increasing their blood (and nutrient) supply, promotes tumor growth. This is one example of the hypotheses that can be raised and tested, based on the new information revealed by this invention.
  • the profiling of genes can be extended to other tissue types and/or other genes.
  • tissue types for the presence or absence of the genes regulated by PR in breast tissue, and/or to perform a de novo screening assay for the identification of genes regulated by PR in another tissue, to develop gene expression profiles for use in screening for tissue specific ligands.
  • One of skill in the art can now look to see if a given gene that is regulated by PR in breast is also regulated by PR in another tissue type.
  • the 4 breast cancer cell lines described in Example 1 can be used to screen other gene arrays, including arrays of expressed tag sequences, to discover additional novel, PR-A vs. PR-B regulated genes.
  • the procedure used to produce these cells can be extended to cells from other tissue sources (e.g., the uterus), and new PR-A and PR-B regulated genes can be identified for these tissue sources.
  • Additional applications of the present invention include screening for genes that are regulated by PRs in a ligand-independent manner.
  • the extension of the gene profiles to other tissue types will allow for the development of a variety of diagnostic assays in other tissues and for diseases related to such other tissues, as well as the identification of additional targets for therapeutic strategies.
  • Another embodiment of the present invention relates to a plurality of polynucleotides for the detection of the expression of genes regulated by progesterone receptors in breast tissue.
  • the plurality of polynucleotides consists of polynucleotide probes that are complementary to RNA transcripts, or nucleotides derived therefrom, of genes that are regulated by progesterone receptors, and is therefore distinguished from previously known nucleic acid arrays and primer sets.
  • the plurality of polynucleotides within the above-limitation includes at least one or more, but is not limited to one or more, polynucleotide probes that are complementary to RNA transcripts, or nucleotides derived therefrom, of genes identified by the present inventors.
  • Such genes are selected from: (a) at least one gene that is selectively upregulated by PR-A chosen from a gene in Table 1; (b) at least one gene that is selectively downregulated by PR-A chosen from a gene in Table 2; (c) at least one gene that is selectively upregulated by PR-B chosen from a gene in Table 3; (d) at least one gene that is selectively downregulated by PR-B chosen from a gene in Table 4; (e) at least one gene that is upregulated or downregulated by both PR-A and PR-B chosen from a gene in Table 5; (f) at least one gene that is reciprocally regulated by PR-A and PR-B chosen from a gene in Table 6; and, (g) at least one gene that is regulated by one of the PR-A or the PR-B, wherein regulation of the gene is altered when the other of the PR-A or PR-B is expressed by the same cell, chosen from a gene in Table 7.
  • additional genes that are not regulated by progesterone receptors can be added to the plurality of polynucleotides.
  • Such genes would not be random genes, or large groups of unselected human genes, as are commercially available now, but rather, would be specifically selected to complement the sets of progesterone receptor-regulated genes identified by the present invention.
  • one of skill in the art may wish to add to the above-described plurality of genes one or more genes that are of relevance because they are expressed by a particular tissue of interest (e.g., breast tissue), are associated with a particular disease or condition of interest (e.g., breast cancer), or are associated with a particular cell, tissue or body function (e.g., angiogenesis).
  • tissue of interest e.g., breast tissue
  • disease or condition of interest e.g., breast cancer
  • angiogenesis e.g., angiogenesis
  • the plurality of polynucleotides further comprises at least one polynucleotide probe that is complementary to RNA transcripts, or nucleotides derived therefrom, of at least one gene chosen from the genes in Table 8.
  • the plurality of polynucleotides comprises polynucleotide probes that are complementary to RNA transcripts, or nucleotides derived therefrom, of particular subsets of the genes disclosed in the present invention.
  • one of skill in the art may wish to design pluralities of polynucleotides on the basis of the function of the gene or gene product, on the basis of a tissue-type that expresses a PR, on the basis of PR isoform specificity, on the basis of association with a particular condition or disease, or on the basis of the change in the level of expression of the gene when in the presence of progesterone.
  • tissue-type that expresses a PR on the basis of PR isoform specificity
  • PR isoform specificity
  • association with a particular condition or disease or on the basis of the change in the level of expression of the gene when in the presence of progesterone.
  • a plurality of polynucleotides refers to at least 2, and more preferably at least 3, and more preferably at least 4, and more preferably at least 5, and more preferably at least 6, and more preferably at least 7, and more preferably at least 8, and more preferably at least 9, and more preferably at least 10, and so on, in increments of one, up to any suitable number of polynucleotides, including at least 100, 500, 1000, 10 4 , 10 5 , or at least 10 6 or more polynucleotides.
  • an isolated polynucleotide, or an isolated nucleic acid molecule is a nucleic acid molecule that has been removed from its natural milieu (i.e., that has been subject to human manipulation), its natural milieu being the genome or chromosome in which the nucleic acid molecule is found in nature.
  • isolated does not necessarily reflect the extent to which the nucleic acid molecule has been purified, but indicates that the molecule does not include an entire genome or an entire chromosome in which the nucleic acid molecule is found in nature.
  • the polynucleotides useful in the plurality of polynucleotides of the present invention are typically a portion of a gene of the present invention that is suitable for use as a hybridization probe or PCR primer for the identification of a full-length gene (or portion thereof) in a given sample (e.g., a cell sample).
  • An isolated nucleic acid molecules can include a gene or a portion of a gene (e.g., the regulatory region or promoter), for example, to produce a reporter construct according to the present invention.
  • An isolated nucleic acid molecule that includes a gene is not a fragment of a chromosome that includes such gene, but rather includes the coding region and regulatory regions associated with the gene, but no additional genes naturally found on the same chromosome.
  • An isolated nucleic acid molecule can also include a specified nucleic acid sequence flanked by (i.e., at the 5′ and/or the 3′ end of the sequence) additional nucleic acids that do not normally flank the specified nucleic acid sequence in nature (i.e., heterologous sequences).
  • Isolated nucleic acid molecule can include DNA, RNA (e.g., mRNA), or derivatives of either DNA or RNA (e.g., cDNA).
  • nucleic acid molecule primarily refers to the physical nucleic acid molecule and the phrase “nucleic acid sequence” primarily refers to the sequence of nucleotides on the nucleic acid molecule, the two phrases can be used interchangeably, especially with respect to a nucleic acid molecule, or a nucleic acid sequence, being capable of encoding a protein.
  • an isolated nucleic acid molecule of the present invention is produced using recombinant DNA technology (e.g., polymerase chain reaction (PCR) amplification, cloning) or chemical synthesis.
  • PCR polymerase chain reaction
  • the polynucleotide is an oligonucleotide probe
  • the probe preferably ranges from about 5 to about 50 or about 500 nucleotides, more preferably from about 10 to about 40 nucleotides, and most preferably from about 15 to about 40 nucleotides in length.
  • the polynucleotide probes are conjugated to detectable markers.
  • Detectable labels suitable for use in the present invention include any composition detectable by spectroscopic, photochemical, biochemical, immunochemical, electrical, optical or chemical means.
  • Useful labels in the present invention include biotin for staining with labeled streptavidin conjugate, magnetic beads (e.g., Dynabeads.TM.), fluorescent dyes (e.g., fluorescein, texas red, rhodamine, green fluorescent protein, and the like), radiolabels (e.g., 3 H, 125 I, 35 S, 14 C, or 32 P), enzymes (e.g., horse radish peroxidase, alkaline phosphatase and others commonly used in an ELISA), and colorimetric labels such as colloidal gold or colored glass or plastic (e.g., polystyrene, polypropylene, latex, etc.) beads.
  • the polynucleotide probes are immobilized on a substrate.
  • the polynucleotide probes are hybridizable array elements in a microarray or high density array.
  • Nucleic acid arrays are well known in the art and are described for use in comparing expression levels of particular genes of interest, for example, in U.S. Pat. No. 6,177,248, which is incorporated herein by reference in its entirety. Nucleic acid arrays are suitable for quantifying a small variations in expression levels of a gene in the presence of a large population of heterogeneous nucleic acids. Knowing the identity of the downstream genes of the present invention, nucleic acid arrays can be fabricated either by de novo synthesis on a substrate or by spotting or transporting nucleic acid sequences onto specific locations of substrate.
  • Nucleic acids are purified and/or isolated from biological materials, such as a bacterial plasmid containing a cloned segment of sequence of interest. It is noted that all of the genes identified by the present invention have been previously sequenced, at least in part, such that oligonucleotides suitable for the identification of such nucleic acids can be produced. The database accession number for each of the genes identified by the present inventors is provided in the tables of the invention. Suitable nucleic acids are also produced by amplification of template, such as by polymerase chain reaction or in vitro transcription.
  • oligonucleotide arrays are particularly preferred for this aspect of the invention. Oligonucleotide arrays have numerous advantages, as opposed to other methods, such as efficiency of production, reduced intra- and inter array variability, increased information content and high signal-to-noise ratio.
  • An array will typically include a number of probes that specifically hybridize to the sequences of interest.
  • the array will include one or more control probes.
  • the high-density array chip includes “test probes.” Test probes could be oligonucleotides that range from about 5 to about 45 or 5 to about 500 nucleotides, more preferably from about 10 to about 40 nucleotides and most preferably from about 15 to about 40 nucleotides in length. In other particularly preferred embodiments the probes are 20 or 25 nucleotides in length. In another preferred embodiments, test probes are double or single strand DNA sequences.
  • DNA sequences are isolated or cloned from natural sources or amplified from natural sources using natural nucleic acids as templates, or produced synthetically. These probes have sequences complementary to particular subsequences of the genes whose expression they are designed to detect. Thus, the test probes are capable of specifically hybridizing to the target nucleic acid they are to detect.
  • Another embodiment of the present invention relates to a plurality of antibodies, or antigen binding fragments thereof, for the detection of the expression of genes regulated by progesterone receptors in breast tissue.
  • the plurality of antibodies, or antigen binding fragments thereof consists of antibodies, or antigen binding fragments thereof, that selectively bind to proteins encoded by genes that are regulated by progesterone receptors.
  • the plurality of antibodies, or antigen binding fragments thereof comprises antibodies, or antigen binding fragments thereof, that selectively bind to proteins encoded by genes selected from the group consisting of: (a) genes that are selectively upregulated by PR-A chosen from genes in Table 1; (b) genes that are selectively downregulated by PR-A chosen from genes in Table 2; (c) genes that are selectively upregulated by PR-B chosen from genes in Table 3; (d) genes that are selectively downregulated by PR-B chosen from genes in Table 4; (e) genes that are upregulated or downregulated by both PR-A and PR-B chosen from genes in Table 5; (f) genes that are reciprocally regulated by PR-A and PR-B chosen from genes in Table 6; and, (g) genes that are regulated by one of the PR-A or the PR-B, wherein regulation of the gene is altered when the other of the PR-A or PR-B is expressed by the same cell, chosen from genes in Table 7.
  • the plurality of antibodies, or antigen binding fragments thereof further comprises at least one antibody, or an antigen binding fragment thereof, that selectively binds to a protein encoded by a gene chosen from the genes in Table 8.
  • the plurality of antibodies, or antigen binding fragments thereof further comprises at least one antibody, or an antigen binding fragment thereof, that selectively binds to a protein encoded by a one or more of a particular subset of the genes disclosed in the present invention.
  • at least one antibody, or an antigen binding fragment thereof that selectively binds to a protein encoded by a one or more of a particular subset of the genes disclosed in the present invention.
  • one of skill in the art may wish to design pluralities of antibodies on the basis of the function of the gene product, on the basis of tissue-type, on the basis of PR isoform specificity, on the basis of association with a particular condition or disease, or on the basis of the change in the level of expression of the gene when in the presence of progesterone.
  • Such embodiments have generally been described above.
  • a plurality of antibodies, or antigen binding fragments thereof refers to at least 2, and more preferably at least 3, and more preferably at least 4, and more preferably at least 5, and more preferably at least 6, and more preferably at least 7, and more preferably at least 8, and more preferably at least 9, and more preferably at least 10, and so on, in increments of one, up to any suitable number of antibodies, or antigen binding fragments thereof, including at least 100, 500, or at least 1000 antibodies, or antigen binding fragments thereof.
  • the phrase “selectively binds to” refers to the ability of an antibody, antigen binding fragment or binding partner of the present invention to preferentially bind to specified proteins (e.g., a protein encoded by a PR regulated gene according to the present invention).
  • specified proteins e.g., a protein encoded by a PR regulated gene according to the present invention.
  • the phrase “selectively binds” with regard to antibodies and antigen binding fragments thereof, has been defined previously herein.
  • An antigen binding fragment is referred to as an Fab, an Fab′, or an F(ab′) 2 fragment.
  • a fragment lacking the ability to bind to antigen is referred to as an Fc fragment.
  • An Fab fragment comprises one arm of an immunoglobulin molecule containing a L chain (V L +C L domains) paired with the V H region and a portion of the C H region (CH1 domain).
  • An Fab′ fragment corresponds to an Fab fragment with part of the hinge region attached to the CH1 domain.
  • An F(ab′) 2 fragment corresponds to two Fab′ fragments that are normally covalently linked to each other through a di-sulfide bond, typically in the hinge regions.
  • Isolated antibodies of the present invention can include serum containing such antibodies, or antibodies that have been purified to varying degrees.
  • Whole antibodies of the present invention can be polyclonal or monoclonal.
  • functional equivalents of whole antibodies such as antigen binding fragments in which one or more antibody domains are truncated or absent (e.g., Fv, Fab, Fab′, or F(ab) 2 fragments), as well as genetically-engineered antibodies or antigen binding fragments thereof, including single chain antibodies or antibodies that can bind to more than one epitope (e.g., bi-specific antibodies), or antibodies that can bind to one or more different antigens (e.g., bi- or multi-specific antibodies), may also be employed in the invention.
  • antigen binding fragments in which one or more antibody domains are truncated or absent e.g., Fv, Fab, Fab′, or F(ab) 2 fragments
  • genetically-engineered antibodies or antigen binding fragments thereof including single chain antibodies or
  • a suitable experimental animal such as, for example, but not limited to, a rabbit, a sheep, a hamster, a guinea pig, a mouse, a rat, or a chicken, is exposed to an antigen against which an antibody is desired.
  • an animal is immunized with an effective amount of antigen that is injected into the animal.
  • An effective amount of antigen refers to an amount needed to induce antibody production by the animal.
  • the animal's immune system is then allowed to respond over a pre-determined period of time. The immunization process can be repeated until the immune system is found to be producing antibodies to the antigen.
  • serum is collected from the animal that contains the desired antibodies (or in the case of a chicken, antibody can be collected from the eggs). Such serum is useful as a reagent.
  • Polyclonal antibodies can be further purified from the serum (or eggs) by, for example, treating the serum with ammonium sulfate.
  • Monoclonal antibodies may be produced according to the methodology of Kohler and Milstein ( Nature 256:495-497, 1975). For example, B lymphocytes are recovered from the spleen (or any suitable tissue) of an immunized animal and then fused with myeloma cells to obtain a population of hybridoma cells capable of continual growth in suitable culture medium. Hybridomas producing the desired antibody are selected by testing the ability of the antibody produced by the hybridoma to bind to the desired antigen.
  • PR-regulated genes of this invention can serve as targets for therapeutic strategies.
  • neutralizing antibodies could be directed against one of the protein products of a selected gene, expressed on the surface of a tumor cell.
  • One embodiment of this aspect of the invention relates to a method to regulate the expression of a gene selected from the group consisting of any one or more of the genes in Tables 1-7.
  • the method includes administering to a cell that expresses a progesterone receptor a compound selected from the group consisting of: progesterone, a progestin, and an antiprogestin, wherein the compound is effective to regulate the expression of the gene(s) in Table 1-7.
  • the gene is selected from the group consisting of genes that are listed in Table 16 (known to be involved in breast cancer or mammary gland development), but not in Table 8 (known to be regulated by progesterone).
  • Such genes include, e.g., growth arrest-specific protein (gas6), NF-IL6-beta (C/EBPbeta), calcium-binding protein S100P, MSX-2, selenium-binding protein (hSBP), and bullous pemphigoid antigen (plakin family).
  • gas6 growth arrest-specific protein
  • C/EBPbeta NF-IL6-beta
  • C/EBPbeta NF-IL6-beta
  • calcium-binding protein S100P calcium-binding protein S100P
  • MSX-2 selenium-binding protein
  • hSBP selenium-binding protein
  • bullous pemphigoid antigen plaque family
  • the cell that expresses a progesterone receptor is in the breast tissue of a patient that has, or is at risk of developing, breast cancer.
  • these genes can serve as targets for the development of other therapeutic methods.
  • a composition, and particularly a therapeutic composition, of the present invention generally includes the therapeutic compound (e.g., the progesterone receptor regulatory ligand) and a carrier, and preferably, a pharmaceutically acceptable carrier.
  • a “pharmaceutically acceptable carrier” includes pharmaceutically acceptable excipients and/or pharmaceutically acceptable delivery vehicles, which are suitable for use in administration of the composition to a suitable in vitro, ex vivo or in vivo site.
  • a suitable in vitro, in vivo or ex vivo site is preferably a cell that expresses a progesterone receptor.
  • a suitable site for delivery is a site of inflammation, a site of a tumor, a site of a transplanted graft, or a site of any other disease or condition in which progesterone receptor regulation, or modulation of genes regulated by a PR, can be beneficial, particularly given the knowledge of the genes regulated by PR according to the invention.
  • Preferred pharmaceutically acceptable carriers are capable of maintaining a steroidal or non-steroidal compound, a protein, a peptide, nucleic acid molecule or mimetic (drug) according to the present invention in a form that, upon arrival of the steroidal or non-steroidal compound, protein, peptide, nucleic acid molecule or mimetic at the cell target in a culture or in patient, the steroidal or non-steroidal compound, protein, peptide, nucleic acid molecule or mimetic is capable of interacting with its target (e.g., a naturally occurring PR or a nucleic acid or protein product of a PR-regulated gene).
  • target e.g., a naturally occurring PR or a nucleic acid or protein product of a PR-regulated gene
  • Suitable excipients of the present invention include excipients or formularies that transport or help transport, but do not specifically target a composition to a cell (also referred to herein as non-targeting carriers).
  • examples of pharmaceutically acceptable excipients include, but are not limited to water, phosphate buffered saline, Ringer's solution, dextrose solution, serum-containing solutions, Hank's solution, other aqueous physiologically balanced solutions, oils, esters and glycols.
  • Aqueous carriers can contain suitable auxiliary substances required to approximate the physiological conditions of the recipient, for example, by enhancing chemical stability and isotonicity.
  • Suitable auxiliary substances include, for example, sodium acetate, sodium chloride, sodium lactate, potassium chloride, calcium chloride, and other substances used to produce phosphate buffer, Tris buffer, and bicarbonate buffer.
  • Auxiliary substances can also include preservatives, such as thimerosal, m- or o-cresol, formalin and benzol alcohol.
  • Compositions of the present invention can be sterilized by conventional methods and/or lyophilized.
  • a controlled release formulation that is capable of slowly releasing a composition of the present invention into a patient or culture.
  • a controlled release formulation comprises a compound of the present invention (e.g., a protein (including homologues), a drug, an antibody, a nucleic acid molecule, or a mimetic) in a controlled release vehicle.
  • Suitable controlled release vehicles include, but are not limited to, biocompatible polymers, other polymeric matrices, capsules, microcapsules, microparticles, bolus preparations, osmotic pumps, diffusion devices, liposomes, lipospheres, and transdermal delivery systems.
  • Other carriers of the present invention include liquids that, upon administration to a patient, form a solid or a gel in situ. Preferred carriers are also biodegradable (i.e., bioerodible).
  • suitable delivery vehicles include, but are not limited to liposomes, viral vectors or other delivery vehicles, including ribozymes.
  • Natural lipid-containing delivery vehicles include cells and cellular membranes.
  • Artificial lipid-containing delivery vehicles include liposomes and micelles.
  • a delivery vehicle of the present invention can be modified to target to a particular site in a patient, thereby targeting and making use of a compound of the present invention at that site.
  • Suitable modifications include manipulating the chemical formula of the lipid portion of the delivery vehicle and/or introducing into the vehicle a targeting agent capable of specifically targeting a delivery vehicle to a preferred site, for example, a preferred cell type.
  • a targeting agent capable of specifically targeting a delivery vehicle to a preferred site, for example, a preferred cell type.
  • Other suitable delivery vehicles include gold particles, poly-L-lysine/DNA-molecular conjugates, and artificial chromosomes.
  • a pharmaceutically acceptable carrier which is capable of targeting is herein referred to as a “delivery vehicle.”
  • Delivery vehicles of the present invention are capable of delivering a composition of the present invention to a target site in a patient.
  • a “target site” refers to a site in a patient to which one desires to deliver a composition.
  • a target site can be any cell which is targeted by direct injection or delivery using liposomes, viral vectors or other delivery vehicles, including ribozymes and antibodies.
  • Examples of delivery vehicles include, but are not limited to, artificial and natural lipid-containing delivery vehicles, viral vectors, and ribozymes. Natural lipid-containing delivery vehicles include cells and cellular membranes. Artificial lipid-containing delivery vehicles include liposomes and micelles.
  • a delivery vehicle of the present invention can be modified to target to a particular site in a subject, thereby targeting and making use of a compound of the present invention at that site.
  • Suitable modifications include manipulating the chemical formula of the lipid portion of the delivery vehicle and/or introducing into the vehicle a compound capable of specifically targeting a delivery vehicle to a preferred site, for example, a preferred cell type.
  • targeting refers to causing a delivery vehicle to bind to a particular cell by the interaction of the compound in the vehicle to a molecule on the surface of the cell.
  • Suitable targeting compounds include ligands capable of selectively (i.e., specifically) binding another molecule at a particular site. Examples of such ligands include antibodies, antigens, receptors and receptor ligands.
  • Manipulating the chemical formula of the lipid portion of the delivery vehicle can modulate the extracellular or intracellular targeting of the delivery vehicle.
  • a chemical can be added to the lipid formula of a liposome that alters the charge of the lipid bilayer of the liposome so that the liposome fuses with particular cells having particular charge characteristics.
  • One preferred delivery vehicle of the present invention is a liposome.
  • a liposome is capable of remaining stable in an animal for a sufficient amount of time to deliver a nucleic acid molecule (e.g., an anti-sense nucleic acid molecule that hybridizes to a nucleic acid sequence in a gene for which inhibition is desired) to a preferred site in the animal.
  • a liposome according to the present invention, comprises a lipid composition that is capable of delivering a nucleic acid molecule described in the present invention to a particular, or selected, site in a patient.
  • a liposome according to the present invention comprises a lipid composition that is capable of fusing with the plasma membrane of the targeted cell to deliver a nucleic acid molecule into a cell.
  • Suitable liposomes for use with the present invention include any liposome.
  • Preferred liposomes of the present invention include those liposomes commonly used in, for example, gene delivery methods known to those of skill in the art. More preferred liposomes comprise liposomes having a polycationic lipid composition and/or liposomes having a cholesterol backbone conjugated to polyethylene glycol. Complexing a liposome with a nucleic acid molecule of the present invention can be achieved using methods standard in the art.
  • a liposome delivery vehicle is preferably capable of remaining stable in a patient for a sufficient amount of time to deliver a nucleic acid molecule or other compound of the present invention to a preferred site in the patient (i.e., a target cell).
  • a liposome delivery vehicle of the present invention is preferably stable in the patient into which it has been administered for at least about 30 minutes, more preferably for at least about 1 hour and even more preferably for at least about 24 hours.
  • a preferred liposome delivery vehicle of the present invention is from about 0.01 microns to about 1 microns in size.
  • a preferred delivery vehicle comprises a viral vector.
  • a viral vector includes an isolated nucleic acid molecule useful in the present invention, in which the nucleic acid molecules are packaged in a viral coat that allows entrance of DNA into a cell.
  • a number of viral vectors can be used, including, but not limited to, those based on alphaviruses, poxviruses, adenoviruses, herpesviruses, lentiviruses, adeno-associated viruses and retroviruses.
  • a composition which includes an agonist or antagonist of a progesterone receptor can be delivered to a cell culture or patient by any suitable method. Selection of such a method will vary with the type of compound being administered or delivered (i.e., steroidal or non-steroidal compound, protein, peptide, nucleic acid molecule, or mimetic), the mode of delivery (i.e., in vitro, in vivo, ex vivo) and the goal to be achieved by administration/delivery of the compound or composition.
  • the type of compound being administered or delivered i.e., steroidal or non-steroidal compound, protein, peptide, nucleic acid molecule, or mimetic
  • the mode of delivery i.e., in vitro, in vivo, ex vivo
  • the goal to be achieved by administration/delivery of the compound or composition i.e., in vitro, in vivo, ex vivo
  • an effective administration protocol i.e., administering a composition in an effective manner
  • suitable dose parameters and modes of administration that result in delivery of a composition to a desired site (i.e., to a desired cell) and/or in the desired regulatory event (e.g., regulation of the PR receptor biological activity or of the biological activity of a gene that is regulated by PR).
  • Administration routes include in vivo, in vitro and ex vivo routes.
  • In vivo routes include, but are not limited to, oral, nasal, intratracheal injection, inhaled, transdermal, rectal, and parenteral routes.
  • Preferred parenteral routes can include, but are not limited to, subcutaneous, intradermal, intravenous, intramuscular and intraperitoneal routes.
  • Intravenous, intraperitoneal, intradermal, subcutaneous and intramuscular administrations can be performed using methods standard in the art.
  • Aerosol (inhalation) delivery can also be performed using methods standard in the art (see, for example, Stribling et al., Proc. Natl. Acad. Sci.
  • Oral delivery can be performed by complexing a therapeutic composition of the present invention to a carrier capable of withstanding degradation by digestive enzymes in the gut of an animal.
  • a carrier capable of withstanding degradation by digestive enzymes in the gut of an animal.
  • examples of such carriers include plastic capsules or tablets, such as those known in the art.
  • Direct injection techniques are particularly useful for suppressing graft rejection by, for example, injecting the composition into the transplanted tissue, or for site-specific administration of a compound, such as at the site of a tumor.
  • Ex vivo refers to performing part of the regulatory step outside of the patient, such as by transfecting a population of cells removed from a patient with a recombinant molecule comprising a nucleic acid sequence encoding a protein according to the present invention under conditions such that the recombinant molecule is subsequently expressed by the transfected cell, and returning the transfected cells to the patient.
  • In vitro and ex vivo routes of administration of a composition to a culture of host cells can be accomplished by a method including, but not limited to, transfection, transformation, electroporation, microinjection, lipofection, adsorption, protoplast fusion, use of protein carrying agents, use of ion carrying agents, use of detergents for cell permeabilization, and simply mixing (e.g., combining) a compound in culture with a target cell.
  • a therapeutic compound including agonists and antagonists of progesterone receptors, as well as compositions comprising such compounds, can be administered to any organism, and particularly, to any member of the Vertebrate class, Mammalia, including, without limitation, primates, rodents, livestock and domestic pets.
  • Livestock include mammals to be consumed or that produce useful products (e.g., sheep for wool production).
  • Preferred mammals to protect include humans.
  • a therapeutic benefit is not necessarily a cure for a particular disease or condition, but rather, preferably encompasses a result which can include alleviation of the disease or condition, elimination of the disease or condition, reduction of a symptom associated with the disease or condition, prevention or alleviation of a secondary disease or condition resulting from the occurrence of a primary disease or condition, and/or prevention of the disease or condition.
  • the phrase “protected from a disease” refers to reducing the symptoms of the disease; reducing the occurrence of the disease, and/or reducing the severity of the disease.
  • Protecting a patient can refer to the ability of a composition of the present invention, when administered to a patient, to prevent a disease from occurring and/or to cure or to alleviate disease symptoms, signs or causes.
  • to protect a patient from a disease includes both preventing disease occurrence (prophylactic treatment) and treating a patient that has a disease (therapeutic treatment) to reduce the symptoms of the disease.
  • a beneficial effect can easily be assessed by one of ordinary skill in the art and/or by a trained clinician who is treating the patient.
  • the term, “disease” refers to any deviation from the normal health of a mammal and includes a state when disease symptoms are present, as well as conditions in which a deviation (e.g., infection, gene mutation, genetic defect, etc.) has occurred, but symptoms are not yet manifested.
  • Example 1 The following example describes the identification of genes regulated by progesterone receptors.
  • the stock medium consists of Eagle's Minimum Essential Medium with Earle's salts (MEM), containing L-glutamine (292 mg/liter) buffered with sodium bicarbonate (2.2 g/liter), insulin (6 ng/ml) and 5% fetal bovine serum (Hyclone, Logan, Utah) with G418.
  • MEM Eagle's Minimum Essential Medium with Earle's salts
  • T47D-YA and T47D-YB breast cancer cells were grown to mid-confluence in Minimal Essential Medium containing 5% Fetal Calf Serum, then either treated with 10 nM progesterone dissolved in ethanol for 6 or 12 hours, or in ethanol alone. This yielded 4 treatment types.
  • hybridization was detected by autoradiography and phosphoimaging on a Molecular DynamicsPhosphoImagerTM (Molecular Dynamics, Sunnyvale, Calif.). Data were analyzed using AtlasTM Image 1.0, and normalized to signals from control housekeeping genes on the same filter. For selected genes, progesterone inducibility and PR-isoform specificity were confirmed by northern blotting, reverse transcriptase-polymerase chain reaction (RT-PCR), and/or western blotting.
  • RT-PCR reverse transcriptase-polymerase chain reaction
  • Affymetrix GeneChipTMArray T47D-Y, T47D-YA and T47D-YB breast cancer cells were grown to mid-confluence in Minimal Essential Medium containing 5% Fetal Calf Serum, then either treated with 10 nM progesterone dissolved in ethanol for 6 hours, or in ethanol alone. This yielded 6 treatment types.
  • First strand cDNA was synthesized from 2 ug of polyA + RNA using SSII Reverse Transcriptase, the T7dT 24mer, and other components of the Superscript Choice system (Gibco BRL Life Technologies, Gaithersburg, Md.).
  • the DNA was purified by phenol/chloroform extraction and precipitation, and resuspended in 12 ul DEPC-treated RNase water. 5 ul were used in an in vitro transcription reaction using the EnZo BioArrayTM High Yield transcript Labeling Kit (Affymetrix, Inc., Santa Clara, Calif.), to synthesize RNA transcripts and incorporate biotin labeled ribonucleotides. Unincorporated nucleotides were removed with RNeasy affinity columns (Qiagen, Valencia, Calif.).
  • Hybridizations and subsequent washes were done in the GeneChip Hybridization Oven and Fluidics Station 400. After overnight hybridization, the solutions were removed, the chips were washed and stained with streptavidin-phycoerythrin. DNA chips were read at a resolution of 6 um with a Hewlett-Packard GeneArray Scanner.
  • Each gene on the chip is represented by perfectly matched (PM) and mismatched (MM) oligonucleotides from 16-20 regions of each gene.
  • the mismatched probes act as specificity controls, which allow direct subtraction of background and cross-hybridization signals.
  • the number of instances in which the PM hybridization signal is larger that the MM signal is computed along with the average of the logarithm of the PM:MM ratio (after background subtraction) for each probe set.
  • the first level of analysis including the “present” or “absent” call, and pairwise comparisons, were done using GeneChip 3.1 Expression Analysis ProgramTM (Affymetrix, Inc., Santa Clara, Calif.).
  • a second level of analysis to identify clusters of genes regulated by progesterone via PR-A, PR-B or both was performed using GeneSpringTM version 3.0 (Silicon Genetics, San Carlos, Calif.).
  • the present inventors used customized software capable of comparing multiple experimental pairwise comparisons (minus versus plus progesterone) and multiple control comparisons (all minus hormone samples and all plus hormone samples) to compare fold change minus versus plus hormone as compared to the fold change between controls. This served as a measure of the variability between samples.
  • k-means clustering was performed using GeneSpringTM version 3.2.12 (Silicon Genetics, San Carlos, Calif.) to identify patterns of gene regulation in PR-A, PR-B, or PR-negative cells treated with or without progesterone.
  • Selected genes i.e., ones that were substantially regulated or are of particular biological interest, have been confirmed by northern and/or RT-PCR, and/or by western blotting. Additionally, the promoters of several genes of interest have been cloned, linked upstream of a luciferase reporter, and tested for their ability to be transcriptionally regulated by PR-A vs. PR-B after transfection into HeLa cervicocarcinoma cells, followed by progesterone treatment of the cells. In the examples tested, regulation by PR-A vs. PR-B using the synthetic promoter/reporter constructs, mimicked the regulation of the endogenous genes in the breast cancer cells, supporting the use of these approaches for drug discovery.
  • RT-PCR amplifications of target sequences were performed with co-amplification of an internal control sequence (p2MG or GAPDH) using: P2MG forward primer: 5′-ATCCAGCGTACTCCAAAGATTC-3′ (SEQ ID NO:1); ⁇ 2MG reverse primer: 5′-TCCTTGCTGAAAGACAAGTCTG-3′ (SEQ ID NO:2); resulting in a product of 178 bp.
  • GAPDH primers yielded a product of 485 bp.
  • GAPDH, Integrin ⁇ 6, and bcl-x cDNA primer sequences were obtained from Clontech. Total RNA was prepared from T47DY-A or -B cells as described above.
  • RNA was mixed with 0.4 ⁇ M random hexamers and heated to 65° C. for five min. (Perkin Elmer).
  • 1 ⁇ PCR buffer (5 mM MgCl 2 ), 20 U RNAse inhibitor, 4 mM dNTPs, and 125 U MMLV reverse transcriptase were added and tubes were incubated at 42° C. for 1 hour.
  • Five ⁇ l of the cDNA synthesis reactions were added to 1 ⁇ PCR buffer, 1.8 mM MgCl 2 , 100 mM dNTP blend, and 60 pmoles of specific primers were incubated with 5 U AmpliTaq DNA polymerase at 94° C. for 30 s, 65 C for 45 s, and 68° C.
  • PCR reagents were purchased from Perkin Elmer, Foster City, Calif. Five ⁇ l of samples were resolved on a 2% agarose gel, and Southern blots were performed in 0.4M. Blots were prehybridized in Rapid-hyb (Amersham) for 1 h at 65° C. cDNA probes were generated by RT-PCR and radioactively labeled using MegaPrime DNA labeling system (Amersham) and 32 P- ⁇ dCTP. Blots were probed for 2 h to overnight at 65° C.
  • Blots were washed and exposed to autoradiography film or phosphoimaging screen and then quantified using ImageQuant, Molecular Dynamics. In some cases the RT-PCR products could be visualized on an ethidium bromide stained gel when amplified in the linear range of production and in these cases Southern blotting and hybridizing with a labeled probe was unnecessary and products were instead directly quantitated. In some cases Northern blot analysis was used to detect transcripts.
  • RNA was electrophoresed in a formaldehyde agarose gel and transferred to a Hybond nylon membrane (Amersham) and hybridized sequentially with cDNA inserts for specific genes generated by random priming PCR products generate as above with 32 P dCTP using Mega-Prime DNA Labeling Kit (Amersham). Membranes were then probed with fragments of housekeeping genes (either B2MG or GAPDH).
  • HeLa cells plated at 4 ⁇ 10 5 cells per 10 cm dish in MEM supplemented with 5% fetal bovine serum were then transiently transfected with 100 ng of HPR1 (PR-B in pSG5) or HPR2 (PR-A in pSG5) and 1.2 ⁇ g of the integrin a6 promoter ( ⁇ 740) in pGL3-Basic vector plasmid (gift from Dr. Sohei Kitazawa, Kobe University School of Medicine, Department of Pathology), 1.2 ⁇ g of ⁇ -galactosidase expression plasmid pCH110, and 5.5 ⁇ g BSM treated with 10 mM progesterone or ethanol vehicle for 24 hours.
  • Protein extracts were equalized to 150 ⁇ g by Bradford assay (Bio-Rad), resolved by SDS-PAGE, and transferred to nitrocellulose. Equivalent protein loading was confirmed by Ponceau S staining. Following incubation with the appropriate antibodies, and HRP-conjugated secondary antibodies, protein bands were detected by enhanced chemiluminescence (Amersham, Arlington Heights, Ill.).
  • any one gene can be viewed individually and standard error bars generated from replicate experiments are shown for gene expression levels in cell lines containing either PR-A, PR-B, or no PR, with or without progesterone treatment.
  • a cluster of genes was shown to be upregulated by progesterone in both PR-A and PR-B containing cells, but not in the PR-negative cell line. While most of these genes were upregulated by progesterone treatment more strongly via PR-B, some, such as S100P calcium binding protein, and Grb10 are upregulated equally well via PR-A and PR-B. Upregulation of IkappaBalpha via both receptors was confirmed at the protein level as early as 6 hours, and remained elevated for up to 48 hours in the presence of progesterone (data not shown).
  • Ezrin identified as being progesterone regulated using AtlasTM Human cDNA Expression Arrays probed with RNA from T47D-YA and YB cells left untreated or treated with progesterone for 12 hrs was confirmed to be equally well upregulated by both PR-A and PR-B at 12, 24 and 48 hrs by northern blot analysis (data not shown).
  • progesterone regulated genes require PR-B as illustrated by Tables 3, 4, 11, 12 and 18. Two examples are Stat5a and C/EBP beta. Their differential upregulation only by PR-B was confirmed by immunoblot at several time-points after progesterone treatment (data not shown). In contrast, the same western blot probed for two control proteins, p21 and cyclin D1, previously reported to be progesterone regulated (Musgrove et al., Mol. Cell. Biol. 13, 3577-3587 (1993); Musgrove et al., Mol. Endocrinol.
  • tissue factor a cell surface glycoprotein
  • Tissue factor was previously known to be regulated by progesterone in the endometrium (Krikun et al., Mol Endocrinol 14, 393-400 (2000); Lockwood et al., J Clin Endocrinol Metab 85, 297-301 (2000); Krikun et al., J Clin Endocrinol Metab 83, 926-30 (1998)), but not in the breast or in breast cancers.
  • the HEF1 gene is highly related to BCAR1/p130Cas, which has been found to be upregulated in tamoxifen resistant tumors (van der Flier et al., Int J Cancer 89, 465-8 (2000); van der Flier et al., J Natl Cancer Inst 92, 120-7 (2000)).
  • the present invention provides the rationale for measuring the expression levels of these genes in breast cancers. It may be that tumors that overexpress these genes good candidates for suppressive therapy with progesterone antagonists.
  • bullous pemphigoid antigen a protein associated with hemidesmosomes
  • bullous pemphigoid antigen a protein associated with hemidesmosomes
  • Such desmosomes are important in maintaining the normal differentiated architecture of the breast.
  • the present inventors have found that bullous pemphigoid antigen is downregulated by progesterone through both PR isoforms. This down regulation may be harmful, and/or it may disrupt important cell-cell interactions. It is possible that antiprogestin therapy would prevent this downregulation.
  • progesterone regulated genes are involved in particular functional pathways. Groups of temporally regulated genes are often involved in the same pathway. For example, it was previously known that progesterone regulates genes involved in the steroid biosynthesis and trafficking pathways (Watari et al., Exp Cell Res 259, 247-56 (2000); Darnel et al., J Steroid Biochem Mol Biol 70:203-10 (1999); Arcuri et al., Endocrinology 137:595-600 (1996)), and the present investigators identify a cluster of such genes. However, less is known about the role of progesterone in regulating signaling pathways controlled by growth factors and cytokines.
  • the present inventors' data demonstrate for the first time, that progesterone plays an important role in regulating many genes involved in these signaling pathways.

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Immunology (AREA)
  • Engineering & Computer Science (AREA)
  • Organic Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Pathology (AREA)
  • Analytical Chemistry (AREA)
  • Molecular Biology (AREA)
  • Physics & Mathematics (AREA)
  • Wood Science & Technology (AREA)
  • Zoology (AREA)
  • Urology & Nephrology (AREA)
  • Microbiology (AREA)
  • Hematology (AREA)
  • Biochemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biotechnology (AREA)
  • Genetics & Genomics (AREA)
  • Oncology (AREA)
  • Hospice & Palliative Care (AREA)
  • Cell Biology (AREA)
  • Biophysics (AREA)
  • General Physics & Mathematics (AREA)
  • Medicinal Chemistry (AREA)
  • Food Science & Technology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

Disclosed are expression profiles of genes that are regulated by progesterone receptors, and particularly by progesterone receptor isoforms PR-A and PR-B. Methods for using such genes to identifying progesterone receptor agonist and antagonist ligands are described. Also described are methods for identifying isoform-specific progesterone receptor ligands, for identifying tissue-specific progesterone receptor ligands, and for determining the profile of genes regulated by progesterone receptors in a breast tumor sample. In addition, pluralities of polynucleotides from genes that are regulated by progesterone receptors are disclosed, as are pluralities of antibodies that selectively bind to proteins encoded by such genes.

Description

    CROSS-REFERENCE TO RELATED APPLICATIONS
  • This application claims the benefit of priority under 35 U.S.C. § 119(e) from U.S. Provisional Application Serial No. 60/214,870, filed Jun. 28, 2000, entitled “Surrogate Gene Markers for Two Different Progesterone Receptor Isoforms in Breast Cancer, and Their Use to Screen for Isoform-Selective Progestational Ligands”. The entire disclosure of U.S. Provisional Application Serial No. 60/214,780 is incorporated herein by reference.[0001]
  • FIELD OF THE INVENTION
  • This invention generally relates to expression profiles of genes that are regulated by progesterone receptors, and particularly by progesterone receptor isoforms PR-A and PR-B, and to the use of such genes in methods for identifying progesterone receptor agonist and antagonist ligands, including progesterone receptor isoform-specific ligands and tissue-specific ligands. This invention also relates to methods for determining the profile of genes regulated by progesterone receptors in a tissue sample. In addition, pluralities of polynucleotides transcribed from genes that are regulated by progesterone receptors are disclosed, as are pluralities of antibodies that selectively bind to proteins encoded by such genes. [0002]
  • BACKGROUND OF THE INVENTION
  • Progesterone is a natural reproductive hormone that targets the breast, uterus, ovaries, brain, bone, blood vessels, immune system, etc. Progestational agents are widely used for oral contraception, menopausal hormone replacement therapy, and cancer treatments. Antiprogestins, which are synthetic ligands that antagonize the actions of progesterone, are in clinical trials for contraception, for induction of labor, and to treat endometriosis, breast cancers and meningiomas. The actions of progesterone are varied and tissue-specific. Even in the normal breast it can have diverse effects: depending on the physiological state of the woman, progesterone can be proliferative, antiproliferative, or differentiative. Additionally, progesterone promotes the development of breast cancers and accelerates the growth of established breast cancers. For example, when used for hormone replacement therapy at menopause, progestins, which are synthetic progestational agents, increase the risk of breast cancer. Paradoxically, they are protective in the uterus and prevent endometrial cancers. [0003]
  • Progesterone, synthetic progestins, and antiprogestins all initially work through the same molecular pathway. These are low molecular weight, lipid soluble “ligands”. They enter target cells passively, and pass into the nucleus where they bind to progesterone receptors (PRs). Ligand binding activates the PR proteins, which then dimerize, bind to DNA at the promoters of progesterone target genes, and either up- or down-regulate transcription of these genes. There are two natural isoforms of PR, the A- and B-receptors, also referred to herein as PR-A and PR-B, respectively. The isoforms are derived from two distinct promoters in the single PR gene and are translated from separate translation initiation start sites. PR-B receptors are 933 amino acids in length, which is 164 amino acids longer at the N-terminus than PR-A, and contain a unique transcriptional activation function, AF-3 (Sartorius et al., [0004] Mol. Endocrinol. 8, 1347-1360 (1994)). Downstream of the additional 164 amino acids of PR-B, the two PRs have the identical primary amino acid content. However, despite this close amino acid composition, the two receptors have dramatically different abilities to activate transcription of progestin-responsive promoters in experimental model systems (Sartorius et al., Mol. Endocrinol. 8, 1347-1360 (1994); Meyer et al., J. Biol. Chem. 267, 10882-10887 (1992); Vegeto et al., Mol. Endocrinol. 7, 1244-1255 (1993); Tung et al., Mol. Endocrinol. 7, 1256-1265 (1993); Sartorius et al., J. Biol. Chem. 268, 9262-9266 (1993)). Progestin agonist-liganded PR-B are stronger transactivators than PR-A, although there are cell-type and promoter-dependent exceptions. The antiprogestin RU486 has mixed agonist/antagonist activity on PR-B but not PR-A. Instead, agonist or antagonist-liganded PR-A can dominantly inhibit PR-B and other members of the steroid receptor family, including estrogen receptors (ERs). Thus, PR-A are more likely to be transcriptional repressors than PR-B. (Hovland et al., J Biol Chem 273, 5455-60(1998); Vegeto et al., Mol. Endocrinol. 7, 1244-1255 (1993); McDonnell et al., J. Biol. Chem. 269, 11945-11949 (1994)).
  • Indirect data suggest that the two PR isoforms have physiologically different functions. They are unequally expressed in different tissues and physiological states. For instance, increasing ratios of PR-A to PR-B in the chick oviduct in late winter, or in aged, nonlaying hens, resulted in measurable decreases in PR functional activity (Boyd-Leinen et al., [0005] Endocrinology 111, 30-36 (1982); Spelsberg et al., Endocrinology 107, 1234-44 (1980)). There are stage-specific and region-specific variations in the PR-A:PR-B ratio in the developing rat brain (Kato et al., J Steroid Biochem Mol Biol 47, 173-82 (1993)) and studies in primates show that PR-B predominates in the estrogen treated hypothalamus, while expression of the PR-A isoform predominates in the pituitary (Baez et al., J Biol Chem 262, 6582-8 (1987); Bethea et al., Endocrinology 139, 677-87 (1998)). In the human endometrium, absolute levels and the ratio of PR-A to PR-B vary extensively during the menstrual cycle (Mote et al., Hum Reprod 15 Suppl 3, 48-56 (2000); Mote et al., J Clin Endocrinol Metab 84, 2963-71 (1999); Mangal et al., J Steroid Biochem Mol Biol 63, 195-202 (1997); Feil et al., Endocrinology 123, 2506-2513 (1988)). In addition, uncontrolled, is or over-expressed PR-B levels are associated with a highly malignant phenotype in endometrial, cervical and ovarian cancers (Farr et al., Mamm. Genome 4, 577-584 (1993); Fujimoto et al., J Steroid Biochem Mol Biol 62, 449-54 (1997)).
  • In the normal breast, progesterone is both proliferative and differentiative [reviewed in\(Clarke et al., [0006] Endocr. Rev. 11, 266-301 (1990))]. Breast epithelium mitoses increase during the menstrual cycle and peak in the late luteal phase, coincident with high circulating levels of progesterone. Progesterone induces lobular-alveolar outgrowth during each menstrual cycle and during pregnancy induces further lobular-alveolar development in preparation for the terminal differentiative event of lactation. PR null mice exhibit incomplete mammary gland ductal branching and failure of lobulo-alveolar development, as well as failure to ovulate and to exhibit sexual behavior (Lydon et al., Genes Develop. 9, 2266-2278 (1995)).
  • Little is known about cyclic changes in PR-A and PR-B in the normal human breast. [0007]
  • However, in the mouse mammary gland, evidence supports a critical and unique role for each of the two PR isoforms. It has been reported that a 3:1 overexpression of PR-A over PR-B results in extensive mammary gland epithelial cell hyperplasia, excessive ductal branching, and a disorganized basement membrane; all features associated with neoplasia (Shyamala et al., [0008] Proc Natl Acad Sci U S A 95, 696-701 (1998)). In contrast, when PR-B is overexpressed, ductal growth prematurely arrests and inappropriate lobulo-alveolar formation is observed (Shyamala et al., Proc Natl Acad Sci U S A 97, 3044-9 (2000)). However, when the PR-A isoform was selectively knocked out, leaving only PR-B, the mammary gland appeared to develop normally in response to estradiol and progesterone. In contrast, decidualization of the endometrium and the normal antiproliferative effect of progesterone in the uterus were absent (Mulac-Jericevic et al., Science 289, 1751-4 (2000)). Such data indicate that PR-A and PR-B have different tissue-specific effects.
  • In human breast cancers the presence of PR in estrogen receptor (ER) positive tumors indicates that responsiveness to endocrine therapies is likely, while absence of PR is associated with hormone resistance thus, PR are routinely measured in breast cancers as a guide to treatment (Horwitz et al., [0009] Recent Prog. Horm. Res. 41, 249-316 (1985); Horwitz et al., J Biol Chem 253, 8185-91 (1978); McGuire, Semin. Oncol. 5, 428-433 (1978)). PR are also direct targets of second-line progestin therapies in patients whose tumors have developed antiestrogen resistance (Kimmick et al., Cancer Treat Res 94, 231-54 (1998); Howell et al., Recent Results Cancer Res 152, 227-44 (1998)). Nothing is known, however, about the role of PR-A vs. PR-B in breast cancers. The PR-A to PR-B ratio was measured in 202 PR-positive human breast tumors (Graham et al., Cancer Res. 55, 5063-5068 (1995)). The majority had PR-A to PR-B ratios greater than one, and 33% had 3.7 times or more PR-A than PR-B. The functional significance of this is unknown. In breast cancer cell lines, overexpression of PR-A results in marked changes in morphology and loss of adherent properties (McGowan et al., Mol Endocrinol 13, 1657-71(1999)). Thus, overexpression of PR-A as seen in many breast tumors, may lead to suppression of PR-B, and may be associated with poor prognosis. However, there are no clinical data to support this conjecture.
  • Prior to the present invention, few, if any, endogenous genes differentially regulated by PR-A vs. PR-B were known in breast cancers or any other tissues. An excess of PR-A enhances the expression of SOX4 mRNA levels in breast cancer cells. Whether PR-B also regulates this gene is unknown. SOX4 induces DNA bending. PR-A enhance expression of the mouse multiple drug resistance (mdr) 1b gene, important for development of drug resistance in tumors. Whether this gene is regulated endogenously only by PR-A is unknown. To the present inventors' knowledge, no data on PR-B specific gene regulation in breast cancers (or any tissues) has been published prior to the present invention. Although certain of the genes listed in Table 8 below were previously known to be progesterone regulated, the PR isoform specificity of this regulation was not known. [0010]
  • Knowledge of the unique sets of genes that are selectively regulated by each PR isoform would serve as a surrogate marker for the presence and function of PR-A vs. PR-B in various tissue types and in various disease states. Furthermore, knowledge of such genes and their promoters, would serve as a tool for screening PR ligands, and particularly, PR-A vs. PR-B selective ligands. However, defining which sets of genes are uniquely regulated by one or the other PR isoform in breast cancers was impossible in progesterone target tissues because both PR-A and PR-B receptors are simultaneously present in those tissues, and are simultaneously activated by progesterone treatment. [0011]
  • SUMMARY OF THE INVENTION
  • One embodiment of the present invention relates to a method to identify agonist ligands of progesterone receptors. The method includes the steps of: (a) contacting a progesterone receptor with a putative agonist ligand, wherein the progesterone receptor is selected from the group consisting of progesterone receptor A (PR-A) and progesterone receptor B (PR-B), under conditions wherein, in the absence of the putative agonist ligand, the progesterone receptor is not activated; (b) detecting expression of at least one gene that is regulated by the progesterone receptor when the progesterone receptor is activated; and, (c) comparing the expression of the at least one gene in the presence and in the absence of the putative agonist ligand, wherein detection of regulation of the expression of the at least one gene in the manner associated with activation of the progesterone receptor as set forth in (b) indicates that the putative agonist ligand is a progesterone receptor agonist. [0012]
  • In one aspect, detection of upregulation of expression of at least one gene chosen from a gene in Table 1, or detection of downregulation of at least one gene chosen from a gene in Table 2, in the presence of the putative agonist ligand, indicates that the putative agonist ligand is a selective agonist of PR-A. In another aspect, detection of upregulation of expression of at least one gene chosen from a gene in Table 3, or detection of downregulation of at least one gene chosen from a gene in Table 4, in the presence of the putative agonist ligand, indicates that the putative agonist ligand is a selective agonist of PR-B. [0013]
  • Another embodiment of the present invention relates to a method to identify antagonists of progesterone receptors. This method includes the steps of: (a) contacting a progesterone receptor with a putative antagonist ligand, wherein the progesterone receptor is selected from the group consisting of progesterone receptor A (PR-A) and progesterone receptor B (PR-B), under conditions wherein, in the absence of the putative antagonist ligand, the progesterone receptor is activated; (b) detecting expression of at least one gene that is regulated by the progesterone receptor when the progesterone receptor is activated; and, (c) comparing the expression of the at least one gene in the presence and in the absence of the putative antagonist ligand, wherein detection of inhibition or reversal of the regulation of expression of the at least one gene as compared to the regulation of expression of the at least one gene in the manner associated with activation of the progesterone receptor as set forth in (b), indicates that the putative antagonist ligand is a progesterone receptor antagonist. The progesterone receptor can be activated by contacting the receptor with a compound that activates the receptor, the step of contacting being performed prior to, simultaneously with, or after the step of contacting of (a). [0014]
  • In one aspect of this embodiment, detection of inhibition of expression or downregulated expression of at least one gene chosen from a gene in Table 1 in the presence of the putative antagonist ligand as compared to the expression of the at least one gene in the presence of the compound that activates the progesterone receptor, or detection of inhibition of expression or upregulation of expression of at least one gene chosen from a gene in Table 2 in the presence of the putative antagonist ligand as compared to the expression of the at least one gene in the presence of the compound that activates the progesterone receptor, indicates that the putative antagonist ligand is a selective antagonist of PR-A. In another aspect, detection of inhibition of expression or downregulation of expression of at least one gene chosen from a gene in Table 3 in the presence of the putative antagonist ligand as compared to the expression of the at least one gene in the presence of the compound that activates the progesterone receptor, or detection of inhibition of expression or upregulation of expression of at least one gene chosen from a gene in Table 4, in the presence of the putative antagonist ligand as compared to the expression of the at least one gene in the presence of the compound that activates the progesterone receptor, indicates that the putative antagonist ligand is a selective antagonist of PR-B. [0015]
  • In each of the above-described methods, the at least one gene is selected from the group consisting of: (i) at least one gene that is selectively upregulated by PR-A chosen from a gene in Table 1; (ii) at least one gene that is selectively downregulated by PR-A chosen from a gene in Table 2; (iii) at least one gene that is selectively upregulated by PR-B chosen from a gene in Table 3; (iv) at least one gene that is selectively downregulated by PR-B chosen from a gene in Table 4; (v) at least one gene that is upregulated or downregulated by both PR-A and PR-B chosen from a gene in Table 5; (vi) at least one gene that is reciprocally regulated by PR-A and PR-B chosen from a gene in Table 6; and, (vii) at least one gene that is regulated by one of the PR-A or the PR-B, wherein regulation of the gene is altered when the other of the PR-A or PR-B is expressed by the same cell, chosen from a gene in Table 7. In one embodiment, the method further includes a step of detecting expression of at least one gene chosen from the genes in Table 8. [0016]
  • In one aspect, step (b) includes detecting expression of: 11-beta-hydroxysteroid dehydrogenase type 2, tissue factor gene, PCI gene (plasminogen activator inhibitor 3), MAD-3 Ikβ-alpha, Niemann-Pick C disease (NPC1), platelet-type phosphofructokinase, phenylethanolamine n-methyltransferase (PNMT), transforming growth factor-beta 3 (TGF-beta3), Monocyte Chemotactic Protein 1, delta sleep inducing peptide (related to TSC-22), and estrogen receptor-related protein (hERRa1). In another aspect, step (b) includes detecting expression of: growth arrest-specific protein (gas6), tissue factor gene, NF-IL6-beta (C/EBPbeta), PCI gene (plasminogen activator inhibitor), Stat5A, calcium-binding protein S100P, MSX-2, lipocortin II (calpactin I), selenium-binding protein (hSBP), and bullous pemphigoid antigen (plakin family). In another aspect, step (b) includes detecting expression of phenylethanolamine n-methyltransferase (PNMT) adrenal medulla. In another aspect, step (b) includes detecting expression of proteasome-like subunit MECL-1. In another aspect, step (b) includes detecting expression of: growth arrest-specific protein and tissue factor gene. [0017]
  • In each of the above-described methods, the progesterone receptor can be PR-A, PR-B or both PR-A and PR-B. [0018]
  • In one aspect of the above-described methods, the step (b) of detecting comprises detecting expression of at least five genes from any one or more of the Tables 1-7. In another aspect, the step (b) of detecting comprises detecting expression of at least ten genes from any one or more of the Tables 1-7. In yet another aspect, the step (b) of detecting comprises detecting expression of at least 15 genes from any one or more of the Tables 1-7. [0019]
  • In one aspect of the above-described methods, the progesterone receptor is expressed by a cell. In this aspect, the progesterone receptor is endogenously expressed by the cell or recombinantly expressed by the cell. In one embodiment, cell is part of a tissue from a test animal. In this embodiment, the step of contacting is performed by administration of the putative agonist ligand to the test animal or to the tissue of the test animal. [0020]
  • In another aspect of the above-described methods, expression of the at least one gene is detected by measuring amounts of transcripts of the at least one gene before and after contact of the progesterone receptor with the putative agonist ligand. In one aspect, expression of the at least one gene is detected by detecting hybridization of at least a portion of the at least one gene or a transcript thereof to a nucleic acid molecule comprising a portion of the at least one gene or a transcript thereof in a nucleic acid array. In another aspect, expression of the at least one gene is detected by measuring expression of a reporter gene that is operatively linked to at least the regulatory region of the at least one gene. In another aspect, expression of the at least one gene is detected by detecting the production of a protein encoded by the at least one gene. [0021]
  • In yet another aspect of the above-described methods, the putative agonist ligand is a product of rational drug design. [0022]
  • Yet another embodiment of the present invention relates to a method to identify isoform-specific agonists of progesterone receptors. This method includes the steps of: (a) contacting a progesterone receptor with a putative agonist ligand, wherein the progesterone receptor is selected from the group consisting of progesterone receptor A (PR-A) and progesterone receptor B (PR-B), under conditions wherein in the absence of the putative agonist ligand, the progesterone receptor is not activated; (b) detecting expression of at least one gene that is regulated by the progesterone receptor when the progesterone receptor is activated; and, (c) comparing the expression of the at least one gene in the presence and in the absence of the putative agonist ligand, wherein detection of regulation of the expression of the at least one gene in the manner associated with activation of the progesterone receptor as set forth in (b)(i) but not (b)(ii), indicates that the putative agonist ligand is a PR-A-specific agonist, and wherein detection of regulation of the expression of the at least one gene in the manner associated with activation of the progesterone receptor as set forth in (b)(ii) but not (b)(i), indicates that the putative agonist ligand is a PR-B-specific agonist. [0023]
  • Another embodiment of the present invention relates to a method to identify isoform-specific antagonists of progesterone receptors. This method includes the steps of: (a) contacting a progesterone receptor with a putative antagonist ligand, wherein the progesterone receptor is selected from the group consisting of progesterone receptor A (PR-A) and progesterone receptor B (PR-B), under conditions wherein, in the absence of the putative antagonist ligand, the progesterone receptor is activated; (b) detecting expression of at least one gene that is regulated by the progesterone receptor when the progesterone receptor is activated; and, (c) comparing the expression of the at least one gene in the presence and in the absence of the putative antagonist ligand, wherein, in the presence of the putative antagonist ligand, detection of inhibition or reversal of the regulation of expression of the at least one gene as compared to the regulation of expression of the at least one gene in the manner associated with activation of the progesterone receptor as set forth in (b)(i) but not (b)(ii), indicates that the putative antagonist ligand is a PR-A-specific antagonist, and wherein, in the presence of the putative antagonist ligand, detection of inhibition or reversal of the regulation of expression of the at least one gene as compared to the regulation of the expression of the at least one gene in the manner associated with activation of the progesterone receptor as set forth in (b)(ii) but not (b)(i), indicates that the putative antagonist ligand is a PR-B-specific antagonist. [0024]
  • In each of the above-described methods of identifying a isoform-specific regulator of progesterone receptors, the progesterone receptor can include PR-A, PR-B, or both PR-A and PR-B. The at least one gene is selected from the group consisting of: (i) at least one gene that is exclusively upregulated or downregulated by PR-A, chosen from a Table selected from the group consisting of Table 1 and Table 2; and, (b) at least one gene that is exclusively upregulated or downregulated by PR-B chosen from a Table selected from the group consisting of Table 3 and Table 4. In one aspect, the step (b) of detecting comprises detecting expression of at least five genes from any one or more of the Tables 1-4. In another aspect, the step (b) of detecting comprises detecting expression of at least ten genes from any one or more of the Tables 1-4. In yet another aspect, the step (b) of detecting comprises detecting expression of at least 15 genes from any one or more of the Tables 1-4. [0025]
  • Another embodiment of the present invention relates to a method to identify a tissue-specific agonist of a progesterone receptor. This embodiment includes the steps of: (a) providing an expression profile for at least one gene that is known to be regulated by a progesterone receptor in both a first and second tissue type when the progesterone receptor is activated, wherein the at least one gene is chosen from the genes in any one or more of the genes in Tables 1-7; (b) contacting a progesterone receptor expressed by a first tissue type with a putative agonist ligand, wherein the progesterone receptor is selected from the group consisting of progesterone receptor A (PR-A) and progesterone receptor B (PR-B), under conditions wherein, in the absence of the putative agonist ligand, the progesterone receptor is not activated; (c) contacting a progesterone receptor expressed by a second tissue type with the putative agonist ligand under conditions wherein, in the absence of the putative agonist ligand, the progesterone receptor is not activated, wherein the progesterone receptor is the same isoform as the progesterone receptor contacted in (b); (d) detecting expression of the at least one gene from (a); (e) comparing the expression of the at least one gene in the presence and in the absence of the putative agonist ligand in each of the first and second tissue types, wherein detection of regulation of the expression of the at least one gene in one of the first or second tissue types in the manner associated with activation of the progesterone receptor as set forth in the expression profile of (a), and detection of inhibition of regulation or no regulation of the at least one gene in the other of the first or second tissue types, as compared to the expression of the at least one gene associated with activation of the progesterone receptor as set forth in the expression profile of (a), indicates that the putative agonist ligand is a tissue-specific progesterone receptor agonist. [0026]
  • Yet another embodiment relates to a method to identify a tissue-specific antagonist of a progesterone receptor. This method includes the steps of: (a) providing an expression profile for at least one gene that is known to be regulated by a progesterone receptor in both a first and second tissue type when the progesterone receptor is activated, wherein the at least one gene is chosen from the genes in any one or more of the genes in Tables 1-7; (b) contacting a progesterone receptor expressed by a first tissue type with a putative antagonist ligand, wherein the progesterone receptor is selected from the group consisting of progesterone receptor A (PR-A) and progesterone receptor B (PR-B), under conditions wherein, in the absence of the putative antagonist ligand, the progesterone receptor is activated; (c) contacting a progesterone receptor expressed by a second tissue type with the putative antagonist ligand, wherein the progesterone receptor is selected from the group consisting of progesterone receptor A (PR-A) and progesterone receptor B (PR-B), under conditions wherein, in the absence of the putative antagonist ligand, the progesterone receptor is activated; (d) detecting expression of the at least one gene from (a); and, (e) comparing the expression of the at least one gene in the presence and in the absence of the putative antagonist ligand in each of the first and second tissue types, wherein detection of regulation of the expression of the at least one gene in one of the first or second tissue types in the manner associated with activation of the progesterone receptor as set forth in the expression profile of (a) in the presence of the putative antagonist ligand, and detection of inhibition or reversal of regulation of expression of the at least one gene in the other of the first or second tissue types in the presence of the putative antagonist ligand, as compared to the expression of the at least one gene associated with activation of the progesterone receptor as set forth in the expression profile of (a), indicates that the putative antagonist ligand is a tissue-specific progesterone receptor antagonist. [0027]
  • Another embodiment of the present invention relates to a method to identify a tissue-specific agonist of a progesterone receptor. This method includes the steps of: (a) providing an expression profile for at least one gene that is known to be regulated by a progesterone receptor in a first tissue type but not a second tissue type when the progesterone receptor is activated, wherein the at least one gene is chosen from the genes in any one or more of the genes in Tables 1-7; (b) contacting a progesterone receptor expressed by the first tissue type with a putative agonist ligand, wherein the progesterone receptor is selected from the group consisting of progesterone receptor A (PR-A) and progesterone receptor B (PR-B), under conditions wherein, in the absence of the putative agonist ligand, the progesterone receptor is not activated; (c) detecting expression of the at least one gene from (a); (d) comparing the expression of the at least one gene in the presence and in the absence of the putative agonist ligand in the first tissue type, wherein detection of regulation of the expression of the at least one gene in the first tissue type in the manner associated with activation of the progesterone receptor as set forth in the expression profile of (a) indicates that the putative agonist ligand is a tissue-specific progesterone receptor agonist for the first tissue type. [0028]
  • Yet another embodiment of the present invention relates to a method to identify a tissue-specific antagonist of a progesterone receptor. This method includes the steps of: (a) providing an expression profile for at least one gene that is known to be regulated by a progesterone receptor in a first but not in a second tissue type when the progesterone receptor is activated, wherein the at least one gene is chosen from the genes in any one or more of the genes in Tables 1-7; (b) contacting a progesterone receptor expressed by a first tissue type with a putative antagonist ligand, wherein the progesterone receptor is selected from the group consisting of progesterone receptor A (PR-A) and progesterone receptor B (PR-B), under conditions wherein, in the absence of the putative antagonist ligand, the progesterone receptor is activated; (c) detecting expression of the at least one gene from (a); and, (d) comparing the expression of the at least one gene in the presence and in the absence of the putative antagonist ligand in the first tissue type, wherein detection of inhibition or reversal of regulation of expression of the at least one gene in the first tissue type in the presence of the putative antagonist ligand, as compared to the expression of the at least one gene associated with activation of the progesterone receptor as set forth in the expression profile of (a), indicates that the putative antagonist ligand is a tissue-specific progesterone receptor antagonist of the first tissue type. [0029]
  • In each of the above-described methods to identify a tissue-specific regulator of a progesterone receptor, in one aspect, the first tissue type is breast, and wherein the at least one gene is selected from the group consisting of: (i) at least one gene that is selectively upregulated by PR-A chosen from a gene in Table 1; (ii) at least one gene that is selectively downregulated by PR-A chosen from a gene in Table 2; (iii) at least one gene that is selectively upregulated by PR-B chosen from a gene in Table 3; (iv) at least one gene that is selectively downregulated by PR-B chosen from a gene in Table 4; (v) at least one gene that is upregulated or downregulated by both PR-A and PR-B chosen from a gene in Table 5; (vi) at least one gene that is reciprocally regulated by PR-A and PR-B chosen from a gene in Table 6; and, (vii) at least one gene that is regulated by one of the PR-A or the PR-B, wherein regulation of the gene is altered when the other of the PR-A or PR-B is expressed by the same cell, chosen from a gene in Table 7. In one aspect, the second tissue type is selected from the group consisting of breast, uterus, bone, cartilage, cardiovascular tissues, heart, lung, brain, meninges, pituitary, ovary, oocyte, corpus luteum, oviduct, fallopian tubes, T lymphocytes, B lymphocytes, thymocytes, salivary gland, placenta, skin, gut, pancreas, liver, testis, epididymis, bladder, urinary tract, eye, and teeth. In one aspect, the first tissue type is a non-malignant tissue and wherein the second tissue type is a malignant tissue from the same tissue source as the first tissue type. A preferred tissue source is breast tissue. In another aspect, the first tissue type is a normal tissue and wherein the second tissue type is a non-malignant, abnormal tissue. [0030]
  • In each of the above-described methods for identifying a tissue-specific regulator of a progesterone receptor, the expression profile of genes regulated by a progesterone receptor in the first or second tissue type can be provided by a method comprising: (a) providing a first cell of a selected tissue type that expresses a progesterone receptor A (PR-A) and not a progesterone receptor B (PR-B) and a second cell of the same tissue type that expresses PR-B and not PR-A; (b) stimulating the progesterone receptors in (a) by contacting the first and second cells with a progesterone receptor stimulatory ligand; (c) detecting expression of genes by the first and second cells in the presence of the stimulatory ligand and in the absence of the stimulatory ligand, wherein a difference in the expression of a gene in the presence of the stimulatory ligand as compared to in the absence of the stimulatory ligand, indicates that the gene is regulated by the progesterone receptor in the selected tissue type. [0031]
  • Another embodiment of the present invention relates to method to determine the profile of genes regulated by progesterone receptors in a breast tumor sample. This method includes the steps of: (a) obtaining from a patient a breast tumor sample; (b) detecting expression of at least one gene in the breast tumor sample that is regulated by a progesterone receptor when the progesterone receptor is activated; and, (c) producing a profile of genes for the tumor sample that are regulated selectively by PR-A, selectively by PR-B, or by both PR-A and PR-B. The at least one gene is selected from the group consisting of: (i) at least one gene that is selectively upregulated by PR-A chosen from a gene in Table 9; (ii) at least one gene that is selectively downregulated by PR-A chosen from a gene in Table 10; (iii) at least one gene that is selectively upregulated by PR-B chosen from a gene in Table 11; (iv) at least one gene that is selectively downregulated by PR-B chosen from a gene in Table 12; (v) at least one gene that is upregulated or downregulated by both PR-A and PR-B chosen from a gene in Table 13; (vi) at least one gene that is reciprocally regulated by PR-A and PR-B chosen from a gene in Table 14; and, (vii) at least one gene that is regulated by one of the PR-A or the PR-B, wherein regulation of the gene is altered when the other of the PR-A or PR-B is expressed by the same cell, chosen from a gene in Table 15. [0032]
  • Yet another embodiment of the present invention relates to a plurality of polynucleotides for the detection of the expression of genes regulated by progesterone receptors in breast tissue. The plurality of polynucleotides consists of polynucleotide probes that are complementary to RNA transcripts, or nucleotides derived therefrom, of genes that are regulated by progesterone receptors. The plurality of polynucleotides also comprises polynucleotide probes that are complementary to RNA transcripts, or nucleotides derived therefrom, of genes selected from the group consisting of: (a) at least one gene that is selectively upregulated by PR-A chosen from a gene in Table 1; (b) at least one gene that is selectively downregulated by PR-A chosen from a gene in Table 2; (c) at least one gene that is selectively upregulated by PR-B chosen from a gene in Table 3; (d) at least one gene that is selectively downregulated by PR-B chosen from a gene in Table 4; (e) at least one gene that is upregulated or downregulated by both PR-A and PR-B chosen from a gene in Table 5; (e) at least one gene that is reciprocally regulated by PR-A and PR-B chosen from a gene in Table 6; and, (f) at least one gene that is regulated by one of the PR-A or the PR-B, wherein regulation of the gene is altered when the other of the PR-A or PR-B is expressed by the same cell, chosen from a gene in Table 7. [0033]
  • In one aspect, the polynucleotide probes are immobilized on a substrate. In another aspect, the polynucleotide probes are hybridizable array elements in a microarray. In another aspect, the polynucleotide probes are conjugated to detectable markers. In yet another aspect, the plurality of polynucleotides further comprises at least one polynucleotide probe that is complementary to RNA transcripts, or nucleotides derived therefrom, of at least one gene chosen from the genes in Table 8. [0034]
  • Another embodiment of the present invention relates to a plurality of antibodies, or antigen binding fragments thereof, for the detection of the expression of genes regulated by progesterone receptors in breast tissue. The plurality of antibodies, or antigen binding fragments thereof, consists of antibodies, or antigen binding fragments thereof, that selectively bind to proteins encoded by genes that are regulated by progesterone receptors. The plurality of antibodies, or antigen binding fragments thereof, also comprises antibodies, or antigen binding fragments thereof, that selectively bind to proteins encoded by genes selected from the group consisting of: (a) at least one gene that is selectively upregulated by PR-A chosen from a gene in Table 1; (b) at least one gene that is selectively down-regulated by PR-A chosen from a gene in Table 2; (c) at least one gene that is selectively upregulated by PR-B chosen from a gene in Table 3; (d) at least one gene that is selectively downregulated by PR-B chosen from a gene in Table 4; (e) at least one gene that is upregulated or downregulated by both PR-A and PR-B chosen from a gene in Table 5; (e) at least one gene that is reciprocally regulated by PR-A and PR-B chosen from a gene in Table 6; and, (f) at least one gene that is regulated by one of the PR-A or the PR-B, wherein regulation of the gene is altered when the other of the PR-A or PR-B is expressed by the same cell, chosen from a gene in Table 7. [0035]
  • In one aspect, the plurality of antibodies, or antigen binding fragments thereof, further comprises at least one antibody, or an antigen binding fragment thereof, that selectively binds to a protein encoded by a gene chosen from the genes in Table 8. [0036]
  • Another embodiment of the present invention relates to a method to identify genes that are regulated by a progesterone receptor in two or more tissue types. This method includes the steps of: (a) activating a progesterone receptor in two or more tissue types that express the progesterone receptor; (b) detecting expression of at least one gene the two or more tissue types, the at least one gene being chosen from a gene in any one or more of Tables 1-7, and, (c) identifying genes that are regulated by the progesterone receptor in each of the two or more tissue types. This method can further include the step of detecting whether the genes are regulated selectively by PR-A, selectively by PR-B, or by both PR-A and PR-B. [0037]
  • Another embodiment of the present invention relates to a method to regulate the expression of a gene selected from the group consisting of any one or more of the genes in Tables 1-7. The method includes administering to a cell that expresses a progesterone receptor a compound selected from the group consisting of: progesterone, a progestin, and an antiprogestin, wherein the compound is effective to regulate the expression of the gene. In one embodiment, the gene is selected from the group consisting of: growth arrest-specific protein (gas6), NF-IL6-beta (C/EBPbeta), calcium-binding protein S100P, MSX-2, selenium-binding protein (hSBP), and bullous pemphigoid antigen (plakin family). In another embodiment, the cell that expresses a progesterone receptor is in the breast tissue of a patient that has, or is at risk of developing, breast cancer.[0038]
  • DETAILED DESCRIPTION OF THE INVENTION
  • The present invention generally relates to the identification of a large number of genes that are regulated by progesterone receptors, and particularly, to the identification of how these genes are regulated by the progesterone receptor isoforms, PR-A and PR-B. Using the gene expression profiles disclosed herein, one can identify novel ligands of progesterone receptors (both progestin-like agonists and anti-progestin-like antagonists) that regulate progesterone receptors, including in an isoform-specific and/or tissue specific manner. In addition, these genes can be used to profile individuals that have been diagnosed with breast cancer to enhance the ability of the clinician to develop a prognosis and treatment protocols for the individual patient. The genes can also be used to profile the progesterone receptor regulated gene expression in tissue types other than breast tissue. Moreover, given the knowledge of these genes, one can produce novel combinations of polynucleotides and/or antibodies and/or peptides for use in progestational drug screening assays or expression profiling of patient samples. [0039]
  • The present inventors have generated model systems to study PRs in breast cancer cells, that are unique to the present inventors' laboratory. In most target tissues, including the breast and uterus, PRs are induced by estradiol. Thus, one can only study progestin actions in the background of an estrogenized system. This makes it virtually impossible to dissect out responses that are due to progesterone, from those that are due to estrogens. Furthermore, all these target tissues contain both PR-A and PR-B. This makes it impossible to dissect out the effects of each PR isoform independently. The T47Dco breast cancer cells are unique to the present inventors' laboratory. They have retained PR and express both PR-A and PR-B at equal levels (Horwitz et al., [0040] Cell 28, 633-42 (1982)). However, the PRs in these cells are constitutively regulated without estrogens. In order to study differential gene regulation by the two PR isoforms independently, the present inventors constructed a model system in which a PR-negative subline (termed T47D-Y), was derived from T47Dco breast cancer cells. T47D-4 cells were then engineered to stably express either PR-B (termed T47D-4B) or PR-A (termed T47D-4A) at equal levels to each other and to the parental T47Dco cells (Sartorius et al., Cancer Res. 54, 3668-3877 (1994)). The present inventors have now used these three new cell lines to analyze progesterone-responsive gene regulation via PR-B or PR-A (with PR negative T47D-Y cells serving as a control) using Affymetrix™ microarray HFL6800 gene expression chips and Atlas™ Human cDNA Expression Arrays. In addition to confirming the regulation of the few known progesterone-responsive genes, the present inventors have identified many genes not previously known to be regulated by PR. Importantly, the results described herein now allow discrimination of genes that are regulated uniquely by PR-B from genes that are uniquely regulated by PR-A. It was found that PR-B regulate more genes than PR-A in response to progesterone, but that a number of genes are uniquely regulated by PR-A. Many of these results have been confirmed by northern blot analysis or RT-PCR of the gene transcripts, or by western blot analyses of the protein products. The data presented herein demonstrate that the two PR isoforms do indeed have unique roles in gene regulation in breast cancer cells. Lastly, the present inventors have observed that the expression levels of a subset of genes are modified by the presence of PR in a ligand-independent fashion.
  • Genes Regulated by Progesterone Receptors: [0041]
  • Of the more than 6000 human genes screened, the present inventors have identified multiple genes, the expression of which is regulated by progesterone receptors. The genes can be grouped into categories based on the regulation of expression of the genes by the progesterone receptor isoforms, PR-A and PR-B. More particularly, the genes have been grouped into the following main categories: (1) Genes that are selectively (i.e., exclusively or uniquely) upregulated by PR-A (Tables 1 and 9); (2) genes that are selectively downregulated by PR-A (Tables 2 and 10); (3) genes that are selectively upregulated by PR-B (Tables 3 and 11); (4) genes that are selectively downregulated by PR-B (Tables 4 and 12); (5) genes that are upregulated or downregulated in the same direction by both PR-A and PR-B (Tables 5 and 13); (6) genes that are reciprocally regulated by PR-A and PR-B (Tables 6 and 14); and (7) genes that are regulated by one of the isoforms, wherein such regulation is altered when the other isoform is present (e.g., the expression of the gene is either up- or downregulated in the presence of both receptors relative to the expression level of the gene in the presence of only one receptor) (Tables 7 and 15). In this last category, the gene is characterized in that the regulation of expression of the gene by one isoform is altered or suppressed by the presence of the other isoform. It is noted that genes in this last category can also fall within one of the other 6 categories. Tables 1-7 include all genes that were newly discovered to be regulated by progesterone receptors by the present inventors. Tables 9-15 include all of the genes from Tables 1-7, respectively, and additionally include the genes that were identified by the present inventors that had previously been identified to be regulated generally by progesterone. This particular subset of genes (i.e., previously known to be regulated by progesterone) is also set forth separately in Table 8. It is noted that even though the genes in Table 8 were known to be regulated by progesterone, the isoform specificity of these genes was not previously known. Therefore, the identification of the PR isoform regulation of the genes in Table 8 is novel. Other categories of the genes identified in the present invention are as follows: Table 16 is a list of genes identified in the present invention which were previously known to be involved in breast cancer or in the development of mammary tissue. Table 17 is a list that categorizes the genes shown to be regulated by progesterone by the present inventors into functional categories based on GeneCard information as well as extensive literature reviews of each gene product. Table 18 (See Example 1) shows the cumulative results of the gene array analysis with regard to the PR-B-expressing cells described in the Examples. Table 19 (See Example 1) shows the cumulative results of the gene array analysis with regard to the PR-A-expressing cells described in the Examples. [0042]
  • Accordingly, in one embodiment of the present invention, the genes identified as being regulated by progesterone receptors by the present inventors can be used as endpoints or markers in a method to identify ligands that regulate progesterone receptor activity. According to the present invention, in general, the biological activity or biological action of a protein such as a progesterone receptor refers to any function(s) exhibited or performed by the protein that is ascribed to the naturally occurring form of the protein as measured or observed in vivo (i.e., in the natural physiological environment of the protein) or in vitro (i.e., under laboratory conditions). In particular, the biological activity of a progesterone receptor that is of interest herein includes the effect of the receptor, particularly when the receptor is activated, on the expression of the downstream genes identified in the present invention. According to the present invention, a “downstream gene” or “endpoint gene” is any gene, the expression of which is regulated (up or down) by a progesterone receptor (PR-A and/or PR-B). The expression of the gene is typically regulated by the progesterone receptor when it is activated, although the expression of the gene may be regulated by the progesterone receptor in the absence of a stimulatory compound (i.e., the regulation may be ligand independent, or constitutive). Pharmaceutical companies are keenly interested in screening their vast libraries of chemical compounds for ones that bind to (ligands), and either activate or inhibit, progesterone receptors. Selected sets of one, two, three, or more of the genes (up to the number equivalent to all of the genes) of this invention can be used as end-points for rapid through-put screening of ligands that specifically and selectively influence the activity of PR-A and/or PR-B. The ligands can be either agonists or antagonists of the progesterone receptor. [0043]
  • As used herein, the phrase “PR agonist ligand” or “PR agonist” refers to any compound that interacts with a PR and elicits an observable response. More particularly, a PR agonist can include, but is not limited to, steroidal or non-steroidal compounds; a protein, peptide, or nucleic acid that selectively binds to and activates or increases the activation of a progesterone receptor; and most commonly includes progesterone, progesterone analogs, and any suitable product of drug design (e.g., a mimetic of progesterone, or a synthetic progestin) which is characterized by its ability to agonize (e.g., stimulate, induce, increase, enhance) the biological activity of a naturally occurring progesterone receptor in a manner similar to the natural agonist, progesterone (e.g., by interaction/binding with and/or direct or indirect activation of a progesterone receptor). It is noted that the term “progestin” as used herein is generally intended to include progesterone as well as any progesterone analog, such as a synthetic progestin. Since the progesterone receptor is an intracellular receptor, a suitable agonist typically does not include an antibody or antigen binding fragment thereof, but to the extent that an antibody that selectively binds to and activates or increases the activation of a progesterone receptor can be designed and implemented as an agonist, such a compound is also contemplated. It is noted that the effect of the action of a given PR agonist on the expression of a downstream gene may be the downregulation of the gene or the suppression of the expression of a gene (e.g., when both isoforms of PR are present). Moreover, the action of the agonist on a PR may have undesirable consequences in one tissue type and beneficial consequences in another tissue type. However, the term agonist is intended to refer to the ability of the ligand to act on a progesterone receptor in a manner that is substantially similar to the action of the natural PR ligand, progesterone, on the progesterone receptor (described in more detail below). Typically, a PR agonist is identified under conditions wherein, in the absence of the agonist, the PR receptor is not activated, or is at least believed not to be in the presence of a compound that is known to activate the receptor, such as the natural ligand progesterone or a known progestin. [0044]
  • The phrase, “PR antagonist ligand” or “PR antagonist” refers to any compound which inhibits the effect of a PR agonist, as described above. More particularly, a PR antagonist is capable of associating with a progesterone receptor such that the biological activity of the receptor is decreased (e.g., reduced, inhibited, blocked, reversed, altered) in a manner that is antagonistic (e.g., against, a reversal of, contrary to) to the action of the natural agonist, progesterone, on the receptor. Such a compound can include, but is not limited to, steroidal or non-steroidal compounds; a protein, peptide, or nucleic acid that selectively binds to and blocks access to the receptor by a natural or synthetic agonist ligand or reduces or inhibits the activity of a progesterone receptor; or a product of drug design that blocks the receptor or alters the biological activity of the receptor (e.g., an antiprogestin, which antagonizes the actions of progesterone). Again, since the progesterone receptor is an intracellular receptor, antibody antagonists are typically not practical, although if appropriate and feasible, their use is contemplated herein. It is noted that the action of a given PR antagonist on a given downstream gene via a PR may be to actually upregulate the gene. Moreover, the action of the antagonist on a PR may have undesirable consequences in one tissue type and beneficial consequences in another tissue type. However, the term antagonist is intended to refer to the ability of the ligand to act on a progesterone receptor in a manner that is antagonistic to the action of the natural PR ligand, progesterone, or a synthetic PR agonist, on the progesterone receptor. Typically, an antagonist is identified under control conditions wherein, in the absence of the antagonist, the progesterone receptor is stimulated, such as by the natural ligand, progesterone, or by any suitable progestin. In one embodiment, a PR antagonist can be identified by its ability to alter the regulation of downstream genes by the receptor in the absence of a known stimulator of the receptor. In this embodiment, ligand-independent regulators of progesterone receptor function can be identified by detecting effects on genes that are constitutively regulated by PR in the ligand-unactivated state. [0045]
  • According to the present invention, agonists and antagonist ligands can include any regulatory ligand or compound that has the above-mentioned characteristics with regard to regulation of a progesterone receptor. For example, agonists and antagonists can include steroidal and non-steroidal compounds, proteins and peptides, nucleic acid molecules, antibodies, and/or mimetics (e.g., products of drug design or combinatorial chemistry). [0046]
  • Natural sex steroid hormone agonists are low molecular weight ringed cyclopentanophenanthrene compounds that in mammals include progesterone, estrogens and androgens. Steroid agonists can be extracted from a variety of natural sources, including the ovaries and testes. With the aim of enhancing the properties of natural steroid compounds, researchers have modified the cyclopentanophenanthrene structures and/or altered the substituent side-chains to generate semi-synthetic and synthetic steroidal and non-steroidal compounds. Non-steroidal compounds lack the classical cyclopentanophenanthrene structure. Nevertheless, all of these compounds—natural, semi-synthetic and synthetic, steroidal and non-steroidal compounds, bind to their respective nuclear receptors. Modified compounds can be either agonists or antagonists. [0047]
  • Progesterone is the natural “progestin” produced by the ovaries and adrenal glands of mammals. Semi-synthetic or synthetic analogs that have progesterone-like effects, can be either steroidal or non-steroidal. They are also included in the generic category called “progestins.” Natural, semi-synthetic or synthetic progestins bind to intracellular, usually intranuclear, progesterone receptors. Such progestins can be either “agonists” or “antagonists” (antiprogestins). Both agonists and antagonists can have variable levels of activity of the receptors. An agonist can be strong or weak with many levels in between. An antagonist can also be strong or weak. Some antagonists may have “mixed” agonist/antagonist properties. The present invention can screen for all of these types of progestins. [0048]
  • Other compounds in addition to steroidal and non-steroidal compounds can bind progesterone receptors. These include proteins and peptides, and nucleic acids and fragments thereof. Any compound that binds a receptor can be classified as a “ligand” of the receptor. If the ligand influences the activity of the progesterone receptor, the present invention can be used to screen for such ligand(s). [0049]
  • An isolated protein, according to the present invention, is a protein (including a peptide) that has been removed from its natural milieu (i.e., that has been subject to human manipulation) and can include purified proteins, partially purified proteins, recombinantly produced proteins, and synthetically produced proteins, for example. As such, “isolated” does not reflect the extent to which the protein has been purified. An isolated protein useful as an antagonist or agonist according to the present invention can be isolated from its natural source, produced recombinantly or produced synthetically. Smaller peptides useful as regulatory ligands are typically produced synthetically by methods well known to those of skill in the art. Regulatory ligands of the present invention can also include an antibody or antigen binding fragment that selectively binds to a progesterone receptor. [0050]
  • According to the present invention, the phrase “selectively binds to” refers to the ability of an antibody, antigen binding fragment or other binding partner (protein, peptide, nucleic acid) to preferentially bind to specified proteins. More specifically, the phrase “selectively binds” refers to the specific binding of one protein to another, wherein the level of binding, as measured by any standard assay, is statistically significantly higher than the background control for the assay. [0051]
  • Agonists and antagonists that are products of drug design can be produced using various methods known in the art. Various methods of drug design, useful to design mimetics or other regulatory compounds useful in the present invention are disclosed in Maulik et al., 1997[0052] , Molecular Biotechnology: Therapeutic Applications and Strategies, Wiley-Liss, Inc., which is incorporated herein by reference in its entirety. A PR agonist or antagonist can be obtained, for example, from molecular diversity strategies (a combination of related strategies allowing the rapid construction of large, chemically diverse molecule libraries), libraries of natural or synthetic compounds, in particular from chemical or combinatorial libraries (i.e., libraries of compounds that differ in sequence or size but that have the similar building blocks) or by rational, directed or random drug design. See for example, Maulik et al., supra.
  • In a molecular diversity strategy, large compound libraries are synthesized, for example, from peptides, oligonucleotides, natural or synthetic steroidal compounds, carbohydrates and/or natural or synthetic organic and non-steroidal molecules, using biological, enzymatic and/or chemical approaches. The critical parameters in developing a molecular diversity strategy include subunit diversity, molecular size, and library diversity. The general goal of screening such libraries is to utilize sequential application of combinatorial selection to obtain high-affinity ligands for a desired target, and then to optimize the lead molecules by either random or directed design strategies. Methods of molecular diversity are described in detail in Maulik, et al., ibid. [0053]
  • As used herein, the term “mimetic” is used to refer to any natural or synthetic steroidal compound, peptide, oligonucleotide, carbohydrate and/or natural or synthetic organic and non-steroidal molecule that is able to mimic the biological action of a naturally occurring or known synthetic progestin. [0054]
  • Methods and Products of the Present Invention: [0055]
  • One embodiment of the present invention relates to a method to identify agonist ligands of progesterone receptors. This method includes the steps of: (a) contacting a progesterone receptor with a putative agonist ligand, wherein the progesterone receptor is selected from the group consisting of progesterone receptor A (PR-A) and progesterone receptor B (PR-B), under conditions wherein, in the absence of the putative agonist ligand, the progesterone receptor is not activated; (b) detecting expression of at least one gene that is regulated by the progesterone receptor when the progesterone receptor is activated and, (c) comparing the expression of the at least one gene in the presence and in the absence of the putative agonist ligand, wherein detection of regulation of the expression of the at least one gene in the manner associated with activation of the progesterone receptor as set forth in (b) indicates that the putative agonist ligand is a progesterone receptor agonist. The gene can include any one or more of any of the following genes: (i) one or more of the genes that are selectively upregulated by PR-A chosen from a gene in Table 1; (ii) one or more of the genes that are selectively downregulated by PR-A chosen from a gene in Table 2; (iii) one or more of the genes that are selectively upregulated by PR-B chosen from a gene in Table 3; (iv) one or more of the genes that are selectively downregulated by PR-B chosen from a gene in Table 4; (v) one or more of the genes that are upregulated or downregulated in the same direction by both PR-A and PR-B chosen from a gene in Table 5; (vi) one or more of the genes that are reciprocally regulated by PR-A and PR-B chosen from a gene in Table 6; and, (vii) one or more of the genes that are regulated by one of either PR-A or PR-B, wherein the regulation of the gene is altered when the other of the PR-A or PR-B is present, such a gene being chosen from a gene in Table 7. [0056]
  • Another embodiment of the present invention relates to a method to identify antagonists of progesterone receptor. This method includes the steps of: (a) contacting a progesterone receptor with a putative antagonist ligand, wherein the progesterone receptor is selected from the group consisting of progesterone receptor A (PR-A) and progesterone receptor B (PR-B), under conditions wherein, in the absence of said putative antagonist ligand, said progesterone receptor is activated (i.e., before, simultaneously with or after the contact of the receptor with the putative regulatory ligand); (b) detecting expression of at least one gene that is regulated by the progesterone receptor when the progesterone receptor is activated; and, (c) comparing the expression of the at least one gene in the presence and in the absence of the putative antagonist ligand. Detection of inhibition or reversal of the regulation of expression of the at least one gene as compared to the regulation of expression of the at least one gene in the manner associated with activation of the progesterone receptor as set forth in (b), indicates that the putative antagonist ligand is a progesterone receptor antagonist. The gene(s) to be detected in step (b) are chosen from one or more of the following genes: (i) one or more of the genes that are selectively upregulated by PR-A chosen from a gene in Table 1; (ii) one or more of the genes that are selectively downregulated by PR-A chosen from a gene in Table 2; (iii) one or more of the genes that are selectively upregulated by PR-B chosen from a gene in Table 3; (iv) one or more of the genes that are selectively downregulated by PR-B chosen from a gene in Table 4; (v) one or more of the genes that are upregulated or downregulated by both PR-A and PR-B chosen from a gene in Table 5; (vi) one or more of the genes that are reciprocally regulated by PR-A and PR-B chosen from a gene in Table 6; and, (vii) one or more of the genes that are regulated by one of the PR-A or the PR-B, wherein regulation of the gene is altered when the other of the PR-A or PR-B is expressed by the same cell, chosen from a gene in Table 7. In one embodiment, the progesterone receptor is activated by contacting the receptor with a compound that activates the receptor, the step of contacting being performed prior to, simultaneously with, or after the step of contacting of (a). [0057]
  • The steps of the method of the present invention will now be described in some detail for these embodiments of the invention; however, this discussion generally applies to other methods of identifying various ligands of progesterone receptors as described below. [0058]
  • As used herein, the term “putative regulatory compound” or “putative regulatory ligand” refers to compounds having an unknown regulatory activity, at least with respect to the ability of such compounds to regulate progesterone receptors as described herein. [0059]
  • In the method of identifying a regulatory ligand (i.e., an agonist or an antagonist) according to the present invention, the method can be a cell-based assay, or non-cell-based assay. In one embodiment, the progesterone receptor is expressed by a cell (i.e., a cell-based assay). In another embodiment the progesterone receptor is in a cell lysate, is in isolated cell nuclei, or is purified or produced free of cells. The progesterone receptor can be a PR-A, a PR-B, or a combination of PR-A and PR-B. One advantage of the present invention is that, given the knowledge of the isoform regulation of the various downstream genes disclosed herein, one can screen for ligands of the progesterone receptor, including screening for isoform specific ligands, using cells that express both receptors. Prior to the present invention, it was impossible to distinguish between the effects of one isoform or the other, because most cells express both isoforms. [0060]
  • In one embodiment, the conditions under which a receptor according to the present invention is contacted with a putative regulatory ligand, such as by mixing; are conditions in which the receptor is not stimulated (activated) if essentially no regulatory ligand is present. For example, such conditions include normal culture conditions in the absence of a known stimulatory compound (a stimulatory compound being, for example, the natural ligand for the receptor (progesterone), a stimulatory peptide, or other equivalent stimulus, such as a synthetic progestin). The putative regulatory ligand is then contacted with the receptor. In this embodiment, the step of detecting is designed to indicate whether the putative regulatory ligand alters the biological activity of the receptor as compared to in the absence of the putative regulatory ligand (i.e., the background level), as determined by the effects of the contact between the ligand and the receptor on the expression of downstream genes as described herein. [0061]
  • In an alternate embodiment, the conditions under which a progesterone receptor according to the present invention is contacted with a putative regulatory ligand, such as by mixing, are conditions in which the receptor is normally stimulated (activated) if essentially no regulatory ligand is present. Such conditions can include, for example, contact of said receptor with a stimulator molecule (a stimulatory compound being, e.g., the natural ligand for the receptor (progesterone), a stimulatory peptide, or other equivalent stimulus, such as a synthetic progestin) which binds to the receptor and causes the receptor to become activated. In this embodiment, the putative regulatory ligand can be contacted with the receptor prior to, or simultaneously with, the contact of the receptor with the stimulatory compound (e.g., to determine whether the putative regulatory ligand blocks or otherwise inhibits the stimulation of the progesterone receptor by the stimulatory compound), or after contact of the receptor with the stimulatory compound (e.g., to determine whether the putative regulatory ligand downregulates, or reduces the activation of the receptor). [0062]
  • The present methods involve contacting the progesterone receptor with the ligand being tested for a sufficient time to allow for interaction, activation or inhibition of the receptor by the ligand. The period of contact with the ligand being tested can be varied depending on the result being measured, and can be determined by one of skill in the art. For example, for binding assays, a shorter time of contact with the compound being tested is typically suitable, than when activation is assessed, and particularly, when the expression of downstream genes is assessed. The methods of the present invention detect the expression of downstream genes and therefore, the time of incubation is dependent upon the time required to achieve expression of the downstream genes. Such a time period is typically at least 2 hours, and more preferably at least 4 hours, and more preferably at least 6 hours, although the time can be extended, if necessary to detect expression of a selected downstream gene. As used herein, the term “contact period” refers to the time period during which the progesterone receptor is in contact with the ligand being tested. The term “incubation period” refers to the entire time during which the cells expressing the receptor, for example, are allowed to grow prior to evaluation, or the time during which genes affected by activation of the progesterone receptor are allowed to be expressed, and such time period can be inclusive of the contact period. Thus, the incubation period includes all of the contact period and may include a further time period during which the compound being tested is not present, or is no longer being supplied to the receptor, but during which gene expression is continuing (in the case of a cell based assay) prior to scoring. The incubation time for growth of cells can vary but is sufficient to allow for the binding of the progesterone receptor, the activation or inhibition of the receptor, and the effect on the expression of the downstream genes regulated by the receptor. It will be recognized that shorter incubation times are preferable because compounds can be more rapidly screened. [0063]
  • In accordance with the present invention, a cell-based assay is conducted under conditions which are effective to screen for regulatory compounds useful in the method of the present invention. Effective conditions include, but are not limited to, appropriate media, temperature, pH and oxygen conditions that permit the growth of the cell that expresses the receptor. An appropriate, or effective, medium refers to any medium in which a cell that naturally or recombinantly expresses a progesterone receptor, when cultured, is capable of cell growth and expression of the progesterone receptor. Such a medium is typically a solid or liquid medium comprising growth factors and assimilable carbon, nitrogen and phosphate sources, as well as appropriate salts, minerals, metals and other nutrients, such as vitamins. Culturing is carried out at a temperature, pH and oxygen content appropriate for the cell. Such culturing conditions are within the expertise of one of ordinary skill in the art. Exemplary cells expressing progesterone receptors are described in the Examples, and in detail in (Sartorius et al., [0064] Cancer Res. 54, 3668-3877 (1994)).
  • Cells that are useful in the cell-based assays of the present invention include any cell that expresses a progesterone receptor of the isoform A, isoform B, or a combination of PR-A and PR-B. Such cells include cells that naturally express progesterone receptors, or cells that express progesterone receptors by recombinant technology. Such cells preferably include, but are not limited to mammalian cells, which can originate from the breast or any other tissue. For example, tissues containing cells that are known to express the progesterone receptor naturally include, but are not limited to, breast, uterus, bone, cartilage, cardiovascular tissues, heart, lung, brain, meninges, pituitary, ovary, oocyte, corpus luteum, oviduct, fallopian tubes, T lymphocytes, B lymphocytes, thymocytes, salivary gland, placenta, skin, gut, pancreas, liver, testis, epididymis, bladder, urinary tract, eye, and teeth. Cells suitable for use in a cell-based assay include normal or malignant cells, as well as cells that are not malignant, but which are abnormal, such as cells from a non-malignant tissue that is otherwise diseased (e.g., tissues from endometriosis and leiomyoma of the uterus, fibrocystic disease of the breast, polycystic ovary). Other suitable cells are cells that express PR-A, PR-B, or both isoforms, as a result of recombinant technology. Such cells were used to discover the PR downstream genes of the present invention and are described in detail in Sartorius et al. (Sartorius et al., Cancer Res. 54, 3668-3877 (1994)). Other suitable cells are cells that express a PR-A and/or a PR-B transgene (i.e., cells isolated from a transgenic animal), or cells that have a germline deletion of one of the PR isoforms, but not the other (i.e., cells from a PR-A or PR-B knockout animal). [0065]
  • According to the present invention, the method includes the step of detecting the expression of at least one, and preferably more than one, of the downstream genes that have now been shown to be regulated by progesterone receptors by the present inventors. As used herein, the term “expression”, when used in connection with detecting the expression of a downstream gene of the present invention, can refer to detecting transcription of the gene and/or to detecting translation of the gene. To detect expression of a downstream gene refers to the act of actively determining whether a gene is expressed or not. This can include determining whether the gene expression is upregulated as compared to a control, downregulated as compared to a control, or unchanged as compared to a control. Therefore, the step of detecting expression does not require that expression of the gene actually is upregulated or downregulated, but rather, can also include detecting that the expression of the gene has not changed (i.e., detecting no expression of the gene or no change in expression of the gene). [0066]
  • The present method includes the step of detecting the expression of at least one gene that is regulated by a progesterone receptor when the receptor is activated, as set forth in detail above. In a preferred embodiment, the step of detecting includes detecting the expression of at least 2 genes, and preferably at least 3 genes, and more preferably at least 4 genes, and more preferably at least 5 genes, and more preferably at least 6 genes, and more preferably at least 7 genes, and more preferably at least 8 genes, and more preferably at least 9 genes, and more preferably at least 10 genes, and more preferably at least 11 genes, and more preferably at least 12 genes, and more preferably at least 13 genes, and more preferably at least 14 genes, and more preferably at least 15 genes, and so on, in increments of one, up to detecting expression of all of the downstream genes disclosed herein. Analysis of a number of genes greater than 1 can be accomplished simultaneously, sequentially, or cumulatively. [0067]
  • In the method of identifying an agonist or an antagonist of a progesterone receptor of the present invention, the gene(s) to be detected are preferably selected from the genes described in any one or more of Tables 1-7. These tables disclose genes that are regulated by progesterone receptors, and particularly, these tables disclose the manner in which the genes are regulated by the PR isoforms when the progesterone receptor is activated (i.e., by a stimulator of the receptor). The genes to be detected can include one or more of: (1) genes that are selectively (i.e., exclusively or uniquely) upregulated by PR-A (Table 1); (2) genes that are selectively downregulated by PR-A (Table 2); (3) genes that are selectively upregulated by PR-B (Table 3); (4) genes that are selectively downregulated by PR-B (Table 4); (5) genes that are upregulated or downregulated in the same direction by both PR-A and PR-B (Table 5); (6) genes that are reciprocally regulated by PR-A and PR-B (Table 6); and (7) genes that are regulated by one of the PR isoforms, wherein such regulation is altered when the other PR isoform is present (e.g., the expression of the gene is either up- or downregulated in the presence of both receptors relative to the expression level of the gene in the presence of only one receptor) (Table 7). In one embodiment, the method further includes the additional detection of the expression of one or more genes that were previously known to be regulated by progesterone, but for which the PR isoform regulation was not known until the present invention. Such genes are disclosed in Table 8. [0068]
  • It is to be understood that the organization of various genes into the present tables is for purposes of clarity and identification of various genes on the basis of the manner in which the gene is regulated by a progesterone receptor isoform. The selection of genes to be detected in any given method can include any one or more of the genes in any one or more of the Tables, and can include the detection of any combination of two or more of the genes in any one or more of the Tables. It is not mandatory that a given assay be restricted to the detection of all of the various genes in a single table, or to one gene in each table. In addition, with regard to Tables 1-7, it is believed that these tables encompass genes that have been identified by the present inventors to be regulated by progesterone receptors, but which have not previously been described as being regulated by progesterone. However, in the event that one or more of the genes in Tables 1-7 is found to have previously been known to be regulated by progesterone, the removal of such gene from these tables and the placement of such gene into Table 8, is explicitly contemplated. This rationale also applies to the genes of Table 16, which are believed to include all of those genes identified by the inventors that were previously known to be involved in breast cancer or mammary development. It is expressly contemplated that other genes from Tables 1-7 or 9-15 can be added to Table 16, if required for accuracy. Tables 9-15 include all of the genes identified by the present inventors as being regulated by progesterone receptors (organized by isoform regulation, as for Tables 1-7), and, as discussed previously herein, include genes that were previously known to be regulated by progesterone. [0069]
  • Given the knowledge of the genes regulated by progesterone receptors according to the present invention, one of skill in the art will be able to select one or more genes to detect in a method of the present invention, and the selection of the one or more genes can be determined by different factors. For example, certain subsets of the genes are useful for detecting genes regulated by PR-A exclusively (i.e., genes in Tables 1, 2, 9 and 10). Other subsets of genes are useful for detecting genes regulated by PR-B exclusively (i.e., genes in Tables 3, 4, 11 and 12). One of skill in the art may wish to detect genes disclosed herein that are related to a particular function, to a particular tissue-type, or that are associated (or likely to be associated) with a particular disease or condition. One of skill in the art may also wish to select genes on the basis of the change in expression level in the presence of progesterone (i.e., and therefore activation of the PR) as compared to in the absence of progesterone. [0070]
  • In one aspect of the methods of the present invention, the method of the present invention includes detecting genes of the present invention that are related by function. For example, Table 17 provides a listing of the various genes identified by the present inventors, categorized by function. Therefore, one could screen functional sets of genes to make a specific determination about a given cell or tissue that expresses a progesterone receptor, or to identify a ligand that has an action that might be correlated with a functional gene. For example, one could use subsets of the disclosed genes to screen a tumor for the likelihood that it will metastasize by screening the genes in the “cell adhesion or cytoskeletal interaction” group of Table 17. Other uses for screening functional groups will be apparent to those of skill in the art. [0071]
  • In another aspect, one could detect genes that are of interest in a particular tissue type. Examples of such genes are disclosed below in the discussion regarding the identification of tissue-specific ligands of progesterone receptors. [0072]
  • In another aspect, one could detect those genes that are associated with a particular disease, such as breast cancer. An exemplary grouping of genes that are regulated by progesterone receptors (as disclosed herein) and that were previously known to be involved in breast cancer or mammary gland development, are shown in Table 16. In one embodiment, one may be interested in detecting those genes listed in Table 16 which are not also listed in Table 8. [0073]
  • In another aspect, it may be desirable to select those genes for detection that are particularly highly regulated by progesterone receptors in that they display the largest increases or decreases in expression levels in the presence of progesterone as compared to in the absence of progesterone. The detection of such genes can be advantageous because the endpoint may be more clear and require less quantitation. The relative expression levels of the genes identified in the present invention are listed in the tables. In these tables, the fold increase or decrease in expression of the gene upon treatment of the progesterone receptor with progesterone for 6 hours is indicated. The fold increase or decrease was made with respect to the background level of expression of the gene, which in some cases, was undetectable (i.e., the gene was not detected at all in the absence of progesterone, but was detected in the presence of progesterone). Therefore, in one embodiment, one of skill in the art might choose to detect genes that exhibited a fold increase above background of at least 2. In another embodiment, one of skill in the art might choose to detect genes that exhibited a fold increase or decrease above background of at least 3, and in another embodiment at least 4, and in another embodiment at least 5, and in another embodiment at least 6, and in another embodiment at least 7, and in another embodiment at least 8, and in another embodiment at least 9, and in another embodiment at least 10 or higher fold changes. It is noted that fold increases or decreases are not typically compared from one gene to another, but with reference to the background level for that particular gene. [0074]
  • In order to determine whether a putative regulatory compound is an agonist or antagonist of PR as defined herein, it is necessary to know how a given gene is regulated by the PR so that one can compare the results in the presence and absence of the putative regulatory ligand to the gene expression profile produced by an activated receptor. This allows the investigator to thereby detect whether the contact of the receptor with the putative ligand results in a profile of gene expression that is substantially similar to the profile of gene expression of an activated PR (i.e., agonist action), or whether contact of the receptor with the putative ligand results in a profile of gene expression that is an inhibition, or reversal, of the profile of gene expression of an activated PR (i.e., antagonist action). [0075]
  • In one aspect of the method of the present invention, the step of detecting can include the detection of one or more reporter genes that are linked to promoters of one or more downstream genes according to the present invention. In this embodiment, the transcriptional read-out can use one, two or more promoters of any of the genes of this invention, linked to any of several reporter constructs, which are introduced into cells by any of several established transfection or infection methods, including, but not limited to, calcium phosphate transfection, transformation, electroporation, microinjection, lipofection, adsorption, infection (e.g., by a viral vector) and protoplast fusion. The cells can be naturally PR-positive (containing both PRs), or they can stably or transiently express either one or both of the two PR-isoforms. The cells can be exposed to the test ligands (i.e., the putative regulatory ligands) for different times and/or concentrations, and transcription of the PR-responsive promoter(s) of the downstream genes disclosed in this invention can be quantified. [0076]
  • In another aspect of this method of the present invention, cells expressing a PR as described above are exposed to the unknown test ligands at various concentrations and for various periods of time. The transcriptional read-out can be expression of one, two or more of the genes of this invention, which are endogenously regulated in the cells. Expression of their transcripts and/or proteins is measured by any of a variety of known methods in the art several of which are exemplified in the Examples section. For RNA expression, methods include but are not limited to: extraction of cellular mRNA and northern blotting using labeled probes that hybridize to transcripts encoding all or part of one or more of the genes of this invention; amplification of mRNA expressed from one or more of the genes of this invention using gene-specific primers and reverse transcriptase-polymerase chain reaction, followed by quantitative detection of the product by any of a variety of means; extraction of total RNA from the cells, which is then labeled and used to probe cDNAs or oligonucleotides encoding all or part of the PR-responsive genes of this invention, arrayed on any of a variety of surfaces. [0077]
  • Methods to measure protein expression levels of selected genes of this invention, include, but are not limited to: western blotting, immunocytochemistry, flow cytometry or other immunologic-based assays; assays based on a property of the protein including but not limited to DNA binding, ligand binding, or interaction with other protein partners. [0078]
  • Nucleic acid arrays are particularly useful for detecting the expression of the downstream genes of the present invention. The production and application of high-density arrays in gene expression monitoring have been disclosed previously in, for example, WO 97/10365; WO 92/10588; U.S. Pat. No. 6,040,138; U.S. Pat. No. 5,445,934; or WO95/35505, all of which are incorporated herein by reference in their entireties. Also for examples of arrays, see Hacia et al. (1996) [0079] Nature Genetics 14:441-447; Lockhart et al. (1996) Nature Biotechnol. 14:1675-1680; and De Risi et al. (1996) Nature Genetics 14:457-460. In general, in an array, an oligonucleotide, a cDNA, or genomic DNA, that is a portion of a known gene occupies a known location on a substrate. A nucleic acid target sample is hybridized with an array of such oligonucleotides and then the amount of target nucleic acids hybridized to each probe in the array is quantified. One preferred quantifying method is to use confocal microscope and fluorescent labels. The Affymetrix GeneChip™ Array system (Affymetrix, Santa Clara, Calif.) and the Atlas™Human cDNA Expression Array system are particularly suitable for quantifying the hybridization; however, it will be apparent to those of skill in the art that any similar systems or other effectively equivalent detection methods can also be used. The Examples section describes the use of these two different array systems. In a particularly preferred embodiment, one can use the knowledge of the genes described herein to design novel arrays of polynucleotides, cDNAs or genomic DNAs for screening methods described herein. Such novel pluralities of polynucleotides are contemplated to be a part of the present invention and are described in detail below.
  • Suitable nucleic acid samples for screening on an array contain transcripts of interest or nucleic acids derived from the transcripts of interest (i.e., transcripts derived from the PR-regulated genes of the present invention). As used herein, a nucleic acid derived from a transcript refers to a nucleic acid for whose synthesis the mRNA transcript or a subsequence thereof has ultimately served as a template. Thus, a cDNA reverse transcribed from a transcript, an RNA transcribed from that cDNA, a DNA amplified from the cDNA, an RNA transcribed from the amplified DNA, etc., are all derived from the transcript and detection of such derived products is indicative of the presence and/or abundance of the original transcript in a sample. Thus, suitable samples include, but are not limited to, transcripts of the gene or genes, cDNA reverse transcribed from the transcript, cRNA transcribed from the cDNA, DNA amplified from the genes, RNA transcribed from amplified DNA, and the like. [0080]
  • Preferably, the nucleic acids for screening are obtained from a homogenate of cells or tissues or other biological samples. Preferably, such sample is a total RNA preparation of a biological sample. More preferably in some embodiments, such a nucleic acid sample is the total mRNA isolated from a biological sample. Biological samples may be of any biological tissue or fluid or cells from any organism. Frequently the sample will be a “clinical sample” which is a sample derived from a patient, such as a breast tumor sample from a patient. Typical clinical samples include, but are not limited to, sputum, blood, blood cells (e.g., white cells), tissue or fine needle biopsy samples, urine, peritoneal fluid, and pleural fluid, or cells therefrom. Biological samples may also include sections of tissues, such as frozen sections or formalin fixed sections taken for histological purposes. [0081]
  • In one embodiment, it is desirable to amplify the nucleic acid sample prior to hybridization. One of skill in the art will appreciate that whatever amplification method is used, if a quantitative result is desired, care must be taken to use a method that maintains or controls for the relative frequencies of the amplified nucleic acids to achieve quantitative amplification. Methods of “quantitative” amplification are well known to those of skill in the art. For example, quantitative PCR involves simultaneously co-amplifying a known quantity of a control sequence using the same primers. This provides an internal standard that may be used to calibrate the PCR reaction. The high-density array may then include probes specific to the internal standard for quantification of the amplified nucleic acid. Other suitable amplification methods include, but are not limited to polymerase chain reaction (PCR) Innis, et al., PCR Protocols. A guide to Methods and Application. Academic Press, Inc. San Diego, (1990)), ligase chain reaction (LCR) (see Wu and Wallace, Genomics, 4: 560 (1989), Landegren, et al., Science, 241: 1077 (1988) and Barringer, et al., Gene, 89: 117 (1990), transcription amplification (Kwoh, et al., Proc. Natl. Acad. Sci. USA, 86: 1173 (1989)), and self-sustained sequence replication (Guatelli, et al, Proc. Nat. Acad. Sci. USA, 87: 1874 (1990)). [0082]
  • Nucleic acid hybridization simply involves contacting a probe and target nucleic acid under conditions where the probe and its complementary target can form stable hybrid duplexes through complementary base pairing. As used herein, hybridization conditions refer to standard hybridization conditions under which nucleic acid molecules are used to identify similar nucleic acid molecules. Such standard conditions are disclosed, for example, in Sambrook et al., [0083] Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Labs Press, 1989. Sambrook et al., ibid., is incorporated by reference herein in its entirety (see specifically, pages 9.31-9.62). In addition, formulae to calculate the appropriate hybridization and wash conditions to achieve hybridization permitting varying degrees of mismatch of nucleotides are disclosed, for example, in Meinkoth et al., 1984, Anal. Biochem. 138, 267-284; Meinkoth et al., ibid., is incorporated by reference herein in its entirety. Nucleic acids that do not form hybrid duplexes are washed away from the hybridized nucleic acids and the hybridized nucleic acids can then be detected, typically through detection of an attached detectable label. It is generally recognized that nucleic acids are denatured by increasing the temperature or decreasing the salt concentration of the buffer containing the nucleic acids. Under low stringency conditions (e.g., low temperature and/or high salt) hybrid duplexes (e.g., DNA:DNA, RNA:RNA, or RNA:DNA) will form even where the annealed sequences are not perfectly complementary. Thus specificity of hybridization is reduced at lower stringency. Conversely, at higher stringency (e.g., higher temperature or lower salt) successful hybridization requires fewer mismatches.
  • High stringency hybridization and washing conditions, as referred to herein, refer to conditions which permit isolation of nucleic acid molecules having at least about 90% nucleic acid sequence identity with the nucleic acid molecule being used to probe in the hybridization reaction (i.e., conditions permitting about 10% or less mismatch of nucleotides). One of skill in the art can use the formulae in Meinkoth et al., 1984[0084] , Anal. Biochem. 138, 267-284 (incorporated herein by reference in its entirety) to calculate the appropriate hybridization and wash conditions to achieve these particular levels of nucleotide mismatch. Such conditions will vary, depending on whether DNA:RNA or DNA:DNA hybrids are being formed. Calculated melting temperatures for DNA:DNA hybrids are 10° C. less than for DNA:RNA hybrids. In particular embodiments, stringent hybridization conditions for DNA:DNA hybrids include hybridization at an ionic strength of 6×SSC (0.9 M Na+) at a temperature of between about 20° C. and about 35° C., more preferably, between about 28° C. and about 40° C., and even more preferably, between about 35° C. and about 45° C. In particular embodiments, stringent hybridization conditions for DNA:RNA hybrids include hybridization at an ionic strength of 6×SSC (0.9 M Na+) at a temperature of between about 30° C. and about 45° C., more preferably, between about 38° C. and about 50° C., and even more preferably, between about 45° C. and about 55° C. These values are based on calculations of a melting temperature for molecules larger than about 100 nucleotides, 0% formamide and a G+C content of about 40%. Alternatively, Tm can be calculated empirically as set forth in Sambrook et al., supra, pages 9.31 to 9.62.
  • The hybridized nucleic acids are detected by detecting one or more labels attached to the sample nucleic acids. The labels may be incorporated by any of a number of means well known to those of skill in the art. Detectable labels suitable for use in the present invention include any composition detectable by spectroscopic, photochemical, biochemical, immunochemical, electrical, optical or chemical means. Useful labels in the present invention include biotin for staining with labeled streptavidin conjugate, magnetic beads (e.g., Dynabeads.TM.), fluorescent dyes (e.g., fluorescein, texas red, rhodamine, green fluorescent protein, and the like), radiolabels (e.g., [0085] 3H, 125I, 35S, 14C, or 32P), enzymes (e.g., horse radish peroxidase, alkaline phosphatase and others commonly used in an ELISA), and colorimetric labels such as colloidal gold or colored glass or plastic (e.g., polystyrene, polypropylene, latex, etc.) beads. Means of detecting such labels are well known to those of skill in the art. Thus, for example, radiolabels may be detected using photographic film or scintillation counters, fluorescent markers may be detected using a photodetector to detect emitted light. Enzymatic labels are typically detected by providing the enzyme with a substrate and detecting the reaction product produced by the action of the enzyme on the substrate, and colorimetric labels are detected by simply visualizing the colored label.
  • The term “quantifying” or “quantitating” when used in the context of quantifying transcription levels of a gene can refer to absolute or to relative quantification. Absolute quantification may be accomplished by inclusion of known concentration(s) of one or more target nucleic acids and referencing the hybridization intensity of unknowns with the known target nucleic acids (e.g. through generation of a standard curve). Alternatively, relative quantification can be accomplished by comparison of hybridization signals between two or more genes, or between two or more treatments to quantify the changes in hybridization intensity and, by implication, transcription level. [0086]
  • In one aspect of the present method, in vitro cell based assays may be designed to screen for compounds that affect the regulation of genes by a progesterone receptor at either the transcriptional or translational level. One, two or more promoters of the genes of this invention can be used to screen unknown ligands for their ability to selectively regulate transcription in vitro via PR-A or PR-B. Promoters of the selected genes can be linked to any of several reporters (including but not limited to chloramphenicol acetyl transferase, or luciferase) that measure transcriptional read-out. The promoters can be tested as pure DNA, or as DNA bound to chromatin proteins. Ligands at different concentrations and under different assay conditions can be screened for their ability to either up- or down-regulate transcription of the selected genes, under the control of either PR-A, PR-B or both. In this embodiment, cells expressing progesterone receptors or cell lysates comprising progesterone receptors are contacted with a putative regulatory ligand for a time sufficient to act on the receptor. The cells or cell lysates contain one, two or more promoters of the selected genes that are linked to any of several reporters, and the transcription or translation of the reporter genes is measured. Appropriate cells are preferably prepared from any cell type that naturally expresses the progesterone receptor or that recombinantly expresses the progesterone receptor, thereby ensuring that the cells contain the transcription factors required for transcription. The screen can be used to identify ligands that modulate the expression of the reporter construct. In such screens, the level of reporter gene expression is determined in the presence of the test ligand and compared to the level of expression in the absence of the test ligand, or the test ligand is compared to a known ligand, such as progesterone. [0087]
  • In one aspect of the present method, the step of detecting can include detecting the expression of one or more downstream genes of the invention in intact animals or tissues obtained from such animals. Mammalian (i.e. mouse, rat, monkey) or non-mammalian (ie. chicken) species that express PRs in their tissues and elaborate progesterone, can be the test animals. The unknown test ligand is introduced into intact or castrated animals by any of a variety of oral, intravenous, intramuscular, subdermal or other routes, for a variety of treatment times or concentrations. The tissues to be surveyed can be either normal or malignant progesterone targets (including but not limited to the mammary glands, mammary cancers, uterus, or endometrial cancers). The presence and quantity of endogenous mRNA or protein expression of one, two or more of the genes of this invention can be measured in those progesterone target tissues. The gene markers can be measured in tissues that are fresh, frozen, fixed or otherwise preserved. They can be measured in cytoplasmic or nuclear organ-, tissue- or cell-extracts; or in cell membranes including but not limited to plasma, cytoplasmic, mitochondrial, golgi or nuclear membranes; in the nuclear matrix; or in cellular organelles and their extracts including but not limited to ribosomes, nuclei, nucleoli, mitochondria, or golgi. Assays for endogenous expression of mRNAs or proteins encoded by the genes of this invention can be performed as described above. Alternatively, intact transgenic animals can be generated for ligand screening. Animals can be genetically manipulated to express the promoters of one, two or more of the genes of this invention linked to one or more reporters such as X-gal. After treatment of the animals with the test unknown ligands, expression of galactosidase can be measured calorimetrically in normal or malignant progesterone target organs, or tissues containing PRs, or in organs or tissues during development. Ligands that activate through either PR-A or PR-B can be identified by their ability to regulate the appropriate selective gene promoter. [0088]
  • The method of the present invention includes a step of comparing the results of detecting the expression of the one or more downstream genes in the presence and in the absence of the putative regulatory ligand, in order to determine whether any observed change in expression is due to the presence of the putative regulatory compound. The step of comparing further includes comparing the expression of the one or more downstream genes detected in the presence of the ligand to the manner of expression of the genes that is associated with the activation of the progesterone receptor when the receptor is activated (described in detail below). As discussed above, the present inventors have identified the expression profile of multiple genes that are regulated by PR, including the manner in which the genes are regulated (i.e., by which PR isoform, and in which direction by such isoform). Therefore, one can determine whether the contact of the receptor with the putative ligand results in a profile of gene expression that is substantially similar to the profile of gene expression of an activated PR (i.e., agonist action), or whether contact of the receptor with the putative ligand results in a profile of gene expression that is an inhibition, or reversal, of the profile of gene expression of an activated PR (i.e., antagonist action). According to the present invention, a putative test ligand is determined to be a regulator of PR if the expression of the gene or genes detected after contact of the PR with the ligand is statistically significantly altered (i.e., up or down) from the expression detected in the profile of a PR that has been activated by progesterone, or an equivalent agonist. The expression profiles for the genes in Tables 1-19 were determined by evaluating PR that had been activated by progesterone after 6 hours. [0089]
  • A PR agonist is identified by detecting an expression profile in the presence of the agonist that, at a minimum, regulates the expression of the gene in the same direction (i.e, upregulation or downregulation) as it is regulated by an activated progesterone receptor (e.g., the manner of expression of the gene as indicated in Tables 1-7, 9-15 or 18 and 19). More specifically, and by way of example, detection of the regulation of the expression of the gene in the “manner” associated with the activation of the PR (i.e., the natural activation of the PR), at a minimum, refers to the detection of the upregulation of a gene that has now been shown by the present inventors to be selectively upregula3ted by PR-A (genes in Tables 1 and 9) when the receptor is in the presence of the putative agonist, as compared to in the absence of the putative agonist. Similarly, an agonist is identified when the expression of a gene from Tables 2 or 10 is detected to be downregulated in the presence of the putative agonist as compared to in the absence of the agonist. Such downregulation also indicates that, at a minimum, the agonist regulated the PR-A isoform. In a preferred embodiment, the agonist regulates the expression of the gene in the same direction and to at least about 10%, and more preferably at least 20%, and more preferably at least 25%, and more preferably at least 30%, and more preferably at least 35%, and more preferably at least 40%, and more preferably at least 45%, and more preferably at least 50%, and preferably at least 55%, and more preferably at least 60%, and more preferably at least 65%, and more preferably at least 70%, and more preferably at least 75%, and more preferably at least 80%, and more preferably at least 85%, and more preferably at least 90%, and more preferably at least 95%, of the level of expression that is induced by a progesterone receptor that has been activated by progesterone. In a particularly preferred embodiment, an agonist regulates the expression of the gene in the same direction and to a level of expression that is substantially equal to or greater than the level of expression that is induced by a progesterone receptor that has been activated by progesterone. The level of expression is determined with reference to the expression of the gene in the absence of the putative regulatory compound, or in the absence of progesterone, in the case of the control. The level of expression is then compared to the level of expression of the control, or the level of expression that is expected from the control. [0090]
  • A PR antagonist is identified by detecting an expression profile in the presence of the antagonist that, at a minimum, regulates the expression of the gene in the opposite direction (i.e, upregulation instead of downregulation) than the gene is regulated by an activated progesterone receptor (e.g., the manner of expression of the gene as indicated in Tables 1-7, 9-15 or 18 and 19), or causes a statistically significant reduction in the expression level of the gene as compared to the expression level of the gene when it is activated by progesterone, or prevents the regulation of the gene as compared to the regulation of the gene when the receptor is activated by progesterone. In the antagonist screening embodiments, the putative antagonists are screened against a PR that is activated, and so in the absence of the putative antagonist, the expression profile of the genes should be substantially the same as the expression profile set forth in Tables 1-7,9-15 and 18-19). Therefore, any statistically significant decrease (inhibition) in the expression level of the gene or a reversal of the direction of expression of the gene in the presence of the putative antagonist as compared to in the absence of the antagonist, indicates that the putative ligand is an antagonist. In a preferred embodiment, the antagonist inhibits the expression of the detected gene by at least 5%, and more preferably at least 10%, and more preferably at least 15%, and more preferably at least 20%, and more preferably at least 25%, and more preferably at least 30%, and more preferably at least 40%, and more preferably at least 50%, and more preferably at least 60%, and more preferably at least 70%, and more preferably at least 80%, and more preferably at least 90%, as compared to the level of expression that is induced by the activated progesterone receptor in the absence of the putative antagonist. In one embodiment, an antagonist regulates the expression of the gene in the opposite direction (i.e., reverses the expression) as compared to the expression of the gene induced by the activated progesterone receptor in the absence of the putative antagonist. [0091]
  • It will be appreciated by those of skill in the art that differences between the expression of genes regulated by the putative ligand (via the PR) and the expression of genes regulated by the natural ligand (via the PR) may be small or large. Some small differences may be very reproducible and therefore the ligand identified by the method can be useful. For other purposes, large differences may be desirable for ease of detection of the regulatory activity. It will be therefore appreciated that the exact boundary between what is called an agonist and what is called an antagonist can shift, depending on the goal of the screening assay. For some assays it may be useful to set threshold levels of change. For other purposes the putative antagonist ligand may simply have a lower level of activity than an agonist ligand (e.g. a test ligand having 10% of the activity of an agonist can be an antagonist of that agonist). This may depend on the technique being used for detection as well as on the number of genes which are being tested. One of skill in the art can readily determine the criteria for selection of suitable antagonists. [0092]
  • Given the knowledge of the gene expression profiles of the present invention as set forth in Tables 1-7, 9-15 and 18-19, one of skill in the art can, for the first time, identify isoform-specific regulators of progesterone receptors. Therefore, one embodiment of the present invention relates to a method to identify isoform-specific agonists of progesterone receptors. This method includes the steps of: (a) contacting a progesterone receptor with a putative agonist ligand, wherein the progesterone receptor is selected from the group consisting of progesterone receptor A (PR-A) and progesterone receptor B (PR-B), under conditions wherein in the absence of the putative agonist ligand, the progesterone receptor is not activated; (b) detecting expression of at least one gene that is selectively regulated by the progesterone receptor when the progesterone receptor is activated, and (c) comparing the expression of the at least one gene in the presence and in the absence of the putative agonist ligand. In this embodiment, the at least one gene is selected from the group consisting of: (i) at least one gene that is exclusively upregulated or downregulated by PR-A, chosen from a Table selected from the group consisting of Table 1 and Table 2; and, (ii) at least one gene that is exclusively upregulated or downregulated by PR-B chosen from a Table selected from the group consisting of Table 3 and Table 4. Detection of regulation of the expression of the at least one gene in the manner associated with activation of the progesterone receptor as set forth in (i) but not (ii), indicates that the putative agonist ligand is a PR-A-specific agonist, and wherein detection of regulation of the expression of the at least one gene in the manner associated with activation of the progesterone receptor as set forth in (ii) but not (i), indicates that the putative agonist ligand is a PR-B-specific agonist. [0093]
  • Another embodiment of the present invention relates to a method to identify isoform-specific antagonists of progesterone receptors, comprising: (a) contacting a progesterone receptor with a putative antagonist ligand, wherein the progesterone receptor is selected from the group consisting of progesterone receptor A (PR-A) and progesterone receptor B (PR-B), under conditions wherein, in the absence of the putative antagonist ligand, the progesterone receptor is activated; (b) detecting expression of at least one gene that is regulated by the progesterone receptor when the progesterone receptor is activated; and (c) comparing the expression of the at least one gene in the presence and in the absence of the putative antagonist ligand. In this embodiment, the at least one gene is selected from the group consisting of: (i) at least one gene that is exclusively upregulated or downregulated by PR-A, chosen from a Table selected from the group consisting of Table 1 and Table 2; and, (ii) at least one gene that is exclusively upregulated or downregulated by PR-B chosen from a Table selected from the group consisting of Table 3 and Table 4. In the presence of the putative antagonist ligand, detection of inhibition or reversal of the regulation of expression of the at least one gene as compared to the regulation of expression of the at least one gene in the manner associated with activation of the progesterone receptor as set forth in (i) but not (ii), indicates that the putative antagonist ligand is a PR-A-specific antagonist, and wherein, in the presence of the putative antagonist ligand, detection of inhibition or reversal of the regulation of expression of the at least one gene as compared to the regulation of the expression of the at least one gene in the manner associated with activation of the progesterone receptor as set forth in (ii) but not (i), indicates that the putative antagonist ligand is a PR-B-specific antagonist. [0094]
  • Given the knowledge of the genes regulated exclusively by progesterone receptor isoforms according to the present invention, one of skill in the art will be able to select one or more genes to detect in a method of the present invention, and the selection of the one or more genes can be determined by different factors. For example, one of skill in the art may wish to further select genes to be detected on the basis of the function of the gene or gene product, on the basis of tissue-type in which a PR is expressed, on the basis of association with a particular condition or disease, or on the basis of the change in the level of expression of the gene when in the presence of progesterone. Such embodiments have generally been described above. [0095]
  • Antiprogestins that selectively inhibit progestin effects on only one of the two PRs, would be highly desirable, but do not exist at present. Such antagonist ligands would be useful not only for breast cancer treatment, but to treat a variety of reproductive disorders, and for contraception. Antagonists that can inhibit only PR-A without affecting PR-B (and vice-versa) would be highly desirable. The current invention allows for rapid and direct screening for such ligands. For example, the invention identifies clusters of genes that are upregulated only by PR-A or PR-B in the presence of the agonist, progesterone. These gene clusters are perfect targets for antiprogestin (antagonist) and progestin (agonist) screening by the cell-free in vitro, intact cell in vitro, or whole animal endogenous or transgenic methods described above. For the embodiment related to antagonists, a selected cluster of one, two or more of the genes of this invention that are exclusively regulated by PR-A or PR-B would first be activated by progesterone or another progestin. Putative antiprogestins would be screened and selected on the basis of their ability to reverse or inhibit the effects of the agonist, progesterone, by comparing the expression profiles of the genes in the presence of the putative antiprogestin to the expression profile of the genes as a result of activation of the receptor with a progestin. Isoform-specific agonists of PRs can be similarly selected by choosing ligands on the basis of their ability to mimic the effects of the agonist, progesterone, on the PR isoforms. [0096]
  • These two embodiments of the present invention take advantage of the knowledge provided herein of the isoform-specific regulation of genes by progesterone receptors. Prior to the present invention, such assays were impossible, because the specific regulation of a gene by one PR isoform, and not the other, was not known. By way of example, if a gene in Table 1 is detected (i.e., a gene that is known to be upregulated selectively (i.e., exclusively, uniquely) by PR-A) when the PR to be tested (at least PR-A or a combination of PR-A and PR-B) is in the presence of a putative regulatory ligand, and the expression of that gene is determined to be in the manner associated with activation of the progesterone receptor (i.e., the gene is upregulated), then it can be concluded that the putative regulatory compound is a PR-A-specific agonist, because the present inventors have shown that the gene is exclusively upregulated by PR-A. Similarly, if a gene in Table 4 is detected (i.e., a gene that is known to be downregulated selectively (i.e., exclusively, uniquely) by PR-B) when the PR to be tested (at least PR-B or a combination of PR-A and PR-B) is in the presence of a putative regulatory ligand, and the expression of that gene is determined to be is in the manner associated with activation of the progesterone receptor (i.e., the gene is downregulated), then it can be concluded that the putative regulatory compound is a PR-B-specific agonist, because the present inventors have shown that this particular gene is exclusively downregulated by PR-B. For a putative antagonist, if the same gene in Table 4 is detected when the PR to be tested is or will be activated and is in the presence of the putative antagonist, and the expression of that gene is determined to be inhibited or reversed (i.e., the gene is upregulated or is statistically significantly less downregulated) as compared to the expression of the gene in the manner associated with activation of the progesterone receptor, then it can be concluded that the putative regulatory compound is a PR-B-specific antagonist, because the present inventors have shown that this particular gene is exclusively downregulated by PR-B. [0097]
  • The particular details relating to the contacting, detecting and comparing steps of the above-described methods for identification of PR isoform-specific ligands are substantially the same as those described above for the broader methods of identifying PR regulatory ligands and will not be repeated here. [0098]
  • Agonists and antagonists of progesterone receptors identified by the above methods or any other suitable method are useful in a variety of therapeutic methods as described herein. [0099]
  • Yet another embodiment of the present invention relates to a method to identify a tissue-specific agonist of a progesterone receptor. This method includes the steps of: (a) providing an expression profile for at least one gene that is known to be regulated by a progesterone receptor in both a first and second tissue type when the progesterone receptor is activated, wherein the at least one gene is chosen from the genes in any one or more of the genes in Tables 1-7; (b) contacting a progesterone receptor expressed by a first tissue type with a putative agonist ligand, wherein the progesterone receptor is selected from the group consisting of progesterone receptor A (PR-A) and progesterone receptor B (PR-B), under conditions wherein, in the absence of the putative agonist ligand, the progesterone receptor is not activated; (c) contacting a progesterone receptor expressed by a second tissue type with the putative agonist ligand under conditions wherein, in the absence of the putative agonist ligand, the progesterone receptor is not activated, wherein the progesterone receptor is the same isoform as the progesterone receptor contacted in (b); (d) detecting expression of the at least one gene from (a); and, (e) comparing the expression of the at least one gene in the presence and in the absence of the putative agonist ligand in each of the first and second tissue types. Detection of regulation of the expression of the at least one gene in one of the first or second tissue types in the manner associated with activation of the progesterone receptor as set forth in the expression profile of (a), and detection of inhibition of regulation or no regulation of the at least one gene in the other of the first or second tissue types, as compared to the expression of the at least one gene associated with activation of the progesterone receptor as set forth in the expression profile of (a), indicates that the putative agonist ligand is a tissue-specific progesterone receptor agonist. [0100]
  • Similarly, another embodiment of the present invention relates to a method to identify a tissue-specific agonist of a progesterone receptor, such method comprising: (a) providing an expression profile for at least one gene that is known to be regulated by a progesterone receptor in a first tissue type but not a second tissue type when the progesterone receptor is activated, wherein the at least one gene is chosen from the genes in any one or more of the genes in Tables 1-7; (b) contacting a progesterone receptor expressed by the first tissue type with a putative agonist ligand, wherein the progesterone receptor is selected from the group consisting of progesterone receptor A (PR-A) and progesterone receptor B (PR-B), under conditions wherein, in the absence of the putative agonist ligand, the progesterone receptor is not activated; (c) detecting expression of the at least one gene from (a); (d) comparing the expression of the at least one gene in the presence and in the absence of the putative agonist ligand in the first tissue type, wherein detection of regulation of the expression of the at least one gene in the first tissue type in the manner associated with activation of the progesterone receptor as set forth in the expression profile of (a) indicates that the putative agonist ligand is a tissue-specific progesterone receptor agonist for the first tissue type. In this embodiment, it is desirable to include additional controls or the detection of multiple genes that confirm that the regulation of the PR by the putative regulatory ligand is tissue-specific. [0101]
  • Another embodiment of the present invention relates to a method to identify a tissue-specific antagonist of a progesterone receptor. This method includes the steps of: (a) providing an expression profile for at least one gene that is known to be regulated by a progesterone receptor in both a first and second tissue type when the progesterone receptor is activated, wherein the at least one gene is chosen from the genes in any one or more of the genes in Tables 1-7; (b) contacting a progesterone receptor expressed by a first tissue type with a putative antagonist ligand, wherein the progesterone receptor is selected from the group consisting of progesterone receptor A (PR-A) and progesterone receptor B (PR-B), under conditions wherein, in the absence of the putative antagonist ligand, the progesterone receptor is activated; (c) contacting a progesterone receptor expressed by a second tissue type with the putative antagonist ligand, wherein the progesterone receptor is selected from the group consisting of progesterone receptor A (PR-A) and progesterone receptor B (PR-B), under conditions wherein, in the absence of the putative antagonist ligand, the progesterone receptor is activated; (d) detecting expression of the at least one gene from (a); and, (e) comparing the expression of the at least one gene in the presence and in the absence of the putative antagonist ligand in each of the first and second tissue types, wherein detection of regulation of the expression of the at least one gene in one of the first or second tissue types in the manner associated with activation of the progesterone receptor as set forth in the expression profile of (a) in the presence of the putative antagonist ligand, and detection of inhibition or reversal of regulation of expression of the at least one gene in the other of the first or second tissue types in the presence of the putative antagonist ligand, as compared to the expression of the at least one gene associated with activation of the progesterone receptor as set forth in the expression profile of (a), indicates that the putative antagonist ligand is a tissue-specific progesterone receptor antagonist. [0102]
  • Similarly, another embodiment of the present invention relates to a method to identify a tissue-specific antagonist of a progesterone receptor, such method including the steps of: (a) providing an expression profile for at least one gene that is known to be regulated by a progesterone receptor in a first but not in a second tissue type when the progesterone receptor is activated, wherein the at least one gene is chosen from the genes in any one or more of the genes in Tables 1-7; (b) contacting a progesterone receptor expressed by a first tissue type with a putative antagonist ligand, wherein the progesterone receptor is selected from the group consisting of progesterone receptor A (PR-A) and progesterone receptor B (PR-B), under conditions wherein, in the absence of the putative antagonist ligand, the progesterone receptor is activated; (c) detecting expression of the at least one gene from (a); and, (d) comparing the expression of the at least one gene in the presence and in the absence of the putative antagonist ligand in the first tissue type, wherein detection of inhibition or reversal of regulation of expression of the at least one gene in the first tissue type in the presence of the putative antagonist ligand, as compared to the expression of the at least one gene associated with activation of the progesterone receptor as set forth in the expression profile of (a), indicates that the putative antagonist ligand is a tissue-specific progesterone receptor antagonist of the first tissue type. In this embodiment, it is desirable to include additional controls or the detection of multiple genes that confirm that the regulation of the PR by the putative regulatory ligand is tissue-specific. [0103]
  • In one aspect of any of the above-described embodiments for identifying a tissue-specific regulator of PR activity, the first tissue type is breast, and at least one gene is selected from the group consisting of any one or more of the genes in Tables 1-7. In general, the first or second tissue type can be any tissue type, including any cell type, that expresses a progesterone receptor. For example, tissues that are known to express progesterone receptors include, but are not limited to, breast, uterus, bone, cartilage, cardiovascular tissues, heart, lung, brain, meninges, pituitary, ovary, oocyte, corpus luteum, oviduct, fallopian tubes, T lymphocytes, B lymphocytes, thymocytes, salivary gland, placenta, skin, gut, pancreas, liver, testis, epididymis, bladder, urinary tract, eye, and teeth. [0104]
  • In another aspect, the first tissue type is a non-malignant tissue and wherein the second tissue type is a malignant tissue from the same tissue source as the first tissue type. A preferred tissue source for screening for regulators of malignant tissue but not non-malignant tissue is breast tissue. [0105]
  • In another aspect, the first tissue type is a normal tissue and wherein the second tissue type is a non-malignant, abnormal tissue. Such tissues include, but are not limited to, tissues from endometriosis and leiomyoma of the uterus, fibrocystic disease of the breast, or polycystic ovary. [0106]
  • In one aspect of the tissue-specific methods of the present invention, the method includes the detection of the any one or more of the following genes: 11-beta-hydroxysteroid dehydrogenase type 2, tissue factor gene, PCI gene (plasminogen activator inhibitor 3), MAD-3 Ikβ-alpha, Niemann-Pick C disease (NPC1), platelet-type phosphofructokinase, phenylethanolamine n-methyltransferase (PNMT), transforming growth factor-beta 3 (TGF-beta3), Monocyte Chemotactic Protein 1, delta sleep inducing peptide (related to TSC-22), estrogen receptor-related protein (hERRa1). These genes are of particular interest when one of the tissue types is the endometrium. [0107]
  • In another aspect of the tissue-specific methods of the present invention, the method includes the detection of the any one or more of the following genes: growth arrest-specific protein (gas6), tissue factor gene, NF-IL6-beta (C/EBPbeta), PCI gene (plasminogen activator inhibitor), Stat5A, calcium-binding protein S100P, MSX-2, lipocortin II (calpactin I), selenium-binding protein (hSBP), and bullous pemphigoid antigen (plakin family). These genes are of particular interest when one of the tissue types is the breast. [0108]
  • In another aspect of the tissue-specific methods of the present invention, the method includes the detection of phenylethanolamine n-methyltransferase (PNMT) adrenal medulla. This gene is of particular interest when one of the tissue types is brain tissue. [0109]
  • In another aspect of the tissue-specific methods of the present invention, the method includes the detection of proteasome-like subunit MECL-1. This gene is of particular interest when one of the tissue types is thymus tissue. [0110]
  • In yet another aspect of these methods, the expression profile of genes regulated by a progesterone receptor in the first or second tissue type is provided by a method comprising: [0111]
  • (a) providing a first cell of a selected tissue type that expresses a progesterone receptor A (PR-A) and not a progesterone receptor B (PR-B) and a second cell of the same tissue type that expresses PR-B and not PR-A; (b)stimulating the progesterone receptors in (a) by contacting the first and second cells with a progesterone receptor stimulatory ligand; (c) detecting expression of genes by the first and second cells in the presence of the stimulatory ligand and in the absence of the stimulatory ligand, wherein a difference in the expression of a gene in the presence of the stimulatory ligand as compared to in the absence of the stimulatory ligand, indicates that the gene is regulated by the progesterone receptor in the selected tissue type. [0112]
  • The present invention defines genes that are regulated by PR-A vs. PR-B in breast cancer cells. It is believed that many, if not most of these genes, will also be regulated by progesterone receptors in other tissues. Similar data can be generated for other tissues, including the uterus, bone, cardiovascular tissues, etc., or malignant vs. normal tissues. Progestin regulated genes in other tissues, which differ from the genes in breast cancer cells of this invention, can be identified, and be used to screen for ligands that regulate candidate genes only in the desired tissue. For example, using the appropriate gene clusters, one could identify a ligand that activates PR-A in the uterus but not the breast. Similarly one could screen out ligands that have undesirable organ or tissue effects. For example, ligands that are inadvertently bioactive in the liver, where they might induce liver toxicity, could be discarded. Alternatively, when a gene is regulated in both tissue types, one can screen for ligands that regulate the expression of the gene in one tissue type, but not the other tissue type. For example, by using tissue specific methods described above, it is also possible to screen for antagonists that block the actions of progestins in one organ or tissue and through one PR isoform, but not another organ or tissue and the other PR isoform. For example, if PR-A are “good” receptors in the uterus but not the breast, a selective “antiprogestin-A” might be found that is only inhibitory in the breast. [0113]
  • Given the guidance provided herein, it is within the ability of those of skill in the art to screen other tissue types for the presence or absence of the genes regulated by PR in breast tissue, and/or to perform a de novo screening assay for the identification of genes regulated by PR in another tissue, to develop gene expression profiles for use in screening for tissue specific ligands. One of skill in the art can now look to see if a given gene that is regulated by PR in breast is also regulated by PR in another tissue type, thereby providing a gene profile for use in the tissue-specific ligand identification methods of the present invention. [0114]
  • The particular details relating to the contacting, detecting and comparing steps of the above-described methods for identification of PR isoform-specific ligands are substantially the same as those described above for the broader methods of identifying PR regulatory ligands and will not be repeated here. [0115]
  • Another method of the present invention relates to a method to identify genes that are regulated by a progesterone receptor in two or more tissue types. The method includes the steps of: (a) activating a progesterone receptor in two or more tissue types that express the progesterone receptor; (b) detecting expression of at least one gene in the two or more tissue types, the at least one gene being chosen from a gene in any one or more of Tables 1-7, and, (c) identifying genes that are regulated by the progesterone receptor in each of the two or more tissue types. In one embodiment, the method further includes detecting whether the genes are regulated selectively by PR-A, selectively by PR-B, or by both PR-A and PR-B. This method can generally be used to provide a profile of genes in a tissue type other than breast. Such a profile can then be used in a method for the identification of tissue-specific progesterone receptor ligands as described above, or in a method of determining a profile of genes for a given tissue sample as described below. [0116]
  • Yet another embodiment of the present invention relates to a method to determine the profile of genes regulated by progesterone receptors in a tissue sample. In a preferred embodiment, the sample is a breast tumor sample. This method includes the steps of: (a) obtaining from a patient a breast tumor sample; (b) detecting expression of at least one gene in the breast tumor sample that is regulated by a progesterone receptor when the progesterone receptor is activated; and, (c) producing a profile of genes for the tumor sample that are regulated selectively by PR-A, selectively by PR-B, or by both PR-A and PR-B. In this embodiment, the gene(s) to be profiled are being selected from the group consisting of: (i) at least one gene that is selectively upregulated by PR-A chosen from a gene in Table 9; (ii) at least one gene that is selectively downregulated by PR-A chosen from a gene in Table 10; (iii) at least one gene that is selectively upregulated by PR-B chosen from a gene in Table 11; (iv) at least one gene that is selectively downregulated by PR-B chosen from a gene in Table 12; (v) at least one gene that is upregulated or downregulated by both PR-A and PR-B chosen from a gene in Table 13; (vi) at least one gene that is reciprocally regulated by PR-A and PR-B chosen from a gene in Table 14; and, (vii) at least one gene that is regulated by one of the PR-A or the PR-B, wherein regulation of the gene is altered when the other of the PR-A or PR-B is expressed by the same cell, chosen from a gene in Table 15. [0117]
  • Because of their physiological importance in the breast, PRs are routinely measured in all breast cancers when the disease is first diagnosed. Presence of PRs, especially if the levels are high, informs the oncologist that the tumor is likely to be “hormone-dependent” and will respond to endocrine treatments. This spares the woman from much harsher treatments involving chemotherapies. Additionally, the number of PRs allows the oncologist to predict how aggressive the tumor is likely to be. High PR levels in her tumor indicates that a woman's prognosis is good. Thus measurement of total PRs levels plays a key role in the management of breast cancers. [0118]
  • Both PR-A and PR-B are present in PR-positive breast cancers. The PR-A:PR-B ratio varies widely from tumor to tumor, and some tumors express only one or the other isoform. However, the clinical consequences of this heterogeneity are unknown. Because the transcriptional effects of the two PRs are believed to be so different, fluctuations in their ratio are expected to critically influence the biology of the tumors. However, at present, how that biology is affected is unknown. Whether in fact, PR-A are “bad” and PR-B are “good” in breast cancers, is also unknown. Since most breast cancer cell lines lose their PRs, and both isoforms are co-expressed in cell lines that retain their PRs, one way to determine the biological consequences of varying A:B ratios is to define the endogenous genes that each of the two PRs regulates independently. Knowledge of the unique sets of genes that are selectively regulated by each PR isoform as disclosed herein allows the genes to serve as surrogate markers for the presence and function of PR-A vs. PR-B. Furthermore, knowledge of such genes and their promoters, allows the genes to serve as a tool for screening PR-A vs. PR-B selective ligands. However, prior to the present invention, defining which sets of genes were uniquely regulated by one or the other PR in breast cancers was impossible because both receptors are simultaneously activated by progesterone treatment. The present invention has provided a solution to this problem. [0119]
  • As discussed above, total PRs are routinely measured in all primary breast cancers as a guide to therapy. Their presence and levels are used to predict whether the tumor is likely to respond to hormone treatments, and to estimate disease prognosis. Tumors that lack PRs have less than 10% chance of responding to hormone treatments; tumors that contain PRs have on average a 70% chance of responding to hormone treatments depending on the receptor levels. These numbers are statistical only, and therefore are not specifically informative for any individual patient. The present invention has led to the development of assays that profile the tumor of an individual patient for “good” and “bad” surrogate markers of PR-A and PR-B. Thus it is now possible to measure not only the presence of PRs in a tumor, but the function of the PRs in that tumor. [0120]
  • In this embodiment, one or more of the genes set forth in Tables 9-15 are selected to be screened in a tissue sample from a patient. Preferably, the tissue sample is a breast tumor sample. The expression of the genes in the tissue sample can be detected using techniques described above for the various other methods of the present invention. For example, transcript expression levels of the selected genes can be measured in the tumor of a patient, by any of a number of known methods. For mRNA expression, methods include but are not limited to: northern blotting; reverse transcriptase-polymerase chain reaction and detection of the product; use of labeled mRNA from the tumor to probe cDNAs or oligonucleotides encoding all or part of the PR-responsive genes of interest, arrayed on any of a variety of surfaces, as described above. For detection of protein expression levels of the selected genes, methods include but are not limited to: western blotting, immunocytochemistry, flow cytometry or other immunologic-based assays; assays based on a property of the protein including but not limited to DNA binding, ligand binding, or interaction with other protein partners, as described above. The presence and quantity of each gene marker can be measured in primary tumors, metastatic tumors, locally recurring tumors, ductal carcinomas in situ, or other tumors of breast cell origin. The markers can be measured in solid tumors that are fresh, frozen, fixed or otherwise preserved. They can be measured in cytoplasmic or nuclear tumor extracts; or in tumor membranes including but not limited to plasma, mitochondrial, golgi or nuclear membranes; in the nuclear matrix; or in tumor cell organelles and their extracts including but not limited to ribosomes, nuclei, mitochondria, golgi. [0121]
  • A profile of individual gene markers, including a matrix of two or more markers, can be generated by one or more of the methods described above. According to the present invention, a profile of the genes regulated by progesterone receptors in a tissue sample refers to a reporting of the expression level of a given gene from Tables 9-15, wherein, based on the knowledge of the regulation of the genes provided by Tables 9-15, includes a classification of the gene with regard to how the gene is regulated by the PR isoforms. For example, if the gene, estrogen receptor-related protein, is identified as being expressed by a tumor sample, the profile for the tumor will include the reporting of the expression of at least one gene that is exclusively regulated by PR-A. The data can be reported as raw data, and/or statistically analyzed by any of a variety of methods, and/or combined with any other prognostic marker(s) including but not limited to ER, % S-phase, other proliferation markers, markers of ER expression, tumor suppressor genes, etc. Prior to the present invention, one of skill in the art would not have known to screen breast tumors for the genes in Tables 1-7, 9-10 or 18-19, (excepting genes in Table 16), and one of skill in the art would not have been able to classify any of these genes on the basis of the PR isoform regulation. [0122]
  • Given the knowledge of the genes regulated by progesterone receptor isoforms according to the present invention, one of skill in the art will be able to select one or more genes to detect in this method of the present invention, and the selection of the one or more genes can be determined by different factors. For example, one of skill in the art may wish to further select genes to be detected on the basis of the function of the gene or gene product, on the basis of PR isoform-specificity, on the basis of association with a particular condition or disease, or on the basis of the change in the level of expression of the gene when in the presence of progesterone. Such embodiments have generally been described above. [0123]
  • In one aspect of this method of the present invention, the method preferably includes the detection of the any one or more of the following genes: growth arrest-specific protein (gas6), tissue factor gene, NF-IL6-beta (C/EBPbeta), PCI gene (plasminogen activator inhibitor), Stat5A, calcium-binding protein S100P, MSX-2, lipocortin 11 (calpactin I), selenium-binding protein (hSBP), and bullous pemphigoid antigen (plakin family). These genes are of particular interest when one of the tissue types is the breast. [0124]
  • In another aspect of this method of the present invention, the method preferably includes the detection of the any one of more of the following genes: growth arrest-specific protein (gas6), NF-IL6-beta (C/EBPbeta), calcium-binding protein S100P, MSX-2, selenium-binding protein (hSBP), and bullous pemphigoid antigen (plakin family). [0125]
  • The profile of genes provided as a result of the screening of the tissue can be used by the patient or physician for decision-making regarding the usefulness of endocrine therapies in general (i.e. oophorectomy, antiestrogens or other SERMs, aromatase inhibitors, or others), or progestational therapy in particular (high dose progestins, antiprogestins or others). The profile can be used to estimate how the disease is likely to respond and progress in any individual patient. Clinical trials can be developed to correlate the relationship between PR-A vs. PR-B regulated genes, and the biological behavior of the tumor. [0126]
  • In addition, if it is determined that one PR isoform is harmful, and the other beneficial, the gene clusters of this invention can be measured or quantified in normal breast or other normal tissues, either frozen or preserved, or in tissue or organelle extracts as described above, either alone or together with other markers (for example BRCA1), and used for genetic counseling. [0127]
  • In addition, one of the key questions that the present invention can address, is whether breast tumors that overexpress PR-B or PR-A represent phenotypically different tumor subsets. For example, breast tumors that are identified as “PR-B rich” based on their expression of PR-B specific genes, can be further assessed in terms of usual clinical parameters—tumor staging, pathological staging, size, nodal status, metastasis, responsiveness to hormonal and chemotherapies—and compared to parallel tumors that are “PR-A rich”. Without being bound by theory, the present inventors predict that PR-B rich tumors may be larger and more aggressive than PR-A rich tumors. One reason for this is that this invention demonstrates that PR-B strongly and uniquely upregulate two important genes that support angiogenesis: L13720, growth arrest-specific protein (gas 6) is increased 23.1 fold; M27436, tissue factor gene is increased 18.1 fold. Increased angiogenesis, by increasing their blood (and nutrient) supply, promotes tumor growth. This is one example of the hypotheses that can be raised and tested, based on the new information revealed by this invention. [0128]
  • In one aspect of this embodiment of the invention, the profiling of genes can be extended to other tissue types and/or other genes. For example, as discussed above, using the guidance provided herein, it is within the ability of those of skill in the art to screen other tissue types for the presence or absence of the genes regulated by PR in breast tissue, and/or to perform a de novo screening assay for the identification of genes regulated by PR in another tissue, to develop gene expression profiles for use in screening for tissue specific ligands. One of skill in the art can now look to see if a given gene that is regulated by PR in breast is also regulated by PR in another tissue type. Moreover, the 4 breast cancer cell lines described in Example 1, can be used to screen other gene arrays, including arrays of expressed tag sequences, to discover additional novel, PR-A vs. PR-B regulated genes. The procedure used to produce these cells can be extended to cells from other tissue sources (e.g., the uterus), and new PR-A and PR-B regulated genes can be identified for these tissue sources. Additional applications of the present invention include screening for genes that are regulated by PRs in a ligand-independent manner. The extension of the gene profiles to other tissue types will allow for the development of a variety of diagnostic assays in other tissues and for diseases related to such other tissues, as well as the identification of additional targets for therapeutic strategies. [0129]
  • Another embodiment of the present invention relates to a plurality of polynucleotides for the detection of the expression of genes regulated by progesterone receptors in breast tissue. The plurality of polynucleotides consists of polynucleotide probes that are complementary to RNA transcripts, or nucleotides derived therefrom, of genes that are regulated by progesterone receptors, and is therefore distinguished from previously known nucleic acid arrays and primer sets. The plurality of polynucleotides within the above-limitation includes at least one or more, but is not limited to one or more, polynucleotide probes that are complementary to RNA transcripts, or nucleotides derived therefrom, of genes identified by the present inventors. Such genes are selected from: (a) at least one gene that is selectively upregulated by PR-A chosen from a gene in Table 1; (b) at least one gene that is selectively downregulated by PR-A chosen from a gene in Table 2; (c) at least one gene that is selectively upregulated by PR-B chosen from a gene in Table 3; (d) at least one gene that is selectively downregulated by PR-B chosen from a gene in Table 4; (e) at least one gene that is upregulated or downregulated by both PR-A and PR-B chosen from a gene in Table 5; (f) at least one gene that is reciprocally regulated by PR-A and PR-B chosen from a gene in Table 6; and, (g) at least one gene that is regulated by one of the PR-A or the PR-B, wherein regulation of the gene is altered when the other of the PR-A or PR-B is expressed by the same cell, chosen from a gene in Table 7. [0130]
  • In one embodiment, it is contemplated that additional genes that are not regulated by progesterone receptors can be added to the plurality of polynucleotides. Such genes would not be random genes, or large groups of unselected human genes, as are commercially available now, but rather, would be specifically selected to complement the sets of progesterone receptor-regulated genes identified by the present invention. For example, one of skill in the art may wish to add to the above-described plurality of genes one or more genes that are of relevance because they are expressed by a particular tissue of interest (e.g., breast tissue), are associated with a particular disease or condition of interest (e.g., breast cancer), or are associated with a particular cell, tissue or body function (e.g., angiogenesis). The development of additional pluralities of polynucleotides (and antibodies, as disclosed below), which include both the above-described plurality and such additional selected polynucleotides, are explicitly contemplated by the present invention. [0131]
  • In one embodiment, the plurality of polynucleotides further comprises at least one polynucleotide probe that is complementary to RNA transcripts, or nucleotides derived therefrom, of at least one gene chosen from the genes in Table 8. In another embodiment, the plurality of polynucleotides comprises polynucleotide probes that are complementary to RNA transcripts, or nucleotides derived therefrom, of particular subsets of the genes disclosed in the present invention. For example, one of skill in the art may wish to design pluralities of polynucleotides on the basis of the function of the gene or gene product, on the basis of a tissue-type that expresses a PR, on the basis of PR isoform specificity, on the basis of association with a particular condition or disease, or on the basis of the change in the level of expression of the gene when in the presence of progesterone. Such embodiments have generally been described above. [0132]
  • According to the present invention, a plurality of polynucleotides refers to at least 2, and more preferably at least 3, and more preferably at least 4, and more preferably at least 5, and more preferably at least 6, and more preferably at least 7, and more preferably at least 8, and more preferably at least 9, and more preferably at least 10, and so on, in increments of one, up to any suitable number of polynucleotides, including at least 100, 500, 1000, 10[0133] 4, 105, or at least 106 or more polynucleotides.
  • In accordance with the present invention, an isolated polynucleotide, or an isolated nucleic acid molecule, is a nucleic acid molecule that has been removed from its natural milieu (i.e., that has been subject to human manipulation), its natural milieu being the genome or chromosome in which the nucleic acid molecule is found in nature. As such, “isolated” does not necessarily reflect the extent to which the nucleic acid molecule has been purified, but indicates that the molecule does not include an entire genome or an entire chromosome in which the nucleic acid molecule is found in nature. The polynucleotides useful in the plurality of polynucleotides of the present invention are typically a portion of a gene of the present invention that is suitable for use as a hybridization probe or PCR primer for the identification of a full-length gene (or portion thereof) in a given sample (e.g., a cell sample). An isolated nucleic acid molecules can include a gene or a portion of a gene (e.g., the regulatory region or promoter), for example, to produce a reporter construct according to the present invention. An isolated nucleic acid molecule that includes a gene is not a fragment of a chromosome that includes such gene, but rather includes the coding region and regulatory regions associated with the gene, but no additional genes naturally found on the same chromosome. An isolated nucleic acid molecule can also include a specified nucleic acid sequence flanked by (i.e., at the 5′ and/or the 3′ end of the sequence) additional nucleic acids that do not normally flank the specified nucleic acid sequence in nature (i.e., heterologous sequences). Isolated nucleic acid molecule can include DNA, RNA (e.g., mRNA), or derivatives of either DNA or RNA (e.g., cDNA). Although the phrase “nucleic acid molecule” primarily refers to the physical nucleic acid molecule and the phrase “nucleic acid sequence” primarily refers to the sequence of nucleotides on the nucleic acid molecule, the two phrases can be used interchangeably, especially with respect to a nucleic acid molecule, or a nucleic acid sequence, being capable of encoding a protein. Preferably, an isolated nucleic acid molecule of the present invention is produced using recombinant DNA technology (e.g., polymerase chain reaction (PCR) amplification, cloning) or chemical synthesis. If the polynucleotide is an oligonucleotide probe, the probe preferably ranges from about 5 to about 50 or about 500 nucleotides, more preferably from about 10 to about 40 nucleotides, and most preferably from about 15 to about 40 nucleotides in length. [0134]
  • In one embodiment, the polynucleotide probes are conjugated to detectable markers. Detectable labels suitable for use in the present invention include any composition detectable by spectroscopic, photochemical, biochemical, immunochemical, electrical, optical or chemical means. Useful labels in the present invention include biotin for staining with labeled streptavidin conjugate, magnetic beads (e.g., Dynabeads.TM.), fluorescent dyes (e.g., fluorescein, texas red, rhodamine, green fluorescent protein, and the like), radiolabels (e.g., [0135] 3H, 125I, 35S, 14C, or 32P), enzymes (e.g., horse radish peroxidase, alkaline phosphatase and others commonly used in an ELISA), and colorimetric labels such as colloidal gold or colored glass or plastic (e.g., polystyrene, polypropylene, latex, etc.) beads. Preferably, the polynucleotide probes are immobilized on a substrate.
  • In one embodiment, the polynucleotide probes are hybridizable array elements in a microarray or high density array. Nucleic acid arrays are well known in the art and are described for use in comparing expression levels of particular genes of interest, for example, in U.S. Pat. No. 6,177,248, which is incorporated herein by reference in its entirety. Nucleic acid arrays are suitable for quantifying a small variations in expression levels of a gene in the presence of a large population of heterogeneous nucleic acids. Knowing the identity of the downstream genes of the present invention, nucleic acid arrays can be fabricated either by de novo synthesis on a substrate or by spotting or transporting nucleic acid sequences onto specific locations of substrate. Nucleic acids are purified and/or isolated from biological materials, such as a bacterial plasmid containing a cloned segment of sequence of interest. It is noted that all of the genes identified by the present invention have been previously sequenced, at least in part, such that oligonucleotides suitable for the identification of such nucleic acids can be produced. The database accession number for each of the genes identified by the present inventors is provided in the tables of the invention. Suitable nucleic acids are also produced by amplification of template, such as by polymerase chain reaction or in vitro transcription. [0136]
  • Synthesized oligonucleotide arrays are particularly preferred for this aspect of the invention. Oligonucleotide arrays have numerous advantages, as opposed to other methods, such as efficiency of production, reduced intra- and inter array variability, increased information content and high signal-to-noise ratio. [0137]
  • One of skill in the art will appreciate that an enormous number of array designs are suitable for the practice of this invention. An array will typically include a number of probes that specifically hybridize to the sequences of interest. In addition, in a preferred embodiment, the array will include one or more control probes. The high-density array chip includes “test probes.” Test probes could be oligonucleotides that range from about 5 to about 45 or 5 to about 500 nucleotides, more preferably from about 10 to about 40 nucleotides and most preferably from about 15 to about 40 nucleotides in length. In other particularly preferred embodiments the probes are 20 or 25 nucleotides in length. In another preferred embodiments, test probes are double or single strand DNA sequences. DNA sequences are isolated or cloned from natural sources or amplified from natural sources using natural nucleic acids as templates, or produced synthetically. These probes have sequences complementary to particular subsequences of the genes whose expression they are designed to detect. Thus, the test probes are capable of specifically hybridizing to the target nucleic acid they are to detect. [0138]
  • Another embodiment of the present invention relates to a plurality of antibodies, or antigen binding fragments thereof, for the detection of the expression of genes regulated by progesterone receptors in breast tissue. The plurality of antibodies, or antigen binding fragments thereof, consists of antibodies, or antigen binding fragments thereof, that selectively bind to proteins encoded by genes that are regulated by progesterone receptors. In addition, the plurality of antibodies, or antigen binding fragments thereof, comprises antibodies, or antigen binding fragments thereof, that selectively bind to proteins encoded by genes selected from the group consisting of: (a) genes that are selectively upregulated by PR-A chosen from genes in Table 1; (b) genes that are selectively downregulated by PR-A chosen from genes in Table 2; (c) genes that are selectively upregulated by PR-B chosen from genes in Table 3; (d) genes that are selectively downregulated by PR-B chosen from genes in Table 4; (e) genes that are upregulated or downregulated by both PR-A and PR-B chosen from genes in Table 5; (f) genes that are reciprocally regulated by PR-A and PR-B chosen from genes in Table 6; and, (g) genes that are regulated by one of the PR-A or the PR-B, wherein regulation of the gene is altered when the other of the PR-A or PR-B is expressed by the same cell, chosen from genes in Table 7. [0139]
  • In one aspect, the plurality of antibodies, or antigen binding fragments thereof, further comprises at least one antibody, or an antigen binding fragment thereof, that selectively binds to a protein encoded by a gene chosen from the genes in Table 8. [0140]
  • The plurality of antibodies, or antigen binding fragments thereof, further comprises at least one antibody, or an antigen binding fragment thereof, that selectively binds to a protein encoded by a one or more of a particular subset of the genes disclosed in the present invention. For example, one of skill in the art may wish to design pluralities of antibodies on the basis of the function of the gene product, on the basis of tissue-type, on the basis of PR isoform specificity, on the basis of association with a particular condition or disease, or on the basis of the change in the level of expression of the gene when in the presence of progesterone. Such embodiments have generally been described above. [0141]
  • According to the present invention, a plurality of antibodies, or antigen binding fragments thereof, refers to at least 2, and more preferably at least 3, and more preferably at least 4, and more preferably at least 5, and more preferably at least 6, and more preferably at least 7, and more preferably at least 8, and more preferably at least 9, and more preferably at least 10, and so on, in increments of one, up to any suitable number of antibodies, or antigen binding fragments thereof, including at least 100, 500, or at least 1000 antibodies, or antigen binding fragments thereof. [0142]
  • According to the present invention, the phrase “selectively binds to” refers to the ability of an antibody, antigen binding fragment or binding partner of the present invention to preferentially bind to specified proteins (e.g., a protein encoded by a PR regulated gene according to the present invention). The phrase “selectively binds” with regard to antibodies and antigen binding fragments thereof, has been defined previously herein. [0143]
  • Limited digestion of an immunoglobulin with a protease may produce two fragments. An antigen binding fragment is referred to as an Fab, an Fab′, or an F(ab′)[0144] 2 fragment. A fragment lacking the ability to bind to antigen is referred to as an Fc fragment. An Fab fragment comprises one arm of an immunoglobulin molecule containing a L chain (VL+CL domains) paired with the VH region and a portion of the CH region (CH1 domain). An Fab′ fragment corresponds to an Fab fragment with part of the hinge region attached to the CH1 domain. An F(ab′)2 fragment corresponds to two Fab′ fragments that are normally covalently linked to each other through a di-sulfide bond, typically in the hinge regions.
  • Isolated antibodies of the present invention can include serum containing such antibodies, or antibodies that have been purified to varying degrees. Whole antibodies of the present invention can be polyclonal or monoclonal. Alternatively, functional equivalents of whole antibodies, such as antigen binding fragments in which one or more antibody domains are truncated or absent (e.g., Fv, Fab, Fab′, or F(ab)[0145] 2 fragments), as well as genetically-engineered antibodies or antigen binding fragments thereof, including single chain antibodies or antibodies that can bind to more than one epitope (e.g., bi-specific antibodies), or antibodies that can bind to one or more different antigens (e.g., bi- or multi-specific antibodies), may also be employed in the invention.
  • Generally, in the production of an antibody, a suitable experimental animal, such as, for example, but not limited to, a rabbit, a sheep, a hamster, a guinea pig, a mouse, a rat, or a chicken, is exposed to an antigen against which an antibody is desired. Typically, an animal is immunized with an effective amount of antigen that is injected into the animal. An effective amount of antigen refers to an amount needed to induce antibody production by the animal. The animal's immune system is then allowed to respond over a pre-determined period of time. The immunization process can be repeated until the immune system is found to be producing antibodies to the antigen. In order to obtain polyclonal antibodies specific for the antigen, serum is collected from the animal that contains the desired antibodies (or in the case of a chicken, antibody can be collected from the eggs). Such serum is useful as a reagent. Polyclonal antibodies can be further purified from the serum (or eggs) by, for example, treating the serum with ammonium sulfate. [0146]
  • Monoclonal antibodies may be produced according to the methodology of Kohler and Milstein ([0147] Nature 256:495-497, 1975). For example, B lymphocytes are recovered from the spleen (or any suitable tissue) of an immunized animal and then fused with myeloma cells to obtain a population of hybridoma cells capable of continual growth in suitable culture medium. Hybridomas producing the desired antibody are selected by testing the ability of the antibody produced by the hybridoma to bind to the desired antigen.
  • Finally, PR-regulated genes of this invention, or their RNA or protein products, can serve as targets for therapeutic strategies. For example, neutralizing antibodies could be directed against one of the protein products of a selected gene, expressed on the surface of a tumor cell. [0148]
  • One embodiment of this aspect of the invention relates to a method to regulate the expression of a gene selected from the group consisting of any one or more of the genes in Tables 1-7. The method includes administering to a cell that expresses a progesterone receptor a compound selected from the group consisting of: progesterone, a progestin, and an antiprogestin, wherein the compound is effective to regulate the expression of the gene(s) in Table 1-7. In a preferred embodiment, the gene is selected from the group consisting of genes that are listed in Table 16 (known to be involved in breast cancer or mammary gland development), but not in Table 8 (known to be regulated by progesterone). Such genes include, e.g., growth arrest-specific protein (gas6), NF-IL6-beta (C/EBPbeta), calcium-binding protein S100P, MSX-2, selenium-binding protein (hSBP), and bullous pemphigoid antigen (plakin family). In this aspect of the invention, the cell that expresses a progesterone receptor is in the breast tissue of a patient that has, or is at risk of developing, breast cancer. In addition to administering a progestin to the cell, these genes can serve as targets for the development of other therapeutic methods. [0149]
  • Once a suitable therapeutic compound, including a progesterone receptor agonist or antagonist, is identified using the methods and genes of the present invention, a composition can be formulated. A composition, and particularly a therapeutic composition, of the present invention generally includes the therapeutic compound (e.g., the progesterone receptor regulatory ligand) and a carrier, and preferably, a pharmaceutically acceptable carrier. According to the present invention, a “pharmaceutically acceptable carrier” includes pharmaceutically acceptable excipients and/or pharmaceutically acceptable delivery vehicles, which are suitable for use in administration of the composition to a suitable in vitro, ex vivo or in vivo site. A suitable in vitro, in vivo or ex vivo site is preferably a cell that expresses a progesterone receptor. In some embodiments, a suitable site for delivery is a site of inflammation, a site of a tumor, a site of a transplanted graft, or a site of any other disease or condition in which progesterone receptor regulation, or modulation of genes regulated by a PR, can be beneficial, particularly given the knowledge of the genes regulated by PR according to the invention. Preferred pharmaceutically acceptable carriers are capable of maintaining a steroidal or non-steroidal compound, a protein, a peptide, nucleic acid molecule or mimetic (drug) according to the present invention in a form that, upon arrival of the steroidal or non-steroidal compound, protein, peptide, nucleic acid molecule or mimetic at the cell target in a culture or in patient, the steroidal or non-steroidal compound, protein, peptide, nucleic acid molecule or mimetic is capable of interacting with its target (e.g., a naturally occurring PR or a nucleic acid or protein product of a PR-regulated gene). [0150]
  • Suitable excipients of the present invention include excipients or formularies that transport or help transport, but do not specifically target a composition to a cell (also referred to herein as non-targeting carriers). Examples of pharmaceutically acceptable excipients include, but are not limited to water, phosphate buffered saline, Ringer's solution, dextrose solution, serum-containing solutions, Hank's solution, other aqueous physiologically balanced solutions, oils, esters and glycols. Aqueous carriers can contain suitable auxiliary substances required to approximate the physiological conditions of the recipient, for example, by enhancing chemical stability and isotonicity. [0151]
  • Suitable auxiliary substances include, for example, sodium acetate, sodium chloride, sodium lactate, potassium chloride, calcium chloride, and other substances used to produce phosphate buffer, Tris buffer, and bicarbonate buffer. Auxiliary substances can also include preservatives, such as thimerosal, m- or o-cresol, formalin and benzol alcohol. Compositions of the present invention can be sterilized by conventional methods and/or lyophilized. [0152]
  • One type of pharmaceutically acceptable carrier includes a controlled release formulation that is capable of slowly releasing a composition of the present invention into a patient or culture. As used herein, a controlled release formulation comprises a compound of the present invention (e.g., a protein (including homologues), a drug, an antibody, a nucleic acid molecule, or a mimetic) in a controlled release vehicle. Suitable controlled release vehicles include, but are not limited to, biocompatible polymers, other polymeric matrices, capsules, microcapsules, microparticles, bolus preparations, osmotic pumps, diffusion devices, liposomes, lipospheres, and transdermal delivery systems. Other carriers of the present invention include liquids that, upon administration to a patient, form a solid or a gel in situ. Preferred carriers are also biodegradable (i.e., bioerodible). When the compound is a recombinant nucleic acid molecule, suitable delivery vehicles include, but are not limited to liposomes, viral vectors or other delivery vehicles, including ribozymes. Natural lipid-containing delivery vehicles include cells and cellular membranes. Artificial lipid-containing delivery vehicles include liposomes and micelles. A delivery vehicle of the present invention can be modified to target to a particular site in a patient, thereby targeting and making use of a compound of the present invention at that site. Suitable modifications include manipulating the chemical formula of the lipid portion of the delivery vehicle and/or introducing into the vehicle a targeting agent capable of specifically targeting a delivery vehicle to a preferred site, for example, a preferred cell type. Other suitable delivery vehicles include gold particles, poly-L-lysine/DNA-molecular conjugates, and artificial chromosomes. [0153]
  • A pharmaceutically acceptable carrier which is capable of targeting is herein referred to as a “delivery vehicle.” Delivery vehicles of the present invention are capable of delivering a composition of the present invention to a target site in a patient. A “target site” refers to a site in a patient to which one desires to deliver a composition. For example, a target site can be any cell which is targeted by direct injection or delivery using liposomes, viral vectors or other delivery vehicles, including ribozymes and antibodies. Examples of delivery vehicles include, but are not limited to, artificial and natural lipid-containing delivery vehicles, viral vectors, and ribozymes. Natural lipid-containing delivery vehicles include cells and cellular membranes. Artificial lipid-containing delivery vehicles include liposomes and micelles. A delivery vehicle of the present invention can be modified to target to a particular site in a subject, thereby targeting and making use of a compound of the present invention at that site. Suitable modifications include manipulating the chemical formula of the lipid portion of the delivery vehicle and/or introducing into the vehicle a compound capable of specifically targeting a delivery vehicle to a preferred site, for example, a preferred cell type. Specifically, targeting refers to causing a delivery vehicle to bind to a particular cell by the interaction of the compound in the vehicle to a molecule on the surface of the cell. Suitable targeting compounds include ligands capable of selectively (i.e., specifically) binding another molecule at a particular site. Examples of such ligands include antibodies, antigens, receptors and receptor ligands. Manipulating the chemical formula of the lipid portion of the delivery vehicle can modulate the extracellular or intracellular targeting of the delivery vehicle. For example, a chemical can be added to the lipid formula of a liposome that alters the charge of the lipid bilayer of the liposome so that the liposome fuses with particular cells having particular charge characteristics. [0154]
  • One preferred delivery vehicle of the present invention is a liposome. A liposome is capable of remaining stable in an animal for a sufficient amount of time to deliver a nucleic acid molecule (e.g., an anti-sense nucleic acid molecule that hybridizes to a nucleic acid sequence in a gene for which inhibition is desired) to a preferred site in the animal. A liposome, according to the present invention, comprises a lipid composition that is capable of delivering a nucleic acid molecule described in the present invention to a particular, or selected, site in a patient. A liposome according to the present invention comprises a lipid composition that is capable of fusing with the plasma membrane of the targeted cell to deliver a nucleic acid molecule into a cell. Suitable liposomes for use with the present invention include any liposome. Preferred liposomes of the present invention include those liposomes commonly used in, for example, gene delivery methods known to those of skill in the art. More preferred liposomes comprise liposomes having a polycationic lipid composition and/or liposomes having a cholesterol backbone conjugated to polyethylene glycol. Complexing a liposome with a nucleic acid molecule of the present invention can be achieved using methods standard in the art. [0155]
  • A liposome delivery vehicle is preferably capable of remaining stable in a patient for a sufficient amount of time to deliver a nucleic acid molecule or other compound of the present invention to a preferred site in the patient (i.e., a target cell). A liposome delivery vehicle of the present invention is preferably stable in the patient into which it has been administered for at least about 30 minutes, more preferably for at least about 1 hour and even more preferably for at least about 24 hours. A preferred liposome delivery vehicle of the present invention is from about 0.01 microns to about 1 microns in size. [0156]
  • Another preferred delivery vehicle comprises a viral vector. A viral vector includes an isolated nucleic acid molecule useful in the present invention, in which the nucleic acid molecules are packaged in a viral coat that allows entrance of DNA into a cell. A number of viral vectors can be used, including, but not limited to, those based on alphaviruses, poxviruses, adenoviruses, herpesviruses, lentiviruses, adeno-associated viruses and retroviruses. [0157]
  • A composition which includes an agonist or antagonist of a progesterone receptor can be delivered to a cell culture or patient by any suitable method. Selection of such a method will vary with the type of compound being administered or delivered (i.e., steroidal or non-steroidal compound, protein, peptide, nucleic acid molecule, or mimetic), the mode of delivery (i.e., in vitro, in vivo, ex vivo) and the goal to be achieved by administration/delivery of the compound or composition. According to the present invention, an effective administration protocol (i.e., administering a composition in an effective manner) comprises suitable dose parameters and modes of administration that result in delivery of a composition to a desired site (i.e., to a desired cell) and/or in the desired regulatory event (e.g., regulation of the PR receptor biological activity or of the biological activity of a gene that is regulated by PR). [0158]
  • Administration routes include in vivo, in vitro and ex vivo routes. In vivo routes include, but are not limited to, oral, nasal, intratracheal injection, inhaled, transdermal, rectal, and parenteral routes. Preferred parenteral routes can include, but are not limited to, subcutaneous, intradermal, intravenous, intramuscular and intraperitoneal routes. Intravenous, intraperitoneal, intradermal, subcutaneous and intramuscular administrations can be performed using methods standard in the art. Aerosol (inhalation) delivery can also be performed using methods standard in the art (see, for example, Stribling et al., [0159] Proc. Natl. Acad. Sci. USA 189:11277-11281, 1992, which is incorporated herein by reference in its entirety). Oral delivery can be performed by complexing a therapeutic composition of the present invention to a carrier capable of withstanding degradation by digestive enzymes in the gut of an animal. Examples of such carriers, include plastic capsules or tablets, such as those known in the art. Direct injection techniques are particularly useful for suppressing graft rejection by, for example, injecting the composition into the transplanted tissue, or for site-specific administration of a compound, such as at the site of a tumor. Ex vivo refers to performing part of the regulatory step outside of the patient, such as by transfecting a population of cells removed from a patient with a recombinant molecule comprising a nucleic acid sequence encoding a protein according to the present invention under conditions such that the recombinant molecule is subsequently expressed by the transfected cell, and returning the transfected cells to the patient. In vitro and ex vivo routes of administration of a composition to a culture of host cells can be accomplished by a method including, but not limited to, transfection, transformation, electroporation, microinjection, lipofection, adsorption, protoplast fusion, use of protein carrying agents, use of ion carrying agents, use of detergents for cell permeabilization, and simply mixing (e.g., combining) a compound in culture with a target cell.
  • In the method of the present invention, a therapeutic compound, including agonists and antagonists of progesterone receptors, as well as compositions comprising such compounds, can be administered to any organism, and particularly, to any member of the Vertebrate class, Mammalia, including, without limitation, primates, rodents, livestock and domestic pets. Livestock include mammals to be consumed or that produce useful products (e.g., sheep for wool production). Preferred mammals to protect include humans. Typically, it is desirable to modulate (e.g., regulate (up or down)) progesterone receptor biological activity or the biological activity of a gene regulated by a PR, to obtain a therapeutic benefit in a patient. A therapeutic benefit is not necessarily a cure for a particular disease or condition, but rather, preferably encompasses a result which can include alleviation of the disease or condition, elimination of the disease or condition, reduction of a symptom associated with the disease or condition, prevention or alleviation of a secondary disease or condition resulting from the occurrence of a primary disease or condition, and/or prevention of the disease or condition. As used herein, the phrase “protected from a disease” refers to reducing the symptoms of the disease; reducing the occurrence of the disease, and/or reducing the severity of the disease. Protecting a patient can refer to the ability of a composition of the present invention, when administered to a patient, to prevent a disease from occurring and/or to cure or to alleviate disease symptoms, signs or causes. As such, to protect a patient from a disease includes both preventing disease occurrence (prophylactic treatment) and treating a patient that has a disease (therapeutic treatment) to reduce the symptoms of the disease. A beneficial effect can easily be assessed by one of ordinary skill in the art and/or by a trained clinician who is treating the patient. The term, “disease” refers to any deviation from the normal health of a mammal and includes a state when disease symptoms are present, as well as conditions in which a deviation (e.g., infection, gene mutation, genetic defect, etc.) has occurred, but symptoms are not yet manifested. [0160]
  • The following examples are provided for the purpose of illustration and are not intended to limit the scope of the present invention. [0161]
  • EXAMPLES Example 1 The following example describes the identification of genes regulated by progesterone receptors.
  • Materials and Methods [0162]
  • Cell Culture [0163]
  • Wild-type PR-positive T47Dco breast cancer cell line and its clonal derivatives T47D-Y, T47D-YA and T47D-YB, have been described (Horwitz et al., [0164] Cell 28, 633-42 (1982); Sartorius et al., Cancer Res. 54, 3668-3877 (1994)). Briefly, cells are routinely cultured in 75 cm2 plastic flasks and incubated in 5% CO2 at 37° C. in a humidified environment. The stock medium consists of Eagle's Minimum Essential Medium with Earle's salts (MEM), containing L-glutamine (292 mg/liter) buffered with sodium bicarbonate (2.2 g/liter), insulin (6 ng/ml) and 5% fetal bovine serum (Hyclone, Logan, Utah) with G418.
  • Arrays [0165]
  • Atlas™Human cDNA Expression Array. T47D-YA and T47D-YB breast cancer cells were grown to mid-confluence in Minimal Essential Medium containing 5% Fetal Calf Serum, then either treated with 10 nM progesterone dissolved in ethanol for 6 or 12 hours, or in ethanol alone. This yielded 4 treatment types. Total RNA was prepared from the 4 sets of cells using guanidinium isothiocyanate, polyA[0166] + RNA was purified with the Oligotex mRNA Kit (Qiagen, Valencia, Calif.), and 32P-labeled cDNA was synthesized from 1 ug of each sample using SuperScriptII reverse transcriptase (Gibco BRL Life Technologies, Gaithersburg, Md.). Labeled probes were separately hybridized to Atlas™ Human cDNA Expression Arrays (Clontech, Palo Alto, Calif.) consisting of nylon membranes onto which 588 cDNA fragments encoding known proteins were spotted in duplicate. After a high stringency wash, hybridization was detected by autoradiography and phosphoimaging on a Molecular DynamicsPhosphoImager™ (Molecular Dynamics, Sunnyvale, Calif.). Data were analyzed using Atlas™ Image 1.0, and normalized to signals from control housekeeping genes on the same filter. For selected genes, progesterone inducibility and PR-isoform specificity were confirmed by northern blotting, reverse transcriptase-polymerase chain reaction (RT-PCR), and/or western blotting.
  • Affymetrix GeneChip™Array. T47D-Y, T47D-YA and T47D-YB breast cancer cells were grown to mid-confluence in Minimal Essential Medium containing 5% Fetal Calf Serum, then either treated with 10 nM progesterone dissolved in ethanol for 6 hours, or in ethanol alone. This yielded 6 treatment types. Total RNA and polyA[0167] + RNA were prepared from the 6 sets, as described above. First strand cDNA was synthesized from 2 ug of polyA+ RNA using SSII Reverse Transcriptase, the T7dT 24mer, and other components of the Superscript Choice system (Gibco BRL Life Technologies, Gaithersburg, Md.). Following second strand synthesis, the DNA was purified by phenol/chloroform extraction and precipitation, and resuspended in 12 ul DEPC-treated RNase water. 5 ul were used in an in vitro transcription reaction using the EnZo BioArray™ High Yield transcript Labeling Kit (Affymetrix, Inc., Santa Clara, Calif.), to synthesize RNA transcripts and incorporate biotin labeled ribonucleotides. Unincorporated nucleotides were removed with RNeasy affinity columns (Qiagen, Valencia, Calif.). Purified, biotinylated cRNAs were quantified, and 20 ug were subjected to a fragmentation reaction by incubation at 94C for 35 min (Affymetrix™ protocol 700218) to randomly generate fragments ranging from 35 to 200 bases. HuGeneFL Array™ chips consisting of 5,600 full-length human genes from Unigene, Genebank and TIGR databases were used for hybridization. Thirty μl of fragmented cRNA were added to a hybridization mixture (100 mM MES, 1 M NaCl, 20 mMEDTA, and 0.01% Tween 20) and control oligonucleotide B2 and control cRNA cocktail, as described in the Affymetrix™ protocol. Hybridizations and subsequent washes were done in the GeneChip Hybridization Oven and Fluidics Station 400. After overnight hybridization, the solutions were removed, the chips were washed and stained with streptavidin-phycoerythrin. DNA chips were read at a resolution of 6 um with a Hewlett-Packard GeneArray Scanner.
  • Each gene on the chip is represented by perfectly matched (PM) and mismatched (MM) oligonucleotides from 16-20 regions of each gene. The mismatched probes act as specificity controls, which allow direct subtraction of background and cross-hybridization signals. The number of instances in which the PM hybridization signal is larger that the MM signal is computed along with the average of the logarithm of the PM:MM ratio (after background subtraction) for each probe set. These values were used to arrive at a matrix-based decision concerning the presence or absence of an RNA transcript. Detailed protocols for data analyses of Affymetrix microarrays and extensive documentation of the sensitivity and quantitative aspects of the method have been described. Briefly, the first level of analysis including the “present” or “absent” call, and pairwise comparisons, were done using GeneChip 3.1 Expression Analysis Program™ (Affymetrix, Inc., Santa Clara, Calif.). A second level of analysis to identify clusters of genes regulated by progesterone via PR-A, PR-B or both was performed using GeneSpring™ version 3.0 (Silicon Genetics, San Carlos, Calif.). The present inventors used customized software capable of comparing multiple experimental pairwise comparisons (minus versus plus progesterone) and multiple control comparisons (all minus hormone samples and all plus hormone samples) to compare fold change minus versus plus hormone as compared to the fold change between controls. This served as a measure of the variability between samples. As a third level of analysis, k-means clustering was performed using GeneSpring™ version 3.2.12 (Silicon Genetics, San Carlos, Calif.) to identify patterns of gene regulation in PR-A, PR-B, or PR-negative cells treated with or without progesterone. [0168]
  • Selected genes, i.e., ones that were substantially regulated or are of particular biological interest, have been confirmed by northern and/or RT-PCR, and/or by western blotting. Additionally, the promoters of several genes of interest have been cloned, linked upstream of a luciferase reporter, and tested for their ability to be transcriptionally regulated by PR-A vs. PR-B after transfection into HeLa cervicocarcinoma cells, followed by progesterone treatment of the cells. In the examples tested, regulation by PR-A vs. PR-B using the synthetic promoter/reporter constructs, mimicked the regulation of the endogenous genes in the breast cancer cells, supporting the use of these approaches for drug discovery. [0169]
  • RT-PCR and Northern Blot Analysis [0170]
  • RT-PCR amplifications of target sequences were performed with co-amplification of an internal control sequence (p2MG or GAPDH) using: P2MG forward primer: 5′-ATCCAGCGTACTCCAAAGATTC-3′ (SEQ ID NO:1); β2MG reverse primer: 5′-TCCTTGCTGAAAGACAAGTCTG-3′ (SEQ ID NO:2); resulting in a product of 178 bp. GAPDH primers yielded a product of 485 bp. GAPDH, Integrin α6, and bcl-x cDNA primer sequences were obtained from Clontech. Total RNA was prepared from T47DY-A or -B cells as described above. One μg of RNA was mixed with 0.4 μM random hexamers and heated to 65° C. for five min. (Perkin Elmer). 1×PCR buffer (5 mM MgCl[0171] 2), 20 U RNAse inhibitor, 4 mM dNTPs, and 125 U MMLV reverse transcriptase were added and tubes were incubated at 42° C. for 1 hour. Five μl of the cDNA synthesis reactions were added to 1×PCR buffer, 1.8 mM MgCl2, 100 mM dNTP blend, and 60 pmoles of specific primers were incubated with 5 U AmpliTaq DNA polymerase at 94° C. for 30 s, 65 C for 45 s, and 68° C. for 1 min for 16-18 cycles (cycle number was chosen to be in the linear range of amplification for each product). All PCR reagents were purchased from Perkin Elmer, Foster City, Calif. Five μl of samples were resolved on a 2% agarose gel, and Southern blots were performed in 0.4M. Blots were prehybridized in Rapid-hyb (Amersham) for 1 h at 65° C. cDNA probes were generated by RT-PCR and radioactively labeled using MegaPrime DNA labeling system (Amersham) and 32P-αdCTP. Blots were probed for 2 h to overnight at 65° C. Blots were washed and exposed to autoradiography film or phosphoimaging screen and then quantified using ImageQuant, Molecular Dynamics. In some cases the RT-PCR products could be visualized on an ethidium bromide stained gel when amplified in the linear range of production and in these cases Southern blotting and hybridizing with a labeled probe was unnecessary and products were instead directly quantitated. In some cases Northern blot analysis was used to detect transcripts. In these cases 25 μg of total RNA was electrophoresed in a formaldehyde agarose gel and transferred to a Hybond nylon membrane (Amersham) and hybridized sequentially with cDNA inserts for specific genes generated by random priming PCR products generate as above with 32P dCTP using Mega-Prime DNA Labeling Kit (Amersham). Membranes were then probed with fragments of housekeeping genes (either B2MG or GAPDH).
  • Transcriptional Assay: [0172]
  • HeLa cells plated at 4×10[0173] 5 cells per 10 cm dish in MEM supplemented with 5% fetal bovine serum were then transiently transfected with 100 ng of HPR1 (PR-B in pSG5) or HPR2 (PR-A in pSG5) and 1.2 μg of the integrin a6 promoter (−740) in pGL3-Basic vector plasmid (gift from Dr. Sohei Kitazawa, Kobe University School of Medicine, Department of Pathology), 1.2 μg of β-galactosidase expression plasmid pCH110, and 5.5 μg BSM treated with 10 mM progesterone or ethanol vehicle for 24 hours.
  • Immunoblots: [0174]
  • For time course treatments with progesterone, cells were plated at 2 million cells per large plates in MEM with supplements described above and were treated with 10 nM progesterone (Sigma). Cells were harvested in RIPA buffer (10 mM sodium phosphate, pH 7.0, 150 mM NaCl, 2 mM EDTA, 1% deoxycholic acid, 1% Nonedet P-40, 0.1% SDS, 0.1% β-mercaptoethanol, 1 mM PMSF, 50 mM sodium fluoride, 200 μM Va[0175] 3VO4, and one Complete Protease Inhibitor Mixture tablet (Boehringer Marnheim, GmbH Germany) per 50 mls of RIPA buffer made fresh for each use. Protein extracts were equalized to 150 μg by Bradford assay (Bio-Rad), resolved by SDS-PAGE, and transferred to nitrocellulose. Equivalent protein loading was confirmed by Ponceau S staining. Following incubation with the appropriate antibodies, and HRP-conjugated secondary antibodies, protein bands were detected by enhanced chemiluminescence (Amersham, Arlington Heights, Ill.).
  • Results [0176]
  • Gene expression data from Affymetrix HuGeneFL Array™ chips were analyzed using Microarray Suite 4.0 Expression Analysis Program (Affymetrix™). Experimental data from independent triplicate experiments for T47D-YA and T47D-YB cells and duplicate T47D-Y cells treated with or without 10 nM progesterone were analyzed and pairwise comparisons were performed to identify genes that had increased or decreased with addition of hormone. These data were imported into Microsoft Excel and custom formulas were written to identify genes that had repeatedly increased or decreased with hormone in three out of three experiments by at least 1.8 fold, but did not vary more than two fold between control groups. Genes that met these criteria and were up- or downregulated by progesterone by in PR-B containing cells are shown in Table 18, while those up- or downregulated by progesterone in PR-A containing cells are shown in Table 19. In both tables fold increases and decreases (negative numbers) upon treatment with progesterone for 6 hrs are indicated. Genes which were at below detectable levels and called absent in one sample, but which were detectable and called as present in the other are denoted with a tilde beside the fold changes. The fold changes indicated with a tilde cannot be compared to those that are not marked with a tilde (indicating they were present in both minus and plus hormone samples) as the fold change was calculated by setting the undetectable gene to background level. Genes in bold in Table 18 are uniquely regulated by progesterone only via PR-B, while those in bold in Table 19 are uniquely regulated by PR-A; those not bolded were regulated in both PR-B and PR-A containing cells. Only genes that were regulated in 3 out of 3 experiments are shown and average fold inductions are given. Genes marked with an asterisk were identified from Atlas™ Human cDNA Expression Arrays (Clonetech, Palo Alto, Calif.) and those marked by an & symbol were identified as being progesterone regulated on using both Atlas™ Human cDNA Expression Arrays and Affyetrix HuGeneFL Array™ chips (Affymetrix, Inc., Santa Clara, Calif.), all others were identified using Affymetrix HuGeneFL Array™ chips (Affymetrix, Inc., Santa Clara, Calif.). The present inventors have categorized genes regulated by progesterone in this study into functional categories based on GeneCard information as well as extensive literature reviews of each gene product (Table 17). Ten of the genes found to be regulated by progesterone in the present study have previously been reported by other groups to be progesterone responsive in either breast cancer cells or other hormone responsive cell types or tissues (Table 8). However, the PR-A and/or PR-B isoform specificity of these genes was unknown prior to the present invention. The independent identification of genes that have previously been reported to be progesterone-regulated serves as an internal control and also demonstrates the sensitivity of this assay, as even genes induced by progesterone as little as 1.9 fold were detected on the arrays. Additionally, 8 of the genes found to be regulated by progesterone in the present study have previously been reported to be involved in either breast cancer or mammary gland development (Table 16). [0177]
  • The average differences indicating relative intensities obtained from triplicate experiments from T47D-YA and T47D-YB cell lines and duplicate experiments in the PR-negative T47D-Y cells were entered into GeneSpring™ 3.2.12 (Silicon Genetics, San Carlos, Calif.). To normalize for variation among chips each gene intensity value was normalized to 1 (intensity of gene A on chip X divided by the median of all intensities measured on chip X). To identify patterns of gene expression among cell lines and hormone treatments, k-means clustering was performed. Clustergrams of various patterns of gene regulation were generated. Within these clusters, any one gene can be viewed individually and standard error bars generated from replicate experiments are shown for gene expression levels in cell lines containing either PR-A, PR-B, or no PR, with or without progesterone treatment. A cluster of genes was shown to be upregulated by progesterone in both PR-A and PR-B containing cells, but not in the PR-negative cell line. While most of these genes were upregulated by progesterone treatment more strongly via PR-B, some, such as S100P calcium binding protein, and Grb10 are upregulated equally well via PR-A and PR-B. Upregulation of IkappaBalpha via both receptors was confirmed at the protein level as early as 6 hours, and remained elevated for up to 48 hours in the presence of progesterone (data not shown). Additionally, the gene encoding Ezrin, identified as being progesterone regulated using Atlas™ Human cDNA Expression Arrays probed with RNA from T47D-YA and YB cells left untreated or treated with progesterone for 12 hrs was confirmed to be equally well upregulated by both PR-A and PR-B at 12, 24 and 48 hrs by northern blot analysis (data not shown). [0178]
  • The present inventors have demonstrated that although some genes (and their protein products) are regulated by progesterone through both PR isoforms, many genes are uniquely regulated by either PR-A or PR-B. In the T47D breast cancer cell lines used for the present invention, many more genes were regulated by progesterone through PR-B than through PR-A. However, it remains to be determined whether this situation is reversed in other types of cells or tissues; the endometrium for instance. Data from knock-out mice show that PR-A, but not PR-B, plays an important role in opposing the proliferative effect of estrogen on the endometrium. This is one example of tissue and PR isoform specificity (Mulac-Jericevic et al., [0179] Science 289, 1751-4 (2000)).
  • Many progesterone regulated genes require PR-B as illustrated by Tables 3, 4, 11, 12 and 18. Two examples are Stat5a and C/EBP beta. Their differential upregulation only by PR-B was confirmed by immunoblot at several time-points after progesterone treatment (data not shown). In contrast, the same western blot probed for two control proteins, p21 and cyclin D1, previously reported to be progesterone regulated (Musgrove et al., [0180] Mol. Cell. Biol. 13, 3577-3587 (1993); Musgrove et al., Mol. Endocrinol. 11, 54-66 (1997); Groshong et al., Mol Endocrinol 11, 1593-607 (1997)), showed them to be equally well regulated by either PR-A or PR-B. The gene encoding tissue factor is also uniquely regulated by PR-B. This too was confirmed by RT-PCR. Similarly, RT-PCR confirmed that integrin alpha 6 is uniquely regulated by PR-B at 6, 12, and 24 hours after progesterone treatment. To demonstrate the differential regulation of this gene by PR-B in a different cell line and by different methods, the present inventors transfected the integrin alpha 6 promoter linked to luciferase into progesterone treated PR-negative HeLa cells that were cotransfected with either PR-B or PR-A. Transcription of the integrin alpha 6 promoter was induced 4.4 fold by PR-B, but was not regulated at all by PR-A, or by cells lacking PR (not shown).
  • Fewer genes were uniquely regulated by PR-A (Table 19) and they tended to be expressed at relatively low levels. The gene encoding the docking protein enhancer of filamentation was significantly upregulated only by PR-A. The gene encoding the estrogen related receptor (ERR), which can heterodimerize with ERα and Erβ is also PR-A dependent. The preferential upregulation of ERR by PR-A was confirmed by RT-PCR at both 6 and 12 hrs of progesterone treatment. The anti-apoptosis inducing protein Bcl-X[0181] L, is another gene uniquely regulated by PR-A as confirmed by RT-PCR (not shown).
  • In general, fewer genes were downregulated by progesterone treatment than were upregulated (Tables 18 and 19). Analysis of pairwise comparisons using MicroArray Suite 4.0 Expression Analysis Program™ was used to demonstrate the statistical significance of the downregulation (in 3 out of 3 experiments). Similarly, gene filtering using GeneSpring™ generated a clustergram of downregulated genes (data not shown) confirming the accuracy of the assignments. Of the downregulated genes, three were downregulated by both PR-A and PR-B; eleven were uniquely downregulated by PR-B; and two were uniquely downregulated by PR-A. Downregulation of three of these genes, monocyte chemotactic protein, bullous pemphigoid antigen, and transforming growth factor-beta 3 (TGF-beta 3) was confirmed by RT-PCR (data not shown). [0182]
  • Several genes that were identified by the present inventors as being regulated by progesterone, were previously known to be important in breast cancers. Based on the present invention they may now be targeted for specific progestin therapies. (1) For instance, S100P calcium-binding protein overexpression is associated with immortalization of human breast epithelial cells in vitro and with early stages of breast cancer development in vivo (Guerreiro da Silva et al., [0183] Int J Oncol 16, 231-40 (2000)). (2) The gene encoding tissue factor, a cell surface glycoprotein, is associated with metastasis in breast and other types of cancers (Ueno et al., Br J Cancer 83, 164-70 (2000); Lwaleed et al., J Pathol 187(3):291-4 (1999)). Tissue factor was previously known to be regulated by progesterone in the endometrium (Krikun et al., Mol Endocrinol 14, 393-400 (2000); Lockwood et al., J Clin Endocrinol Metab 85, 297-301 (2000); Krikun et al., J Clin Endocrinol Metab 83, 926-30 (1998)), but not in the breast or in breast cancers. (3) The gene encoding Gas6, a ligand for the tyrosine kinase receptor Axl receptor tyrosine kinase (RTK) and other members of the RTK family, was recently reported to be mitogenic in breast cancer cells (Goruppi et al., Mol Cell Biol 21, 902-915 (2001)) and it promotes angiogenesis (Fridell et al., J Biol Chem 273, 7123-6. (1998)). (4) The HEF1 gene is highly related to BCAR1/p130Cas, which has been found to be upregulated in tamoxifen resistant tumors (van der Flier et al., Int J Cancer 89, 465-8 (2000); van der Flier et al., J Natl Cancer Inst 92, 120-7 (2000)). The present invention provides the rationale for measuring the expression levels of these genes in breast cancers. It may be that tumors that overexpress these genes good candidates for suppressive therapy with progesterone antagonists.
  • Additionally the inventors now demonstrate the progesterone regulation of several genes previously known to be preferentially expressed in normal breast epithelium compared to breast cancers. For instance, the gene encoding bullous pemphigoid antigen, a protein associated with hemidesmosomes, is overexpressed 12-fold in normal breast cells compared to breast tumors (Nacht et al., [0184] Cancer Res 59, 5464-70 (1999)). Such desmosomes are important in maintaining the normal differentiated architecture of the breast. The present inventors have found that bullous pemphigoid antigen is downregulated by progesterone through both PR isoforms. This down regulation may be harmful, and/or it may disrupt important cell-cell interactions. It is possible that antiprogestin therapy would prevent this downregulation.
  • Some of the genes that were discovered by the present inventors to be progesterone regulated are involved in particular functional pathways. Groups of temporally regulated genes are often involved in the same pathway. For example, it was previously known that progesterone regulates genes involved in the steroid biosynthesis and trafficking pathways (Watari et al., [0185] Exp Cell Res 259, 247-56 (2000); Darnel et al., J Steroid Biochem Mol Biol 70:203-10 (1999); Arcuri et al., Endocrinology 137:595-600 (1996)), and the present investigators identify a cluster of such genes. However, less is known about the role of progesterone in regulating signaling pathways controlled by growth factors and cytokines. The present inventors' data demonstrate for the first time, that progesterone plays an important role in regulating many genes involved in these signaling pathways. In addition, the present inventors' demonstrate that progesterone regulates expression of genes for proteins previously known to interact with PR. Examples are FKB54 (Kester et al., J Biol Chem 272, 16637-43 (1997)), Stat5 (Richeretal., J Biol Chem 273, 31317-26 (1998)), IκBα and cytoplasmic dynein light chain 1 (Crepieux et al., Mol Cell Biol 17:7375-85 (1997)).
    TABLE 1
    Genes selectively upregulated by PR-A
    Accession No. Fold Increase Gene Name
    L43821 4.7 enhancer of filamentation (HEF1)
    L38487 2.3 estrogen receptor-related
    protein (hERRa1)
  • [0186]
    TABLE 2
    Genes selectively downregulated by PR-A
    Accession No. Fold Decrease Gene Name
    U44103 −2.9 small GTP binding protein Rab9
  • [0187]
    TABLE 3
    Genes selectively upregulated by PR-B.
    Accession Fold
    No. Increase Gene Name
    L13720 ˜23.1 growth arrest-specific protein (gas6)
    M27436 ˜18.1 tissue factor gene
    D79990 10.2 KIAA0168 Ras association (RalGDS/AF-6) domain
    family 2 (RASSF2)
    U01120 ˜9.8 glucose-6-phosphatase
    D25539 ˜8 KIAA0040 gene
    U37546 ˜7.2 IAP homolog C (MIHC)
    D87953 6.8 RTP, DRG1, CAP43
    M76180 ˜6.5 aromatic amino acid decarboxylase (ddc)
    M77140 ˜6 pro-galanin
    D50840 ˜5.6 ceramide glucosyltransferase
    HG2743- ˜5.1 Caldesmon 1 Non-Muscle
    HT2846
    U76421 ˜4.7 dsRNA adenosine deaminase DRADA2b
    U40572 4.6 beta2-syntrophin (SNT B2)
    S69189 ˜4.5 peroxisomal acyl-coenzyme A oxidase
    U44754 4.4 PSE-binding factor PTF gamma subunit
    U02081 4.1 guanine nucleotide regulatory
    protein (NET1) oncogene1
    D16227 ˜4 BDP-1 (member of the recoverin family)
    D17793 ˜4 3-alpha hydroxysteroid dehydrogenase type IIb
    U83461 3.7 putative copper uptake protein (hCTR2)
    M23254 3.6 Ca2+-activated neutral protease (CANP)
    D15050 3.6 transcription factor AREB6
    HG2167- ˜3.5 Protein Kinase Ht31, Camp-Dependent
    HT2237
    D10040 3.5 long-chain acyl-CoA synthetase
    D31887 3.5 KIAA0062 gene
    X60673 3.4 adenylate kinase 3
    U45878 ˜3.3 inhibitor of apoptosis protein 1
    L09229 3.3 long-chain acyl-coenzyme A synthetase (FACL1)
    U09646 3.2 carnitine palmitoyltransferase II precursor (CPT1)
    D31716 3.2 GC box bindig protein
    M37400 3.1 cytosolic aspartate aminotransferase
    X59834 3.1 glutamine synthase
    D78335 3.1 uridine monophosphate kinase (UMPK)
    U41387 3 RNA helicase II/Gu)
    U07919 3 aldehyde dehydrogenase 6
    M69013 2.9 guanine nucleotide-binding regulatory
    protein (G-y-alpha)1
    HG2530- 2.9 Adenylyl Cyclase-Associated Protein 2
    HT2626
    U79288 2.8 clone 23682
    D10704 2.6 choline kinase
    Y08134 2.6 ASM-like phosphodiesterase 3b
    U33632 2.6 two P-domain K+ channel TWIK-1
    M21154 2.5 S-adenosylmethionine decarboxylase
    U77949 2.5 Cdc6-related protein (HsCDC6)
    M95767 ˜2.5 di-N-acetylchitobiase
    D83781 2.5 KIAA0197 gene
    X98534 2.5 vasodilator-stimulated phosphoprotein (VASP)
    X53586 2.5 Integrin α 6*
    D80001 2.4 KIAA0179 gene
    L18960 2.4 protein synthesis factor (elF-4C)
    D23673 2.3 insulin receptor substrate-1 (IRS-1)
    J02888 2.3 quinone oxidoreductase (NQO2)
    D63487 2.3 KIAA0153 gene
    U14603 2.3 protein-tyrosine phosphatase (HU-PP-1)
    L41887 2.3 splicing factor, arginine/serine-rich 7 (SFRS7)
    M92287 2.2 cyclin D3 (CCND3)
    X61123 2.2 BTG1
    M95929 2.1 homeobox protein (PHOX1)
    U32944 2.1 cytoplasmic dynein light chain 1 (hdlc1)
    D79994 2.1 KIAA0172 gene (similar to ankyrin)
    D89377 2 MSX-2
    U90878 2 LIM domain protein CLP-36
    U97105 2 N2A3 dihydropyrimidinase related protein-2
    L40379 2 thyroid receptor interactor (TRIP10)
    J05459 1.9 glutathione transferase M3 (GSTM3)
    L42542 1.8 RLIP76 (ralA binding protein 1)
    D42047 1.7 KIAA0089 similar to glycerol-3-
    phosphate dehydrogenase 1
    M84349 1.7 transmembrane protein (CD59)
    D43950 1.6 KIAA0098 T-COMPLEX PROTEIN 1 (TCP-1-
    EPSILON)
    M15796 1.6 proliferating cell nuclear antigen (PCNA)
  • [0188]
    TABLE 4
    Genes selectively downregulated by PR-B
    Accession No. Fold Decrease Gene Name
    U07225 ˜−4.3 P2U nucleotide receptor
    M27492 ˜−3.4 interleukin 1 receptor mRNA
    Y08682 −3.1 carnitine palmitoyltransferase|type|
    U29091 ˜−2.9 selenium-binding protein (hSBP)
    X79683 −2.6 beta2 laminin.
    AB000220 −2.6 semaphorin E1
    HG2197-HT2267 ˜−2.5 Collagen, Type Vii, Alpha 1
    U65011 ˜−2.5 preferentially expressed antigen of
    melanoma (PRAME)
    M18391 ˜−2.3 tyrosine kinase receptor (eph)
    X71874 −1.9 proteasome-like subunit MECL-1
  • [0189]
    TABLE 5
    Genes up or downregulated by both PR-A and PR-B
    Accession No. Fold Gene Name
    X51521 ˜22.6 Ezrin*
    U70663 ˜7.5 zinc finger transcription factor EZF
    U16799 6.1 Na, K-ATPase beta-1 subunit
    X65614 3.6 calcium-binding protein S100P
    D86962 2.9 Grb10
    S81914 2.6 IEX-1 = radiation-inducible immediate-early
    U00115 2.4 bcl-6
    M69225 ˜−3.5 bullous pemphigoid antigen (plakin family)
    U90907 −3.2 clone 23907
    M92357 −2.1 tumor necrosis factor alpha-induced
    protein 2 (B94)
  • [0190]
    TABLE 6
    Gene that is reciprocally regulated (upregulated by PR-B,
    downregulated by PR-A)
    Accession No. Fold Gene Name
    X53586 2.5 Integrin α 6*
  • [0191]
    TABLE 7
    Group of genes for which the expression level is
    different depending on which isoform is present.
    Accession
    No. Fold Gene Name
    L13720 ˜23.1 growth arrest-specific protein (gas6)
    D79990 10.2 KIAA0168 Ras association (RalGDS/AF-6) domain
    family 2 (RASSF2)
    U01120 ˜9.8 glucose-6-phosphatase
    U37546 ˜7.2 IAP homolog C (MIHC)
    D87953 6.8 RTP, DRG1, CAP43
    M76180 ˜6.5 aromatic amino acid decarboxylase (ddc)
    M77140 ˜6 pro-galanin
    D50840 ˜5.6 ceramide glucosyltransferase
    HG2743- ˜5.1 Caldesmon 1 Non-Muscle
    HT2846
    U76421 ˜4.7 dsRNA adenosine deaminase DRADA2b
    U40572 4.6 beta2-syntrophin (SNT B2)
    S69189 ˜4.5 peroxisomal acyl-coenzyme A oxidase
    U44754 4.4 PSE-binding factor PTF gamma subunit
    U02081 4.1 guanine nucleotide regulatory
    protein (NET1) oncogene
    D16227 ˜4 BDP-1 (member of the recoverin family)
    D17793 ˜4 3-alpha hydroxysteroid dehydrogenase type IIb
    U83461 3.7 putative copper uptake protein (hCTR2)
    M23254 3.6 Ca2+-activated neutral protease (CANP)
    D15050 3.6 transcription factor AREB6
    HG2167- ˜3.5 Protein Kinase Ht31, Camp-Dependent
    HT2237
    D10040 3.5 long-chain acyl-CoA synthetase
    D31887 3.5 KIAA0062 gene
    X60673 3.4 adenylate kinase 3
    U45878 ˜3.3 inhibitor of apoptosis protein 1
    L09229 3.3 long-chain acyl-coenzyme A synthetase (FACL1)
    U09646 3.2 carnitine palmitoyltransferase II precursor (CPT1)
    D31716 3.2 GC box bindig protein
    M37400 3.1 cytosolic aspartate aminotransferase
    X59834 3.1 glutamine synthase
    D78335 3.1 uridine monophosphate kinase (UMPK)
    U41387 3 RNA helicase II/Gu)
    U07919 3 aldehyde dehydrogenase 6
    M69013 2.9 guanine nucleotide-binding regulatory
    protein (G-y-alpha)
    HG2530- 2.9 Adenylyl Cyclase-Associated Protein 2
    HT2626
    U79288 2.8 clone 23682
    D10704 2.6 choline kinase
    Y08134 2.6 ASM-like phosphodiesterase 3b
    U33632 2.6 two P-domain K+ channel TWIK-1
    M21154 2.5 S-adenosylmethionine decarboxylase
    U77949 2.5 Cdc6-related protein (HsCDC6)
    M95767 ˜2.5 di-N-acetylchitobiase
    D83781 2.5 KIAA0197 gene
    X98534 2.5 vasodilator-stimulated phosphoprotein (VASP)
    D80001 2.4 KIAA0179 gene
    L18960 2.4 protein synthesis factor (elF-4C)
    D23673 2.3 insulin receptor substrate-1 (IRS-1)
    J02888 2.3 quinone oxidoreductase (NQO2)
    D63487 2.3 KIAA0153 gene
    U14603 2.3 protein-tyrosine phosphatase (HU-PP-1)
    L41887 2.3 splicing factor, arginine/serine-rich 7 (SFRS7)
    M92287 2.2 cyclin D3 (CCND3)
    X61123 2.2 BTG1
    M95929 2.1 homeobox protein (PHOX1)
    U32944 2.1 cytoplasmic dynein light chain 1 (hdlc1)
    D79994 2.1 KIAA0172 gene (similar to ankyrin)
    D89377 2 MSX-2
    U90878 2 LIM domain protein CLP-36
    U97105 2 N2A3 dihydropyrimidinase related protein-2
    L40379 2 thyroid receptor interactor (TRIP10)
    J05459 1.9 glutathione transferase M3 (GSTM3)
    L42542 1.8 RLIP76 (ralA binding protein 1)
    D42047 1.7 KIAA0089 similar to glycerol-3-
    phosphate dehydrogenase 1
    M84349 1.7 transmembrane protein (CD59)
    D43950 1.6 KIAA0098 T-COMPLEX PROTEIN 1 (TCP-1-
    EPSILON)
    M15796 1.6 proliferating cell nuclear antigen (PCNA)
    U07225 ˜−4.3 P2U nucleotide receptor
    M27492 ˜−3.4 interleukin 1 receptor mRNA
    Y08682 −3.1 carnitine palmitoyltransferase I type I
    U29091 ˜−2.9 selenium-binding protein (hSBP)
    X79683 −2.6 beta2 laminin.
    AB000220 −2.6 semaphorin E
    HG2197- ˜−2.5 Collagen, Type Vii, Alpha 1
    HT2267
    U65011 ˜−2.5 preferentially expressed antigen of
    melanoma (PRAME)
    M18391 ˜−2.3 tyrosine kinase receptor (eph)
    X71874 −1.9 proteasome-like subunit MECL-1
    L43821 4.7 enhancer of filamentation (HEF1)
    L38487 2.3 estrogen receptor-related protein (hERRa1)
    D25539 ˜8 KIAA0040 gene
  • [0192]
    TABLE 8
    Genes encoding products previously reported to be regulated by progesterone
    Accession no. Gene Name Cell or tissue type Isoform
    U26726 11-beta-hydroxysteroid dehydrogenase type 2 endometrial stromal cells, Both1
    dometrial cancer cells,
    M27436 tissue factor gene endometrium PR-B only2
    U42031 progesterone receptor-associated FKBP54 breast cancer cells Both3
    M68516 PCI gene (plasminogen activator inhibitor) endometrial stromal cells PR-B only4
    U43185 Stat5A breast cancer cells PR-B only5
    X52730 phenylethanolamine n-methyltransferase (PNMT) adrenal medulla PR-B only6
    M69043 MAD-3 encoding IkB-alpha macrophage cells and endometrium Both7
    AF002020 Niemann-Pick C disease (NPC1) granulosa cells PR-B only8
    D00017 lipocortin II (calpactin I) endometrial cancer cells PR-B only9
    D25328 platelet-type phosphofructokinase breast cancer cells, intestinal epithelium, PR-B only10
    granulosa cells
    M80254 cyclophilin isoform (hCyP3) liver PR-B only11
    HG4069-HT4339_s_at Monocyte Chemotactic Protein 1 endometrial cells and breast cancer cells PR-A only12
    Z50781 delta sleep inducing peptide (related to TSC-22) breast cancer cells PR-A only13
  • [0193]
    TABLE 9
    Genes selectively upregulated by PR-A
    Accession No. Fold Increase Gene Name
    L43821 4.7 enhancer of filamentation (HEF1)
    Z23115 3.2 Bcl-x*
    Z50781 2.5 delta sleep inducing
    peptide (higly related to TSC-22)
    L38487 2.3 estrogen receptor-related
    protein (hERRa1)
  • [0194]
    TABLE 10
    Genes selectively downregulated by PR-A
    Accession No. Fold Decrease Gene Name
    HG4069-HT4339 ˜−7.4 Monocyte Chemotactic Protein 1
    U44103 −2.8 small GTP binding protein Rab9
  • [0195]
    TABLE 11
    Genes selectively upregulated by PR-B
    Accession No. Fold Increase Gene Name
    L13720 ˜23.1 growth arrest-specific protein (gas6)
    M27436 ˜18.1 tissue factor gene
    D79990 10.2 KIAA0168 Ras association (RaIGDS/AF-6) domain family 2
    (RASSF2)
    U01120 ˜9.8 glucose-6-phosphatase
    D25539 ˜8 KIAA0040 gene
    U37546 ˜7.2 IAP homolog C (MIHC)
    D87953 6.8 RTP, DRG1, CAP43
    M76180 ˜6.5 aromatic amino acid decarboxylase (ddc)
    M83667 6.4 NF-IL6 (C/EBPbeta)
    M68516 ˜6.2 PCI gene (plasminogen activator inhibitor 3)
    U43185 ˜6.1 Stat5A
    M77140 ˜6 pro-galanin
    D50840 ˜5.6 ceramide glucosyltransferase
    HG2743-HT2846 ˜5.1 Caldesmon 1 Non-Muscle
    U76421 ˜4.7 dsRNA adenosine deaminase DRADA2b
    U40572 4.6 beta2-syntrophin (SNT B2)
    S69189 ˜4.5 peroxisomal acyl-coenzyme A oxidase
    U44754 4.4 PSE-binding factor PTF gamma subunit
    X52730 4.4 phenylethanolamine n-methyltransferase (PNMT)
    U02081 4.1 guanine nucleotide regulatory protein (NET1) oncogene1
    D16227 ˜4 BDP-1 (member of the recoverin family)
    D17793 ˜4 3-alpha hydroxysteroid dehydrogenase type IIb
    U83461 3.7 putative copper uptake protein (hCTR2)
    M23254 3.6 Ca2+-activated neutral protease (CANP)
    D15050 3.6 transcription factor AREB6
    HG2167-HT2237 ˜3.5 Protein Kinase Ht31, Camp-Dependent
    D10040 3.5 long-chain acyl-CoA synthetase
    D31887 3.5 KIAA0062 gene
    X60673 3.4 adenylate kinase 3
    U45878 ˜3.3 inhibitor of apoptosis protein 1
    L09229 3.3 long-chain acyl-coenzyme A synthetase (FACL1)
    U09646 3.2 carnitine palmitoyltransferase II precursor (CPT1)
    D31716 3.2 GC box bindig protein
    M37400 3.1 cytosolic aspartate aminotransferase
    X59834 3.1 glutamine synthase
    D78335 3.1 uridine monophosphate kinase (UMPK)
    U41387 3 RNA helicase II/Gu)
    U07919 3 aldehyde dehydrogenase 6
    M69013 2.9 guanine nucleotide-binding regulatory protein (G-y-alpha)1
    HG2530-HT2626 2.9 Adenylyl Cyclase-Associated Protein 2
    U79288 2.8 clone 23682
    D10704 2.6 choline kinase
    Y08134 2.6 ASM-like phosphodiesterase 3b
    U33632 2.6 two P-domain K+ channel TWIK-1
    M21154 2.5 S-adenosylmethionine decarboxylase
    U77949 2.5 Cdc6-related protein (HsCDC6)
    M95767 ˜2.5 di-N-acetylchitobiase
    D83781 2.5 KIAA0197 gene
    X98534 2.5 vasodilator-stimulated phosphoprotein (VASP)
    X53586 2.5 Integrin α 6*
    D80001 2.4 KIAA0179 gene
    L18960 2.4 protein synthesis factor (elF-4C)
    D23673 2.3 insulin receptor substrate-1 (IRS-1)
    J02888 2.3 quinone oxidoreductase (NQO2)
    D63487 2.3 KIAA0153 gene
    U14603 2.3 protein-tyrosine phosphatase (HU-PP-1)
    L41887 2.3 splicing factor, arginine/serine-rich 7 (SFRS7)
    M92287 2.2 cyclin D3 (CCND3)
    X61123 2.2 BTG1
    AF002020 2.1 Niemann-Pick C disease (NPC1)
    M95929 2.1 homeobox protein (PHOX1)
    U32944 2.1 cytoplasmic dynein light chain 1 (hdlc1)
    D79994 2.1 KIAA0172 gene (similar to ankyrin)
    D89377 2 MSX-2
    U90878 2 LIM domain protein CLP-36
    U97105 2 N2A3 dihydropyrimidinase related protein-2
    L40379 2 thyroid receptor interactor (TRIP10)
    D00017 1.9 lipocortin II
    J05459 1.9 glutathione transferase M3 (GSTM3)
    D25328 1.9 platelet-type phosphofructokinase
    M80254 1.9 cyclophilin isoform (hCyP3)
    L42542 1.8 RLIP76 (ralA binding protein 1)
    D42047 1.7 KIAA0089 similar to glycerol-3-phosphate dehydrogenase 1
    M84349 1.7 transmembrane protein (CD59)
    D43950 1.6 KIAA0098 T-COMPLEX PROTEIN 1 (TCP-1-EPSILON)
    M15796 1.6 proliferating cell nuclear antigen (PCNA)
  • [0196]
    TABLE 12
    Genes selectively downregulated by PR-B
    Accession No. Fold Decrease Gene Name
    U07225 ˜−4.3 P2U nucleotide receptor
    M27492 ˜−3.4 interleukin 1 receptor mRNA
    Y08682 −3.1 carnitine palmitoyltransferase I
    type I
    U29091 ˜−2.9 selenium-binding protein (hSBP)
    X79683 −2.6 beta2 laminin.
    AB000220 −2.6 semaphorin E1
    HG2197-HT2267 ˜−2.5 Collagen, Type Vii, Alpha 1
    U65011 ˜−2.5 preferentially expressed antigen of
    melanoma (PRAME)
    M18391 ˜−2.3 tyrosine kinase receptor (eph)
    X71874 −1.9 proteasome-like subunit MECL-1
  • [0197]
    TABLE 13
    Genes up or downregulated by
    progesterone via both PR-A and PR-B
    Accession No. Fold Gene Name
    U26726 ˜22.6 11-beta-hydroxysteroid dehydrogenase type 2
    X51521 12.7 Ezrin*
    U42031 9.4 progesterone receptor-associated FKBP541
    U70663 ˜7.5 zinc finger transcription factor EZF
    U16799 6.1 Na, K-ATPase beta-1 subunit
    M69043 4.2 MAD-3 (IkB-alpha)
    X65614 3.6 calcium-binding protein S100P
    D86962 2.9 Grb10
    S81914 2.6 IEX-1 = radiation-inducible immediate-early
    U00115 2.4 bcl-6
    M69225 ˜−3.5 bullous pemphigoid antigen (plakin family)
    U90907 −3.2 clone 23907
    J03241 ˜−3 transforming growth factor-beta 3 (TGF-beta3)
    M92357 −2.1 tumor necrosis factor alpha-induced protein 2
    (B94)
  • [0198]
    TABLE 14
    Gene that is reciprocally regulated
    (upregulated by PR-B, downregulated by PR-A)
    Accession No. Fold Gene Name
    X53586 2.5 Integrin α 6*
  • [0199]
    TABLE 15
    Group of genes for which the expression level
    is different depending on which isoform is present.
    Accession No. Fold Gene Name
    L13720 ˜23.1 growth arrest-specific protein (gas6)
    M27436 ˜18.1 tissue factor gene
    D79990 10.2 KIAA0168 Ras association (RaIGDS/AF-6) domain family 2
    (RASSF2)
    U01120 ˜9.8 glucose-6-phosphatase
    U37546 ˜7.2 IAP homolog C (MIHC)
    D87953 6.8 RTP, DRG1, CAP43
    M76180 ˜6.5 aromatic amino acid decarboxylase (ddc)
    M77140 ˜6 pro-galanin
    D50840 ˜5.6 ceramide glucosyltransferase
    HG2743-HT2846 ˜5.1 Caldesmon 1 Non-Muscle
    U76421 ˜4.7 dsRNA adenosine deaminase DRADA2b
    U40572 4.6 beta2-syntrophin (SNT B2)
    S69189 ˜4.5 peroxisomal acyl-coenzyme A oxidase
    U44754 4.4 PSE-binding factor PTF gamma subunit
    U02081 4.1 guanine nucleotide regulatory protein (NET1) oncogene
    D16227 ˜4 BDP-1 (member of the recoverin family)
    D17793 ˜4 3-alpha hydroxysteroid dehydrogenase type IIb
    U83461 3.7 putative copper uptake protein (hCTR2)
    M23254 3.6 Ca2+-activated neutral protease (CANP)
    D15050 3.6 transcription factor AREB6
    HG2167-HT2237 ˜3.5 Protein Kinase Ht31, Camp-Dependent
    D10040 3.5 long-chain acyl-CoA synthetase
    D31887 3.5 KIAA0062 gene
    X60673 3.4 adenylate kinase 3
    U45878 ˜3.3 inhibitor of apoptosis protein 1
    L09229 3.3 long-chain acyl-coenzyme A synthetase (FACL1)
    U09646 3.2 carnitine palmitoyltransferase II precursor (CPT1)
    D31716 3.2 GC box bindig protein
    M37400 3.1 cytosolic aspartate aminotransferase
    X59834 3.1 glutamine synthase
    D78335 3.1 uridine monophosphate kinase (UMPK)
    U41387 3 RNA helicase II/Gu)
    U07919 3 aldehyde dehydrogenase 6
    M69013 2.9 guanine nucleotide-binding regulatory protein (G-y-alpha)
    HG2530-HT2626 2.9 Adenylyl Cyclase-Associated Protein 2
    U79288 2.8 clone 23682
    D10704 2.6 choline kinase
    Y08134 2.6 ASM-like phosphodiesterase 3b
    U33632 2.6 two P-domain K+ channel TWIK-1
    M21154 2.5 S-adenosylmethionine decarboxylase
    U77949 2.5 Cdc6-related protein (HsCDC6)
    M95767 ˜2.5 di-N-acetylchitobiase
    D83781 2.5 KIAA0197 gene
    X98534 2.5 vasodilator-stimulated phosphoprotein (VASP)
    D80001 2.4 KIAA0179 gene
    L18960 2.4 protein synthesis factor (elF-4C)
    D23673 2.3 insulin receptor substrate-1 (IRS-1)
    J02888 2.3 quinone oxidoreductase (NQO2)
    D63487 2.3 KIAA0153 gene
    U14603 2.3 protein-tyrosine phosphatase (HU-PP-1)
    L41887 2.3 splicing factor, arginine/serine-rich 7 (SFRS7)
    M92287 2.2 cyclin D3 (CCND3)
    X61123 2.2 BTG1
    M95929 2.1 homeobox protein (PHOX1)
    U32944 2.1 cytoplasmic dynein light chain 1 (hdlc1)
    D79994 2.1 KIAA0172 gene (similar to ankyrin)
    D89377 2 MSX-2
    U90878 2 LIM domain protein CLP-36
    U97105 2 N2A3 dihydropyrimidinase related protein-2
    L40379 2 thyroid receptor interactor (TRIP10)
    J05459 1.9 glutathione transferase M3 (GSTM3)
    L42542 1.8 RLIP76 (ralA binding protein 1)
    D42047 1.7 KIAA0089 similar to glycerol-3-phosphate dehydrogenase 1
    M84349 1.7 transmembrane protein (CD59)
    D43950 1.6 KIAA0098 T-COMPLEX PROTEIN 1 (TCP-1-EPSILON)
    M15796 1.6 proliferating cell nuclear antigen (PCNA)
    U07225 ˜−4.3 P2U nucleotide receptor
    M27492 ˜−3.4 interleukin 1 receptor mRNA
    Y08682 −3.1 carnitine palmitoyltransferase I type I
    U29091 ˜−2.9 selenium-binding protein (hSBP)
    X79683 −2.6 beta2 laminin.
    AB000220 −2.6 semaphorin E
    HG2197-HT2267 ˜−2.5 Collagen, Type Vii, Alpha 1
    U65011 ˜−2.5 preferentially expressed antigen of melanoma (PRAME)
    M18391 ˜−2.3 tyrosine kinase receptor (eph)
    X71874 −1.9 proteasome-like subunit MECL-1
    L43821 4.7 enhancer of filamentation (HEF1)
    L38487 2.3 estrogen receptor-related protein (hERRa 1)
    D25539 ˜8 KIAA0040 gene
    HG4069-HT4339 ˜−7.4 Monocyte Chemotactic Protein 1
  • [0200]
    TABLE 16
    Genes encoding products involved in breast
    cancer or mammary gland development*.
    Accession no. Fold Gene Name
    L13720 ˜23.1 growth arrest-specific protein (gas6)
    M27436 ˜18.1 tissue factor gene
    M83667 6.4 NF-IL6-beta (C/EBPbeta)*
    M68516 ˜6.2 PCI gene (plasminogen activator inhibitor)
    U43185 ˜6.1 Stat5A*
    X65614 3.6 calcium-binding protein S100P
    X53586 2.5 Integrin α 6*
    D89377 2 MSX-2*
    D00017 1.9 lipocortin II (calpactin I)
    U29091 ˜−2.9 selenium-binding protein (hSBP)
    M69225 ˜−3.5 bullous pemphigoid antigen (plakin family)
  • REFERENCES
  • 1. Goruppi et al., [0201] Mol Cell Biol, 21:902-915 (2001)
  • 2. Ueno et al., [0202] Br J Cancer, 83:164-70 (2000); Lwaleed et al., J Pathol, 187:291-4 (1999); Lwaleed et al., J Pathol, 188(1):3-8 (1999)
  • 3. Seagroves et al., [0203] Mol Endocrinol, 14(3):359-68 (2000); Robinson et al., Genes Dev, 12(12):1907-16 (1998); Seagroves et al., Genes Dev, 12(12):1917-1928 (1998)
  • 4. Nelson et al., [0204] J Natl Cancer Inst, 92(11):866-8 (2000)
  • 5. Liu et al., [0205] Genes Dev, 11(2):179-86 (1997); Watson et al., Br J Cancer, 71(4):840-844 (1995)
  • 6. Guerreiro de Silva et al., [0206] Int J Oncol, 16:231-40 (2000)
  • 7. Wewer et al., [0207] Am J Pathol, 151(5):1191-8 (1997); Tagliabue et al., Eur J Cancer, 34(12):1982-3 (1998)
  • 8. Phippard et al., [0208] Development, 122(9):2729-37 (1996); Friedmann et al., Dev Biol, 177(1):347-55 (1996)
  • 9. Mai et al., [0209] Biochim Biophys Acta, 1477(1-2):215-30 (2000)
  • 10. Vinceti et al., [0210] Tumori 86(2):105-18 (2000); Jiang et al., Mol Carcinog, 26(4):213-25 (1999)
  • 11. Nacht et al., [0211] Cancer Res, 59:5464-70 (1999)
    TABLE 17
    Genes regulated by progesterone organized by primary function of gene product.
    Accession no. Fold Gene Name Regulation Pattern
    Transcription
    factors
    U70663 ˜7.5 zinc finger transcription factor EZF Up by Both
    M83667 6.4 NF-IL6 (C/EBPbeta) Up by PR-B
    U43185 ˜6.1 Stat5A Up by PR-B
    D15050 3.6 transcription factor AREB6 Up by PR-B
    D31716 3.2 GC box bindig protein Up by PR-B
    U00115 2.4 bcl-6 Up by Both
    U44754 4.4 PSE-binding factor PTF gamma subunit Up by PR-B
    M95929 2.1 homeobox protein (PHOX1) Up by PR-B
    S81914 2.6 IEX-1 = radiation-inducible DIF2 Up by Both
    D89377 2 MSX-2 Up by PR-B
    Z50781 2.5 delta sleep inducing peptide (higly related to TSC-22) Up by PR-A
    L38487 2.3 estrogen receptor-related protein (hERRa1) Up by PR-A
    Cell adhesion
    or
    cytoskeleton
    interaction
    HG2743-HT2846 ˜5.1 Caldesmon 1 Non-Muscle Up by PR-B
    L43821 4.7 enhancer of filamentation (HEF1) Up by PR-A
    U40572 4.6 beta2-syntrophin (SNT B2) Up by PR-B
    X98534 2.5 vasodilator-stimulated phosphoprotein (VASP) Up by PR-B
    U32944 2.1 cytoplasmic dynein light chain 1 (hdlc1) Up by PR-B
    U90878 2 LIM domain protein CLP-36 Up by PR-B
    X79683 −2.6 beta2 laminin. Down by PR-B
    L43821 4.7 enhancer of filamentation (HEF1) Up by PR-A
    Calcium
    binding
    proteins
    D16227 ˜4 BDP-1 (member of the recoverin family) Up by PR-B
    X65614 3.6 calcium-binding protein S100P Up by Both
    D00017 1.9 lipocortin II (calpactin I) Up by PR-B
    Cholesterol
    or
    steroid metabolism
    and trafficking
    U26726 ˜22.6 11-beta-hydroxysteroid dehydrogenase type 2 Up by Both
    D17793 ˜4 3-alpha hydroxysteroid dehydrogenase type IIb Up by PR-B
    AF002020 2.1 Niemann-Pick C disease (NPC1) Up by PR-B
    Fatty acid/
    lipid metabolism
    M76180 ˜6.5 aromatic amino acid decarboxylase (ddc) Up by PR-B
    D50840 ˜5.6 ceramide glucosyltransferase (phospholipid synthesis) Up by PR-B
    S69189 ˜4.5 peroxisomal acyl-coenzyme A oxidase Up by PR-B
    X52730 4.4 phenylethanolamine n-methyltransferase (PNMT) Up by PR-B
    L09229 3.3 long-chain acyl-coenzyme A synthetase (FACL1) Up by PR-B
    U09646 3.2 carnitine palmitoyltransferase II precursor (CPT1) Up by PR-B
    X59834 3.1 glutamine synthase Up by PR-B
    D78335 3.1 uridine monophosphate kinase (UMPK) Up by PR-B
    Y08134 2.6 ASM-like phosphodiesterase 3b Up by PR-B
    J02888 2.3 quinone oxidoreductase (NQO2) Up by PR-B
    Y08682 −3.1 carnitine palmitoyltransferase I type I down by PR-B
    Nucleotide
    or
    amino acid
    metabolism
    M37400 3.1 cytosolic aspartate aminotransferase (amino acid metabolism) Up by PR-B
    U97105 2 N2A3 dihydropyrimidinase related protein-2 Up by PR-B
    U07225 ˜−4.3 P2U nucleotide receptor down by PR-B
    General metabolic/
    synthetic
    U01120 ˜9.8 glucose-6-phosphatase (gluconeogenesis) Up by PR-B
    U07919 3 aldehyde dehydrogenase 6 (alcohol metabolism) Up by PR-B
    M21154 2.5 S-adenosylmethionine decarboxylase (polyamine Up by PR-B
    biosynthesis)
    M95767 ˜2.5 di-N-acetylchitobiase (glycoprotein synthesis) Up by PR-B
    D42047 1.7 KIAA0089 gene (similar to glycerol-3-phosphate Up by PR-B
    dehydrogenase 1)
    J05459 1.9 glutathione transferase M3 (GSTM3) Up by PR-B
    D25328 1.9 platelet-type phosphofructokinase Up by PR-B
    U29091 ˜−2.9 selenium-binding protein (hSBP) down by PR-B
    DNA-replication/
    transcription/
    translation and
    protein processing
    U76421 ˜4.7 dsRNA adenosine deaminase DRADA2b Up by PR-B
    U41387 3 RNA helicase II/Gu Up by PR-B
    L18960 2.4 protein synthesis factor (elF-4C) Up by PR-B
    L41887 2.3 splicing factor, arginine/serine-rich 7 (SFRS7) Up by PR-B
    U77949 2.5 Cdc6-related protein (HsCDC6) Up by PR-B
    X71874 −1.9 proteasome-like subunit MECL-1 Down by PR-B
    Secreted
    molecules
    L13720 ˜23.1 growth arrest-specific protein (gas6) Up by PR-B
    M27436 ˜18.1 tissue factor gene Up by PR-B
    M68516 ˜6.2 PCI gene (plasminogen activator inhibitor 3) Up by PR-B
    M77140 ˜6 pro-galanin Up by PR-B
    M23254 3.6 Ca2+-activated neutral protease (CANP) Up by PR-B
    AB000220 −2.6 semaphorin E Down by PR-B
    Signal
    transduction
    D79990 10.2 KIAA0168 Ras association (RaIGDS/AF-6) domain
    family 2 (RASSF2)
    M69043 4.2 MAD-3 encoding IkB-alpha Up by Both
    U02081 4.1 guanine nucleotide regulatory protein (NET1) oncogene Up by PR-B
    HG2167-HT2237 ˜3.5 Protein Kinase Ht31, cAMP-Dependent Up by PR-B
    X60673 3.4 adenylate kinase 3 Up by PR-B
    HG2530-HT2626 2.9 Adenylyl Cyclase-Associated Protein 2 Up by PR-B
    D86962 2.9 Grb10 Up by Both
    M69013 2.9 guanine nucleotide-binding regulatory protein (G-y-alpha) Up by PR-B
    D10704 2.6 choline kinase Up by PR-B
    U14603 2.3 protein-tyrosine phosphatase (HU-PP-1) Up by PR-B
    L40379 2 thyroid receptor interactor (TRIP10) Up by PR-B
    M18391 ˜−2.3 tyrosine kinase receptor (eph) Down by PR-B
    U44103_at −2.8 small GTP binding protein Rab9 Down by PR-A
    Cytokines/
    Cytokine
    Receptors and
    Chemokines
    M27492 ˜−3.4 interleukin 1 receptor mRNA Down by PR-B
    J03241 ˜−3 transforming growth factor-beta 3 (TGF-beta3) Down by Both
    HG4069-HT4339_s_at ˜−7.4 Monocyte Chemotactic Protein 1 Down by PR-A
    Membrane bound
    molecules
    U16799 6.1 Na,K-ATPase beta-1 subunit Up by Both
    U83461 3.7 putative copper uptake protein (hCTR2) Up by PR-B
    U33632 2.6 two P-domain K+ channel TWIK-1 Up by PR-B
    M84349 1.7 transmembrane protein (CD59) Up by PR-B
    M69225 ˜−3.5 bullous pemphigoid antigen (plakin family) Down by Both
    U65011 ˜−2.5 preferentially expressed antigen of melanoma (PRAME) Down by PR-B
    Chaperones/
    Protein folding
    U42031 9.4 progesterone receptor-associated FKBP54 Up by Both
    M80254 1.9 cyclophilin isoform (hCyP3) Up by PR-B
    Apoptosis
    U37546 ˜7.2 IAP homolog C (bindsTNFreceptor-associated factors) Up by PR-B
    U45878 ˜3.3 inhibitor of apoptosis protein 1 mRNA Up by PR-B
    Cell cycle
    D87953 6.8 RTP Up by PR-B
    M92287 2.2 cyclin D3 (CCND3) Up by PR-B
    M15796 1.6 proliferating cell nuclear antigen (PCNA) Up by PR-B
    X61123 2.2 BTG1 Up by PR-B
    Unknown Function
    D25539 ˜8 KIAA0040 gene Up by PR-B
    D31887 3.5 KIAA0062 gene Up by PR-B
    U79288 2.8 clone 23682 Up by PR-B
    D83781 2.5 KIAA0197 gene Up by PR-B
    D80001 2.4 KIAA0179 gene Up by PR-B
    D63487 2.3 KIAA0153 gene Up by PR-B
    D79994 2.1 KIAA0172 gene (similar to ankyrin) Up by PR-B
    M92357 −2.1 tumor necrosis factor, alpha-induced protein 2 B94 Down by PR-B
    U90907 −2.1 clone 23907 (similar to mouse p55PIK) Down by Both
  • [0212]
    TABLE 18
    Transcripts regulated in T47D-YB cells after 6 hrs progesterone treatment
    Accession no. Gene Name
    Fold Increase
    L13720 ˜23.1 growth arrest-specific protein (gas6)
    U26726 ˜22.6 11-beta-hydroxysteroid dehydrogenase type 2
    M27436 ˜18.1 tissue factor gene
    D79990 10.2 KIAA0168 Ras association (RalGDS/AF-6) domain
    family 2 (RASSF2)
    U01120 ˜9.8 glucose-6-phosphatase
    U42031 9.4 progesterone receptor-associated FKBP54*
    D25539 ˜8 KIAA0040 gene
    U70663 ˜7.5 zinc finger transcription factor EZF
    U37546 ˜7.2 IAP homolog C (MIHC)
    D87953 6.8 RTP, DRG1, CAP43
    M76180 ˜6.5 aromatic amino acid decarboxylase (ddc)
    M83667 6.4 NF-IL6 (C/EBPbeta)
    M68516 ˜6.2 PCI gene (plasminogen activator inhibitor 3)
    U43185 ˜6.1 Stat5A
    U16799 6.1 Na, K-ATPase beta-1 subunit
    M77140 ˜6 pro-galanin
    D50840 ˜5.6 ceramide glucosyltransferase
    HG2743-HT2846 ˜5.1 Caldesmon 1 Non-Muscle
    U76421 ˜4.7 dsRNA adenosine deaminase DRADA2b
    U40572 4.6 beta2-syntrophin (SNT B2)
    S69189 ˜4.5 peroxisomal acyl-coenzyme A oxidase
    U44754 4.4 PSE-binding factor PTF gamma subunit
    X52730 4.4 phenylethanolamine n-methyltransferase (PNMT)
    M69043 4.2 MAD-3 (lkB-alpha)
    U02081 4.1 guanine nucleotide regulatory protein (NET1) oncogene*
    D16227 ˜4 BDP-1 (member of the recoverin family)
    D17793 ˜4 3-alpha hydroxysteroid dehydrogenase type IIb
    U83461 3.7 putative copper uptake protein (hCTR2)
    X65614 3.6 calcium-binding protein S100P
    M23254 3.6 Ca2+-activated neutral protease (CANP)
    D15050 3.6 transcription factor AREB6
    HG2167-HT2237 ˜3.5 Protein Kinase Ht31, Camp-Dependent
    D10040 3.5 long-chain acyl-CoA synthetase
    D31887 3.5 KIAA0062 gene
    X60673 3.4 adenylate kinase 3
    U45878 ˜3.3 inhibitor of apoptosis protein 1
    L09229 3.3 long-chain acyl-coenzyme A synthetase (FACL1)
    U09646 3.2 carnitine palmitoyltransferase II precursor (CPT1)
    D31716 3.2 GC box bindig protein
    M37400 3.1 cytosolic aspartate aminotransferase
    X59834 3.1 glutamine synthase
    D78335 3.1 uridine monophosphate kinase (UMPK)
    U41387 3 RNA helicase II/Gu)
    U07919 3 aldehyde dehydrogenase 6
    D86962 2.9 Grb10
    M69013 2.9 guanine nucleotide-binding regulatory protein (G-y-alpha)*
    HG2530-HT2626 2.9 Adenylyl Cyclase-Associated Protein 2
    U79288 2.8 clone 23682
    D10704 2.6 choline kinase
    Y08134 2.6 ASM-like phosphodiesterase 3b
    U33632 2.6 two P-domain K+ channel TWIK-1
    S81914 2.6 IEX-1 = radiation-inducible immediate-early
    M21154 2.5 S-adenosylmethionine decarboxylase
    U77949 2.5 Cdc6-related protein (HsCDC6)
    M95767 ˜2.5 di-N-acetylchitobiase
    D83781 2.5 KIAA0197 gene
    X98534 2.5 vasodilator-stimulated phosphoprotein (VASP)
    D80001 2.4 KIAA0179 gene
    L18960 2.4 protein synthesis factor (elF-4C)
    U00115 2.4 bcl-6
    J02888 2.3 quinone oxidoreductase (NQO2)
    D63487 2.3 KIAA0153 gene
    U14603 2.3 protein-tyrosine phosphatase (HU-PP-1)
    L41887 2.3 splicing factor, arginine/serine-rich 7 (SFRS7)
    M92287 2.2 cyclin D3 (CCND3)
    X61123 2.2 BTG1
    AF002020 2.1 Niemann-Pick C disease (NPC1)
    M95929 2.1 homeobox protein (PHOX1)
    U32944 2.1 cytoplasmic dynein light chain 1 (hdlc1)
    D79994 2.1 KIAA0172 gene (similar to ankyrin)
    D89377 2 MSX-2
    U90878 2 LIM domain protein CLP-36
    U97105 2 N2A3 dihydropyrimidinase related protein-2
    L40379 2 thyroid receptor interactor (TRIP10)
    D00017 1.9 lipocortin II
    J05459 1.9 glutathione transferase M3 (GSTM3)
    D25328 1.9 platelet-type phosphofructokinase
    M80254 1.9 cyclophilin isoform (hCyP3)
    L42542 1.8 RLIP76 (ralA binding protein 1)
    D42047 1.7 KIAA0089 similar to glycerol-3-phosphate dehydrogenase 1
    M84349 1.7 transmembrane protein (CD59)
    D43950 1.6 KIAA0098 T-COMPLEX PROTEIN 1 (TCP-1-EPSILON)
    M15796 1.6 proliferating cell nuclear antigen (PCNA)
    Fold Decrease
    U07225 ˜−4.3 P2U nucleotide receptor
    M69225 ˜−3.5 bullous pemphigoid antigen (plakin family)
    M27492 ˜−3.4 interleukin 1 receptor mRNA
    U90907 −3.2 clone 23907
    Y08682 −3.1 carnitine palmitoyltransferase I type I
    J03241 ˜−3 transforming growth factor-beta 3 (TGF-beta3)
    U29091 ˜−2.9 selenium-binding protein (hSBP)
    X79683 −2.6 beta2 laminin.
    AB000220 −2.6 semaphorin E*
    HG2197-HT2267 ˜−2.5 Collagen, Type Vii, Alpha 1
    U65011 ˜−2.5 preferentially expressed antigen of melanoma (PRAME)
    M18391 ˜−2.3 tyrosine kinase receptor (eph)
    M92357 −2.1 tumor necrosis factor, alpha-induced protein 2 B94
    X71874 −1.9 proteasome-like subunit MECL-1
  • [0213]
    TABLE 19
    Transcripts regulated in T47D-YA cells after 6 hrs progesterone treatment
    Accession no. Gene Name
    Fold Increase
    U26726 6.5 11-beta-hydroxysteroid dehydrogenase type 2
    L43821 4.7 enhancer of filamentation (HEF1)
    U70663 ˜7.5 zinc finger transcription factor EZF
    U16799 3.9 Na, K-ATPase beta-1 subunit
    U42031 3.3 progesterone receptor-associated FKBP54
    Z50781 2.5 delta sleep inducing peptide (higly related to TSC-22)
    L38487 2.3 estrogen receptor-related protein (hERRa1)
    U00115 2.3 bcl-6
    X65614 2.2 calcium-binding protein S100P
    S81914 2.1 IEX-1 = radiation-inducible immediate-early
    M69043 2.0 MAD-3 mRNA (IkB-alpha)
    D86962 2.0 Grb10
    Fold Decrease
    HG4069-HT4339 ˜−7.4 Monocyte Chemotactic Protein 1
    M69225 ˜−4.3 bullous pemphigoid antigen (BPAG1)
    J03241 −3.3 transforming growth factor-beta 3 (TGF-beta3)
    M92357 −3.0 tumor necrosis factor, alpha-induced protein 2 B94
    U44103 −2.8 small GTP binding protein Rab9
    U90907 −2.1 clone 23907
  • While various embodiments of the present invention have been described in detail, it is apparent that modifications and adaptations of those embodiments will occur to those skilled in the art. It is to be expressly understood, however, that such modifications and adaptations are within the scope of the present invention, as set forth in the following claims. [0214]
  • 1 108 1 22 DNA Artificial Sequence Primer 1 atccagcgta ctccaaagat tc 22 2 22 DNA Artificial Sequence Primer 2 tccttgctga aagacaagtc tg 22 3 3817 DNA Homo sapiens 3 tgaattcgtg agagacttga gggaggcgct gcgactgaca agcggctctg cccgggacct 60 tctcgctttc atctagcgct gcactcaatg gaggggcggg caccgcagtg cttaatgctg 120 tcttaactag tgtaggaaaa cggctcaacc caccgctgcc gaaatgaagt ataagaatct 180 tatggcaagg gccttatatg acaatgtccc agagtgtgcc gaggaactgg cctttcgcaa 240 gggagacatc ctgaccgtca tagagcagaa cacaggggga ctggaaggat ggtggctgtg 300 ctcgttacac ggtcggcaag gcattgtccc aggcaaccgg gtgaagcttc tgattggtcc 360 catgcaggag actgcctcca gtcacgagca gcctgcctct ggactgatgc agcagacctt 420 tggccaacag aagctctatc aagtgccaaa cccacaggct gctccccgag acaccatcta 480 ccaagtgcca ccttcctacc aaaatcaggg aatttaccaa gtccccactg gccacggcac 540 ccaagaacaa gaggtatatc aggtgccacc atcagtgcag agaagcattg ggggaaccag 600 tgggccccac gtgggtaaaa aggtgataac ccccgtgagg acaggccatg gctacgtata 660 cgagtaccca tccagatacc aaaaggatgt ctatgatatc cctccttctc ataccactca 720 aggggtatac gacatccctc cctcatcagc aaaaggccct gtgttttcag ttccagtggg 780 agagataaaa cctcaagggg tgtatgacat cccgcctaca aaaggggtat atgccattcc 840 gccctctgct tgccgggatg aagcagggct tagggaaaaa gactatgact tcccccctcc 900 catgagacaa gctggaaggc cggacctcag accggagggg gtttatgaca ttcctccaac 960 ctgcaccaag ccagcaggga aggaccttca tgtaaaatac aactgtgaca ttccaggagc 1020 tgcagaaccg gtggctcgaa ggcaccagag cctgtccccg aatcacccac ccccgcaact 1080 cggacagtca gtgggctctc agaacgacgc atatgatgtc ccccgaggcg ttcagtttct 1140 tgagccacca gcagaaacca gtgagaaagc aaacccccag gaaagggatg gtgtttatga 1200 tgtccctctg cataacccgc cagatgctaa aggctctcgg gacttggtgg atgggatcaa 1260 ccgattgtct ttctccagta caggcagcac ccggagtaac atgtccacgt cttccacctc 1320 ctccaaggag tcctcactgt cagcctcccc agctcaggac aaaaggctct tcctggatcc 1380 agacacagct attgagagac ttcagcggct ccagcaggcc cttgagatgg gtgtctccag 1440 cctaatggca ctggtcacta ccgactggcg gtgttacgga tatatggaaa gacacatcaa 1500 tgaaatacgc acagcagtgg acaaggtgga gctgttcctg aaggagtacc tccactttgt 1560 caagggagct gttgcaaatg ctgcctgcct cccggaactc atcctccaca acaagatgaa 1620 gcgggagctg caacgagtcg aagactccca ccagatcctg agtcaaacca gccatgactt 1680 aaatgagtgc agctggtccc tgaatatctt ggccatcaac aagccccaga acaagtgtga 1740 cgatctggac cggtttgtga tggtggcaaa gacggtgccc gatgacgcca agcagctcac 1800 cacaaccatc aacaccaacg cagaggccct cttcagaccc ggccctggca gcttgcatct 1860 gaagaatggg ccggagagca tcatgaactc aacggagtac ccacacggtg gctcccaggg 1920 acagctgctg catcctggtg accacaaggc ccaggcccac aacaaggcac tgcccccagg 1980 cctgagcaag gagcaggccc ctgactgtag cagcagtgat ggttctgaga ggagctggat 2040 ggatgactac gattacgtcc acctacaggg taaggaggag tttgagaggc aacagaaaga 2100 gctattggaa aaagagaata tcatgaaaca gaacaagatg cagctggaac atcatcagct 2160 gagccagttc cagctgttgg aacaagagat tacaaagccc gtggagaatg acatctcgaa 2220 gtggaagccc tctcagagcc tacccaccac aaacagtggc gtgagtgctc aggatcggca 2280 gttgctgtgc ttctactatg accaatgtga gacccatttc atttcccttc tcaacgccat 2340 tgacgcactc ttcagttgtg tcagctcagc ccagcccccg cgaatcttcg tggcacacag 2400 caagtttgtc atcctcagtg cacacaaact ggtgttcatt ggagacacgc tgacacggca 2460 ggtgactgcc caggacattc gcaacaaagt catgaactcc agcaaccagc tctgcgagca 2520 gctcaagact atagtcatgg caaccaagat ggccgccctc cattacccca gcaccacggc 2580 cctgcaggaa atggtgcacc aagtgacaga cctttctaga aatgcccagc tgttcaagcg 2640 ctctttgctg gagatggcaa cgttctgaga agaaaaaaaa gaggaagggg actgcgttaa 2700 cggttactaa ggaaaactgg aaatactgtc tggtttttgt aaatgttatc tatttttgta 2760 gataatttta tataaaaatg aaatatttta acattttatg ggtcagacaa ctttcagaaa 2820 ttcagggagc tggagaggga aatctttttt tcccccctga gtgttcttat gtatacacag 2880 aagtatctga gacataaact gtacagaaaa cttgtccacg tccttttgta tgcccatgta 2940 ttcatgtttt tgtttgtaga tgtttgtctg atgcatttca ttaaaaaaaa aaccatgaat 3000 tacgaagcac cttagtaagc accttctaat gctgcatttt ttttgttgtt gttaaaaaca 3060 tccagctggt tataatattg ttctccacgt ccttgtgatg attctgagcc tggcactggg 3120 aatctgggaa gcatagttta tttgcaagtg ttcaccttcc aaatcatgag gcatagcatg 3180 acttattctt gttttgaaaa ctcttttcaa aactgaccat cttaaacaca tgatggccaa 3240 gtgccacaaa gccctcttgc ggagacattt acgaatatat atgtggatcc aagtctcgat 3300 agttaggcgt tggagggaag agagaccaga gagtttagag gccaggacca cagttaggat 3360 tgggttgttt caatactgag agacagctac aataaaagga gagcaattgc ctccctgggg 3420 ctgttcaatc ttctgcattt gtgagtggtt cagtcatgag gttttccaaa agatgttttt 3480 agagttgtaa aaaccatatt tgcagcaaag atttacaaag gcgtatcaga ctatgattgt 3540 tcaccaaaat aggggaatgg tttgatccgc cagttgcaag tagaggcctt tctgactctt 3600 aatattcact ttggtgctac tacccccatt acctgaggaa ctggccaggt ccttgatcat 3660 ggaactatag agctaccaga catatcctgc tctctaaggg aatttattgc tatcttgcac 3720 cttctttaaa actcaaaaaa catatgcaga cctgacactc aagagtggct agctacacag 3780 agtccatcta atttttgcaa cttccccccc cgaattc 3817 4 2218 DNA Homo sapiens 4 tcctacaagc agccggcggc gccgccgagt gaggggacgc ggcgcggtgg ggcggcgcgg 60 cccgaggagg cggcggagga ggggccgccc gcggcccccg gctcactccg gcactccggg 120 ccgctcggcc cccatgcctg cccgaccgcg ctgccggagc cccaggtgac cagcgccatg 180 tccagccagg tggtgggcat tgagcctctc tacatcaagg cagagccggc cagccctgac 240 agtccaaagg gttcctcgga gacagagacc gagcctcctg tggccctggc ccctggtcca 300 gctcccactc gctgcctccc aggccacaag gaagaggagg atggggaggg ggctgggcct 360 ggcgagcagg gcggtgggaa gctggtgctc agctccctgc ccaagcgcct ctgcctggtc 420 tgtggggacg tggcctccgg ctaccactat ggtgtggcat cctgtgaggc ctgcaaagcc 480 ttcttcaaga ggaccatcca ggggagcatc gagtacagct gtccggcctc caacgagtgt 540 gagatcacca agcggagacg caaggcctgc caggcctgcc gcttcaccaa gtgcctgcgg 600 gtgggcatgc tcaaggaggg agtgcgcctg gaccgcgtcc ggggtgggcg gcagaagtac 660 aagcggcggc cggaggtgga cccactgccc ttcccgggcc ccttccctgc tgggcccctg 720 gcagtcgctg gaggcccccg gaagacagcc ccagtgaatg cactggtgtc tcatctgctg 780 gtggttgagc ctgagaagct ctatgccatg cctgaccccg caggccctga tgggcacctc 840 ccagccgtgg ctaccctctg tgacctcttt gaccgagaga ttgtggtcac catcagctgg 900 gccaagagca tcccaggctt ctcatcgctg tcgctgtctg accagatgtc agtactgcag 960 agcgtgtgga tggaggtgct ggtgctgggt gtggcccagc gctcactgcc actgcaggat 1020 gagctggcct tcgctgagga cttagtcctg gatgaagagg gggcacgggc agctggcctg 1080 ggggaactgg gggctgccct gctgcaacta gtgcggcggc tgcaggccct gcggctggag 1140 cgagaggagt atgttctact aaaggccttg gcccttgcca attcagactc tgtgcacatc 1200 gaagatgccg aggctgtgga gcagctgcga gaagctctgc acgaggccct gctggagtat 1260 gaagccggcc gggctggccc cggagggggt gctgagcggc ggcgggcggg caggctgctg 1320 ctcacgctac cgctcctccg ccagacagcg ggcaaagtgc tggcccattt ctatggggtg 1380 aagctggagg gcaaggtgcc catgcacaag ctgttcttgg agatgctcga ggccatgatg 1440 gactgaggca aggggtggga ctggtggggg ttctggcagg acctgcctag catggggtca 1500 gccccaaggg ctggggcgga gctggggtct gggcagtgcc acagcctgct ggcagggcca 1560 gggcaatgcc atcagcccct gggaacaggc cccacgccct ctcctccccc tcctaggggg 1620 tgtcagaagc tgggaacgtg tgtccaggct ctgggcacag tgctgcccct tgcaagccat 1680 aacgtgcccc cagagtgtag ggggccttgc ggaagccata gggggctgca cgggatgcgt 1740 gggaggcaga aacctatctc agggagggaa ggggatggag gccagagtct cccagtgggt 1800 gatgcttttg ctgctgctta atcctacccc ctcttcaaag cagagtggga cttggagagc 1860 aaaggcccat gcccccttcg ctcctcctct catcatttgc attgggcatt agtgtccccc 1920 cttgaagcaa taactccaag cagactccag cccctggacc cctggggtgg ccagggcttc 1980 cccatcagct cccaacgagc ctcctcaggg ggtaggagag cactgcctct atgccctgca 2040 gagcaataac actatattta tttttgggtt tggccaggga ggcgcaggga catggggcaa 2100 gccagggccc agagcccttg gctgtacaga gactctattt taatgtatat ttgctgcaaa 2160 gagaaaccgc ttttggtttt aaacctttaa tgagaaaaaa atatataata ccgagctc 2218 5 606 DNA Homo sapiens 5 atggcaggaa aatcttcact ttttaaagta attctccttg gagatggtgg agttgggaag 60 agttcactta tgaacagata tgtaactaat aagtttgata cccagctctt ccatacaata 120 ggtgtggaat ttttaaataa agatttggaa gtggatggac attttgttac catgcagatt 180 tgggacacgg caggtcagga gcgattccga agcctgagga caccatttta cagaggttct 240 gactgctgcc tgcttacttt tagtgtcgat gattcacaaa gcttccagaa cttaagtaac 300 tggaagaaag aattcatata ttatgcagat gtgaaagagc ctgagagctt tccttttgtg 360 attctgggta acaagattga cataagcgaa cggcaggtgt ctacagaaga agcccaagct 420 tggtgcaggg acaacggcga ctatccttat tttgaaacaa gtgcaaaaga tgccacaaat 480 gtggcagcag cctttgagga agcggttcga agagttcttg ctaccgagga taggtcagat 540 catttgattc agacagacac agtcaatctt caccgaaagc ccaagcctag ctcatcttgc 600 tgttga 606 6 2461 DNA Homo sapiens 6 ccgcagccgc cgccgccgcc gccgccgcga tgtgaccttc agggccgcca ggacgggatg 60 accggagcct ccgccccgcg gcgcccgctc gcctcggcct cccgggcgct ctgaccgcgc 120 gtccccggcc cgccatggcc ccttcgctct cgcccgggcc cgccgccctg cgccgcgcgc 180 cgcagctgct gctgctgctg ctggccgcgg agtgcgcgct tgccgcgctg ttgccggcgc 240 gcgaggccac gcagttcctg cggcccaggc agcgccgcgc ctttcaggtc ttcgaggagg 300 ccaagcaggg ccacctggag agggagtgcg tggaggagct gtgcagccgc gaggaggcgc 360 gggaggtgtt cgagaacgac cccgagacgg attattttta cccaagatac ttagactgca 420 tcaacaagta tgggtctccg tacaccaaaa actcaggctt cgccacctgc gtgcaaaacc 480 tgcctgacca gtgcacgccc aacccctgcg ataggaaggg gacccaagcc tgccaggacc 540 tcatgggcaa cttcttctgc ctgtgtaaag ctggctgggg gggccggctc tgcgacaaag 600 atgtcaacga atgcagccag gagaacgggg gctgcctcca gatctgccac aacaagccgg 660 gtagcttcca ctgttcctgc cacagcggct tcgagctctc ctctgatggc aggacctgcc 720 aagacataga cgagtgcgca gactcggagg cctgcgggga ggcgcgctgc aagaacctgc 780 ccggctccta ctcctgcctc tgtgacgagg gctttgcgta cagctcccag gagaaggctt 840 gccgagatgt ggacgagtgt ctgcagggcc gctgtgagca ggtctgcgtg aactccccag 900 ggagctacac ctgccactgt gacgggcgtg ggggcctcaa gctgtcccag gacatggaca 960 cctgtgagga catcttgccg tgcgtgccct tcagcgtggc caagagtgtg aagtccttgt 1020 acctgggccg gatgttcagt gggacccccg tgatccgact gcgcttcaag aggctgcagc 1080 ccaccaggct ggtagctgag tttgacttcc ggacctttga ccccgagggc atcctcctct 1140 ttgccggagg ccaccaggac agcacctgga tcgtgctggc cctgagagcc ggccggctgg 1200 agctgcagct gcgctacaac ggtgtcggcc gtgtcaccag cagcggcccg gtcatcaacc 1260 atggcatgtg gcagacaatc tctgttgagg agctggcgcg gaatctggtc atcaaggtca 1320 acagggatgc tgtcatgaaa atcgcggtgg ccggggactt gttccaaccg gagcgaggac 1380 tgtatcatct gaacctgacc gtgggaggta ttcccttcca tgagaaggac ctcgtgcagc 1440 ctataaaccc tcgtctggat ggctgcatga ggagctggaa ctggctgaac ggagaagaca 1500 ccaccatcca ggaaacggtg aaagtgaaca cgaggatgca gtgcttctcg gtgacggaga 1560 gaggctcttt ctaccccggg agcggcttcg ccttctacag cctggactac atgcggaccc 1620 ctctggacgt cgggactgaa tcaacctggg aagtagaagt cgtggctcac atccgcccag 1680 ccgcagacac aggcgtgctg tttgcgctct gggcccccga cctccgtgcc gtgcctctct 1740 ctgtggcact ggtagactat cactccacga agaaactcaa gaagcagctg gtggtcctgg 1800 ccgtggagca tacggccttg gccctaatgg agatcaaggt ctgcgacggc caagagcacg 1860 tggtcaccgt ctcgctgagg gacggtgagg ccaccctgga ggtggacggc accaggggcc 1920 agagcgaggt gagcgccgcg cagctgcagg agaggctggc cgtgctcgag aggcacctgc 1980 ggagccccgt gctcaccttt gctggcggcc tgccagatgt gccggtgact tcagcgccag 2040 tcaccgcgtt ctaccgcggc tgcatgacac tggaggtcaa ccggaggctg ctggacctgg 2100 acgaggcggc gtacaagcac agcgacatca cggcccactc ctgccccccc gtggagcccg 2160 ccgcagccta ggcccccacg ggacgcggca ggcttctcag tctctgtccg agacagccgg 2220 gaggagcctg ggggctcctc accacgtggg gccatgctga gagctgggct ttcctctgtg 2280 accatcccgg cctgtaacat atctgtaaat agtgagatgg acttggggcc tctgacgccg 2340 cgcactcagc cgtgggcccg ggcgcgggga ggccggcgca gcgcagagcg ggctcgaaga 2400 aaataattct ctattatttt tattaccaag cgcttctttc tgactctaaa atatggaaaa 2460 t 2461 7 2127 DNA Homo sapiens 7 ctcgcactcc ctctggccgg cccagggcgc cttcagccca acctccccag ccccacgggc 60 gccacggaac ccgctcgatc tcgccgccaa ctggtagaca tggagacccc tgcctggccc 120 cgggtcccgc gccccgagac cgccgtcgct cggacgctcc tgctcggctg ggtcttcgcc 180 caggtggccg gcgcttcagg cactacaaat actgtggcag catataattt aacttggaaa 240 tcaactaatt tcaagacaat tttggagtgg gaacccaaac ccgtcaatca agtctacact 300 gttcaaataa gcactaagtc aggagattgg aaaagcaaat gcttttacac aacagacaca 360 gagtgtgacc tcaccgacga gattgtgaag gatgtgaagc agacgtactt ggcacgggtc 420 ttctcctacc cggcagggaa tgtggagagc accggttctg ctggggagcc tctgtatgag 480 aactccccag agttcacacc ttacctggag acaaacctcg gacagccaac aattcagagt 540 tttgaacagg tgggaacaaa agtgaatgtg accgtagaag atgaacggac tttagtcaga 600 aggaacaaca ctttcctaag cctccgggat gtttttggca aggacttaat ttatacactt 660 tattattgga aatcttcaag ttcaggaaag aaaacagcca aaacaaacac taatgagttt 720 ttgattgatg tggataaagg agaaaactac tgtttcagtg ttcaagcagt gattccctcc 780 cgaacagtta accggaagag tacagacagc ccggtagagt gtatgggcca ggagaaaggg 840 gaattcagag aaatattcta catcattgga gctgtggtat ttgtggtcat catccttgtc 900 atcatcctgg ctatatctct acacaagtgt agaaaggcag gagtggggca gagctggaag 960 gagaactccc cactgaatgt ttcataaagg aagcactgtt ggagctactg caaatgctat 1020 attgcactgt gaccgagaac ttttaagagg atagaataca tggaaacgca aatgagtatt 1080 tcggagcatg aagaccctgg agttcaaaaa actcttgata tgacctgtta ttaccattag 1140 cattctggtt ttgacatcag cattagtcac tttgaaatgt aacgaatggt actacaacca 1200 attccaagtt ttaattttta acaccatggc accttttgca cataacatgc tttagattat 1260 atattccgca ctcaaggagt aaccaggtcg tccaagcaaa aacaaatggg aaaatgtctt 1320 aaaaaatcct gggtggactt ttgaaaagct tttttttttt tttttttttg agacggagtc 1380 ttgctctgtt gcccaggctg gagtgcagta gcacgatctc ggctcactgc accctccgtc 1440 tctcgggttc aagcaattgt ctgcctcagc ctcccgagta gctgggatta caggtgcgca 1500 ctaccacacc aagctaattt ttgtattttt tagtagagat ggggtttcac catcttggcc 1560 aggctggtct tgaattcctg acctcagttg atccacccac cttggcctcc caaagtgcta 1620 gtattatggg cgtgaaccac catgcccagc cgaaaagctt ttgaggggct gacttcaatc 1680 catgtaggaa agtaaaatgg aaggaaattg ggtgcatttc taggactttt ctaacatatg 1740 tctataatat agtgtttagg ttcttttttt tttcaggaat acatttggaa attcaaaaca 1800 attggcaaac tttgtattaa tgtgttaagt gcaggagaca ttggtattct gggcaccttc 1860 ctaatatgct ttacaatctg cactttaact gacttaagtg gcattaaaca tttgagagct 1920 aactatattt ttataagact actatacaaa ctacagagtt tatgatttaa ggtacttaaa 1980 gcttctatgg ttgacattgt atatataatt ttttaaaaag gttttctata tggggatttt 2040 ctatttatgt aggtaatatt gttctatttg tatatattga gataatttat ttaatatact 2100 ttaaataaag gtgactggga attgtta 2127 8 5426 DNA Homo sapiens 8 ggggaggaag aaaggcgaag gcaaggcgaa ggggtggaga gtgatatgaa gagcgagaga 60 aaagagagga cagcggacga gcagatccgg tatctggaat cccggcgcct agaacgtgtt 120 tttcgggaga gcaaaggctg tgtctacggc aggctgggga tatagcctct ccttccgatg 180 aaaagagaaa ggaagaatgg actacagcca ccaaacgtcc ctagtcccat gtggacaaga 240 taaatacatt tccaaaaatg aacttctctt gcatctgaag acctacaact tgtactatga 300 aggccagaat ttacagctcc ggcaccggga ggaagaagac gagttcattg tggaggggct 360 cctgaacatc tcctggggcc tgcgccggcc cattcgcctg cagatgcagg atgacaacga 420 acgcattcga ccccctccat cctcctcctc ctggcactct ggctgtaacc tgggggctca 480 gggaaccact ctgaagcccc tgactgtgcc caaagttcag atctcagagg tggatgcccc 540 gccggagggt gaccagatgc caagctccac agactccagg ggcctgaagc ccctgcagga 600 ggacacccca cagctgatgc gcacacgcag tgatgttggg gtgcgtcgcc gtggcaatgt 660 gaggacgcct agtgaccagc ggcgaatcag acgccaccgc ttctccatca acggccattt 720 ctacaaccat aagacatccg tgttcacacc agcctatggc tctgtcacca acgtccgcat 780 caacagcacc atgaccaccc cacaggtcct gaagctgctg ctcaacaaat ttaagattga 840 gaattcagca gaggagtttg ccttgtacgt ggtccatacg agtggtgaga aacagaagct 900 gaaggccacc gattacccgc tgattgcccg aatcctccag ggcccatgtg agcagatctc 960 caaagtgttc ctaatggaga aggaccaggt ggaggaagtc acctacgacg tggcccagta 1020 tataaagttc gagatgccgg tacttaaaag cttcattcag aagctccagg aggaagaaga 1080 tcgggaagta aagaagctga tgcgcaagta caccgtgctc cggctaatga ttcgacagag 1140 gctggaggag atagccgaga ccccagcaac aatctgagcc atgagaacga ggggatctgg 1200 gcaccccagg aaccgccatt gcccataaga cccccaggaa gctaggcact ttctttccat 1260 ggaaacattt agacacaaac ctccccagct ccggccaagc catcatttgc tacctggagc 1320 tggatgtaga agtcagcaga cagctcccta tccctggacc cctgccctcc ttttttctgc 1380 tcacaaggac ttttgatttt agttataagg aggacccaaa atgtgtgtgt gtacatgtgt 1440 gtgcacacat ggtacgtgtc catgtgccta cctgatactt tcacatgtaa ttaaattcca 1500 ggcaaccagc acaagagccg tgagcttggc acatgtgctg ctcgtgagca ggaaaatcag 1560 aggagccact gatctgagtg gtatttaggt tgaaggaaag atttctcctc tcaagtgcca 1620 gggagcagcc acacgtctgt ctgtgtttag agagggaaga gggttctcca ggttcaccat 1680 ttgggttgtt tatatgttgg tagaaattct ccctgtatgc ctagaaggat cagtgaatgt 1740 aagagccttg gaaattaaca aaataacagc cacataacct tgcggcaagt ctgatggaaa 1800 gaaaaagata aaccatccgt ggggtagatg caataagccc acgtattttt acactggaaa 1860 cgttgattgt tttaaatgac aaagacatat gtgatgttct atgtggaaac ctgtgaagag 1920 tggattctgc ctccatctct gcctccatgg ctacctttag gagacagaga agatcctgtg 1980 tgtttctctg tacccagctg acagcctgtc tctatggcgc ttccttgagt ggaaggaaat 2040 gtctcaagaa acaaagatct cgctggtgcg tacacagtgc tgaccagcta gtgtggccag 2100 ggcctggtgg cctggtggcc aggaagtttc aggttgaagg gaaatgtcga ggctacctgc 2160 agatatgaca ggtgccttga acgcagccca tcttcatgtc atcaaaggtc ttcctgcact 2220 tgaagctggg gcgatgtttg cagtcaagac cattctttcc aacctctggg ttcttgcaag 2280 ttgccctcac cttgtgtgtg gagatgcatt ccaagaatga agcctcatct tgctactgag 2340 tgtggggttc agggaagctc tttaggccac ctggtgaagg tgcatgggga ggatggagct 2400 tctcctcagc tcctctgagc agccacctat gtgatcttta aatccaaccc caatgggaga 2460 aaagggcaag aacagtctgt gccctgggac tcctatcagg aagcttgaca ggcagctggg 2520 catcagtgca gctgatatcg tttgaggagg gagacagatg cttggacctg ggtgcctggc 2580 tatggagatt gaccaagcaa gatcaggagc tcctgatagc aggcgtcttt gagcctagct 2640 ggggtagagg cactgcccat ctcttctcca ccttctctcc acagaatgtt tgcagagctg 2700 ggcagttgag gaaaggacag cccctggttg gtgcctccaa aggaaggtgg acttttttgg 2760 tggagacgtt tctgccctgg gcaccctcct gcccccgatt catacctatg gcttcttgag 2820 aaggctcaca gctgtggtct taacgtagac tgcagaaaga tggcatgcgg cccctggcat 2880 ttcgccaagg gttttatagc aagtctcctt cctccatagg gacagcagca ccagccctgt 2940 ggggcatgga gtggaagccc agaagggctt ctgcaagctg cacagaactg gggtaagaag 3000 acaaagagta gccaccggga gaggcttcct ttgttacagc tgggaaagaa cagttctgtg 3060 aatgcaaaca cctcctgagt tttgcaattg agaaaatgat ttggagaact tctcttctgg 3120 taatttttat tttgaatgtt cagggcctta gttggcccca gtaattctcc ttggaggact 3180 tgggagaaga atttccacaa agcaaactac taaccactag ctcttactgg acagcgattt 3240 ctggcttata agagttctct ttgatttgca ctagcactac gatagtgtta gatggggaaa 3300 tactgcaaca tgtccagttg gccagatcac tttccaaggg agcgatacta aggcagactc 3360 agctttttaa agatgggagg tcaggaggtg gaagtgagag gagatcccat ctcacacaac 3420 acacttccac gtaatgcaga ccacactttt ccattttgtc ctgccctctt gagaggtcat 3480 ttctcacgtc ctaagaacct gatcagaaat tttggaaggg ttctttgaaa tagcagcagt 3540 tgaaacagag acactttgcc acagtgtgga gcagattttc tcactggtat cacatggtct 3600 tgcagttttg aactcttcga ccgatttgtg ggagtttatg taattgcgtg caatgaacct 3660 gaaattgtgt aaaggacaaa agaccagttt atagggttgg gttttttttc caacttgtga 3720 aaagcagttt agctgcatct gtctccccac cacccccacc ccgggagggg cttatgttac 3780 aaggtgatca agtgaaggaa aaacctgagc ctatctggct gggatggtgg aattaagcac 3840 aaggtcacat tctctgtgat cacatgagag ggaaggtgat gacttaaatg gcagggggtg 3900 gggattatct tggggagagg ctgaaaagca caaaagatag tcttccctgt acgtattggt 3960 gaagaacgtg cacaaggctg gatggacttc aacttggagt tgagttgagg caagaggatt 4020 tctggatatt agtcacccat ctgcaagaaa aatgctgagg cctcgggtca agattttgat 4080 ctgagacatg ctgatgcttc aaggagaaat attttcacaa tcctctcttc cctcaccaga 4140 agagaacagt actctctcct agaaacctct aggtaaacac attttatcct aatatcggta 4200 gcatataatg ccccccccaa aatatctgtt ttccatgcaa aaaagtctca acaagaagtc 4260 tgtggagttg agtggttact tcaaagtgtc aggagagtga agaaattggc cacagaagag 4320 caagaagctc tcttaagaaa agggaattct ctttaaagaa accaccacca acaacaaaac 4380 aaccaaaaac catgttttat gtcaaagctc tgtagcacag agaatgtggt gtcacagata 4440 catcgccgag agaggtttct ttctttcttt tttttttttt tgagacagag tctggttctg 4500 tttcccaggc tggagtgcag tggtgggatc tcagctcact gcaacatccg cctctggggt 4560 tcaagtgatt ctcctgtctc agcctcccaa gtagctggaa ttacagggac ccgccaccac 4620 gcccggctaa tttttttgtg tggttttagt agaggtgggg tttcaccatc ttggccaggc 4680 tggtcttgaa ctcctgacct cgtgatccac ccgcctaggc ctcccaaagt gttgggatta 4740 caggcgtgag ccactgtgcc cagccaaaag agaaatttct acatgaacaa ggcaatttca 4800 gtgtcttaca gcggccaaac catgacgtga agaatgagat aggagacagg agatcaccat 4860 aagcgtccct gatatagcag cacacatttt cacgtttcca cttaaatcgt tttgcacaaa 4920 gtcttgcttc gctcagatga gatgagatat gatttcctag agatgtaaaa ataagaatga 4980 atgtggcgcc cccttcttcc agatgtaata gaaagctctg ccctatcaca aggggggtgt 5040 tgaagcgccc cttgtgtttt aactgtattt aactgagcac aagatgcaca agctgtggtg 5100 ggaaaccctc agtttacctt tggagtcttc cctgcagatc gcagacctgt ttccaggctg 5160 atgtttctgg tgtgtaattg ctagcgtttc tgaagggttt tcccaattgt tttagccttg 5220 tgaagtattc ttaattataa cttgcctttc agcgatggta catgacttga ttcaacgttt 5280 ggttctgaac ttacacactg atgcgtttac tcatctaaca taatctgaca gggcctcagc 5340 aagggagcca tacatttttg taacattttg atatgtttta atgcatctga cttagatctt 5400 actgaaataa agcacttttc aaagag 5426 9 3095 DNA Homo sapiens 9 tagcagagca atcaccacca agcctggaat aactgcaagg gctctgctga catcttcctg 60 aggtgccaag gaaatgagga tggaggaagg aatgaatgtt ctccatgact ttgggatcca 120 gtcaacacat tacctccagg tgaattacca agactcccag gactggttca tcttggtgtc 180 cgtgatcgca gacctcagga atgccttcta cgtcctcttc cccatctggt tccatcttca 240 ggaagctgtg ggcattaaac tcctttgggt agctgtgatt ggagactggc tcaacctcgt 300 ctttaagtgg attctctttg gacagcgtcc atactggtgg gttttggata ctgactacta 360 cagcaacact tccgtgcccc tgataaagca gttccctgta acctgtgaga ctggaccagg 420 gagcccctct ggccatgcca tgggcacagc aggtgtatac tacgtgatgg tcacatctac 480 tctttccatc tttcagggaa agataaagcc gacctacaga tttcggtgct tgaatgtcat 540 tttgtggttg ggattctggg ctgtgcagct gaatgtctgt ctgtcacgaa tctaccttgc 600 tgctcatttt cctcatcaag ttgttgctgg agtcctgtca ggcattgctg ttacagaaac 660 tttcagccac atccacagca tctataatgc cagcctcaag aaatattttc tcattacctt 720 cttcctgttc agcttcgcca tcggatttta tctgctgctc aagggactgg gtgtagacct 780 cctgtggact ctggagaaag cccagaggtg gtgcgagcag ccagaatggg tccacattga 840 caccacaccc tttgccagcc tcctcaagaa cctgggcacg ctctttggcc tggggctggc 900 tctcaactcc agcatgtaca gggagagctg caaggggaaa ctcagcaagt ggctcccatt 960 ccgcctcagc tctattgtag cctccctcgt cctcctgcac gtctttgact ccttgaaacc 1020 cccatcccaa gtcgagctgg tcttctacgt cttgtccttc tgcaagagtg cggtagtgcc 1080 cctggcatcc gtcagtgtca tcccctactg cctcgcccag gtcctgggcc agccgcacaa 1140 gaagtcgttg taagagatgt ggagtcttcg gtgtttaaag tcaacaacca tgccagggat 1200 tgaggaggac tactatttga agcaatgggc actggtattt ggagcaagtg acatgccatc 1260 cattctgccg tcgtggaatt aaatcacgga tggcagattg gagggtcgcc tggcttattc 1320 ccatgtgtga ctccagcctg ccctcagcac agactctttc agatggaggt gccatatcac 1380 gtacaccata tgcaagtttc ccgccaggag gtcctcctct ctctacttga atactctcac 1440 aagtagggag ctcactccca ctggaacagc ccattttatc tttgaatggt cttctgccag 1500 cccattttga ggccagaggt gctgtcagct caggtggtcc tcttttacaa tcctaatcat 1560 attgggtaat gtttttgaaa agctaatgaa gctattgaga aagacctgtt gctagaagtt 1620 gggttgttct ggattttccc ctgaagactt acttattctt ccgtcacata tacaaaagca 1680 agacttccag gtagggccag ctcacaagcc caggctggag atcctaactg agaattttct 1740 acctgtgttc attcttaccg agaaaaggag aaaggagctc tgaatctgat aggaaaagaa 1800 ggctgcctaa ggaggagttt ttagtatgtg gcgtatcatg caagtgctat gccaagccat 1860 gtctaaatgg ctttaattat atagtaatgc actctcagta atgggggacc agcttaagta 1920 taattaatag atggttagtg gggtaattct gcttctagta ttttttttac tgtgcataca 1980 tgttcatcgt atttccttgg atttctgaat ggctgcagtg acccagatat tgcactaggt 2040 caaaacattc aggtatagct gacatctcct ctatcacatt acatcatcct ccttataagc 2100 ccagctctgc tttttccaga ttcttccact ggctccacat ccaccccact ggatcttcag 2160 aaggctagag ggcgactctg gtggtgcttt tgtatgtttc aattaggctc tgaaatcttg 2220 ggcaaaatga caaggggagg gccaggattc ctctctcagg tcactccagt gttactttta 2280 attcctagag ggtaaatatg actcctttct ctatcccaag ccaaccaaga gcacattctt 2340 aaaggaaaag tcaacatctt ctctcttttt tttttttttt gagacagggt ctcactatgt 2400 tgcccaggct gctcttgaat tcctgggctc aagcagtcct cccaccctac cacagcgtcc 2460 cgcgtagctg gcatacaggt gcaagccact atgtccagct agccaactcc tccttgcctg 2520 cttttctttt tttttctttt tttgagacgg cgcacctatc acccaggctg gagtggagtg 2580 gcacgatctt ggctcactgc aacctcttcc tcctggttca agcgattctc atgtctcagc 2640 ctcctcagta gctaggacta ccggcgtgca ccaccatgcc aggctaattt ttatattttt 2700 agaattttag aagagatggg atttcatcat gttggccagg ctggtctcga actcctgacc 2760 tcaagtgatc cacctgcctt ggcctcccaa ggtgctagga ttacaggcat gagccaccgc 2820 accgggccct ccttgcctgt ttttcaatct catctgatat gcagagtatt tctgccccac 2880 ccacctaccc cccaaaaaaa gctgaagcct atttatttga aagtccttgt ttttgctact 2940 aattatatag tataccatac attatcattc aaaacaacca tcctgctcat aacatctttg 3000 aaaagaaaaa tatatatgtg cagtatttta ttaaagcaac attttattta agaataaagt 3060 cttgttaatt actatatttt agatgcaatg tgatc 3095 10 4460 DNA Homo sapiens 10 cggggcagca accaggagat tccctgggcc tgcaggaagc ccttccgcgg accgaaagat 60 tgttccccat tttggagatg aagaaactga gactcaaagc agctgagtga ccttcccaag 120 gacacacact gaactgggcg gtgatcagga tctgaatgca cagggcgggt gttcagcgat 180 tgtttactac gttgaacgtg acctccagga aagcagttct ggccgagatc ccctgacaac 240 gcaaagcaag aagtaacgtg gaaggaggct ccccaagctg gctggccatt ttgctgctgt 300 gtgtggaggt gctgtcagtg gcatgcccaa acccaaagct ggaagaggaa taaattacaa 360 gtggtcaagg ttgcatcctt ttgagctcag gacctgcttg taagccgaga gggttctctg 420 gccctaatct agccaagcac catggagaga atcagtgcct tcttcagctc tatctgggac 480 accatcttga ccaaacacca agaaggcatc tacaacacca tctgcctggg agtcctcctg 540 ggcctgccac tcttggtgat catcacactc ctcttcatct gttgccattg ctgctggagc 600 ccaccaggca agaggggcca gcagccagag aagaaaaaga agaagaagaa gaagaaggat 660 gaagaagacc tctggatctc tgctcaaccc aagcttctcc agatggagaa gagaccatca 720 ctgcctgttt agttaggcag gaagcagagg tgtttccttt ctggggctaa gcctccttct 780 gaccacacac agacatttca ggaacccctg aaataatgca ctatgtccat gtccacagag 840 taactactca accaaggaac aaacctcaga ctaagtgtcc cagtggaggg cagtcccagg 900 gaccacgtgg acaattcttg gatactgtct tggcagctat gtgtccaata gcaatgctcc 960 ttactgcaga cccaggcatg cctcccacct gtctctggca taccccacat gcaaagcaca 1020 aagaacattt atccatacat ctcaatatgg ttcccaagtg tgtgcacatg cacgtaacac 1080 acacacacac aaattcaggt agcaggtacg tgggcaagta tattctgctc atcaaatggt 1140 cattggctat gtactttgtg cagggaagta cattatctac agtcacaaaa atgtctcatg 1200 ggaaagcctt gccagattca gacacatata tacaatttcc taaccagcaa ggcccccata 1260 caccatctat tccataaacc actcaggtta cagatgcatg ctttcctatt tctaactcta 1320 cacataaact tttactggaa gtactcataa ttggacattc cagcaacctg ctacagtccc 1380 cacccttgtg tgtcttgata cagacacacc aagtttctgt gcctctgacc cctcacctgt 1440 gccaagatgt ttaaagtgtg atggttcaaa attcattgaa agctcttttc ttgtaactca 1500 tgacaaagtc cgtcctcatt gccactgaga ggtgtttaat gtgatccaag acctctctgt 1560 gaaacattac ccccgcaaac cactcagcaa agtgcctttc tccaagcaag aacaaagagc 1620 tcttggtggt gactgctaga aaattatgga agcccactca tttatgtcag tggactgcaa 1680 ctgtgtacct gtgcaatgtt tacagatgga aagggtgagg agatgctaca cctgagctag 1740 gtatctccta tataaccaaa gtttccagca gggaaggaac tagacaatca tcagtgcagt 1800 ctcacagaag gcaacactgg aagtgatgtc ataaggttgt gatgtgtgca cggtacggca 1860 caggtgggat gcagaggtaa cagagtttaa atgaaagtag gatgaagcta taaagaggtt 1920 tatttatatt tatattgaag ctcaggcaag tgccttgcac acagtaggta cttataacta 1980 actgtggtta ctgttggata tgtgatgttg ttaagggtaa gcttgtaata cctcaccaat 2040 tctctgcgag tgatcttctc ttctaagtga gcccactaat tgctgcaatg gatgaaattg 2100 ggtgtttaat gctggagagc acatgtaggt gacacatgtg ccttgaggta tgtgaggaca 2160 tgtaaattag atccacagtg agctgaggag ggctttcccc gccagagtga ggttgggaag 2220 cagagttaat ccacttatag gatgaactgc ttggtatttt tattgtattg tgactgtatt 2280 acaaagatgg acaattcact ccttgggagc aagttatgct ctagaagttt atttacaaat 2340 atgctgggca gctctcttga aatattttcc caaggaagct attctacaca gtggcaaaat 2400 tgctatctaa ttaataatgt agctaaacta tgatatttat agtagcaaaa aactaaattc 2460 tataagattg cattaaagga aagatatatt ctatttgctc acttgggctg cttggtactc 2520 acctgccctc caggtgtact ttaggcctgt ggagggtggg catttagtgg tgacccttgc 2580 accagggttt tctaacagat gaccctgtga atcataattt aaacctgcat atattttata 2640 gccagtcaca tttgccctct caccctatat ggccataaac tgcctaagca ctcaggcctc 2700 ccactcatca acccctttga ccagagaaag aagcactctg gttctctatc cccttgtcac 2760 atagagagtt tgtcatgggg cctctggctg tgcccttcac ataacagaat aacttgccat 2820 ctgcctgcac caaacccagg gatgtggaag acatctcccc acaactgcca ctgctcacca 2880 ggacaagctg cccttcctgt ctccacctct cagtccccct agaatggatg gctggggaga 2940 ggtggaggct gacagctgag acgtagtgtc agatatgatc taggagggcg gatcaccggg 3000 atccgggacc atacaagtaa catggtttcc atggcaactg cttgctcgtt tgaattaaga 3060 cagcagtcag ttgtcattgc catgacaagg cctctatctc caggcacaat gtccctgctg 3120 tctcctaatc caatggactt gctctcaccc cagggatgaa acacccagaa actcacttct 3180 cagtcacttc cacagccgat gactcagaag agccaaaccc agaatggggc ctctcttttc 3240 cccatcacag actcccctga caacctttcc tggcgtaact agaggagtcc cagtgcagga 3300 taggccctaa acgttttgtt aaataaacag gtgcatgaaa ggagcctaag gccattgttg 3360 atatccactc tcttctttcc acttccttct catctttttc tccatgtttt atgcttctct 3420 gattccctct tctgcctgca ccagaccagc cccagccctt tattcctctc cattttcact 3480 ccttccagcc tctgtccctg aactgccact ggcaacccat gggacctcag gaccagagac 3540 tgcttgactc atctggggag ggtaagttca cgggggacaa aaaaatgatt cctaaagaag 3600 aggcttccta gaccagcaca ggctccagaa agacatcccc taggcctgga cttctgagca 3660 gctttagcca ggctccggac ggcagccaga ggaggccttt ccccattgct cctttcccca 3720 ttgctcaatg gattccatgt ttctttttct tggggggagc agggagggag aaaggtagaa 3780 aaatggcagc cacctttcca agaaaaatat aaagggtcca agctgtatag tatttgtcag 3840 tatttttttc tgtaaaattc gaacacacac aaaagaaaaa tttatttaaa taaaatactt 3900 tgaaaatgaa aagtcttgat gtagtcagat ggttactttc ttaacattag gtattacccc 3960 cactcagaca tcactcagaa atgatcaatg cagggactct ttctgtgaca caaatgtccc 4020 agccctccct ggtcaccgcc ttcgccatgg tagagtcgta ggtctgagga tgaggaatgt 4080 ggctgtctca cccttgcttg caaaacagat ggccttggag accagactcc ctcaaaggtg 4140 ccagctacag gaaaaataca ctgatgttcc ttggcaacac ttacagaact ttccatcaat 4200 gagggtccat caatggcttc ttaaaggaaa aggggggaaa tagcaaaaac ctaaggaaga 4260 atggaccttt gagttaaatc cagtgtttgt tgggaaagga gggatcaaaa acctctatag 4320 tagccactag ggcaaaaact gtgtgtatgt gtgtgtgtat gtgtgtgtac actgttcaat 4380 atggttcaat atggtaccaa tagccacatg tgactattta aattcattgc aatgaaataa 4440 aattaaaggt atactagctc 4460 11 3076 DNA Homo sapiens 11 gaattcaaaa tgtcttcagt tgtaaatctt accattattt tacgtacctc taagaaataa 60 aagtgcttct aattaaaata tgatgtcatt aattatgaaa tacttcttga taacagaagt 120 tttaaaatag ccatcttaga atcagtgaaa tatggtaatg tattattttc ctcctttgag 180 ttaggtcttg tgcttttttt tcctggccac taaatttcac aatttccaaa aagcaaaata 240 aacatattct gaatattttt gctgtgaaac acttgacagc agagctttcc accatgaaaa 300 gaagcttcat gagtcacaca ttacatcttt gggttgattg aatgccactg aaacattcta 360 gtagcctgga gaagttgacc tacctgtgga gatgcctgcc attaaatggc atcctgatgg 420 cttaatacac atcactcttc tgtgaagggt tttaattttc aacacagctt actctgtagc 480 atcatgttta cattgtatgt ataaagatta tacaaaggtg caattgtgta tttcttcctt 540 aaaatgtatc agtataggat ttagaatctc catgttgaaa ctctaaatgc atagaaataa 600 aaataataaa aaatttttca ttttggcttt tcagcctagt attaaaactg ataaaagcaa 660 agccatgcac aaaactacct ccctagagaa aggctagtcc cttttcttcc ccattcattt 720 cattatgaac atagtagaaa acagcatatt cttatcaaat ttgatgaaaa gcgccaacac 780 gtttgaactg aaatacgact tgtcatgtga actgtaccga atgtctacgt attccacttt 840 tcctgctggg gttcctgtct cagaaaggag tcttgctcgt gctggtttct attacactgg 900 tgtgaatgac aaggtcaaat gcttctgttg tggcctgatg ctggataact ggaaaagagg 960 agacagtcct actgaaaagc ataaaaagtt gtatcctagc tgcagattcg ttcagagtct 1020 aaattccgtt aacaacttgg aagctacctc tcagcctact tttccttctt cagtaacaaa 1080 ttccacacac tcattacttc cgggtacaga aaacagtgga tatttccgtg gctcttattc 1140 aaactctcca tcaaatcctg taaactccag agcaaatcaa gatttttctg ccttgatgag 1200 aagttcctac cactgtgcaa tgaataacga aaatgccaga ttacttactt ttcagacatg 1260 gccattgact tttctgtcgc caacagatct ggcaaaagca ggcttttact acataggacc 1320 tggagacaga gtggcttgct ttgcctgtgg tggaaaattg agcaattggg aaccgaagga 1380 taatgctatg tcagaacacc tgagacattt tcccaaatgc ccatttatag aaaatcagct 1440 tcaagacact tcaagataca cagtttctaa tctgagcatg cagacacatg cagcccgctt 1500 taaaacattc tttaactggc cctctagtgt tctagttaat cctgagcagc ttgcaagtgc 1560 gggtttttat tatgtgggta acagtgatga tgtcaaatgc ttttgctgtg atggtggact 1620 caggtgttgg gaatctggag atgatccatg ggttcaacat gccaagtggt ttccaaggtg 1680 tgagtacttg ataagaatta aaggacagga gttcatccgt caagttcaag ccagttaccc 1740 tcatctactt gaacagctgc tatccacatc agacagccca ggagatgaaa atgcagagtc 1800 atcaattatc cattttgaac ctggagaaga ccattcagaa gatgcaatca tgatgaatac 1860 tcctgtgatt aatgctgccg tggaaatggg ctttagtaga agcctggtaa aacagacagt 1920 tcaaagaaaa atcctagcaa ctggagagaa ttatagacta gtcaatgatc ttgtgttaga 1980 cttactcaat gcagaagatg aaataaggga agaggagaga gaaagagcaa ctgaggaaaa 2040 agaatcaaat gatttattat taatccggaa gaatagaatg gcactttttc aacatttgac 2100 ttgtgtaatt ccaatcctgg atagtctact aactgccgga attattaatg aacaagaaca 2160 tgatgttatt aaacagaaga cacagacgtc tttacaagca agagaactga ttgatacgat 2220 tttagtaaaa ggaaatattg cagccactgt attcagaaac tctctgcaag aagctgaagc 2280 tgtgttatat gagcatttat ttgtgcaaca ggacataaaa tatattccca cagaagatgt 2340 ttcagatcta ccagtggaag aacaattgcg gagactacaa gaagaaagaa catgtaaagt 2400 gtgtatggac aaagaagtgt ccatagtgtt tattccttgt ggtcatctag tagtatgcaa 2460 agattgtgct ccttctttaa gaaagtgtcc tatttgtagg agtacaatca agggtacagt 2520 tcgtacattt ctttcatgaa gaagaaccaa aacatcatct aaactttaga attaatttat 2580 taaatgtatt ataactttaa cttttatcct aatttggttt ccttaaaatt tttatttatt 2640 tacaactcaa aaaacattgt tttgtgtaac atatttatat atgtatctaa accatatgaa 2700 catatatttt ttagaaacta agagaatgat aggcttttgt tcttatgaac gaaaaagagg 2760 tagcactaca aacacaatat tcaatcaaaa tttcagcatt attgaaattg taagtgaagt 2820 aaaacttaag atatttgagt taacctttaa gaattttaaa tattttggca ttgtactaat 2880 acctggtttt ttttttgttt tgtttttttg tacagacagg gcagcatact gagaccctgc 2940 ctttaaaaac aaacagaaca aaaacaaaac accagggaca catttctctg tcttttttga 3000 tcagtgtcct atacatcgaa ggtgtgcata tatgttgaat gacattttag ggacatggtg 3060 tttttataaa gaattc 3076 12 3056 DNA Homo sapiens 12 cccagctggt gctgaagctc gtcagttcac catccgccct cggcttccgc ggggcgctgg 60 gccgccagcc tcggcaccgt cctttccttt ctccctcgcg ttaggcaggt gacagcaggg 120 acatgtctcg ggagatgcag gatgtagacc tcgctgaggt gaagcctttg gtggagaaag 180 gggagaccat caccggcctc ctgcaagagt ttgatgtcca ggagcaggac atcgagactt 240 tacatggctc tgttcacgtc acgctgtgtg ggactcccaa gggaaaccgg cctgtcatcc 300 tcacctacca tgacatcggc atgaaccaca aaacctgcta caaccccctc ttcaactacg 360 aggacatgca ggagatcacc cagcactttg ccgtctgcca cgtggacgcc cctggccagc 420 aggacggcgc agcctccttc cccgcagggt acatgtaccc ctccatggat cagctggctg 480 aaatgcttcc tggagtcctt caacagtttg ggctgaaaag cattattggc atgggaacag 540 gagcaggcgc ctacatccta actcgatttg ctctaaacaa ccctgagatg gtggagggcc 600 ttgtccttat caacgtgaac ccttgtgcgg aaggctggat ggactgggcc gcctccaaga 660 tctcaggatg gacccaagct ctgccggaca tggtggtgtc ccaccttttt gggaaggaag 720 aaatgcagag taacgtggaa gtggtccaca cctaccgcca gcacattgtg aatgacatga 780 accccggcaa cctgcacctg ttcatcaatg cctacaacag ccggcgcgac ctggagattg 840 agcgaccaat gccgggaacc cacacagtca ccctgcagtg ccctgctctg ttggtggttg 900 gggacagctc gcctgcagtg gatgccgtgg tggagtgcaa ctcaaaattg gacccaacaa 960 agaccactct cctcaagatg gcggactgtg gcggcctccc gcagatctcc cagccggcca 1020 agctcgctga ggccttcaag tacttcgtgc agggcatggg atacatgccc tcggctagca 1080 tgacccgcct gatgcggtcc cgcacagcct ctggttccag cgtcacttct ctggatggca 1140 cccgcagccg ctcccacacc agcgagggca cccgaagccg ctcccacacc agcgagggca 1200 cccgcagccg ctcgcacacc agcgaggggg cccacctgga catcaccccc aactcgggtg 1260 ctgctgggaa cagcgccggg cccaagtcca tggaggtctc ctgctaggcg gcctgcccag 1320 ctgccgcccc cggactctga tctctgtagt ggccccctcc tccccggccc cttttcgccc 1380 cctgcctgcc atactgcgcc taactcggta ttaatccaaa gcttattttg taagagtgag 1440 ctctggtgga gacaaatgag gtctattacg tgggtgccct ctccaaaggc ggggtggcgg 1500 tggaccaaag gaaggaagca agcatctccg catcgcatcc tcttccatta accagtggcc 1560 ggttgccact ctcctcccct ccctcagaga caccaaactg ccaaaaacaa gacgcgtagc 1620 agcacacact tcacaaagcc aagcctaggc cgccctgagc atcctggttc aaacgggtgc 1680 ctggtcagaa ggccagccgc ccacttcccg tttcctcttt aactgaggag aagctgatcc 1740 agtttccgga aacaaaatcc ttttctcatt tggggagggg ggtaatagtg acatgcaggc 1800 acctctttta aacaggcaaa acaggaaggg ggaaaaggtg ggattcatgt cgaggctaga 1860 ggcatttgga acaacaaatc tacgtagtta acttgaagaa accgattttt aaagttggtg 1920 catctagaaa gctttgaatg cagaagcaaa caagcttgat ttttctagca tcctcttaat 1980 gtgcagcaaa agcaggcaac aaaatctcct ggctttacag acaaaaatat ttcagcaaac 2040 gttgggcatc atggtttttg aaggctttag ttctgctttc tgcctctcct ccacagcccc 2100 aacctcccac ccctgataca tgagccagtg attattcttg ttcagggaga agatcattta 2160 gatttgtttt gcattcctta gaatggaggg caacattcca cagctgccct ggctgtgatg 2220 agtgtccttg caggggccgg agtaggagca ctggggtggg ggcggaattg gggttactcg 2280 atgtaaggga ttccttgttg ttgtgttgag atccagtgca gttgtgattt ctgtggatcc 2340 cagcttggtt ccaggaattt tgtgtgattg gcttaaatcc agttttcaat cttcgacagc 2400 tgggctggaa cgtgaactca gtagctgaac ctgtctgacc cggtcacgtt cttggatcct 2460 cagaactctt tgctcttgtc ggggtggggg tgggaactca cgtggggagc ggtggctgag 2520 aaaatgtaag gattctggaa tacatattcc atgggacttt ccttccctct cctgcttcct 2580 cttttcctgc tccctaacct ttcgccgaat ggggcagcac cactgacgtt tctgggcggc 2640 cagtgcggct gccaggttcc tgtactactg ccttgtactt ttcattttgg ctcaccgtgg 2700 attttctcat aggaagtttg gtcagagtga attgaatatt gtaagtcagc cactgggacc 2760 cgaggatttc tgggaccccg cagttgggag gaggaagtag tccagccttc caggtggcgt 2820 gagaggcaat gactcgttac ctgccgccca tcaccttgga ggccttccct ggccttgagt 2880 agaaaagtcg gggatcgggg caagagaggc tgagtacgga tgggaaacta ttgtgcacaa 2940 gtctttccag aggagtttct taatgagata tttgtattta tttccagacc aataaatttg 3000 taactttgca gcggaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaa 3056 13 1930 DNA Homo sapiens 13 ggagagagag aggacagaga gcaagtcact cccggctgcc tttttcacct ctgacagagc 60 ccagacacca tgaacgcaag tgaattccga aggagaggga aggagatggt ggattacgtg 120 gccaactaca tggaaggcat tgagggacgc caggtctacc ctgacgtgga gcccgggtac 180 ctgcggccgc tgatccctgc cgctgcccct caggagccag acacgtttga ggacatcatc 240 aacgacgttg agaagataat catgcctggg gtgacgcact ggcacagccc ctacttcttc 300 gcctacttcc ccactgccag ctcgtacccg gccatgcttg cggacatgct gtgcggggcc 360 attggctgca tcggcttctc ctgggcggca agcccagcat gcacagagct ggagactgtg 420 atgatggact ggctcgggaa gatgctggaa ctaccaaagg catttttgaa tgagaaagct 480 ggagaagggg gaggagtgat ccagggaagt gccagtgaag ccaccctggt ggccctgctg 540 gccgctcgga ccaaagtgat ccatcggctg caggcagcgt ccccagagct cacacaggcc 600 gctatcatgg agaagctggt ggcttactca tccgatcagg cacactcctc agtggaaaga 660 gctgggttaa ttggtggagt gaaattaaaa gccatcccct cagatggcaa cttcgccatg 720 cgtgcgtctg ccctgcagga agccctggag agagacaaag cggctggcct gattcctttc 780 tttatggttg ccaccctggg gaccacaaca tgctgctcct ttgacaatct cttagaagtc 840 ggtcctatct gcaacaagga agacatatgg ctgcacgttg atgcagccta cgcaggcagt 900 gcattcatct gccctgagtt ccggcacctt ctgaatggag tggagtttgc agattcattc 960 aactttaatc cccacaaatg gctattggtg aattttgact gttctgccat gtgggtgaaa 1020 aagagaacag acttaacggg agcctttaga ctggacccca cttacctgaa gcacagccat 1080 caggattcag ggcttatcac tgactaccgg cattggcaga taccactggg cagaagattt 1140 cgctctttga aaatgtggtt tgtatttagg atgtatggag tcaaaggact gcaggcttat 1200 atccgcaagc atgtccagct gtcccatgag tttgagtcac tggtgcgcca ggatccccgc 1260 tttgaaatct gtgtggaagt cattctgggg cttgtctgct ttcggctaaa gggttccaac 1320 aaagtgaatg aagctcttct gcaaagaata aacagtgcca aaaaaatcca cttggttcca 1380 tgtcacctca gggacaagtt tgtcctgcgc tttgccatct gttctcgcac ggtggaatct 1440 gcccatgtgc agcgggcctg ggaacacatc aaagagctgg cggccgacgt gctgcgagca 1500 gagagggagt aggagtgaag ccagctgcag gaatcaaaaa ttgaagagag atatatctga 1560 aaactggaat aagaagcaaa taaatatcat cctgccttca tggaactcag ctgtctgtgg 1620 cttcccatgt ctttctccaa agccatccag agggttgtga ttttgtctgc ttagtatctc 1680 atcaacaaag aaatattatt tgctaattaa aaagttaatc ttcatggcca tagcttttat 1740 tcattagctg tgatttttgt tgattaaaac attatagatt ttcatgttct tgcagtcatc 1800 agaagtggta ggaaagcctc actgatatat tttccagggc aatcaatgtt cacgcaactt 1860 gaaattatat ctgtggtctt caaattgtct tttgtcatgt ggctaaatgc ctaataaaca 1920 attcaagtga 1930 14 512 DNA Homo sapiens 14 gccctttctg cctctgcggg gctctggtcg ccggccaagg aaaaacgagg ctggaccctg 60 aacagcgcgg gctacctgct gggcccacat gccgttggca accacaggtc attcagcgac 120 aagaatggcc tcaccagcaa gcgggagctg cggcccgaag atgacatgaa accaggaagc 180 tttgacaggt ccatacctga aaacaatatc atgcgcacaa tcattgagtt tctgtctttc 240 ttgcatctca aagaggccgg tgccctcgac cgcctcctgg atctccccgc cgcagcctcc 300 tcagaagaca tcgagcggtc ctgagagcct cctgggcacg tttgtctgtg tgctgtaacc 360 tgaagtcaaa ccttaagata atggataatc ttcggccaat ttatgcggag tcagccattc 420 ctgttctctt tgccttgatg ttgtgttgtt atcatttaag attttttttt tttggtaatt 480 attttgagtg gcaaaataaa gaatagcaat ta 512 15 1637 DNA Homo sapiens 15 gaggcgaacc ggagcgcggg gccgcggtcg ccccgaccag agccgggaga ccgcagcacc 60 cgcagccgcc cgcgagcgcg ccgaagacag cgcgcaggcg agagcgcgcg ggcgggggcg 120 cgcaggccct gcccgcccct tccgtcccca cccccctccg ccctttcctc tccccacctt 180 cctctcgcct cccgcgcccc cgcaccgggc gcccaccctg tcctcctcct gcgggagcgt 240 tgtccgtgtt ggcggccgca gcgggccggg ccggtccggc gggccggggg atggcgctgc 300 tggacctggc cttggaggga atggccgtct tcgggttcgt cctcttcttg gtgctgtggc 360 tgatgcattt catggctatc atctacaccc gattacacct caacaagaag gcaactgaca 420 aacagcctta tagcaagctc ccaggtgtct ctcttctgaa accactgaaa ggggtagatc 480 ctaacttaat caacaacctg gaaacattct ttgaattgga ttatcccaaa tatgaagtgc 540 tcctttgtgt acaagatcat gatgatccag ccattgatgt atgtaagaag cttcttggaa 600 aatatccaaa tgttgatgct agattgttta taggtggtaa aaaagttggc attaatccta 660 aaattaataa tttaatgcca ggatatgaag ttgcaaagta tgatcttata tggatttgtg 720 atagtggaat aagagtaatt ccagatacgc ttactgacat ggtgaatcaa atgacagaaa 780 aagtaggctt ggttcacggg ctgccttacg tagcagacag acagggcttt gctgccacct 840 tagagcaggt atattttgga acttcacatc caagatacta tatctctgcc aatgtaactg 900 gtttcaaatg tgtgacagga atgtcttgtt taatgagaaa agatgtgttg gatcaagcag 960 gaggacttat agcttttgct cagtacattg ccgaagatta ctttatggcc aaagcgatag 1020 ctgaccgagg ttggaggttt gcaatgtcca ctcaagttgc aatgcaaaac tctggctcat 1080 attcaatttc tcagtttcaa tccagaatga tcaggtggac caaactacga attaacatgc 1140 ttcctgctac aataatttgt gagccaattt cagaatgctt tgttgccagt ttaattattg 1200 gatgggcagc ccaccatgtg ttcagatggg atattatggt atttttcatg tgtcattgcc 1260 tggcatggtt tatatttgac tacattcaac tcaggggtgt ccagggtggc acactgtgtt 1320 tttcaaaact tgattatgca gtcgcctggt tcatccgcga atccatgaca atatacattt 1380 ttttgtctgc attatgggac ccaactataa gctggagaac tggtcgctac agattacgct 1440 gtgggggtac agcagaggaa atcctagatg tataactaca gctttgtgac tgtatataaa 1500 ggaaaaaaga gaagtattat aaattatgtt tatataaatg cttttaaaaa tctaccttct 1560 gtagttttat cacatgtatg ttttggtatc tgttctttaa tttatttttg catggcactt 1620 gcatctgtga aaaaaaa 1637 16 2172 DNA Homo sapiens 16 agatcatcaa atcaaattcc acagggattg gtgaccaacc agaaggctca gacatctgat 60 tgctgacctg tccagacatc atctggtctc cctgaacctg aaatcacacc atggatgatt 120 ttgagcgtcg cagagaactt agaaggcaaa agagggagga gatgcgactc gaagcagaaa 180 gaatcgccta ccagaggaat gacgatgatg aagaggaggc agcccgggaa cgccgccgcc 240 gagcccgaca ggaacggctg cggcagaagc aggaggaaga atccttggga caggtgaccg 300 accaggtgga ggtgaatgcc cagaacagtg tgcctgacga ggaggccaag acaaccacca 360 caaacactca agtggaaggg gatgatgagg ccgcattcct ggagcgcctg gctcggcgtg 420 aggaaagacg ccaaaaacgc cttcaggagg ctctggagcg gcagaaggag ttcgacccaa 480 caataacaga tgcaagtctg tcgctcccaa gcagaagaat gcaaaatgac acagcagaaa 540 atgaaactac cgagaaggaa gaaaaaagtg aaagtcgcca agaaagatac gagatagagg 600 aaacagaaac agtcaccaag tcctaccaga agaatgattg gagggatgct gaagaaaaca 660 agaaagaaga caaggaaaag gaggaggagg aagaggagaa gccaaagcga gggagcattg 720 gagaaaatca gatcaaagat gaaaagatta aaaaggacaa agaacccaaa gaagaagtta 780 agagcttcat ggatcgaaag aagggattta cagaagttaa gtcgcagaat ggagaattca 840 tgacccacaa acttaaacat actgagaata ctttcagccg ccctggaggg agggccagcg 900 tggacaccaa ggaggctgag ggcgcccccc aggtggaagc cggcaaaagg ctggaggagc 960 ttcgtcgtcg tcgcggggag accgagagcg aagagttcga gaagctcaaa cagaagcagc 1020 aggaggcggc tttggagctg gaggaactca agaaaaagag ggaggagaga aggaaggtcc 1080 tggaggagga agagcagagg aggaagcagg aggaagccga tcgaaaactc agagaggagg 1140 aagagaagag gaggctaaag gaagagattg aaaggcgaag agcagaagct gctgagaaac 1200 gccagaagat gccagaagat ggcttgtcag atgacaagaa accattcaag tgtttcactc 1260 ctaaaggttc atctctcaag atagaagagc gagcagaatt tttgaataag tctgtgcaga 1320 aaagcagtgg tgtcaaatcg acccatcaag cagcaatagt ctccaagatt gacagcagac 1380 tggagcagta taccagtgca attgagggaa caaaaagcgc aaaacctaca aagccggcag 1440 cctcggatct tcctgttcct gctgaaggtg tacgcaacat caagagtatg tgggagaaag 1500 ggaatgtgtt ttcatccccc actgcagcag gcacaccaaa taaggaaact gctggcttga 1560 aggtaggggt ttctagccgc atcaatgaat ggctaactaa aaccccagat ggaaacaagt 1620 cacctgctcc caaaccttct gacttgagac caggagacgt atccagcaag cggaacctct 1680 gggaaaagca atctgtggat aaggtcactt cccccactaa ggtttgagac agttccagaa 1740 agaacccaag ctcaagacgc aggacgagct cagttgtaga gggctaattc gctctgtttt 1800 gtatttatgt tgatttacta aattgggttc attatctttt atttttcaat atcccagtaa 1860 acccatgtat attatcacta tatttaataa tcacagtcta gagatgttca tggtaaaagt 1920 actgcctttg cacaggatcc tgtttctaaa gaaacccatg ctgtgaaata gagacttttc 1980 tactgatcat cataactctg tatctgagca gtgataccaa ccacatctga agtcaacaga 2040 agatccaagt ttaaaattgc tgcggaatgt gtgcagtatc tagaaaaatg aaccgtagtt 2100 tttgtttttt taaatacaga agtcatgttg tttctgcact ttataataaa gcatggaaga 2160 aattatctta gt 2172 17 5035 DNA Homo sapiens 17 gcggcggcgg cggcggcggc ggcagcggcg gccaagcggc caggttggcg gccggggctc 60 cgggccgcgc gaggccacgg ccacgccgcg ccgctgcgca caaccaacga ggcagagcgc 120 cgcccggcgc gagactgcgg ccgaagcgtg gggcgcgcgt gcggaggacc aggcgcggcg 180 cggctgcggc tgagagtgga gcctttcagg ctggcatgga gagcttaagg ggcaactgaa 240 ggagacacac tggccaagcg cggagttctg cttacttcag tcctgctgag atactctctc 300 agtccgctcg caccgaagga agctgccttg ggatcagagc agacataaag ctagaaaaat 360 ttcaagacag aaacagtctc cgccagtcaa gaaaccctca aaagtatttt gccatggata 420 tagaagatga agaaaacatg agttccagca gcactgatgt gaaggaaaac cgcaatctgg 480 acaacgtgtc ccccaaggat ggcagcacac ctgggcctgg cgagggctct cagctctcca 540 atgggggtgg tggtggcccc ggcagaaagc ggcccctgga ggagggcagc aatggccact 600 ccaagtaccg cctgaagaaa aggaggaaaa caccagggcc cgtcctcccc aagaacgccc 660 tgatgcagct gaatgagatc aagcctggtt tgcagtacac actcctgtcc cagactgggc 720 ccgtgcacgc gcctttgttt gtcatgtctg tggaggtgaa tggccaggtt tttgagggct 780 ctggtcccac aaagaaaaag gcaaaactcc atgctgctga gaaggccttg aggtctttcg 840 ttcagtttcc taatgcctct gaggcccacc tggccatggg gaggaccctg tctgtcaaca 900 cggacttcac atctgaccag gccgacttcc ctgacacgct cttcaatggt tttgaaactc 960 ctgacaaggc ggagcctccc ttttacgtgg gctccaatgg ggatgactcc ttcagttcca 1020 gcggggacct cagcttgtct gcttccccgg tgcctgccag cctagcccag cctcctctcc 1080 ctgtcttacc accattccca cccccgagtg ggaagaatcc cgtgatgatc ttgaacgaac 1140 tgcgcccagg actcaagtat gacttcctct ccgagagcgg ggagagccat gccaagagct 1200 tcgtcatgtc tgtggtcgtg gatggtcagt tctttgaagg ctcggggaga aacaagaagc 1260 ttgccaaggc ccgggctgcg cagtctgccc tggccgccat ttttaacttg cacttggatc 1320 agacgccatc tcgccagcct attcccagtg agggtcttca gctgcattta ccgcaggttt 1380 tagctgacgc tgtctcacgc ctggtcctgg gtaagtttgg tgacctgacc gacaacttct 1440 cctcccctca cgctcgcaga aaagtgctgg ctggagtcgt catgacaaca ggcacagatg 1500 ttaaagatgc caaggtgata agtgtttcta caggaacaaa atgtattaat ggtgaataca 1560 tgagtgatcg tggccttgca ttaaatgact gccatgcaga aataatatct cggagatcct 1620 tgctcagatt tctttataca caacttgagc tttacttaaa taacaaagat gatcaaaaaa 1680 gatccatctt tcagaaatca gagcgagggg ggtttaggct gaaggagaat gtccagtttc 1740 atctgtacat cagcacctct ccctgtggag atgccagaat cttctcacca catgagccaa 1800 tcctggaagg gtctcgctct tacacccagg ctggagtgca gtggtgcaat catggctcac 1860 tgcagcctcg acctcctggg ctcttaagcg atccttccac ctcaaccttc caaggagctg 1920 ggactacaga accagcagat agacacccaa atcgtaaagc aagaggacag ctacggacca 1980 aaatagagtc tggtgagggg acgattccag tgcgctccaa tgcgagcatc caaacgtggg 2040 acggggtgct gcaaggggag cggctgctca ccatgtcctg cagtgacaag attgcacgct 2100 ggaacgtggt gggcatccag ggatccctgc tcagcatttt cgtggagccc atttacttct 2160 cgagcatcat cctgggcagc ctttaccacg gggaccacct ttccagggcc atgtaccagc 2220 ggatctccaa catagaggac ctgccacctc tctacaccct caacaagcct ttgctcagtg 2280 gcatcagcaa tgcagaagca cggcagccag ggaaggcccc caacttcagt gtcaactgga 2340 cggtaggcga ctccgctatt gaggtcatca acgccacgac tgggaaggat gagctgggcc 2400 gcgcgtcccg cctgtgtaag cacgcgttgt actgtcgctg gatgcgtgtg cacggcaagg 2460 ttccctccca cttactacgc tccaagatta ccaagcccaa cgtgtaccat gagtccaagc 2520 tggcggcaaa ggagtaccag gccgccaagg cgcgtctgtt cacagccttc atcaaggcgg 2580 ggctgggggc ctgggtggag aagcccaccg agcaggacca gttctcactc acgccctgac 2640 ccgggcagac atgatggggg gtgcaggggg ctgtgggcat ccagcgtcat cctccagaac 2700 ctcacatctg aactgggggc aggtgcatac cttggggagg gagtaggggg acacggggga 2760 ccaccaggtg tccacggttg tccccagcat ctcacatcag acctggggca ggtgcgcagt 2820 gtggggaggg gatggggtgc gtcagggccc agcatcgccg cctggcatct ctctgccgca 2880 gcatttcccc ttctgaaccg tccagtgact gctttcaatc tcggtttacg tttagaaatt 2940 gagttctact gagtagggct tccttaagtt taggaaaata gaaattactt tgtgtgaaat 3000 tcttgaataa ataatttatt cagagctagg aatgtggttt ataaaatagg aagtaattgt 3060 gtcaggtcac ttttatgcca cattatttta attgcaaaaa agcatctata tatggaggag 3120 ggtgggaaaa tagaggtagg aaatagtagc ctaaaggaaa tcgccacacg tctgtctaaa 3180 cttaggtctc ttttctccgt aggtacctcc ctgggtagtt ccacacacta ggttgtaaca 3240 gtctctccct gaggagcaga ctcccagcat ggtgtagcgt ggccctgtca tgcacatggg 3300 gtcccgcagc agtgactgtg tgtcctgcag aggcgtgacc caggcccctg tagccctcag 3360 cctcctctag aagcttctgt actccttgta ggatcagatc atggaaaact tttctcagtt 3420 tacttctaag taatcacaga taatacatgg ccagtaatcc caggctggcc attcattcag 3480 gttttttaaa ggatatttaa cttttatgga ctagaaggaa tcacgagggc tactgcacaa 3540 tacatggcct aagttccctc tgttccttcc tctgaatcga atggatgtgg gtgaccgccc 3600 gaaggccttc acaggatgga agtagaatga tttcagtaga tactcattct tggaaaatgc 3660 catagtttta aattattgtt tccagcttta tcaaagacat gtttgaaaaa taaaaagcat 3720 ccaagtgaga gctggtgaga ccacgtgctg ctggcgtagt gtaggccaga cattgacagt 3780 cctgacggga gctcagggct gcccagcgcc cagcgtgcac gggacggccc cacgacagag 3840 ggagtcagcc cgggaggtca ggagcgcggc gggcgagggc cctgtgtgga ccacctccac 3900 caagctcaga gatttgcaac caggtgcctt gttgcctccg ctcaggatga aagaggagct 3960 gagagaagtg ctctgcctgc cagtgcagtg cccagctcca aggctctaga gggtgttcag 4020 gtacactgag gaggggacgg ctccgtcttc acattgtgca cagatctgag gatgggatta 4080 gcgaagctgt ggagactgca catccggacc tgcccatgtc tcaaaacaaa cacatgtaca 4140 gtggctcttt ttccttctca aacactttac cccagaagca ggtggtctgc cccaggcata 4200 aagaaggaaa attggccatc tttcccacct ctaaattctg taaaattata gacttgctca 4260 aaagattcct ttttatcatc cccacgctgt gtaagtggaa agggcattgt gttccgtgtg 4320 tgtccagttt acagcgtctc tgccccctag cgtgttttgt gacaatctcc cctgggtgag 4380 gagtgggtgc acccagcccc gaggccagtg gttgctcggg gccttccgtg tgagttctag 4440 tgttcacttg atgccgggga atagaattag agaaaactct gacctgccgg gttccaggga 4500 ctggtggagg tggatggcag gtccgactcg accatgactt agttgtaagg gtgtgtcggc 4560 tttttcagtc tcatgtgaaa atcctcctgt ctctggcagc actgtctgca ctttcttgtt 4620 tactgtttga agggacgagt accaagccac aaggaacact tcttttggcc acagcataag 4680 ctgatggtat gtaaggaacc gatgggccat taaacatgaa ctgaacggtt aaaagcacag 4740 tctatggaac gctaatggag tcagccccta aagctgtttg ctttttcagg ctttggatta 4800 catgctttta atttgatttt agaatctgga cactttctat gaatgtaatt cggctgagaa 4860 acatgttgct gagatgcaat cctcagtgtt ctctgtatgt aaatctgtgt atacaccaca 4920 cgttacaact gcatgagctt cctctcgcac aagaccagct ggaactgagc atgagacgct 4980 gtcaaataca gacaaaggat ttgagatgtt ctcaataaaa agaaaatgtt tcact 5035 18 1700 DNA Homo sapiens 18 gccgaggctg cctgactgga atgagggtag ctgcggcgac tgcggcggct ggagcggggc 60 cggccatggc ggtgtggacg cgggccacca aagcggggct ggtggagctg ctcctgaggg 120 agcgctgggt ccgagtggtg gccgagctga gcggggagag cctgagcctg acgggcgacg 180 ccgccgcggc cgagctggag cccgctctgg gacccgcggc cgccgccttc aacggcctcc 240 caaacggcgg cggcgcgggc gactcgctgc ccgggagccc aagccgcggc ctggggcccc 300 cgagcccgcc ggcgccgcct cggggccccg cgggtgaggc gggcgcgtcg ccgcccgtgc 360 gccgggtgcg ggtggtgaag caagaggcgg gcggcctggg catcagcatc aagggcggcc 420 gcgagaaccg gatgccgatc ctcatctcca agatcttccc cgggctggct gccgaccaga 480 gccgggcgct gcggctgggc gacgccatcc tgtcggtgaa cggcaccgac ctgcgccagg 540 ccacccacga ccaggccgtg caggcgctga agcgcgcggg caaggaggtg ctgctggagg 600 tcaagttcat ccgagaagta acaccatata tcaagaagcc atcattagta tcagatctgc 660 cgtgggaagg tgcagccccc cagtcaccaa gctttagtgg cagtgaggac tctggttcgc 720 caaaacacca gaacagcacc aaggacagga agatcatccc tctcaaaatg tgctttgctg 780 ctagaaacct aagcatgccg gatctggaaa acagattgat agagctacat tctcctgata 840 gcaggaacac gttgatccta cgctgcaaag atacagccac agcacactcc tggttcgtag 900 ctatccacac caacataatg gctctcctcc cacaggtgtt ggctgaactc aacgccatgc 960 ttggggcaac cagtacagca ggaggcagta aagaggtgaa gcatattgcc tggctggcag 1020 aacaggcaaa actagatggt ggaagacagc aatggagacc tgtcctcatg gctgtgactg 1080 agaaggattt gctgctctat gactgtatgc cgtggacaag agatgcctgg gcgtcaccat 1140 gccacagcta cccacttgtt gccaccaggt tggttcattc tggctccgga tgtcgatccc 1200 cctcccttgg atctgacctt acatttgcta ccaggacagg ctctcgacag ggcattgaga 1260 tgcatctctt cagggtggag acacatcggg atctgtcatc ctggaccagg atacttgttc 1320 agggttgcca tgctgctgct gagctgatca aggaagtctc tctaggctgc atgttaaatg 1380 gccaagaggt gaggcttact attcactatg aaaatgggtt caccatctca agggaaaatg 1440 gaggctccag cagcatattg taccgctacc cctttgaaag gctgaagatg tctgctgatg 1500 atggcatccg aaatctatac ttggattttg gtggtcccga gggagaactg accatggacc 1560 tgcactcttg tccgaagccg attgtatttg tgttgcacac gtttttatcg gccaaagtca 1620 ctcgtatggg actgcttgta tgagcaacaa aaaatcagaa aagagccttg actgtcacaa 1680 gaaatatttc cacctccaaa 1700 19 3086 DNA Homo sapiens 19 actgccacct cggtcggtcg gtgcttactt cgctgccagc tggtctgtcg ccatgaaccc 60 ggacctgcgc agggagcggg attccgccag cttcaacccg gagctgctta cacacatcct 120 ggacggcagc cccgagaaaa cgcggcgccg ccgagagatc gagaacatga tcctgaacga 180 cccagacttc cagcatgagg acttgaactt cctaactcgc agccagcgtt atgaggtggc 240 tgtcaggaaa agtgccatca tggtgaagaa gatgagggag tttggcatcg ctgaccctga 300 tgaaattatg tggtttaaaa aactacattt ggtcaatttt gtggaacctg tgggcctcaa 360 ttactccatg tttattccta ccttgctgaa tcagggcacc actgctgaga aagagaaatg 420 gctgctttca tccaaaggac tccagataat tggcacctac gcccagacgg aaatgggcca 480 cggaactcac cttcgaggct tggaaaccac agccacgtat gaccctgaaa cccaggagtt 540 cattctcaac agtcctactg tgacctccat taaatggtgg cctggtgggc ttggaaaaac 600 ttcaaatcat gcaatagtcc ttgcccagct catcactaag gggaaatgct atggattaca 660 tgcctttatc gtacctatcc gtgaaatcgg gacccataag cctttgccag gaattaccgt 720 tggtgacatc gggcccaaat ttggttatga tgagatagac aatggctacc tcaaaatgga 780 caaccatcgt attcccagag aaaacatgct gatgaagtat gcccaggtga agcctgatgg 840 cacatacgtg aaaccgctga gtaacaagct gacttacggg accatggtgt ttgtcaggtc 900 cttccttgtg ggagaagctg ctcgggctct gtctaaggcg tgcaccattg ccatccgata 960 cagcgctgtg aggcaccagt ctgaaatgaa gccaggtgaa ccagaaccac agattttgga 1020 ttttcaaacc cagcagtata aactctttcc actcctggcc actgcctatg ccttccagtt 1080 tgtgggcgca tacatgaagg agacctatca ccggattaac gaaggcattg gtcaagggga 1140 cctgagtgaa ctgcctgagc ttcatgccct caccgctgga ctgaaggctt tcacctcctg 1200 gactgcaaac actggcattg aagcatgtcg gatggcttgt ggtgggcatg gctattctca 1260 ttgcagtggt cttccaaata tttatgtcaa tttcacccca agctgtacct ttgagggaga 1320 aaacactgtc atgatgctcc agacggctag gttcctgatg aaaagttatg atcaggtgca 1380 ctcaggaaag ttggtgtgtg gcatggtgtc ctatttgaac gacctgccca gtcagcgcat 1440 ccagccacag caggtagcag tctggccaac catggtggat atcaacagcc ccgaaagcct 1500 aaccgaagca tataaactcc gtgcagccag attagtagaa attgctgcaa aaaaccttca 1560 aaaagaagtg attcacagaa aaagcaagga ggtagcttgg aacctaactt ctgttgacct 1620 tgttcgagca agtgaggcac attgccacta tgtggtagtt aagctctttt cagaaaaact 1680 cctcaaaatt caagataaag ccattcaagc tgtcttaagg agtttatgtc tgctgtattc 1740 tctgtatgga atcagtcaga acgcggggga tttccttcag gggagcatca tgacagagcc 1800 tcagattaca caagtaaacc agcgtgtaaa ggagttactc actctgattc gctcagatgc 1860 tgttgctttg gttgatgcat ttgattttca ggatgtgaca cttggctctg tgcttggccg 1920 ctatgatggg aatgtgtatg aaaacttgtt tgagtgggct aagaactccc cactgaacaa 1980 agcagaggtc cacgaatctt accacaagca cctgaagtca ctgcagtcca agctctgaag 2040 tgtcacaagg acaagtttaa tctgcttcag aaagcgcctg tgtgcaactc aaattttgtg 2100 gaatcttttc gaattcaaat agctatagag caaatgataa attgacccct ttttataaat 2160 ggagggaaaa aatgaacaga tttcagagat taaatgaaaa aaagcagatg tgttttaagt 2220 gcaattaaca ctgaaagaga cctgttaaac cattcagaaa aagcttaaga aatgcgatat 2280 gacttccttt tgtaatgctg ctgatcccag tagactatga cttttgataa ttagcagaat 2340 ttaactactg agtagttgat tattttcaca ttttaattgc taatcactgg ctatataagt 2400 gtttttaagc aagggtattt ttgaagtggt gtagaaccct tccacgcttt cctgctcagt 2460 gttctaccag acaagaaaag ggacttgggg aaggaaactt attggaaact tgatgcgaat 2520 taggttcttc tttgcacaaa ctctgcctgc ttgctctccc ttgctgatgg gttgcaattc 2580 tcaaactatt catgctagca atttttccac gggggggcct ttttcccacg ggggcctcta 2640 taggggccca tttctccggt aaataggaat ttccccttta aggggtgcca gtagtaggag 2700 tatagggaac ctctcagctg tggcactgtt gtagctttgg agtcagagtg tactctgggc 2760 aatcagattt ccacatattc tgcatcttgg ataagcatta aaagttggga tactaatttg 2820 gataaaaaaa tgcactaggc aaactccagc gagacagaaa gtatagggaa acctctcagc 2880 tgtggcactg ttgtagcttt ggagtgcaga gtgtaactct ggcgacaatc agatttcaca 2940 tattctgtca tcttggcata agccattaaa agcttggaga ttactgtatt tggcattaaa 3000 aaaaaatgtc acttaggtca gcactcccag acgtagcaca gaaaaaccct ttgacacaaa 3060 ccatgtgttc tgatttttgg ttcaga 3086 20 1302 DNA Homo sapiens 20 gcttcgggtg ccatggggac tcctcccggc ctgcagaccg actgcgaggc gctgctcagc 60 cgcttccagg agacggacag tgtacgcttc gaggacttca cggagctctg gagaaacatg 120 aagttcggga ctatcttctg tggcagaatg agaaatttag aaaagaacat gtttacaaaa 180 gaagctttag ctttggcttg gcgatatttt ttacctccat acaccttcca gatcagagtt 240 ggtgctttgt atctgctata tggattatat aatacccaac tgtgtcaacc aaaacaaaag 300 atcagagttg ccctgaagga ttgggatgaa gttttaaaat ttcagcaaga tttagtaaat 360 gcacagcatt ttgatgcagc ttatattttt aggaagctac gactagacag agcatttcac 420 tttacagcaa tgcccaaatt gctgtcatat aggatgaaga aaaaaattca ccgagctgaa 480 gttacagaag aatttaagga cccaagtgat cgtgtgatga aacttatcac ttctgatgta 540 ttagaggaaa tgctgaatgt tcatgatcat tatcagaaca tgaaacatgt aatttcagtt 600 gataagtcca agccagataa agccctcagc ttgataaagg atgatttttt tgacaatatt 660 aagaacatag ttttggagca tcagcagtgg cacaaagaca gaaagaatcc atccttaaag 720 tcaaaaacta atgatggaga agaaaaaatg gaaggaaatt cacaagaaac ggagagatgt 780 gaaagggcag aatcattagc gaaaataaaa tcaaaggcct tttcagttgt catacaggca 840 tccaaatcaa gaaggcatcg tcaagtcaaa ctcgactctt ctgactctga ttctgcatct 900 ggtcaagggc aagtcaaagc aactaggaaa aaagagaaga aagaaagatt gaaaccagca 960 ggaaggaaga tgtctctcag aaacaaaggc aatgtgcaga atatacacaa ggaagataaa 1020 cctttaagtc tgagtatgcc tgtaattaca gaagaagaag agaatgaaag tttgagtgga 1080 acagagttca ctgcatccaa gaagaggaga aaacactgaa caaagagcct ggtgtagttt 1140 ttaattttga gttttctgac agaagaaaag attgatattt tgtgtattga acaggaagac 1200 tgccagtatt aaaaaaatcc ttctgggaat ctgtaggtta tttcttggaa attgcaatac 1260 gtagttctag aataaaagta caaaaaatta gaataagaat tc 1302 21 2081 DNA Homo sapiens 21 atggatggat ggcccgccaa gagaaggagc agtgcactgt ggtcagagat gctggacatc 60 accatgaagg agtctctcac caccagggag atcagacggc aggaggcaat atatgaaatg 120 tcccgaggtg aacaggattt aattgaggat ctcaaacttg caagaaaggc ctaccatgac 180 cccatgttaa agttgtccat catgtcagaa gaggaactca cacatatatt tggtgatctg 240 gactcttaca tacctctgca tgaagatttg ttgacaagaa taggagaagc aaccaagcct 300 gatggaacag tggagcagat tggtcacatt ctcgtgagct ggttaccgcg cttgaatgcc 360 tacagaggtt actgtagtaa ccagctggca gccaaagctc ttcttgatca aaagaaacag 420 gatccaagag tccaagactt cctccagcga tgtctcgagt ctcccttcag tcgaaaacta 480 gatctttgga gtttcctaga tatccctcga agtcgcctag tcaaataccc tttactgtta 540 aaagaaattc ttaaacacac tccaaaagag caccctgatg ttcagcttct ggaggatgct 600 atattgataa tacagggagt cctctctgat atcaacttga agaaaggtga atccgagtgc 660 cagtattaca tcgacaagct ggagtacctg gatgaaaagc agagggaccc cagaatcgaa 720 gcgagcaaag tgctgctgtg ccatggggag ctgcggagca agagtggaca taaactttac 780 attttcctgt ttcaagacat cttggttctg actcggcccg tcacacggaa cgaacggcac 840 tcttaccagg tttaccggca gccaatccca gtccaagagc tagtcctaga agacctgcag 900 gatggagatg tgagaatggg aggctccttt cgaggagctt tcagtaactc agagaaagct 960 aaaaatatct ttagaattcg cttccatgac ccctctccag cccagtctca cactctgcaa 1020 gccaatgacg tgttccacaa gcagcagtgg ttcaactgta ttcgagcggc cattgccccc 1080 ttccagtcgg caggcagtcc acctgagctg cagggcctgc cggagctgca cgaagagtgt 1140 gaggggaacc acccctctgc gaggaaactc acagcccaga ggagggcatc cacagtttcc 1200 agtgttactc aggtagaagt tgatgaaaac gcttacagat gtggctctgg catgcagatg 1260 gcagaggaca gcaagagctt aaagacacac cagacacagc ccggcatccg aagagcgagg 1320 gacaaagccc ttctggtggc aaacggaaag agactttggt gtagagaagg ctctgtgtgt 1380 taactgatgg gagagactgt ttgtttataa atgtgtacag ttttgttttc tcgtaagggg 1440 agcatcatag ggttacttta taccagttgt aacattttca ttgtttttgg ttgttctttt 1500 ttcttttttt aatggcagct aaagatatac agattactgt taaattgcag tccttttttt 1560 tttaaagata ttttcttgag ttatttagaa catggtaagc ctggtatttt ttaatcaaac 1620 aaaatattta tgaaatgggt tttctcttaa ttctggattc atcatggctt tctaatacca 1680 attgtaatat ttacaatatt caccaaaact tagaattttg caaatgcagg aattctgcca 1740 gtgtttcttt gctaagcctt gcatgcaaaa tttgaaattt taacattggc acccaaaacc 1800 tacatggaat gtatgtctgg agtatttcaa actttacatt gaaacataat ttccttggaa 1860 aacaaaccat aagcctgagg aggtttttat caactggaat gctttatatt agtttgtttt 1920 tcactgtaca ttcctcattt tacattcatt taacctgccg attatttaat ttttttattg 1980 taaagtagtt tttagcattt gcttttattt ttttactttg atgccttaac aaattggcac 2040 gtctttaaag tatttttctt cctgattaaa aatgtgtgtg t 2081 22 968 DNA Homo sapiens 22 gaattccgaa gccggcgacc ggtctgacgt cccgagcagg gcatggtcta gtggcccagt 60 caggacgcga aacactccct ggaggttctg acccactccc tctcagcctc cgcctggtct 120 ctggtgtagt cgccgccgcc agccgccatg ggcaaacaga acagcaagct gcggcccgag 180 gtgctgcagg acctgcggga gaagacggag ttcaccgacc acgagctgca ggagtggtac 240 aagggcttcc tcaaggactg ccccaccggc cacctgaccg tggacgagtt caagaagatc 300 tacgccaact tcttccccta cggcgacgct tccaagttcg ccgagcacgt cttccgcacc 360 ttcgacacca acggcgacgg caccatcgac ttccgggagt tcatcattgg cctgagcgtg 420 actcgcgggg gcaagctgga gcagaagctc aagtgggcct tcagcatgta cgacctggac 480 ggcaacggct acatcagccg cagcgagatg ctggagatcg tgcaggccat ctacaagatg 540 gtgtcgtctg tgatgaagat gccggaggat gagtccaccc cggagaagcg cacagacaag 600 atcttcaggc agatggacac caacaatgac ggcaaactgt ccttggaaga attcatcaga 660 ggtgccaaga gcgacccctc catcgtccgc ctgctgcagt gcgaccccag cagtgccagt 720 cagttctgag cgagcggccc ctggacagtt gcagagaaac acaggcttgt cgtgccgttt 780 aagctttgct tgcaagagtg gatgccccgc aatcgttcct gctctcccgg gcccccgctg 840 ggcatgtccg tttgcacctg cccgggcgcc ggtgcgcctc cctcctccac ctgaccaacg 900 cgacattcct cccctcacgc ctggcccggt cccttccagg aactccaggg atgtggtgac 960 atgcaggg 968 23 1204 DNA Homo sapiens 23 ctctgaggag aagcagcagc aaacatttgc tagtcagaca agtgacaggg aatggattcc 60 aaacagcagt gtgtaaagct aaatgatggc cacttcatgc ctgtattggg atttggcacc 120 tatgcacctc cagaggttcc gagaagtaaa gctttggagg tcacaaaatt agcaatagaa 180 gctgggttcc gccatataga ttctgctcat ttatacaata atgaggagca ggttggactg 240 gccatccgaa gcaagattgc agatggcagt gtgaagagag aagacatatt ctacacttca 300 aagctttggt ccacttttca tcgaccagag ttggtccgac cagccttgga aaactcactg 360 aaaaaagctc aattggacta tgttgacctc tatcttattc attctccaat gtctctaaag 420 ccaggtgagg aactttcacc aacagatgaa aatggaaaag taatatttga catagtggat 480 ctctgtacca cctgggaggc catggagaag tgtaaggatg caggattggc caagtccatt 540 ggggtgtcaa acttcaaccg caggcagctg gagatgatcc tcaacaagcc aggactcaag 600 tacaagcctg tctgcaacca ggtagaatgt catccgtatt tcaaccggag taaattgcta 660 gatttctgca agtcgaaaga tattgttctg gttgcctata gtgctctggg atctcaacga 720 gacaaacgat gggtggaccc gaactccccg gtgctcttgg aggacccagt cctttgtgcc 780 ttggcaaaaa agcacaagcg aaccccagcc ctgattgccc tgcgctacca gctgcagcgt 840 ggggttgtgg tcctggccaa gagctacaat gagcagcgca tcagacagaa cgtgcaggtt 900 tttgagttcc agttgactgc agaggacatg aaagccatag atggcctaga cagaaatctc 960 cactatttta acagtgatag ttttgctagc caccctaatt atccatattc agatgaatat 1020 taacatggag agctttgcct gatgtctacc agaagccctg tgtgtggatg gtgacgcaga 1080 ggacgtctct atgccggtga ctggacatat cacctctact taaatccgtc ctgtttagcg 1140 acttcagtca actacagctg agtccatagg ccagaaagac aataaatttt tatcattttg 1200 aaat 1204 24 1698 DNA Homo sapiens 24 tcggcacagg agcgaggaga cccgagagca gacgcgccct ggcgcccgcc ctgcgcagtc 60 accatggcga tgcatttcat cttctcagat acagcggtgc ttctgtttga tttctggagt 120 gtccacagtc ctgctggcat ggccctttcg gtgttggtgc tcctgcttct ggctgtactg 180 tatgaaggca tcaaggttgg caaagccaag ctgctcaacc aggtactggt gaacctgcca 240 acctccatca gccagcagac catcgcagag acagacgggg actctgcagg ctcagattca 300 ttccctgttg gcagaaccca ccacaggtgg tacttgtgtc actttggcca gtctctaatc 360 catgtcatcc aggtggtcat cggctacttc atcatgctgg ccgtaatgtc ctacaacacc 420 tggattttcc ttggtgtggt cttgggctct gctgtgggct actacctagc ttacccactt 480 ctcagcacag cttagatggt gaggaacgtg caggcactga ggctggaggg acatggagcc 540 ccctcttcca gacactatac ttccaactgc cctttcttct gatggctatt cctccacctt 600 attcccagcc cctggaaact ttgagctgaa gccagcactt gctccctgga gttcggaagc 660 cattgcagca accttccttc tcagccagcc tacgtagggc ccaggcatgg tcttgtgtct 720 taagacagct gctgtgacca aagggagaat ggagataaca ggggtggcag ggttactgag 780 cccatgacaa tgcttctctg tgactcaaac caggaatttc caaagatttc aagccaggga 840 gaagggttct tggtgatgca gggcatggaa cctggacacc ctcagctctc ctgctttgtg 900 ccttatctac aggagcatcg cccattggac ttcctgacct cttctgtctt tgagggacag 960 agaccaagct agatcctttt tctcaccttt ctgcctttgg aacacatgaa gatcatctcg 1020 tctatggatc atgttgacaa actaagtttt ttttattttt cccattgaac tcctagttgg 1080 caattttgca cattcataca aaaaaatttt taatgaaatg atttcattga ttcatgatgg 1140 atggcagaaa ctgctgagac ctatttccct ttcttgggga gagaataagt gacagctgat 1200 taaaggcaga gacacaggac tgctttcagg ctcctggttt attctctgat tgactgagct 1260 ccttccacca gaaggcactg cctgcaggaa gaagatgatc tgatggccgt gggtgtctgg 1320 gaagctcttc gtggcctcaa tgccctcctt tatcctcatc tttcttctat gcagaacaaa 1380 aagctgcatc taataatgtt caatacttaa tattctctat ttattactta ctgcttactc 1440 gtaatgatct agtggggaaa catgattcat tcacttaaaa tactgattaa gccatgggca 1500 ggtactgact gaagatgcaa tccaaccaaa gccattacat tttttgagtt agatgggact 1560 ctctggatag ttgaacctct tcactttata aaaaaggaaa gagagaaaat cactgctgta 1620 tactaaatac ctcacagatt agatgaaaag atggttgtaa gctttgggaa ttaaaaacaa 1680 atacatttta gtaaatat 1698 25 3213 DNA Homo sapiens 25 aatcatcgct cgcagcggcg gcgcccgcag tggccgcagc agcgcgccgg gccctggccg 60 cgccccagcc gagcgcagcg cggagtcgcc ccgacctttc tctgcgcagt acggccgccg 120 ggaccgcagc atggcgggca tcgcggccaa gctggcgaag gaccgggagg cggccgaggg 180 gctgggctcc cacgagaggg ccatcaagta cctcaaccag gactacgagg cgctgcggaa 240 cgagtgcctg gaggccggga cgctcttcca ggacccgtcc ttcccggcca tcccctcggc 300 cctgggcttc aaggagttgg ggccctactc cagcaaaacc cggggcatga gatggaagcg 360 ccccacggag atctgcgctg acccccagtt tatcattgga ggagccaccc gcacagacat 420 ctgccaagga gccctaggtg actgctggct gctggcagcc attgcctccc tcaccttgaa 480 tgaagaaatc ctggctcgag tcgtccccct aaaccagagc ttccaggaaa actatgcagg 540 gatctttcac ttccagttct ggcaatacgg cgagtgggtg gaggtggtgg tggatgacag 600 gctgcccacc aaggacgggg agctgctctt tgtgcattca gccgaaggga gcgagttctg 660 gagcgccctg ctggagaagg catacgccaa gatcaacgga tgctatgaag ctctatcagg 720 gggtgccacc actgagggct tcgaagactt caccggaggc attgctgagt ggtatgagtt 780 gaagaagccc cctcccaacc tgttcaagat catccagaaa gctctgcaaa aaggctctct 840 ccttggctgc tccatcgaca tcaccagcgc cgcggactcg gaggccatca cgtttcagaa 900 gctggtgaag gggcacgcgt actcggtcac cggagccgag gaggttgaaa gtaacggaag 960 cctacagaaa ctgatccgca tccgaaatcc ctggggagaa gtggagtgga cagggcggtg 1020 gaatgacaac tgcccaagct ggaacactat agacccagag gagagggaaa ggctgaccag 1080 acggcatgaa gatggagaat tctggatgtc tttcagtgac ttcctgaggc actattcccg 1140 cctggagatc tgtaacctga ccccagacac tctcaccagc gatacctaca agaagtggaa 1200 actcaccaaa atggatggga actggaggcg gggctccacc gcgggaggtt gcaggaacta 1260 cccgaacaca ttctggatga accctcagta cctgatcaag ctggaggagg aggatgagga 1320 cgaggaggat ggggagagcg gctgcacctt cctggtgggg ctcattcaga agcaccgacg 1380 gcggcagagg aagatgggcg aggacatgca caccatcggc tttggcatct atgaggttcc 1440 agaggagtta agtgggcaga ccaacatcca cctcagcaaa aacttcttcc tgacgaatcg 1500 cgccagggag cgctcagaca ccttcatcaa cctccgggag gtgctcaacc gcttcaagct 1560 gccgccagga gagtacattc tcgtgccttc caccttcgaa cccaacaagg atggggattt 1620 ctgcatccgg gtcttttctg aaaagaaagc tgactaccaa gctgtcgatg atgaaatcga 1680 ggccaatctt gaagagttcg acatcagcga ggatgacatt gatgatggag tcaggagact 1740 gtttgcccag ttggcaggag aggatgcgga gatctctgcc tttgagctgc agaccatcct 1800 gagaagggtt ctagcaaagc gccaagatat caagtcagat ggcttcagca tcgagacatg 1860 caaaattatg gttgacatgc tagattcgga cgggagtggc aagctggggc tgaaggagtt 1920 ctacattctc tggacgaaga ttcaaaaata ccaaaaaatt taccgagaaa tcgacgttga 1980 caggtctggt accatgaatt cctatgaaat gcggaaggca ttagaagaag caggtttcaa 2040 gatgccctgt caactccacc aagtcatcgt tgctcggttt gcagatgacc agctcatcat 2100 cgattttgat aattttgttc ggtgtttggt tcggctggaa acgctattca agatatttaa 2160 gcagctggat cccgagaata ctggaacaat agagctcgac cttatctctt ggctctgttt 2220 ctcagtactt tgaagttata actaatctgc ctgaagactt ctcatgatgg aaaatcagcc 2280 aaggactaag cttccataga aatacacttt gtatctggac ctcaaaatta tgggaacatt 2340 tacttaaacg gatgatcata gctgaaaata atgatactgt caatttgaga tagcagaagt 2400 ttcacacatc aaagtaaaag atttgcatat cattatacta aatgcaaatg agtcgcttaa 2460 cccttgacaa ggtcaaagaa agctttaaat ctgtaaatag tatacacttt ttacttttac 2520 acactttcct gttcatagca atattaaatc aggaaaaaaa aatgcaggga ggtatttaac 2580 agctgagcaa aaacattgag tcgctctcaa aggacacgag gcccttggca gggaatattt 2640 aaagcaactt caagtttaaa atgcagctgt tgattctacc aaacaacagt ccaagattac 2700 catttcccat gagccaactg ggaaacatgg tatatcatga agtaatcttg tcaaggcatc 2760 tggagagtcc aggagaggag actcacctct gtcgcttggg ttaaacaaga gacaggtttt 2820 gtagaatatt gattggtaat agtaaatcgt tctccttaca atcaagttct tgaccctatt 2880 cggccttata catctggtct tacaaagacc aaagggatcc tgcgcttgat caactgaacc 2940 agtatgccaa aaccaggcat ccaatttgta aaccaattat gataaaggac aaaataagct 3000 gtttgccacc tcaaaacttt atgaacttca ccaccactag tgtctgtcca tggagttaga 3060 ggggacatca cttagaagtt cttatagaaa ggacacaagt ttgtttcctg gctttacctt 3120 gggaaaatgc tagcaacatt atagaaattt tgccttgttg ccttatcttc ttccaaatgt 3180 actgttaaat aaaaataaag ggttacccca tcg 3213 26 5316 DNA Homo sapiens 26 atcatggcgg atggccccag gtgtaagcgc agaaagcagg cgaacccgcg gcgcaataac 60 gttacaaatt ataatactgt ggtagaaaca aattcagatt cagatgatga agacaaactg 120 catattgtgg aagaagaaag tgttacagat gcagctgact gtgaaggtgt accagaggat 180 gacctgccaa cagaccagac agtgttacca gggaggagca gtgaaagaga agggaatgct 240 aagaactgct gggaggatga cagaaaggaa gggcaagaaa tcctggggcc tgaagctcag 300 gcagatgaag caggatgtac agtaaaagat gatgaatgcg agtcagatgc agaaaatgag 360 caaaaccatg atcctaatgt tgaagagttt ctacaacaac aagacactgc tgtcattttt 420 cctgaggcac ctgaagagga ccagaggcag ggcacaccag aagccagtgg tcatgatgaa 480 aatggaacac cagatgcatt ttcacaatta ctcacctgtc catattgtga tagaggctat 540 aaacgcttta cctctctgaa agaacacatt aaatatcgtc atgaaaagaa tgaagataac 600 tttagttgct ccctgtgcag ttacaccttt gcatacagaa cccaacttga acgtcacatg 660 acatcacata aatcaggaag agatcaaaga catgtgacgc agtctgggtg taatcgtaaa 720 ttcaaatgca ctgagtgtgg aaaagctttc aaatacaaac atcacctaaa agagcactta 780 agaattcaca gtggagagaa gccatatgaa tgcccaaact gcaagaaacg cttttcccat 840 tctggctcct atagctcaca cataagcagt aagaaatgta tcagcttgat acctgtgaat 900 gggcgaccaa gaacaggact caagacatct cagtgttctt caccgtctct ttcagcatca 960 ccaggcagtc ccacacgacc acagatacgg caaaagatag agaataaacc ccttcaagaa 1020 caactttctg ttaaccaaat taaaactgaa cctgtggatt atgaattcaa acccatagtg 1080 gttgcttcag gaatcaactg ttcaacccct ttacaaaatg gggttttcac tggtggtggc 1140 ccattacagg caaccagttc tcctcagggc atggtgcaag ctgttgttct gccaacagtt 1200 ggtttggtgt ctcccataag tatcaattta agtgatattc agaatgtact taaagtggcg 1260 gtagatggta atgtaataag gcaagtgttg gagaataatc aagccaatct tgcatccaaa 1320 gaacaagaaa caatcaatgc ttcacccata caacaaggtg gccattctgt tatttcagcc 1380 atcagtcttc ctttggttga tcaagatgga acaaccaaaa ttatcatcaa ctacagtctt 1440 gagcagccta gccaacttca agttgttcct caaaatttaa aaaaagaaaa tccagtcgct 1500 acaaacagtt gtaaaagtga aaagttacca gaagatctta ctgttaagtc tgagaaggac 1560 aaaagctttg aagggggggt gaatgatagc acttgtcttc tgtgtgatga ttgtccagga 1620 gatattaatg cacttccaga attaaagcac tatgacctaa agcagcctac tcagcctcct 1680 ccactccctg cagcagaagc tgagaagcct gagtcctctg tttcatcagc tactggagat 1740 ggcaatttgt ctcctagtca gccaccttta aagaacctct tgtctctcct aaaagcatat 1800 tatgctttga atgcacaacc aagtgcagaa gagctctcaa aaattgctga ttcagtaaac 1860 ctaccactgg atgtagtaaa aaagtggttt gaaaagatgc aagctggaca gatttcagtg 1920 cagtcttctg aaccatcttc tcctgaacca ggcaaagtaa atatccctgc caagaacaat 1980 gatcagcctc aatctgcaaa tgcaaatgaa ccccaggaca gcacagtaaa tctacaaagt 2040 cctttgaaga tgactaactc cccagtttta ccagtgggat caaccaccaa tggttccaga 2100 agtagtacac catccccatc acctctaaac ctttcctcat ccagaaatac acagggttac 2160 ttgtacacag ctgagggtgc acaagaagag ccacaagtag aacctcttga tctttcacta 2220 ccaaagcaac agggagaatt attagaaagg tcaactatca ctagtgttta ccagaacagt 2280 gtttattctg tccaggaaga acccttgaac ttgtcttgcg caaaaaagga gccacaaaag 2340 gacagttgtg ttacagactc agaaccagtt gtaaatgtaa tcccaccaag tgccaacccc 2400 ataaatatcg ctatacctac agtcactgcc cagttaccca caatcgtggc cattgctgac 2460 cagaacagtg ttccatgctt aagagcgcta gctgccaata agcaaacgat tctgattccc 2520 caggtggcat acacctactc aactacggtc agccctgcag tccaagaacc acccttgaaa 2580 gtgatccagc caaatggaaa tcaggatgaa agacaagata ctagctcaga aggagtatca 2640 aatgtagagg atcagaatga ctctgattct acaccgccca aaaagaaaat gcggaagaca 2700 gaaaatggaa tgtatgcttg tgatttgtgt gacaagatat tccaaaagag tagttcatta 2760 ttgagacata aatatgaaca cacaggtaaa agacctcatg agtgtggaat ctgtaaaaag 2820 gcatttaaac acaaacatca tttgattgaa cacatgcgat tacattctgg agaaaagccc 2880 tatcaatgtg acaaatgtgg aaagcgcttc tcacactctg ggtcttattc tcaacacatg 2940 aatcatcgct actcctactg taagagagaa gcggaagaac gtgacagcac agagcaggaa 3000 gaggcagggc ctgaaatcct ctcgaatgag cacgtgggtg ccagggcgtc tccctcacag 3060 ggcgactcgg acgagagaga gagtttgaca agggaagagg atgaagacag tgaaaaagag 3120 gaagaggagg aggataaaga gatggaagaa ttgcaggaag aaaaagaatg tgaaaaacca 3180 caaggggatg aggaagagga ggaggaggag gaagaagtgg aagaagaaga ggtagaagag 3240 gcagagaatg agggagaaga agcaaaaact gaaggtctga tgaaggatga cagggctgaa 3300 agtcaagcaa gcagcttagg acaaaaagta ggcgagagta gtgagcaagt gtctgaagaa 3360 aagacaaatg aagcctaatc gtttttctag aaggaaaata aattctaatt gataatgaat 3420 ttcgttcaat attatccttg cttttcatgg aaacacagta acctgtatgc tgtgattcct 3480 gttcactact gtgtgtgtgt gcgcgtgcat tgattactat ccatttcttt agtcaacgct 3540 ctccacttcc tgatttctgc tttaaggaaa actgtgaact ttctgcttca tgtatcagtt 3600 ttaaagcatc ccaggcaaag atcatctaca gattctagga attctctccc ctgaaatcaa 3660 aacctggaga cttttttttc ttattttagt tgagaagttc ataaactgct caaggattag 3720 ttttccagga ctctgcggag gaacggcagg aagaacctca gagagggcag aggtgacttc 3780 aaagtgctgg ggactccgtc ctgagggtca cttggccctg agcccctgcg tgcccttgcg 3840 gaagcccaga agcttcttcc tgctgcacct cccgtttccg ctgctgctga cgtttatgca 3900 tttcatgatg gggtccaaca agaacacctg acttgggtga agttgtgcaa tattggaggc 3960 tgactgtagg gctgggcagc tgggagacag gctcatggct catggctcat ggctcagggc 4020 ggtgcctgcc ctgggccggg acccccctcc ccacccccca cctaggcttt ttgggttttg 4080 ttcaaggaag gtaaagtgag aggtttaggt cagtgttttt aagtttttgt ttttttttta 4140 aagcaaatcc tgtatatgta tctacatggg agataggtag acactactta tttgttacat 4200 tttgtactat acgtttgtgt tccaggtttc agcttccctc gctcctgttg ttaagaagcg 4260 tccctgtcag cacaggtgtg cattgaggaa ggggccccag ggccttcgct ccctcagcac 4320 tggggtggag gcggcaggaa ggggcggccc ttacctggca ggtctgggcg cacctttagc 4380 aggtggactc cgtggggctc caccagccag aagcctctgg aaggcaacga aggcaatgct 4440 gctccctgag tccagtcccc gcccccaaac ccagcccagg tgccttcagc tacttcggct 4500 tcttaaaccc tgcagtgtta aacagaggca ttgagaaagg ggaaaggcgg gtatttttaa 4560 aagccaaaga ttgacccaag ttacttgagg gtagggaggc gggcccagtg caggaggctg 4620 catccctggc ctgctggtgc ccaccggggg ctgtgcctgt gccgggccgc aggaagctgg 4680 ctgcccccat tcctgctgct gctgctgctg ctgctctgtg gctgtttcaa agactgggcg 4740 aaaggctgtc cggagggcag accaggtgcc ttgccgcaga gaaaacacca aagtctcctg 4800 ttcgctcata aagaagtttt tgggatggga gagaatccag accatcttgg ggcagccagg 4860 cccttgcctt catttttaca gaggtagcac aactgattcc aacacaaaac cccttcccct 4920 ttttaaaatg atttctgttc taatgccata gatcaaaggc ctcagaaacc attgtgtgtt 4980 tcctctttga agcaatgaca agcactttac tttcacggtg gtttttgttt tttcttattg 5040 ctgtggaacc tcttttggag gacgttaaag gcgtgtttta cttgtttttt taagagtgtg 5100 tgatgtgtgt tttgtagatt tcttgacagt gctgtaatac agacggcaat gcaatagcct 5160 atttaaagaa ctacgtgatc tgattgagat gtacatagtt ttttttttta ccataactga 5220 attattttat ctcttatgtt atcatgagaa atgtatgcca aatgattagt tgatgtatgt 5280 tttttaattt aatatttaaa taaaatattt ggaagg 5316 27 3045 DNA Homo sapiens 27 aattcccttg aggtggtttc acatccacat ccagttgtcc ctaaaatgga gaaagaactg 60 gtgccagacc aggcagtaat atcagacagt actttctctc tggcaaacag tccaggcagt 120 gaatcagtaa ccaaggatga cgcactttct tttgtcccct cccagaaaga aaagggaaca 180 gcaactcctg aactacatac agctacagat tatagagatg gcccagatgg aaattcgaat 240 gagcctgata cgcggccact agaagacagg gcagtaggcc tgtccacatc ctccactgct 300 gcagagcttc agcacgggat ggggaatacc agtctcacag gacttggtgg agagcatgag 360 ggtcccgccc ctccagcaat cccagaagct ctgaatatca aggggaacac tgactcttcc 420 gtgcaaagtg tgggtaaggc cactttggct ttagattcag ttttgactga agaaggaaaa 480 gttctggtgg tttcagaaag ctctgcagct caggaacaag ataaggataa agcggtgacc 540 tgttcctcta ttaaggaaaa tgctctctct tcaggaactt tgcaggaaga gcagagaaca 600 ccacctcctg gacaagatac tcaacaattt catgaaaaat caatctcagc tgactgtgcc 660 aaggacaaag cacttcagct aagtaattca ccgggtgcat cctctgcctt tcttaaggca 720 gaaactgaac ataacaagga agtggcccca caagtctcac tgctgactca aggtggggct 780 gcccagagcc tggtgccacc aggagcaagt ctggccacag agtcaaggca ggaagccttg 840 ggggcagagc acaacagctc cgctctgttg ccatgtctgt tgccagatgg gtctgatggg 900 tccgatgctc ttaactgcag tcagccttct cctctggatg ttggagtgaa gaacactcaa 960 tcccagggaa aaactagtgc ctgtgaggtg agtggagatg tgacggtgga tgttacaggg 1020 gttaatgctc tacaaggtat ggctgagccc agaagagaga atatatcaca caacacccaa 1080 gacatcctga ttccaaacgt cttgttgagc caagagaaga atgccgttct aggtttgcca 1140 gtggctctac aggacaaagc tgtgactgac ccacagggag ttggaacccc agagatgata 1200 cctcttgatt gggagaaagg gaagctggag ggagcagacc acagctgtac catgggtgac 1260 gctgaggaag cccaaataga cgatgaagca catcctgtcc tactgcagcc tgttgccaag 1320 gagctcccca cagacatgga gctctcagcc catgatgatg gggccccagc tggtgtgagg 1380 gaagtcatgc gagccccgcc ttcaggcagg gaaaggagca ctccctctct accttgcatg 1440 gtctctgccc aggacgcacc tctgcctaag ggggcagact tgatagagga ggctgccagc 1500 cgtatagtgg atgctgtcat cgaacaagtc aaggccgctg gagcactgct tactgagggg 1560 gaggcctgtc acatgtcact gtccagccct gagttgggtc ctctcactaa aggactagag 1620 agtgctttta cagaaaaagt gagtactttc ccacctgggg agagcctacc aatgggcagt 1680 actcctgagg aagccacggg gagccttgca ggatgttttg ctggaaggga ggagccagag 1740 aagatcattt tacctgtcca ggggcctgag ccagcagcag aaatgccaga cgtgaaagct 1800 gaagatgaag tggattttag agcaagttca atttctgaag aagtggctgt agggagcata 1860 gctgctacac tgaagatgaa gcaaggccca atgacccagg cgataaaccg agaaaactgg 1920 tgtacaatag agccatgccc tgatgcagca tctcttctgg cttccaagca gagcccagaa 1980 tgtgagaact tcctggatgt tggactgggc agagagtgta cctcaaaaca aggtgtactt 2040 aaaagagaat ctgggagtga ttctgacctc tttcactcac ccagtgatga catggacagc 2100 atcatcttcc caaagccaga ggaagagcat ttggcctgtg atatcaccgg atccagttca 2160 tccaccgatg acacggcttc actggaccga cattcttctc atggcagtga tgtgtctctc 2220 tcccagattt taaagccaaa caggtcaaga gatcggcaaa gccttgatgg attctacagc 2280 catgggatgg gagctgaggg tcgagaaagt gagagtgagc ctgctgaccc aggcgacgtg 2340 gaggaggagg agatggacag tatcactgaa gtgcctgcaa actgctctgt cctaaggagc 2400 tccatgcgct ctctttctcc cttccggagg cacagctggg ggcctgggaa aaatgcagcc 2460 agcgatgcag aaatgaacca ccggagttca atgcgagttc ttggggatgt tgtcaggaga 2520 cctcccattc ataggagaag tttcagtcta gaaggcttga caggaggagc tggtgtcgga 2580 aacaagccat cctcatctct agaagtaagc tctgcaaatg ccgaagagct cagacaccca 2640 ttcagtggtg aggaacgggt tgactctttg gtgtcacttt cagaagagga tctggagtca 2700 gaccagagag aacataggat gtttgatcag cagatatgtc acagatctaa gcagcaggga 2760 tttaattact gtacatcagc catttcctct ccattgacaa aatccatctc attaatgaca 2820 atcagccatc ctggattgga caattcacgg cccttccaca gtaccttcca caataccagt 2880 gctaatctga ctgagagtat aacagaagag aactataatt tcctgccaca tagcccctcc 2940 aagaaagatt ctgaatggaa gagtggaaca aaagtcagtc gtacattcag ctacatcaag 3000 aataaaatgt ctagcagcaa gaagagcaaa gaaaagaaaa aaaag 3045 28 3634 DNA Homo sapiens 28 tcaacacagg acaatgcaag cccatgagct gttccggtat tttcgaatgc cagagctggt 60 tgacttccga cagtacgtgc gtactcttcc gaccaacacg cttatgggct tcggagcttt 120 tgcagcactc accaccttct ggtacgccac gagacccaaa cccctgaagc cgccatgcga 180 cctctccatg cagtcagtgg aagtggcggg tagtggtggt gcacgaagat ccgcactact 240 tgacagcgac gagcccttgg tgtatttcta tgatgatgtc acaacattat acgaaggttt 300 ccagagggga atacaggtgt caaataatgg cccttgttta ggctctcgga aaccagacca 360 accctatgaa tggctttcat ataaacaggt tgcagaattg tcggagtgca taggctcagc 420 actgatccag aagggcttca agactgcccc agatcagttc attggcatct ttgctcaaaa 480 tagacctgag tgggtgatta ttgaacaagg atgctttgct tattcgatgg tgatcgttcc 540 actttatgat acccttggaa atgaagccat cacgtacata gtcaacaaag ctgaactctc 600 tctggttttt gttgacaagc cagagaaggc caaactctta ttagagggtg tagaaaataa 660 gttaatacca ggccttaaaa tcatagttgt catggatgcc tacggcagtg aactggtgga 720 acgaggccag aggtgtgggg tggaagtcac cagcatgaag gcgatggagg acctgggaag 780 agccaacaga cggaagccca agcctccagc acctgaagat cttgcagtaa tttgtttcac 840 aagtggaact acaggcaacc ccaaaggagc aatggtcact caccgaaaca tagtgagcga 900 ttgttcagct tttgtgaaag caacagagaa tacagtcaat ccttgcccag atgatacttt 960 gatatctttc ttgcctctcg cccatatgtt tgagagagtt gtagagtgtg taatgctgtg 1020 tcatggagct aaaatcggat ttttccaagg agatatcagg ctgctcatgg atgacctcaa 1080 ggtgcttcaa cccactgtct tccccgtggt tccaagactg ctgaaccgga tgtttgaccg 1140 aattttcgga caagcaaaca ccacgctgaa gcgatggctc ttggactttg cctccaagag 1200 gaaagaagca gagcttcgca gcggcatcat cagaaacaac agcctgtggg accggctgat 1260 cttccacaaa gtacagtcga gcctgggcgg aagagtccgg ctgatggtga caggagccgc 1320 cccggtgtct gccactgtgc tgacgttcct cagagcagcc ctgggctgtc agttttatga 1380 aggatacgga cagacagagt gcactgccgg gtgctgccta accatgcctg gagactggac 1440 cgcaggccat gttggggccc cgatgccgtg caatttgata aaacttgttg atgtggaaga 1500 aatgaattac atggctgccg agggcgaggg cgaggtgtgt gtgaaagggc caaatgtatt 1560 tcagggctac ttgaaggacc cagcgaaaac agcagaagct ttggacaaag acggctggtt 1620 acacacaggg gacattggaa aatggttacc aaatggcacc ttgaaaatta tcgaccggaa 1680 aaagcacata tttaagctgg cacaaggaga atacatagcc cctgaaaaga ttgaaaatat 1740 ctacatgcga agtgagcctg ttgctcaggt gtttgtccac ggagaaagcc tgcaggcatt 1800 tctcattgca attgtggtac cagatgttga gacattatgt tcctgggccc aaaagagagg 1860 atttgaaggg tcgtttgagg aactgtgcag aaataaggat gtcaaaaaag ctatcctcga 1920 agatatggtg agacttggga aggattctgg tctgaaacca tttgaacagg tcaaaggcat 1980 cacattgcac cctgaattat tttctatcga caatggcctt ctgactccaa caatgaaggc 2040 gaaaaggcca gagctgcgga actatttcag gtcgcagata gatgacctct attccactat 2100 caaggtttag tgtgaagaag aaagctcaga ggaaatggca cagttccaca atctcttctc 2160 ctgctgatgg ccttcatgtt gttaattttg aatacagcaa gtgtagggaa ggaagcgttc 2220 gtgtttgact tgtccattcg gggttcttct cataggaatg ctagaggaaa cagaacaccg 2280 ccttacagtc acctcatgtt gcagaccatg tttatggtaa tacacacttt ccaaaatgag 2340 ccttaaaaat tgtaaagggg atactataaa tgtgctaagt tatttgagac ttcctcagtt 2400 taaaaagtgg gttttaaatc ttctgtctcc ctgcttttct aatcaagggg ttaggacttt 2460 gctatctctg agatgtctgc tacttgctgc aaattctgca gctgtctgct gctctaaaga 2520 gtacagtgca ctagagggaa gtgttccctt taaaaataag aacaactgtc ctggctggag 2580 aatctcacaa gcggaccaga gatcttttta aatccctgct actgtccctt ctcacaggca 2640 ttcacagaac ccttctgatt cgtaagggtt acgaaactca tgttcttctc cagtcccctg 2700 tggtttctgt tggagcataa ggtttccagt aagcgggagg gcagatccaa ctcagaacca 2760 tgcagataag gagcctctgg caaatgggtg ctcatcagaa cgcgtggatt ctctttcatg 2820 gcagaatgct cttggactcg gttctccagg cctgattccc cgactccatc ctttttcagg 2880 ggttatttaa aaatctgcct tagattctat agtgaagaca agcatttcaa gaaagagtta 2940 cctggatcag ccatgctcag ctgtgacgcc tgaataactg tctactttat cttcactgaa 3000 ccactcactc tgtgtaaagg ccaacagatt tttaatgtgg ttttcatatc aaaagatcat 3060 gttgggatta acttgccttt ttccccaaaa aataaactct caggcaagca tttctttaaa 3120 gctattaagg gagtatatac ttgagtactt attgaaatgg acagtaataa gcaaatgttc 3180 ttataatgct acctgatttc tatgaaatgt gtttgacaag ccaaaattct aggatgtaga 3240 aatctggaaa gttcatttcc tgggattcac ttctccaggg attttttaaa gttaatttgg 3300 gaaattaaca gcagttcact ttattgtgag tctttgccac atttgactga attgagctgt 3360 catttgtaca tttaaagcag ctgttttggg gtctgtgaga gtacatgtat tatatacaag 3420 cacaacaggg cttgcactaa agaattgtca ttgtaataac actacttggt agcctaactt 3480 catatatgta ttcttaattg cacaaaaagt caataatttg tcaccttggg gttttgaatg 3540 tttgctttaa gtgttggcta tttctatgtt ttataaacca aaacaaaatt tccaaaaaca 3600 atgaaggaaa ccaaaataaa tatttctgca tttc 3634 29 4573 DNA Homo sapiens 29 cgcgtgtcta cgcggacgca ccggctaagc tgcttctgcc gccgccggcc gcctgggacc 60 ttgcggtgag gctgcgcggg gccgaggccg cctccgagcg ccaggtttat tcagtcacca 120 tgaagctgct gctgctgcac ccggccttcc agagctgcct cctgctgacc ctgcttggct 180 tatggagaac cacccctgag gctcacgctt catccctggg tgcaccagct atcagcgctg 240 cctccttcct gcaggatcta atacatcggt atggcgaggg tgacagcctc actctgcagc 300 agctgaaggc cctgctcaac cacctggatg tgggagtggg ccggggtaat gtcacccagc 360 acgtgcaagg acacaggaac ctctccacgt gctttagttc tggagacctc ttcactgccc 420 acaatttcag cgagcagtcg cggattggga gcagcgagct ccaggagttc tgccccacca 480 tcctccagca gctggattcc cgggcctgca cctcggagaa ccaggaaaac gaggagaatg 540 agcagacgga ggaggggcgg ccaagcgctg ttgaagtgtg gggatacggt ctcctctgtg 600 tgaccgtcat ctccctctgc tccctcctgg gggccagcgt ggtgcccttc atgaagaaga 660 ccttttacaa gaggctgctg ctctacttca tagctctggc gattggaacc ctctactcca 720 acgccctctt ccagctcatc ccggaggcat ttggtttcaa ccctctggaa gattattatg 780 tctccaagtc tgcagtggtg tttgggggct tttatctttt ctttttcaca gagaagatct 840 tgaagattct tcttaagcag aaaaatgagc atcatcatgg acacagccat tatgcctctg 900 agtcgcttcc ctccaagaag gaccaggagg agggggtgat ggagaagctg cagaacgggg 960 acctggacca catgattcct cagcactgca gcagtgagct ggacggcaag gcgcccatgg 1020 tggacgagaa ggtcattgtg ggctcgctct ctgtgcagga cctgcaggct tcccagagtg 1080 cttgctactg gctgaaaggt gtccgctact ctgatatcgg cactctggcc tggatgatca 1140 ctctgagcga cggcctccac aatttcatcg atggcctggc catcggtgct tccttcactg 1200 tgtcagtttt ccaaggcatc agcacctcgg tggccatcct ctgtgaggag ttcccacatg 1260 agctaggaga ctttgtcatc ctgctcaacg ctgggatgag catccaacaa gctctcttct 1320 tcaacttcct ttctgcctgc tgctgctacc tgggtctggc ctttggcatc ctggccggca 1380 gccacttctc tgccaactgg atttttgcgc tagctggagg aatgttcttg tatatttctc 1440 tggctgatat gttccctgag atgaatgagg tctgtcaaga ggatgaaagg aagggcagca 1500 tcttgattcc atttatcatc cagaacctgg gcctcctgac tggattcacc atcatggtgg 1560 tcctcaccat gtattcagga cagatccaga ttgggtaggg ctctgccaag agcctgtggg 1620 actggaagtc gggccctggg ctgcccgatc gccagcccga ggacttacca tccacaatgc 1680 accacggaag aggccgttct atgaaaaact gacacagact gtattcctgc attcaaatgt 1740 cagccgtttg taaaatgctg tatcctagga ataagctgcc ctggtaacca gtctctagct 1800 agtgcctctt gccctctcct cacctccttt tctctcagtg actctggaac ctgaatgcag 1860 cttacaagac aagcctgact tttttctctg attaccttgg cctcctcttg gaaccagtgc 1920 tgaaaggttt tgaatccttt acccaacaat gcaaaaatag agccaatggt tataacttgg 1980 ctagaaatat caagagttga atccatagtg tggggcccat gactctagct gggcaccttg 2040 gacctccagc tggccaatag aagagacagg agacaggaag ccttcccatt ttttcaaagt 2100 ctgtttaatt gcctattact tctctcaaag agaacctgaa gtcagaacac atgagcaggg 2160 tgagaggtga ggcaaggttc atcctgaatg ggagaggaag tcgaaccact gctgtgtgtc 2220 ttgtcaggat gctcacttgt tcctactgag atgctggata ttgattttgt aacagcacct 2280 ggtgtttcac ggctgtccga gtgagctaac gtggcggtgt ggctgcctgg acctcctctt 2340 tcaggttaac gctgacagaa tggaggctca ggctgtctgc aagaaaacag ttggtttggc 2400 tgtgattttg acctcctctt ccccactgcc atcttctaag agactttgta gctgcctcct 2460 agaagcacat tctgagcaca tttgagacct ctgtgttaga ggggagactg cacaaactat 2520 cctcccccag gttgagacgt ctgcagagtg gcaagctgac ttgtagaaat ggggtgccat 2580 ttatgctcta cttagacaag ggtaatcaga aatggaatca gtgcaggcaa aatttaggat 2640 ttgccgcttc cataaatcaa agcatgacta atagggggtc tctgaaatgt aagggcacaa 2700 acttcactta gggcatcgca gatgtttgca gaatggttgg cctaatgatt atgctacaga 2760 tgggttttaa atgacccgtc taggttactg cttccttgca aaaaaagtcg aatcctgcat 2820 tgaattgaat atgaatttct ctaactctct ccagaaaatg gatggagata acttgtcttt 2880 aaaactgtag gccagcctta gccactgtgg agcccttgcc tccgagctct ggcttcaagg 2940 ggagctcttc tccaggttca ctaggtgaat tgatttatta ttatcatatt gataatgtga 3000 gattctttag ccactttggg gagcctgtct ctccagaagc ctttcttagt ggtgcccaca 3060 gttggagccc aggggccatg tttgcaaact gattcatgtg catggctgac aggagtactg 3120 gttcactacc aatgcctgag cttttctctt acatagaaaa actgtccact ctcagtaatc 3180 acaagcagca tccgttttgt tttctcttct tgggagacat ctgtcaaacc aggaatattc 3240 ttgaaaagaa cgtgagcagg aaaaactgct ggtgatactt tttttaagtt ttgtttttat 3300 cttgcctgtt ggcttcaata catttgagaa tacgctgaag agggaaaatt tcagtgatgg 3360 agattctaga ttaaatatca ggactgattt cctggtggga ttatggtcca gttttaccaa 3420 agaaccaatt ccttgaatgt tggaatctaa ctttttatat tgtcattatt attgttgttt 3480 ttaaacggtt ctttgtcttt tctgttttat ttttctcaag ctgctttcag gagctagcag 3540 aaaataactc aaagttgaag actctggaag attttgcttt aacctaactc gcattgatgt 3600 attaaattta taattttagc attcccaata gatcctatca ttccttaaac ataataccct 3660 ttgtcttgga gtagaatact aagttagagt tagtggattt ctagtttagg agaggagctc 3720 aaaactataa tctttaacaa attgaaaaat gaaatagggt gttttccctt tttgtgcaca 3780 cctatattac cttaagaaat ttccttccat agacagctgc ctcaaaggga aatcctcttt 3840 aaaccgtagt tggcgcagag gtcagtccta gtcggagctt aggaggggcg gagacgctca 3900 catcgtctga cttgagtcgc cactgattgt ggcaacagct ttgcctcatg agtcaaaaat 3960 tggcaatttc ttttgatttt tagttgttga atttgctgtt tcaagcattt gtacatatta 4020 gaagtctaag gagtagcaag tcagtgggag gactttttca cccctggcat tagcagcttc 4080 gacctcattt tccagatgca ccagctccta ttaataagtt agcaaggaaa gtgtatgtca 4140 cgtgcaggaa cagtgaggca gggacagggg ttctgctcct tctcacttca ccaccggcac 4200 acagcttgcc cctgtctttg cccccaaagg tattttgtgt ctagtgtcaa attggagcta 4260 ttcttcactg gtccttaacc ttgggtttta aaaagaaggc ttctctgttt gggtagcgta 4320 agagctgagt atagtaagtc ctcttccaaa gagatggcaa tatgctgggc atctacttta 4380 aaacaaagtt gtctgatttt tgcaagagag gttaggattt tattgttctt atttcccttt 4440 acagttctgc agttccatca cagtattttt ttaaataact caggtgtatg agcagaaatt 4500 agaaaagaaa attaacttat gtggactgta aatgttttat ttgtaagatt ctataaataa 4560 agctatattc tgt 4573 30 1707 DNA Homo sapiens 30 cggcgctggg ctgaggggag gggttgtctt aaaagtctct ccttccccct gtaggggcgg 60 ccggcgagtc ccagtgagag cggagggtgc cagaggtagg gggccgagaa acaaagttcc 120 cggggcttcc tccggggccg cggtcggggc tgcgcgtttg accgcccccc tcctcgcgaa 180 gcaatggctt ccaaactcct gcgcgcggtc atcctcgggc cgcccggctc gggcaagggc 240 accgtgtgcc agaggatcgc ccagaacttt ggtctccagc atctctccag cggccacttc 300 ttgcgggaga acatcaaggc cagcaccgaa gttggtgaga tggcaaagca gtatatagag 360 aaaagtcttt tggttccaga ccatgtgatc acacgcctaa tgatgtccga gttggagaac 420 aggcgtggac agcactggct ccttgatggt tttcctagga cattaggaca agccgaagcc 480 ctggacaaaa tctgtgaagt ggatctagtg atcagtttga atattccatt tgaaacactt 540 aaagatcgtc tcagccgccg ttggattcac cctcctagcg gaagggtata taacctggac 600 ttcaatccac ctcatgtaca tggtattgat gacgtcactg gtgaaccgtt agtccagcag 660 gaggatgata aacccgaagc agttgctgcc aggctaagac agtacaaaga cgtggcaaag 720 ccagtcattg aattatacaa gagccgagga gtgctccacc aattttccgg aacggagacg 780 aacaaaatct ggccctacgt ttacacactt ttctcaaaca agatcacacc tattcagtcc 840 aaagaagcat attgaccctg cccaatggaa gaaccaggaa gatgtggtca ttcattcaat 900 agtgtgtgta gtattggtgc tgtgtccaaa ttagaagcta gctgaggtag cttgcagcat 960 cttttctagt tgaaatggtg aactgatagg aaaacaaatg agtagaaaga gttcatgaag 1020 aggccctcct ctgcctttca aaaggctggt cacctacaca tgtttaaggt gtctctgcac 1080 atgtctcaag cccatcacaa gaaagcaagt acagtgtgga tttcaaatgg tgtgtaactt 1140 cagctccagc tggtttttga cagctgttgc tgtggtaata tttttgacat gtgatggtga 1200 tagtctctgg ttctccccat ccccacaaag gctgttgaac cacagcacca ggaagcctga 1260 gaatgaatcc tgagggctct agcccaggct ttgtcccagg ctttctggtg tgtgccctcc 1320 tggtaacagt gaaattgaag ctacttactc atagtggttg tttctctggt cttgagtgac 1380 tgtgtccaca gttcattttt ttccggtagg aataactcct tttctacatc cacgctccat 1440 agagtctctc cttttcagac atcctgggat gaaagaattt ggcttttttt tttctttttt 1500 ttttggacat ctgttttcac tcttaggctt ttaaacaata gttattgctt ttatccctct 1560 cagattctaa taactgagag cgatggggct atattgaatc tctgtatgca ctgagaactg 1620 agctatgaag agaatcttat taaactgctg gtctgacttt atggattgac actgttcctt 1680 tcttttattg tgaaaaaaaa aaaaaaa 1707 31 2916 DNA Homo sapiens misc_feature (1)..(2916) n = a, c, g or t 31 agcagagctt tcccnccatg nnagaagctt catgagtcac acattacatc tttgggttga 60 ttgaatgcca ctgaaacatt tctagtagcc tggagnagtt gacctacctg tggagatgcc 120 tgccattaaa tggcatcctg atggcttaat acacatcact cttctgtgna gggttttaat 180 tttcaacaca gcttactctg tagcatcatg tttacattgt atgtataaag attatacnaa 240 ggtgcaattg tgtatttctt ccttaaaatg tatcagtata ggatttagaa tctccatgtt 300 gaaactctaa atgcatagaa ataaaaataa taaaaaattt ttcattttgc cttttcagcc 360 tagtattaaa actgataaaa gcaaagccat gcacaaaact acctccctag agaaaggcta 420 gtcccttttc ttccccattc atttcattat gaacatagta gaaaacagca tattcttatc 480 aaatttgatg aaaagcgcca acacgtttga actgaaatac gacttgtcat gtgaactgta 540 ccgaatgtct acgtattcca cttttcctgc tggggttcct gtctcagaaa ggagtcttgc 600 tcgtgctggt ttctattaca ctggtgtgaa tgacaaggtc aaatgcttct gttgtggcct 660 gatgctggat aactggaaaa gaggagacag tcctactgaa aagcataaaa agttgtatcc 720 tagctgcaga ttcgttcaga gtctaaattc cgttaacaac ttggaagcta cctctcagcc 780 tacttttcct tcttcagtaa cacattccac acactcatta cttccgggta cagaaaacag 840 tggatatttc cgtggctctt attcaaactc tccatcaaat cctgtaaact ccagagcaaa 900 tcaagaattt tctgccttga tgagaagttc ctacccctgt ccaatgaata acgaaaatgc 960 cagattactt acttttcaga catggccatt gacttttctg tcgccaacag atctggcacg 1020 agcaggcttt tactacatag gacctggaga cagagtggct tgctttgcct gtggtggaaa 1080 attgagcaat tgggaaccga aggataatgc tatgtcagaa cacctgagac attttcccaa 1140 atgcccattt atagaaaatc agcttcaaga cacttcaaga tacacagttt ctaatctgag 1200 catgcagaca catgcagccc gctttaaaac attctttaac tggccctcta gtgttctagt 1260 taatcctgag cagcttgcaa gtgcgggttt ttattatgtg ggtaacagtg atgatgtcaa 1320 atgcttttgc tgtgatggtg gactcaggtg ttgggaatct ggagatgatc catgggttca 1380 acatgccaag tggtttccaa ggtgtgagta cttgataaga attaaaggac aggagttcat 1440 ccgtcaagtt caagccagtt accctcatct acttgaacag ctgctatcca catcagacag 1500 cccaggagat gaaaatgcag agtcatcaat tatccatttg gaacctggag aagaccattc 1560 agaagatgca atcatgatga atactcctgt gattaatgct gccgtggaaa tgggctttag 1620 tagaagcctg gtaaaacaga cagttcagag aaaaatccta gcaactggag agaattatag 1680 actagtcaat gatcttgtgt tagacttact caatgcagaa gatgaaataa gggaagagga 1740 gagagaaaga gcaactgagg aaaaagaatc aaatgattta ttattaatcc ggaagaatag 1800 aatggcactt tttcaacatt tgacttgtgt aattccaatc ctggatagtc tactaactgc 1860 cggaattatt aatgaacaag aacatgatgt tattaaacag aagacacaga cgtctttaca 1920 agcaagagaa ctgattgata cgattttagt aaaaggaaat attgcagcca ctgtattcag 1980 aaactctctg caagaagctg aagctgtgtt atatgagcat ttatttgtgc aacaggacat 2040 aaaatatatt cccacagaag atgtttcaga tctaccagtg gaagaacaat tgcggagact 2100 accagaagaa agaacatgta aagtgtgtat ggacaaagaa gtgtccatag tgtttattcc 2160 ttgtggtcat ctagtagtat gcaaagattg tgctccttct ttaagaaagt gtcctatttg 2220 taggagtaca atcaagggta cagttcgtac atttctttca tgaagaagaa ccaaaacatc 2280 gtctaaactt tagaattaat ttattaaatg tattataact ttaactttta tcctaatttg 2340 gtttccttaa aatttttatt tatttacaac tcaaaaaaca ttgttttgtg taacatattt 2400 atatatgtat ctaaaccata tgaacatata ttttttagaa actaagagaa tgataggctt 2460 ttgttcttat gaacgaaaaa gaggtagcac tacaaacaca atattcaatc caaatttcag 2520 cattattgaa attgtaagtg aagtaaaact taagatattt gagttaacct ttaagaattt 2580 taaatatttt ggcattgtac taataccggg aacatgaagc caggtgtggt ggtatgtacc 2640 tgtagtccca ggctgaggca agagaattac ttgagcccag gagtttgaat ccatcctggg 2700 cagcatactg agaccctgcc tttaaaaacn aacagnacca aanccaaaca ccagggacac 2760 atttctctgt cttttttgat cagtgtccta tacatcgaag gtgtgcatat atgttgaatc 2820 acattttagg gacatggtgt ttttataaag aattctgtga gnaaaaattt aataaagcaa 2880 ccnaaattac tcttaaaaaa aaaaaaaaaa aaaaaa 2916 32 3188 DNA Homo sapiens 32 cgggcagtga cagccggcgc ggatcgcgcg tccacggagg agaatcagct tagagaacta 60 tcaacacagg acaatgcaag cccatgagct gttccggtat tttcgaatgc cagagctggt 120 tgacttccga cagtgcgtga ctcttccgac caacacgctt atgggcttcg gagctttttc 180 cagacgactc accaccttct ggcggccacg ccacccaaaa cccctgaagc cgccatggca 240 cctctccatg cagtcagtgg aagtggcggg tagtggtggt gcacgaagat ccgcactact 300 tgacagcgac gagcccttgg tgtatttcta tgatgatgtt acaacattat acgaaggttt 360 ccagagaggg atacaggtgt caaataatgg cccttgttta ggctctcgga aaccagacca 420 accctatgaa tggctttcat ataaacaggt tgcagaattg tcggagtgca taggctcagc 480 actgatccag aagggcttca agactgcccc agatcagttc attggcatct ttgctcaaaa 540 tagacctgag tgggtgatta ttgaacaagg atgctttgct tattcgatgg tgatcgttcc 600 actttatgat acccttggaa atgaagccat cacgtacata gtcaacaaag ctgaactctc 660 tctggttttt gttgacaagc cagagaaggc caaactctta ttagagggtg tagaaaataa 720 gttaatacca ggccttaaaa tcatagttgt catggactcg tacggcagtg aactggtgga 780 acgaggccag aggtgtgggg tggaagtcac cagcatgaag gcgatggagg acctgggaag 840 agccaacaga cggaagccca agcctccagc acctgaagat cttgcagtaa tttgtttcac 900 aagtggaact acaggcaacc ccaaaggagc aatggtcact caccgaaaca tagtgagcga 960 ttgttcagct tttgtgaaag caacagagaa tacagtcaat ccttgcccag atgatacttt 1020 gatatctttc ttgcctctcg cccatatgtt tgagagagtt gtagagtgtg taatgctgtg 1080 tcatggagct aaaatcggat ttttccaagg agatatcagg ctgctcatgg atgacctcaa 1140 ggtgcttcaa cccactgtct tccccgtggt tccaagactg ctgaaccgga tgtttgaccg 1200 aattttcgga caagcaaaca ccaccgtgaa gcgatggctc ttggactttg cctccaagag 1260 gaaagaagca gacgttcgca gcggcatcat cagaaacaac agcctgtggg accggctgat 1320 cttccacaaa gtacagtcga gcctgggcgg aagagtccgg ctgatggtga caggagccgc 1380 cccggtgtct gccactgtgc tgacgttcct cagagcagcc ctgggctgtc agttttatga 1440 aggatacgga cagacagagt gcactgccgg gtgctgccta accatgcctg gagactggac 1500 cacaggccat gttggggccc cgatgccgtg caatttgata aaacttggtt ggcagttgga 1560 agaaatgaat tacatggcgt ccgagggcga gggcgaggtg tgtgtgaaag ggccaaatgt 1620 atttcagggc tacttgaagg acccagcgaa aacagcagaa gctttggaca aagacggctg 1680 gttacacaca ggggacatcg gaaaatggtt accaaatggc accttgaaaa ttatcgaccg 1740 gaaaaagcac atatttaagc tggcacaagg agaatacata gcccctgaaa agattgaaaa 1800 tatctacatg cgaagtgagc ctgttgctca ggtgtttgtc cacggagaaa gcctgcaggc 1860 atttctcatt gcaattgtgg taccagatgt tgagacatta tgttcctggg cccaaaagag 1920 aggatttgaa gggtcgtttg aggaactgtg cagaaataag gatgtcaaaa aagctatcct 1980 cgaagatatg gtgagacttg ggaaggattc tggtctgaaa ccatttgaac aggtcaaagg 2040 catcacattg caccctgaat tattttctat cgacaatggc cttctgactc caacaatgaa 2100 ggcgaaaagg ccagagctgc ggaactattt caggtcgcag atagatgacc tctattccat 2160 catcaaggtt tagtgtgaag aagaaagctc agaggaaatg gcacagttcc acaatctctt 2220 ctcctgctga tggccttcat gttgttaatt ttgaatacag caagtgtagg gaaggaagcg 2280 ttctgtgttt gacttgtcca ttcggggttc ttctcatagg aatgctagag gaaacagaac 2340 actgccttac agtcacctca gtgttcagac catgtttatg gtaatacaca cttccaaaag 2400 tagccttaaa aattgtaaag ggatactata aatgtgctaa ttatttgaga cttcctcagt 2460 ttaaaaagtg ggttttaaat cttctgtctc cctgtttttc taatcaaggg gttaggactt 2520 tgctatctct gagatgtctg ctacttcgtc gaaattctgc agctgtctgc tgctctaaag 2580 agtacagtgc tctagaggga agtgttccct ttaaaaataa gaacaactgt cctggctgga 2640 gatctcacaa gcggaccaga gatcttttta aatccctgct actgtccctt ctcacaggca 2700 ttcacagaac ccttctgatt cgaagggtta cgaaactcat gttcttctcc agtcccctgt 2760 ggtttctgtt ggagcataag gtttccagta agcgggaggg cagatccaac tcagaaccat 2820 gcagataagg agcctctggc aaatgggtgc tgcatcagaa cgcgtggatt ctctttcatg 2880 gcagatgctc ttggactcgg ttctccaggc ctgattcccc gactccatcc tttttcaggg 2940 ttatttaaaa atctgcctta gattctatag tgaagacaag catttcaaga aagagttacc 3000 tggatcagcc atgctcagct gtgacgcctg ataactgtct actttatctt cactgaacca 3060 ctcactctgt gtaaaggcca acggattttt aatgtggttt tcatatcaaa agatcatgtt 3120 gggattaact tgcctttttc cccaaaaaat aaactctcag gcaaggcatt tcttttaaag 3180 ctattccg 3188 33 1342 DNA Homo sapiens 33 tcccccactc tcaaggatgc tgtgaggggt attcctccca tgtggtgart tgggaggwtt 60 tcctgaggtc cttttccatc ctgagacgct ggttttccat tttgtttctc acaggccagg 120 gctttgaccg acacttgttt gctctgcggc atctggcagc agccaaaggg atcatcttgc 180 ctgagctcta cctggaccct gcatacgggc agataaacca caatgtcctg tccacgagca 240 cactgagcag cccagcagtg aaccttgggg gctttgcccc tgtggtctct gatggctttg 300 gtgttgggta tgctgttcat gacaactgga taggctgcaa tgtctcttcc tacccaggcc 360 gcaatgcccg ggagtttctc caatgtgtgg agaaggcctt agaagacatg tttgatgcct 420 tagaaggcaa atccatcaaa agttaacttc tgggcagatg aaaagctacc atcacttcct 480 catcatgaaa actgggaggc cgggcatggt ggctcatgcc tgtaatccca gcattttgag 540 aggctgaggc gggtggatca cttgaggtca ggagtttgag accaacctgg ccaacatggt 600 gaaaccttgt ctctactaaa aatacaagaa ttagctgggt gtggtggcat gtgcctatat 660 cccagctact gggaggttga agcagaattg cttgaaccca ggaggtggag gttgcagtga 720 gctgagatca caccactgca ctccggcctg ggcgacagag cgagactgtc tcaaaaagac 780 aaaaaagaaa aaaaactggg gcctgtgtag ccagtgggtg ctattctgtg aaactaatca 840 taagctgcct aggcagccag ctacaggctt gagctttaaa ttcatggttt taaagctaaa 900 cgtaatttcc acttgggact agatcacaac tgaagrtaac aagagattta agttttaagg 960 gcatttaatc aggaggaaag gtttggaaaa ctaactcagg tgtatttatt gtttaagcag 1020 aaataaagtt taatttttgc ttgaagatgg ttcttaattt cttttaacct aattcctaat 1080 cctcacaaag atctttccaa cagcaagttc agtaagttca ggtaacagta cgtcaccatt 1140 ggcttctggc tcattgagtg atggtgggat cgcggtttca tctctgtaaa cttgcccttg 1200 actggggaga taccatctcc ttaaaaatac tcttcatttc tcctaaggag tgaactsctg 1260 ctgcacgaat tcttatttgt ggagggagta gcttgctccc ttactttcac cycccatgca 1320 accagtgcag ggtkaacagg gg 1342 34 4859 DNA Homo sapiens 34 cacgttgggt gacataatgg ggttttttta attatagatt cacactgcat ttattcatca 60 cccctgtcct ctcatccata actcaaattt actaccagca acacaaaata caaagatgtg 120 tccagtttca ctacagctct tcgcgtttac aagtgtcgag cgcttgcttt cggaacgccc 180 ttgtgattgg ccgagccaat gccagtgaca tcaaccaact tacttttgat tggaaggctg 240 gttgctggga ctgtagcgtt tgcaggaagt cacttaactg tttgggagct ggaaaaccga 300 agctgaagtt ctcttttgcc ataggaacga gcgcaactga ctaggaaaga tgtgtcccaa 360 agctccgcaa gctggaacgt gagccaggag gcccggaccg gccacgggac cgcgaggcac 420 tccgaaagtg tgcggctgcc ccttccctgc ctcccagctg ttaccctttt aaatgtcagt 480 gttcgaggct gtaggggtag cacgaggcag cgaaacggaa cagtcggatt ggccgcacgc 540 ctcagttcta gacgcacctc tccaccgaag ccgttctgac tggcaggggg agaaagtaaa 600 cagagttgaa tcaccctccc cactggccaa ttggaggggg tttggtttgt gacgtgatgg 660 gattctgcga aattgttact gagcaagaga atgccggaac gtgcggaccg gccggagcag 720 gggttcagaa gccgtcagtg gactcgggaa aaagtgtctc ttagacctgg cgctcggcgg 780 ggccctcgcc acccgcgtcg gggtgatcgg gtgaatgtcc tggggctttg gctcgacggc 840 gaggcggccg agggcgtgca cctctcttgc agtttcctct cccagcgcct cgggggcgtt 900 ttcagtcgaa taaacttgcg accgccacgt gtggcatctt tccaagggag ccggctcaga 960 ggggccggcg cgcccgtcgg gggatcgcgg ccggcgcggg gcaggggcgg cggctagagg 1020 cggcggcgcg gcggagcccg gggccgtgga tgctgcgtgc ggaggcgctg ccggttacgt 1080 aaagatgagg ggctgaggtc gcctcggcgc tcctgcgagt cggaagcgcc ccgcgccccc 1140 gcccccttgg ccgccgcgcc gtgccgggcg ggcgggtcgt cgtccgaggc cagggagggc 1200 gagccgaacc tccgcagcca ccgccaagtt tgtccgcgcc gcctgggctg ccgtcgcccg 1260 caccatgtcc gcggccgcct acatggactt cgtggctgcc cagtgtctgg tttccatttc 1320 gaaccgcgct gcggtgccgg agcatggggt cgctccggac gccgagcggc tgcgactacc 1380 tgagcgcgag gtgaccaagg agcacggtga cccgggggac acctggaagg attactgcac 1440 actggtcacc atcgccaaga gcttgttgga cctgaacaag taccgaccca tccagacccc 1500 ctccgtgtgc agcgacagtc tggaaagtcc agatgaggat atgggatccg acagcgacgt 1560 gaccaccgaa tctgggtcga gtccttccca cagcccggag gagagacagg atcctggcag 1620 cgcgcccagc ccgctctccc tcctccatcc tggagtggct gcgaagggga aacacgcctc 1680 cgaaaagagg cacaagtgcc cctacagtgg ctgtgggaaa gtctatggaa aatcctccca 1740 tctcaaagcc cattacagag tgcatacagg tgaacggccc ttcccctgca cgtggccaga 1800 ctgccttaaa aagttctccc gctcagacga gctgacccgc cactaccgga cccacactgg 1860 ggaaaagcag ttccgctgtc cgctgtgtga gaagcgcttc atgaggagtg accacctcac 1920 aaagcacgcc cggcggcaca ccgagttcca ccccagcatg atcaagcgat cgaaaaaggc 1980 gctggccaac gctttgtgag gtgctgcccg tggaagccag ggagggatgg accccgaaag 2040 gacaaaagta ctcccaggaa acagacgcgt gaaaactgag ccccagaaga ggcacacttg 2100 acggcacagg aagtcactgc tctttggtca atattctgat tttcctctcc ctgcattgtt 2160 tttaaaaagc acattgtagc ctaagatcaa agtcaacaac actcggtccc cttgaagagg 2220 caactctctg aacccgtctc tgactgttgg agggaaggca aatgcttttg ggttttttgg 2280 tttttgtttt tgtttttttt tctcctttta tttttttgcg ggggagggta gggagtgggt 2340 gggggggagg gggtaaggcc aagactgggt agattttaaa gattcaacac tggtgtacat 2400 atgtccgctg ggtgagttga cctgtggcct cgcacagtga ttctaggccc tttatgcttg 2460 ctgtctctca gaattgtttt cttacctttt aatgtaatga cgagtgtgct tcagtttgtt 2520 tagcaaaacc actctcttga atcacgttaa cttttgagat taaaaaaaaa aacgccatag 2580 cacagctgtc tttatgcaag caagagcaca tctactccag catgatctgt catctaaaga 2640 cttgaaaaca aaaaacagtt acttatagtc aatgggtaag cagagtctga atttatacta 2700 atcaagacaa acctttgaaa ggttacacta agtacagaac ttttaaacct tgctttgtat 2760 gagttgtact ttttgaacat aagctgcact tttattttct aatgcagagg atgaataagt 2820 taaatacatg ctttgaggat agaagcagat gttctgtttg gcaccacgtt ataatctgct 2880 tattttacaa tatacacgtt tccctaagaa atcatgcgca gagatgtgag ggcagaatat 2940 acacaacaga tgctgaagga gaaggagggt agtgttttgc aaaagaaaaa gaaaagaacc 3000 aacagaattt taactctatt aacttttcca aattttccta tgcttttagt taacatcatt 3060 attgtatcct aatgccacta ggggagagag cttttgactc tgttgggttt tatttgaatg 3120 tgtgcataac agtaatgaga tctggaaaca cctatttttt ggggaaaaag gtttgttggt 3180 ctccttcctg tgttcctaca aaactcccac tctcaggtgc aagagttatg tagaaggaaa 3240 gggagctgaa ataggaacag aaaaatcaac ccctataact agtgaacacc aagggaaaat 3300 accacaatga tttcagagga gactctgcaa aatcgtccct tgtggagaat gcaggcaaca 3360 tggaatacta cgaatgaaat cacatcactg tatcttttac atcaatagcc tcaccactaa 3420 tatatcttgt atctaggtgt ctataatggc tgaaaccact acatccatct atgccattta 3480 cctgaaaact taactgtggc ctttatgagg ccagaaaagt gaactgagtt ttgtagttaa 3540 gacctcaaat gaggggagtc agcagtgatc atgggggaaa tgtttacatt ttttttttct 3600 tcagaagtaa cgctttctga tgattttatc tgatatttaa aacagggagc tatggtgcac 3660 tctagtttat acttgcgctc tgaaatgtgt aaacataggg tgcctaccta tttcacctga 3720 cccatactcg tttctgattc agaatcagtg tgggctcctg cagtgggcgc gggtcacggc 3780 tgactccaac ttccaataca acagccatca ctagcacagt gtttttttgt ttaaccaacg 3840 tagtgttatt agtagttcta taaagagaac tgcttttaac attagggact gggagcagtc 3900 catgggataa aaaggaaagt gttttctcac gagaaaacat gtcaggaaaa ataaagaaca 3960 ctttctacct ctgtttcaga tttttgaaac acttatttta aaccaaattt taatttctgt 4020 gtccaaaata agttttaagg acatctgttc ttccatacga aataggttag gctgcctatt 4080 tctcactgag ctcatggaat ggttctgctt atgatactct gcacgctgcc ttttagtgag 4140 tgaggagttt ggggttgcct agcacttgct aacttgtaaa aagtcatctt tccctcacag 4200 aaagaaacga aagaaagcaa agcaaagtca gtgaaagaca atctttatag tttcaggagt 4260 aaatctaaat gtggcttttg tcaagcactt agatggatat aaatgcagca acttgtttta 4320 aaaaaatgca catttacttc ccaaaaaagt tgttacttgc cttttcaagt gtgacaaact 4380 cacatttgat attctcttat atgttatagt aatgtaacgt ataaactcaa gcctttttat 4440 tctttgtgat taaatcctgt tttaaaatgt cacaaaacag gaaccagcat tctaattaga 4500 tttactatat caagatatgg ttcaaatagg actactagag ttcattgaac actaaaacta 4560 tgaaacaatt actttttata ttaaaaagac catggattta acttatgaaa atccaaatgc 4620 aggatagtaa tttttgttta cttttttaac caaactgaat ttttgaaaga ctattgcagg 4680 tgtttaaaaa gaaagaaaag ttgttttatc taatactgta agtagttgtc atattctgga 4740 aaatttaata gttttagagt taagatatct cctctctttg gttagggaag aagaaagccc 4800 ttcaccattg tggaatgatg ccctggcttt aaggtttagc tccacatcat gcttctctt 4859 35 1941 DNA Homo sapiens 35 tctcttgatt cctagtctct cgatatggca cctccgtcag tctttgccga ggttccgcag 60 gcccagcctg tcctggtctt caagctcact gccgacttca gggaggatcc ggacccccgc 120 aaggtcaacc tgggagtggg agcatatcgc acggatgact gccatccctg ggttttgcca 180 gtagtgaaga aagtggagca gaagattgct aatgacaata gcctaaatca cgagtatctg 240 ccaatcctgg gcctggctga gttccggagc tgtgcttctc gtcttgccct tggggatgac 300 agcccagcac tcaaggagaa gcgggtagga ggtgtgcaat ctttgggggg aacaggtgca 360 cttcgaattg gagctgattt cttagcgcgt tggtacaatg gaacaaacaa caagaacaca 420 cctgtctatg tgtcctcacc aacctgggag aatcacaatg ctgtgttttc cgctgctggt 480 tttaaagaca ttcggtccta tcgctactgg gatgcagaga agagaggatt ggacctccag 540 ggcttcctga atgatctgga gaatgctcct gagttctcca ttgttgtcct ccacgcctgt 600 gcacacaacc caactgggat tgacccaact ccggagcagt ggaagcagat tgcttctgtc 660 atgaagcacc ggtttctgtt ccccttcttt gactcagcct atcagggctt cgcatctgga 720 aacctggaga gagatgcctg ggccattcgc tattttgtgt ctgaaggctt cgagttcttc 780 tgtgcccagt ccttctccaa gaacttcggg ctctacaatg agagagtcgg gaatctgact 840 gtggttggaa aagaacctga gagcatcctg caagtccttt cccagatgga gaagatcgtg 900 cggattactt ggtccaatcc ccccgcccag ggagcacgaa ttgtggccag caccctctct 960 aaccctgagc tctttgagga atggacaggt aatgtgaaga caatggctga ccggattctg 1020 accatgagat ctgaactcag ggcacgacta gaagccctca aaacccctgg gacctggaac 1080 cacatcactg atcaaattgg catgttcagc ttcactgggt tgaaccccaa gcaggttgag 1140 tatctggtca atgaaaagca catctacctg ctgccaagtg gtcgaatcaa cgtgagtggc 1200 ttaaccacca aaaatctaga ttacgtggcc acctccatcc atgaagcagt caccaaaatc 1260 cagtgaagaa acaccacccg tccagtacca ccaaagtagt tctctgtcat gtgtgttccc 1320 tgcctgcaca aacctacatg tacataccat ggattagaga cacttgcagg actgaaagct 1380 gctctggtga ggcagcctct gtttaaaccg gccccacatg aagagaacat cccttgagac 1440 gaatttggag actgggatta gagcctttgg aggtcaaagc aaattaagat ttttatttaa 1500 gaataaaaga gtactttgat catgagacat aggtatcttg tccctctcac taaaaaggag 1560 tgttgtgtgt ggcggccacg tgcttctatg tggtgtttga ctctgtacaa attctagtcc 1620 caaagatcaa gttgtctgaa ggagccaaag tgtgaatgtg ggtgtcggct gcggcattaa 1680 attcatcatc tcaacccaga gtgtctggtc tccctgctct ttctgcatgg ttgtgtccct 1740 agtcctaagc tttggttctt tagggtgact gtggtaagaa ggatatttaa tcatgacatg 1800 cacggacacg tacatattta actgaaacaa gttttaccaa acagtattta ctcgtgatgt 1860 gcgtagtgca ttctgatatt tttgagccat tctattgtgt tctacttcac ctaaaaaaat 1920 aaaataaaaa tgttgatcaa g 1941 36 2727 DNA Homo sapiens 36 agaagagcgg agctgtgagc agtactgcgg cctcctctcc tctcctaacc tcgctctcgc 60 ggcctagctt tacccgcccg cctgctcggc gaccagaaca ccttccacca tgaccacctc 120 agcaagttcc cacttaaata aaggcatcaa gcaggtgtac atgtccctgc ctcagggtga 180 gaaagtccag gccatgtata tctggatcga tggtactgga gaaggactgc gctgcaagac 240 ccggaccctg gacagtgagc ccaagtgtgt ggaagagttg cctgagtgga atttcgatgg 300 ctctagtact ttacagtctg agggttccaa cagtgacatg tatctcgtgc ctgctgccat 360 gtttcgggac cccttccgta aggaccctaa caagctggtg ttatgtgaag ttttcaagta 420 caatcgaagg cctgcagaga ccaatttgag gcacacctgt aaacggataa tggacatggt 480 gagcaaccag cacccctggt ttggcatgga gcaggagtat accctcatgg ggacagatgg 540 gcaccccttt ggttggcctt ccaacggctt cccagggccc cagggtccat attactgtgg 600 tgtgggagca gacagagcct atggcaggga catcgtggag gcccattacc gggcctgctt 660 gtatgctgga gtcaagattg cggggactaa tgccgaggtc atgcctgccc agtgggaatt 720 tcagattgga ccttgtgaag gaatcagcat gggagatcat ctctgggtgg cccgtttcat 780 cttgcatcgt gtgtgtgaag actttggagt gatagcaacc tttgatccta agcccattcc 840 tgggaactgg aatggtgcag gctgccatac caacttcagc accaaggcca tgcgggagga 900 gaatggtctg aagtacatcg aggaggccat tgagaaacta agcaagcggc accagtacca 960 catccgtgcc tatgatccca agggaggcct ggacaatgcc cgacgtctaa ctggattcca 1020 tgaaacctcc aacatcaacg acttttctgc tggtgtagcc aatcgtagcg ccagactacg 1080 cattccccgg actgttggcc aggagaagaa gggttacttt gaagatcgtc gcccctctgc 1140 caactgcgag cccttttcgg tgacagaagc cctcatccgc acgtgtcttc tcaatgaaac 1200 cggcgatgag cccttccagt acaaaaatta agtggactag acctccagct gttgagcccc 1260 tcctagttct tcatccctga ctccaactct tccccctctc ccagttgtcc cgattgtaac 1320 tcaaagggtg gaatatcaag gtcgtttttt tcattccatg tgcccagtta atcttgcttt 1380 cttttgtttg gctgggatag aggggtcaag ttattaattt cttcacacct accctccttt 1440 ttttccctat cactgaagct ttttagtgca ttagtgggga ggagggtggg gagacataac 1500 cactgcttcc atttaatggg gtgcacctgt ccaataggcg tacgtatccg gacagagcac 1560 gtttgcagag gggtctctct ccaggtagct gaaagggaag acctgacgta ctctggttag 1620 gttaggactt gccctcgtgg tggaaacttt tcttaaaaag ttataaccaa cttttctatt 1680 aaaagtggga attaggagag aaggtagggg ttgggaatca gagagaatgg ctttggtctc 1740 ttgcttgtgg gactagcctg gcttgggact aaatgccctg ctctgaacac aagcttagta 1800 taaactgatg gatatcccta ccttgaaaga agaaaaggtt cttactgctt ggtccttgat 1860 ttatcacaca aagcagaata gtatttttat atttaaatgt aaagacaaaa aactatatgt 1920 atggttttgt ggattatgtg tgttttggct aaaggaaaaa accatccagg tcacggggca 1980 ccaaatttga gacaaatagt cggattagaa ataaagcatc tcattttgag tagagagcaa 2040 ggaagtggtt cttagatggt gatctgggat taggccctca agaccccttt tgggtttctg 2100 ccctgcccac cctctggaga aggtggcact gattagttaa cagaccaaca ccgttactag 2160 cagtcactga tctccgtggc tttggtttaa aagacacact tgtccacata ggtttagaga 2220 taagagttgg ctggtcaact tgagcatgtt actgacagag ggggtattgg ggttattttc 2280 tggtaggaat agcatgtcac taaagcaggc ctttgatatt aaatttttta aaaagcaaaa 2340 ttatagaagt ttagatttta atcaaatttg tagggtttct aggtatttac agatgctgtt 2400 gctcaacgtc tcctacctct gctctgagag atgggacagg ctgagtcaaa cactgtaatt 2460 ttgtatcttg atgtctttgt taagactgct gaagaattat tttttctttt ataataagga 2520 ataaacccca cctttattcc ttcatttcat ctaccatttt ctggttcttg tgttggctgt 2580 ggcaggccag ctgtggtttt cttttgccat gacaacttct aattgccatg tacagtatgt 2640 tcaaagtcaa ataactcctc attgtaaaca aactgtgtaa ctgcccaaag cagcacttat 2700 aaatcagcct aacataaaaa aaaaaaa 2727 37 831 DNA Homo sapiens 37 gttgacaaga gacattccag cccaccactt cccaagtaaa gaattaaaat gcagcatgat 60 ggctaaggca agggcctgca gaagaatgta aaggagggag gaagagcagg ggattcagag 120 caggaaggag gagacagtac tgtctatccc gcagacgtgg tgctctttga agggatcctg 180 gccttctact cccaggaaag gtacgagacc tgttccagat gaagcttttt gtggatacag 240 atgcggacac ccggctctca cgcagagtat taagggacat cagcgagaga ggcagggatc 300 ttgagcagat tttatctcag tacattacgt tcgtcaagcc tgcctttgag gaattctgct 360 tgccaacaaa gcagtatgct gatgtgatca tccctagagg tgcagataat ctggtggcca 420 tcaacctcat cgagcagcac atccaggaca tcctgaatgg agggccctcc aaacggcaga 480 ccaatggctg tctcaacggc tacacccctt cacgcaagag gcaggcatcg gagtccagca 540 gcaggccgca ttgacccgtc tccatcggac cccagcccct atctccaaga gacagaggag 600 gcgtcaggag gcactgctca tctgtacata ctgtttccta tgacattact gtatttaaga 660 aaacaccatg gagatgaaat gcctttgatt ttttttttct ttttgtactt tggaacgaca 720 aaatgaaaca gaacttgacc ctgagcttaa ataacaaaac tgtgccaact actactggtg 780 atgcctaatt atgaatccaa cgtgtaacca gtaataaata catatatata t 831 38 3288 DNA Homo sapiens 38 cttcctctcc acgcggttga gaagaccggt cggcctgggc aacctgcgct gaagatgccg 60 ggaaaactcc gtagtgacgc tggtttggaa tcagacaccg caatgaaaaa aggggagaca 120 ctgcgaaagc aaatcgagga gaaagagaaa aaagagaagc caaaatctga taagactgaa 180 gagatagcag aagaggaaga aactgttttc cccaaagcta aacaagttaa aaagaaagca 240 gagccttctg aagttgacat gaattctcct aaatccaaaa aggcaaaaaa gaaagaggag 300 ccatctcaaa atgacatttc tcctaaaacc aaaagtttga gaaagaaaaa ggagcccatt 360 gaaaagaaag tggtttcttc taaaaccaaa aaagtgacaa aaaatgagga gccttctgag 420 gaagaaatag atgctcctaa gcccaagaag atgaagaaag aaaaggaaat gaatggagaa 480 actagagaga aaagccccaa actgaagaat ggatttcctc atcctgaacc ggactgtaac 540 cccagtgaag ctgccagtga agaaagtaac agtgagatag agcaggaaat acctgtggaa 600 caaaaagaag gcgctttctc taattttccc atatctgaag aaactattaa acttctcaaa 660 ggccgaggag tgaccttcct atttcctata caagcaaaga cattccatca tgtttacagc 720 gggaaggact taattgcaca ggcacggaca ggaactggga agacattctc ctttgccatc 780 cctttgattg agaaacttca tggggaactg caagacagga agagaggccg tgcccctcag 840 gtactggttc ttgcacctac aagagagttg gcaaatcaag taagcaaaga cttcagtgac 900 atcacaaaaa agctgtcagt ggcttgtttt tatggtggaa ctccctatgg aggtcaattt 960 gaacgcatga ggaatgggat tgatatcctg gttggaacac caggtcgtat caaagaccac 1020 atacagaatg gcaaactaga tctcaccaaa cttaagcatg ttgtcctgga tgaagtggac 1080 cagatgttgg atatgggatt tgctgatcaa gtggaagaga ttttaagtgt ggcatacaag 1140 aaagattctg aagacaatcc ccaaacattg cttttttctg caacttgccc tcattgggta 1200 tttaatgttg ccaagaaata catgaaatct acatatgaac aggtggacct gattggtaaa 1260 aagactcaga aaacggcaat aactgtggag catctggcta ttaagtgcca ctggactcag 1320 agggcagcag ttattgggga tgtcatccga gtatatagtg gtcatcaagg acgcactatc 1380 atcttttgtg aaaccaagaa agaagcccag gagctgtccc agaattcagc tataaagcag 1440 gatgctcagt ccttgcatgg agacattcca cagaagcaaa gggaaatcac cctgaaaggt 1500 tttagaaatg gtagttttgg agttttggtg gcaaccaatg ttgctgcacg tgggttagac 1560 atccctgagg ttgatttggt tatacaaagc tctccaccaa aggatgtaga gtcctacatt 1620 catcgatccg ggcggacagg cagagctgga aggacggggg tgtgcatctg cttttatcag 1680 cacaaggaag aatatcagtt agtacaagtg gagcaaaaag cgggaattaa gttcaaacga 1740 ataggtgttc cttctgcaac agaaataata aaagcttcca gcaaagatgc catcaggctt 1800 ttggattccg tgcctcccac tgccattagt cacttcaaac aatcagctga gaagctgata 1860 gaggagaagg gagctgtgga agctctggca gcagcactgg cccatatttc aggtgccacg 1920 tccgtagacc agcgctcctt gatcaactca aatgtgggtt ttgtgaccat gatcttgcag 1980 tgctcaattg aaatgccaaa tattagttat gcttggaaag aacttaaaga gcagctgggc 2040 gaggagattg attccaaagt gaagggaatg gtttttctca aaggaaagct gggtgtttgc 2100 tttgatgtac ctaccgcatc agtaacagaa atacaggaga aatggcatga ttcacgacgc 2160 tggcagctct ctgtggccac agagcaacca gaactggaag gaccacggga aggatatgga 2220 ggcttcaggg gacagcggga aggcagtcga ggcttcaggg gacagcggga cggaaacaga 2280 agattcagag gacagcggga aggcagtaga ggcccgagag gacagcgatc aggaggtggc 2340 aacaaaagta acagatccca aaacaaaggc cagaagcgga gtttcagtaa agcatttggt 2400 caataattag aaatagaaga tttatatagc aaaaagagaa tgatgtttgg caatatagaa 2460 ctgaacatta tttttcatgc aaagttaaaa gcacattgtg cctccttttg accacttgcc 2520 aagtccctgt ctctttcaga cacagacaag cttcatttaa attatttcat ctgatcatta 2580 tcatttataa ctttattgtt acttcttcat cagtttttcc ttttgaaagg tgtatgaatt 2640 cattacattt ttattctaat gtattatctg tagattagaa gataaaatca agcatgtatc 2700 tgcctatact ttgtgagttc acctgtcttt atactcaaaa gtgtccctta atagtgtcct 2760 tccctgaaat aaatacctaa gggagtgtaa cagtctctgg aggaccactt tgagcctttg 2820 gaagttaagg tttcctcagc cacctgccga acagtttctc atgtggtcct attatttgtc 2880 tactgagact taatactgag caatgttttg aaacaagatt tcaaactaat ctgggttgta 2940 atacagttta taccagtgta tgctctagac ttggaagatg tagtatgttt gatgtggatt 3000 acctatactt atgttcgttt tgatacattt ttagcttctc attataaggt gattcatgct 3060 ttagtgaatt cttatagatg atatataaaa gtacatttta atagaagcca gggtttaagg 3120 aatttcacat gtataaggtg gctccatagc tttatttgta agtaggctgg ataaatggtg 3180 cttaaatggt aatgtactcc acttcttccc attggaagat taacattatt taccaagaag 3240 gacttaaggg agtagggggc gcagattagc attgctcaag agtatgga 3288 39 3442 DNA Homo sapiens 39 agccggtgcg ccgcagacta gggcgcctcg ggccagggag cgcggaggag ccatggccac 60 cgctaacggg gccgtggaaa acgggcagcc ggacgggaag ccgccggccc tgccgcgccc 120 catccgcaac ctggaggtca agttcaccaa gatatttatc aacaatgaat ggcacgaatc 180 caagagtggg aaaaagtttg ctacatgtaa cccttcaact cgggagcaaa tatgtgaagt 240 ggaagaagga gataagcccg acgtggacaa ggctgtggag gctgcacagg ttgccttcca 300 gaggggctcg ccatggcgcc ggctggatgc cctgagtcgt gggcggctgc tgcaccagct 360 ggctgacctg gtggagaggg accgcgccac cttggccgcc ctggagacga tggatacagg 420 gaagccattt cttcatgctt ttttcatcga cctggagggc tgtattagaa ccctcagata 480 ctttgcaggg tgggcagaca aaatccaggg caagaccatc cccacagatg acaacgtcgt 540 atgcttcacc aggcatgagc ccattggtgt ctgtggggcc atcactccat ggaacttccc 600 cctgctgatg ctggtgtgga agctggcacc cgccctctgc tgtgggaaca ccatggtcct 660 gaagcctgcg gagcagacac ctctcaccgc cctttatctc ggctctctga tcaaagaggc 720 cgggttccct ccaggagtgg tgaacattgt gccaggattc gggcccacag tgggagcagc 780 aatttcttct caccctcaga tcaacaagat cgccttcacc ggctccacag aggttggaaa 840 actggttaaa gaagctgcgt cccggagcaa tctgaagcgg gtgacgctgg agctgggggg 900 gaagaacccc tgcatcgtgt gtgcggacgc tgacttggac ttggcagtgg agtgtgccca 960 tcagggagtg ttcttcaacc aaggccagtg ttgcacggca gcctccaggg tgttcgtgga 1020 ggagcaggtc tactctgagt ttgtcaggcg gagcgtggag tatgccaaga aacggcccgt 1080 gggagacccc ttcgatgtca aaacagaaca ggggcctcag attgatcaaa agcagttcga 1140 caaaatctta gagctgatcg agagtgggaa gaaggaaggg gccaagctgg aatgcggggg 1200 ctcagccatg gaagacaagg ggctcttcat caaacccact gtcttctcag aagtcacaga 1260 caacatgcgg attgccaaag aggagatttt cgggccagtg caaccaatac tgaagttcaa 1320 aagtatcgaa gaagtgataa aaagagcgaa tagcaccgac tatggactca cagcagccgt 1380 gttcacaaaa aatctcgaca aagccctgaa gttggcttct gccttagagt ctggaacggt 1440 ctggatcaac tgctacaacg ccctctatgc acaggctcca tttggtggct ttaaaatgtc 1500 aggaaatggc agagaactag gtgaatacgc tttggccgaa tacacagaag tgaaaactgt 1560 caccatcaaa cttggcgaca agaacccctg aaggaaaggc ggggctcctt cctcaaacat 1620 cggacggcgg aatgtggcag atgaaatgtg ctggaggaaa aaaatgacat ttctgacctt 1680 cccgggacac attcttctgg aggctttaca tctactggag ttgaatgatt gctgttttcc 1740 tctcactctc ctgtttattc accagactgg ggatgcctat aggttgtctg tgaaatcgca 1800 gtcctgcctg gggagggagc tgttggccat ttctgtgttt ccctttaaac cagatcctgg 1860 agacagtgag atactcaggg cgttgttaac agggagtggt atttgaagtg tccagcagtt 1920 gcttgaaatg ctttgccgaa tctgactcca gtaagaatgt gggaaaaccc cctgtgtgtt 1980 ctgcaagcag ggctcttgca ccagcggtct cctcagggtg gacctgctta cagagcaagc 2040 cacgcctctt tccgaggtga aggtgggacc attccttggg aaaggattca cagtaaggtt 2100 ttttggtttt tgttttttgt tttcttgttt ttaaaaaaag gatttcacag tgagaaagtt 2160 ttggttagtg cataccgtgg aagggcgcca gggtctttgt ggattgcatg ttgacattga 2220 ccgtgagatt cggcttcaaa ccaatactgc ctttggaata tgacagaatc aatagcccag 2280 agagcttagt caaagacgat atcacggtct accttaacca aggcactttc ttaagcagaa 2340 aatattgttg aggttacctt tgctgctaaa gatccaatct tctaacgcca caacagcata 2400 gcaaatccta ggataattca cctcctcatt tgacaaatca gagctgtaat tcactttaac 2460 aaattacgca tttctatcac gttcactaac agcttatgat aagtctgtgt agtcttcctt 2520 ttctccagtt ctgttaccca atttagatta gtaaagcgta cacaactgga aagactgctg 2580 taataacaca gccttgttat ttttaagtcc tattttgata ttaatttctg attagttagt 2640 aaataacacc tggattctat ggaggacctc ggtcttcatc caagtggcct gagtatttca 2700 ctggcaggtt gtgaattttt cttttcctct ttgggaatcc aaatgatgat gtgcaatttc 2760 atgttttaac ttgggaaact gaaagtgttc ccatatagct tcaaaaacaa aaacaaatgt 2820 gttatccgac ggatactttt atggttacta actagtactt tcctaattgg gaaagtagtg 2880 cttaagtttg caaattaagt tggggagggc aataataaaa tgagggcccg taacagaacc 2940 agtgtgtgta taacgaaaac catgtataaa atgggcctat cacccttgtc agagatataa 3000 attaccacat ttggcttccc ttcatcagct aacacttatc acttatacta ccaataactt 3060 gttaaatcag gatttggctt catacactga attttcagta ttttatctca agtagatata 3120 gacactaacc ttgatagtga tacgttagag ggttcctatt cttccattgt acgataatgt 3180 ctttaatatg aaatgctaca ttatttataa ttggtagagt tattgtatct ttttatagtt 3240 gtaagtacac agaggtggta tatttaaact tctgtaatat actgtattta gaaatggaaa 3300 tatatatagt gttaggtttc acttctttta aggtttaccc ctgtggtgtg gtttaaaaat 3360 ctataggcct gggaattccg atcctagctg cagatcgcat cccacaatgc gagaatgata 3420 aaataaaatt ggatatttga ga 3442 40 1540 DNA Homo sapiens 40 gccctcggcc ccgggccggc ccgccccgcc tcggccgccg cctggcgagc cgccgggtcc 60 ccgctcggcc ggtggccgag gccggagggc cgcggcgggc ggcggccgag gcggctccgg 120 ccagggccgg gccgggggcc ggggggcggc ggcgggcagg cggccgcgtc ggccggggcc 180 gggacgatga ctctggagtc catgatggcg tgttgcctga gcgatgaggt gaaggagtcc 240 aagcggatca acgccgagat cgagaagcag ctgcggcggg acaagcgcga cgcccggcgc 300 gagctcaagc tgctgctgct cggcacgggc gagagcggga agagcacgtt catcaagcag 360 atgcgcatca tccacggcgc cggctactcg gaggaggaca agcgcggctt caccaagctc 420 gtctaccaga acatcttcac cgccatgcag gccatgatcc gggccatgga gacgctcaag 480 atcctctaca agtacgagca gaacaaggcc aatgcgctcc tgatccggga ggtggacgtg 540 gagaaggtga ccaccttcga gcatcagtac gtcagtgcca tcaagaccct gtgggaggac 600 ccgggcatcc aggaatgcta cgaccgcagg cgcgagtacc agctctccga ctctgccaag 660 tactacctga ccgacgttga ccgcatcgcc accttgggct acctgcccac ccagcaggac 720 gtgctgcggg tccgcgtgcc caccaccggc atcatcgagt accctttcga cctggagaac 780 atcatcttcc ggatggtgga tgtggggggc cagcggtcgg agcggaggaa gtggatccac 840 tgctttgaga acgtgacatc catcatgttt ctcgtcgccc tcagcgaata cgaccaagtc 900 ctggtggagt cggacaacga gaaccggatg gaggagagca aagccctgtt ccggaccatc 960 atcacctacc cctggttcca gaactcctcc gtcatcctct tcctcaacaa gaaggacctg 1020 ctggaggaca agatcctgta ctcgcacctg gtggactact tccccgagtt cgatggtccc 1080 cagcgggagc cccaggcggc gcgggagttc atcctgaaga tgttcgtgga cctgaacccc 1140 gacagcgaca agatcatcta ctcacacttc acgtgtgcca ccgacacgga gaacatccgc 1200 ttcgtgttcg cggccgtgaa ggacaccatc ctgcagctca acctcaagga gtacaacctg 1260 gtctgagcgc cccaggccca gggagacggg atggagacac ggggcaggac cttccttcca 1320 cggagcctgc gctgccgggc gggtggcgct gccgagtccg ggccggggct ctgccgcggg 1380 aggagatttt ttttttttca tatttttaac aaatggtttt tatttcacag ttatcagggg 1440 atgtacatct ctccctccgt acacttcgcg caccttctca ccttttgtca acggcaaagg 1500 cagccttttt ctggccttga cttatggctc gcttttttct 1540 41 1517 DNA Homo sapiens 41 attctttggg gaggcaacta ggatggtgtg gccgaccacg gatttgcatt gccgaggacg 60 ggaccccagg gcagcgaagc agaatggcca acatgcaggg actggtggaa agactggaac 120 gagctgtcag ccgcctggag tcgctgtctg cagagtccca caggccccct gggaactgcg 180 gggaagtcaa tggtgtcatt gcaggtgtgg caccctccgt ggaagccttt gacaagctga 240 tggacagtat ggtggccgag tttttaaaga acagtaggat ccttgctggg gacgtggaga 300 cccatgcaga aatggtgcac agtgctttcc aggcccagcg ggctttcctt ctgatggcct 360 ctcagtacca acaaccccac gagaatgacg tggccgcact tctgaaaccc atatcggaaa 420 agattcagga aatccaaact ttcagagaga gaaaccgggg gagtaacatg tttaatcatc 480 tttcggccgt cagcgaaagc atccctgccc ttggatggat agctgtgtct cccaaacctg 540 gtccttatgt caaggagatg aatgacgctg ccacctttta cactaacagg gtcttaaagg 600 actacaaaca cagtgatttg cgtcatgtgg attgggtgaa gtcatatttg aacatttgga 660 gtgaacttca agcatacatc aaggaacacc acaccacggg cctcacatgg agcaaaacag 720 gtcctgtagc atccacagta tcagcgtttt ctgtcctctc ctctgggcct ggccttcctc 780 caccccctcc tcctctgcct cctccagggc cacctccact tttcgagaat gaaggcaaaa 840 aagaggaatc ttctccttca cgctcagctt tatttgccca acttaaccag ggagaagcaa 900 ttacaaaagg gctccgccat gtcacagatg accagaagac atacaaaaat cccagcctgc 960 gggctcaagg agggcaaact caatctccca ccaaaagtca cactccaagt cccacatctc 1020 ctaaatctta tccttctcaa aaacatgccc cagtgttgga gttggaagga aagaaatgga 1080 gagtggagta ccaagaggac aggaatgacc ttgtgatttc agagactgag ctgaaacaag 1140 tggcttacat tttcaaatgc gaaaaatcaa ctattcagat aaaagggaaa gtaaactcca 1200 ttataattga caactgtaag aaactcggcc tggtgtttga caatgtggtg ggcattgtgg 1260 aagtgatcaa ctcccaggac attcaaatcc aggtaatggg gagagtgcca acaatttcca 1320 ttaataagac agaaggttgc cacatatacc tcagtgaaga tgcattagac tgtgagatcg 1380 tgagcgccaa gtcatctgaa atgaacatac ttatccctca ggatggtgat tatagagaat 1440 ttcccattcc tgaacagttc aagacagcat gggatggatc caagttaatc actgaacctg 1500 cagaaattat ggcctaa 1517 42 1616 DNA Homo sapiens 42 tgctgaacca tttttcttag gatgcagccg tctcactccc ttgtcctgta aatcgtgtat 60 tcatgttgat gattcttgga gataggtttc actttttccc agctgcgtcc acaggaaagg 120 ggagtcggat gccagctgca ccccgcctgg ctcgcacagg ctaagaccac agacagagca 180 gggcttcccg gagccacaca ggccacgcac cccaggaacc cttgctgccg cgggccagga 240 acaggaatgt gttggtgcct gagacaccaa atggaagaag cacatcaaga ctgttctcct 300 gcggccaaca ctggcccgga agccgccctc catacaggcc ctcagggggc ctgccttctg 360 cgcctcagtc ccccgtgcat ccctgggcct gggtatcaca tgctctccag gaaagggacg 420 gaatcaatcg tgtgaccgat gggctcgcaa ggatgggtgc cgccgtggga gccctgcctc 480 tggtgctggc aagggattgg gtttgtgtgg gtgtctctag cctgcagagt gcagtgagtg 540 agagtccttg ggagcgcggc gctgcctgta gctgtgcctg gggatgcacg tggccacggg 600 atttcagtgg gacagcgctc ccacaggggc tgggggtggg ggtggggttt cttagttact 660 gttggaaagg gaaaaattca ccatatccaa ggggagagac gatgggctgg gtttgtttac 720 tccaacttcc cttctacacc cctcctgcag gacagtacga tttggggaga acccagctcc 780 ccactttatc tgcagactct gggacctgac aaaacagtca gagcctgagt gcactgcagc 840 ctgaactccc ttgagcagcg ctataaggga ctttgcactt taaaaagggg atgcctgtca 900 gtaaatcccc tgtgcattga ctagaactgg ggggctgcgc ccgctccctc cttaatccta 960 gatgatttgc tcatgaaata gaggtggggg acgaccgcat gcactctggg aggtgcagcc 1020 ctaaggggtg gactccagat ctccctgcaa gagacagctt ggcttggctt tggctgttgg 1080 ggaggagtcc ctgccatccc ggtgagcctg gggctgttgc ttagggtctt ctgggtggac 1140 acgtggagaa agagaaggca aacgttggaa cactaggaaa agctagaaat tcagacaaca 1200 cacatggatc cccttaaaac atgtaaatgt gtcagaacac ggttgacctg ccgccttctt 1260 gaacctggtg gcccccgttg gaactatcag tggcgtctcc catgcacacg ccctctgctt 1320 tctctttcct agactcgcgg tgctcacatc cagacattac cttgttggta gcccccaagt 1380 ggcgtgcagt gacaccagta tcttctctgt tgcatttttg caatcttgtg tcccgctcgg 1440 tgatgttcta caactctgtt ttaaggttga gaaagtttca agggtgaaga tctcaaaaca 1500 gtgctaaaat caaaggtgtt tgctgtgaag aaaaacatgt gtatatattg caccttgagt 1560 tgtcagaagg tagaaactga aataaactaa ctttaaaaaa aaaaaaaaaa aaaaaa 1616 43 2408 DNA Homo sapiens 43 ccgcgcctcc tcggccgcct gtcgggcatg aaaaccaaat tctgcaccgg gggcgaggcg 60 gagccctcgc cgctcgggct gctgctgagc tgcggtagcg gcagcgcggc cccggcgccc 120 ggcgtggggc agcagcgcga cgccgccagc gacctcgagt ccaagcagct ggcgccaaca 180 gccgcgctcg cgctgccccc tccgccgccg ctgccgctgc cgctgccgct gccccagccc 240 ccgccgccgc agccgcccgc agacgagcag ccggagcccc gggcgcggcg cagggcctat 300 ctgtggtgca aggagttcct gcccggcgcc tggcggggcc tccgcgagga cgagttccac 360 atcagtgtca tcagaggcgg ccttagcaac atgctgttcc agtgctccct acctgacacc 420 acagccaccc ttggtgatga gcctcggaaa gtgctcctgc ggctgtatgg agcgattttg 480 cagatgaggt cctgtaataa agagggatcc gaacaagctc agaaagaaaa tgaatttcaa 540 ggggctgagg ccatggttct ggagagcgtt atgtttgcca ttctcgcaga gaggtcactt 600 gggccaaaac tctatggcat ctttccccaa ggccgactgg agcagttcat cccgagccgg 660 cgattagata ctgaagaatt aagtttgcca gatatttctg cagaaatcgc cgagaaaatg 720 gctacatttc atggtatgaa aatgccattc aataaggaac caaaatggct ttttggcaca 780 atggaaaagt atctaaagga agtgctgaga attaaattta ctgaggaatc cagaattaaa 840 aagctccaca aattgctcag ttacaatctg cccttggaac tggaaaacct gagatcattg 900 cttgaatcta ctccatctcc agttgtattt tgtcataatg actgtcaaga aggtaatatc 960 ttgttgctgg aaggccgaga gaattctgaa aaacagaaac tgatgctcat tgatttcgaa 1020 tacagcagtt acaattacag gggattcgac attggaaatc acttctgtga gtggatgtat 1080 gattatagct atgaaaaata cccttttttc agagcaaaca tccggaagta tcccaccaag 1140 aaacaacagc tccattttat ttccagttac ttgcctgcat tccaaaatga ctttgaaaac 1200 ctcagtactg aagaaaaatc cattataaaa gaagaaatgt tgcttgaagt taataggttt 1260 gcccttgcat ctcatttcct ctggggactg tggtccattg tacaagccaa gatttcatct 1320 attgaatttg ggtacatgga ctacgcccaa gcaaggtttg atgcctattt ccaccagaag 1380 aggaagcttg gggtgtgact gtggggagga ctccatccac ctcatcactg gactgcatgg 1440 ggaggcagca gagcgcggtc ccctctgtgc ttcgactact gctcctgtgg caggaggctt 1500 tgggtggctc actactgaac acatgtgtat gatactaaag acggtattaa aatggagcga 1560 cgtttatttc atctcttgtt tacgatttca ctaggactca gaaacgagat cgggaagacg 1620 aaatatagtg caatagtgca acatctctga atccttttaa tctagagaag gcatttcata 1680 tttgggggct aaggtttcca gtcagatgag gcaaacagca agagtaagca gtgttacttg 1740 caggtacttt ggttaatgtt gactttaaat tttcatgaat gtgctggtga acactgtgac 1800 caggcttttg tagatggcga ctgtgttata gacggtgctc actcccaagg gacagcaagt 1860 gagcagagat gtactgcaaa gtcgccagtc actgcgtgca aggtggcctc tgcctggggc 1920 cgtccagaag ctgctccttt accctcttgg tcccatggct gaagcggagc agctggattg 1980 ctctggagca gccaaggccg ccactgtgga gacagagctc tcccctcctg ctgggcgtgt 2040 gtgacactgt agagtttcac tgtactcgat gtgacttctc ccctgccctt cctcctgatg 2100 gagtgtgcag acagccatgc gtggccacgg gggcagtgtg aggacctccc tgtctcccgc 2160 tcccctccca gggagcagct gcttgaccta gctctttggg cctctcctgc cctctgctct 2220 gcctggagtg tcggatcctg tgagtaggct gggcctcccc tgggcagggt tctccaaggc 2280 cggtttcccg gcccttacca aacctgatgc ccctgacatc atcattcttg tgggagacag 2340 cagcctgtat gtggtgtggg gcgtggatcg agtgtagctg tgaaatccat atatatgaaa 2400 tgtccaat 2408 44 1610 DNA Homo sapiens misc_feature (1)..(1610) n = a, c, g or t 44 cgtaacagga caaggagtcc tgctccggca cgtggccaca gaaaactact taggaagcct 60 gtggtgagaa caacaacagt gcctggagaa tcccacggct ctggggaagt gagccccgag 120 gatgaggctg ctcgcctggc tgattttcct ggctaactgg ggaggtgcca gggctgaacc 180 agggaagttc tggcacatcg ctgacctgca ccttgaccct gactacaagg tatccaaaga 240 ccccttccag gtgtgcccat cagctggatc ccagccagtg cccgacgcag gcccctgggg 300 tgactacctc tgtgattctc cctgggccct catcaactcc tccatctatg ccatgaagga 360 gattgagcca gagccagact tcattctctg gactggtgat gacacgcctc atgtgcccga 420 tgagaaactg ggagaggcag ctgtactgga aattgtggaa cgcctgacca agctcatcag 480 agaggtcttt ccagatacta aagtctatgc tgctttggga aatcatgatt ttcaccccaa 540 aaaccagttc ccagctggaa gtaacaacat ctacaatcag atagcagaac tatggaaacc 600 ctggcttagt aatgagtcca tcgctctctt caaaaaaggt gccttctact gtgagaagct 660 gccgggtccc agcggggctg ggcgaattgt ggtcctcaac accaatctgt actataccag 720 caatgcgctg acagcagaca tggcggaccc tggccagcag ttccagtggc tggaagatgt 780 gctgaccgat gcatccaaag ctggggacat ggtgtacatt gtcggccacg tgcccccggg 840 gttctttgag aagacgcaaa acaaggcatg gttccgggag ggcttcaatg aaaaatacct 900 gaaggtggtc cggaagcatc atcgcgtcat agcagggcag ttcttcgggc accaccacac 960 cgacagcttt cggatgctct atgatgatgc aggtgtcccc ataagcgcca tgttcatcac 1020 acctggagtc accccatgga aaaccacatt acctggagtg gtcaatgggg ccaacaatcc 1080 agccatccgg gtgttcgaat atgaccgagc cacactgagc ctnnaggaca tggtgaccta 1140 cttcatgaac ctgagccagg cgaatgctca ggggacgccg cgctgggagc tcgagtacca 1200 gctgaccgag gcctatgggg tgccggacgc cagcgcccac tccatcgaca cagtgctgga 1260 ccgcatcgct ggcgaccaga gcacactgca gcgctactac gtctataact cagtcagcta 1320 ctctgctggg gtctgcgacg aggcctgcag catgcagcac gtgtgtgcca tgcgccaggt 1380 ggacattgac gcttacacca cctgtctgta tgcctctggc accacgcccg tgccccagct 1440 nccgntgctg ctgatggccc tgctggggct gtgcacgact cgtgctgtga cctgccaggc 1500 tcaccattct tcctggtaac gggtaacggg ggcagcgccc aggatcaccc agagctgggc 1560 cttccaccat ttcctccgcg cctgaggagt gaactgaatg gacaccgatc 1610 45 1882 DNA Homo sapiens 45 gggcaggaag acggcgctgc ccggaggagc ggggcgggcg ggcgcgcggg ggagcgggcg 60 gcgggcggga gccaggcccg ggcgggggcg ggggcggcgg ggccagaaga ggcggcgggc 120 cgcgctccgg ccggtctgcg gcgttggcct tggctttggc tttggcggcg gcggtggaga 180 agatgctgca gtccctggcc ggcagctcgt gcgtgcgcct ggtggagcgg caccgctcgg 240 cctggtgctt cggcttcctg gtgctgggct acttgctcta cctggtcttc ggcgcagtgg 300 tcttctcctc ggtggagctg ccctatgagg acctgctgcg ccaggagctg cgcaagctga 360 agcgacgctt cttggaggag cacgagtgcc tgtctgagca gcagctggag cagttcctgg 420 gccgggtgct ggaggccagc aactacggcg tgtcggtgct cagcaacgcc tcgggcaact 480 ggaactggga cttcacctcc gcgctcttct tcgccagcac cgtgctctcc accacaggtt 540 atggccacac cgtgcccttg tcagatggag gtaaggcctt ctgcatcatc tactccgtca 600 ttggcattcc cttcaccctc ctgttcctga cggctgtggt ccagcgcatc accgtgcacg 660 tcacccgcag gccggtcctc tacttccaca tccgctgggg cttctccaag caggtggtgg 720 ccatcgtcca tgccgtgctc cttgggtttg tcactgtgtc ctgcttcttc ttcatcccgg 780 ccgctgtctt ctcagtcctg gaggatgact ggaacttcct ggaatccttt tatttttgtt 840 ttatttccct gagcaccatt ggcctggggg attatgtgcc tggggaaggc tacaatcaaa 900 aattcagaga gctctataag attgggatca cgtgttacct gctacttggc cttattgcca 960 tgttggtagt tctggaaacc ttctgtgaac tccatgagct gaaaaaattc agaaaaatgt 1020 tctatgtgaa gaaggacaag gacgaggatc aggtgcacat catagagcat gaccaactgt 1080 ccttctcctc gatcacagac caggcagctg gcatgaaaga ggaccagaag caaaatgagc 1140 cttttgtggc cacccagtca tctgcctgcg tggatggccc tgcaaaccat tgagcgtagg 1200 atttgttgca ttatgctaga gcaccagggt cagggtgcaa ggaagaggct taagtatgtt 1260 catttttatc agaatgcaaa agcgaaaatt atgtcacttt aagaaatagc tactgtttgc 1320 aatgtcttat taaaaaacaa caaaaaaaga cacatggaac aaagaagctg tgaccccagc 1380 aggatgtcta atatgtgagg aaatgagatg tccacctaaa attcatatgt gacaaaatta 1440 tctcgacctt acataggagg agaatacttg aagcagtatg ctgctgtggt tagaagcaga 1500 ttttatactt ttaactggaa actttggggt ttgcatttag atcatttagc tgatggctaa 1560 atagcaaaat ttatatttag aagcaaaaaa aaaaagcata gagatgtgtt ttataaatag 1620 gtttatgtgt actggtttgc atgtacccac ccaaaatgat tatttttgga gaatctaagt 1680 caaactcact atttataatg cataggtaac cattaactat gtacatataa agtataaata 1740 tgtttatatt ctgtacatat ggtttaggtc accagatcct agtgtagttc tgaaactaag 1800 actatagata ttttgtttct tttgatttct ctttatacta aagaatccag agttgctaca 1860 ataaaataag gggaataata aa 1882 46 1805 DNA Homo sapiens 46 aagagactga actgtatctg cctctatttc caaaagactc acgttcaact ttcgctcaca 60 caaagccggg aaaattttat tagtcctttt tttaaaaaaa gttaatataa aattatagca 120 aaaaaaaaaa ggaacctgaa ctttagtaac acagctggaa caatcgcagc ggcggcggca 180 gcggcgggag aagaggttta atttagttga ttttctgtgg ttgttggttg ttcgctagtc 240 tcacggtgat ggaagctgca cattttttcg aagggaccga gaagctgctg gaggtttggt 300 tctcccggca gcagcccgac gcaaaccaag gatctgggga tcttcgcact atcccaagat 360 ctgagtggga catacttttg aaggatgtgc aatgttcaat cataagtgtg acaaaaactg 420 acaagcagga agcttatgta ctcagtgaga gtagcatgtt tgtctccaag agacgtttca 480 ttttgaagac atgtggtacc accctcttgc tgaaagcact ggttcccctg ttgaagcttg 540 ctagggatta cagtgggttt gactcaattc aaagcttctt ttattctcgt aagaatttca 600 tgaagccttc tcaccaaggg tacccacacc ggaatttcca ggaagaaata gagtttctta 660 atgcaatttt cccaaatgga gcaggatatt gtatgggacg tatgaattct gactgttggt 720 acttatatac tctggatttc ccagagagtc gggtaatcag tcagccagat caaaccttgg 780 aaattctgat gagtgagctt gacccagcag ttatggacca gttctacatg aaagatggtg 840 ttactgcaaa ggatgtcact cgtgagagtg gaattcgtga cctgatacca ggttctgtca 900 ttgatgccac aatgttcaat ccttgtgggt attcgatgaa tggaatgaaa tcggatggaa 960 cttattggac tattcacatc actccagaac cagaattttc ttatgttagc tttgaaacaa 1020 acttaagtca gacctcctat gatgacctga tcaggaaagt tgtagaagtc ttcaagccag 1080 gaaaatttgt gaccaccttg tttgttaatc agagttctaa atgtcgcaca gtgcttgctt 1140 cgccccagaa gattgaaggt tttaagcgtc ttgattgcca gagtgctatg ttcaatgatt 1200 acaattttgt ttttaccagt tttgctaaga agcagcaaca acagcagagt tgattaagaa 1260 aaatgaagaa aaaacgcaaa aagagaacac atgtagaagg tggtggatgc tttctagatg 1320 tcgatgctgg gggcagtgct ttccataacc accactgtgt agttgcagaa agccctagat 1380 gtaatgatag tgtaatcatt ttgaattgta tgcattatta tatcaaggag ttagatatct 1440 tgcatgaatg ctctcttctg tgtttaggta ttctctgcca ctcttgctgt gaaattgaag 1500 tggatgtaga aaaaaccttt tactatatga aactttacaa cacttgtgaa agcaactcaa 1560 tttggtttat gcacagtgta atatttctcc aagtatcatc caaaattccc cacagacaag 1620 gctttcgtcc tcattaggtg ttggcctcag cctaaccctc taggactgtt ctattaaatt 1680 gctgccagaa ttttacatcc agttacctcc actttctaga acatattctt tactaatgtt 1740 attgaaacca atttctactt catactgatg tttttggaaa cagcaattaa agtttttctt 1800 ccatg 1805 47 2653 DNA Homo sapiens 47 gagcgcggct ggagtttgct gctgccgctg tgcagtttgt tcaggggctt gtggcggtga 60 gtccgagagg ctgcgtgtga gagacgtgag aaggatcctg cactgaggag gtggaaagaa 120 gaggattgct cgaggaggcc tggggtctgt gagacagcgg agctgggtga aggctgcggg 180 ttccggcgag gcctgagctg tgctgtcgtc atgcctcaaa cccgatccca ggcacaggct 240 acaatcagtt ttccaaaaag gaagctgtct cgggcattga acaaagctaa aaactccagt 300 gatgccaaac tagaaccaac aaatgtccaa accgtaacct gttctcctcg tgtaaaagcc 360 ctgcctctca gccccaggaa acgtctgggc gatgacaacc tatgcaacac tccccattta 420 cctccttgtt ctccaccaaa gcaaggcaag aaagagaatg gtccccctca ctcacataca 480 cttaagggac gaagattggt atttgacaat cagctgacaa ttaagtctcc tagcaaaaga 540 gaactagcca aagttcacca aaacaaaata ctttcttcag ttagaaaaag tcaagagatc 600 acaacaaatt ctgagcagag atgtccactg aagaaagaat ctgcatgtgt gagactattc 660 aagcaagaag gcacttgcta ccagcaagca aagctggtcc tgaacacagc tgtcccagat 720 cggctgcctg ccagggaaag ggagatggat gtcatcagga atttcttgag ggaacacatc 780 tgtgggaaaa aagctggaag cctttacctt tctggtgctc ctggaactgg aaaaactgcc 840 tgcttaagcc ggattctgca agacctcaag aaggaactga aaggctttaa aactatcatg 900 ctgaattgca tgtccttgag gactgcccag gctgtattcc cagctattgc tcaggagatt 960 tgtcaggaag aggtatccag gccagctggg aaggacatga tgaggaaatt ggaaaaacat 1020 atgactgcag agaagggccc catgattgtg ttggtattgg acgagatgga tcaactggac 1080 agcaaaggcc aggatgtatt gtacacgcta tttgaatggc catggctaag caattctcac 1140 ttggtgctga ttggtattgc taataccctg gatctcacag atagaattct acctaggctt 1200 caagctagag aaaaatgtaa gccacagctg ttgaacttcc caccttatac cagaaatcag 1260 atagtcacta ttttgcaaga tcgacttaat caggtatcta gagatcaggt tctggacaat 1320 gctgcagttc aattctgtgc ccgcaaagtc tctgctgttt caggagatgt tcgcaaagca 1380 ctggatgttt gcaggagagc tattgaaatt gtagagtcag atgtcaaaag ccagactatt 1440 ctcaaaccac tgtctgaatg taaatcacct tctgagcctc tgattcccaa gagggttggt 1500 cttattcaca tatcccaagt catctcagaa gttgatggta acaggatgac cttgagccaa 1560 gagggagcac aagattcctt ccctcttcag cagaagatct tggtttgctc tttgatgctc 1620 ttgatcaggc agttgaaaat caaagaggtc actctgggga agttatatga agcctacagt 1680 aaagtctgtc gcaaacagca ggtggcggct gtggaccagt cagagtgttt gtcactttca 1740 gggctcttgg aagccagggg cattttagga ttaaagagaa acaaggaaac ccgtttgaca 1800 aaggtgtttt tcaagattga agagaaagaa atagaacatg ctctgaaaga taaagcttta 1860 attggaaata tcttagctac tggattgcct taaattcttc tcttacaccc cacccgaaag 1920 tattcagctg gcatttagag agctacagtc ttcattttag tgctttacac attcgggcct 1980 gaaaacaaat atgacctttt ttacttgaag ccaatgaatt ttaatctata gattctttaa 2040 tattagcaca gaataatatc tttgggtctt actattttta cccataaaag tgaccaggta 2100 gacccttttt aattacattc actacttcta ccacttgtgt atctctagcc aatgtgcttg 2160 caagtgtaca gatctgtgta gaggaatgtg tgtatattta cctcttcgtt tgctcaaaca 2220 tgagtgggta tttttttgtt tgtttttttt gttgttgttg tttttgaggc gcgtctcacc 2280 ctgttgccca ggctggagtg caatggcgcg ttctctgctc actacagcac ccgcttccca 2340 ggttgaagtg attctcttgc ctcagcctcc cgagtagctg ggattacagg tgcccaccac 2400 cgcgcccagc taatttttta atttttagta gagacagggt tttaccatgt tggccaggct 2460 ggtcttgaac tcctgaccct caagtgatct gcccaccttg gcctccctaa gtgctgggat 2520 tataggcgtg agccaccatg ctcagccatt aaggtatttt gttaagaact ttaagtttag 2580 ggtaagaaga atgaaaatga tccagaaaaa tgcaagcaag tccacatgga gatttggagg 2640 acactggtta aag 2653 48 1618 DNA Homo sapiens 48 atgtcccggc cgcagcttcg acgctggcgc ctcgtctcta gcccgccgag cggcgtcccg 60 ggtctagcgc tgctggcgct gctggcgctg ctggcgctgc ggctcgcggc cgggaccgac 120 tgcccatgcc cggagcctga gctctgccgc ccgattcgcc accatccaga tttcgaggtc 180 tttgtgtttg atgttggaca gaaaacttgg aaatcttatg attggtcaca gattacaact 240 gtggcaacat ttggaaaata tgactcagaa cttatgtgct acgctcattc aaaaggagcc 300 agagtagtac ttaaaggaga tgtatcctta aaggatatca ttgatcctgc tttcagagca 360 tcctggatag ctcaaaaact taatttggcc aaaacacaat atatggatgg aattaatata 420 gatatagagc aagaagttaa ttgtttatca cctgaatatg atgcattaac tgctttagtc 480 aaagaaacta cagactcttt ccatcgtgaa attgagggat cacaggtaac ctttgatgta 540 gcttggtctc caaagaacat agacagaaga tgctataatt atactggaat cgcagatgct 600 tgtgacttcc tctttgtgat gtcttatgat gaacaaagtc agatctggtc agaatgtatt 660 gcagcagcca atgctcccta taatcagaca ttaactggat ataatgacta catcaagatg 720 agcattaatc ctaagaaact tgtaatgggt gttccttggt atggttatga ttatacctgc 780 ctgaatctgt ctgaggatca tgtttgtacc attgcaaaag tccctttccg gggggctcct 840 tgtagtgacg ctgcaggacg tcaggtgccc tacaaaacga tcatgaagca aataaatagt 900 tctatttctg gaaacctatg ggataaagat cagcgggctc cttattataa ctataaagat 960 cctgctggcc actttcatca agtatggtat gataaccctc agagtatttc tttaaaggca 1020 acatatatac aaaactatcg cttacggggc attggcatgt ggaatgcaaa ctgtcttgac 1080 tactctggag atgctgtagc caaacagcaa actgaagaaa tgtgggaagt cttaaagcca 1140 aagctgttac agagatgaac atcttttgtc aaaccattaa gagttagaaa gatgatctgt 1200 atcaacagat ctagtttctt gcatttttat tatgttgcta tatacttttg ttatccgtat 1260 actaaaaaaa aagaataaat aaatgttttg attgtttgaa tttgaaaaat acacacgaat 1320 gtcctcagta tccaggaaca taaaggcaag aagcaagtca acttacctat taaatattcc 1380 tctattagat gtttcaacac tataatttaa ttgggaaaaa ttgctttcag aattttatta 1440 tgccatattt cccttcatta tagtaaaata tatgctcacg aatcaatgct gatttttaaa 1500 atatgtataa tctgaagtgg aaattgtttg cttagagttt ttaaaaacct agtctttgaa 1560 aagcagtttg tgctatactt ttcccccaac cctccaataa atcttaaatt taaaacct 1618 49 4814 DNA Homo sapiens 49 ggcggcggga gccctggaac ggagcttcgt ggagctaagc ggagctgagc gcgaaaggcc 60 gaggcacttt cgggaattca cagtctgcag cattgggact gcaaatgccg tggctggcgc 120 cgtaaaatac agtgaaagcg cgggaggctt ttactacgtg gagagtggca agttgttctc 180 cgtaaccaga aacaggttca ttcattggaa gacctctgga gatacattgg agctgatgga 240 ggagtcactg gacataaatc tgttgaataa tgccattcgc ctaaaattcc aaaattgcag 300 tgttttacct ggaggggttt atgtctctga gactcagaat cgtgtgataa tcttgatgtt 360 aaccaatcaa acagtgcaca ggttactttt accacacccc tcccggatgt ataggagtga 420 gttggtagtt gacagtcaga tgcagtcaat attcactgac attggaaaag ttgatttcac 480 agatccttgc aactatcagt taattccagc agtacctgga atatctccta attccaccgc 540 ctctacagcc tggctcagca gtgatgggga ggccctgttt gccttaccat gtgcttctgg 600 gggaatcttt gttcttaagc tacctcctta tgacatacct ggtatggtgt cagtcgtgga 660 actgaaacag agttcagtaa tgcaacgatt gcttacaggc tggatgccaa cagctatcag 720 gggtgaccag tcgccttcag atcgtcccct cagtcttgct gttcattgtg tggagcatga 780 tgccttcatc tttgctttgt gtcaggatca taaactacga atgtggtctt acaaggagca 840 aatgtgccta atggtagctg acatgctgga gtatgtccct gtgaagaaag accttcggct 900 tactgctgga actggacaca aattacggct tgcttattcc cccaccatgg gactctacct 960 ggggatatac atgcatgcac caaaacgagg acagttctgc attttccagt tggtgagcac 1020 tgagagtaat cgctatagtc tcgatcatat ttcttcactg ttcacttctc aggagacact 1080 gattgacttt gccttaactt ccacggatat ctgggccctg tggcatgatg ctgagaacca 1140 aacagtagtg aaatacatca actttgaaca taatgttgca ggtcagtgga atccagtttt 1200 tatgcagcct ctgccagagg aagagattgt catcagagat gatcaagacc ccagagagat 1260 gtatctgcaa agtcttttta caccaggaca attcacaaat gaagctttat gtaaggcttt 1320 acagattttc tgccgaggaa ctgagaggaa tttggatctt tcctggagtg aactgaagaa 1380 agaagttact ttagctgttg aaaatgagct tcaaggaagt gtaacagagt atgaattctc 1440 ccaggaggag tttcgaaatt tacaacaaga attctggtgc aagttctatg cctgttgtct 1500 tcagtatcaa gaagccctct ctcaccctct tgccctacat ttgaatccac acacaaacat 1560 ggtgtgcctg ctgaaaaaag ggtacctgtc tttccttatt ccctcatcct tagtggatca 1620 tttgtatctc ctgccttatg agaacctttt gacagaagat gagacaacca tatctgatga 1680 tgtggatatc gctcgggatg tcatatgtct tataaaatgc ctccggctga ttgaagagtc 1740 agtaactgtg gatatgtcag ttataatgga aatgagttgt tataacctac agtctccgga 1800 aaaggctgca gagcagattc tggaagatat gatcactatt gatgtagaaa atgtgatgga 1860 ggatatttgt agtaaactgc aagagattag gaacccaatc catgcaattg gactacttat 1920 acgggaaatg gattatgaaa cagaagtgga aatggaaaag ggattcaatc cagctcagcc 1980 tttgaatatt cgaatgaatc ttacccagct ctatggtagt aacacagcag ggtatattgt 2040 gtgcagaggg gtgcataaaa tcgccagtac tcgtttcctg atctgcagag atcttttgat 2100 cttacagcag ctgttaatga ggcttggaga tgctgtgatt tggggaactg gtcagctctt 2160 tcaagctcag caagacctac tacatcgaac agctccccta ctcttatctt attacctcat 2220 taaatgggga agtgagtgct tggcaactga tgttccactt gacacactgg agtctaatct 2280 ccaacactta tcagtactgg aattaacaga ctctggtgct ttaatggcaa ataggtttgt 2340 atctagtcct cagactattg tggagttatt cttccaagaa gttgcaagaa aacacattat 2400 atctcacctc ttctctcagc caaaggcacc tctgagccaa actggattga attggcctga 2460 aatgattact gcaattacca gttatttatt gcagctttta tggcctagca atcctggttg 2520 tctctttcta gaatgtttga tgggaaattg ccaatatgta caattgcagg attatattca 2580 actgctacat ccctggtgtc aagtcaatgt tggttcctgt cgatttatgc tgggaaggtg 2640 ttacctagtt acaggagaag gacagaaggc tctggaatgt ttttgtcagg cagcatctga 2700 agtaggcaaa gaggaattct tggatcgctt gattcgctca gaggatgggg agatcgtgtc 2760 tacccccagg ctgcagtatt atgacaaggt tttacgacta ctagatgtca ttggtttgcc 2820 tgaactggtt attcagttgg ctacatcagc cataactgaa gcaagtgatg actggaaaag 2880 tcaggctact ctaaggacat gtattttcaa acatcatttg gatttgggtc acaatagcca 2940 agcatatgaa gccttaaccc aaattcctga ttccagcagg caattagatt gtttacggca 3000 gttggtggta gttctttgtg aacgctcaca gctacaggat cttgtagagt ttccctatgt 3060 gaatctgcat aatgaggttg tgggaataat tgagtcacgt gctagagctg tggaccttat 3120 gactcacaat tactatgaac ttctgtatgc ctttcacatc tatcgccaca attaccgcaa 3180 ggctggcaca gtgatgtttg agtatggaat gcggcttggc agagaagttc gaactctccg 3240 gggacttgag aaacaaggca actgttatct ggctgctctc aattgtttac gacttattcg 3300 tccagaatat gcgtggattg tgcagccagt gtctggtgca gtgtatgatc gccctggagc 3360 atcccctaag aggaatcatg atggagaatg cacagctgcc cccacaaatc gacaaattga 3420 aatcctggaa ctggaagatc tggagaaaga gtgttccttg gctcgcatcc gcctcacttt 3480 ggctcagcat gatccatcag cggttgcagt tgctggaagt tcatcagcag aggaaatggt 3540 cactctcttg gttcaggcgg gcctctttga cactgccata tcactctgtc agacttttaa 3600 gcttccctta acgccagtct ttgaagggct tgccttcaaa tgcatcaaat tgcaatttgg 3660 aggagaggca gcacaagcag aagcctgggc ctggctagca gccaatcagc tctcatctgt 3720 catcactact aaggagtcta gtgctacaga tgaagcatgg cgactattat ccacttacct 3780 ggagaggtac aaagtccaga ataacttgta tcaccactgt gtaatcaaca agctcttgtc 3840 tcatggagtg cctctgccta attggcttat aaacagtcac aacatcgcac tgtcccaaaa 3900 agttgataag gcaacacggg atttattata tcgtcggacc ttgtgatttg gattgtcacc 3960 tagcctttgt aaccgcttgg tgcctcttag gacttaagac taccctacag gaaccctgta 4020 ctcaaggccg atttttgtaa ctgtaaatga tgtgtacaac attcaagtct gcattctgca 4080 caagatagga gggcggaaga gtcagaggac cctgtgcttg ctggtggtgc taacacaatt 4140 tctggtgttc aaccttggtc tcaaatagct gcttttgtat atgattcacg agctttttta 4200 gagtttatat ttttttaaac taccgaagac attcattatc tgcaaattaa gactcacctt 4260 cactttccaa aatagctgag ggttgttggc ttgttgtagc tgaccaccaa aagcagtcac 4320 tgcaaatctt ttaattcttc cctatcacct tttgtatttt aatgcaatta ttttggtcca 4380 gaactgacct gtattttctg tattgtacac aaaagctaat aattttgtgt actttttatt 4440 tattttggag gttttatatg atcttcaatt gagtattaaa taatttgcct agattaagcc 4500 taaaatgatg accagctaat taaagaagat attttgaatc tggttctgag ctaaagttga 4560 gtaaattctt agctaagaaa aaattggaaa tccatcatct atattagcaa cagattctca 4620 gagtaaattg ttaacttcta tgatttatga taatcaagct ggacttgatc atacaagtta 4680 gtctcataat gtattggacc aaaatgtaaa cttcattggt cagatttaga agcattcatg 4740 ctcacaagtt ttgggaaagt gaaaaataat aaaatcatct tggattttat tctgtatatt 4800 aaaatttatc tttt 4814 50 6493 DNA Homo sapiens misc_feature (1)..(6493) n = a, c, g or t 50 gaattcaagt cttgttcctg cacattccac cctggagaaa tctggggcaa gtgactgttc 60 cccgggcctt agcttctcct gtcactggga catcacaaca gcacctacct tagggcaact 120 caggccaggg aagttggtgc tgcctcacct cccaatgtgc gtcctcctgg gcctggagcc 180 tcagggcctc tggaaggagg aagtgagcgc ctctgggcag gattcctggg aggcctggga 240 gagcaaggga agcgccaaga gctgagcaga gttctgggac tgatccatgg ccctttctct 300 ctcacctttc aggaggtggg ccccctccac ccccagcact tcccacctgg tcggtcccga 360 acggcccctc cccggaggag gtggagcagc agaaaaggtg gggctgggcc ctgggtgggg 420 aaccttagcc gctgccagag ttccatatgt tctggaaccc ttgactccta gagttcagaa 480 cccagccaac ttgcagtttt cagaatgttc aagaaacttc tgacactcag agttgcagaa 540 cctcctggtc cctgcagatt cctggaaatc agaatatggt ggttgaaaga atcttgtggc 600 tgggcgtggt ggctcacgcc tgtaatccca gcactctggg aggccgaggc gggcagatcg 660 cctgaggtca ggagtttgag accagcctgg ccaacatggc gaaatcccgt ctctactgaa 720 gataacaaaa attagccggt catggtggcg cccgtgcctg taatcccagc tcggcaggcc 780 gaggcaggag aatcgcttga acccgggagg cagaggttgc agtgagccaa gatcgagcca 840 ctgcactcca gcctgggtga cagagtctca aaaaaaaaaa aagaaaagaa agaatcttgg 900 gcattttgta attcggtgtt cctgacagtt tagtgactgg gatctcgcat cctgatctct 960 ccctgtcgct gccctgccct ccattccccc tactctcacc cagccccctt cttggttccc 1020 taggggagga aggcttgggt gagtattagg agccagccac cctggagacc tctgagagag 1080 aggacggagg tcgctggccc cttcgctggc catccttagg accctgattg acggcagctc 1140 tctcgcctcc ccccacaggc agcagcccgg cccgtcggag cacatagagc gccgggtctc 1200 caatgcaggt gatgctcaga tagcttcggg agttgggagg gggcctccct ggaggaagtg 1260 gccagccagc tggacagtga agaatgaggc ttctctctct cagctgcccc cttttctgtg 1320 tttgtttcag gaggcccacc tgctcccccc gctgggggtc cacccccacc accaggacct 1380 ccccctcctc caggtccccc cccaccccca ggtttgcccc cttcnggggt cccagctgca 1440 gcgcacggag cagggggagg accaccccct gcaccccctc tcccggcagc acagggccct 1500 ggtggtgggg gagctggggc cccaggcctg gccgcagcta ttgctggagc caaactcagg 1560 aaagtcagca aggtgagggg ccgggagagg tgggcagggg gcaacagggc ttttatgggg 1620 gatgaggcca gggctgccgg cggtgtcatt gggctggaag gccaaaaggc ctgcccctaa 1680 agctcctgcc ccttttaaat ttctccagca ggaggaggcc tcaggggggc ccacagcccc 1740 caaagctgag agtggtcgaa gcggaggtgg gggactcatg gaagagatga acgccatgct 1800 ggcccggagg tgagcctgag cctggacccc caagtcacct ggagttccag ttcagtaggg 1860 cccagtcaga ggagggctcc aattcctgtt tagtttgttt cttttggtga atgttccccc 1920 tttgataacc aggtttggga tataatggtg gggtttgtca tgaaatgcct gaggcttgca 1980 accacctagg tagcctgtag atgttctaaa acccagaatt ctagaaccgt aggagatctt 2040 tcctcagaat tctgggaact caggttcctg caatctcagt gttccaacac agcaccgctc 2100 caccctcgga atcttactgt tccctaatat aagaatcata gaacctcctc caccctgatt 2160 ctagaaccac aatctcttga attttttttt tttttttttt tttttttttg agatggagtc 2220 ttgctctgtc acccaggctg gagtgcagtg gtatgatctc ggtccactgc aacctccgcc 2280 tcctgggttc aggcagttct tctgcctcag cctcctaagt agccgggatt acaggcatga 2340 gtcaccacac ccggctaatt tttgtatttt tagtagacac aggatttcac catgttggcc 2400 aggctggtct tgaactcttg acctcaagag atccacctgc ttcagcctct caaagtgttg 2460 gcattacagg ccactgcgcc cagcacaatc tcttgaattt ctaaaactag agtttcctta 2520 ggttttcgga gttccagaat tctatgcgct aggatctaca tttctagaac tcccctcaga 2580 aggggatggg ttgggtgacg gaagcacgtg tttttgcttt tctctcctgc agaaggaaag 2640 ccacgcaagt tggggagaaa acccccaagg atgaatctgc caatgtaagt cagggactct 2700 tcttgcccta catctcttag gccgtaccat gagggtaggg atagtgggat gtgtggggtt 2760 tgaacctgaa agaggaaatg ggcagaggtg tggcaggggc tggctcatgg cagttttatt 2820 tcctaccagc aggaggagcc agaggccaga gtcccggccc agagtggtga gtagagtgcc 2880 cagtccagcc acaggaacta caaatcccag aatactctgt tctcacatgt taagcaccct 2940 tataggagag tcagggcgaa tggtgctggg gattgtagtc tcctgagatg gggctttgat 3000 caggggctga tgaggttggg ggagtaagat tgattggggg gcagtctttt gtccctgatc 3060 tttctgattt cttgcctatc cccagaatct gtgcggagac cctgggagaa gaacagcaca 3120 accttgccaa ggtaggccat cggtcctggg gcccttgggg aggtaaaggc gggcagatcg 3180 cttgagccca ggaggtcaag accagcctgg gcaacatggc gacaccccat ctctacaaaa 3240 attagccagg cgtggtagca cttacctgtg gtcccagcta ctcaggaggc tgaggtggga 3300 ggattacttg agcccaggaa gttgaggcct cagcgagcca tcatcatgcc tgcactccag 3360 cctgagaaat agaatgtgac tgtctcaaaa caaaacacaa caaaccaaaa ccaaaaaaaa 3420 aaaaaactgg ggccccaaaa atacttggac ttgcccaatt tataaggcag agctcaatgt 3480 gatccctgga ataggaggcg gggaagcagg tcctctctct aatctcattg ctgtcccaaa 3540 ccacaccaac tcccccagga tgaagtcgtc ttcttcggtg accacttccg agacccaacc 3600 ctgcacgccc agctccagtg attactcgga cctacagagg gtgaaacagg taacttgggg 3660 gggaagttgg ggaccacagc aagagagatc taggtctggc ccctgccact ggcatgccgt 3720 atgatcctag ataacatctc agaaacctca ggtttccaat ctgacaaatg gagaaactgg 3780 attgggtcaa ggatgaccga gactccacac ccccttttct ggcacctgtg acagacatta 3840 ttaatctatc accgcgctca ttccagatga gtgccttgaa ttctttccgc acattgaccc 3900 agctgtccat caccaattgg agttggcagg aggctggaat gcgcttgcca accttggtac 3960 tggatgttct ccagtacttt tccggctcca aggatccaga attctcccct agaatcctcc 4020 agtcactctg cgaccttgac agcgatgtca tggtgtcgat gtaggggtag gtctcaaacc 4080 tactccccct ggcttttcca tcaacaagaa agaggggact ctggcagggc acggtggctc 4140 atgtgtgtaa tttcagcaca ttgcgaggct gaggtgggag cattgcttga ggccaggagt 4200 ttgagaccag cctggggcaa catcgggaga cccccatctc taaaaataac ttttaaaagt 4260 tacctgagaa ggccaggtgc ggtggctcat gcctgtaatc ccagcacttt gggaggccga 4320 ggtgggtgga tcacctgagg tcaggggttc aagaccagcc tggccaacat ggtgaaaccc 4380 atcgctacta aaaatacaaa aattaggctg ggaatggtgg ctcacagcca taatcccagc 4440 agtttggaag gctgatgggg acggatcacg tgaagtcaaa agttcgagac cagcctggcc 4500 aacatggcga aaccctgtct ctactaaaaa tacaaaaatt agctgggcct tgtggggggc 4560 acctgtaatc cagttatttg ggcggctgag gcaggagaat cgcttgaacc cgggagccag 4620 agattgcagt cagccgagat tgggccactg cactgcagtc tgggtgacag ggagactctg 4680 tttcaaaaaa aaaaagaaaa agaaaaagtt acctgattgt ggcggcaggt gactgtggtc 4740 ccagctactt gggaggctaa ggcaggagga ttacctgagc ctgggaagtt gaggctgcaa 4800 tgagctgtga tcatgccatt gcaccctagc ctaggcaaca gagcaaggtt ccttctcaaa 4860 aaataaaaga agggggattc attcctgcaa gtcccggtac ccctcctgat tagttttacc 4920 ccattaattt taggagcttc tggaagaggt gaagaaggaa ttgcagaaag tgaaagagga 4980 aatcattgaa ggtgaggtgg tttgctttgg ttttgttctt aaacatttac ttattttgga 5040 ggcatcatgt ccctgggcaa gagccctgtt ttggaaggga ggaggcagag actctgcccc 5100 tgacctctgc tccttgtttc cttccagcct tcgtccagga gctgaggaag cggggttctc 5160 cctgaccaca gggacccaga agacccgctt ctcctttccg cacacccggc ctgtcaccct 5220 gctttccctg cctctacttg acttggaatt ggctgaagac tacacaggaa tgcatcgttc 5280 ccactcccca tcccacttgg aaaactccaa gggggtgtgg cttccctgct cacacccaca 5340 ctggctgctg attggctggg gaggcccccg cccttttctc cctttggtcc ttcccctctg 5400 ccatcccctt ggggccggtc cctctgctgg ggatgcacca atgaacccca caggaagggg 5460 gaaggaagga gggaatttca cattcccttg ttctagattc actttaacgc ttaatgcctt 5520 caaagttttg gtttttttaa gaaaaaaaaa tatatatata tttgggtttt gggggaaaag 5580 ggaaattttt ttttctcttt ggttttgata aaatgggatg tgggagtttt taaatgctat 5640 agccctgggc ttgccccatt tggggcagct atttaagggg aggggatgtc tcaccgggct 5700 gggggtgaga catcccccca ccccagggac tccccttccc tctggctcct tccccttttc 5760 tatgaggaaa taagatgctg taactttttg gaacctcagt tttttgattt tttatttggg 5820 taggttttgg ggtccaggcc atttttttta ccccttggag gaaataagat gagggagaaa 5880 ggaaaagggg aggaaacttc tcccctccca ccttcacctt tagcttcttg aaaatgggcc 5940 cctgcagaat aaatctgcca gtttttataa atgctaagat ctctggagtg atttgaaggc 6000 ctgttctgat ggggatggag gtgtgctcgg cccccggtgc ccctccagga agatttggtc 6060 ctctgctgag aacccctgcc tcctcccagg aatccacctt cccttcatct tccttcccac 6120 cctgcatatt gcgcctgctc actcatcctc aggcccgcag ccaggatgat ctctgccccc 6180 tccagcctcc ctccccatgc cccttaggag gccacttcct ccccatccca ccctgccctt 6240 caccacccta ggggaggcca gaagcagcct cactttgtgt agccttgggc aagtccattt 6300 gcttacctca ggcctcagtt tctgatttgg gaaagggctc ataagatgat tctctgcccc 6360 cactctacca ctctcccagc ttctttcctc tttttttttt tttttttttt ttaatgagtt 6420 ggggtcttgc tctttcaccc agtctggagt gtagtggcag gatcacagct cactgcagcc 6480 ttgaactcct ggg 6493 51 5629 DNA Homo sapiens 51 gcgcgaccgt cccgggggtg gggccgggcg cagcggcgag aggaggcgaa ggtggctgcg 60 gtagcagcag cgcggcagcc tcggacccag cccggagcgc agggcggccg ctgcaggtcc 120 ccgctcccct ccccgtgcgt ccgcccatgg ccgccgccgg gcagctgtgc ttgctctacc 180 tgtcggcggg gctcctgtcc cggctcggcg cagccttcaa cttggacact cgggaggaca 240 acgtgatccg gaaatatgga gaccccggga gcctcttcgg cttctcgctg gccatgcact 300 ggcaactgca gcccgaggac aagcggctgt tgctcgtggg ggccccgcgc ggagaagcgc 360 ttccactgca gagagccaac agaacgggag ggctgtacag ctgcgacatc accgcccggg 420 ggccatgcac gcggatcgag tttgataacg atgctgaccc cacgtcagaa agcaaggaag 480 atcagtggat gggggtcacc gtccagagcc aaggtccagg gggcaaggtc gtgacatgtg 540 ctcaccgata tgaaaaaagg cagcatgtta atacgaagca ggaatcccga gacatctttg 600 ggcggtgtta tgtcctgagt cagaatctca ggattgaaga cgatatggat gggggagatt 660 ggagcttttg tgatgggcga ttgagaggcc atgagaaatt tggctcttgc cagcaaggtg 720 tagcagctac ttttactaaa gactttcatt acattgtatt tggagccccg ggtacttata 780 actggaaagg gattgttcgt gtagagcaaa agaataacac tttttttgac atgaacatct 840 ttgaagatgg gccttatgaa gttggtggag agactgagca tgatgaaagt ctcgttcctg 900 ttcctgctaa cagttactta ggtttttctt tggactcagg gaaaggtatt gtttctaaag 960 atgagatcac ttttgtatct ggtgctccca gagccaatca cagtggagcc gtggttttgc 1020 tgaagagaga catgaagtct gcacatctcc tccctgagca catattcgat ggagaaggtc 1080 tggcctcttc atttggctat gatgtggcgg tggtggacct caacaaggat gggtggcaag 1140 atatagttat tggagcccca cagtattttg atagagatgg agaagttgga ggtgcagtgt 1200 atgtctacat gaaccagcaa ggcagatgga ataatgtgaa gccaattcgt cttaatggaa 1260 ccaaagattc tatgtttggc attgcagtaa aaaatattgg agatattaat caagatggct 1320 acccagatat tgcagttgga gctccgtatg atgacttggg aaaggttttt atctatcatg 1380 gatctgcaaa tggaataaat accaaaccaa cacaggttct caagggtata tcaccttatt 1440 ttggatattc aattgctgga aacatggacc ttgatcgaaa ttcctaccct gatgttgctg 1500 ttggttccct ctcagattca gtaactattt tcagatcccg gcctgtgatt aatattcaga 1560 aaaccatcac agtaactcct aacagaattg acctccgcca gaaaacagcg tgtggggcgc 1620 ctagtgggat atgcctccag gttaaatcct gttttgaata tactgctaac cccgctggtt 1680 ataatccttc aatatcaatt gtgggcacac ttgaagctga aaaagaaaga agaaaatctg 1740 ggctatcctc aagagttcag tttcgaaacc aaggttctga gcccaaatat actcaagaac 1800 taactctgaa gaggcagaaa cagaaagtgt gcatggagga aaccctgtgg ctacaggata 1860 atatcagaga taaactgcgt cccattccca taactgcctc agtggagatc caagagccaa 1920 gctctcgtag gcgagtgaat tcacttccag aagttcttcc aattctgaat tcagatgaac 1980 ccaagacagc tcatattgat gttcacttct taaaagaggg atgtggagac gacaatgtat 2040 gtaacagcaa ccttaaacta gaatataaat tttgcacccg agaaggaaat caagacaaat 2100 tttcttattt accaattcaa aaaggtgtac cagaactagt tctaaaagat cagaaggata 2160 ttgctttaga aataacagtg acaaacagcc cttccaaccc aaggaatccc acaaaagatg 2220 gcgatgacgc ccatgaggct aaactgattg caacgtttcc agacacttta acctattctg 2280 catatagaga actgagggct ttccctgaga aacagttgag ttgtgttgcc aaccagaatg 2340 gctcgcaagc tgactgtgag ctcggaaatc cttttaaaag aaattcaaat gtcacttttt 2400 atttggtttt aagtacaact gaagtcacct ttgacacccc atatctggat attaatctga 2460 agttagaaac aacaagcaat caagataatt tggctccaat tacagctaaa gcaaaagtgg 2520 ttattgaact gcttttatcg gtctcgggag ttgctaaacc ttcccaggtg tattttggag 2580 gtacagttgt tggcgagcaa gctatgaaat ctgaagatga agtgggaagt ttaatagagt 2640 atgaattcag ggtaataaac ttaggtaaac ctcttacaaa cctcggcaca gcaaccttga 2700 acattcagtg gccaaaagaa attagcaatg ggaaatggtt gctttatttg gtgaaagtag 2760 aatccaaagg attggaaaag gtaacttgtg agccacaaaa ggagataaac tccctgaacc 2820 taacggagtc tcacaactca agaaagaaac gggaaattac tgaaaaacag atagatgata 2880 acagaaaatt ttctttattt gctgaaagaa aataccagac tcttaactgt agcgtgaacg 2940 tgaactgtgt gaacatcaga tgcccgctgc gggggctgga cagcaaggcg tctcttattt 3000 tgcgctcgag gttatggaac agcacatttc tagaggaata ttccaaactg aactacttgg 3060 acattctcat gcgagccttc attgatgtga ctgctgctgc cgaaaatatc aggctgccaa 3120 atgcaggcac tcaggttcga gtgactgtgt ttccctcaaa gactgtagct cagtattcgg 3180 gagtaccttg gtggatcatc ctagtggcta ttctcgctgg gatcttgatg cttgctttat 3240 tagtgtttat actatggaag tgtggtttct tcaagagaaa taagaaagat cattatgatg 3300 ccacatatca caaggctgag atccatgctc agccatctga taaagagagg cttacttctg 3360 atgcatagta ttgatctact tctgtaattg tgtggattct ttaaacgctc taggtacgat 3420 gacagtgttc cccgatacca tgctgtaagg atccggaaag aagagcgaga gatcaaagat 3480 gaaaagtata ttgataacct tgaaaaaaaa cagtggatca caaagtggaa cagaaatgaa 3540 agctactcat agcgggggcc taaaaaaaaa aaagcttcac agtacccaaa ctgctttttc 3600 caactcagaa attcaatttg gatttaaaag cctgctcaat ccctgaggac tgatttcaga 3660 gtgactacac acagtacgaa cctacagttt taactgtgga tattgttacg tagcctaagg 3720 ctcctgtttt gcacagccaa atttaaaact gttggaatgg atttttcttt aactgccgta 3780 atttaacttt ctgggttgcc tttgtttttg gcgtggctga cttacatcat gtgttgggga 3840 agggcctgcc cagttgcact caggtgacat cctccagata gtgtagctga ggaggcacct 3900 acactcacct gcactaacag agtggccgtc ctaacctcgg gcctgctgcg cagacgtcca 3960 tcacgttagc tgtcccacat cacaagacta tgccattggg gtagttgtgt ttcaacggaa 4020 agtgctgtct taaactaaat gtgcaataga aggtgatgtt gccatcctac cgtcttttcc 4080 tgtttcctag ctgtgtgaat acctgctcac gtcaaatgca tacaagtttc attctccctt 4140 tcactaaaaa cacacaggtg caacagactt gaatgctagt tatacttatt tgtatatggt 4200 atttattttt tcttttcttt acaaaccatt ttgttattga ctaacaggcc aaagagtctc 4260 cagtttaccc ttcaggttgg tttaatcaat cagaattaga attagagcat gggagggtca 4320 tcactatgac ctaaattatt tactgcaaaa agaaaatctt tataaatgta ccagagagag 4380 ttgttttaat aacttatcta taaactataa cctctccttc atgacagcct ccaccccaca 4440 acccaaaagg tttaagaaat agaattataa ctgtaaagat gtttatttca ggcattggat 4500 attttttact ttagaagcct gcataatgtt tctggattta catactgtaa cattcaggaa 4560 ttcttggaga agatgggttt attcactgaa ctctagtgcg gtttactcac tgctgcaaat 4620 actgtatatt caggacttga aagaaatggt gaatgcctat ggaactagtg gatccaaact 4680 gatccagtat aagactactg aatctgctac caaaacagtt aatcagtgag tcgagtgttc 4740 tattttttgt tttgtttcct cccctatctg tattcccaaa aattactttg gggctaattt 4800 aacaagaact ttaaattgtg ttttaattgt aaaaatggca gggggtggaa ttattactct 4860 atacattcaa cagagactga atagatatga aagctgattt tttttaatta ccatgcttca 4920 caatgttaag ttatatgggg agcaacagca aacaggtgct aatttgtttt ggatatagta 4980 taagcagtgt ctgtgttttg aaagaataga acacagtttg tagtgccact gttgttttgg 5040 ggggggcttt ttttcttttt ccggaaaatc cttaaacctt aagatactaa ggacgttgtt 5100 ttggttgtac ttggaattct tagtcacaaa atatattttg tttacaaaaa tttctgtaaa 5160 acaggttata acagtgttta aagtctcagt ttcttgcttg gggaacttgt gtccctaatg 5220 tgttagattg ctagattgct aaggagctga tacttgacag ttttttagac ctgtgttact 5280 aaaaaaaaga tgaatgtcgg aaaagggtgt tgggagggtg gtcaacaaag aaacaaagat 5340 gttatggtgt ttagacttat ggttgttaaa aatgtcatct caagtcaagt cactggtctg 5400 tttgcatttg atacattttt gtactaacta gcattgtaaa attatttcat gattagaaat 5460 tacctgtgga tatttgtata aaagtgtgaa ataaattttt tataaaagtg ttcattgttt 5520 cgtaacacag cattgtatat gtgaagcaaa ctctaaaatt ataaatgaca acctgaatta 5580 tctatttcat caaaaaaaaa aaaaaaaaaa actttatggg cacaactgg 5629 52 4994 DNA Homo sapiens 52 ccgcagcgct cggctggctg cagcggcacc gcgggttgcg cggccgggga tgctccagcg 60 ggcgcgatgg cccccgccat gcagccggcc gagatccaat ttgcccagcg gctggcgtcc 120 agcgagaagg gcatccggga ccgagcggtg aagaagctgc gccagtacat cagcgtgaag 180 acgcagaggg agacaggagg tttcagtcag gaagaacttc tacaggaaga gctcgccaac 240 accattgcac agctagtcca tgctgttaac aactcagcgg ctcaacacct gttcattcag 300 accttttggc aaaccatgaa tcgagaatgg aaaggaatag acaggctacg cctggacaaa 360 tactatatgc tgattcgtct ggtcctgagg cagtcctttg aagtcttgaa gcgaaatggc 420 tgggaagaaa gccgaatcaa ggttttcttg gatgtcctga tgaaggaggt cctgtgtcct 480 gagagtcagt ctcctaatgg agtgagattc cacttcattg atatttacct ggatgaactc 540 tccaaagtcg gggggaagga gcttttagca gatcagaatc tcaagtttat cgatccattc 600 tgcaaaattg ctgcaaagac gaaggaccac accctggtac agaccatagc tcggggtgtc 660 ttcgaagcta tcgtagatca gtctcctttt gtgcctgaag agacgatgga ggaacagaag 720 acaaaagtgg gtgatggtga cctctctgct gaggagatac ctgaaaatga ggtatccttg 780 agaagagctg tcagtaaaaa gaagacagca ctgggcaaaa accattccag aaaagatgga 840 ctcagtgatg aaagaggaag agatgactgt ggaacctttg aggacacagg gccccttctc 900 cagtttgact ataaggctgt tgctgatcga ctcctggaaa tgaccagcag gaagaacacg 960 ccccacttca acaggaagcg cctctccaaa ctcatcaaga aattccaaga cctttctgaa 1020 ggaagcagta tatctcaact cagttttgcg gaggacattt ctgctgatga agatgaccaa 1080 atcctcagtc aaggaaagca taagaagaaa ggaaataaac ttttagagaa aactaacttg 1140 gaaaaggaga aaggaagcag agtcttttgt gtagaggaag aggacagtga aagcagtctt 1200 caaaagagaa gaaggaagaa gaagaagaag caccacctgc agcctgaaaa tccaggccca 1260 gggggtgcag ccccgtccct ggaacagaac cggggcaggg agcccgaggc ctctgggccg 1320 aaagccctga aggcacgtgt ggccgagcca ggtgcagagg ccacgtccag cactggggag 1380 gagagtggct ccgagcatcc tccagccgtc cccatgcaca ataaaaggaa acggccacgg 1440 aagaagagcc cgagggccca cagggaaatg ttggaatcag cagtgttgcc cccagaggac 1500 atgtctcaga gtggcccgag tggcagtcat cctcagggac ctagagggtc cccgacaggt 1560 ggagcccaac tcctaaaaag gaagcggaaa cttggagttg tgcccgtcaa tggcagtggc 1620 ctgtccacgc cggcctggcc tccattgcag caggaaggcc ctcccacagg ccccgcagag 1680 ggggcgaaca gccacaccac gctgccccag cgcaggaggc tgcagaaaaa gaaggcaggg 1740 cccggcagcc tggagctctg tggcctgccc agccagaaaa cagcaagttt gaaaaagagg 1800 aagaaaatga gagtgatgtc aaacttggtg gagcacaacg gggtgctgga gtccgaagct 1860 gggcaacccc aggctctggg aagcagtggg acttgcagtt ccctgaagaa gcagaagctg 1920 agggcagaga gcgactttgt gaagtttgac acccccttct taccaaagcc cctgttcttc 1980 agaagagcca agagcagcac tgccacccac cctccaggcc ctgccgtcca gctaaacaag 2040 acaccatcca gctccaagaa agtcaccttt gggctgaaca gaaacatgac tgccgaattc 2100 aagaagacag acaagagtat cttggtcagt cccacgggcc cttctcgagt ggccttcgac 2160 cctgaacaga agcccctcca cggggtgctg aagaccccca ccagctcacc tgccagctca 2220 cccctggtgg ccaagaagcc cctgaccacc acaccaagga gaaggcccag ggctatggat 2280 ttcttctgag gagcagcaga gtcccttgta aaagactgct tttgtacaga atgcgctata 2340 aattatacct ttaagaatgt ggggcctttt ttatgatttt gtaagttccc ataagttgtg 2400 tgcacgaggt tctgagagtg cccgcaggct gctgcgtcct ggcccctctg tagtggctgc 2460 gggcgtcttg gttgaatctt ttgctacaaa ccatgtttgc gtttgagctc tccaggattt 2520 tacatttttg ggtaacctca gtgattccca ttggtgtagg aaatgagacc ctctctgaag 2580 ctgaggagag cacgttgatc tgaactttaa atcaatcagt gctgctggca caatgaaagg 2640 tggaactgca cttgtgttga gctctcagtt ctgcggaatt tggtactcat taccgtattc 2700 gccgtactaa gttggtttct gttagtctta acagtctgtt ttcttttaaa agcatgtagg 2760 gcttcattgc catgttctgt gggtgtttgg caggttaccg atggggaaga ttcttgtcac 2820 agaatcagca ataccatagt ttttctacat gtgctcagct gggggtgtgg acaggtaggg 2880 gtggggaaag aagaggctct gcgttctggg ggctttttct tctcctcccc ctacccggtt 2940 tccctccctg ttttcctacc tctacggcaa gcccaaagtg tcttcccggg agcccagcgc 3000 agcccccggc tcttacccag gaccccgccc cgtgctgagc cttctgctga ggtccttgcg 3060 tggagcacac tcattcctcc aagcccttgc gctcccgttt ctctctctct ccgtccacgt 3120 tccagccgag tcactgcctg cctgaccggc tccatggcag ctccccatct tccctagagg 3180 ctgcctgcgc atctggagcc tgcgctccgg ctcagcgacc tttcctctca aatgcggaag 3240 cgtgcactta cagttcagac cgttctcctg taagttcatt acaaacacgg gcggaaggca 3300 ctcaggcttt cgttggagaa acagaaataa ggccttcttt tgagcagcga ttgctggatc 3360 attgatctgt ttgaggaagt gtctgacctg ggcctgagag ctggagaagg tgcagattca 3420 aagtgagcgg ctcctgagga gagccgccaa ggctgctcgc cttctccgtg gcttccgcag 3480 ctaccgtctg cacggtgaga gggcacgggc acacggttcg ggctggcgtg cagctctccc 3540 agccagccac gctctgctca ggcctggaag tgaaagccgc ctccttcccg ttatgccccc 3600 catacaggag cctcggtttt tcagcaaaac gcggccagtc cccttctcca ctgctgcctc 3660 ccagcagagg gccccaggat ctccaaggtc ccagctatgg ctttggacaa cgtggcttcg 3720 gcccctgggg ttgcagagct tgcattgggt ttacctcggt ctcattcatt catggagcca 3780 agggtggggt ttcacctgcg aacatcagac tgacttgctg gcgtcaagag cagttgactc 3840 actgatgaag gccctggtga ggagaaagca ctctgttctt cgcctactct gtaatcgttt 3900 tgtcataatg agccatgaaa aaagtaatga acttgtgctg ttaatcgtca ctgtaatgag 3960 aagtcttacg tacaacatag ctgtggtggc tgcgtggttt aatggctgca ttagatagga 4020 tcctcacatc ccattcagaa ccaaaactga tacagtgaaa caattaaggt gagcaaatag 4080 ttttaacttt tctttttttt ttaagtttca ttcttcctag aatatttttc taacaatttt 4140 tatttcagct ttaaagatgg gtcatatagc caaacgggcc atataatcca acattgttga 4200 gatgtcttag gacatctaag gcaaaactgg cacatttgtt ctgcagacta ttgcaggaat 4260 gttttttcct agcatttcta tattatctgt ccattctgag gaaccagtga atgtcctata 4320 aatgcacctc ctgtcaaaac catgcctgag aggtcccggc tgggagtgac agggtgcttc 4380 ttagattcta ttggtccttc tctcattctc cgaacttact cctttttatg ggtaagtcaa 4440 ctaggtttac agtcccttat ttttaatgcc taagttttga cagcaggaag aaaacaattt 4500 tttaaaaatt ctcattacat agacgcacaa gaatatgtca cataaagaaa atgtgtttag 4560 aatactggtt ttctatttac gcatgatatt ttcctaagta aaattgccaa gtggacttgg 4620 aagtccagaa aggaaaataa tttaaattaa tgctggtgat cttaacaata ttttgtaaaa 4680 tgatgcttcc cccttctcca tggtctagtc aattttgtac aattaggtat ctgactttac 4740 aagtttgtta tcctttctaa tttttactga actgaaagca caaagaagac tacacagaaa 4800 atctggaaac agttgcaggt gttgggagga agatgaaatc gagctgtctt ttaacttttg 4860 tatgtgtttt atcagaattt gctggactat gctggcaagg actttgttta cgatcaaatt 4920 gtactagtgt ctgcagggtt tgtcagtact cgtcaaagcc aagtccaatt aaaaaaaaaa 4980 agtctttgcc ctcc 4994 53 1202 DNA Homo sapiens 53 ggcacgaggc gccatttgct gccgccgagc gtggacgcag gcggatctct gaagagctgg 60 gtcgccagcc tctcccgcgc acgttgcctg gcctccagca cctacttggt cccgcgcgct 120 ccctcgtgtc gcccctcgga gcagcagccg ccgcggtcgc cgctacccgg aaagaagtca 180 gagacgccgc gagtcgccgc caccgccatg cccaagaata aaggtaaagg aggtaaaaac 240 agacgcaggg gtaagaatga gaatgaatct gaaaaaagag aactggtatt caaagaggat 300 gggcaggagt atgctcaggt aatcaaaatg ttgggaaatg gacggctaga agcaatgtgt 360 ttcgatggtg taaagaggtt atgtcacatc agaggaaaat tgagaaaaaa ggtttggata 420 aatacctcgg acattatttt ggttggtctc cgagactacc aggataacaa agctgatgta 480 attttaaaat acaatgcaga cgaagctaga agtctgaagg catacggcga gcttccagag 540 catgctaaaa tcaatgaaac tgatacattt ggtcctggag atgatgatga aattcagttt 600 gatgacattg gagatgatga tgaagatatt gatgacatct aaattgaact caacatttta 660 cattccatct tttctgaaga ttgtcctaca atttggattt tgatcatgac aaagaagatt 720 aaaatttcat tagcatgaat gcaatttgtt aaagcagact gatttgtttc taagatattt 780 ttggtttttt taaaactgat aataatgctg aattatctta agtgagatgt taagcccact 840 ttgttctttt aatgtaatgg agcttatggg tagaagacca tgtctactaa ttacaaaaaa 900 aaaaaaaaac catgattgct gcttttccta ccacttccag taagaaaatg ggtgttttga 960 agaaatcatt tgccttgtct cacggaatct gattaagccc tggcctcttg atgtatagag 1020 tcatggatat tccagttacc tagatattcc cttgagattt tgatacaatt tgagggaggc 1080 agaagtctgc agttgaagaa aaaaaataag tctgtttgtc atatttaagt agcctgtgcg 1140 tatttttata ctgattttga tatcatgttc ttttcatagt cgtattttgc caccgtaaac 1200 at 1202 54 1745 DNA Homo sapiens 54 ctgctcgaga aggagctgga gcagagccag aaggaggcct cagaccttct ggagcagaac 60 cggctcctgc aggaccagct gagggtggcc ctgggccggg agcagagcgc ccgtgagggc 120 tacgtgctgc aggccacgtg cgagcgaggg tttgcagcaa tggaagaaac gcaccagaag 180 attgaagatc tccagaggca gcaccagcgg gagctagaga aacttcgaga agagaaagac 240 cgcctcctag ccgaggagac agcggccacc atctcagcca tcgaagccat gaagaacgcc 300 caccgggagg aaatggagcg ggagctggag aagagccagc ggtcccagat cagcagcgtc 360 aactcggatg ttgaggccct gcggcgccag tacctggagg agctgcagtc ggtgcagcgg 420 gaactggagg tcctctcgga gcagtactcg cagaagtgcc tggagaatgc ccatctggcc 480 caggcgctgg aggccgagcg gcaggccctg cggcagtgcc agcgtgagaa ccaggagctc 540 aatgcccaca accaggagct gaacaaccgc ctggctgcag agatcacacg gttgcggacg 600 ctgctgactg gggacggcgg tggggaggcc actgggtcac cccttgcaca gggcaaggat 660 gcctatgaac tagaggtctt attgcgggta aaggaatcgg aaatacagta cctgaaacag 720 gagattagct ccctcaagga tgagctgcag acggcactgc gggacaagaa gtacgcaagt 780 gacaagtaca aagacatcta cacagagctc agcatcgcga aggctaaggc tgactgtgac 840 atcagcaggt tgaaggagca gctcaaggct gcaacggaag cactggggga gaagtcccct 900 gacagtgcca cggtgtccgg atatgatata atgaaatcta aaagcaaccc tgacttcttg 960 aagaaagaca gatcctgtgt cacccggcaa ctcagaaaca tcaggtccaa gagtctgaag 1020 gaaggcctga cggtgcaaga acggttgaag ctctttgaat ccagggactt gaagaaagac 1080 taggtgtgtc ccatccaagt tgagcacgcg ccttccccag cttgcagcag cacaccccaa 1140 gcgctgcttt tcacctgtac ctttgtttta ttattattat tattattgct gttgttgtca 1200 tcgttaactg tgggcatgga atgcgtgagg ctggcttctg ggttgtccac accactctct 1260 gctgtgttga cttcctgttg tcttcaacaa agcttttttc cgtggtattc taaaattagg 1320 ccagcagtgg gggctgggag ggcatctgtg ttagtccttt cctggctgtg acccgccaca 1380 ctcactgtca gtattaaggc ccagcagcct gttgataagc taccctgtct caccatgtgc 1440 tggtgtggaa acggggccca gccagcacgc ctcaaggtag atggaatccc cactggtcag 1500 agaaaaagct atgcggacac tccagcttgg cctgggtcac agcactgact cctcacccgc 1560 tagtctggct gttaagagga gaaagtgcac tgccttccag cccaggagga ggacagcatt 1620 ttgtatttgt tccactgatg cagcttagac ccacacccct gagagtcgtg gcaaaccttt 1680 cacaacctgg aaaatgttga aagcaaccat tcctattttt gtttgttttt tattaaatct 1740 tgcac 1745 55 976 DNA Homo sapiens 55 cccggaacct ggcgcaactc ctagagcggt ccttggggag acgcgggtcc cagtcctgcg 60 gctcctactg gggagtgcgc tggtcggaag attgctggac tcgctgaaga gagactacgc 120 aggaaagccc cagccaccca tcaaatcaga gagaaggaat ccaccttctt acgctatggc 180 aggtaagaaa gtactcattg tctatgcaca ccaggaaccc aagtctttca acggatcctt 240 gaagaatgtg gctgtagatg aactgagcag gcagggctgc accgtcacag tgtctgattt 300 gtatgccatg aactttgagc cgagggccac agacaaagat atcactggta ctctttctaa 360 tcctgaggtt ttcaattatg gagtggaaac ccacgaagcc tacaagcaaa ggtctctggc 420 tagcgacatc actgatgagc agaaaaaggt tcgggaggct gacctagtga tatttcagtt 480 cccgctgtac tggttcagcg tgccggccat cctgaagggc tggatggata gggtgctgtg 540 ccagggcttt gcctttgaca tcccaggatt ctacgattcc ggtttgctcc agggtaaact 600 agcgctcctt tccgtaacca cgggaggcac ggccgagatg tacacgaaga caggagtcaa 660 tggagattct cgatacttcc tgtggccact ccagcatggc acattacact tctgtggatt 720 taaagtcctt gcccctcaga tcagctttgc tcctgaaatt gcatccgaag aagaaagaaa 780 ggggatggtg gctgcgtggt cccagaggct gcagaccatc tggaaggaag agcccatccc 840 ctgcacagcc cactggcact tcgggcaata actctgtggc acgtgggcat cacgtaagca 900 gcacactagg aggcccaggc gcaggcaaag agaagatggt gctgtcatga aataaaatta 960 caacatagct acctgg 976 56 3394 DNA Homo sapiens 56 gtcccgagcg ccggcctgcg gagcgtagca gcccgggcca gacgccggag gagggcgcgc 60 aggccttggc cgagttcgcg gcgctgcacg gcccggcgct gcgcgcttcg ggggtccccg 120 aacgttactg gggccgcctc ctgcacaagc tggagcacga ggttttcgac gctggggaag 180 tgtttgggat catgcaagtg gaggaggtag aagaggagga ggacgaggca gcccgggagg 240 tgcggaagca gcagcccaac ccggggaacg agctgtgcta caaggtcatc gtgaccaggg 300 agagcgggct ccaggcagcc caccccaaca gcatcttcct catcgaccac gcctggacgt 360 gccgtgtgga gcacgcgcgc cagcagctgc agcaggtgcc cgggctgctg caccgcatgg 420 ccaacctgat gggcattgag ttccacggtg agctgcccag tacagaggct gtggccctgg 480 tgctggagga gatgtggaag ttcaaccaga cctaccagct ggcccatggg acagctgagg 540 agaagatgcc ggtgtggtat atcatggacg agttcggttc gcggatccag cacgcggacg 600 tgcccagctt cgccacggca cccttcttct acatgccgca gcaggtggcc tacacgctgc 660 tgtggcccct gagggacctg gacactggcg aggaggtgac ccgagacttt gcctacggag 720 agacggaccc cctgatccgg aagtgcatgc tgctgccctg ggcccccacc gacatgctgg 780 acctcagctc ttgcacaccc gagccgcccg ccgagcacta ccaggccatt ctggaggaaa 840 acaaggagaa gctgccactt gacatcaacc ccgtggtgca cccccacggc cacatcttca 900 aggtctacac ggacgtgcag caggtggcca gcagcctcac ccacccgcgc ttcaccctca 960 cccagagtga ggcggacgcc gacatcctct tcaacttctc acacttcaag gactacagga 1020 aactcagcca ggagaggcca ggcgtgctgc tgaaccagtt cccctgcgag aacctgctga 1080 ctgtcaagga ctgcctggcc tccatcgcgc gccgggcagg tggccccgag ggcccaccct 1140 ggctgccccg aaccttcaac ctgcgcactg agctgcccca gtttgtcagc tacttccagc 1200 agcgggaaag gtggggcgag gacaaccact ggatctgcaa gccctggaac ctggcgcgca 1260 gcctggacac ccacgtcacc aagagcctgc acagcatcat ccggcaccga gagagcaccc 1320 ccaaggttgt gtccaagtac atcgaaagtc ccgtgttgtt ccttcgagaa gacgtgggaa 1380 aggtcaagtt cgacatccgc tacatcgtgc tgctgcggtc agtgaggccc ctacggttgt 1440 tcgtgtatga tgtgttctgg ctgcggttct ccaaccgggc ctttgcactc aacgacctgg 1500 atgactacga gaagcacttc acggtcatga actatgaccc ggatgtggtg ctgaagcagg 1560 tgcactgtga agagttcatc cccgagtttg agaagcaata cccagaattt ccctggacgg 1620 acgtccaggc tgagatcttc cgggccttca cggagctgtt ccaggtggcc tgtgccaagc 1680 caccacccct gggcctctgc gactacccct catcccgggc catgtatgcc gtcgacctca 1740 tgctgaagtg ggacaacggc ccagatggaa ggcgggtgat gcagccgcag atcctggagg 1800 tgaacttcaa ccccgactgt gagcgagcct gcaggtacca ccccaccttc ttcaacgacg 1860 tcttcagcac cttgtttctg gaccagcccg gtggctgcca cgttacctgc cttgtctagg 1920 cactcgctgt ccccaaaacc tgtgcttggg gcaggattcc aacctcagtt ctctgagctg 1980 cttctgcaaa ggcccccatg tccctcccca caccggccct gggcatagcc tcagccccag 2040 gcctctgtcc tgccgagcca tcctcccggc gccacactcc gggagcacag catcctcctc 2100 tcacctgtgg gtcagagcag gacagtgatg gtgtccccag ggctgagcac caccccacgc 2160 cctgccctca cccctcacca ccatctgtgc actgatgagt ctccagttta gccaagggct 2220 tcgttcctgg catggagaat ttgttcctgg ctgctgtgtt tccagggggt gctgggggaa 2280 gggttccgtg gagcgagaca aggtgtcctc gggagcaggg ttccaccggg aagcgtttgg 2340 gagccctgta tcacacgggg caggcgggtt tctcttccgg ggtctctgct cttatgcatc 2400 aggacgaccc cgggacggct gtggggcccc acactgcacc cacagggctc tatgcgacag 2460 gggcccagga acagcctgag gccaccaccc agcaagcccg ccttatcacc cattccagct 2520 cacccagaac cttcaccagc aaacctcctg ctgaggtcct ggcaggaggc caccgtcttg 2580 ttaccgtttc cttttcgttt gctgagggtc acagacccca acagggaaat cagtatctgt 2640 cttcccagtg gttgccctgc tcgccgggca ctccacgggg tcccgccctt gtgtgagatg 2700 ggccaggatc cttcggcaag gggcgcctgg ggctggggct gattgtgggc ggtggagcgc 2760 cagacagaaa aggattccaa tgagccccag ccccaggcgc cccttgccga aggatcctgg 2820 ggctggggct gattgtgggc ggtggagcgc cagacagaaa aggattccaa tgagaacttc 2880 aggttaaagt cagatgccac ctaccagggt ctacagtcaa aatgttggct ttttcttatt 2940 ttttaatgta tgggagaaaa atgtaaaatt ccagttcttt tctaattgtg tttctgaaat 3000 taggagtcag ctgccagcgt ttttgtgtgg ctgcagtgtg cctgggccca gctcacgggc 3060 agtgggtgga cctaactgcc caggcaggcg agagctactt ccagagcctt ccagtgcatg 3120 ggagggcagg gctaggtgta gcggtgtctc ctctttgaaa ttaagaacta tctttcttgt 3180 agcaaagctg cacctgatga tgctgcctct cctctctgtg ttgtctgggc ccttgtttac 3240 aagcacgcgt tacccttcct gaggggagcc atgctctagc ccctggaggg cctgttgcag 3300 gggcagggcg ggcccgtcgc ctttggcagc tcctggagag ctgtggacat gcagtccccc 3360 tcagttcgtg ctgcaataaa ggccatcttc tctt 3394 57 1526 DNA Homo sapiens 57 gttttttttt ttttttttaa ttgcaagcat atttctttta atgactccag taaaattaag 60 catcaagtaa acaagtggaa agtgacctac acttttaact tgtctcacta gtgcctaaat 120 gtagtaaagg ctgcttaagt tttgtatgta gttggatttt ttggagtccg aaggtatcca 180 tctgcagaaa ttgaggccca aattgaattt ggattcaagt ggattctaaa tactttgctt 240 atcttgaaga gagaagcttc ataaggaata aacaagttga atagagaaaa cactgattga 300 taataggcat tttagtggtc tttttaatgt tttctgctgt gaaacatttc aagatttatt 360 gatttttttt tttcactttc cccatcacac tcacacgcac gctcacactt tttatttgcc 420 ataatgaacc gtccagcccc tgtggagatc tcctatgaga acatgcgttt tctgataact 480 cacaacccta ccaatgctac tctcaacaag ttcacagagg aacttaagaa gtatggagtg 540 acgactttgg ttcgagtttg tgatgctaca tatgataaag ctccagttga aaaagaagga 600 atccacgttc tagattggcc atttgatgat ggagctccac cccctaatca gatagtagat 660 gattggttaa acctgttaaa aaccaaattt cgtgaagagc caggttgctg tgttgcagtg 720 cattgtgttg caggattggg aagggcacct gtgctggttg cacttgcttt gattgaatgt 780 ggaatgaagt acgaagatgc agttcagttt ataagacaaa aaagaagggg agcgttcaat 840 tccaaacagc tgctttattt ggagaaatac cgacctaaga tgcgattacg cttcagagat 900 accaatgggc attgctgtgt tcagtagaag gaaatgtaaa cgaaggctga cttgattgtg 960 ccatttagag ggaactcttg gtacctggaa atgtgaatct ggaatattac ctgtgtcatc 1020 aaagtagtga tggattcagt actcctcaac cactctccta atgattggaa caaaagcaaa 1080 caaaaaagaa atctctctat aaaatgaata aaatgtttaa gaaaagagaa agagaaaagg 1140 aattaattca gtgaaggatg attttgctcc tagttttgga gtttgaattt ctgccaggat 1200 tgaattattt tgaaatctcc tgtcttttta aactttttca aaataggtct ctaaggaaaa 1260 ccagcagaac attagcctgt gcaaaaccat ctgtttgggg agcacactct tccattatgc 1320 ttggcacata gatctccctg tggtgggatt ttttttttcc ctttttttgt gggggagggt 1380 tggtggtata tttttcccct cttttttcct tcctctccta catctccctt ttcccccgat 1440 ccaagttgta gatggaatag aagcccttgt tgctgtagat gtgcgtgcag tctggcagcc 1500 ttaagcccac ctgggcactt ttagat 1526 58 8213 DNA Homo sapiens 58 cccccagcag aagggcgcga cggctgcaac atcagcggtt aaattgtaca gcctttcata 60 ggccggttca atgcatccgt actaagattg ttaaggctga gggtccctag cctggggaaa 120 aacgaaagga ggcagagggt agggagacgg gaaggaagac aaggagggtg tagaaaacgg 180 ggagaggagg gggcgggaca gcatggggaa ggcctcaggt ttactggaga gatcgtggcg 240 ttcccataga aacgtatccc tccgcccatg acccgcgtgt tagtctcttc agttccttcc 300 gcgtcgtttc ttggctgttt ccgcccagct cctttgtgcc gcgcagaaca acgagatgac 360 gcatgcgcaa agcgcagcgg ccgcatatat aaacgcgaac ccgggctctt cctcgtagtg 420 ccgccgggac tcttggcggg tgaaggtgtg tgtcagcttt tgcgtcactc gagccctggg 480 cgctgcttgc taaagagccg agcacgcggg tctgtcatca tgtcgcgtta cgggcggtac 540 ggaggaggta agaagctgga gtccggtgag ggacgttggt gtgggtgtag tgagcactgc 600 gaggccgtag ggttgtcgcg gaggttggga gacggttatt ccgcgtgcgt aatggcggct 660 taggagcacg ccagacgaag ccggaggcag cggaggcggg gtgctgaagg gagacgggat 720 ggcgggtgta catctctgcc gagttccgta ctcttgggca tttttgtggc ccaatccagc 780 ctaaagcagg gttgagatga cggttttcgc gttgcctttc tcggagctgc ccgccggccc 840 ccctcccccc ccgccctcgg ccggcggctg ccattttgcg cacattgagg accgtggtgg 900 cgcatttcct cagcgctttc ccgccacttc agcggacaga tctggccgca gctgtaagat 960 cgtggttgtg tttgagatag aacgaaattg gcagctgtga gctgcatgtt ctcgtcaaac 1020 aatcggttaa attgcggaat gggaatgggg acgtaatctg cgactggcgg ctgggttttt 1080 ttttagttat ttccagcgcg gtttatggct ctggggcggg gagctggagt cttgggcgag 1140 cctgtgcctg ggacgtttgc cgcggaggac gagagccggc gcagccctgc tctcctggcc 1200 cggcccctac cgaggccctc ccgccgccga cgcgctgccg ctgcgggccc gcgcgctccc 1260 ggtgcgcccg gggctgccgg gactcatggg tggggccggg ccaggtcccg ccccacgcct 1320 cggtgtatcc taccacgcgt ttctgcttgt gttcgggagg gtcaccccgc attatttaga 1380 acgttaagaa ttttgtcaaa agtctagttt ctcggggatt tgcggacttc accagtttta 1440 cgactaagtt ttgtcttgga tagagggcat taaatgtgct ttacccaatc ttgaggatgg 1500 cccgttttaa ggcaagtaag taattgaaac ttgggccaga ttttgcataa cgtgcattct 1560 tctatttgcg tttttaaaca gaaaccaagg tgtatgttgg taacctggga actggcgctg 1620 gcaaaggaga gttagaaagg gctttcagtt attatggtcc tttaagaact gtatggattg 1680 cgagaaatcc tccaggattt gcctttgtgg aattcgaaga tcctagagat gcagaagatg 1740 cagtacgagg actggatgga aagtaagtaa gatgttatga atcttctgtt cattaaaata 1800 tactgtggct agataatgaa cttagtgcta aatttggatt ctgaagtctg gaagagacct 1860 taaatagctg gtcatagtgt taaatgctaa aggcacacga aggttaaaga agatagcgga 1920 gatggagtta gggcttggta aagaccgcca aagtttgttg ggggggaagg agtggttgga 1980 aagagtgagt ggttggaaag agttcttttt aaatctataa gtcctgaata tatttttaac 2040 tttagaattt tgttaatttg cttttattag ggtgatttgt ggctcccgag tgagggttga 2100 actatcgaca ggcatgcctc ggagatcacg ttttgataga ccacctgccc gacgtccctt 2160 tgatccaaat gatagatgct atgagtgtgg cgaaaaggga cattatgctt atgattgtca 2220 tcgttacagc cggcgaagaa gaagcaggta tttattttaa taaaggaatg gttggtattc 2280 tagttaatca agtaattctt ttattagcaa ggcagaaact agtgtttttc tataaacttg 2340 aatgttaatt gtacaggtgt attttacaat ttgtgtttaa ttaaaaaaat gttactatat 2400 taataatcaa cctggtcaaa acctttcagg tttcttcgtt tgagtcagtc gccttgattc 2460 agaatgtcac gagccttatg atatcatgct gaggcgcctt gcaaatccga caattaagat 2520 cctcctagac cttgaggtga tcagcataag aggccagatc ccctcgagtc atctacacct 2580 agcttcacct tattctttaa agggcagaaa atttgagacg gtgatcgccg taacagtaaa 2640 tttggcttac aattggggcc cccctccggt ttagaaagag gaacaccaga ttgaccacat 2700 tcccaactag aaaaatcttc ttgcgtcaat caagcctcac ctggctcatt tggctgtcag 2760 tttgatcgtc gttagattga agaaaacatc tagatgcagc gatcggctat agatacttct 2820 agatcgtcta gatctactag accatgggcc aaagagggtc gacctgcaaa cttgcaaggt 2880 ttatgttaaa tacacattac agtgttttat attatgtaat gctaagttgt aattcagctt 2940 ttaacaaatc tttttttagg tagtaaaaaa aaaaatactc aacaactaat aggcccagag 3000 tttatttcca aatgagacac taaatttaaa tagttttgag atttgatttc agcagaggca 3060 cacaaactct taaaaacgag ttattgtctg acattttgtt ttttctctaa cttgaaaaat 3120 aggtcacggt ctagatcaca ttctcgatcc agaggaaggc gatactctcg ctcacgcagc 3180 aggagcaggg gacgaaggtg agatcttgtt taactgaagt ctttctgtat tattattaaa 3240 ttcactggta gtccaacaca gaaaaagctc attatttttt ttggagacag ggtcttgctc 3300 tgtcacccgg gctggagtac aggggcataa ccacgactca ctgctgcctt gatgatctct 3360 tgggtttaag cagttctcct acctcagcct cccgagtagc tgggactgta ggcactgcca 3420 ccatacccag ctaattttta tttttgtaga aatggtcttg cactgtttcc caggctggtc 3480 tcaagctcct gggctcaaac gatcctcccg cagtgctggg attatgggca tgagccactg 3540 caccgttccc cagttgaagt cttaacaggc caaaaaaaaa aaaaactgtg gagatggact 3600 taaagttctt tattttaggt caaggtcagc atctcctcga cgatcaagat ctatctctct 3660 tcgtagatca agatcagctt cactcagaag atctaggtct ggttctataa aaggatcgag 3720 gtatttccag tatgtaacac tttttttcct tacttgtgtt tggattgttc acatcttatc 3780 agtagagtgt cttaaggaca taattcaaat ggattgcttc agggaatatt tgagatgtaa 3840 aagtttggaa tttatgtgta acttgtaaca taaatattac cctagtttca cagatgaaga 3900 aaagggctac tagagatttt aaggcttgtt aggccgtgtg gtagacaagg gtcccaagca 3960 atacagctct actcaacact ctgggtaggc atgttgctat aaacttttct ggcttcagat 4020 tggatgatac tagctctgaa agatggtaat tgattttccc gacaaaaagg cctattagca 4080 ccaggaaaag agatcagaag caagtagaaa catttctcat ttttggaatg atggggttga 4140 tttgagacac tggaaagttg actagggcag tagtgtgtac acagaaatga atgtggattt 4200 tttttttaga ccgtttcaga cctgaaaaaa ctaaagaacc agagctttac tatttgtaga 4260 aggccttaaa aggagataga atggaaaaaa ttgtaaaata agtattgcaa catgtaatta 4320 acaatattgt tatctgtacc aacgataaaa ccgtggtacg gaatgctact gggagttaaa 4380 ttgctgttta atagcacaaa acctttaaat gcaggaattc tgaatcttgt ggtctatttg 4440 agaaagctat gaaccatctc tttagataaa tttaaaagat agatatgtca gtctgatttg 4500 gtttgtctga cagattgatg gctctcaaac ataacttgat ccgggaagaa gcctgacaaa 4560 tggggggcgg ctttcttttc gtctggcctt atcacctgaa ttagtctcag ttcaggggtc 4620 tggttatttt catcctgcct tagcctcctg agtagctggg actgccattg tgtaccacag 4680 tgcccagctg agggatctgt gccttaagtg aggttagttt tgcttccttc ataccagtct 4740 catcaaatga aaaccatgta tttcccttgg atattacaca gtgtttgaga atgttatacc 4800 tgtacagaaa ctaaccaatt gagtgataga aacaagtaat tgaaatgggg gttccttatg 4860 tctggtaaca ctttgtttga cagtgtgtta gacagaataa ggcaagtgtt gcatcttgtt 4920 tagttttagc ttctttatgc ctgaccaacc taatacagtg ttgagtagtt aaggaaattc 4980 ctttggactg attgatataa ttgtgttttt tcactttttt tattaagatc cccgtcgagg 5040 tcaagatcaa gatccaggtc tatttcacga ccaagaagca ggtagggtaa aaatttgatt 5100 atccttttct agttatatgg caccaatatc caaagagttc aaagtgtttt taattgttga 5160 aattttaagt gttaactcta aacttaggtt ttagtgggaa cacagtacct tatttgtgta 5220 tgtcctattt attactggct gactttccct gaacaaggga atgtaaaact atagtgagaa 5280 agaagcttat gacttggggg attatattaa agaggccctt gttagaactg ataggtgcat 5340 ggagaagcat cctgaaatcg atgtgcttaa agcagaatgt aaaagattaa tcatgatgta 5400 gtaattgagt cattttttga aaaacagttg ttgaaagatt ggcttttgtt agcaacaact 5460 ggtaggatgt ttttcagttt aagtgcagtc tgacatttta agcttaggac atttgggggt 5520 tttacggtat tggtgactac aagaaaggga ttggttagta ctctttcttt aatagaattt 5580 ctcatgtttt gacagccgat caaagtccag atctccatct ccaaaaagaa ggtaagctaa 5640 atgttttgtt gccaaatctt gcctgtcaag tgtggcctct gcagaatttg tttgcttact 5700 gctttgcagt ctttgagctc tttggagaat tggtgctata tagattaaaa tactatgcta 5760 agtttctgaa atactttttt tttttgattc agtaacatta gtttatactt ttgctggaaa 5820 tacttagtca taaaatgtta gggtgattat taagatgtga ttggtcctgt gagtacttgg 5880 tagaaatttt ggtaagatag atgccttttc cccacatgta caatagatac aaagtgtgga 5940 gaaaagtctt ggaaatagtt acctgcctag tgcttcttta tgaccagaaa acttcaaata 6000 gttgtcatat ttatctagtg cttcttaatg accagaagac ttcaaatagt tgtcatattt 6060 aactgcaggt tgaccttgca attttgacaa ggaggatagc ctaatttttt tttttttctg 6120 ggatggagtt ttcgctctgt ccccaggctt ggagtgcagt ggctcaatct tggctcactg 6180 cagcctccga ttcccgggtt caagcaatta tcctgtctca gcctcttgag cagttgggat 6240 tacaggcacc caccgccaag cctggctaat tttttgtatt tctagtagag acggagtttc 6300 accatgttgg cgaggttggt cttaaactcc tgatcttagg tgatcacctg cctcggcctc 6360 tcccaaagtg ctggggttac aggcgtgagc caccgtgcct ggccagggta gcctaatctt 6420 aagccaggga caaaagatga atatatgtaa gtttcatgtc atttttaggt ctttgctata 6480 ggaaattagt accttaggcc acctttgaag ttattgaaag ttagtacatg tacatgagag 6540 tttcaattga cactaattgg atccaaacct aatgtttttc tttttagtcg ttccccatca 6600 ggaagtcctc gcagaagtgc aagtcctgaa agaatggact gaagctctca agttcaccct 6660 ttagggaaaa gttattttgt ttacattatt ataagggatt tgtgatgtct gtaaagtgta 6720 acctaggaaa gataattcaa ccatctaatc aaaatggatc tggattacta tgtaaattca 6780 cagcagtaag gataatataa attttgttga atgtatgaac atcatatggt ctgaaaatgt 6840 gggtttttat ttggcacatt taaataacat gtttctaact agatttttga tttgtgttca 6900 atattaacac ttcttaattt gatatatttg agagtcagac attataattg ttaatcctta 6960 ttcatacata cctacattca gaattgaaag gtgttggtta agtcttgaac atcactattc 7020 tatgcataaa acttggccag gatcttaagg gactttgaaa attccatctt acccttgtag 7080 ctctgggtaa gatgacctga gtcccttatg atacagcctg aatgcatcat gacagatcct 7140 tagttagcta atccgtttga agttggtgtt agtaggtatt gtatgatcag tggtgaagca 7200 agtaggacca ctgatgtgtc taaatgagca tgacaggaac taaacgaaac tgattaaatg 7260 tatgagaaat agaaactgat ttctggatga tctttatact aattgcagct ttcaggctac 7320 taggtggcat agtgttaatt aggactcccc aagatatggg gagttctact ctcaatggtc 7380 ttgtttcttt gctttctaca ttagttaacc agttttatac caaaaaatgc atgtttgagg 7440 aattgtctga aattgggaca aaacaccttc atgtaaacca gctttgcaaa attttccagc 7500 ccagatactc ttcatctatt caaatggatt gtcttattct gagcaaagac ctgttgttaa 7560 tcttcaagct aggttttgca gttcccaacc acaacattct tctattttgc caggctggtg 7620 caaagtaatt aaagatgtca atcagaaatg tcaatgagac taaagtggtt ttgtaaatct 7680 cagctatatt tagcaacact ccatgtagct aatatttttt ggtagcatct ggtagacctt 7740 agaatgttac atagccagta ggttctttat tcaaatttta agtatcttaa gaatagtagg 7800 gcagtaacag ttacttttga gagttttctg gtcaagcttt taccaggcat tctctagcct 7860 tggtacaaaa aaaaaaaaaa cctgctggtt gcgcagatac ctaggcttgt ccattttatg 7920 catttcagca aagtcattgg agactattgc aacttgggaa tactggtctg catcaagttt 7980 aattcggtag tttgaccgct agtatgttgg aagttatttg gattgttttt ggaattttga 8040 ctggctgaat tatggttggt ataaagttat gtgtataact ggcaggctta tttatctgtt 8100 gcacttggtt agctttaatt gttctgtatt atttaaagat aagtttactc aacaataaat 8160 ctgcagagat tgaacaaata atcctgatac ttaatttttg gaagtgggag ctc 8213 59 2042 DNA Homo sapiens 59 gcgcctgtca gggaagcggc gcgcgcgcgc gggcggcggg cgggctgggg atccgccgcg 60 cagtgccagc gccagcgcca gacccgcgcc ccgcgctctc cggcccgtcg cctgccttgg 120 gactcgcgag cccgcactcc cgccctgcct gttcgctgcc cgagtatgga gctgctgtgt 180 tgcgaaggca cccggcacgc gccccgggcc gggccggacc cgcggctgct gggggaccag 240 cgtgtcctgc agagcctgct ccgcctggag gagcgctacg taccccgcgc ctcctacttc 300 cagtgcgtgc agcgggagat caagccgcac atgcggaaga tgctggctta ctggatgctg 360 gaggtatgtg aggagcagcg ctgtgaggag gaagtcttcc ccctggccat gaactacctg 420 gatcgctacc tgtcttgcgt ccccacccga aaggcgcagt tgcagctcct gggtgcggtc 480 tgcatgctgc tggcctccaa gctgcgcgag accacgcccc tgaccatcga aaaactgtgc 540 atctacaccg accacgctgt ctctccccgc cagttgcggg actgggaggt gctggtccta 600 gggaagctca agtgggacct ggctgctgtg attgcacatg atttcctggc cttcattctg 660 caccggctct ctctgccccg tgaccgacag gccttggtca aaaagcatgc ccagaccttt 720 ttggccctct gtgctacaga ttataccttt gccatgtacc cgccatccat gatcgccacg 780 ggcagcattg gggctgcagt gcaaggcctg ggtgcctgct ccatgtccgg ggatgagctc 840 acagagctgc tggcagggat cactggcact gaagtggact gcctgcgggc ctgtcaggag 900 cagatcgaag ctgcactcag ggagagcctc agggaagcct ctcagaccag ctccagccca 960 gcgcccaaag ccccccgggg ctccagcagc caagggccca gccagaccag cactcctaca 1020 gatgtcacag ccatacacct gtagccctgg agaggccctc tggagtggcc actaagcaga 1080 ggaggggccg ctgccaccca cctccctgcc tccaggaacc acaccacatc taagcctgaa 1140 ggggcgtctg ttcccccttc acaaagccca agggatctgg tcctacccat ccccgcagtg 1200 tgcactaagg ggcccggcca gccatgtctg catttcggtg gctagtcaag ctcctcctcc 1260 ctgcatctga ccagcagcgc ctttcccaac tctagctggg ggtgggccag gctgatggga 1320 cagaattgga tacatacacc agcattcctt ttgaacgccc cccccccacc cctgggggct 1380 ctcatgtttt caactgccaa aatgctctag tgccttctaa aggtgttgtc ccttctaggg 1440 ttattgcatt tggattgggg tccctctaaa atttaatgca tgatagacac atatgagggg 1500 gaatagtcta gatggctcct ctcagtactt tggaggcccc tatgtagtcc gtgctgacag 1560 ctgctcctag agggaggggc ctaggcctca gccagagaag ctataaattc ctctttgctt 1620 tgctttctgc tcagcttctc ctgtgtgatt gacagctttg ctgctgaagg ctcattttaa 1680 tttattaatt gctttgagca caactttaag aggacataat gggggcctgg ccatccacaa 1740 gtggtggtaa ccctggtggt tgctgttttc ctcccttctg ctactggcaa aaggatcttt 1800 gtggccaagg agctgctata gcctggggtg gggtcatgcc ctcctctccc attgtccctc 1860 tgccccatcc tccagcaggg aaaatgcagc agggatgccc tggaggtggc tgagcccctg 1920 tctagagagg gaggcaagcc ctgttgacac aggtctttcc taaggctgca aggtttaggc 1980 tggtggccca ggaccatcat cctactgtaa taaagatgat tgtgaaataa aactggcttt 2040 gg 2042 60 1783 DNA Homo sapiens 60 cctctcggag ctggaaatgc agctattgag atcttcgaat gctgcggagc tggaggcgga 60 ggcagctggg gaggtccgag cgatgtgacc aggccgccat cgctcgtctc ttcctctctc 120 ctgccgcctc ctgtgtcgaa aataactttt ttagtctaaa gaaagaaaga caaaagtagt 180 cgtccgcccc tcacgccctc tcttcctctc agccttccgc ccggtgagga agcccggggt 240 ggctgctccg ccgtcggggc cgcgccgccg agccccagcg ccccgggccg cccccgcacg 300 ccgcccccat gcatcccttc tacacccggg ccgccaccat gataggcgag atcgccgccg 360 ccgtgtcctt catctccaag tttctccgca ccaaggggct gacgagcgag cgacagctgc 420 agaccttcag ccagagcctg caggagctgc tggcagaaca ttataaacat cactggttcc 480 cagaaaagcc atgcaaggga tcgggttacc gttgtattcg catcaaccat aaaatggatc 540 ctctgattgg acaggcagca cagcggattg gactgagcag tcaggagctg ttcaggcttc 600 tcccaagtga actcacactc tgggttgacc cctatgaagt gtcctacaga attggagagg 660 atggctccat ctgtgtgctg tatgaagcct caccagcagg aggtagcact caaaacagca 720 ccaacgtgca aatggtagac agccgaatca gctgtaagga ggaacttctc ttgggcagaa 780 cgagcccttc caaaaactac aatatgatga ctgtatcagg ttaagatata gtctgtggat 840 ggatcatctg atgatgatcc ataaatttga tttttgcttt gggtgggctc ctcttgggga 900 tggattatgg aatttaaacc atgtcacagc tgtgaagatc tggcacaaga tagaatggta 960 aaaaaaaaaa aaaattttaa gtgacagtgc catagtttgg acagtacctt tcaatgatta 1020 attttaatag cctgtgagtc caagtaaatg atcactttat ttgctaggga gggaagtcct 1080 agggtggttt cagtttctcc cagacatacc taaattttta catcaatcct tttaaagaaa 1140 atctgtattt caaagaatct ttctctgcag taaatctcgc aggggaattt gcactattac 1200 acttgaaagt tgttattgtt aaccttttcg gcagctttta ataggaaagt taaacgtttt 1260 aaacatggta gtactggaaa ttttacaaga cttttaccta gcacttaaat atgtataaat 1320 gtacataaag acaaactagt aagcatgacc tggggaaatg gtcagacctt gtattgtgtt 1380 tttggccttg aaagtagcaa gtgaccagaa tctgccatgg caacaggctt taaaaaagac 1440 ccttaaaaag acactgtctc aactgtggtg ttagcaccag ccagctctct gtacatttgc 1500 tagcttgtag ttttctaaga ctgagtaaac ttcttatttt tagaaagtgg aggtctggtt 1560 tgtaactttc cttgtactta attgggtaaa agtcttttcc acaaaccacc atctattttg 1620 tgaactttgt tagtcatctt ttatttggta aattatgaac tggtgtaaat ttgtacagtt 1680 catgtatatt gattgtggca aagttgtaca gatttctata ttttggatga gaaatttttc 1740 ttctctctat aataaatcgt ttcttatctt ggcattttta acc 1783 61 1433 DNA Homo sapiens 61 ttggacagcc cgggcaacct cgacaccctg caggcgaaaa agaacttctc cgtcagtcac 60 ctgctagacc tggaggaagc cggggacatg gtggcggcac aggcggatga gaacgtgggc 120 gaggctggcc ggagcctgct ggagtcgccg ggactcacca gcggcagcga caccccgcag 180 caggacaatg accagctgaa ctcagaagaa aaaaagaaga gaaagcagcg aaggaatagg 240 acaaccttca atagcagcca gctgcaggct ttggagcgtg tctttgagcg gacacactat 300 cctgatgctt ttgtgcgaga agaccttgcc cgccgggtga acctcaccga ggcgagagtg 360 caggtgtggt ttcagaaccg aagagccaag ttccgcagga atgagagagc catgctagcc 420 aataaaaacg cttccctcct caaatcctac tcaggagacg tgactgctgt ggagcagccc 480 atcgtacctc gtcctgctcc gagacccacc gattatctct cctgggggac agcgtctccg 540 tacagatcct cgtccctccc aagatgttgt ttacacgagg ggcttcataa cggattctaa 600 cggaagacac tgaaaagcgc catggctact tattctgcca catgtgccaa caatagccct 660 gcacagggca tcaacatggc caacagcatt gccaacctga gactgaaggc caaggaatat 720 agtttacaga ggaaccaggt gccaacagtc aactgaggaa aaaaaataat taaacaggcc 780 taagaagaaa tcaaaaacca taagacacct atcctgctct gttatttctt catctgctgg 840 ggggaaaaag taaattacaa acaaacaaac aaagcagaac taaaatattg ggaccatggc 900 agagaaaagc aggagaggag caaaatgaaa attagttaac aaatgttcct cctcctctgg 960 gataccacca ccacttgttt ctgtgtgtgt ttattttgtt tttctttcat tcatgctttg 1020 cttaatgtac tccaggcttc ttcagctagg ttcagcccac ccacccccat gcttgtaatc 1080 ccagtgcttt gggaggccaa ggcaggtgga tcacctgagg tcaggagttc gagactagcc 1140 tgttccactg acatttctta gacattcagc aaaaccccca ccttaacctc ttttctttct 1200 tgagggttgg tcctgtcccc acctccaccc tcccaccccc tggaagagga agggcccggg 1260 catcagtggc tagtccaaat aaaatatggg cttggggatg gaatgggtgg tggtaagttc 1320 acagagtgta gttagatccc aactcccatg acctctggct tcagtggtgg gtggggcagg 1380 gcagatgaaa gggcttcagt gggaacctct gagagcattt tcctgttccc aat 1433 62 643 DNA Homo sapiens 62 ggtagcgacg gtagctctag ccgggcctga gctgtgctag cacctccccc aggagaccgt 60 tgcagtcggc cagccccctt ctccacggta accatgtgcg accgaaaggc cgtgatcaaa 120 aatgcggaca tgtcggaaga gatgcaacag gactcggtgg agtgcgctac tcaggcgctg 180 gagaaataca acatagagaa ggacattgcg gctcatatca agaaggaatt tgacaagaag 240 tacaatccca cctggcattg catcgtgggg aggaacttcg gtagttatgt gacacatgaa 300 accaaacact tcatctactt ctacctgggc caagtggcca ttcttctgtt caaatctggt 360 taaaagcatg gactgtgcca cacacccagt gatccatcca gaaacaagga ctgcagccta 420 aattccaaat accagagact gaaattttca gccttgctaa gggaacatct cgatgtttga 480 acctttgttg tgttttgtac agggcattct ctgtactagt ttgtcgtggt tataaaacaa 540 ttagcagaat agcctacatt tgtatttatt ttctattcca tacttctgcc cacgttgttt 600 tctctcaaaa tccattcctt taaaaaataa atctgatgca ccg 643 63 4792 DNA Homo sapiens 63 ctcaaatatg tggatgacat acagaaggga aataccatca aaagactgaa catccagaag 60 aggcggaagc cgtccgtgcc atgcccagaa cccaggacca catctggtca gcaaggtata 120 tggacttcca ctgaatccct ctcatcctcc aacagtgatg acaacaagca gtgccccaac 180 ttcctcatag ccagaagtca agttacatca actccaatct caaagccacc tccccctctg 240 gagacctcac tcccttttct taccatccca gaaaatcgac agctgccacc tccctcacca 300 caactcccaa agcataacct tcatgtcacc aagacactga tggagacccg gagaagactg 360 gaacaggaga gagccaccat gcagatgaca ccgggtgagt tcagaaggcc caggctggcc 420 agttttggag gcatgggcac cacaagctcc ctcccttctt ttgtgggttc tggaaaccac 480 aatcctgcca agcaccagct tcagaatgga taccaaggta atggggatta tggtagctat 540 gccccagctg ctcccaccac ttcctccatg gggagctcca tccgccacag ccccctgagc 600 tcagggatct ccaccccagt gaccaacgtg agccccatgc acctgcagca catccgcgag 660 cagatggcca ttgctctgaa acgcctgaag gagctggagg agcaggtgcg aaccatccct 720 gtgctccagg taaagatctc tgtcttgcaa gaagagaaaa ggcagttggt ctcacagctg 780 aaaaaccaaa gggctgcatc ccagatcaat gtctgtggtg tgaggaagcg gtcctatagt 840 gcggggaacg cctcccagct ggaacagctc tcccgggccc gaagaagtgg cggggaatta 900 tacattgact atgaggagga agaaatggag accgtagaac agagcacgca gaggataaag 960 gagttccggc aacttacagc agacatgcaa gccctggagc agaagatcca ggacagcagc 1020 tgtgaggcct cctcagagct cagggagaat ggagagtgcc ggtctgtggc tgtgggtgcc 1080 gaggagaaca tgaacgacat cgtcgtgtac cacagaggct ccaggtcctg taaggatgca 1140 gctgtaggga cacttgttga gatgagaaat tgtggggtca gcgtgacaga ggccatgctt 1200 ggagtgatga ctgaagctga caaagaaatt gagctccaac agcagaccat agaagccttg 1260 aaggaaaaga tctatcgcct agaagtacag cttagagaaa ccacccatga ccgggagatg 1320 actaaactga aacaagagct gcaggctgct ggatcgagga aaaaggttga caaagccacg 1380 atggcccagc cgcttgtttt cagtaaggtg gtggaggcag tggtgcagac cagagaccaa 1440 atggtcggca gtcacatgga cctggtggac acgtgtgttg ggacctccgt ggaaacaaac 1500 agtgtaggca tctcctgcca gcctgaatgt aagaataaag tcgtagggcc tgagctgcct 1560 atgaattggt ggattgttaa ggagagggtg gaaatgcatg accgatgtgc tgggaggtct 1620 gtggaaatgt gtgacaagag tgtgagtgtg gaagtcagcg tctgcgaaac aggcagcaac 1680 acagaggagt ctgtgaacga cctcacactc ctcaagacaa acttgaatct caaagaagtg 1740 cggtctatcg gttgtggaga ttgttctgtt gacgtgaccg tctgctctcc aaaggagtgc 1800 gcctcccggg gcgtgaacac tgaggctgtt agccaggtgg aagctgccgt catggcagtg 1860 cctcgtactg cagaccagga cactagcaca gatttggaac aggtgcacca gttcaccaac 1920 accgagacgg ccaccctcat agagtcctgc accaacactt gtctaagcac tttggacaag 1980 cagaccagca cccagactgt ggagacgcgg acagtagctg taggagaagg ccgtgtcaag 2040 gacatcaact cctccaccaa gacgcggtcc attggtgttg gaacgttgct ttctggccat 2100 tctgggtttg acaggccatc agctgtgaag accaaagagt caggtgtggg gcagataaat 2160 attaacgaca actatctggt tggtctcaaa atgaggacta tagcttgtgg gccaccacag 2220 ttgactgtgg ggctgacagc cagcagaagg agcgtggggg ttggggatga ccctgtaggg 2280 gaatctctgg agaaccccca gcctcaagct ccacttggaa tgatgactgg cctggatcac 2340 tacattgagc gtatccagaa gctgctggca gaacagcaga cactgctggc tgagaactac 2400 agtgaactgg cagaagcttt cggggaacct cactcacaga tgggctccct caactctcag 2460 ctcatcagca ccctgtcgtc tatcaactct gtcatgaaat ctgcaagcac tgaagagctg 2520 aggaaccctg acttccagaa aaccagtctg ggtaaaatca caggcaatta tttgggatat 2580 acctgtaagt gtgggggcct tcagtcagga agtcccttaa gctcccagac atcccagcct 2640 gagcaagaag tggggacctc agaaggaaag ccaatcagca gcctggatgc cttccccact 2700 caggaaggta cgctgtctcc agtgaacctg acagacgacc agatcgccgc tggcctctat 2760 gcatgtacaa acaatgaaag tacactgaag tccatcatga agaagaaaga tggtaacaaa 2820 gattcaaatg gcgcaaaaaa gaatcttcag tttgttggca ttaatggagg gtatgaaaca 2880 acttcaagtg atgattccag ctcagatgaa agctcttctt ccgagtcaga tgacgagtgt 2940 gatgtcattg agtatcctct tgaagaagag gaggaggagg aggatgaaga cactcgggga 3000 atggcagaag ggcaccatgc agttaatatt gaaggtttga agtctgccag ggtggaagat 3060 gaaatgcagg ttcaagaatg tgaacctgag aaggtggaaa tcagagagag gtatgaatta 3120 agtgaaaaga tgttgtctgc atgcaactta ctgaaaaata ctataaatga ccccaaagct 3180 ttgaccagca aagatatgag gttctgtctg aacaccctcc agcacgagtg gttccgcgtg 3240 tccagtcaga agtcagccat tccagccatg gtgggggact acatagctgc ttttgaggcc 3300 atttccccag atgtcctccg ctatgtcatc aacttggcag acggcaacgg caacacagcc 3360 ctccattaca gcgtgtccca ctccaacttc gagattgtga agctgctgtt agatgccgat 3420 gtgtgtaatg tggatcacca gaacaaggca ggctacaccc ccatcatgtt ggcggccctc 3480 gccgctgtgg aagcagagaa ggacatgcgg attgtggaag aactctttgg ctgtggggat 3540 gtgaatgcca aagctagtca ggcgggacag acggccctca tgctggcggt cagtcacgga 3600 cggatagaca tggtgaaggg ccttctggcc tgtggggctg atgtcaacat ccaggatgac 3660 gagggctcca cggccctcat gtgtgccagc gagcacgggc acgtggagat tgtcaagctg 3720 ctgctggccc agcccggctg caacggtcac ctagaggaca acgatggcag cactgcgctc 3780 tcaatcgccc tggaagcagg acacaaggac atcgctgttc ttctgtatgc ccatgtcaac 3840 tttgcaaaag cccagtctcc gggcacccct aggcttggaa ggaagacgtc tcctggcccc 3900 acccaccgag gttcatttga ttgattgtat gcaaatagcc ctttatttac atgccactat 3960 taagctgcta attgttcctg ttggggtgac agatactgaa tgtatacgta ttgtgcctga 4020 gctcaccagc aaacagaagc atcaagccca ggggtaaagg ctgaagcttt cacagtgcag 4080 agactgctag cctgggcaca cgcacctcct ttctggccgt cttctgtgta gggcacactt 4140 taacccagtc tctgttgctg ttgagtctct gctccgtttt gtacagtcac agggaattct 4200 gatctgaagg ggcaccttct gttcactccc acaaagtggt gtctggttct cactgagacg 4260 ttttaagatt tttccacaaa tatttatatg tactaaatgt ggaaccatta gaaagttctt 4320 ccaaaatctc attccagcat agttttggat ttttcttttg tcttatttta aaataaggaa 4380 gtcgagatga ctttgatcat tggtaacttg ggcctgggcc agacaaagta taaaacttac 4440 aaaagaatat tctcatttgg tcttaactag gtagatgtaa tatatgactt tttataaaaa 4500 gggtatctat atgaacttga cacagtattt tcagcttttg tattccatac taaagccatg 4560 aagaactaca cgtaacatca tcatttgtat taattgcaca actccaatgc taaaggttgg 4620 attgtgttag aggaatcggc tctgtatttg cctctagaga aacacagtgt tctctttgta 4680 tttatggatt cctttttacc gtgtcacatt tactttggtc ctctatgtat ttaaatgttt 4740 gaagtgcctt agactcttgc catattttca aaataaaatt ccattaagct ct 4792 64 2199 DNA Homo sapiens 64 gtcgccgctg ccgggttgcc agcggagtcg cgcgtcggga gctacgtagg gcagagaagt 60 catggcttct ccgtccaaag gcaatgactt gttttcgccc gacgaggagg gcccagcagt 120 ggtggccgga ccaggcccgg ggcctggggg cgccgagggg gccgcggagg agcgccgcgt 180 caaggtctcc agcctgccct tcagcgtgga ggcgctcatg tccgacaaga agccgcccaa 240 ggaggcgtcc ccgctgccgg ccgaaagcgc ctcggccggg gccaccctgc ggccactgct 300 gctgtcgggg cacggcgctc gggaagcgca cagccccggg ccgctggtga agcccttcga 360 gaccgcctcg gtcaagtcgg aaaattcaga agatggagcg gcgtggatgc aggaacccgg 420 ccgatattcg ccgccgccaa gacatacgag ccctaccacc tgcaccctga ggaaacacaa 480 gaccaatcgg aagccgcgca cgccctttac cacatcccag ctcctcgccc tggagcgcaa 540 gttccgtcag aaacagtacc tctccattgc agagcgtgca gagttctcca gctctctgaa 600 cctcacagag acccaggtca aaatctggtt ccagaaccga agggccaagg cgaaaagact 660 gcaggaggca gaactggaaa agctgaaaat ggctgcaaaa cctatgctgc cctccagctt 720 cagtctccct ttccccatca gctcgcccct gcaggcagcg tccatatatg gagcatccta 780 cccgttccat agacctgtgc ttcccatccc gcctgtggga ctctatgcca cgccagtggg 840 atatggcatg taccacctgt cctaaggaag accagatcaa tagactccat gatggatgct 900 tgtttcaaag ggtttcctct ccctctccac gaaggcagta ccagccagta ctcctgctct 960 gctaaccctg cgtgcaccac cctaagcggc taggctgaca gggccacacg acatagctga 1020 aatttcgttc tgtaggcgga ggcaccaagc cctgttttct tggtgtaatc ttccagatgc 1080 ccccttttcc tttcacaaag attggctctg atggttttta tgtataaata tatatatata 1140 ataaaatata atacattttt atacagcaga cgtaaaaatt caaattattt taaaaggcaa 1200 aatttatata catatgtgct ttttttgtat atctcacctt cccaaaagac actgtgtaag 1260 tccatttgtt gtattttctt aaagagggag acaaattatt tgcaaaatgt gctaaagtca 1320 atgattttta cgggattatt gacttctgct tatggaaaac aaagaaacag acacagtgca 1380 cacagaaaat attagatatg gagagattat tcaaagtgaa ggggacacat catatttctg 1440 cattttactt gcattaaaag aaacctcttt atatactaca gttgttccta tttttccccc 1500 gccccccacc gccccaccac acacatattt ttaaagtttt tcctttttta agaatatttt 1560 tgtaagacca atacctggga tgagaagaat cctgagactg cctggaggtg aggtagaaaa 1620 ttagaaatac ttcctaattc ttctcaaggc tgttggtaac tttatttcag ataattggag 1680 agtaaaatgt taaaacctgt tgagaggaat tgatggtttc tgagaaatac taggtacatt 1740 catcctcaca gattgcaaag gtgatttggg tgggggttta gtaattttct gcttaaaaaa 1800 tgagtatctt gtaaccatta cctatatgct aaatattctt gaacaattag tagatccaga 1860 aagaaaaaaa aaatatgctt tctctgtgtg tgtacctgtt gtatgtccta aacttattag 1920 aaaattttat atactttttt acatgttggg gggcagaagg taaagccatg ttttgacttg 1980 gtgaaaatgg ggttgtcaaa cagcccatta agctccctgg tatttcacct tcctgtccat 2040 ctctcccctc cctccggtat acctttatcc ctttgaaagg gtgcttgtac aatttgatat 2100 attttattga agagttatct cttattctga attaaattaa gcatttgttt tattgcagta 2160 aagtttgtcc aaactcacaa ttaaaaaaaa aaaaaaaaa 2199 65 1496 DNA Homo sapiens 65 tcactaaagg gaacaaaagc tggagctcca ccgcggtggc ggcccctcag aactagtgga 60 tcccccgggc tgcaaggaat tcggcacgag cgcgcgtcct gcccgtctgt ccccgcgggg 120 gtcgcccgcc acagcccgcg gaatgaccac ccagcagata gacctccagg gcccggggcc 180 gtggggcttc cgcctcgtgg ggcgaaagga cttcgagcag cctctcgcca tttcccgggt 240 cactcctgga agcaaggcgg ctctagctaa tttatgtatt ggagatgtaa tcacagccat 300 tgatggggaa aatactagca atatgacaca cttggaagct cagaacagaa tcaaaggctg 360 cacagacaac ttgactctca ctgtagccag atctgaacat aaagtctggt ctcctctggt 420 gacggaggaa gggaagcgtc atccatacaa gatgaattta gcctctgaac cccaggaggt 480 cctgcacata ggaagcgccc acaaccgaag tgccatgccc tttaccgcct cgcctgcctc 540 cagcactact gccagggtca tcacaaacca gtacaacaac ccagctggcc tctactcttc 600 tgaaaatatc tccaacttca acaatgccct ggagtcaaag actgctgcca gcggggtgga 660 ggcgaacagc agacccttag accatgctca gcctccaagc agccttgtca tcgacaaaga 720 atctgaagtt tacaagatgc ttcaggagaa acaggagttg aatgagcccc cgaaacagtc 780 cacgtctttc ttggttttgc aggaaatcct ggagtctgaa gaaaaagggg atcccaacaa 840 gccctcagga ttcagaagtg ttaaagctcc tgtcactaaa gtggctgcgt cgattggaaa 900 tgctcagaag ttgcctatgt gtgacaaatg tggcactggg attgttggtg tgtttgtgaa 960 gctgcgggac cgtcaccgcc accctgagtg ttatgtgtgc actgactgtg gcaccaacct 1020 gaaacagaag ggccatttct ttgtggagga tcaaatctac tgtgagaagc atgcccggga 1080 gcgagtcaca ccacctgagg gttatgaagt ggtcactgtg ttccccaagt gagccagcag 1140 atctgaccac tgttctccag caggcctctg ctgcagcttt tctctcagtg ttctggccct 1200 ctcctctctt gaaagttctc tgcttacttt ggttttccct ctgcttgtaa aacattgagg 1260 cccctccctg ccttggttaa ttgactcaca ccagctgtgg gatgcccgct tttacaatta 1320 aaggaaaact gttgtgttca gtgtcacctt gtcagcaaca ctgtgtccct tcgcccgccg 1380 ttcttctctg ctgcatttgg acatcagcca aatttgaacc caatcaaata taacgtgtct 1440 gacactgatt ttgtttttac tcaataaatg tatagactac aaaaaaaaaa aaaaaa 1496 66 5421 DNA Homo sapiens 66 ccgggatccg gttttttttg tttttaaaag tgtaatttcc tttttatttg catctgttta 60 tgactgaaaa aaatgactag ttattatgaa gacactactg ttgaagatgg atattttaac 120 atggagtttc aacaaaatta cttcttgaga cagagctgat gtgtttttta aataacgtga 180 ttttaagcat atatttgaac aaaactaaaa catttagtat tatgaatatg aaaaaagatc 240 agtaaatcaa tgtactcttc taggctgaat taaggtagac tatttaaggt ttcaaaaaag 300 tttggctggg gcagaataag ttttacaaaa cccatgccat ccaaaattaa gatgacatgt 360 agcagcaaga agtattccaa tgtctcataa ccagttctcg caagcaatgt gtattcctta 420 ctttaaggaa gtgtcaaaca aatagaaaaa tctggaagaa tttactaagt gtaataaatt 480 agaggtaaat cgtaataaaa gaatttatgt ctcacaaaaa tattcacaag tgggagtttt 540 cttttaccaa cttctcagag tccttctagc cccctcttca cttctgaaag atgggattta 600 ccaaaatctg gtttacattt aacttttcag ggacacatga cctgaaaaga aagatgtcag 660 ataatactga cattgcctca tgcactttct ttgtatcagt ccttcttctg taagtaatca 720 gaattgggtc caaatggcat agaatcaaac attatgtatc atgccaaata ccacttcctg 780 cccaacaaaa tttcatcttt ctccagtaat gaagaggtgg acattcttgt tggactgtag 840 catctgtgcc gcccgctcca caccaaccac ggcagctaac ctctgggcat catatttgga 900 gtagagaaca gtgcaggtcc acgtggcctc ttctcctctg ttggtggctc tcagcatatt 960 acagatttca ctgtaaaagt gtggatatgt cggcagttca tagaaaatca ggttcctgat 1020 gccttttatt gctgtagttt atttccaccc ccttccctcc tgttttctct ctctccttct 1080 ctctctctct ctctctctct tttttttccg ccctagctgg ggctgtgttg gaggagagga 1140 agaaagagag acagaggatt gcattcatcc gttacgttct tgaaatttcc taatagcaag 1200 accagcgaag cggttgcacc cttttcaatc ttgcaaagga aaaaaacaaa acaaaacaaa 1260 aaaaacccaa gtccccttcc cggcagtttt tgccttaaag ctgccctctt gaaattaatt 1320 ttttcccagg agagagatgt cttatcaggg gaagaaaaat attccacgca tcacgagcga 1380 tcgtcttctg atcaaaggag gtaaaattgt taatgatgac cagtcgttct atgcagacat 1440 atacatggaa gatgggttga tcaagcaaat aggagaaaat ctgattgtgc caggaggagt 1500 gaagaccatc gaggcccact cccggatggt gatccccgga ggaattgacg tccacactcg 1560 tttccagatg cctgatcagg gaatgacgtc tgctgatgat ttcttccaag gaaccaaggc 1620 ggccctggct gggggaacca ctatgatcat tgaccacgtt gttcctgagc ctgggacaag 1680 cctgctcgct gcctttgacc agtggaggga atgggccgac agcaagtcct gctgtgacta 1740 ctctctgcat gtggacatca gcgagtggca taagggcatc caggaggaga tggaagcgct 1800 tgtgaaggat cacggggtaa attccttcct cgtgtacatg gctttcaaag atcgcttcca 1860 gctaacggat tgccagattt atgaagtact gagtgtgatc cgggatattg gcgccatagc 1920 ccaagtccac gcagaaaatg gcgacatcat tgcagaggag cagcagagga tcctggatct 1980 gggcatcacg ggccccgagg gacatgtgct gagccgacct gaggaggtcg aggccgaagc 2040 cgtgaatcgt gccatcacca tcgccaacca gaccaactgc ccgctgtata tcaccaaggt 2100 gatgagcaaa agctctgctg aggtcatcgc ccaggcacgg aagaagggaa ctgtggtgta 2160 tggcgagccc atcactgcca gcttgggaac ggacggctcc cattactgga gcaagaactg 2220 ggccaaggct gctgcctttg tcacctcccc acccttgagc cctgatccaa ccactccaga 2280 ctttctcaac tccttgctgt cctgtggaga cctccaggtc acgggcagtg cccattgcac 2340 gtttaacact gcccagaagg ctgtaggaaa ggacaacttc accctgattc cggagggcac 2400 caatggcact gaggagcgga tgtccgtcat ctgggacaag gctgtggtca ctgggaagat 2460 ggatgagaac cagtttgtgg ctgtgaccag caccaatgca gccaaagtct tcaaccttta 2520 cccccggaaa ggccgcattg ctgtgggatc cgatgccgac ctggtcatct gggaccccga 2580 cagcgttaaa accatctctg ccaagacaca caacagctct ctcgagtaca acatctttga 2640 aggcatggag tgccgcggct ccccactggt ggtcatcagc caggggaaga ttgtcctgga 2700 ggacggcacc ctgcatgtca ccgaaggctc tggacgctac attccccgga agcccttccc 2760 tgattttgtt tacaagcgta tcaaggcaag gagcaggctg gctgagctga gaggggttcc 2820 tcgtggcctg tatgacggac ctgtgtgtga agtgtctgtg acgcccaaga cagtcactcc 2880 agcctcctcg gccaagacgt ctcctgccaa gcagcaggcc ccacctgtcc ggaacctgca 2940 ccagtctgga ttcagtttgt ctggtgctca gattgatgac aacattcccc gccgcaccac 3000 ccagcgtatc gtggcgcccc ccggtggccg tgccaacatc accagcctgg gctagagctc 3060 ctgggctgtg cgtccactgg ggactgggga tgggacacct gaggacattc tgagacttct 3120 ttcttccttc cttttttttt tttgtttttt tttttaagag cctgtgatag ttactgtgga 3180 gcagccagtt catggggtcc cccttgggcc cacaccccgt ctctcaccaa gagttactga 3240 ttttgctcat ccacttccct acacatctat gggtatcaca cccaagacta cccaccaagc 3300 tcatacaggg aaccacaccc aacacttaga catgcgaaca agcagccccc agcgagggtc 3360 tccttcgcct tcaacctcct agtgtctgtt agcattcctt ttcatggggg gagggaagat 3420 aaagtgaatt gcccagagct gcctttttct tttcttttta aaaattttaa gaagttttcc 3480 ttgtggggct ggggaggggc cggggtcagg gagagtcttt tttttttttt ttttaaatac 3540 taaattggaa catttaattc catattaata caaggggttt gaactggaca tcctaatgat 3600 gcaattacgt catcacccag ctgattccgg gtggttggca aactcatcgt gtctgtcctg 3660 agaggctcca caatgcccac ccgcatcgcc attctgtagt cttcagggtc agctgttgat 3720 aaaggggcag gcttgcgtta ttggcctaga ttttgctgca gattaaatcc tttgaggatt 3780 ctcttctctt ttaccatttt tctgcgtgct ctcactctct ctttctctct ctagcttttt 3840 aattcatgaa tattttcgtg tctgtctctc tctctctctg tgtttcctcc agcccttgtc 3900 tcggagacgg tgttttcctc ccttgcccca ttatcttttc acctcccagg tctacatttc 3960 atggtggtcg ttgggtccgc ctaaaggatt tgagcgtttg ccattgcaag catagtgctg 4020 tgtcatcctg gtccatgtag gactggtgct aaccacctgc catcatgagg atgtgtgcta 4080 gagtgtggga ccctggccaa gtgcaggaat gggccatgcc gtctcaccca cagtatcaca 4140 cgtggaaccg cagacagggc ccagaagctt tagaggtatg aggctgcaga accggagaga 4200 ttttcctctg tgcagtgctc tctggctaaa gtcacggtca aacctaaaca ccgagcctca 4260 ttaacccaag tgaaccaacc aaagtcacca gttcagaagt gctaagctaa taggagtctg 4320 acccgagggc ctgctgcttc ctggttaagt atcttttgag attctagaac acatgggagc 4380 tttttatttt cggggaaaaa ccgtattttt ttcttgtcca attatttcta aagacacact 4440 acatagaaag aggccctata aactcaaaaa gtcattggga aacttaaagt ctattctact 4500 ttgccaagag gagaaatgtg ttttatgaac gatagatcac atcagaactc ctgtggggag 4560 gaaaccttat aaattaaaca catggccccc ttagagacca caggcgatgt ctgtctccat 4620 ccttccctct ccttttctgt cacctttccc cctagctggc tcctttggac ctacccctgt 4680 ccttgctgac ttgtgttgca ttgtattcca aacgtgttta caggttctct taagcaatgt 4740 tgtatttgca ggcttttctg aataccaaat ctgctttttg taaagcgtaa aaacatcaca 4800 aagtaggtca ttccatcacc acccttgtct ctctacacat tttgcctttg gggatctggt 4860 tggggttttg ggttttttgt tgttgttgtt tatttgttat tttaaaggta aattgcactt 4920 ttaaaaaaat aattggttga cttaatatat ttgctttttt tctcacctgc acttagagga 4980 aatttgaaca agttggaaaa aaacaatttt tgtttcaatt ctaagaaaca cttgcagctc 5040 tagtattcac ttgagtcttc ctgtttttcc tgtaccgggt catggtaatt tttggttgtt 5100 ttggttgttt tcttaaaaaa caagttaaaa cctgacgatt tctgcagtga cttgatgctc 5160 taaaacagtg taggatttaa gaatagatgg tttttaatcc tggaaattgt gattgtgacc 5220 catgagtgga ggaactttca gttctaaagc tgataaagtg tgtagccaga agagtacttt 5280 ttttttgtaa ccactgtctt gatggcaaaa taattatggt aaaaaacaag tctcgtgttt 5340 attattcctt aagaactctg tgttatatta ccatggaacg cctaataaag caaaatgtgg 5400 ttgtttcaaa aaaaaaaaaa a 5421 67 620 DNA Homo sapiens 67 aaacatccta tcatctgtag gctcattcat ttctctaaca gcagcagcaa cagcgcatca 60 caggacacca aggagagctc tgaagagcct ccctcagaag agagccagga cacccccatt 120 tacacggagt ttgatgagga tttcgaggag gaacccacat cccccatagg tcactgtgtg 180 gccatctacc actttgaagg gtccagcgag ggcactatct ctatggccga gggtgaagac 240 ctcagtctta tggaagaaga caaaggggac ggctggaccc gggtcaggcg gaaagaggga 300 ggcgagggct acgtgcccac ctcctacctc cgagtcacgc tcaattgaac cctgccagag 360 acgggaagag gggggctgtc ggctgctgct tctgggccac ggggagcccc aggacctatg 420 cactttattt ctgaccccgt ggcttcggct gagacctgtg taacctgctg ccccctccac 480 ccccaaccca gtcctacctg tcacaccgga cggacccgct gtgccttcta ccatcgttcc 540 accattgatg tacatactca tgttttacat cttttctttc tgcgctcggc tccggccatt 600 ttgttttata caaaaatggg 620 68 1266 DNA Homo sapiens 68 ctcggaagcc cgtcaccatg tcgtgcgagt cgtctatggt tctcgggtac tgggatattc 60 gtgggctggc gcacgccatc cgcctgctcc tggagttcac ggatacctct tatgaggaga 120 aacggtacac gtgcggggaa gctcctgact atgatcgaag ccaatggctg gatgtgaaat 180 tcaagctaga cctggacttt cctaatctgc cctacctcct ggatgggaag aacaagatca 240 cccagagcaa tgccatcttg cgctacatcg ctcgcaagca caacatgtgt ggtgagactg 300 aagaagaaaa gattcgagtg gacatcatag agaaccaagt aatggatttc cgcacacaac 360 tgataaggct ctgttacagc tctgaccacg aaaaactgaa gcctcagtac ttggaagagc 420 tacctggaca actgaaacaa ttctccatgt ttctgtggaa attctcatgg tttgccgggg 480 aaaagctcac ctttgtggat tttctcacct atgatatctt ggatcagaac cgtatatttg 540 accccaagtg cctggatgag ttcccaaacc tgaaggcttt catgtgccgt tttgaggctt 600 tggagaaaat cgctgcctac ttacagtctg atcagttctg caagatgccc atcaacaaca 660 agatggccca gtggggcaac aagcctgtat gctgagcagg aggcagactt gcagagcttg 720 ttttgtttca tcctgtccgt aaggggtcag cgctcttgct ttgctctttt caatgaatag 780 cacttatgtt actggtgtcc agctgagttt ctcttgggta taaaggctaa aagggaaaaa 840 ggatatgtgg agaatcatca agatatgaat tgaatcgctg cgatactgtg gcatttccct 900 actccccaac tgagttcaag ggctgtaggt tcatgcccaa gccctgagag tgggtactag 960 aaaaaacgag attgcacagt tggagagagc aggtgtgtta aatggactgg agtccctgtg 1020 aagactgggt gaggataaca caagtaaaac tgtggtactg atggacttaa ccggagttcg 1080 gaaaccgtcc tgtgtacaca tgggagttta gtgtgataaa ggcagtattt cagactggtg 1140 ggctagccaa tagagttggc aattgcttat tgaaactcat taaaaataat agagccccac 1200 ttgacactat tcactaaaat taatctggaa tttaaggccc aacattaaac acaaagctgt 1260 attgat 1266 69 3858 DNA Homo sapiens 69 agtctggttt aactggttgg aacgactaaa gcacgctggc gcaaggaaag ctctcaactt 60 cgggagctga ggcgcaggct ggccagagcg tggagaggaa agccctttcc atcctcaagg 120 ccgttgcagg agatgcccgc gagccacctt cgccagcacc acaccggggt gtaatggata 180 ggtaacagag aagacctcgt cccttcctag tcagggcatc agcatgactg agtgcttcct 240 gccccccacc agcagcccca gtgaacaccg cagggtggag catggcagcg ggcttacccg 300 gacccccagc tctgaagaga tcagccctac taagtttcct ggattgtacc gcactggcga 360 gccctcacct ccccatgaca tcctccatga gcctcctgat gtagtgtctg atgatgagaa 420 agatcatggg aagaaaaaag ggaaatttaa gaaaaaggaa aagaggactg aaggctatgc 480 agcctttcag gaagatagct ctggagatga ggcagaaagt ccttctaaaa tgaagaggtc 540 caagggaatc catgttttca agaagcccag cttttctaaa aagaaggaaa aggattttaa 600 aataaaagag aaacccaaag aagaaaagca taaagaagaa aagcacaaag aagaaaaaca 660 taaagagaag aagtcaaaag acttgacagc agctgatgtt gttaaacagt ggaaggaaaa 720 gaagaaaaag aaaaagccaa ttcaggagcc agaggtgcct cagattgatg ttccaaatct 780 caaacccatt tttggaattc ctttggctga tgcagtagag aggaccatga tgtatgatgg 840 cattcggctg ccagccgttt tccgtgaatg tatagattac gtagagaagt atggcatgaa 900 gtgtgaaggc atctacagag tatcaggaat taaatcaaag gtggatgagc taaaagcagc 960 ctatgaccgg gaggagtcta caaacttgga agactatgag cctaacactg tagccagttt 1020 gctgaagcag tatttgcgag accttccaga gaatttgctt accaaagagc ttatgcccag 1080 atttgaagag gcttgtggga ggaccacgga gactgagaaa gtgcaggaat tccagcgttt 1140 actcaaagaa ctgccagaat gtaactatct tctgatttct tggctcattg tgcacatgga 1200 ccatgtcatt gcaaaggaac tggaaacaaa aatgaatata cagaacattt ctatagtgct 1260 cagcccaact gtgcagatca gcaatcgagt cctgtatgtg tttttcacac atgtgcaaga 1320 actctttgga aatgtggtac taaagcaagt gatgaaacct ctgcgatggt ctaacatggc 1380 cacgatgccc acgctgccag agacccaggc gggcatcaag gaggagatca ggagacagga 1440 gtttcttttg aattgtttac atcgagatct gcagggtggg ataaaggatt tgtctaaaga 1500 agaaagatta tgggaagtac aaagaatttt gacagccctc aaaagaaaac tgagagaagc 1560 taaaagacag gagtgtgaaa ccaagattgc acaagagata gccagtcttt caaaagagga 1620 tgtttccaaa gaagagatga atgaaaatga agaagttata aatattctcc ttgctcagga 1680 gaatgagatc ctgactgaac aggaggagct cctggccatg gagcagtttc tgcgccggca 1740 gattgcctca gaaaaagaag agattgaacg cctcagagct gagattgctg aaattcagag 1800 tcgccagcag cacggccgaa gtgagactga ggagtactcc tccgagagcg agagcgagag 1860 tgaggatgag gaggagctgc agatcattct ggaagactta cagagacaga acgaagagct 1920 ggaaataaag aacaatcatt tgaatcaagc aattcatgag gagcgcgagg ccatcatcga 1980 gctgcgcgtg cagctgcggc tgctccagat gcagcgagcc aaggccgagc agcaggcgca 2040 ggaggacgag gagcctgagt ggcgcggggg tgccgtccag ccgcccagag acggcgtcct 2100 tgagccaaaa gcagctaaag agcagccaaa ggcaggcaag gagccggcaa agccatcgcc 2160 cagcagggat aggaaggaga cgtccatctg agcagcctgc gtggccgtct ggagtccgtg 2220 agactgaaag gacccgtgca tcttactgta acccgggggc caggccggct ctctcgctgt 2280 acattctgta aaggtgtctt ctcttctcag actcttcctc tgtcacacgt ctgactcctt 2340 cacgtcaggc tcaggttcca tgggaggacg aagcagtgga cgcattgtgg gctttaggga 2400 cagatgagtt ttccagatag tgtcagctta tttgaagatt aattttcttt gttaacttaa 2460 aataactatt ttaacccttg agtggcttct ttttaaacca aaaaccgtct ttctttgctt 2520 ttttatcaca gcagaatcag gatctctttc tcattcaagg ggggaaccac accaggtcag 2580 cgctgcgcct gctgtggccg ccgcgagcca cgccctctgg gatctctggt accgtcactc 2640 ttgcttgtgc cttccacacc ttctcggtgc agatccctat gggggagctg cctcacgttc 2700 tctgactggt cagagcagcg cctggtgggt gttccctggc ccactctcct ctctccttct 2760 gcagttctaa accacagtct ataagcccga gtcaccagga cggcctgtct ggccacagac 2820 aggggctgcc tgtggagcct gcccaccggc ccccggcagt gcagtccagc ggggaggagg 2880 ctgcccgttc ctgccagttc ctcactgcgg ggaccagcaa aggccttctc actgggttgg 2940 tcaaaggtag tcaccttggc ctggtgcatc cacagaggat gttgttcaaa ccagaaatct 3000 tttaaacgac tgaccttcct taaaaacaga atgactccga ttgcttgctt gggctagaat 3060 gtacacgtct ccttgcctga ataagccata tatatgctct taaacaaaag tttgaaatta 3120 tccatatcat ctcagtgaac ctactggtgg actcccaatt gacaagattg agcaatagaa 3180 aaaaattcct ttcctttgaa tgatagctgt gattcacccc accccatttt cttgtttctg 3240 gtccatccga tgagacggat gctctgatgc tctgaggctt ctgggaggct gggccctgga 3300 ggcaacgtgc tgcaggcgca ctctgtcaga gtgaacagca ccgcgagaca ggccaggctc 3360 gtggctcgga agacaaaccc cacacacact caaggggtcg aaaacaaacc ccacacgagg 3420 gctctcacct ccttctccta ggtagtattt attttcagca cctgtttgat gcagttttta 3480 atcctctacc tattgcactg ttgtgactcg ttggccatta tttgattttg gtacgaaaaa 3540 aagctttgtt atagaaatca gcatactatt tttttaaatc tggagagaag atattctggt 3600 gactgaaagt atggtcgggt gtcagatata aatgtgcaaa tgccttcttg ctgtcctgtc 3660 ggtctcagta cgttcacttt atagctgctg gcaatatcga aggttccttt tttgtttgtg 3720 taaactctaa tttctatcaa ggtgtcatgg atttttaaaa ttagtatttc attacaaatg 3780 tctcagcatt ggttaactaa ttttgggcag gaccattatt gatcaagcaa ataaattcaa 3840 cagccatttg ggaaaaag 3858 70 4043 DNA Homo sapiens 70 cgaagcgggt cctgccccgc tgtcagctgc ggcccccggc gccgggcggg ggtggccgcg 60 accattggcg gagaggcgaa aggggcgggg ccgccgccag ccgctgcggg caaggctgaa 120 caggcggagg tgggcagccg gccagggaag cacggtccag gcggctacat tcggcccggc 180 catggcagcg gcgcccctga aagtgtgcat cgtgggctcg gggaactggg gttcagctgt 240 tgcaaaaata attggtaata acgtcaagaa acttcagaaa tttgcctcca cagtcaagat 300 gtgggtcttt gaagaaacag tgaatggcag aaaactgaca gacatcataa ataatgacca 360 tgaaaatgta aaatatcttc ctggacacaa gctgccagaa aatgtggttg ccatgtcaaa 420 tcttagcgag gctgtgcagg atgcagacct gctggtgttt gtcattcccc accagttcat 480 tcacagaatc tgtgatgaga tcactgggag agtgcccaag aaagcgctgg gaatcaccct 540 catcaagggc atagacgagg gccccgaggg gctgaaactc atttctgaca tcatccgtga 600 gaagatgggt attgacatca gtgtgctgat gggagccaac attgccaatg aggtggctgc 660 agagaagttc tgtgagacca ccatcggcag caaagtaatg gagaacggcc ttctcttcaa 720 agaacttctg cagactccaa attttcgaat tacggtggtt gatgatgcag acactgttga 780 actctgtggt gcgcttaaga acatcgtagc tgtgggagct gggttctgcg acggcctccg 840 ctgtggagac aacaccaaag cggccgtcat ccgcctggga ctcatggaaa tgattgcttt 900 tgccaggatc ttctgcaaag gccaagtgtc tacagccacc ttcctagaga gctgcggggt 960 ggccgacctg atcaccacct gttacggagg gcggaaccgc agggtggccg aggccttcgc 1020 cagaactggg aagaccattg aagagttgga gaaggagatg ctgaatgggc aaaagctcca 1080 aggaccgcag acttctgctg aagtgtaccg catcctcaaa cagaagggac tactggacaa 1140 gtttccattg tttactgcag tgtatcagat ctgctacgaa agcagaccag ttcaagagat 1200 gttgtcttgt cttcagagcc atccagagca tacataaagt gaatcatgca acgtgttggg 1260 ggaagttctg cctttctgat caatcttttg ggttcacgtg gaaaccagga cttggcaaca 1320 tgatgtttga ctgtaatctc atcacggata tgtatgaatt tttacaggtt cgtttttgaa 1380 ttgtgagagg cagttcatta gcaaagatgt actgggcagt aactaaacac acatgcaaac 1440 atgtgaatgg tggtttattc ctcattctgt ggatgtttct atgagccaaa atttgatgtc 1500 tttttttcaa aattgcttat gaaatttcca cacaatcgta gcttataaga ttggaacgat 1560 ctcagccaaa tattttaggt gtaattcata tgtatttgag tggaggattt tttttctcat 1620 ttttctagtg ttaaatttta accagcatta acatggtaga gtggaggagt gagtgtgttc 1680 aaagatcaac atatttaact tttaaacact atctcaaagc cagcataatt aactactttg 1740 attgtgggct gacctttgtt tttttaacaa tcaggcattt ttaattagat aatccactca 1800 tgtatttccc cctcactgca gttgtctgca tttttagcct cttttctctt cgttagttgt 1860 cagaatatgc ctttgtcaag gctcagaggt aacaagacag aaaattcatc tgggattttc 1920 ctgctgtggc tggcacattc ttctgattaa cagacacttg tatgatgctt taggctagtt 1980 agtgcatttt ttagcaaaca tttatcttaa acatcacaga tccactgggg ggtgcaaggg 2040 gctactgtta gtcctcttgt tagatgcagt cactcctcct ggtcacctag tgagcaggga 2100 cagagccagg agtcaagtgc agtgccaagg tgcatgaccc tctgagaagt cactgggctg 2160 atttgacctc cgactcattg gttgtgtaaa tgccatgtgc agcctttcct gaggccatag 2220 gagggcttcc tgcagctgag atctatgcag gccatcctct caacaggtgc cactccaagg 2280 gcggtcctcg gtgcagcagc atcagcttca cttgtggggg ggtgggggaa ggggcggtct 2340 cagaaatgca ggttcccagg tcccaccctg gacttctgaa ggggtgtggc atctgtgttt 2400 ctgatgctta ctacaatatg tgaaccacta ctttagaaaa tctgctttaa cttggtattc 2460 ctctaattgt gttccctagg aaatgactgt cccaagagcc agtgattatt ccaggtgttc 2520 cctggaaagg tcaagtgagt ctgggaaaca ctatgtctgt acacctcttg aaggtgtcga 2580 atgtatgttt atacatcagt ggaacccatt tttctagcct agcaagtccc aaacacatta 2640 cactgaagag attttggtga ggaaacttgc tggagttttc agggaacact gttctaggct 2700 taggtgacct taggatcact caagtagacc cttcactccc tgcgagaaat taggatgaat 2760 aactacctgt ggcattgttg gttctgaact tttacagttc aggcctgctg tgaatctttg 2820 atgaagcttt aaggtgacac tgttgtacaa gatgtcagct ttgctgaaac gcacattacc 2880 tggaataagt gctttaattg tagaattaga atgggattta ctgtactgtt ttaaatgaga 2940 ttggcttcag aatccattac agttacctta catagcactt gatacgtgtt aaatgaacat 3000 atgaatgtaa tttatatatt cctagaattt aagttacttt gtgagatttg ggcctgtccc 3060 tcaatgccag tttaggattt ctttttttct ataccttgaa atgattataa aatagatttt 3120 catgggaatt ttaaaaactc tatccaaaac atttttggag cattttaaag ccccatacac 3180 agaagtatac gaaagcacac aaaacactcc aagtttcagc agttttagcg ccaccattaa 3240 cccactttgc ttgtctcatg aaaaatcttt gttaaagttt gtacacaggt aacaaaaagt 3300 tactttaaaa gatatataaa gggctgtaag ctaattgtgg tgtctagtaa gtagcataat 3360 gagatgtgag gagttggaac tttgcgtgtt ttgcgtattt tcatctgcat tcagcttctt 3420 actctgggtt tgtactcgag tgttatttct ttacaaatgc ccttgtaatt accactctga 3480 agtctgctga ctgtgtctct tgaacatact taggatattc tgcacattat ggaaaaaggt 3540 aaattttaga agtttctgct ctactaactg tagatattta tgactctgcg agttatctat 3600 ttttataacc acctgtggtc cattgttcat tttaattcac atttcttatg aagtatggta 3660 acagggaggg agacacctag attagcagct caatttgtac tacttcagcc aatctgtgaa 3720 tgtaaaaact acactgttgc cttgctagga tccaccctcc tataatatgg aacaaatatc 3780 tgaatgaaat ccaccctagg agacggagtc aaactaaact tgtggttttt catttaactt 3840 ttgactacag catggcccca tggcatccac accaagaggg tgttgtgatg aggtgccggt 3900 gtgcaaaggg aactttagtt tttccactgg ttcttatctg ctagcctttt acatacatgt 3960 gtactatatt tgtttataga ctgtaggtgg atatataatt taaaagcttg atttaataaa 4020 catttaaccc cctaaacttg ggg 4043 71 2108 DNA Homo sapiens 71 tgttcctcct ccgtcccacc cccataacta tactggctct gatgagacct tggttttctg 60 taaaagctct atttagaggt gtatcattat ttacttaatt gttctccttt acaacccacc 120 tgggatgagc atcttgccta gaagtctcta cttgcacagg atacatacga aatagattga 180 ggattcaaag cagatacaga actcttccca cttactttct taccctgtgt gtctccccac 240 agggttacaa gtgtataaca agtgttggaa gtttgagcat tgcaatttca acgacgtcac 300 aacccgcttg agggaaaatg agctaacgta ctactgctgc aagaaggacc tgtgtaactt 360 taacgaacag cttgaaaatg gtgggacatc cttatcagag aaaacagttc ttctgctggt 420 gactccattt ctggcagcag cctggagcct tcatccctaa gtcaacacca ggagagcttc 480 tcccaaactc cccgttcctg cgtagtccgc tttctcttgc tgccacattc taaaggcttg 540 atattttcca aatggatcct gttgggaaag aataaaatta gcttgagcaa cctggctaag 600 atagaggggc tctgggagac tttgaagacc agtcctgttt gcagggaagc cccacttgaa 660 ggaagaagtc taagagtgaa gtaggtgtga cttgaactag attgcatgct tcctcctttg 720 ctcttgggaa gaccagcttt gcagtgacag cttgagtggg ttctctgcag ccctcagatt 780 atttttcctc tggctccttg gatgtagtca gttagcatca ttagtacatc tttggagggt 840 ggggcaggag tatatgagca tcctctctca catggaacgc tttcataaac ttcagggatc 900 ccgtgttgcc atggaggcat gccaaatgtt ccatatgtgg gtgtcagtca gggacaacaa 960 gatccttaat gcagagctag aggacttctg gcagggaagt ggggaagtgt tccagatagc 1020 agggcatgaa aacttagaga ggtacaagtg gctgaaaatc gagtttttcc tctgtcttta 1080 aattttatat gggctttgtt atcttccact ggaaaagtgt aatagcatac atcaatggtg 1140 tgttaaagct atttccttgc ctttttttta ttggaatggt aggatatctt ggctttgcca 1200 cacacagtta cagagtgaac actctactac atgtgactgg cagtattaag tgtgcttatt 1260 ttaaatgtta ctggtagaaa ggcagttcag gtatgtgtgt atatagtatg aatgcagtgg 1320 ggacaccctt tgtggttaca gtttgagact tccaaaggtc atccttaata acaacagatc 1380 tgcaggggta tgttttacca tctgcatcca gcctcctgct aactcctagc tgactcagca 1440 tagattgtat aaaatacctt tgtaacggct cttagcacac tcacagatgt ttgaggcttt 1500 cagaagctct tctaaaaaat gatacacacc tttcacaagg gcaaactttt tccttttccc 1560 tgtgtattct agtgaatgaa tctcaagatt cagtagacct aatgacattt gtattttatg 1620 atcttggctg tatttaatgg cataggctga cttttgcaga tggaggaatt tcttgattaa 1680 tgttgaaaaa aaacccttga ttatactctg ttggacaaac cgagtgcaat gaatgatgct 1740 tttctgaaaa tgaaatataa caagtgggtg aatgtggtta tggccgaaaa ggatatgcag 1800 tatgcttaat ggtagcaact gaaagaagac atcctgagca gtgccagctt tcttctgttg 1860 atgccgttcc ctgaacatag gaaaatagaa acttgcttat caaaacttag cattaccttg 1920 gtgctctgtg ttctctgtta gctcagtgtc tttccttaca tcaataggtt tttttttttt 1980 tttttggcct gaggaagtac tgaccatgcc cacagccacc ggctgagcaa agaagctcat 2040 ttcatgtgag ttctaaggaa tgagaaacaa ttttgatgaa tttaagcaga aaatgaattt 2100 ctgggaac 2108 72 1938 DNA Homo sapiens 72 attccggttg ttgcaccatg gcgtccatgg ggaccctcgc cttcgatgaa tatgggcgcc 60 ctttcctcat catcaaggat caggaccgca agtcccgtct tatgggactt gaggccctca 120 agtctcatat aatggcagca aaggctgtag caaatacaat gagaacatca cttggaccaa 180 atgggcttga taagatgatg gtggataagg atggagatgt gactgtaact aatgatgggg 240 ccaccatctt aagcatgatg gatgttgatc atcagattgc caagctgatg gtggaactgt 300 ccaagtctca ggatgatgaa attggagatg gaaccacagg agtggttgtc ctggctggtg 360 ccttgttaga agaagcggag caattgctag accgaggcat tcacccaatc agaatagccg 420 atggctatga gcaggctgct cgtgttgcta ttgaacacct ggacaagatc agcgatagcg 480 tccttgttga cataaaggac accgaacccc tgattcagac agcaaaaacc acgctgggct 540 ccaaagtggt caacagttgt caccgacaga tggctgagat tgctgtgaat gccgtcctca 600 ctgtagcaga tatggagcgg agagacgttg actttgagct tatcaaagta gaaggcaaag 660 tgggcggcag gctggaggac actaaactga ttaagggcgt gattgtggac aaggatttca 720 gtcacccaca gatgccaaaa aaagtggaag atgcgaagat tgcaattctc acatgtccat 780 ttgaaccacc caaaccaaaa acaaagcata agctggatgt gacctctgtc gaagattata 840 aagcccttca gaaatacgaa aaggagaaat ttgaagagat gattcaacaa attaaagaga 900 ctggtgctaa cctagcaatt tgtcagtggg gctttgatga tgaagcaaat cacttacttc 960 ttcagaacaa cttgcctgcg gttcgctggg taggaggacc tgaaattgag ctgattgcca 1020 tcgcaacagg agggcggatc gtccccaggt tctcagagct cacagccgag aagctgggct 1080 ttgctggtct tgtacaggag atctcatttg ggacaactaa ggataaaatg ctggtcatcg 1140 agcagtgtaa gaactccaga gctgtaacca tttttattag aggaggaaat aagatgatca 1200 ttgaggaggc gaaacgatcc cttcacgatg ctttgtgtgt catccggaac ctcatccgcg 1260 ataatcgtgt ggtgtatgga ggaggggctg ctgagatatc ctgtgccctg gcagttagcc 1320 aagaggcgga taagtgcccc accttagaac agtatgccat gagagcgttt gccgacgcac 1380 tggaggtcat ccccatggcc ctctctgaaa acagtggcat gaatcccatc cagactatga 1440 ccgaagtccg agccagacag gtgaaggaga tgaaccctgc tcttggcatc gactgtttgc 1500 acaaggggac aaatgatatg aagcaacagc atgtcataga aaccttgatt ggcaaaaagc 1560 aacagatatc tcttgcaaca caaatggtta gaatgatttt gaagattgat gacattcgta 1620 agcctggaga atctgaagaa tgaagacatt gagaaaacta tgtagcaaga tccacttctg 1680 tgattaagta aatggatgtc tcgtgatgca tctacagtta tttattgtta catccttttc 1740 cagacactgt agatgctata ataaaaatag ctgtttggta accatagttt cacttgttca 1800 aagctgtgta atcgtggggg taccatctca actgcttttg tattcattgt attaaaagaa 1860 tctgtttaaa caacctttat cttctcttcg ggtttaagaa acgtttattg taacagtaat 1920 taaatgctgc cttaattg 1938 73 1231 DNA Homo sapiens 73 aggtctcagc cggtcgtcgc gacgttcgcc cgctcgctct gaggctcctg aagccgaaac 60 tagctagact ttcctccttc ccgcctgcct gtagcggcgt tgttgccact ccgccaccat 120 gttcgaggcg cgcctggtcc agggctccat cctcaagaag gtgttggagg cactcaagga 180 cctcatcaac gaggcctgct gggatattag ctccagcggt gtaaacctgc agagcatgga 240 ctcgtcccac gtctctttgg tgcagctcac cctgcggtct gagggcttcg acacctaccg 300 ctgcgaccgc aacctggcca tgggcgtgaa cctcaccagt atgtccaaaa tactaaaatg 360 cgccggcaat gaagatatca ttacactaag ggccgaagat aacgcggata ccttggcgct 420 agtatttgaa gcaccaaacc aggagaaagt ttcagactat gaaatgaagt tgatggattt 480 agatgttgaa caacttggaa ttccagaaca ggagtacagc tgtgtagtaa agatgccttc 540 tggtgaattt gcacgtatat gccgagatct cagccatatt ggagatgctg ttgtaatttc 600 ctgtgcaaaa gacggagtga aattttctgc aagtggagaa cttggaaatg gaaacattaa 660 attgtcacag acaagtaatg tcgataaaga ggaggaagct gttaccatag agatgaatga 720 accagttcaa ctaacttttg cactgaggta cctgaacttc tttacaaaag ccactccact 780 ctcttcaacg gtgacactca gtatgtctgc agatgtaccc cttgttgtag agtataaaat 840 tgcggatatg ggacacttaa aatactactt ggctcccaag atcgaggatg aagaaggatc 900 ttaggcattc ttaaaattca agaaaataaa actaagctct ttgagaactg cttctaagat 960 gccagcatat actgaagtct tttctgtcac caaatttgta cctctaagta catatgtaga 1020 tattgttttc tgtaaataac ctattttttt tctctattct ctccaatttg tttaaagaat 1080 aaagtccaaa gtctgatctg gtctagttaa cctagaagta tttttgtctc ttagaaatac 1140 ttgtgatttt tataatacaa aagggtcttg actctaaatg cagttttaag aagtgttttt 1200 gaatttaaat aaagttactt gaatttcaaa c 1231 74 2025 DNA Homo sapiens 74 cggcacgagg caccccgaga ggagaagcgc agcgcagtgg cgagaggagc cccttgtggc 60 agcagcacta cctgcccaga aaaatgctgg aggctgggcg tggccccagg cctggggacc 120 tgtttttcct gtttcccgca gagttccctg cagcccggtc caggtccagg cgtgtgcatt 180 catgagtgag gaacccgtgc aggcgctgag catcctgacc tggagagcag gggctggtca 240 gggcgatggc agcagacctg ggcccctgga atgacaccat caatggcacc tgggatgggg 300 atgagctggg ctacaggtgc cgcttcaacg aggacttcaa gtacgtgctg ctgcctgtgt 360 cctacggcgt ggtgtgcgtg cttgggctgt gtctgaacgc cgtggcgctc tacatcttct 420 tgtgccgcct caagacctgg aatgcgtcca ccacatatat gttccacctg gctgtgtctg 480 atgcactgta tgcggcctcc ctgccgctgc tggtctatta ctacgcccgc ggcgaccact 540 ggcccttcag cacggtgctc tgcaagctgg tgcgcttcct cttctacacc aacctttact 600 gcagcatcct cttcctcacc tgcatcagcg tgcaccggtg tctgggcgtc ttacgacctc 660 tgcgctccct gcgctggggc cgggcccgct acgctcgccg ggtggccggg gccgtgtggg 720 tgttggtgct ggcctgccag gcccccgtgc tctactttgt caccaccagc gcgcgcgggg 780 gccgcgtaac ctgccacgac acctcggcac ccgagctctt cagccgcttc gtggcctaca 840 gctcagtcat gctgggcctg ctcttcgcgg tgccctttgc cgtcatcctt gtctgttacg 900 tgctcatggc tcggcgactg ctaaagccag cctacgggac ctcgggcggc ctccctaggg 960 ccaagcgcaa gtccgtgcgc accatcgccg tggtgctggc tgtcttcgcc ctctgcttcc 1020 tgccattcca cgtcacccgc accctctact actccttccg ctcgctggac ctcagctgcc 1080 acaccctcaa cgccatcaac atggcctaca aggttacccg gccgctggcc agtgctaaca 1140 gttgccttga ccccgtgctc tacttcctgg ctgggcagag gctcgtacgc tttgcccgag 1200 atgccaagcc acccactggc cccagccctg ccaccccggc tcgccgcagg ctgggcctgc 1260 gcagatccga cagaactgac atgcagagga taggagatgt gttgggcagc agtgaggact 1320 tcaggcggac agagtccacg ccggctggta gcgagaacac taaggacatt cggctgtagg 1380 agcagaacac ttcagcctgt gcaggtttat attgggaagc tgtagaggac caggacttgt 1440 gcagacgcca cagtctcccc agatatggac catcagtgac tcatgctgga tgaccccatg 1500 ctccgtcatt tgacaggggc tcaggatatt cactctgtgg tccagagtca actgttccca 1560 taacccctag tcatcgtttg tgtgtataag ttgggggaat taagtttcaa gaaaggcaag 1620 agctcaaggt caatgacacc cctggcctga ctcccatgca agtagctggc tgtactgcca 1680 aggtacctag gttggagtcc agcctaatca agtcaaatgg agaaacaggc ccagagagga 1740 aggtggctta ccaagatcac ataccagagt ctggagctga gctacctggg gtgggggcca 1800 agtcacaggt tggccagaaa accctggtaa gtaatgaggg ctgagtttgc acagtggtct 1860 ggaatggact gggtgccacg gtggacttag ctctgaggag tacccccagc ccaagagatg 1920 aacatctggg gactaatatc atagacccat ctggaggctc ccatgggcta ggagcagtgt 1980 gaggctgtaa cttatactaa aggttgtgtt gcctgctaaa aaaaa 2025 75 4910 DNA Homo sapiens 75 tagacgcacc ctctgaagat ggtgactccc tcctgagaag ctggacccct tggtaaaaga 60 caaggccttc tccaagaaga atatgaaagt gttactcaga cttatttgtt tcatagctct 120 actgatttct tctctggagg ctgataaatg caaggaacgt gaagaaaaaa taattttagt 180 gtcatctgca aatgaaattg atgttcgtcc ctgtcctctt aacccaaatg aacacaaagg 240 cactataact tggtataaag atgacagcaa gacacctgta tctacagaac aagcctccag 300 gattcatcaa cacaaagaga aactttggtt tgttcctgct aaggtggagg attcaggaca 360 ttactattgc gtggtaagaa attcatctta ctgcctcaga attaaaataa gtgcaaaatt 420 tgtggagaat gagcctaact tatgttataa tgcacaagcc atatttaagc agaaactacc 480 cgttgcagga gacggaggac ttgtgtgccc ttatatggag ttttttaaaa atgaaaataa 540 tgagttacct aaattacagt ggtataagga ttgcaaacct ctacttcttg acaatataca 600 ctttagtgga gtcaaagata ggctcatcgt gatgaatgtg gctgaaaagc atagagggaa 660 ctatacttgt catgcatcct acacatactt gggcaagcaa tatcctatta cccgggtaat 720 agaatttatt actctagagg aaaacaaacc cacaaggcct gtgattgtga gcccagctaa 780 tgagacaatg gaagtagact tgggatccca gatacaattg atctgtaatg tcaccggcca 840 gttgagtgac attgcttact ggaagtggaa tgggtcagta attgatgaag atgacccagt 900 gctaggggaa gactattaca gtgtggaaaa tcctgcaaac aaaagaagga gtaccctcat 960 cacagtgctt aatatatcgg aaattgaaag tagattttat aaacatccat ttacctgttt 1020 tgccaagaat acacatggta tagatgcagc atatatccag ttaatatatc cagtcactaa 1080 tttccagaag cacatgattg gtatatgtgt cacgttgaca gtcataattg tgtgttctgt 1140 tttcatctat aaaatcttca agattgacat tgtgctttgg tacagggatt cctgctatga 1200 ttttctccca ataaaagctt cagatggaaa gacctatgac gcatatatac tgtatccaaa 1260 gactgttggg gaagggtcta cctctgactg tgatattttt gtgtttaaag tcttgcctga 1320 ggtcttggaa aaacagtgtg gatataagct gttcatttat ggaagggatg actacgttgg 1380 ggaagacatt gttgaggtca ttaatgaaaa cgtaaagaaa agcagaagac tgattatcat 1440 tttagtcaga gaaacatcag gcttcagctg gctgggtggt tcatctgaag agcaaatagc 1500 catgtataat gctcttgttc aggatggaat taaagttgtc ctgcttgagc tggagaaaat 1560 ccaagactat gagaaaatgc cagaatcgat taaattcatt aagcagaaac atggggctat 1620 ccgctggtca ggggacttta cacagggacc acagtctgca aagacaaggt tctggaagaa 1680 tgtcaggtac cacatgccag tccagcgacg gtcaccttca tctaaacacc agttactgtc 1740 accagccact aaggagaaac tgcaaagaga ggctcacgtg cctctcgggt agcatggaga 1800 agttgccaag agttctttag gtgcctcctg tcttatggcg ttgcaggcca ggttatgcct 1860 catgctgact tgcagagttc atggaatgta actatatcat cctttatccc tgaggtcacc 1920 tggaatcaga ttattaaggg aataagccat gacgtcaata gcagcccagg gcacttcaga 1980 gtagagggct tgggaagatc ttttaaaaag gcagtaggcc cggtgtggtg gctcacgcct 2040 ataatcccag cactttggga ggctgaagtg ggtggatcac cagaggtcag gagttcgaga 2100 ccagcccagc caacatggca aaaccccatc tctactaaaa atacaaaaat gagctaggca 2160 tggtggcaca cgcctgtaat cccagctaca cctgaggctg aggcaggaga attgcttgaa 2220 ccggggagac ggaggttgca gtgagccgag tttgggccac tgcactctag cctggcaaca 2280 gagcaagact ccgtctcaaa aaaagggcaa taaatgccct ctctgaatgt ttgaactgcc 2340 aagaaaaggc atggagacag cgaactagaa gaaagggcaa gaaggaaata gccaccgtct 2400 acagatggct tagttaagtc atccacagcc caagggcggg gctatgcctt gtctggggac 2460 cctgtagagt cactgaccct ggagcggctc tcctgagagg tgctgcaggc aaagtgagac 2520 tgacacctca ctgaggaagg gagacatatt cttggagaac tttccatctg cttgtatttt 2580 ccatacacat ccccagccag aagttagtgt ccgaagaccg aattttattt tacagagctt 2640 gaaaactcac ttcaatgaac aaagggattc tccaggattc caaagttttg aagtcatctt 2700 agctttccac aggagggaga gaacttaaaa aagcaacagt agcagggaat tgatccactt 2760 cttaatgctt tcctccctgg catgaccatc ctgtcctttg ttattatcct gcattttacg 2820 tctttggagg aacagctccc tagtggcttc ctccgtctgc aatgtccctt gcacagccca 2880 cacatgaacc atccttccca tgatgccgct cttctgtcat cccgctcctg ctgaaacacc 2940 tcccaggggc tccacctgtt caggagctga agcccatgct ttcccaccag catgtcactc 3000 ccagaccacc tccctgccct gtcctccagc ttcccctcgc tgtcctgctg tgtgaattcc 3060 caggttggcc tggtggccat gtcgcctgcc cccagcactc ctctgtctct gctcttgcct 3120 cgacccttcc tcctcctttg cctaggaggc cttctcgcat tttctctagc tgatcagaat 3180 tttaccaaaa ttcagaacat cctccaattc cacagtctct gggagacttt ccctaagagg 3240 cgacttcctc tccagccttc tctctctggt caggcccact gcagagatgg tggtgagcac 3300 atctgggagg ctggtctccc tccagctgga attgctgctc tctgagggag aggctgtggt 3360 ggctgtctct gtccctcact gccttccagg agcaatttgc acatgtaaca tagatttatg 3420 taatgcttta tgtttaaaaa cattccccaa ttatcttatt taatttttgc aattattcta 3480 attttatata tagagaaagt gacctatttt ttaaaaaaat cacactctaa gttctattga 3540 acctaggact tgagcctcca tttctggctt ctagtctggt gttctgagta cttgatttca 3600 ggtcaataac ggtcccccct cactccacac tggcacgttt gtgagaagaa atgacatttt 3660 gctaggaagt gaccgagtct aggaatgctt ttattcaaga caccaaattc caaacttcta 3720 aatgttggaa ttttcaaaaa ttgtgtttag attttatgaa aaactcttct actttcatct 3780 attctttccc tagaggcaaa catttcttaa aatgtttcat tttcattaaa aatgaaagcc 3840 aaatttatat gccaccgatt gcaggacaca agcacagttt taagagttgt atgaacatgg 3900 agaggacttt tggtttttat atttctcgta tttaatatgg gtgaacacca acttttattt 3960 ggaataataa ttttcctcct aaacaaaaac acattgagtt taagtctctg actcttgcct 4020 ttccacctgc tttctcctgg gcccgctttg cctgcttgaa ggaacagtgc tgttctggag 4080 ctgctgttcc aacagacagg gcctagcttt catttgacac acagactaca gccagaagcc 4140 catggagcag ggatgtcacg tcttgaaaag cctattagat gttttacaaa tttaattttg 4200 cagattattt tagtctgtca tccagaaaat gtgtcagcat gcatagtgct aagaaagcaa 4260 gccaatttgg aaacttaggt tagtgacaaa attggccaga gagtgggggt gatgatgacc 4320 aagaattaca agtagaatgg cagctggaat ttaaggaggg acaagaatca atggataagc 4380 gtgggtggag gaagatccaa acagaaaagt gcaaagttat tccccatctt ccaagggttg 4440 aattctggag gaagaagaca cattcctagt tccccgtgaa cttcctttga cttattgtcc 4500 ccactaaaac aaaacaaaaa acttttaatg ccttccacat taattagatt ttcttgcagt 4560 ttttttatgg cattttttta aagatgccct aagtgttgaa gaagagtttg caaatgcaac 4620 aaaaatattt aattaccggt tgttaaaact ggtttagcac aatttatatt ttccctctct 4680 tgcctttctt atttgcaata aaaggtattg agccattttt taaatgacat ttttgataaa 4740 ttatgtttgt actagttgat gaaggagttt tttttaacct gtttatataa ttttgcagca 4800 gaagccaaat tttttgtata ttaaagcacc aaattcatgt acagcatgca tcacggatca 4860 atagactgta cttattttcc aataaaattt tcaaactttg tactgttaaa 4910 76 2592 DNA Homo sapiens 76 gccccacgca cggacaggag tgaacccgag ctgtgccgac caacccccag gatggcggaa 60 gctcaccagg ccgtggcctt ccagttcacg gtgaccccag acggggtcga cttccggctc 120 agtcgggagg ccctgaaaca cgtctacctg tctgggatca actcctggaa gaaacgcctg 180 atccgcatca agaatggcat cctcaggggc gtgtaccctg gcagccccac cagctggctg 240 gtcgtcatca tggcaacagt gggttcctcc ttctgcaacg tggacatctc cttggggctg 300 gtcagttgca tccagagatg cctccctcag gggtgtggcc cctaccagac cccgcagacc 360 cgggcacttc tcagcatggc catcttctcc acgggcgtct gggtgacggg catcttcttc 420 ttccgccaaa ccctgaagct gcttctctgc taccatgggt ggatgtttga gatgcatggc 480 aagaccagca acttgaccag gatctgggct atgtgtatcc gccttctatc cagccggcac 540 cctatgctct acagcttcca gacatctctg cccaagcttc ctgtgcccag ggtgtcagcc 600 acaattcagc ggtacctaga gtctgtgcgc cccttgttgg atgatgagga atattaccgc 660 atggagttgc tggccaaaga attccaggac aagactgccc ccaggctgca gaaatacctg 720 gtgctcaagt catggtgggc aagtaactat gtgagtgact ggtgggaaga gtacatctac 780 cttcgaggca ggagccctct catggtgaac agcaactatt atgtcatgga ccttgtgctc 840 atcaagaata cagacgtgca ggcagcccgc ctgggaaaca tcatccacgc catgatcatg 900 tatcgccgta aactggaccg tgaagaaatc aagcctgtga tggcactggg catagtgcct 960 atgtgctcct accagatgga gaggatgttc aacaccactc ggatcccggg caaggacaca 1020 gatgtgctac agcacctctc agacagccgg cacgtggctg tctaccacaa gggacgcttc 1080 ttcaagctgt ggctctatga gggcgcccgt ctgctcaagc ctcaggatct ggagatgcag 1140 ttccagagga tcctggacga cccctcccca cctcagcctg gggaggagaa gctggcagcc 1200 ctcactgcag gaggaagggt ggagtgggcg caggcacgcc aggccttctt tagctctgga 1260 aagaataagg ctgccttgga ggccatcgag cgtgccgctt tcttcgtggc cctggatgag 1320 gaatcctact cctatgaccc cgaagatgag gccagcctca gcctctatgg caaggccctg 1380 ctacatggca actgctacaa caggtggttt gacaaatcct tcactctcat ttccttcaag 1440 aatggccagt tgggtctcaa tgcagagcat gcgtgggcag atgctcccat cattgggcac 1500 ctctgggagt ttgtcctggg cacagacagc ttccacctgg gctacacgga gaccgggcac 1560 tgcctgggca aaccgaaccc tgcgctcgca cctcctacac ggctgcagtg ggacattcca 1620 aaacagtgcc aggcggtcat cgagagttcc taccaggtgg ccaaggcgtt ggcagacgac 1680 gtggagttgt actgcttcca gttcctgccc tttggcaaag gcctcatcaa gaagtgccgg 1740 accagccctg atgcctttgt gcagatcgcg ctgcagctgg ctcacttccg ggacaggggt 1800 aagttctgcc tgacctatga ggcctcaatg accagaatgt tccgggaggg acggactgag 1860 actgtgcgtt cctgtaccag cgagtccaca gcctttgtgc aggccatgat ggaggggtcc 1920 cacacaaaag cagacctgcg agatctcttc cagaaggctg ctaagaagca ccagaatatg 1980 taccgcctgg ccatgaccgg ggcagggatc gacaggcacc tcttctgcct ttacttggtc 2040 tccaagtacc taggagtcag ctctcctttc cttgctgagg tgctctcgga accctggcgt 2100 ctctccacca gccagatccc ccaatcccag atccgcatgt tcgacccaga gcagcacccc 2160 aatcacctgg gcgctggagg tggctttggc cctgtagcag atgatggcta tggagtttcc 2220 tacatgattg caggcgagaa cacgatcttc ttccacatct ccagcaagtt ctcaagctca 2280 gagacgaacg cccagcgctt tggaaaccac atccgcaaag ccctgctgga cattgctgat 2340 cttttccaag ttcccaaggc ctacagctga agcccttagg tacctgtgtt ttgtttggga 2400 actcggaggc cctccccctc ccccagctca gaccacagag gtggcaagag aagggctgaa 2460 gctggaagac tgttcatgag ggacttgtgt gacctgcttt gaaatgtgtg actctgctga 2520 gtgacgtagg ctctgagata gctgtccacg cccacgtgtt tgcttggaat aaatacttgc 2580 ctcagaacct tc 2592 77 1429 DNA Homo sapiens 77 cagcatggct acgaaatgtg ggaattgtgg acccggctac tccacccctc tggaggccat 60 gaaaggaccc agggaagaga tcgtctacct gccctgcatt taccgaaaca caggcactga 120 ggccccagat tatctggcca ctgtggatgt tgaccccaag tctccccagt attgccaggt 180 catccaccgg ctgcccatgc ccaacctgaa ggacgagctg catcactcag gatggaacac 240 ctacagcagc tgcttcggtg atagcaccaa gtcgcgcaac aagctggtct tgcccagtct 300 catctcctct cgcatctatg tggtggacgt gggctctgag cccgggcccc aaaagctgca 360 caaggtcatt gagcccaagg acatccatgc caagtgcgaa ctggcctgtc tccacaccag 420 ccactgcctg gccagcgggg aagtgatgat cagctccctg ggggacgtca agggcaatgg 480 caaagggggt tttgtgctgc tggatgggga gacgttcgag gtgaagggga catgggagag 540 acctgggggt gctgcaccgt tgggctatga cttctggtac cagcctcgac acaatgtcat 600 gatcagcact gagtgggcag ctcccaatgt cttacgagat ggctttaacc ccgctgatgt 660 ggaggctgga ctgtacggga gccacttata tgtatgggac tggcagcgcc atgagattgt 720 gcagaccctg tctctaaaag atgggctgat acccttggag atccgcttcc tgcacaaccc 780 aagtgccacc cagggttttg taggctgtgc ctcagctcca aacatccagc gcttctacaa 840 aacgagggaa ggtacatggt cagtggagaa ggtgatccag gtgcccccca agaaagtgaa 900 gggctggctg ctgccagggg tgccaggcct gatcaccgac atcctgctct ccctggacga 960 ccgcttcctc tacttcagca actggctgca tggggacctg aggcagtatg acatctctga 1020 cccacagaga ccccgcctca caggacagct cttcctcgga ggcagcattg ttaagggagg 1080 ccctgtgcaa gtgctggagg acgaggaact aaagtcccag ccagagcccc tagtggtcaa 1140 gggaaaacgg gtggctggag gccctcagat gatccagctc agcctggatg gcaagcgcct 1200 ctacatcacc acgtcgctgt acagtgcctg ggaaaagcag ttttaccctg atctcatcag 1260 ggaaggctct gtaatgctgc aggttgatgt agacacagta aaaggagggc tgaagttgaa 1320 ccccaactgc ctggtggact tcgggaagga gccccttggc ccagccctgg ctcacgagct 1380 tcgctaccct gggggcgatt gtagctctga catctggatt tgaaggctc 1429 78 5683 DNA Homo sapiens 78 ccgcccggtg ttgcgctcct tcccagaatc cgctccggcc tttccttcct gccgcgattc 60 ccaactttgc tcaaagtcgc cggactctaa gctgtcggag ggaccgctgg acagacctgg 120 gaactgacag agggcctgga gggaaatagg ccaaagaccc acaggatgga gctgacctca 180 accgaaagag ggaggggaca gcctctgccc tgggaacttc gactgcccct actgctaagc 240 gtgctggctg ccacactggc acaggcccct gccccggatg tccctggctg ttccagggga 300 agctgctacc ccgccacggc cgacctgctg gtgggccgag ctgacagact gactgcctca 360 tccacttgtg gcctgaatgg ccgccagccc tactgcatcg tcagtcacct gcaggacgaa 420 aagaagtgct tcctttgtga ctcccggcgc cccttctctg ctagagacaa cccacacacc 480 catcgcatcc agaatgtagt caccagcttt gcaccacagc ggcgggcagc ttggtggcag 540 tcacagaatg gtatccctgc ggtcaccatc cagctggacc tggaggctga gtttcatttc 600 acacacctca ttatgacctt caagacattt cgccctgctg ccatgctggt cgaacgctca 660 gcagactttg gccgcacctg gcatgtgtac cgatatttct cctatcactg tggggctgac 720 ttcccaggag tcccactagc acccccacgg cactgggatg atgtagtctg tgagtcccgc 780 tactcagaga ttgagccatc cactgaaggc gaggtcatct atcgtgtgct ggaccctgcc 840 atccctatcc cagaccccta cagctcacgg attcagaacc tgttgaagat caccaaccta 900 cgggtgaacc tgactcgtct acacacgttg ggagacaacc tactcgaccc acggagggag 960 atccgagaga agtactacta tgccctctat gagctggttg tacgtggcaa ctgcttctgc 1020 tacggacacg cctcagagtg tgcacccgcc ccaggggcac cagcccatgc tgagggcatg 1080 gtgcacggag cttgcatctg caaacacaac acacgtggcc tcaactgcga gcagtgtcag 1140 gatttctatc gtgacctgcc ctggcgtccg gctgaggacg gccatagtca tgcctgtagg 1200 aagtgtgatc ggcatgggca cacccacagc tgccacttcg acatggccgt atacctcgga 1260 tctggcaatg tgagtggagg tgtgtgtgat ggatgtcagc ataacacagc gtggcgccac 1320 tgtgagctct gtcggccctt cttctaccgt gacccaacca aggacctgcg ggatccggct 1380 gtgtgccgct cctgtgattg tgaccccatg ggttctcaag acggtggtcg ctgtgattcc 1440 catgatgacc ctgcactggg actggtctcc ggccagtgtc gctgcaaaga acacgtggtg 1500 ggcactcgct gccagcaatg ccgtgatggc ttctttgggc tcagcatcag tgacccgtct 1560 gggtgccggc gatgtcaatg taatgcacgg ggcacagtgc ctgggagcac tccttgtgac 1620 cccaacagtg gatcctgtta ctgcaaacgt ctagtgactg gacgtggatg tgaccgctgc 1680 ctgcctggcc actggggcct gagcctcgac ctgctcggct gccgcccctg tgactgcgac 1740 gtgggtggtg ctttggatcc ccagtgtgat gagggcacag gtcaatgcca ctgccgccag 1800 cacatggttg ggcgacgctg tgagcaggtg caacctggct acttccggcc cttcctggac 1860 cacctaattt gggaggctga gaacacccga gggcaggtgc tcgatgtggt ggagcgcctg 1920 gtgacccccg gggaaactcc atcctggact ggctcaggct tcgtgcgact acaggaaggt 1980 cagaccctgg agttcctggt ggcctctgtg ccgaacgcga tggactatga cctgctgctg 2040 cgcttagagc cccaggtccc tgagcaatgg gcagagttgg aactgattgt gcagcgtcca 2100 gggcctgtgc ctgcccacag cctgtgtggg catttggtgc ccagggatga tcgcatccaa 2160 gggactctgc aaccacatgc caggtacttg atatttccta atcctgtctg ccttgagcct 2220 ggtatctcct acaagctgca tctgaagctg gtacggacag ggggaagtgc ccagcctgag 2280 actccctact ctggacctgg cctgctcatt gactcgctgg tgctgctgcc ccgtgtcctg 2340 gtgctagaga tgtttagtgg gggtgatgct gctgccctgg agcgccaggc cacctttgaa 2400 cgctaccaat gccatgagga gggtctggtg cccagcaaga cttctccctc tgaggcctgc 2460 gcacccctcc tcatcagcct gtccaccctc atctacaatg gtgccctgcc atgtcagtgc 2520 aaccctcaag gttcactgag ttctgagtgc aaccctcatg gtggtcagtg cctgtgcaag 2580 cctggagtgg ttgggcgccg ctgtgacacg tgtgcccctg gctactatgg ctttggcccc 2640 acaggctgtc aagcctgcca gtgcagccca cgaggggcac tcagcagtct ctgtgaaagg 2700 accagtgggc aatgtctctg tcgaactggt gcctttgggc ttcgctgtga cgcctgccag 2760 cgtggccagt ggggattccc tagctgccgg ccatgtgtct gcaatgggca tgcagatgag 2820 tgcaacaccc acacaggcgc ttgcctgggc tgccgtgatc tcacaggggg tgagcactgt 2880 gaaaggtgca ttgctggttt ccacggggac ccacggctgc catatggggc gcagtgccgg 2940 ccctgtccct gtcctgaagg ccctgggagc caacggcact ttgctacttc ttgccaccag 3000 gatgaatatt cccagcagat tgtgtgccac tgccgggcag gctatacggg gctgcgatgt 3060 gaagcttgtg cccctgggca gtttggggac ccatcaaggc caggtggccg gtgccaactg 3120 tgtgagtgca gtgggaacat tgacccaatg gatcctgatg cctgtgaccc acaccccggg 3180 caatgcctgc gctgtttaca ccacacagag ggtccacact gtgcccactc gaagcctggc 3240 ttccatggcc aggctgcccg gcagagctgt caccgctgca catgcaacct gctgggcaca 3300 aatccgcagc agtgcccatc tcctgaccag tgccactgtg atccaagcag tgggcagtgc 3360 ccatgcctcc ccaatgtcca ggccctagct gtagaccgct gtgcccccaa cttctggaac 3420 ctcaccagtg gccatggttg ccagccttgt gcctgcctcc caagcccgga agaaggcccc 3480 acctgcaacg agttcacagg gcagtgccac tgcctgtgcg gctttggagg gcggacttgt 3540 tctgagtgcc aagagctcca ctggggagac cctgggttgc agtgccatgc ctgtgattgt 3600 gactctcgtg gaatagatac acctcagtgt caccgcttca caggtcactg cacgtgccgc 3660 ccaggggtgt ctggtgtgcg ctgtgaccag tgtgcccgtg gcttctcagg aatctttcct 3720 gcctgccatc cctgccatgc atgcttcggg gattgggacc gagtggtgca ggacttggca 3780 gcccgtacac agcgcctaga gcagcgggcg caggagttgc aacagacggg tgtgctgggt 3840 gcctttgaga gcagcttctg gcacatgcag gagaagctgg gcattgtgca gggcatcgta 3900 ggtgcccgca acacctcagc cgcctccact gcacagcttg tggaggccac agaggagctg 3960 cggcgtgaaa ttggggaggc cactgagcac ctgactcagc tcgaggcaga cctgacagat 4020 gtgcaagatg agaacttcaa tgccaaccat gcactaagtg gtctggagcg agataggctt 4080 gcacttaatc tcacactgcg gcagctcgac cagcatcttg acttgctcaa acattcaaac 4140 ttcctgggtg cctatgacag catccggcat gcccatagcc agtctgcaga ggcagaacgt 4200 cgtgccaata cctcagccct ggcagtacct agccctgtga gcaactcggc aagtgctcgg 4260 catcggacag aggcactgat ggatgctcag aaggaggact tcaacagcaa acacatggcc 4320 aaccagcggg cacttggcaa gctctctgcc catacccaca ccctgagcct gacagacata 4380 aatgagctgg tgtgtggggc ccagggattg catcatgatc gtacaagccc ttgtgggggt 4440 gccggctgtc gagatgagga tgggcagccg cgctgtgggg gcctcagctg caatggggca 4500 gcggctacag cagacctagc actgggccgg gcccggcaca cacaggcaga gctgcagcgg 4560 gcactggcag aaggtggtag catcctcagc agagtggctg agactcgtcg gcaggcaagc 4620 gaggcacagc agcgggccca ggcagccctg gacaaggcta atgcttccag gggacaggtg 4680 gaacaggcca accaggaact tcaagaactt atccagagtg tgaaggactt cctcaaccag 4740 gagggggctg atcctgatag cattgaaatg gtggccacac gggtgctaga gctctccatc 4800 ccagcttcag ctgagcagat ccagcacctg gcgggcgcga ttgcagagcg agtccggagc 4860 ctggcagatg tggatgcgat cctggcacgt actgtaggag atgtgcgtcg tgccgagcag 4920 ctactgcagg atgcacggcg ggcaaggagc tgggctgagg atgagaaaca gaaggcagag 4980 acagtacagg cagcactgga ggaggcccag cgggcacagg gtattgccca gggtgccatc 5040 cggggggcag tggctgacac acgggacaca gagcagaccc tgtaccaggt acaggagagg 5100 atggcaggtg cagagcgggc actgagctct gcaggtgaaa gggctcggca gttggatgct 5160 ctcctggagg ctctgaaatt gaaacgggca ggaaatagtc tggcagcctc tacagcagaa 5220 gaaacggcag gcagtgccca gggtcgtgcc caggaggctg agcagctgct acgcggtcct 5280 ctgggtgatc agtaccagac ggtgaaggcc ctagctgagc gcaaggccca aggtgtgctg 5340 gctgcacagg caagggcaga acaactgccg gatgaggctc gggacctgtt gcaagccgct 5400 caggacaagc tgcagcggct acaggaattg gaaggcacct atgaggaaaa tgagcgggca 5460 ctggagagta aggcagccca gttggacggg ttggaggcca ggatgcgcag cgtgcttcaa 5520 gccatcaact tgcaggtgca gatctacaac acctgccagt gacccctgcc caaggcctac 5580 cccagttcct agcactgccc cacatgcatg tctgcctatg cactgaagag ctcttggccc 5640 ggcagggccc ccaataaacc agtgtgaacc cccaaaaaaa aaa 5683 79 5177 DNA Homo sapiens 79 ggactgcgaa aggagcaggg ttgcggagct agggctccag cctgcggccg cgcattcttg 60 cgtctggcca gccgcgagct ctaagggtcg gccccgcccg gtccgccccc gcggctccct 120 gccaggctct cgcgggcgcg ctcggggtgg ggcctcgcgg ctggcggaga tgcggccggg 180 gctgcgcggt ggtgatgcga gcctgctggg cggcgcgccg gggcagccgg agccgcgcgc 240 cgcggcgctg taatcggaca ccaagagcgc tcgcccccgg cctccggcca ctttccattc 300 actccgaggt gcttgattga gcgacgcgga gaagagctcc gggtgccgcg gcactgcagc 360 gctgagattc ctttacaaag aaactcagag gaccgggaag aaagaatttc acctttgcga 420 cgtgctagaa aataaggtcg tctgggaaaa ggactggaga cacaagcgca tccaaccccg 480 gtagcaaact gatgactttt ccgtgctgat ttctttcaac ctcggtattt tcccttggat 540 attaacttgc atatctgaag aaatggcatt ccggacaatt tgcgtgttgg ttggagtatt 600 tatttgttct atctgtgtga aaggatcttc ccagccccaa gcaagagttt atttaacatt 660 tgatgaactt cgagaaacca agacctctga atacttcagc ctttcccacc atcctttaga 720 ctacaggatt ttattaatgg atgaagatca ggaccggata tatgtgggaa gcaaagatca 780 cattctttcc ctgaatatta acaatataag tcaagaagct ttgagtgttt tctggccagc 840 atctacaatc aaagttgaag aatgcaaaat ggctggcaaa gatcccacac acggctgtgg 900 gaactttgtc cgtgtaattc agactttcaa tcgcacacat ttgtatgtct gtgggagtgg 960 cgctttcagt cctgtctgta cttacttgaa cagagggagg agatcagagg accaagtttt 1020 catgattgac tccaagtgtg aatctggaaa aggacgctgc tctttcaacc ccaacgtgaa 1080 cacggtgtct gttatgatca atgaggagct tttctctgga atgtatatag atttcatggg 1140 gacagatgct gctatttttc gaagtttaac caagaggaat gcggtcagaa ctgatcaaca 1200 taattccaaa tggctaagtg aacctatgtt tgtagatgca catgtcatcc cagatggtac 1260 tgatccaaat gatgctaagg tgtacttctt cttcaaagaa aaactgactg acaataacag 1320 gagcacgaaa cagattcatt ccatgattgc tcgaatatgt cctaatgaca ctggtggact 1380 gcgtagcctt gtcaacaagt ggaccacttt cttaaaggcg aggctggtgt gctcggtaac 1440 agatgaagac ggcccagaaa cacactttga tgaattagag gatgtgtttc tgctggaaac 1500 tgataacccg aggacaacac tagtgtatgg catttttaca acatcaagct cagttttcaa 1560 aggatcagcc gtgtgtgtgt atcatttatc tgatatacag actgtgttta atgggccttt 1620 tgcccacaaa gaagggccca atcatcagct gatttcctat cagggcagaa ttccatatcc 1680 tcgccctgga acttgtccag gaggagcatt tacacccaat atgcgaacca ccaaggagtt 1740 cccagatgat gttgtcactt ttattcggaa ccatcctctc atgtacaatt ccatctaccc 1800 aatccacaaa aggcctttga ttgttcgtat tggcactgac tacaagtaca caaagatagc 1860 tgtggatcga gtgaacgctg ctgatgggag ataccatgtc ctgtttctcg gaacagatcg 1920 gggtactgtg caaaaagtgg ttgttcttcc tactaacaac tctgtcagtg gcgagctcat 1980 tctggaggag ctggaagtct ttaagaatca tgctcctata acaacaatga aaatttcatc 2040 taaaaagcaa cagttgtatg tgagttccaa tgaaggggtt tcccaagtat ctctgcaccg 2100 ctgccacatc tatggtacag cctgtgctga ctgctgcctg gcgcgggacc cttattgcgc 2160 ctgggatggc cattcctgtt ccagattcta cccaactggg aaacggagga gccgaagaca 2220 agatgtgaga catggaaacc cactgactca atgcagagga tttaatctaa aagcatacag 2280 aaatgcagct gaaattgtgc agtatggagt aaaaaataac accacttttc tggagtgtgc 2340 ccccaagtct ccgcaggcat ctatcaagtg gctgttacag aaagacaaag acaggaggaa 2400 agaggttaag ctgaatgaac gaataatagc cacttcacag ggactcctga tccgctctgt 2460 tcagggttct gaccaaggac tttatcactg cattgctaca gaaaatagtt tcaagcagac 2520 catagccaag atcaacttca aagttttaga ttcagaaatg gtggctgttg tgacggacaa 2580 atggtccccg tggacctggg ccagctctgt gagggcttta cccttccacc cgaaggacat 2640 catgggggca ttcagccact cagaaatgca gatgattaac caatactgca aagacactcg 2700 gcagcaacat cagcagggag atgaatcaca gaaaatgaga ggggactatg gcaagttaaa 2760 ggccctcatc aatagtcgga aaagtagaaa caggaggaat cagttgccag agtcataata 2820 ttttcttatg tgggtcttat gcttccatta acaaatgctc tgtcttcaat gatcaaattt 2880 tgagcaaaga aacttgtgct ttaccaaggg gaattactga aaaaggtgat tactcctgaa 2940 gtgagtttta cacgaactga aatgagcatg cattttcttg tatgatagtg actagcacta 3000 gacatgtcat ggtcctcatg gtgcatataa atatatttaa cttaacccag attttattta 3060 tatctttatt caccttttct tcaaaatcga tatggtggct gcaaaactag aattgttgca 3120 tccctcaatt gaatgagggc catatccctg tggtattcct ttcctgcttt ggggctttag 3180 aattctaatt gtcagtgatt ttgtatatga aaacaagttc caaatccaca gcttttacgt 3240 agtaaaagtc ataaatgcat atgacagaat ggctatcaaa agaaatagaa aaggaagacg 3300 gcatttaaag ttgtataaaa acacgagtta ttcataaaga gaaaatgatg agtttttatg 3360 gttccaatga aatatcttcc ccttttttta agattgtaaa aataatcagt tactggtatc 3420 tgtcactgac ctttgtttcc ttattcagga agataaaaat cagtaaccta ccccatgaag 3480 atatttggtg ggagttatat cagtgaagca gtttggttta tattcttatg ttatcacctt 3540 ccaaacaaaa gcacttactt tttttggaag ttatttaatt tattttagac tcaaagaata 3600 taatcttgca ctactcagtt attactgttt gttctcttat tccctagtct gtgtggcaaa 3660 ttaaacaata taagaaggaa aaatttgaag tattagactt ctaaataagg ggtgaaatca 3720 tcagaaagaa aaatcaaagt agaaactact aattttttaa gaggaattta taacaaatat 3780 ggctagtttt caacttcagt actcaaattc aatgattctt ccttttatta aaaccagtct 3840 cagatatcat actgattttt aagtcaacac tatatatttt atgatctttt cagtgtgatg 3900 gcaaggtgct tgttatgtct agaaagtaag aaaacaatat gaggagacat tctgtctttc 3960 aaaaggtaat ggtacatacg ttcactggtc tctaagtgta aaagtagtaa attttgtgat 4020 gaataaaata attatctcct aattgtatgt tagaataatt ttattagaat aatttcatac 4080 tgaaattatt ttctccaaat aaaaattaga tggaaaaatg tgaaaaaaat tattcatgct 4140 ctcatatata ttttaaaaac actacttttg cttttttatt taccttttaa gacattttca 4200 tgcttccagg taaaaacaga tattgtacca tgtacctaat ccaaatatca tataaacatt 4260 ttatttatag ttaataatct atgatgaagg taattaaagt agattatggc ctttttaagt 4320 attgcagtct aaaacttcaa aaactaaaat cattgtcaaa attaatatga ttattaatca 4380 gaatatcaga tatgattcac tatttaaact atgataaatt atgataatat atgaggaggc 4440 ctcgctatag caaaaatagt taaaatgctg acataacacc aaacttcatt ttttaaaaaa 4500 tctgttgttc caaatgtgta taattttaaa gtaatttcta aagcagttta ttataatggt 4560 ttgcctgctt aaaaggtata attaaacttc ttttctcttc tacattgaca cacagaaatg 4620 tgtcaatgta aagccaaaac catcttctgt gtttatggcc aatctattct caaagttaaa 4680 agtaaaattg tttcagagtc acagttccct ttatttcaca taagcccaaa ctgatagaca 4740 gtaacggtgt ttagttttat actatatttg tgctatttaa ttctttctat tttcacaatt 4800 attaaattgt gtacactttc attactttta aaaatgtaga aattcttcat gaacataact 4860 ctgctgaatg taaaagaaaa ttttttttca aaaatgctgt taatgtatac tactggtggt 4920 tgattggttt tattttatgt agcttgacaa ttcagtgact taatatctat tccatttgta 4980 ttgtacataa aattttctag aaatacactt ttttccaaag tgtaagtttg tgaatagatt 5040 ttagcatgat gaaactgtca taatggtgaa tgttcaatct gtgtaagaaa acaaactaaa 5100 tgtagttgtc acactaaaat ttaattggat attgatgaaa tcattggcct ggcaaaataa 5160 aacatgttga attcccc 5177 80 9164 DNA Homo sapiens 80 ggctggaggg gcgctgggct cggacctgcc aaggccacgg gggagcaagg gacagaggcg 60 ggggtcctag ctgacggctt ttactgccta ggatgacgct gcggcttctg gtggccgcgc 120 tctgcgccgg gatcctggca gaggcgcccc gagtgcgagc ccagcacagg gagagagtga 180 cctgcacgcg cctttacgcc gctgacattg tgttcttact ggatggctcc tcatccattg 240 gccgcagcaa tttccgcgag gtccgcagct ttctcgaagg gctggtgctg cctttctctg 300 gagcagccag tgcacagggt gtgcgctttg ccacagtgca gtacagcgat gacccacgga 360 cagagttcgg cctggatgca cttggctctg ggggtgatgt gatccgcgcc atccgtgagc 420 ttagctacaa ggggggcaac actcgcacag gggctgcaat tctccatgtg gctgaccatg 480 tcttcctgcc ccagctggcc cgacctggtg tccccaaggt ctgcatcctg atcacagacg 540 ggaagtccca ggacctggtg gacacagctg cccaaaggct gaaggggcag ggggtcaagc 600 tatttgctgt ggggatcaag aatgctgacc ctgaggagct gaagcgagtt gcctcacagc 660 ccaccagtga cttcttcttc ttcgtcaatg acttcagcat cttgaggaca ctactgcccc 720 tcgtttcccg gagagtgtgc acgactgctg gtggcgtgcc tgtgacccga cctccggatg 780 actcgacctc tgctccacga gacctggtgc tgtctgagcc aagcagccaa tccttgagag 840 tacagtggac agcggccagt ggccctgtga ctggctacaa ggtccagtac actcctctga 900 cggggctggg acagccactg ccgagtgagc ggcaggaggt gaacgtccca gctggtgaga 960 ccagtgtgcg gctgcggggt ctccggccac tgaccgagta ccaagtgact gtgattgccc 1020 tctacgccaa cagcatcggg gaggctgtga gcgggacagc tcggaccact gccctagaag 1080 ggccggaact gaccatccag aataccacag cccacagcct cctggtggcc tggcggagtg 1140 tgccaggtgc cactggctac cgtgtgacat ggcgggtcct cagtggtggg cccacacagc 1200 agcaggagct gggccctggg cagggttcag tgttgctgcg tgacttggag cctggcacgg 1260 actatgaggt gaccgtgagc accctatttg gccgcagtgt ggggcccgcc acttccctga 1320 tggctcgcac tgacgcttct gttgagcaga ccctgcgccc ggtcatcctg ggccccacat 1380 ccatcctcct ttcctggaac ttggtgcctg aggcccgtgg ctaccggttg gaatggcggc 1440 gtgagactgg cttggagcca ccgcagaagg tggtactgcc ctctgatgtg acccgctacc 1500 agttggatgg gctgcagccg ggcactgagt accgcctcac actctacact ctgctggagg 1560 gccacgaggt ggccacccct gcaaccgtgg ttcccactgg accagagctg cctgtgagcc 1620 ctgtaacaga cctgcaagcc accgagctgc ccgggcagcg ggtgcgagtg tcctggagcc 1680 cagtccctgg tgccacccag taccgcatca ttgtgcgcag cacccagggg gtggagcgga 1740 ccctggtgct tcctgggagt cagacagcat tcgacttgga tgacgttcag gctgggctta 1800 gctacactgt gcgggtgtct gctcgagtgg gtccccgtga gggcagtgcc agtgtcctca 1860 ctgtccgccg ggagctggaa actccacttg ctgttccagg gctgcgggtt gtggtgtcag 1920 atgcaacgcg agtgagggtg gcctggggac ccgtccctgg agccagtgga tttcggatta 1980 gctggagcac aggcagtggt ccggagtcca gccagacact gcccccagac tctactgcca 2040 cagacatcac agggctgcag cctggaacca cctaccaggt ggctgtgtcg gtactgcgag 2100 gcagagagga gggccctgct gcagtcatcg tggctcgaac ggacccactg ggcccagtga 2160 ggacggtcca tgtgactcag gccagcagct catctgtcac cattacctgg accagggttc 2220 ctggcgccac aggatacagg gtttcctggc actcagccca cggcccagag aaatcccagt 2280 tggtttctgg ggaggccacg gtggctgagc tggatggact ggagccagat actgagtata 2340 cggtgcatgt gagggcccat gtggctggcg tggatgggcc ccctgcctct gtggttgtga 2400 ggactgcccc tgagcctgtg ggtcgtgtgt cgaggctgca gatcctcaat gcttccagcg 2460 acgttctacg gatcacctgg gtaggggtca ctggagccac agcttacaga ctggcctggg 2520 gccggagtga aggcggcccc atgaggcacc agatactccc aggaaacaca gactctgcag 2580 agatccgggg tctcgaaggt ggagtcagct actcagtgcg agtgactgca cttgtcgggg 2640 accgcgaggg cacacctgtc tccattgttg tcactacgcc gcctgaggct ccgccagccc 2700 tggggacgct tcacgtggtg cagcgcgggg agcactcgct gaggctgcgc tgggagccgg 2760 tgcccagaga gcagggcttc cttctgcact ggcaacctga gggtggccag gaacagtccc 2820 gggtcctggg gcccgagctc agcagctatc acctggacgg gctggagcca gcgacacagt 2880 accgcgtgag gctgagtgtc ctagggccag ctggagaagg gccctctgca gaggtgactg 2940 cgcgcactga gtcacctcgt gttccaagca ttgaactacg tgtggtggac acctcgatcg 3000 actcggtgac tttggcctgg actccagtgt ccagggcatc cagctacatc ctatcctggc 3060 ggccactcag aggccctggc caggaagtgc ctgggtcccc gcagacactt ccagggatct 3120 caagctccca gcgggtgaca gggctagagc ctggcgtctc ttacatcttc tccctgacgc 3180 ctgtcctgga tggtgtgcgg ggtcctgagg catctgtcac acagacgcca gtgtgccccc 3240 gtggcctggc ggatgtggtg ttcctaccac atgccactca agacaatgct caccgtgcgg 3300 aggctacgag gagggtcctg gagcgtctgg tgttggcact tgggcctctt gggccacagg 3360 cagttcaggt tggcctgctg tcttacagtc atcggccttc cccactgttc ccactgaatg 3420 gctcccatga ccttggcatt atcttgcaaa ggatccgtga catgccctac atggacccaa 3480 gtgggaacaa cctgggcaca gccgtggtca cagctcacag atacatgttg gcaccagatg 3540 ctcctgggcg ccgccagcac gtaccagggg tgatggttct gctagtggat gaacccttga 3600 gaggtgacat attcagcccc atccgtgagg cccaggcttc tgggcttaat gtggtgatgt 3660 tgggaatggc tggagcggac ccagagcagc tgcgtcgctt ggcgccgggt atggactctg 3720 tccagacctt cttcgccgtg gatgatgggc caagcctgga ccaggcagtc agtggtctgg 3780 ccacagccct gtgtcaggca tccttcacta ctcagccccg gccagagccc tgcccagtgt 3840 attgtccaaa gggccagaag ggggaacctg gagagatggg cctgagagga caagttgggc 3900 ctcctggcga ccctggcctc ccgggcagga ccggtgctcc cggcccccag gggccccctg 3960 gaagtgccac tgccaagggc gagaggggct tccctggagc agatgggcgt ccaggcagcc 4020 ctggccgcgc cgggaatcct gggacccctg gagcccctgg cctaaagggc tctccagggt 4080 tgcctggccc tcgtggggac ccgggagagc gaggacctcg aggcccaaag ggggagccgg 4140 gggctcccgg acaagtcatc ggaggtgaag gacctgggct tcctgggcgg aaaggggacc 4200 ctggaccatc gggcccccct ggacctcgtg gaccactggg ggacccagga ccccgtggcc 4260 ccccagggct tcctggaaca gccatgaagg gtgacaaagg cgatcgtggg gagcggggtc 4320 cccctggacc aggtgaaggt ggcattgctc ctggggagcc tgggctgccg ggtcttcccg 4380 gaagccctgg accccaaggc cccgttggcc cccctggaaa gaaaggagaa aaaggtgact 4440 ctgaggatgg agctccaggc ctcccaggac aacctgggtc tccgggtgag cagggcccac 4500 ggggacctcc tggagctatt ggccccaaag gtgaccgggg ctttccaggg cccctgggtg 4560 aggctggaga gaagggcgaa cgtggacccc caggcccagc gggatcccgg gggctgccag 4620 gggttgctgg acgtcctgga gccaagggtc ctgaagggcc accaggaccc actggccgcc 4680 aaggagagaa gggggagcct ggtcgccctg gggaccctgc agtggtggga cctgctgttg 4740 ctggacccaa aggagaaaag ggagatgtgg ggcccgctgg gcccagagga gctaccggag 4800 tccaagggga acggggccca cccggcttgg ttcttcctgg agaccctggc cccaagggag 4860 accctggaga ccggggtccc attggcctta ctggcagagc aggaccccca ggtgactcag 4920 ggcctcctgg agagaaggga gaccctgggc ggcctggccc cccaggacct gttggccccc 4980 gaggacgaga tggtgaagtt ggagagaaag gtgacgaggg tcctccgggt gacccgggtt 5040 tgcctggaaa agcaggcgag cgtggccttc ggggggcacc tggagttcgg gggcctgtgg 5100 gtgaaaaggg agaccaggga gatcctggag aggatggacg aaatggcagc cctggatcat 5160 ctggacccaa gggtgaccgt ggggagccgg gtcccccagg acccccggga cggctggtag 5220 acacaggacc tggagccaga gagaagggag agcctgggga ccgcggacaa gagggtcctc 5280 gagggcccaa gggtgatcct ggcctccctg gagcccctgg ggaaaggggc attgaagggt 5340 ttcggggacc cccaggccca cagggggacc caggtgtccg aggcccagca ggagaaaagg 5400 gtgaccgggg tccccctggg ctggatggcc ggagcggact ggatgggaaa ccaggagccg 5460 ctgggccctc tgggccgaat ggtgctgcag gcaaagctgg ggacccaggg agagacgggc 5520 ttccaggcct ccgtggagaa caaggcctcc ctggcccctc tggtccccct ggattaccgg 5580 gaaagccagg cgaggatggg aaacctggcc tgaatggaaa aaacggagaa cctggggacc 5640 ctggagaaga cgggaggaag ggagagaaag gagattcagg cgcctctggg agagaaggtt 5700 ttcctggtgt cccaggaggc acgggcccca agggtgaccg tggggagact ggatccaaag 5760 gggagcaggg cctccctgga gagcgtggcc tgcgaggaga gcctggaagt gtgccgaatg 5820 tggatcggtt gctggaaact gctggcatca aggcatctgc cctgcgggag atcgtggaga 5880 cctgggatga gagctctggt agcttcctgc ctgtgcccga acggcgtcga ggccccaagg 5940 gggactcagg cgaacagggc cccccaggca aggagggccc catcggcttt cctggagaac 6000 gcgggctgaa gggcgaccgt ggagaccctg gccctcaggg gccacctggt ctggcccttg 6060 gggagagggg cccccccggg ccttccggcc ttgccgggga gcctggaaag cctggtattc 6120 ccgggctccc aggcagggct gggggtgtgg gagaggcagg aaggccagga gagaggggag 6180 aacggggaga gaaaggagaa cgtggagaac agggcagaga tggccctcct ggactccctg 6240 gaacccctgg gccccccgga ccccctggcc ccaaggtgtc tgtggatgag ccaggtcctg 6300 gactctctgg agaacaggga ccccctggac tcaagggtgc taagggggag ccgggcagca 6360 atggtgacca aggtcccaaa ggagacaggg gtgtgccagg catcaaagga gaccggggag 6420 agcctggacc gaggggtcag gacggcaacc cgggtctacc aggagagcgt ggtatggctg 6480 ggcctgaagg gaagccgggt ctgcagggtc caagaggccc ccctggccca gtgggtggtc 6540 atggagaccc tggaccacct ggtgccccgg gtcttgctgg ccctgcagga ccccaaggac 6600 cttctggcct gaagggggag cctggagaga caggacctcc aggacggggc ctgactggac 6660 ctactggagc tgtgggactt cctggacccc ccggcccttc aggccttgtg ggtccacagg 6720 ggtctccagg tttgcctgga caagtggggg agacagggaa gccgggagcc ccaggtcgag 6780 atggtgccag tggaaaagat ggagacagag ggagccctgg tgtgccaggg tcaccaggtc 6840 tgcctggccc tgtcggacct aaaggagaac ctggccccac gggggcccct ggacaggctg 6900 tggtcgggct ccctggagca aagggagaga agggagcccc tggaggcctt gctggagacc 6960 tggtgggtga gccgggagcc aaaggtgacc gaggactgcc agggccgcga ggcgagaagg 7020 gtgaagctgg ccgtgcaggg gagcccggag accctgggga agatggtcag aaaggggctc 7080 caggacccaa aggtttcaag ggtgacccag gagtcggggt cccgggctcc cctgggcctc 7140 ctggccctcc aggtgtgaag ggagatctgg gcctccctgg cctgcccggt gctcctggtg 7200 ttgttgggtt cccgggtcag acaggccctc gaggagagat gggtcagcca ggccctagtg 7260 gagagcgggg tctggcaggc cccccaggga gagaaggaat cccaggaccc ctggggccac 7320 ctggaccacc ggggtcagtg ggaccacctg gggcctctgg actcaaagga gacaagggag 7380 accctggagt agggctgcct gggccccgag gcgagcgtgg ggagccaggc atccggggtg 7440 aagatggccg ccccggccag gagggacccc gaggactcac ggggccccct ggcagcaggg 7500 gagagcgtgg ggagaagggt gatgttggga gtgcaggact aaagggtgac aagggagact 7560 cagctgtgat cctggggcct ccaggcccac ggggtgccaa gggggacatg ggtgaacgag 7620 ggcctcgggg cttggatggt gacaaaggac ctcggggaga caatggggac cctggtgaca 7680 agggcagcaa gggagagcct ggtgacaagg gctcagccgg gttgccagga ctgcgtggac 7740 tcctgggacc ccagggtcaa cctggtgcag cagggatccc tggtgacccg ggatccccag 7800 gaaaggatgg agtgcctggt atccgaggag aaaaaggaga tgttggcttc atgggtcccc 7860 ggggcctcaa gggtgaacgg ggagtgaagg gagcctgtgg ccttgatgga gagaagggag 7920 acaagggaga agctggtccc ccaggccgcc ccgggctggc aggacacaaa ggagagatgg 7980 gggagcctgg tgtgccgggc cagtcggggg cccctggcaa ggagggcctg atcggtccca 8040 agggtgaccg aggctttgac gggcagccag gccccaaggg tgaccagggc gagaaagggg 8100 agcggggaac cccaggaatt gggggcttcc caggccccag tggaaatgat ggctctgctg 8160 gtcccccagg gccacctggc agtgttggtc ccagaggccc cgaaggactt cagggccaga 8220 agggtgagcg aggtcccccc ggagagagag tggtgggggc tcctggggtc cctggagctc 8280 ctggcgagag aggggagcag gggcggccag ggcctgccgg tcctcgaggc gagaagggag 8340 aagctgcact gacggaggat gacatccggg gctttgtgcg ccaagagatg agtcagcact 8400 gtgcctgcca gggccagttc atcgcatctg gatcacgacc cctccctagt tatgctgcag 8460 acactgccgg ctcccagctc catgctgtgc ctgtgctccg cgtctctcat gcagaggagg 8520 aagagcgggt accccctgag gatgatgagt actctgaata ctccgagtat tctgtggagg 8580 agtaccagga ccctgaagct ccttgggata gtgatgaccc ctgttccctg ccactggatg 8640 agggctcctg cactgcctac accctgcgct ggtaccatcg ggctgtgaca ggcagcacag 8700 aggcctgtca cccttttgtc tatggtggct gtggagggaa tgccaaccgt tttgggaccc 8760 gtgaggcctg cgagcgccgc tgcccacccc gggtggtcca gagccagggg acaggtactg 8820 cccaggactg aggcccagat aatgagctga gattcagcat cccctggagg agtcggggtc 8880 tcagcagaac cccactgtcc ctccccttgg tgctagaggc ttgtgtgcac gtgagcgtgc 8940 gagtgcacgt ccgttatttc agtgacttgg tcccgtgggt ctagccttcc cccctgtgga 9000 caaaccccca ttgtggctcc tgccaccctg gcagatgact cactgtgggg gggtggctgt 9060 gggcagtgag cggatgtgac tggcgtctga cccgcccctt gacccaagcc tgtgatgaca 9120 tggtgctgat tctggggggc attaaagctg ctgttttaaa aggc 9164 81 2148 DNA Homo sapiens 81 gcttcagggt acagctcccc cgcagccaga agccgggcct gcagcccctc agcaccgctc 60 cgggacaccc cacccgcttc ccaggcgtga cctgtcaaca gcaacttcgc ggtgtggtga 120 actctctgag gaaaaaccat tttgattatt actctcagac gtgcgtggca acaagtgact 180 gagacctaga aatccaagcg ttggaggtcc tgaggccagc ctaagtcgct tcaaaatgga 240 acgaaggcgt ttgtggggtt ccattcagag ccgatacatc agcatgagtg tgtggacaag 300 cccacggaga cttgtggagc tggcagggca gagcctgctg aaggatgagg ccctggccat 360 tgccgccctg gagttgctgc ccagggagct cttcccgcca ctcttcatgg cagcctttga 420 cgggagacac agccagaccc tgaaggcaat ggtgcaggcc tggcccttca cctgcctccc 480 tctgggagtg ctgatgaagg gacaacatct tcacctggag accttcaaag ctgtgcttga 540 tggacttgat gtgctccttg cccaggaggt tcgccccagg aggtggaaac ttcaagtgct 600 ggatttacgg aagaactctc atcaggactt ctggactgta tggtctggaa acagggccag 660 tctgtactca tttccagagc cagaagcagc tcagcccatg acaaagaagc gaaaagtaga 720 tggtttgagc acagaggcag agcagccctt cattccagta gaggtgctcg tagacctgtt 780 cctcaaggaa ggtgcctgtg atgaattgtt ctcctacctc attgagaaag tgaagcgaaa 840 gaaaaatgta ctacgcctgt gctgtaagaa gctgaagatt tttgcaatgc ccatgcagga 900 tatcaagatg atcctgaaaa tggtgcagct ggactctatt gaagatttgg aagtgacttg 960 tacctggaag ctacccacct tggcgaaatt ttctccttac ctgggccaga tgattaatct 1020 gcgtagactc ctcctctccc acatccatgc atcttcctac atttccccgg agaaggaaga 1080 gcagtatatc gcccagttca cctctcagtt cctcagtctg cagtgcctgc aggctctcta 1140 tgtggactct ttatttttcc ttagaggccg cctggatcag ttgctcaggc acgtgatgaa 1200 ccccttggaa accctctcaa taactaactg ccggctttcg gaaggggatg tgatgcatct 1260 gtcccagagt cccagcgtca gtcagctaag tgtcctgagt ctaagtgggg tcatgctgac 1320 cgatgtaagt cccgagcccc tccaagctct gctggagaga gcctctgcca ccctccagga 1380 cctggtcttt gatgagtgtg ggatcacgga tgatcagctc cttgccctcc tgccttccct 1440 gagccactgc tcccagctta caaccttaag cttctacggg aattccatct ccatatctgc 1500 cttgcagagt ctcctgcagc acctcatcgg gctgagcaat ctgacccacg tgctgtatcc 1560 tgtccccctg gagagttatg aggacatcca tggtaccctc cacctggaga ggcttgccta 1620 tctgcatgcc aggctcaggg agttgctgtg tgagttgggg cggcccagca tggtctggct 1680 tagtgccaac ccctgtcctc actgtgggga cagaaccttc tatgacccgg agcccatcct 1740 gtgcccctgt ttcatgccta actagctggg tgcacatatc aaatgcttca ttctgcatac 1800 ttggacacta aagccaggat gtgcatgcat cttgaagcaa caaagcagcc acagtttcag 1860 acaaatgttc agtgtgagtg aggaaaacat gttcagtgag gaaaaaacat tcagacaaat 1920 gttcagtgag gaaaaaaagg ggaagttggg gataggcaga tgttgacttg aggagttaat 1980 gtgatctttg gggagataca tcttatagag ttagaaatag aatctgaatt tctaaaggga 2040 gattctggct tgggaagtac atgtaggagt taatccctgt gtagactgtt gtaaagaaac 2100 tgttgaaaat aaagagaagc aatgtgaagc aaaaaaaaaa aaaaaaaa 2148 82 3370 DNA Homo sapiens 82 gcccccgccc ggcccgcccc gctctcctag tcccttgcaa cctggcgctg catccgggcc 60 actgtcccag gtcccaggtc ccggcccgga gctatggagc ggcgctggcc cctggggcta 120 gggctggtgc tgctgctctg cgccccgctg cccccggggg cgcgcgccaa ggaagttact 180 ctgatggaca caagcaaggc acagggagag ctgggctggc tgctggatcc cccaaaagat 240 gggtggagtg aacagcaaca gatactgaat gggacacccc tctacatgta ccaggactgc 300 ccaatgcaag gacgcagaga cactgaccac tggcttcgct ccaattggat ctaccgcggg 360 gaggaggctt cccgcgtcca cgtggagctg cagttcaccg tgcgggactg caagagtttc 420 cctgggggag ccgggcctct gggctgcaag gagaccttca accttctgta catggagagt 480 gaccaggatg tgggcattca gctccgacgg cccttgttcc agaaggtaac cacggtggct 540 gcagaccaga gcttcaccat tcgagacctt gcgtctggct ccgtgaagct gaatgtggag 600 cgctgctctc tgggccgcct gacccgccgt ggcctctacc tcgctttcca caacccgggt 660 gcctgtgtgg ccctggtgtc tgtccgggtc ttctaccagc gctgtcctga gaccctgaat 720 ggcttggccc aattcccaga cactctgcct ggccccgctg ggttggtgga agtggcgggc 780 acctgcttgc cccacgcgcg ggccagcccc aggccctcag gtgcaccccg catgcactgc 840 agccctgatg gcgagtggct ggtgcctgta ggacggtgcc actgtgagcc tggctatgag 900 gaaggtggca gtggcgaagc atgtgttgcc tgccctagcg gctcctaccg gatggacatg 960 gacacacccc attgtctcac gtgcccccag cagagcactg ctgagtctga gggggccacc 1020 atctgtacct gtgagagcgg ccattacaga gctcccgggg agggccccca ggtggcatgc 1080 acaggtcccc cctcggcccc ccgaaacctg agcttctctg cctcagggac tcagctctcc 1140 ctgcgttggg aacccccagc agatacgggg ggacgccagg atgtcagata cagtgtgagg 1200 tgttcccagt gtcagggcac agcacaggac ggggggccct gccagccctg tggggtgggc 1260 gtgcacttct cgccgggggc ccgggcgctc accacacctg cagtgcatgt caatggcctt 1320 gaaccttatg ccaactacac ctttaatgtg gaagcccaaa atggagtgtc agggctgggc 1380 agctctggcc atgccagcac ctcagtcagc atcagcatgg ggcatgcaga gtcactgtca 1440 ggcctgtctc tgagactggt gaagaaagaa ccgaggcaac tagagctgac ctgggcgggg 1500 tcccggcccc gaagccctgg ggcgaacctg acctatgagc tgcacgtgct gaaccaggat 1560 gaagaacggt accagatggt tctagaaccc agggtcttgc tgacagagct gcagcctgac 1620 accacataca tcgtcagagt ccgaatgctg accccactgg gtcctggccc tttctcccct 1680 gatcatgagt ttcggaccag cccaccagtg tccaggggcc tgactggagg agagattgta 1740 gccgtcatct ttgggctgct gcttggtgca gccttgctgc ttgggattct cgttttccgg 1800 tccaggagag cccagcggca gaggcagcag aggcacgtga ccgcgccacc gatgtggatc 1860 gagaggacaa gctgtgctga agccttatgt ggtacctcca ggcatacgag gaccctgcac 1920 agggagcctt ggactttacc cggaggctgg tctaattttc cttcccggga gcttgatcca 1980 gcgtggctga tggtggacac tgtcatagga gaaggagagt ttggggaagt gtatcgaggg 2040 accctcaggc tccccagcca ggactgcaag actgtggcca ttaagacctt aaaagacaca 2100 tccccaggtg gccagtggtg gaacttcctt cgagaggcaa ctatcatggg ccagtttagc 2160 cacccgcata ttctgcatct ggaaggcgtc gtcacaaagc gaaagccgat catgatcatc 2220 acagaattta tggagaatgc agccctggat gccttcctga gggagcggga ggaccagctg 2280 gtccctgggc agctagtggc catgctgcag ggcatagcat ctggcatgaa ctacctcagt 2340 aatcacaatt atgtccaccg ggacctggct gccagaaaca tcttggtgaa tcaaaacctg 2400 tgctgcaagg tgtctgactt tggcctgact cgcctcctgg atgactttga tggcacatac 2460 gaaacccagg gaggaaagat ccctatccgt tggacagccc ctgaagccat tgcccatcgg 2520 atcttcacca cagccagcga tgtgtggagc tttgggattg tgatgtggga ggtgctgagc 2580 tttggggaca agccttatgg ggagatgagc aatcaggagg ttatgaagag cattgaggat 2640 gggtaccggt tgccccctcc tgtggactgc cctgcccctc tgtatgagct catgaagaac 2700 tgctgggcat atgaccgtgc ccgccggcca cacttccaga agcttcaggc acatctggag 2760 caactgcttg ccaaccccca ctccctgcgg accattgcca actttgaccc cagggtgact 2820 cttcgcctgc ccagcctgag tggctcagat gggatcccgt atcgaaccgt ctctgagtgg 2880 ctcgagtcca tacgcatgaa acgctacatc ctgcacttcc actcggctgg gctggacacc 2940 atggagtgtg tgctggagct gaccgctgag gacctgacgc agatgggaat cacactgccc 3000 gggcaccaga agcgcattct ttgcagtatt cagggattca aggactgatc cctcctctca 3060 ccccatgccc aatcagggtg caaggagcaa ggacggggcc aaggtcgctc atggtcactc 3120 cctgcgcccc ttcccacaac ctgccagact aggctatcgg tgctgcttct gcccgcttta 3180 aggagaaccc tgctctgcac cccagaaaac ctctttgttt taaaagggag gtgggggtag 3240 aagtaaaagg atgatcatgg gagggagctc aggggttaat atatatacat acatacacat 3300 atatatattg ttgtaaataa acaggaaatg attttctgcc tccatcccac ccatcagggc 3360 tgcaggcact 3370 83 13863 DNA Homo sapiens misc_feature (1)..(13863) n = a, c, g or t 83 aagcttagga agcacaagag gctgagcctt tcaggtcagc aaagacttcc cagaggaggc 60 agtgcctaca ctgaggtcag agtgacaaga agagtaatgg accactgtaa agacttgggt 120 tcggccgggc gcggtggctc acgcctgtaa tcccagcact ttgggaggcc gaggcgggtg 180 gatcatgagg tcaggagatc gagaccatcc tggctaacaa ggtgaaaccc cgtctctact 240 aaaaatacag aaaattagcc gggcgcggtg gcgggcgcct gtggtcccag ctactcggga 300 ggctgaggca ggagaatggc gtgaacccgg gaagcggagc ttgcagtgag ccgagattgc 360 gccactgcag tccgcagtcc ggcctgggcg acagagcgag actccgtctc aaaaaaaaaa 420 aaagacttgg gtttgacttg attgagccca ggagttcgag acaagcctgg gcaatatagt 480 gagacctcat ctctacaaaa attttaaaaa ttagcctggt gcggtggctc atgcctgtaa 540 tcccagcact ctgggaggcc gaggtgggcg gatcacttga ggtcagaagt ttgagaccac 600 cctgaccaac atggagaaac cccgtctcta ctaaaaatac aaaattagcc gggcatggtg 660 gcgcatgcct gtaatcccag ctactcggga ggctgaggca ggagaattgt ttgaacctgg 720 gaggtggacg ttgcggtgag ccaagatcac actattgcac tccagcctgg gcaacaagag 780 caaaactccg tctcaaaaaa aaaaatttat ttttaaatta gccaggtgta gccacagctg 840 tagtcaaatc tactaggcag gctgaggtgg gaggattgct tgaacctggg aggcagaggt 900 tgcagtgagc caagatggtg ccacggcatt ccagcctgag caacagcaag accctgtgtc 960 caaaaaaaaa aaaaaaaaaa accgtaaaat aggccaggca cagtggttca tggttataag 1020 cctagcactt tggaaggctg aggagggtgg atcgcctgag ctcaggagtt caagaccagc 1080 ctgggcaaca cggtgaaacc ccatctctac caaaaaaaaa aaaaaaaaaa attagccagg 1140 catggtggtg tgtgcctgtg gtcccagcta ctcaggaggc tgaggtggaa gagtgcttgt 1200 gcctgggagg cagaggttcc agtgaaccga gatcacacca ttgtactcca gcctgggcaa 1260 cagagtaaga ccccatctca aaaaaaaaaa aaaaaattaa gataaaccct ttggcagctg 1320 cgtgctgctc ttagcctcaa acccaagtct tttttttccc cctttgagac ggggtctatt 1380 gcccaggctg gagtgcaatg gtatgatcca tactcactgc agccccgaac tcctgggctt 1440 ccaaagtgct gggattacag gtgtgagcca ccaggcccag actgctgaag ggtttaaacc 1500 agagaaagaa tgtgaccaga tttccaattt agaaagaccc gctctctgca gggtaaggag 1560 agcctggggg tccgggggcg gggggcaaga attgcaaggt aaccagggag gccagtgcaa 1620 tgtccaggtg ggagaggatg ctagctgaga ctagaagtgc taggaaaagg atgtgtgcag 1680 acaagaggtc actggggagg tgaaataaca aggcttggcc atgagtggaa cccaacaccc 1740 atggtgccct cttgagagag ggaagatggc acctgagatg gaagatggaa agaccagggt 1800 ccctgtgact gaggactgag cctctgtttg aggtttttgc agaggagtaa aggcaacaaa 1860 agaggcaaga gttggaagaa aggtgacaag gaacaaaagt cagctatgcc tgatgctact 1920 gggtggccag caacaatgct gacttggcca aggctctgag agctttacta tgctgggact 1980 ggaggtcaga gttgaggcta gggtaagagc aaggggctca gagatggagg gggaggagga 2040 cctgaacaag tccagaaggg aagagatttg tccctctatc caacagagta cccagtgagc 2100 agcacagagg gcacagcaag ggacatcacc cggttcccca aatgctcaga gccacaagtg 2160 aagccaaaag tgaaagacaa gatgcagaaa accgccacgg gcctttgagg aagggtaaag 2220 gcgaaagcga aagcaggaag tacagacgtg aagcctagca gaggactttt tagctgctca 2280 ctggccccgc ttgtctggcc gactcatccg cccgcgaccc ctaatcccct ctgcctgccc 2340 caagatgctg aagccagccc tggagccccg agggggcttc tccttcgaga actgccaaag 2400 gtgaagcggg ggcgcggggg gcggtcactc ctgagccgcc tctgcttgct cgtggccttt 2460 tttcctggct gggggtgggg gagggtgtgt tggtcgactt gggttccagg cttaccccgg 2520 aagatgaggg agacggggac caggttaggg gaagcaacag gggtcttgaa agcagagccg 2580 aaacatgggc gccctcctcc gtttccagaa atgcatcatt ggaacgcgtc ctcccggggc 2640 tcaaggtccc tcacgcacgc aagaccggga ccaccatcgc gggcctggtg ttccaagtga 2700 gcagcgggga gggacgggga gctggagggg agccgagagt atcgagcagg cactgaagct 2760 gcggtccctc cctctcctca ggacggggtc attctgggcg ccgatacgcg agccactaac 2820 gattcggtcg tggcggacaa gagctgcgag aagatccact tcatcgcccc caaaatctag 2880 tgagactccc gagcccagtt cccgtacgca aaaaagaacg gccccctcgt tcccactccg 2940 gtccccgcac gtcccagccc tgcccacacc gatcctccct tttgcctcag ctgctgtggg 3000 gctggagtag ccgcggacgc cgagatgacc acacggatgg tggcgtccaa gatggagcta 3060 cacgcgttat ctacgggccg cgagccccgc gtggccacgg tcactcgcat cctgcgccag 3120 acgctcttca ggtgcggggg cagggctaac aggaccccgg caggtagttt acggggttgg 3180 ggccattgga aggcgggaca gaaagaaggg cgggaccgcg acgggccagg tgaccggaag 3240 aggccggccc aagagaacct gggctacagg aaaaggcgat gtcagtcatc gggcgccagc 3300 ccacaggaag gagcggggat agcacctagg agctgggcat agagaggtgg gcctaggccc 3360 cagcttgtgg ccgaccccgc ccatcctcga gcaggtacca gggccacgtg ggtgcatcgc 3420 tgatcgtggg cggcgtagac ctgactggac cgcagctcta cggtgtgcat ccccatggct 3480 cctacagccg tctgcccttc acagccctgg gtgagcgctt ctgtcccttc tcctcgaact 3540 ctgcccctgg tgaccttggc ctcactccaa acggcgtcgc agcggttgac ttcagatgct 3600 tctcctgcct tcaggctctg gtcaggacgc ggccctggcg gtgctagaag accggttcca 3660 gccgaacatg acggtgagcg gcctctgtcc ccgactttgt ggtcgctggt gggatgtgca 3720 cccgggagct gggggagcac aggaccctgg cccagtgcgg gtggctaagg cttgtcggag 3780 gaggtgacca ctgaagggtg agtggagtaa gggcagagaa gtgcggtccc gacataacac 3840 cgtccaatac caaagcctgc acggctggga gaagtcgaag ctcacagagg atctttagga 3900 gccgagggcg gagagaagga ccagtagggt cctacttata tcaacgtctg gagcctagat 3960 tttgtttggg gtgggatgga agcaggtgat gttgcctcag aggtggctaa ggctcagagg 4020 gagaaacaca gtgggggttt ggagggcaag accagattgg gtaagtggac aggcaagtcc 4080 ccaggctgta gcctaagtta acagcagaga gagcccgtta ggtctcacac acccatcacc 4140 gcagctggag gctgctcagg ggctgctggt ggaagccgtc accgccggga tcttgggtga 4200 cctgggctcc gggggcaatg tggacgcatg tgtgatcaca aagactggcg ccaagctgct 4260 gcggacactg agctcaccca cagagcccgt gaagaggtga gagctggaga tcggggacca 4320 cagggatgtg tggggctata gcaggggaga tagggggctg caaaaagggg atgggccaca 4380 tgacaggccc atgttcagag gctgtccctc ctccctccca ggtctggccg ctaccacttt 4440 gtgcctggaa ccacagctgt cctgacccag acagtgaagc cactaaccct ggagctagtg 4500 gaggaaactg tgcaggctat ggaggtggag taagctgagg cttagagctt ggaacaaggg 4560 ggaataaacc cagaaaatac agttaaacag atggctgtgt cattcttgag tggaatgggg 4620 tgggcaggca gccagcaggg ctctgtagct aaggcgtccc tgcaggggcc attacctacc 4680 atagctctag tgtctggcct aagagatgcc cttcacccat aacctcaggc acctacaact 4740 ccagaacccc agccctggcc agcattgcag gcttggtctc cacccaaacc ttccttctga 4800 ctccacactt gaaggctccc ccaccactcc actgtcttgc tcttgccctc tagtccactg 4860 ggagacttgt aaattatgaa ataccccatg tactaccccc tcctagagac tttccatggc 4920 tcctcagtgg cccaggacaa gctcatacct ttcaatcagg cccccacagg ccccactgag 4980 ggctaaagtg ctgacaagag gagccgctcc ctgactccaa ggcaagttct caccaagcac 5040 tcctcaacct cgcaacatct ttacctgtga caccccttag atgacgaggc atgcctgcac 5100 tgctcacgtg aagctcgtct tctgtctgca catgctgggc ttgtgactcc aagttttcca 5160 ggctaataag ggtcacagga ctcacatggg gagagatgac acgtttctcc aacaaacctt 5220 tgctgggccc ctgctgagtc tcaggcctgg ctgctgggtg ccagcaagag catcctgtcc 5280 tcagcgagaa cggctgaact ccgctggagc ttcagaaatg tcagggagag tctacccagg 5340 gcccagggag ggtctatgcc gggctgcaca tccccaggct gctgagtgtg ctccctgcac 5400 cccaacattc tattaatgaa catttgtaaa tgtaacagaa aagtagaaag agttgtatat 5460 tgaataccct tatactgtca ggtcaccaca gacctgacag tattttgtta tatttgtttt 5520 atcatctatt catccctcta tccattaatt catcgctcct tttttttttt tttttttttt 5580 tttgagacgg cgtctcgctc tgtcacccag gctctggagt gcaaatattt tgttatattt 5640 gttttatcat ctattcatcc ctctatccat taattcatcg ctcctttttt tttttttttt 5700 tttgagacgg agtctcgctc tgtcacccag gctctggagt gcagtggcgc aatctcagct 5760 cactggaagc tccgcctccc aggttcacgc cattctcctg cctcagcctc ccgagtagct 5820 gggactacag gtgcccgcca ccacgcgcgg ctaatttttt tttttttttg tatttttagt 5880 agagacgagg ttctactgaa cctgttagcc aggatggtct ttgatctcct gacctcatga 5940 tccgcccgcg tcggcctccc aaagtgctgg gattacaggc gtgagccacc gtgcccagcc 6000 aattcatctc attttttggc tgatgctgtt tctttgagat ggggtctagc tccatcgccc 6060 aggccggaat gcagtggtgc actcatggct cactgcagcc ttgaacttaa gggctcaagt 6120 gatccctcct gcctcagcct tctgagttgc tgggactaca ggtgtgtacc atcataccca 6180 gcacatttct taatttaaaa aaattttttt tgtagagaca gggtttcatg atgttgctca 6240 ggctggtctc gaactcctgg aatcaagcct cctacgtctg cctcccaaag ttttgggatt 6300 acaggtgtga gccaccacac ccagccctga tctgttcttg aatcagttaa agccctcaca 6360 ctcccagaag gccgccagcc aatgcacctg ttggaacttt gcacacaggg tgtcttctcc 6420 cttcaagctt ggtctgcagc tcagtaacaa atgggctaca gacaccaggg gcttgcccat 6480 gggagcccca aggcctaaag agggtggcag agatttgatg tctgtcactc tccacctgca 6540 gcctcagtcc acggtcggcc aggcaccaag agctcacact ttgccctcct aaatgccagg 6600 cccttcataa gtatcatctc attgttaaga gcggaggctt cagcgccaga caaatgcgag 6660 tttgcgtaca actcaaccac gtgctggtgg gagagtcacc atctctgagc agacctgtga 6720 ctcctgttcc aaatggacga ggaaccactg cgatgatgtg ttaggactcc cagcctgcca 6780 gaacctcaca gcccctggcc cttcacagca aagttgaccg cagtgagcat tccatccacc 6840 agtcagaaca ccctggacgc tgagcggacc ttctctgaaa gcctggtgcc tttgttagcc 6900 ctgggtgact cctgtgatcc cagccaccag gttgtcacta tagacctaat ttaaccatct 6960 gtcctcagta ccgagggctc aacatttgga atgggaggtg gttctgggag ccaattagag 7020 gccaggcttt gggaggtggc agaggtgagt ctcacacctt gggctctgtc tgataagtct 7080 aggtctcggt caggggacct tggcctaaag ggcctgtctt gcctggagcg tgggaggggg 7140 ctgagtctac acagctggcc tggcctcagg cctggagctt tagctcaagg acgagaagac 7200 ccataaagcc agacccagct cccaacctca catctgccac gatgttgctg ctcagcctga 7260 ccctaagcct ggttctcctc ggctcctcct ggggtgagtg ggccaggacc agccctgatt 7320 cagccctggg agcaactcag ctcccagcaa cagcccaggg aaggagctag gctggctgga 7380 agggacgaag gtggacagag tgggtaaaag aaacaggata tgccagggca gtggagcagg 7440 gaacagtcct gcagggctgg gagggggcaa gaggtggggt ggtctcacaa ataggaccag 7500 agattgagcc aggccctgga gcccgggagg gtttaggaag ctgagacagg aagacctgtc 7560 catgtctttt agaaagaacc ttctggctgc atgaagggta tgaactgttc aggtcgggag 7620 ggggcagaga gaccaggggt agagatgggg aacagcgggg actaggctgg agacagatgt 7680 aggagaacag cagggctggg ggactgggtg gatagggata accaagatag ctgtggggcc 7740 cgaaggtgct tgcatgtacc ctgttgggga aggggtagtg ctgtaccctc tcgacagacc 7800 tctctggggt gcacagcctg gggcacccaa aaggaggtgg ggaaagatgg gctgaggcat 7860 gggaagcagg tcctcattag cccaatggcc aggctgcggc attcctgcca tcaaaccggc 7920 actgagcttc agccagagga ttgtcaacgg ggagaatgca gtgttgggct cctggccctg 7980 gcaggtgtcc ctgcaggtac accaccagag gggtgggcag ggtcctgggt acgtcatgcc 8040 taggggcagc ctcagcagcc catccccact ctgacctctg agccctgacc acaggacagc 8100 agcggcttcc acttctgcgg tggttctctc atcagccagt cctgggtggt cactgctgcc 8160 cactgcaatg tcaggtgagt gcctgcattc cacctgcccc gcccctcgcc tcttcctgcc 8220 tcctcccctg gctgtccccc tctcgcgctg gcctccctgc agctgcctaa tcccaccccc 8280 ttgcagccct ggccgccatt ttgttgtcct gggcgagtat gaccgatcat caaacgcaga 8340 gcccttgcag gttctgtccg tctctcgggt gagtgcctgg gctgcagaca cggaggaaaa 8400 gtgggcagtg caggtgggtg ggtgctggga acgaggaatt caggacatgc cctggcctac 8460 cctgctcagc acccatcaga acatggactg tttctgaccc cacaggccat tacacaccct 8520 agctggaact ctaccaccat gaacaatgac gtgacgctgc tgaagctcgc ctcgccagcc 8580 cagtacacaa cacgcatctc gccagtttgc ctggcatcct caaacgaggc tctgactgaa 8640 ggcctcacgt gtgtcaccac cggctggggt cgcctcagtg gcgtgggtag ggactcaggc 8700 caaagctcag ggtgggagga ctggggtggg gacagtgttc tgggccccat gtgaccaccc 8760 ctcctggcca caggcaatgt gacaccagca catctgcagc aggtggcttt gcccctggtc 8820 actgtgaatc agtgccggca gtactggggc tcaagtatca ctgactccat gatctgtgca 8880 ggtggcgcag gtgcctcctc gtgccaggta agccccagca cccgctcctc tgcgctgtcc 8940 tagtggtata cctccccaac cccccctact caattctccc tccctcttcc ctctcagggt 9000 gactccggag gccctcttgt ctgccagaag ggaaacacat gggtgcttat tggtattgtc 9060 tcctggggca ccaaaaactg caatgtgcgc gcacctgctg tgtatactcg agttagcaag 9120 ttcagcacct ggatcaacca ggtcatagcc tacaactgag ctcaccacag gccctcccca 9180 gctcaaccca ttaaagaccc aggccctgtc ccatcatgca ttcatgtctg tcttcctggc 9240 tcaggagaaa gaagaggctg ttgagggtcc gactccctac ttggacttct ggcacagaag 9300 gggctgagtg actccttgag tagcagtggc tcttcctaga gtagccatgc cgaggccggg 9360 gcccccaccc ctcctccagg gcaacccctt ggtcctacag caagaagcca gaactgttgg 9420 aatgaatggc agccctccct ggagaggcag cctgtttact gaatacagag gatacgttta 9480 caaactgaat acgcataata aataactgca cattctccat ccacaggcca tggcatgaag 9540 gcccaagtgg gtctatcaaa ggcccacatc tccaaacccc tgtcctgccc tcaggaccag 9600 gcccaccctg ggcaagagag aacgtaagcc ccagggcttc aggtccccag agacacttgg 9660 ggaactgggg ggaaattctg aggccatggg gcttggttct ccactgcctc ctgcccaggg 9720 ggatttgggg acggtaggag gatgtgtcta aggcatagtc gacttggcac agagtggtct 9780 ctttagtttt gtttcccact ggaggtggca catgcaggaa aagggcctgg cccaggctgc 9840 cgaccggcag aagctgagtg ggaaccaaac cctcctgcaa ttggcagggc cctgccgtca 9900 agctaaggcc aaagctgggc cctgggccca ttctacccac tgaaggcagc tgtggaggaa 9960 ggggcttggg ttccagcctg gtttgtggta gggggagata ccacaaaaga aatggggatg 10020 gttctggctc aggcctctgg gaaagcagcc acccaacccc acccacctcc cgcaggggct 10080 ccttccagct tgaggctcag tgggacccag actggaaggt taatgctgtg aagggaagca 10140 gcacagggtg gacggggcaa ggccagctgt gagaaggcag tgcccctggc accctggttt 10200 cagaggcagg tcacacagta tggctaagtt ccagggaggg gtgcgcagaa gctcagcaga 10260 aggggagagg tgagcagccc gggaccctcc cccagggcgg caactcctac cttcccatgt 10320 cctcatggag gactacaggt gtgcaccatg ggtgggtgtg cacgatgggc aggtgtgcac 10380 gatgggcgtg cagtgatcac tcccaggctg ccaacaccca tgcagacacc agatggcgcc 10440 ttcgtgcagc tgcagaggag ggagcaacag agcctgaagg gaaaaggcaa tggggctgca 10500 ccaaaggata gaacccaggc tgacactcga ccctaatcgg gaggaccccc ttccctctgc 10560 cttggccccc aggtgcccca ttccccaggt agcagcagtg gggctccctt taaccacccc 10620 cagttgggaa ggaggcacct ggggaatgga atggacatca acggggagag ggaggtagcg 10680 gtgctctaca aagaaggcac caagggcggt gggctgagac ccctcagaat cttggagagg 10740 ctggagcctg ggcaagccga tgaccagcat ggccacacag tccagaaggg tgaaggtcca 10800 cgccatggcc ctccaccaga ggtcctggga ccaggaaggc tccctggagg caccatgaag 10860 gaagacagat cttggctggg aggtggaggg ctgtttcgac ctagccaggg gctacgggtc 10920 cagtcaaggc acaagctttg tgcctaccag ggtctcccac tggagcataa tcttaaggat 10980 caggatgcat gggaatgtgt gaaaccaggg agaagggctc tgtggaggaa agggggtccc 11040 agaagtaact gtcccaaagg gtcctgaggc cacaggacac tccacccagc actgcagttc 11100 cctttgattg gggaaaagtc aaagggcaag ggagacagtg aaggccaggt cctatccctt 11160 cccaactcca ccagagcagc tgcccaccaa gaggggtatc agtgccagcc aggctcccag 11220 ttcaggggga gtcacagccc cctgtgctac ctctactctg tcacacctgg cccaggccat 11280 ggtgaggaca ggggctgctg aaggcacaga gaaagggctg gagccagaca ttcttcacct 11340 actgtgggcc acataggcct atctccagag agggcatcgg acccagatgg caccacagtg 11400 tgtggccagg ctgggtcgtg ctgcatgtgt gcacagccag gcggctcagc cattgtattg 11460 ctgctggtag cgcaggttga gctcccgcag ctcccgttcc cgcacacggc gtgacttatt 11520 ggagcgtgtg gagcggctgg aacgcgtgga ctgggcagat ttggtgctct ggcagcgcga 11580 ggaggcacgt ttaaggaggt tctgggatat ggagcggtgc aggttcttca tggatgaaga 11640 ggcagccatg ctcaccaccc acgggtgcct cagggcctgc agtgcagtca tacgggctcc 11700 agggtccact gtcagcaggc ggtcaatgaa gtccttggcc aggttggaca cactaggcca 11760 gggctagaga ccaaggacaa gcattagagt gagagcatct gacactgccc accccatctg 11820 gatgaggcca ctactcagca accctcccct ttccagagag aggtgctgcc cctcctctca 11880 tgtagcactt ggggcctccc cgcccaacgc tggctcaggc tgaacaaggg ctgctctcca 11940 ggtgatggag tctggcaagg aaggaaagga cctgtgcact ctcccaggga gcaaattcta 12000 tggtgcactg gacccgaagc ctggctccag ggagatggcc tctgccaaga ccccccggaa 12060 cgtgtcccag gagtatcata actcagggga ctgttagaga atgattcaaa ctttcccacc 12120 acatcctaag tcagattgaa gctccaatct ctggatgacc aggatcaggc tacttaaagg 12180 ggaacttcct agtccttaca gagaagatcc aacctctctc caactgccga agcagtggca 12240 gaagaccact gctccctgcc tctcctcccg gcatggggag gaaaggaaac aattcaaggc 12300 aactagattt cccagtcggc tgagggcagg cgatcccggg ccaggaagga accaggaccc 12360 ttctcagtgg caccctctgg cccgcattac ttctctaagc cacaaagggc tcctggcagt 12420 gctgtgcgcc agcctcattt tagtacattc tgtcccctgg gaggaactcc ataaagccca 12480 ctctgccaca tgcaccccgg gctgcctcat ctcagccccg aacccagcag ctgtctgtct 12540 cagggcctca ggttgtacgg ctgtcttcac ctgactggat cctcaggttc tcagggtaaa 12600 ggacacttgc tcagactccc tcttagcccc cagtgcttcc agcaattatt ccagctgtaa 12660 cgtgagactg caatttcatg ttcgtttagt attcccatga gatcatgctg agctggatga 12720 gcccggcctg gtgctgcgca tacaggaagc actcagtagg cacaggctca gacagtaaac 12780 aacccacggt gctgccggat gggtgccctt tcctggagct gcttccaggc cttggggctc 12840 agccaggtga gtccttgcgt ccctgcatct cctaggaaca cttctggcac gggctctgag 12900 gctcccccaa ggataggcag ctaggacctt tcctgagcct gctgcagatg actcaacagg 12960 gatgctaacg atcccctcat cttccttcct gccaggtgag gtctgcctgt tccacccatg 13020 gtacccttca ccttgaggaa cccctgaaca tgccctccag ggggttcagg aggatctgag 13080 agaccacctt cagggcaggt gcacagccat ctagcagaca cacacactca ctgactactg 13140 ctactcccag tctggctcgc ctgacctcca actctttccc tacccccttc cccactgcca 13200 cagagggatg aggcanngag aacacgcttc caccgtcctg aggaaggcnt ggggctacct 13260 gcagctgctg tcttcaccca ctctttggaa ggttattcca agttttactg agctgaagtg 13320 ggagcaacag gggaaccata ttcccaaaca cacctaacag ggtcatcctc atcagtgggc 13380 cagcagcaca cagtgactcc tggggagatg ctggccccag gaggaggaag tcagggtcca 13440 ggagcatgca gccaacgaag gcccatagat gccttactat ccaagggctg tgggtgggcg 13500 cagagagcaa cagccctccc cgacaggcag gtaagtctcc tgggggcttg tgtagttcaa 13560 gattcatatt gagggccagg cgtggtggct catgcctgta atcccagcac tttggggagg 13620 ctgaggcagg tggatcacaa ggtcatgaga tcaagaccat cctggccaac atggtgaaac 13680 cccgtctcta ctaaaaatac aaaaattagt cgggcgtggt ggcgtgcctg tagtccagct 13740 actcaggaag ctgaggcagg agaattgctt gaacctgaga ggcggaggtt gcagtgagcc 13800 aagatcgcac cactgcactc caggctggga aagagggggg ttccgtttcc aaaaaaaaaa 13860 aaa 13863 84 3044 DNA Homo sapiens misc_feature (1)..(3044) n = a, c, g or t 84 aggcagggcg ggcgggcgct ctaagggttc tgctctgact ccaggttggg acagcgtctt 60 cgctgctgct ggatagtcgt gttttcgggg atcgaggata ctcaccagaa accgaaaatg 120 ccgaaaccaa tcaatgtccg agttaccacc atggatgcag agctggagtt tgcaatccag 180 ccaaatacaa ctggaaaaca gctttttgat caggtggtaa agactatcgg cctccgggaa 240 gtgtggtact ttggcctcca ctatgtggat aataaaggat ttcctacctg gctgaagctg 300 gataagaagg tgtctgccca ggaggtcagg aaggagaatc ccctccagtt caagttccgg 360 gccaagttct accctgaaga tgtggctgag gagctcatcc aggacatcac ccagaaactt 420 ttcttcctcc aagtgaagga aggaatcctt agcgatgaga tctactgccc ccctgagact 480 gccgtgctct tggggtccta cgctgtgcag gccaagtttg gggactacaa caaagaagtg 540 cacaagtctg ggtacctcag ctctgagcgg ctgatccctc aaagagtgat ggaccagcac 600 aaacttacca gggaccagtg ggaggaccgg atccaggtgt ggcatgcgga acaccgtggg 660 atgctcaaag ataatgctat gttggaatac ctgaagattg ctcaggacct ggaaatgtat 720 ggaatcaact atttcgagat aaaaaacaag aaaggaacag acctttggct tggagttgat 780 gcccttggac tgaatattta tgagaaagat gataagttaa ccccaaagat tggctttcct 840 tggagtgaaa tcaggaacat ctctttcaat gacaaaaagt ttgtcattaa acccatcgac 900 aagaaggcac ctgactttgt gttttatgcc ccacgtctga gaatcaacaa gcggatcctg 960 cagctctgca tgggcaacca tgagttgtat atgcgccgca ggaagcctga caccatcgag 1020 gtgcagcaga tgaaggccca ggcccgggag gagaagcatc agaagcagct ggagcggcaa 1080 cagctggaaa cagagaagaa aaggagagaa accgtggaga gagagaaaga gcagatgatg 1140 cgcgagaagg aggagttgat gctgcggctg caggactatg aggagaagac aaagaaggca 1200 gagagagagc tctcggagca gattcagagg gccctgcagc tggaggagga gaggaagcgg 1260 gcacaggagg aggccgagcg cctagaggct gaccgtatgg ctgcactgcg ggctaaggag 1320 gagctggaga gacaggcggt ggatcagata aagagccagg agcagctggc tgcggagctt 1380 gcagaataca cagccaagat tgccctcctg gaagaggcgc ggaggcgcaa ggaggatgaa 1440 gttgaagagt ggcagcacag ggccaaagaa gcccaggatg acctggtgaa gaccaaggag 1500 gagctgcacc tggtgatgac agcacccccg cccccaccac cccccgtgta cgagccggtg 1560 agctaccatg tccaggagag cttgcaggat gagggcgcag agcccacggg ctacagcgcg 1620 gagctgtcta gtgagggcat ccgggatgac cgcaatgagg agaagcgcat cactgaggca 1680 gagaagaacg agcgtgtgca gcggcagctc gtgacgctga gcagcgagct gtcccaggcc 1740 cgagatgaga ataagaggac ccacaatgac atcatccaca acgagaacat gaggcaaggc 1800 cgggacaagt acaagacgct gcggcagatc cggcagggca acaccaagca gcgcatcgac 1860 gagttcgagg ccctgtaaca gccaggccag gaccaagggc agaggggtgc tcatagcggg 1920 cgctgccagc cccgccacgc ttgtctttag tgctccaagt ctaggaactc cctcagatcc 1980 cagttccttt agaaagcagt tacccaacag aaacattctg ggctgggaac cagggaggcg 2040 ccctggtttg ttttccccag ttgtaatagt gccaagcagg cctgattctc gcgattattc 2100 tcgaatcacc tcctgtgttg tgctgggagc aggactgatt gaattacgga aaatgcctgt 2160 aaagtctgag taagaaactt catgctggcc tgtgtgatac aagagtcagc atcattaaag 2220 gaaacgtggc aggacttcca tctgtgccat acttgttctg tattcgaaat gagctcaaat 2280 tgattttttt aatttctatg aaggatccat ctttgtatat ttacatgctt agaggggtga 2340 aaattatttt ggaaattgag tctgaagcac tctcgcacac acagtgattc cctcctcccg 2400 tcactccacg cagctggcag agagcacagt gatcaccagc gtgagtggtg gaggaggaca 2460 cttggatatt tttttagttc tttttttttt ggcttaacag ttttagaata cattgtactt 2520 atacacctta ttaatgatca gctatatact atttatatac aagtgataat acagatttgt 2580 aacattagtt ttaaaaaggg aaagttttgt tctgtatatt ttgttacctt ttacagaata 2640 aaagaattac atatgaaaaa ccctctaaac catggcactt gatgtgatgt ggcaggaggg 2700 nagtggtgga gctggacctg cctgctgcag ctgcagtcac gtgtaaacag gattattatt 2760 agtgttttat gcatgtaatg gactatgcac acttttaatt ttgtcagatt cacacatgcc 2820 actatgagct ttcagactcc agctgtgaag agactctgtc tgcttgtgtt tgtttgcagt 2880 ctctctctgc catggccttg gcaggctgct ggaaggcagc ttgtggaggc cgttggttcc 2940 gcccactcat tccttctcgt gcactgcttt ctccttcaca gctaagatgc catgtgcagg 3000 tggattccat gccgcagaca tgaaataaaa gctttgcaaa ggca 3044 85 1953 DNA Homo sapiens 85 cgctcccacc cgcccgtggc ccgcgcccat ggccgcgcgc gctccacaca actcaccgga 60 gtccgcgccc tgcgccgccg accagttcgc agctccgcgc cacggcagcc agtctcacct 120 ggcggcaccg cccgcccacc gccccggcca cagcccctgc gcccacggca gcaatcgagg 180 cgaccgcgac agtggtgggg gacgctgctg agtggaagag agcgcagccc ggccaccgga 240 cctacttact cgccttgctg attgtctatt tttgcgttta caacttttct aagaactttt 300 gtatacaaag gaacttttta aaaaagacgc ttccaagtta tatttaatcc aaagaagaag 360 gatctcggcc aatttggggt tttgggtttt ggcttcgttt tttctcttcg ttgactttgg 420 ggttcaggtg ccccagctgc ttcgggctgc cgaggacctt ctgggccccc acattaatga 480 ggcagccacc tggcgagtct gacatggctg tcagcgacgc gctgctccca tctttctcca 540 cgttcgcgtc tggcccggcg ggaagggaga agacactgcg tcaagcaggt gccccgaata 600 accgctggcg ggaggagctc tcccacatga agcgacttcc cccagtgctt cccggccgcc 660 cctatgacct ggcggcggcg accgtggcca cagacctgga gagcggcgga gccggtgcgg 720 cttgcggcgg tagcaacctg gcgcccctac ctcggagaga gaccgaggag ttcaacgatc 780 tcctggacct ggactttatt ctctccaatt cgctgaccca tcctccggag tcagtggccg 840 ccaccgtgtc ctcgtcagcg tcagcctcct cttcgtcgtc gccgtcgagc agcggccctg 900 ccagcgcgcc ctccacctgc agcttcacct atccgatccg ggccgggaac gacccgggcg 960 tggcgccggg cggcacgggc ggaggcctcc tctatggcag ggagtccgct ccccctccga 1020 cggctccctt caacctggcg gacatcaacg acgtgagccc ctcgggcggc ttcgtggccg 1080 agctcctgcg gccagaattg gacccggtgt acattccgcc gcagcagccg cagccgccag 1140 gtggcgggct gatgggcaag ttcgtgctga aggcgtcgct gagcgcccct ggcagcgagt 1200 acggcagccc gtcggtcatc agcgtcagca aaggcagccc tgacggcagc cacccggtgg 1260 tggtggcgcc ctacaacggc gggccgccgc gcacgtgccc caagatcaag caggaggcgg 1320 tctcttcgtg cacccacttg ggcgctggac cccctctcag caatggccac cggccggctg 1380 cacacaactt ccccctgggg cggcagctcc ccagcaggag taccccgacc ctgggttttg 1440 aggaagtgct gagcagcagg gaatgtcacc ctgccctgcc gcttcctccc ggcttccatc 1500 cccacccggg gcccaattac ccatccttcc tgcccgatca gatgcagccg caagtcccgc 1560 cgctccatta ccaagagctc atgccacccg gttcctgcat gccagaggag cccaagccaa 1620 agaggggaag acgatcgtgg ccccggaaaa ggaccgccac ccacacttgt gattacgcgg 1680 gctgcggcaa aacctacaca aagagttccc atctcaaggc acacctgcga acccacacag 1740 gtgagaaacc ttaccactgt gactgggacg gctgtggatg gaaattcgcc cgctcagatg 1800 aactgaccag gcactaccgt aaacacacgg ggcaccgccc gttccagtgc caaaaatgcg 1860 accgagcatt ttccaggtcg gaccacctcg ccttacacat gaagaggcat ttttaaatcc 1920 cagacagtgg atatgaccca cactgccaga aga 1953 86 1476 DNA Homo sapiens 86 gccacccacc ctccggaccg cggcagctgc tgacccgcca tcgccatggc ccgcgggaaa 60 gccaaggagg agggcagctg gaagaaattc atctggaact cagagaagaa ggagtttctg 120 ggcaggaccg gtggcagttg gtttaagatc cttctattct acgtaatatt ttatggctgc 180 ctggctggca tcttcatcgg aaccatccaa gtgatgctgc tcaccatcag tgaatttaag 240 cccacatatc aggaccgagt ggccccgcca ggattaacac agattcctca gatccagaag 300 actgaaattt cctttcgtcc taatgatccc aagagctatg aggcatatgt actgaacata 360 gttaggttcc tggaaaagta caaagattca gcccagaggg atgacatgat ttttgaagat 420 tgtggcgatg tgcccagtga accgaaagaa cgaggagact ttaatcatga acgaggagag 480 cgaaaggtct gcagattcaa gcttgaatgg ctgggaaatt gctctggatt aaatgatgaa 540 acttatggct acaaagaggg caaaccgtgc attattataa agctcaaccg agttctaggc 600 ttcaaaccta agcctcccaa gaatgagtcc ttggagactt acccagtgat gaagtataac 660 ccaaatgtcc ttcccgttca gtgcactggc aagcgagatg aagataagga taaagttgga 720 aatgtggagt attttggact gggcaactcc cctggttttc ctctgcagta ttatccgtac 780 tatggcaaac tcctgcagcc caaatacctg cagcccctgc tggccgtaca gttcaccaat 840 cttaccatgg acactgaaat tcgcatagag tgtaaggcgt acggtgagaa cattgggtac 900 agtgagaaag accgttttca gggacgtttt gatgtaaaaa ttaaatttta agtgacacta 960 cagaaaaaca caaaaaggtg atgggttgtg ttatgcttgt attgaatgct gtcttgacat 1020 ctcttgcctt gtcctccggt atgttctaaa gctgtgtctg agatctggat ctgcccatca 1080 ctttggctag tgacagggct aattaatttg ctttatacat tttcttttac tttccttttt 1140 tcctttctgg aggcatcaca tgctggtgct gtgtctttat gaatgtttta accattttca 1200 tggtggaaga attttatatt tatgcagttg tacaatttta tttttttctg caagaaaaag 1260 tgtaatgtat gaaataaacc aaagtcactt gtttgaaaat aaatctttat tttgaacttt 1320 ataaaaagca atgcagtacc ccatagactg gtgttaaatg ttgtctacag tgcaaaatcc 1380 atgttctaac atatgtaata attgccagga gtacagtgct cttgttgatc ttgtattcag 1440 tcaggttaaa acaacggtca ataaaagaat gaacac 1476 87 439 DNA Homo sapiens 87 ggtgggtctg aatctagcac catgacggaa ctagagacag ccatgggcat gatcatagac 60 gtcttttccc gatattcggg cagcgagggc agcacgcaga ccctgaccaa gggggagctc 120 aaggtgctga tggagaagga gctaccaggc ttcctgcaga gtggaaaaga caaggatgcc 180 gtggataaat tgctcaagga cctggacgcc aatggagatg cccaggtgga cttcagtgag 240 ttcatcgtgt tcgtggctgc aatcacgtct gcctgtcaca agtactttga gaaggcagga 300 ctcaaatgat gccctggaga tgtcacagat tcctgcagag ccatggtccc aggcttccca 360 aaagtgtttg ttggcaatta ttcccctagg ctgagcctgc tcatgtacct ctgattaata 420 aatgcttatg aaaaaaaaa 439 88 5431 DNA Homo sapiens 88 ggcagccggg cgccccgcgg ggctctccgc gctgcgttcc cgacccctgg ggggaggtgt 60 ggagtccaag cggtgcattc ttgaaccatc ttgtcagacg ccggcggctc gcgggctgtg 120 gcgggggctg cggtcaaggc cgcgctcctg ggggccgccg cctgggaggg tgggcgccca 180 ggcgtccctg cagccccggg tgctccgact gcgcggcggg gccgcggcgc gcgcgcccgg 240 gcgtccgggc gtccgggaca gtggtgccag acactcccaa atcccgagcc ggcccagcct 300 cgtacggagg accttttttt tggttctgtt ggtgacccgt tagccgccgc tggggcctaa 360 caccaagttg agggctcgcg gattagccgc ccgccagccg tggaaatgtg ataagagcgg 420 taccgtttgc agaaggaaat ttctgatgca actcttcgcc tttgctgatt gcctctccaa 480 acgcctgcct gacgactgcc ttggagcatg tgcgttatgg aaattaggct ttggcgctga 540 ccacaatgct gagcaggaag cagcagctgc aggcccagtg actggtagct cagtgaccag 600 cagcccagtg accggcagcc aggtcctcac ctgggtcctc tcagtgaagc cagggtggcc 660 gccccagcag acagtgctac agagccaact cctgacaggt tctgaaaata ttgtgcacag 720 ggcaggctga ggacacagcc acgtgatacc cactgtagag agagggagag agagacctcc 780 tatgcaagct gccggccctc tgttccgtag taaggacaag gtggagcaga cacctcgcag 840 tcaacaagac ccggcaggac caggactccc cgcacagtct gaccgacttg cgaatcacca 900 ggaggatgat gtggacctgg aagccctggt gaacgatatg aatgcatccc tggagagcct 960 gtactcggcc tgcagcatgc agtcagacac ggtgcccctc ctgcagaatg gccagcatgc 1020 ccgcagccag cctcgggctt caggccctcc tcggtccatc cagccacagg tgtccccgag 1080 gcagagggtg cagcgctccc agcctgtgca catcctcgct gtcaggcgcc ttcaggagga 1140 agaccagcag tttagaacct catctctgcc ggccatcccc aatccttttc ctgaactctg 1200 tggccctggg agccccgctg tgctcacgcc gggttcttta cctccgagcc aggccgccgc 1260 aaagcaggat gttaaagtct ttagtgaaga tgggacaagc aaagtggtgg agattctagc 1320 agacatgaca gccagagacc tgtgccaatt gctggtttac aaaagtcact gtgtggatga 1380 caacagctgg acactagtgg agcaccaccc gcacctagga ttagagaggt gcttggaaga 1440 ccatgagctg gtggtccagg tggagagtac catggccagt gagagtaaat ttctattcag 1500 gaagaattac gcaaaatacg agttctttaa aaatcccatg aatttcttcc cagaacagat 1560 ggttacttgg tgccagcagt caaatggcag tcaaacccag cttttgcaga attttctgaa 1620 ctccagtagt tgtcctgaaa ttcaagggtt tttgcatgtg aaagagctgg gaaagaaatc 1680 atggaaaaag ctgtatgtgt gtttgcggag atctggcctt tattgctcca ccaagggaac 1740 ttcaaaggaa cccagacacc tgcagctgct ggccgacctg gaggacagca acatcttctc 1800 cctgatcgct ggcaggaagc agtacaacgc ccctacagac cacgggctct gcataaagcc 1860 aaacaaagtc aggaatgaaa ctaaagagct gaggttgctc tgtgcagagg acgagcaaac 1920 caggacgtgc tggatgacag cgttcagact cctcaagtat ggaatgctcc tttaccagaa 1980 ttaccgaatc cctcagcaga ggaaggcctt gctgtccccg ttctcgacgc cagtgcgcag 2040 tgtctccgag aactccctcg tggcaatgga tttttctggg caaacaggac gcgtgataga 2100 gaatccggcg gaggcccaga gcgcagccct ggaggagggc cacgcctgga ggaagcgaag 2160 cacacggatg aacatcctag gtagccaaag tcccctccac ccttctaccc taagtacagt 2220 gattcacagg acacagcact ggtttcacgg gaggatctcc agggaggaat cccacaggat 2280 cattaaacag caagggctcg tggatgggct ttttctcctc cgtgacagcc agagtaatcc 2340 aaaggcattt gtactcacac tgtgtcatca ccagaaaatt aaaaatttcc agatcttacc 2400 ttgcgaggac gacgggcaga cgttcttcag cctagatgac gggaacacca aattctctga 2460 cctgatccag ctggttgact tttaccagct gaacaaagga gtcctgcctt gcaaactcaa 2520 gcaccactgc atccgagtgg ccttatgacc gcagatgtcc tctcggctga agactggagg 2580 aagtgaacac tggagtgaag aagcggtctg tgcgttggtg aagaacacac atcgattctg 2640 cacctgggga cccagagcga gatgggtttg ttcggtgcca gccgaccaag attgactagt 2700 ttgttggact taaacgacga tttgctgctg tgaacccagc agggtcgcct ccctctgcgt 2760 cagccaaatt ggggagggca tggaagatcc agcggaaagt tgaaaataaa ctggaatgat 2820 catcttggct tgggccgctt aggaacaaga accggagaga agtgattgga aatgaactct 2880 tgccctggaa taatcttgac aattaaaact gatatgttta ctttttttgt attgatcact 2940 tttttgcact ccttctttgt tttcaatatt gtattcagcc tattgtagga gggggatgtg 3000 gcgtttcaac tcatataata cagaaagagt tttgaatggg cagatttcaa actgaatatg 3060 ggtccccaaa tgttcccaga gggtcctcca caccctctgc cgactaccac ggtgtggatt 3120 cagctcccaa atgacaaacc cagcccttcc cagtatactt gaaaagcttt cttgttaaaa 3180 taaaaggtgt cactgtggta ggcatttggc atattttgtg gactcagtca agcaaccaca 3240 gtctgttaat catttctcta tgctcagatg tcagatcctc ttgttattag tgtgtcttgt 3300 tctgcacagt gcaggagact ttattccttt ggaaaattca ctgttccaca aacagcaggc 3360 tgaatggcct cgcctctaga ttgacgtggg ccagcctcct tgagacacac ctggcacccg 3420 tcatcggcca gcggtggatg ctgcataatc cacctgggta cttcagcctt gcgtttccac 3480 agccttcagc ctgttctaga acgatcactg ccttacccct gctgctgcag tggtgtgagt 3540 cgtttcacgg ctgatgtccc tcgggggatt aaaggatcta aagagaaaat ggcacctggt 3600 tgtcttcgtg ctgtgtctca tgggtttcca tagtgataaa gacaaggaaa cgctgcaggg 3660 gccacaggca caggctgata tttaaagatc tttgcttgca gccctccgtc ctgctgaaaa 3720 cccccataag ccagtgaaca cagagcagct agaggctcct cctctgctgg cttagggtca 3780 gaagtacctc acagtggttg tggacatgga agagttttgt caacacaaca ctttgtcccc 3840 gctccgggag atgagtcaga tggtggcttg agttgtcact tggtcccctc cgcccctcgg 3900 gtggccccct ttgccacgtc cccttagctt agtgatcagg tgtgagagtg gccatttcct 3960 tacctttgat ccctgtaaag cagaaaggac tcctttgaca ggcgacaaac tactgtggtg 4020 agcagaatga tttccttttt caagacaaca cctgcctggc ttctattaat gtgtgctggc 4080 catgatattg ccccaaatcc gccccactga agtgttccct aaggaacagc atttctctgc 4140 tcctcagtca acccccgtag cctagagcag tgtcacaagc ttcagtaagg ccagtcagct 4200 ggaagtcagt ctaccgtata gtaacactgt atttcagtct acagaccaca ctctagttgt 4260 tttccatgaa aggtatacaa atgaagaatt ttctagcaaa acatgttttt aaccatcagt 4320 gctcaattgc attttcttcc tttcgcagcc agtcagtctt tcaaactatt gacagtaaga 4380 taattctcac gttcacacct ggtggcaggc ttcactgtag ggacggacat tgcagttaca 4440 ccacgattcc ttcctcttca ctggctcgag gtaaaccctt ttcaaggaaa aacaactcta 4500 ggatttcttt tttctgtgta cgtagaccag tcccatcagt gtataatctc tctctcacac 4560 gcctctctcc aatagacagc ttgtatttgc agtatttcat atttataaat atgcgtttat 4620 ttaaaaggag aacaaaagct tgactctgat tcacagtttt gtatgtagct ggtttgacgt 4680 agtcttttgt attttccctg ccgaagtgaa ttgttggaga atgtaaaccg cctccacgtg 4740 gcggcagact tcctaaggcc ccagctcgct ggcctcgcgc tgggcggctg ggaattccac 4800 ctgagaacaa gtcccgcaaa ccggggacgg aaggacattt gacttttatt tttgtattta 4860 attgacatga atgtaaaggg gacagctcag ggttgttttg gagcctgttg actttgtatc 4920 tctgcctgtg attttctttt ctaaatgaaa ctccatgtag caaccaggac gaagttgaga 4980 aggaaaacgc caaatgcttt ggttattaga gtttaatagg taagctctgt tacactaggt 5040 gttagagttc cagaatgttc ttttgtttgc taaaccttga agaaacatgt gcctcagcct 5100 agatgttttg tcttctcttt tctgcactta atacctgaca gtatgaccga tctctgcgcc 5160 tttctggggg cgggcaagct ggcggtagat ttgtgatgtc acagtgcaaa ctgcagtgac 5220 tgtaaattgg cctggcgtgt ataaacgttt tcagggaatg cagaaggtat taatgaagag 5280 acaaaacctt tattccatgt gctttgcttc attctgtaca tagctctttg gctcgtgaac 5340 ctaattgtaa actttcaggt atttttgtac aaataaggga ctgatgttct gtttcttgta 5400 attagaaata aacattaata cagtgttctt c 5431 89 1223 DNA Homo sapiens 89 acactcgctc ggctcaccat gtgtcactct cgcagctgcc acccgaccat gaccatcctg 60 caggccccga ccccggcccc ctccaccatc ccgggacccc ggcggggctc cggtcctgag 120 atcttcacct tcgaccctct cccggagccc gcagcggccc ctgccgggcg ccccagcggc 180 tctcgcgggc accgaaagcg cagccgcagg gttctctacc ctcgagtggt ccggcgccag 240 ctgccagtcg aggaaccgaa cccagccaaa aggcttctct ttctgctgct caccatcgtc 300 ttctgccaga tcctgatggc tgaagagggt gtgcgggcgc ccctgcctcc agaggacgcc 360 cctaacgccg catccctggc gcccacccct gtgtcccccg tcctcgagcc ctttaatctg 420 acttcggagc cctcggacta cgctctggac ctcagcactt tcctccagca acacccggcc 480 gccttctaac tgtgactccc cgcactcccc aaaaagaatc cgaaaaacca caaagaaaca 540 ccaggcgtac ctggtgcgcg agagcgtatc cccaactggg acttccgagg caacttgaac 600 tcagaacact acagcggaga cgccacccgg tgcttgaggc gggaccgagg cgcacagaga 660 ccgaggcgca tagagaccga gcacagccca gctgggctag gcccggtggg aaggagagcg 720 tcgttaattt atttcttatt gctcctaatt aatatttata tgtatttatg tacgtcctcc 780 taggtgatga gatgtgtacg taatatttat tttaacttat gcaagggtgt gagatgttcc 840 ccctgctgta aatgcaggtc tcttggtatt tattgagctt tgtgggactg gtggaagcag 900 gacacctgga actgcggcaa agtaggagaa gaaatgggga ggactcgggt gggggaggac 960 gtcccggctg ggatgaagtc tggtggtggg tcgtaagttt aggaggtgac tgcatcctcc 1020 agcattctca actccgtctg tctactgtgt gagacttcgg cggaccatta ggaatgagat 1080 ccgtgagatc cttccatctt cttgaagtcg cctttagggt ggctgcgagg tagagggttg 1140 ggggttggtg ggctgtcacg gagcgactgt cgagatcgcc tagtatgttc tgtgaacaca 1200 aataaaattg atttactgtc tgc 1223 90 3536 DNA Homo sapiens 90 ggcccctcga gcctcgaacc ggaacctcca aatccgagac gctctgctta tgaggacctc 60 gaaatatgcc ggccagtgaa aaaatcttat ggctttgagg gcttttggtt ggccaggggc 120 agtaaaaatc tcggagagct gacaccaagt cctcccctgc cacgtagcag tggtaaagtc 180 cgaagctcaa attccgagaa ttgagctctg ttgattctta gaactggggt tcttagaagt 240 ggtgatgcaa gaagtttcta ggaaaggccg gacaccaggt tttgagcaaa attttggact 300 gtgaagcaag gcattggtga agacaaaatg gcctcgccgg ctgacagctg tatccagttc 360 acccgccatg ccagtgatgt tcttctcaac cttaatcgtc tccggagtcg agacatcttg 420 actgatgttg tcattgttgt gagccgtgag cagtttagag cccataaaac ggtcctcatg 480 gcctgcagtg gcctgttcta tagcatcttt acagaccagt tgaaatgcaa ccttagtgtg 540 atcaatctag atcctgagat caaccctgag ggattctgca tcctcctgga cttcatgtac 600 acatctcggc tcaatttgcg ggagggcaac atcatggctg tgatggccac ggctatgtac 660 ctgcagatgg agcatgttgt ggacacttgc cggaagttta ttaaggccag tgaagcagag 720 atggtttctg ccatcaagcc tcctcgtgaa gagttcctca acagccggat gctgatgccc 780 caagacatca tggcctatcg gggtcgtgag gtggtggaga acaacctgcc actgaggagc 840 gcccctgggt gtgagagcag agcctttgcc cccagcctgt acagtggcct gtccacaccg 900 ccagcctctt attccatgta cagccacctc cctgtcagca gcctcctctt ctccgatgag 960 gagtttcggg atgtccggat gcctgtggcc aaccccttcc ccaaggagcg ggcactccca 1020 tgtgatagtg ccaggccagt ccctggtgag tacagccggc cgactttgga ggtgtccccc 1080 aatgtgtgcc acagcaatat ctattcaccc aaggaaacaa tcccagaaga ggcacgaagt 1140 gatatgcact acagtgtggc tgagggcctc aaacctgctg ccccctcagc ccgaaatgcc 1200 ccctacttcc cttgtgacaa ggccagcaaa gaagaagaga gaccctcctc ggaagatgag 1260 attgccctgc atttcgagcc ccccaatgca cccctgaacc ggaagggtct ggttagtcca 1320 cagagccccc agaaatctga ctgccagccc aactcgccca cagaggcctg cagcagtaag 1380 aatgcctgca tcctccaggc ttctggctcc cctccagcca agagccccac tgaccccaaa 1440 gcctgcaact ggaagaaata caagttcatc gtgctcaaca gcctcaacca gaatgccaaa 1500 ccaggggggc ctgagcaggc tgagctgggc cgcctttccc cacgagccta cacggcccca 1560 cctgcctgcc agccacccat ggagcctgag aaccttgacc tccagtcccc aaccaagctg 1620 agtgccagcg gggaggactc caccatccca caagccagcc ggctcaataa catcgttaac 1680 aggtccatga cgggctctcc ccgcagcagc agcgagagcc actcaccact ctacatgcac 1740 cccccgaagt gcacgtcctg cggctctcag tccccacagc atgcagagat gtgcctccac 1800 accgctggcc ccacgttcgc tgaggagatg ggagagaccc agtctgagta ctcagattct 1860 agctgtgaga acggggcctt cttctgcaat gagtgtgact gccgcttctc tgaggaggcc 1920 tcactcaaga ggcacacgct gcagacccac agtgacaaac cctacaagtg tgaccgctgc 1980 caggcctcct tccgctacaa gggcaacctc gccagccaca agaccgtcca taccggtgag 2040 aaaccctatc gttgcaacat ctgtggggcc cagttcaacc ggccagccaa cctgaaaacc 2100 cacactcgaa ttcactctgg agagaagccc tacaaatgcg aaacctgcgg agccagattt 2160 gtacaggtgg cccacctccg tgcccatgtg cttatccaca ctggtgagaa gccctatccc 2220 tgtgaaatct gtggcacccg tttccggcac cttcagactc tgaagagcca cctgcgaatc 2280 cacacaggag agaaacctta ccattgtgag aagtgtaacc tgcatttccg tcacaaaagc 2340 cagctgcgac ttcacttgcg ccagaagcat ggcgccatca ccaacaccaa ggtgcaatac 2400 cgcgtgtcag ccactgacct gcctccggag ctccccaaag cctgctgaag catggagtgt 2460 tgatgctttc gtctccagcc ccttctcaga atctacccaa aggatactgt aacactttac 2520 aatgttcatc ccatgatgta gtgcctcttt catccactag tgcaaatcat agctgggggt 2580 tgggggtggt gggggtcggg gcctggggga ctgggagccg cagcagctcc ccctccccca 2640 ctgccataaa acattaagaa aatcatattg cttcttctcc tatgtgtaag gtgaaccatg 2700 tcagcaaaaa gcaaaatcat tttatatgtc aaagcagggg agtatgcaaa agttctgact 2760 tgactttagt ctgcaaaatg aggaatgtat atgttttgtg ggaacagatg tttcttttgt 2820 atgtaaatgt gcattctttt aaaagacaag acttcagtat gttgtcaaag agagggcttt 2880 aattttttta accaaaggtg aaggaatata tggcagagtt gtaaatatat aaatatatat 2940 atatataaaa taaatatata taaacctaac aaagatatat taaaaatata aaactgcgtt 3000 aaaggctcga ttttgtatct gcaggcagac acggatctga gaatctttat tgagaaagag 3060 cacttaagag aatattttaa gtattgcatc tgtataagta agaaaatatt ttgtctaaaa 3120 tgcctcagtg tatttgtatt tttttgcaag tgaaggttta caatttacaa agtgtgtatt 3180 aaaaaaaacc caaagaaccc aaaaatctgc agaaggaaaa atgtgtaatt ttgttctagt 3240 tttcagtttg tatatacccg tacaacgtgt cctcacggtg ccttttttca cggaagtttt 3300 caatgatggg cgagcgtgca ccatcccttt ttgaagtgta ggcagacaca gggacttgaa 3360 gttgttacta actaaactct ctttgggaat gtttgtctca tcccattctg cgtcatgctt 3420 gtgtgataac tactccggag acagggtttg gctgtgtcta aactgcatta ccgcgttgta 3480 aaaaatagct gtaccaatat aagaataaaa tgttggaaag tcgcaaaaaa aaaaaa 3536 91 8930 DNA Homo sapiens 91 gaattccgga aagaaagaac atcgtttcag gaataaaaat gcacagtagt agttatagtt 60 accgtagcag tgattctgtg tttagtaaca ctaccagcac tcgaaccagt cttgattcaa 120 atgaaaatct tctcttggtt cattgtggtc caacactgat caactcttgc attagcttcg 180 gcagtgaatc ctttgatgga cacaggttag aaatgttgca acagattgcc aacagagttc 240 agagggacag tgtcatctgt gaagacaaac tgattcttgc tggaaatgct cttcagtctg 300 attctaaaag attagaatca ggagtgcagt ttcagaatga agcagaaatt gctgggtata 360 tacttgaatg tgagaacctt ttacgccagc atgtaattga tgtacagatt cttattgatg 420 gaaaatacta ccaggcagat caattggtac agagggttgc aaaactgcgt gacgaaatta 480 tggccttaag gaacgaatgt tcttctgtgt acagcaaagg acgcatactg acaacagaac 540 agacaaagct catgatatca ggaatcactc aaagtttaaa ctcaggattt gcacagacct 600 tacaccctag tctgacctca gggctgaccc agagtttaac accttcccta acctcttcta 660 gtatgacttc tggcctgtca tcagggatga cttcccgcct gactccatct gtcactccag 720 cttatacacc tggtttccca tcaggattag ttccaaattt cagttcagga gtagagccaa 780 attcattgca aactttgaag ttgatgcaga tccgaaaacc ccttctaaag tcttctttgc 840 tggatcaaaa tttaacagaa gaagaaatca atatgaaatt tgttcaggat cttttgaatt 900 gggttgatga gatgcaggta caactggacc gcactgagtg gggctcagat ttgccaagtg 960 ttgaaagcca tttagaaaat cataaaaatg ttcatagagc tattgaagaa tttgaatcta 1020 gtctcaaaga agctaaaatc agtgagattc aaatgacagc acctcttaaa ctgacttatg 1080 cagaaaagtt gcacagatta gagagtcagt atgcaaaact cttgaataca tccaggaatc 1140 aagaacggca ccttgataca ctccataatt ttgtaagtcg tgcgactaat gaacttattt 1200 ggttgaatga aaaagaagag gaggaagttg cttatgactg gagtgagaga aacaccaaca 1260 tagctaggaa aaaagattat catgctgaat taatgagaga acttgatcaa aaggaagaaa 1320 atattaaatc agttcaggag atagcagagc agctacttct agaaaatcat ccagcccggt 1380 taactattga ggcctacaga gcggcaatgc agacgcagtg gagctggatc ttacagctct 1440 gccagtgtgt ggagcagcac ataaaggaga acacagcgta tttcgagttt ttcaatgatg 1500 ccaaagaagc tactgattac ttaaggaatc taaaagatgc cattcagcgg aagtacagct 1560 gtgatagatc aagcagcatt cacaagctag aagaccttgt tcaggaatca atggaagaga 1620 aagaagaact tctgcagtac aaaagcacta tagcaaacct aatgggaaaa gcaaaaacaa 1680 taattcaact gaagccaagg aattctgact gtccactcaa aacttctatt ccgatcaaag 1740 ctatctgtga ctacagacaa attgagataa ccatttacaa agacgatgaa tgtgttttgg 1800 caaataactc tcatcgtgct aaatggaagg tcattagtcc tactgggaat gaggctatgg 1860 tcccatctgt gtgcttcacc gttcctccac caaacaaaga agcggtggac cttgccaaca 1920 gaattgagca acagtatcag aatgtcctga ctctttggca tgagtctcac ataaacatga 1980 agagtgtagt atcctggcat tatctcatca atgaaattga tagaattcga gctagcaatg 2040 tggcttcaat aaagacaatg ctacctggtg aacatcagca agttctaagt aatctacaat 2100 ctcgttttga agattttctg gaagatagcc aggaatccca agtcttttca ggctcagata 2160 taacacaact ggaaaaggag gttaatgtat gtaagcagta ttatcaagaa cttcttaaat 2220 ctgcagaaag agaggagcaa gaggaatcag tttataatct ctacatctct gaagttcgaa 2280 acattagact tcggttagag aactgtgaag atcggctgat tagacagatt cgaactcccc 2340 tggaaagaga tgatttgcat gaaagtgtgt tcagaatcac agaacaggag aaactaaaga 2400 aagagctgga acgacttaaa gatgatttgg gaacaatcac aaataagtgt gaggagtttt 2460 tcagtcaagc agcagcctct tcatcagtcc ctaccctacg atcagagctt aatgtggtcc 2520 ttcagaacat gaaccaagtc tattctatgt cttccactta catagataag ttgaaaactg 2580 ttaacttggg gttaaaaaac actcaagctg cagaagccct cgtaaaactc tatgaaacta 2640 aactgtgtga agaagaagca gttatagctg acaagaataa tattgagaat ctaataagta 2700 ctttaaagca atggagatct gaagtagatg aaaagagaca ggtattccat gccttagagg 2760 atgagttgca gaaagctaaa gccatcagtg atgaaatgtt taaaacgtat aaagaacggg 2820 accttgattt tgactggcac aaagaaaaag cagatcaatt agttgaaagg tggcaaaatg 2880 ttcatgtgca gattgacaac aggttacggg acttagaggg cattggcaaa tcactgaagt 2940 actacagaga cacttaccat cctttagatg attggatcca gcaggttgaa actactcaga 3000 gaaagattca ggaaaatcag cctgaaaata gtaaaaccct agccacacag ttgaatcaac 3060 agaagatgct ggtgtccgaa atagaaatga aacagagcaa aatggacgag tgtcaaaaat 3120 atgcagaaca gtactcagct acagtgaagg actatgaatt acaaacaatg acctaccggg 3180 ccatggtaga ttcacaacaa aaatctccag tgaaacgccg aagaatgcag agttcagcag 3240 atctcattat tcaagagttc atggacctaa ggactcgata tactgccctg gtcactctca 3300 tgacacaata tattaaattt gctggtgatt cattgaagag gctggaagag gaggagatta 3360 aaaggtgtaa ggagacttct gaacatgggg catattcaga tctgcttcag cgtcagaagg 3420 caacagtgct tgagaatagc aaacttacag gaaagataag tgagttggaa agaatggtag 3480 ctgaactaaa gaaacaaaag tcccgagtag aggaagaact tccgaaggtc agggaggctg 3540 cagaaaatga attgagaaag cagcagagaa atgtagaaga tatctctctg cagaagataa 3600 gggctgaaag tgaagccaag cagtaccgca gggaacttga aaccattgtg agagagaagg 3660 aagccgctga aagagaactg gagcgggtga ggcagctcac catagaggcc gaggctaaaa 3720 gagctgccgt ggaagagaac ctcctgaatt ttcgcaatca gttggaggaa aacaccttta 3780 ccagacgaac actggaagat catcttaaaa gaaaagattt aagtctcaat gatttggagc 3840 aacaaaaaaa taaattaatg gaagaattaa gaagaaagag agacaatgag gaagaactct 3900 tgaagctgat aaagcagatg gaaaaagacc ttgcatttca gaaacaggta gcagagaaac 3960 agttgaaaga aaagcagaaa attgaattgg aagcaagaag aaaaataact gaaattcagt 4020 atacatgtag agaaaatgca ttgccagtgt gtccgatcac acaggctaca tcatgcaggg 4080 cagtaacggg tctccagcaa gaacatgaca agcagaaagc agaagaactc aaacagcagg 4140 tagatgaact aacagctgcc aatagaaagg ctgaacaaga catgagagag ctgacatatg 4200 aacttaatgc cctccagctt gaaaaaacgt catctgagga aaaggctcgt ttgctaaaag 4260 ataaactaga tgaaacaaat aatacactca gatgccttaa gttggagctg gaaaggaagg 4320 atcaggcgga gaaagggtat tctcaacaac tcagagagct tggtaggcaa ttgaatcaaa 4380 ccacaggtaa agctgaagaa gccatgcaag aagctagtga tctcaagaaa ataaagcgca 4440 attatcagtt agaattagaa tctcttaatc atgaaaaagg gaaactacaa agagaagtag 4500 acagaatcac aagggcacat gctgtagctg agaagaatat tcagcattta aattcacaaa 4560 ttcattcttt tcgagatgag aaagaattag aaagactaca aatctgccag agaaaatcag 4620 atcatctaaa agaacaattt gagaaaagcc atgagcagtt gcttcaaaat atcaaagctg 4680 aaaaagaaaa taatgataaa atccaaaggc tcaatgaaga attggagaaa agtaatgagt 4740 gtgcagagat gctaaaacaa aaagtagagg agcttactag gcagaataat gaaaccaaat 4800 taatgatgca gagaattcag gcagaatcag agaatatagt tttagagaaa caaactatcc 4860 agcaaagatg tgaagcactg aaaattcagg cagatggttt taaagatcag ctacgcagca 4920 caaatgaaca cttgcataaa cagacaaaaa cagagcagga ttttcaaaca aaaattaaat 4980 gcctagaaga agacctggcg aaaagtcaaa atttggtaag tgaatttaag caaaagtgtg 5040 accaacagaa cattatcatc cagaatacca agaaagaagt tagaaatctg aatgcggaac 5100 tgaatgcttc caaagaagag aagcgacgcg gggagcagaa agttcagcta caacaagctc 5160 aggtgcaaga gttaaataac aggttgaaaa aagtacaaga cgaattacac ttaaagacca 5220 tagaggagca gatgacccac agaaagatgg ttctgtttca ggaagaatct ggtaaattca 5280 aacaatcagc agaggagttt cggaagaaga tggaaaaatt aatggagtcc aaagtcatca 5340 ctgaaaatga tatttcaggc attaggcttg actttgtgtc tcttcaacaa gaaaactcta 5400 gagcccaaga aaatgctaag ctttgtgaaa caaacattaa agaacttgaa agacagcttc 5460 aacagtatcg tgaacaaatg cagcaagggc agcacatgga agcaaatcat taccaaaaat 5520 gtcagaaact tgaggatgag ctgatagccc agaagcgtga ggttgaaaac ctgaagcaaa 5580 aaatggacca acagatcaaa gagcatgaac atcaattagt tttgctccag tgtgaaattc 5640 aaaaaaagag cacagccaaa gactgtacct tcaaaccaga ttttgagatg acagtgaagg 5700 agtgccagca ctctggagag ctgtcctcta gaaacactgg acaccttcac ccaacaccca 5760 gatcccctct gttgagatgg actcaagaac cacagccatt ggaagagaag tggcagcatc 5820 gggttgttga acagataccc aaagaagtcc aattccagcc accaggggct ccactcgaga 5880 aagagaaaag ccagcagtgt tactctgagt acttttctca gacaagcacc gagttacaga 5940 taacttttga tgagacaaac cccattacaa gactgtctga aattgagaag ataagagacc 6000 aagccctgaa caattctaga ccacctgtta ggtatcaaga taacgcatgt gaaatggaac 6060 tggtgaaggt tttgacaccc ttagagatag ctaagaacaa gcagtatgat atgcatacag 6120 aagtcacaac attaaaacaa gaaaagaacc cagttcccag tgctgaagaa tggatgcttg 6180 aagggtgcag agcatctggt ggactcaaga aaggggattt ccttaagaag ggcttagaac 6240 cagagacctt ccagaacttt gatggtgatc atgcatgttc agtcagggat gatgaattta 6300 aattccaagg gcttaggcac actgtgactg ccaggcagtt ggtggaagct aagcttctgg 6360 acatgagaac aattgagcag ctgcgactcg gtcttaagac tgttgaagaa gttcagaaaa 6420 ctcttaacaa gtttctgacg aaagccacct caattgcagg gctttaccta gaatctacaa 6480 aagaaaagat ttcatttgcc tcagcggccg agagaatcat aatagacaaa atggtggctt 6540 tggcattttt agaagctcag gctgcaacag gttttataat tgatcccatt tcaggtcaga 6600 catattctgt tgaagatgca gttcttaaag gagttgttga ccccgaattc agaattaggc 6660 ttcttgaggc agagaaggca gctgtgggat attcttattc ttctaagaca ttgtcagtgt 6720 ttcaagctat ggaaaataga atgcttgaca gacaaaaagg taaacatatc ttggaagccc 6780 agattgccag tgggggtgtc attgaccctg tgagaggcat tcgtgttcct ccagaaattg 6840 ctctgcagca ggggttgttg aataatgcca tcttacagtt tttacatgag ccatccagca 6900 acacaagagt tttccctaat cccaataaca agcaagctct gtattactca gaattactgc 6960 gaatgtgtgt atttgatgta gagtcccaat gctttctgtt tccatttggg gagaggaaca 7020 tttccaatct caatgtcaag aaaacacata gaatttctgt agtagatact aaaacaggat 7080 cagaattgac cgtgtatgag gctttccaga gaaacctgat tgagaaaact atatatcttg 7140 aactttcagg gcagcaatat cagtggaagg aagctatgtt ttttgaatcc tatgggcatt 7200 cttctcatat gctgactgat actaaaacag gattacactt caatattaat gaggctatag 7260 agcagggaac aattgacaaa gccttggtca aaaagtatca ggaaggcctc atcacactta 7320 cagaacttgc tgattctttg ctgagccggt tagtccccaa gaaagatttg cacagtcctg 7380 ttgcagggta ttggctgact gctagtgggg aaaggatctc tgtactaaaa gcctcccgta 7440 gaaatttggt tgatcggatt actgccctcc gatgccttga agcccaagtc agtacagggg 7500 gcataattga tcctcttact gtcaaaaagt accgggtggc cgaagctttg catagaggcc 7560 tggttgatga ggggtttgcc cagcagctgc gacagtgtga attagtaatc acagggattg 7620 gccatcccat cactaacaaa atgatgtcag tggtggaagc tgtgaaggca aatattataa 7680 ataaggaaat gggaatccga tgtttggaat ttcagtactt gacaggaggg ttgatagagc 7740 cacaggttca ctctcggtta tcaatagaag aggctctcca agtaggtatt atagatgtcc 7800 tcattgccac aaaactcaaa gatcaaaagt catatgtcag aaatataata tgccctcaga 7860 caaaaagaaa gttgacatat aaagaagcct tagaaaaacc tgattttgat ttccacacag 7920 gacttaaact gttagaagta tctgagcccc tgatgacagg aatttctagc ctctactatt 7980 cttcctaatg ggacatgttt aaataactgt gcaaggggtg atgcaggctg gttcatgcca 8040 ctttttcaga gtatgatgat atcggctaca tatgcagtct gtgaattatg taacatactc 8100 tatttcttga gggctgcaaa ttgctaagtg ctcaaaatag agtaagtttt aaattgaaaa 8160 ttacataaga tttaatgccc ttcaaatggt ttcatttagc cttgagaatg gttttttgaa 8220 acttggccac actaaaatgt tttttttttt acgtagaatg tgggataaac ttgatgaact 8280 ccaagttcac agtgtcattt cttcagaact ccccttcatt gaatagtgat catttattaa 8340 atgataaatt gcactcgctg aaagagcacg tcatgaagca ccatggaatc aaagagaaag 8400 atataaattc gttcccacag ccttcaagct gcagtgtttt agattgcttc aaaaaatgaa 8460 aaagttttgc ctttttctgt atatagtgac cttctttgca tattaaaatg tttaccacaa 8520 tgtcccattt ctagttaagt cttcgcactt gaaagctaac attatgaata ttatgtgttg 8580 gaggagggga aggattttct tcattctgtg tattttcctt acatgtacag tagacgttct 8640 ctattctatc agccttctat ggtacctttt tgtcaggaca attaggattg taatgctaat 8700 gcaaaggcag caattcaaag atcttctagt gcctcatgaa taaagttgag atttaaaatt 8760 tgtaacattg atggaacagc tgggaggtta gaccaatcat taaggaatgt atgccatacc 8820 tttctttgct accataaaca ttttggaggt gcatctgcta tgtgacatgg taaatatggt 8880 taagtgaatg aataaaatgt tttagtaacc tgtgtcggat tccgcggaat 8930 92 1675 DNA Homo sapiens 92 gtgagacaga gacaaatgaa cccccctcta aagtcattta actaatagcc agcacatccc 60 ttccccaaac tgtcaattga aatcttaact gaaagtttta ctgaataata ccaagctaat 120 tgctgttggg cacacctgga tggctttgca cctggtgttg aacctgctga agcaggtgga 180 tgctcaagat tacgtgcaag gaatccctcc catctggtac taaaatttca gtgtgttctg 240 agtgtctttt aaaccaaaat ggaaatacag atacagggct gtagtattca gtaatgtgtc 300 tgctccttgt tgggcagaca ccagcggtgt gcagggagag accaagtacc atctttatct 360 acacttgggc tggcttgtgg agaagggctg ctttttttca gtcctacatt ccttcatttt 420 ttttttcatt cttgaattca ttgttttgtg ggatctaaga cccaggggtc atttgagagg 480 tttgacagta tcttttctga ccagttgcca catgacttgc ttgaccctga gcctgtggaa 540 atggcatagg gaccagtcta ctacccactg ggcctggtgt gtagaggggg agagggtagc 600 aaggtgcttc tctacgccca tgacttggga gcaggtcttg gcctccttca tgagagtcta 660 gtgccatgtc ctgtcccatg atctggaccc tgggactgtc ttggcatctt aactgcagtt 720 tcaatgaggc agagggcaaa gagagaccaa gatcagaggg gttcattata cccctggcta 780 gagaacccag ctactgacat gcaagcagct tggggctggc tggacacagg tactaggccc 840 attgtttcca ggtgaagctt tcatcacaga acagtgttgt ctccacctgg ccttagatgg 900 cacgccatga ttcgggcctg gatagactgc ctgcgtcctt accactgatc tggccaagaa 960 tgaggccctc ccaacacttt cactccctct ccaagccttg atgggacctc cacttattta 1020 ggcctcatgt gctttgaaga agctttgaga gccaatgtgt cttccacggg tctctttttt 1080 gctacaagta atcagcccca tgtgttctct taaactgaga attgcacctg ggcaattcct 1140 gttttctaag gtggtctctg ctgctattta acaacccaga gtaggcctct gtgaggcttc 1200 agtggcctca gaaaccagag ggtccagata gggggcctgc ttgggccctc tgctgccaac 1260 tgctcaaacc tgctttagct ccagccactt gtggcaaaca acctcgtttc cttacaaatt 1320 ccagcatgtg actttggtgc cgttacttgt gaaaaatcta ttctgttgtc tttgatgtgt 1380 ccaagaaaat tcgtgtagtt tacgtaaaaa tatctgactc acaagaaagc caactgtatg 1440 tcttgtgatg ggacagttca taatgtagtt gctagaccac tttacaaatt gttcttgtca 1500 ccagatgtgt tcagacattg ctgtgcaatt gttggggagg gtagggggaa aggcgagagg 1560 agatacttat tggtcttttt gtttaatacc ttccccaaga ggggacagtc tggccaactt 1620 gctccagtaa tgcaataaag acattgcaat aaagtaaaaa aaaaaaaaaa aaaaa 1675 93 4180 DNA Homo sapiens 93 ccagggtgat gctgaagatg atgaccttct tccaaggcct ctagagccat cagcctgtgc 60 caggcaccct cgacttgcct agaggccccc aaaagttgca gtccacatca gaggcagagt 120 cagaggcctc catgtcggag gcctcctctg aggacctggt gccacccctg gaggctgggg 180 cagccccata tagggaggag gaagaggcgg cgaagaagaa gaaggagaag aagaagaagt 240 ccaaaggcct ggccaatgtg ttctgcgtct tcaccaaagg gaagaagaag aagggtcagc 300 ccagctcagc ggagcccgag gacgcagccg ggtccaggca ggggctggat ggcccgcccc 360 ccacagtgga ggagctgaag gcggcgctgg agcgcgggca gctggaggcg gcgcggccgc 420 tgctggcgct ggagcgggag ctggcggcgg cggcggcggc gggcggtgtg agcgaggagg 480 agctggtgcg gcgccagagc aaggtggagg cgctgtacga gctgctgcgc gaccaggtgc 540 tgggcgtgct gcggcggccg ctggaggcgc cgcccgagcg gctgcgccag gcgctggccg 600 tggtggcgga gcaggagcgc gaggaccgcc aggcggcggc ggcggggccg gggacctcgg 660 ggctggcggc cacgcgcccg cggcgctggc tgcagctgtg gcggcgcggc gtggcggagg 720 cggccgagga gcgcatgggc cagcggccgg ccgcgggcgc cgaggtcccc gagagcgtct 780 ttctgcactt gggccgcacc atgaaggagg acctggaggc cgtggtggag cggctgaagc 840 cgctgttccc cgccgagttc ggcgtcgtgg cggcctacgc cgagagctac caccagcact 900 tcgcggccca cctggccgcc gtggcgcagt tcgagctgtg cgagcgcgac acctacatgc 960 tgctgctctg ggtggagaac ctctacccca atgacatcat caacagcccc aagctggtgg 1020 gtgagctgca gggtatgggg ctcgggagcc tcctgccccc caggcagatc cgactgctgg 1080 aggccacatt cctgtccagt gaggcggcca atgtgaggga gttgatggac cgagctctgg 1140 agctagaggc acggcgctgg gctgaggatg tgcctcccca gaggctggac ggccactgcc 1200 acagcgagct ggccatcgac atcatccaga tcacctccca ggcccaggcc aaggccgaga 1260 gcatcacgct ggacttgggc tcacagataa agcgggtgct gctggtggag ctgcctgcgt 1320 tcctgaggag ctaccagcgc gcctttaatg aatttctgga gagaggcaag cagctgacga 1380 attacagggc caatgttatt gccaacatca acaactgcct gtccttccgg atgtccatgg 1440 agcagaattg gcaggtaccc caggacaccc tgagcctcct gctgggcccc ctgggtgagc 1500 tcaagagcca cggctttgac accctgctcc agaacctgca tgaggacctg aagccactgt 1560 tcaagaggtt cacgcacacc cgctgggcgg cccctgtgga gaccctggaa aacatcatcg 1620 ccactgtaga cacgaggctg cctgagttct cagagctgca gggctgtttc cgggaggagc 1680 tcatggaggc cttgcacctg cacctggtga aggagtacat catccaactc agcaaggggc 1740 gcctggtcct caagacggcc gagcagcagc agcagctggc tgggtacatc ctggccaatg 1800 ctgacaccat ccagcacttc tgcacccagc acggctcccc ggcgacctgg ctgcagcctg 1860 ctctccctac gctggccgag atcattcgcc tgcaggaccc cagtgccatc aagattgagg 1920 tggccactta tgccacctgc taccctgact tcagcaaagg ccacctgagc gctatcctgg 1980 ccatcaaggg gaacctatcc aacagtgagg tcaagcgcat ccggagcatc ttggacgtca 2040 gcatgggggc gcaggagccc tcccggcccc tattttccct tataaaggtt ggttagcttt 2100 tcctgtggcc tgacctgcct gtgagtgccc agcaagcctt gggcacaccc cgctgggagc 2160 tgttaagagc agcgctggtt ctcggttcct cccgggtctc ctgtgctctg atgctacttc 2220 tgcctagccc tggcggaggt gcaggccctg tcagctggaa ctggacagac cttggtttgt 2280 ttacatgtcc gatgggggca ggagctccca tcctgggcag ccaaccaggc aacaccaagg 2340 actctttgta aacgatagct gatcgtgtgc acgcaaggaa agaaccagga gggagagtgc 2400 agccaggctc agggatcccc ggacacctct gtccagagcc cctccacagt cggcctcatg 2460 actgtcctcc tcgtgggtgg ggccgagggc cctcttcagc tctctggaga caggggccga 2520 gcctcaccca tctgccctct gcagcccagg gccgccgtga gcgggattca gcaatggtgg 2580 aatggaagac agaactggaa gagaaagaag gaaaagatga gctctcgtct ggcaggggct 2640 tttagggtcc tgtggcgagc tgtgagcacc gccagcgtta gacgtcacat ccaggtggcc 2700 ccacggcccc tacaggctgg ccctgcaatg gggccctgag ccctccctct tcatccccca 2760 aggcctcaac tagagggtgg tcccccgagg gcttggtgtc tactaccgaa gggcccaaga 2820 cctcctgggt cctctcaggc tcccccttcc ccaaggcagg gacaggccct gggggtgcca 2880 ccgtgggccc tgccacccag aagtctggct gaggtctggg caggggcagg gcaagcttga 2940 cctctcactg ttgacccttt ggcctctgta tttgtttcct attgccgtga caggtttcca 3000 caaacttcgt ggatcaaaac gaggtcttcc agttctgcgg gtcagaaggc tgacctgggg 3060 ctcaaatctg ggtgtcggca gtcctgcact ccttctggag gctctagggg agaattcatt 3120 tctggccttt tcatttttag aggctgaccg taattcttga cttcaggctc ctccatcttc 3180 agagccagct gtgggtagtt gaatcttttt cccgtcacct cattgaggcc tcccctctcc 3240 tgcctccctc caccactttt tttttttttt ttttgagaca gggtcttgct gtgttgccca 3300 ggctggagtg cagtggcctg gtcatggcat caaggctcac tgcagcctgg acctcctggt 3360 tcaagtgatc ctcttgtctc agtcccctga gacaatcccc cacgcccagc tacatatttt 3420 tgtggataca gggtctcatt ctgttgccta ggcttgtctg gaactcctgg gctcaaggga 3480 tcttgtagcc ttagcctcct aaagtgctgg gattataggc atgagtcact cgtacccggc 3540 ctgctctacc gcttttaagg acgcttatga tcacattgcg cctacccaga gaacccaggt 3600 cgtctttcta ttttcaggtc agctgattag ccaccttagt tccatctgca actttagttc 3660 ccactggctg tgtaacctaa catagtcaca ggctctgggg actgtcacgt ggacatcttt 3720 gggaggccgt tattctgccc accgcaccct ccgttcatcc cctgccctgc cgggcacctc 3780 gctctacccc aggaaaatgt gagctcgttt tcctgctcgg catgtgctcc ccctaaggct 3840 ctgctcctcc ctgggcctga aagttccttc tcagcctgag agggggccct tcgatctcag 3900 gcatgactca gcccggctga tgcctctgca gtgctgagtc aggatttggg gccggctctc 3960 ttgggtctgt ccccttttcc caggtactgc cttacaaagc tgtggccagg aagtggccgg 4020 tataaaggat gcccaaggtc tttgtacgtg tgtaggagtt agcgtgtttg atattgttaa 4080 tataataata attatttttt agagtactgc ttttgtatgt atgttgaaca ggatccaggt 4140 ttttatagct tgatataaaa cagaattcaa aagtgaaaaa 4180 94 1897 DNA Homo sapiens 94 gacgagagaa agcgagtgtc cctctcgcgc cccaggccgg tgtacccccg cactccgcgc 60 cccggcctag aagctctctc tccccgctcc ccggcccggc ccccgccccg ccccgcccca 120 gcccgctggc gccatggagc gctggccttg gccgtcgggc ggcgcctggc tgctcgtggc 180 tgcccgcgcg ctgctgcagc tgctgcgctc agacctgcgt ctgggccgcc cgctgctggc 240 ggcgctggcg ctgctggccg cgctcgactg gctgtgccag cgcctgctgc ccccgccggc 300 cgcactcgcc gtgctggccg ccgccggctg gatcgcgttg tcccgcctgg cgcgcccgca 360 gcgcctgccg gtggccactc gcgcggtgct catcaccggc tgtgactctg gttttggcaa 420 ggagacggcc aagaaactgg actccatggg cttcacggtg ctggccaccg tattggagtt 480 gaacagcccc ggtgccatcg agctgcgtac ctgctgctcc cctcgcctaa ggctgctgca 540 gatggacctg accaaaccag gagacattag ccgcgtgcta gagttcacca aggcccacac 600 caccagcacc ggcctgtggg gcctcgtcaa caacgcaggc cacaatgaag tagttgctga 660 tgcggagctg tctccagtgg ccactttccg tagctgcatg gaggtgaatt tctttggcgc 720 gctcgagctg accaagggcc tcctgcccct gctgcgcagc tcaaggggcc gcatcgtgac 780 tgtggggagc ccagcggggg acatgccata tccgtgcttg ggggcctatg gaacctccaa 840 agcggccgtg gcgctactca tggacacatt cagctgtgaa ctccttccct ggggggtcaa 900 ggtcagcatc atccagcctg gctgcttcaa gacagagtca gtgagaaacg tgggtcagtg 960 ggaaaagcgc aagcaattgc tgctggccaa cctgcctcaa gagctgctgc aggcctacgg 1020 caaggactac atcgagcact tgcatgggca gttcctgcac tcgctacgcc tggccatgtc 1080 cgacctcacc ccagttgtag atgccatcac agatgcgctg ctggcagctc ggccccgccg 1140 ccgctattac cccggccagg gcctggggct catgtacttc atccactact acctgcctga 1200 aggcctgcgg cgccgcttcc tgcaggcctt cttcatcagt cactgtctgc ctcgagcact 1260 gcagcctggc cagcctggca ctaccccacc acaggacgca gcccaggacc caaacctgag 1320 ccccggccct tccccagcag tggctcggtg agccatgtgc acctatggcc cagccactgc 1380 agcacaggag gctccgtgag cccttggttc ctccccgaaa acccccagca ttacgatccc 1440 ccaagtgtcc tggaccctgg cctaaagaat cccaccccca cttcatgccc actgccgatg 1500 cccaatccag gcccggtgag gccaaggttt cccagtgagc ctctgcgcct ctccactgtt 1560 tcatgagccc aaacaccctc ctggcacaac gctctaccct gcagcttgga gaactccgct 1620 ggatggggag tctcatgcaa gacttcactg cagcctttca caggactctg cagatagtgc 1680 ctctgcaaac taaggagtga ctaggtgggt tggggacccc ctcaggattg tttctcggca 1740 ccagtgcctc agtgctgcaa ttgagggcta aatcccaagt gtctcttgac tggctcaaga 1800 attagggccc caactacaca cccccaagcc acagggaagc atgtactgta cttcccaatt 1860 gccacatttt aaataaagac aaatttttat ttcttct 1897 95 2291 DNA Homo sapiens 95 gaacaatgaa gaaagcccca cagccactgt tgctgagcag ggagaggata ttacctccaa 60 aaaagacagg ggagtattaa agattgtcaa aagagtgggg aatggtgagg aaacgccgat 120 gattggagac aaagtttatg tccattacaa aggaaaattg tcaaatggaa agaagtttga 180 ttccagtcat gatagaaatg aaccatttgt ctttagtctt ggcaaaggcc aagtcatcaa 240 ggcatgggac attggggtgg ctaccatgaa gaaaggagag atatgccatt tactgtgcaa 300 accagaatat gcatatggct cggctggcag tctccctaaa attccctcga atgcaactct 360 cttttttgag attgagctcc ttgatttcaa aggagaggat ttatttgaag atggaggcat 420 tatccggaga accaaacgga aaggagaggg atattcaaat ccaaacgaag gagcaacagt 480 agaaatccac ctggaaggcc gctgtggtgg aaggatgttt gactgcagag atgtggcatt 540 cactgtgggc gaaggagaag accacgacat tccaattgga attgacaaag ctctggagaa 600 aatgcagcgg gaagaacaat gtattttata tcttggacca agatatggtt ttggagaggc 660 agggaagcct aaatttggca ttgaacctaa tgctgagctt atatatgaag ttacacttaa 720 gagcttcgaa aaggccaaag aatcctggga gatggatacc aaagaaaaat tggagcaggc 780 tgccattgtc aaagagaagg gaaccgtata cttcaaggga ggcaaataca tgcaggcggt 840 gattcagtat gggaagatag tgtcctggtt agagatggaa tatggtttat cagaaaagga 900 atcgaaagct tctgaatcat ttctccttgc tgcctttctg aacctggcca tgtgctacct 960 gaagcttaga gaatacacca aagctgttga atgctgtgac aaggcccttg gactggacag 1020 tgccaatgag aaaggcttgt ataggagggg tgaagcccag ctgctcatga acgagtttga 1080 gtcagccaag ggtgactttg agaaagtgct ggaagtaaac ccccagaata aggctgcaag 1140 actgcagatc tccatgtgcc agaaaaaggc caaggagcac aacgagcggg accgcaggat 1200 atacgccaac atgttcaaga agtttgcaga gcaggatgcc aaggaagagg ccaataaagc 1260 aatgggcaag aagacttcag aaggggtcac taatgaaaaa ggaacagaca gtcaagcaat 1320 ggaagaagag aaacctgagg gccacgtatg acgccacgcc aaggagggaa gagtcccagt 1380 gaactcggcc cctcctcaat gggctttccc ccaactcagg acagaacagt gtttaatgta 1440 aagtttgtta tagtctatgt gattctggaa gcaaatggca aaaccagtag cttcccaaaa 1500 acagcccccc tgctgctgcc cggagggttc actgaggggt ggcacgggac cactccaggt 1560 ggaacaaaca gaaatgactg tggtgtggag ggagtgagcc agcagcttaa gtccagctca 1620 tttcagtttc tatcaacctt caagtatcca attcagggtc cctggagatc atcctaacaa 1680 tgtggggctg ttaggtttta cctttgaact ttcatagcac tgcagaaacc ttttaaaaaa 1740 aaatgcttca tgaatttctc ctttcctaca gttgggtagg gtaggggaag gaggataagc 1800 ttttgttttt taaatgactg aagtgctata aatgtagtct gttgcatttt taaccaacag 1860 aacccacagt agaggggtct catgtctccc cagttccaca gcagtgtcac agacgtgaaa 1920 gccagaacct cagaggccac ttgcttgctg acttagcctc ctcccaaagt ccccctcctc 1980 agccagcctc cttgtgagag tggctttcta ccacacacag cctgtccctg ggggagtaat 2040 tctgtcattc ctaaaacacc cttcagcaat gataatgagc agatgagagt ttctggatta 2100 gcttttccta ttttcgatga agttctgaga tactgaaatg tgaaaagagc aatcagaatt 2160 gtgctttttc tcccctcctc tattcctttt agggaataat attcaataca cagtacttcc 2220 tcccagaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 2280 aaaaaaaaaa a 2291 96 15571 DNA Homo sapiens 96 aagcttcctc actccttggc acctggctcc gacatcacat tgacttttcc cttcctgctt 60 ctaccatcac atcaccttct tctgactcca atctcctgcc tctttcttgc aaggatcctt 120 gtgattgtaa ttaggaccca gctggataat ccatgacaat ctcttcaaga tccttaactt 180 aatcacatct gcaaagtccc ttttgtcata gaacgataac attcacaggt tctgggtatt 240 aggacacgga taacttcggg gttccattac tcacccataa ctggtatgca gtgctgattt 300 ccatcctgta ggtacggttt agggatctct aggtcaatga gataatggac tcttgctcat 360 gttacatggc ataatgggaa gaaagccaaa cctagaaaaa gagggactca ggttcccttg 420 ttagagcctc ttcttactaa ctgtgggatt aggggctgat tccctgacct gctgtgttct 480 gtcttctctc caattcaatg ggaatgaact gtgagggcac tgagcaaaaa ctaaggtctc 540 aatacctagt agtagtggga cttgcctctg gatacccagt agtgatcctg ccgcctgttt 600 ctggatatcc tacagtagca atcccacctg tttctggata tcctacagta gtgatcctgc 660 ctgtttctgg atatcctaca gtagcaatcc cgcctgtttc tggatatcct acagtagtga 720 tcctgcctgc ttctggatac ccagaagtga tcatgcctgg ttctagatac ccagtagtga 780 ttgtgtttcc tctagatacc cagtaggtat tgtgcttgct tctagagatg tagtagtagc 840 tggacttagc tctagatacc cagtgctcat ctctagactg ggctgagatc agtgtctccc 900 ttgaagggtt attgtaagga tgaaaaaaga taatgcgttt aaagcacttg gtgtagtagg 960 tggtcttttt aaaagtgtga ataaatacta gttcttatta tttctgtgga tatccaacag 1020 ccacataatt gggccccaaa gccatgaaga aggaagagga aatgtcttaa aggttgtcga 1080 tggacagtgt ttgctgaaca tcaaaatcac tttccaggta ttacctctga tttgctctac 1140 caactccaca ccccacctgc agccacataa ccttccatga tcacggccat gcacaacaca 1200 ccatgtcccc caggcaaggg gaccttagaa acataaccag gcttgagaca gcactctgca 1260 ccggtgtctt ggaaatgctc ttaagagtgt atggctgagt tagggaacca ggatttcaaa 1320 gtagaaaggg agaatctacc caagcccata gaaatcctga atccactcct ttctcagcaa 1380 caagcactgg cctgggagtc agccacttat gcaccaaccc cactctgccc ctaattaaat 1440 gcatgacttt gaaaattccc ctcattcttc tgagccccaa ttcagtgatt ggtgcaatca 1500 caggcttggc tacagtgacc cattcattgc aggcatggtg agactctcaa tccctctcat 1560 ttccactaga atctaactgt tgggatctat gacccagtca gcatagcagg cctgtgggga 1620 gctctcaggt tcaagcatat gcccccccta atctacaaga aattagctgc agaaaaccaa 1680 ggaatagaac ctggaaaaag agagggtttg ctagagctgt ccctttccct gtctctggaa 1740 tgccaacaat agggaggctc tttggtcttg tctctcagga gtgcccatgc cattccagga 1800 aaatgatggc ccagctggtg gtgtaaggct tggggggcag cgagtgggca tcgtggtgaa 1860 agcctcggga tcagggagct gcgtctgcag gcaggcctgc tggccggaaa cctgccagga 1920 aaggaagggg ctgtctcggg gcggggccag ggaggggtgg agacagggcc ggctgtggtc 1980 agtgacaaat gctggctgca atccagccag ccctctgccc tttctgagcc cgagggactg 2040 ccacctccac tgtgtgcaca ctcagctacg ggacacagta agtaccgatg ccgcaaaggg 2100 aggtccccag ggcttgaggg catgtgaggc gaggagagga tggactctag agttttgggg 2160 tttggggtct gcaaagctct gaaggagtct catctctgca gtttcaggta tccaaggcag 2220 cagaggtgag tgggtccccc gagctctgtg accttatgct ccacactaac tctggcagag 2280 cctccgtttc ctcataggta agatggaaat aattacaccc tctggatggt gtgactgaag 2340 attaaataca gcgggtgctc tcactcagca catctggcca tgtctgcaga cacatttggt 2400 tgccacaact ggaagggggg tgggggttag tgacatctag aggccagcga tgctgctgat 2460 gatcccacaa tgcccaggac aagatcacaa agcatcatcc tgttcaaaag gtcaacagga 2520 tcaaggttga gagaccctga aataaggcca tggggacaaa atgtcggctg gataggaggt 2580 gctcagtaag tggcagcttc tgttgttttc tgtgcctgga gtcttggggc tttagaaatc 2640 aggaacaatg atccaatatt atcggcttcc gtgagataag ggcatcttgc ctggaggctg 2700 ccacccaggc cggtcatggc agctgctcat gaaggacagt aacaatttgg cagtttgtta 2760 aatgaacaaa atgtagaaat aaagtaagca gaatttttag tttttctgaa ggtagggctt 2820 ttggccagat atgcagcaat aaaagagcaa actgcttcct tgggccagtg tccttgctca 2880 tagatcagga aaccgaagca tgaagaatac aggcggcaga tgcctgaagg taacggacgt 2940 gttcatggtg ctgacggtga tgataagtga cagatgtaga ctcatctcca aacttgtcag 3000 gttatagaca ttaaatatgt gcaactttat gaatagcagt catgtctcaa tcaagtggtt 3060 ttaataaaga aataatagga agccagagct gagagacagg gagggagttg ttcaaggtca 3120 cctggcaagt gagctccggg gcggggagag ctcagctctg ggtggccagc ctggcttttt 3180 ccactgctca gtgtccagct tgcagtctaa tgtctcgaat tacagagaag gagactggtc 3240 agttcattca ttcattcatt ctacaaaggt ttatggagca tctctcctga ctgcaagctc 3300 ttgaaggtga gagcagcaca aatgagggtc ccatggagag agaggccgga atgaaaaatg 3360 tcaatgacaa atgcatatat aaaggcacat gtgtaattga aagagctttg agagaaagag 3420 tcaagggact gttccagaga atagccatgg aaggggaaaa ggtccagtgt gataaggtat 3480 tgcaaagaag tgacatttaa gcaaaagcct gcagcctatg cagaagttgg cctcagtgag 3540 aaaggttggg ggagggttcc agtagagagg gaaggtatgc aaaggcccag agttaggaca 3600 gaacttgctg tgtttgagaa actgggaaaa gaagagtgag cctgggggta tcacgtgatc 3660 cagggcagag caggtccagg ccaggtgcag ccaggtcaca gcagccctag tgggttagag 3720 cacaaatcaa agtttagcat ttatctgaaa cacaggagtt ggccatgagt ttcttaggcg 3780 aggaagcgct gtgaccatat ttatgattga aggagattct tttatatgct gtatatagaa 3840 agcctttcag ggcaaagaaa ggaagctact ggggtagccc tgggggagat gaagggagct 3900 tccactgggg gcagtaagaa agccagggaa aggcggcagc tttaagacct gttttggaga 3960 tagaacggac aagctttgct gatgggctgg agtggaacag gaagtcaaga ttacttcttc 4020 tgggaagttc tgttcctggg tctttaggat ctagaggaag ctgtgacttt gtctctcatc 4080 tctgcctggg ctccaagcct cacatccctt tttgtaatta gaagatattg gacagaccgt 4140 cctcactaac acaattccca cagctgagtc cagggtagaa ctgggcagga cttcactgcc 4200 caacacggga aatatcagtc agcagatttg ggtttcgggg atggtggtgg gccagcggga 4260 agactgacca gggcctaccc atcacatccc caccacctcc cacctcaatt caccttggcc 4320 tgagatgaca ggtgaacatg actgatcctc tctcttccct ctgcagaaac actaaagcca 4380 gggaccagga gaggggcagc ccaaccaagc tttcaaagca ctcagtagag gctggtctgg 4440 gggatgggag gctcccaggg cttcacctgt ctctgtcaaa gccatgtatt tccaccagag 4500 gcccaagagt gcgatggcaa accctggatt tgaaactaag aaacgtaaaa caagcactga 4560 ggactccact gcctcttgag tgacctctct gaccctctgt ttcttctgca ctgttaggat 4620 aatgatacta actccatgtt gttgtagaga agtataaatg agctaataca ggtgaaccgc 4680 ctggggatac caggaggtga ggtcgaggag gaacgaggta tcactcctca gagccactca 4740 gagagaggct gtgcacgagt cagaggaacc tggattttaa ttccggttcc atcactcagt 4800 agctgaaaca agctattcca cttcacttag cctcagtcta ttcaatctgt aaaatagagt 4860 gagtttactt ttggaaaact ctgtaaaata gagagcttac ttttggtgaa ggttaaacat 4920 agtaatattt atggagtgtc tagtatgtct ttaataatta gtggttttac tgaaaagtag 4980 agagagttgg cccagaggga gcaagatttc tgggtctcaa acatgtagcc caggagagcc 5040 taagtgaacc tggggccctc tccaaacaga tcctggggga gactcagtgc acacccggag 5100 aagcagctcc tccccatcgg atctctagtg cttggcaggg ggcggggtct tgagggggtg 5160 tccacaacac atggcagact gcagatgaag aaactgaggc ccagaggggg tgaggcttgc 5220 ccagggtgac ctagtagctg aatagatggg agaatggagc cagggcctca ctgagactct 5280 ctggtcagct gcccctgggc tgtatccaat aaggaaactc ccctgcttct gaagctgttc 5340 tcgaaattat cagctcagtg tgaccctgtg gggggttgag ccacattgtt tctttagaag 5400 catctccata catggctggt tccaaccctt ggcaggaggg accatattgt gctgtaaaat 5460 agactcattt agagaagccg gagattaaag cacccaccta tgtccttcaa agctctccag 5520 gcaagtgcca tggtgggaac aggtagggag tgtcagtggg gggaagccca gactctgctc 5580 actcattatc tgcagattag ggctattgtt ggtggctact aagtcaggga tttcaaaatc 5640 aggaagatgc agccaggaaa agaggaggca ggactctgca gaggaggcag gactctgcag 5700 agtcagagtg ataaccgagt ctgagtccaa gctttgccag tgttagcaag cgactccatc 5760 tctctgaacc tcggattacc catctgtaaa atagagctag cagcaagatg tacctttttg 5820 ggtggtgcag ggctgaagga gttggcacag tgcctgaaag agggtgcggg caatgcgccc 5880 aactgctgtg gctgctgggt ttggtgccag gttcgattct gcaggcagaa acttctacat 5940 gaggctcctt ctcggaagga gctcaggaca caatttggag gctgggctgg caagggtgac 6000 ctgctggagc tattcaactt cacttaaaga caggcctgca gtccaagcct gcccaattcc 6060 tgagaccatt ctctctccac tgctgagccc cacggccact ctgcaaggga tttcccaccc 6120 acctgtttgg ggccctttgg agtttggttt taattgggtc acgggatgct gtgacaggct 6180 gcccctgcct ggtggggatc tggggtcact gatgacattg tgcccatgga gagagcccag 6240 cagaaaggga ttccctccaa ggcgacacac agggcaaagc tcacatcaga agccaggcag 6300 gccctctgca cctggtaatt agccggcccg ggtgctgtca ggctcacacg tgtgtgtgtg 6360 tgtgtgtgtg tgtgtgtgtg tgtgtgtgtg taaagcatgt accctatggt acagttgaga 6420 atatggaggc ctcagatggg gcttttgcag aaactgccat gcctactgct cacacttcca 6480 tagcacgtgc ccccaagcac cccatggtgt aggtgctgtt attatcacta tcttacagtt 6540 atggagcagt ggctcaaggt gtaactgatt tgcccaaaat cacactacaa ggacacagca 6600 gggctgagat ttgaacccag gcagtggctt cagagcctga gctgtttcct actgcagagg 6660 gaggaggcaa gacttctacc cgtagccaga tggggaggca tgggcacagg aacggctctt 6720 gggtgaagtg gagggaggaa gaggaggact gaaggcgaag gccacgtcag gagtgatggg 6780 ataccccaca aaggcctccc tgagaagcgc tagagacaaa gatgagtgcc tcctcatctg 6840 gaagatgaaa agatgtcttt gcctgcatgg gctgccgtca caaagtccca ggggctaggg 6900 ggcttcaaca acagaaattt ctttctttac aactctggaa gctggaagtc tgagattaag 6960 gcaccagcag gatttgttcc ttccaaggcc cctctccttg gctcacaggt ggctgccttc 7020 tccctgtctt cacctggtct tccctctgtg catgtctcta tcctgatctc ctctttttaa 7080 tttttgtgta aggacgtagt catattgggt tggggcccac tctagtgacc tcattctaac 7140 tcagtcccct ctttaaaagc cctatctcca gatatagtca cattctgggg tattgaaggt 7200 aaggacttca gtatatgcat tttgggggca caattcagcc agaacaggag gacggtgggg 7260 atgtccacat gaagaggttc aggcagaatt cctttaggag gggaagatgt ctctctgtgg 7320 gacaagggtg gcatggagca gcccctgggg gaaggagaag gggacagttt gcatactggt 7380 attctgccta ccccagggtg gacactcact cagcgtttgc tgaatgaaca gggcaaggcc 7440 agcagtgctg atggtcccag gcatgtagct ggtctgagtt catagaagga ccacagcgcc 7500 ctgccatgtg ccaaaccagg acaccagagt gaaggccaga agctcacatg gaagcagctt 7560 agttccctgg taacctcgag atgctgatga gacagagcag agcagaggga accctctccc 7620 tccatatccc atcctccaaa atgtgtccct tgatgtggat gggtagacag gattcctgcc 7680 ctggcagcca gacccctgcc ttgggtctgc acctcctctc cctccttcct ctccccgtca 7740 tccctaaatc ttgtcctcga gccactgcca ccctgtgtaa accctcatgt ccagtcttgg 7800 gggtgccatc ccttctcttt aaagctgaat ggaccaaaca tacccattga gtgttgggtg 7860 gggacatctc tggaaagtca gcacctggac cagctccacc cctctctgag gacaccttct 7920 ttccctttca gaacaaagaa cagccaccat gcagctcttc ctcctcttgt gcctggtgct 7980 tctcagccct cagggggcct cccttcaccg ccaccacccc cgggagatga agaagagagt 8040 cgaggacctc catgtaggtg ccacggtggc ccccagcagc agaagggact ttacctttga 8100 cctctacagg gccttggctt ccgctgcccc cagccagaac atcttcttct cccctgtgag 8160 catctccatg agcctggcca tgctctccct gggggctggg tccagcacaa agatgcagat 8220 cctggagggc ctgggcctca acctccagaa aagctcagag aaggagctgc acagaggctt 8280 tcagcagctc cttcaggaac tcaaccagcc cagagatggc ttccagctga gcctcggcaa 8340 tgcccttttc accgacctgg tggtagacct gcaggacacc ttcgtaagtg ccatgaagac 8400 gctgtacctg gcagacactt tccccaccaa ctttagggac tctgcagggg ccatgaagca 8460 gatcaatgat tatgtggcaa agcaaacgaa gggcaagatt gtggacttgc ttaagaacct 8520 cgatagcaat gcggtcgtga tcatggtgaa ttacatcttc tttaaaggta aggcccttgg 8580 gcccaaacct gcactttctt tggcttttct gctgctttta tctaaagaat acccaattcc 8640 ctcacataca taaaagacgg ggagtacgtt aagttctttt gggtgcctgt tgagaaaaat 8700 taagtaaaca agcagccaga gaaggtaaga tgaatgcctt cttgctgtgg atgggattag 8760 tgaggctgag atgctgtttc ctccacggag gaagagctgg ttgctgtctt cgggcccctg 8820 gggacatctg aagccccagc tttctacagg ctctgaagta tgaacccatt gtggccacca 8880 tggcaaagac accaacacct tagccactca gggcaggaca cagaccccag aagggcttaa 8940 agggcatttc ccagtccccc gtatccctca gatcttggcc cctctgccct catagaggcc 9000 aagactccct cagacaaatg cttgttcctc tgaaatgcct cctcctgact cctcagcaag 9060 agctgacctc tgcttatctc cccgacactc cttgtaagca ttcctgctcg cctctgcagc 9120 tcctgccagt tgctgaccct ggggaaagca agagtggata gagaggagaa gagaggagag 9180 gagagggtgg gaagggttgc gaaggaaggt aaattgttaa cacctcccct tcctatggtc 9240 acagatcatg agtatctttg gccatttggg tggctataac aaaataccat aaactgggtg 9300 gcttagcaac aacaaacata tatttctcat agttctggag gctgagaagt ccaggatcaa 9360 ggcactggca gatgcagtgc ccattccttg gttcatagag agtgccttct tagtatatcc 9420 ttgctggaag gaggaaggca gctctctgtg gtctcttttg taaggacacc gatcctgttc 9480 atgacagctc cacccccatg acctaatcaa ctcccaaagg ccccctgtcc taataccacc 9540 accttggggg ttaggtttca acatatgaac aatgtgggga cacaaacatt gagaccacag 9600 cagtgagtgt cgaacttgga ctctgagatt tcctatcccc tggtgcaggg cagtccccat 9660 tacaccagat tgctgagggc agctgggaaa taagctaagg acggtattga ctggggtctt 9720 ccttcgataa cgattaagaa gttggaaaca ggccaggcat ggtggctcac gcctataatc 9780 ccaacatttt aggaggccga gatgggcaga tcacctgagg tcaggagttc gagatcagcc 9840 tggccaacat agtgaaaccc cgtctctact aaaaaataca gaaattagcc aggcatggtg 9900 gtgggcgcct gtaattccag ctacttggga ggctgaggca ggagaatcac ttgaacctgg 9960 gaggtggggg ctgtagtgag ccaaaattgc gccactgcac tccagcatgg gtcacagagc 10020 gagactccat ctcaaaaaga agaaaaaaag aaaaaaaaga aaaaaagaaa taaaataaaa 10080 taaaaagaag ttggaaacaa tcacttgtag cgttttgttc agaagttccc ataggaaggt 10140 cagagaaggg tcattgaaga cttcccaatg ggaaaaacca ttcatttcca ggatccatac 10200 taacttcttt ctaaaattta aatcaaaata ttggaatgaa agtgcaaaca gagaagttca 10260 cccagatatc aggtagcatt cacagccagc cacatttttc accctcttca cttggagatt 10320 tggtcttgag taaaacgtta gagaatcaga gaacatcagg gatccagggc ctctgaagat 10380 gtgaaaacca acctccttgt tttgcaaatg tggaaggaaa agtcccacga aaagtccaag 10440 aatgtgccca atgttataaa gagacttgcc ttcatattca agaggttcaa cagtcactgc 10500 tctggggctg ccataaagat ggtctccgct ggctatcttt actgtcttca ctccttttat 10560 ttgcagctga gaatttctaa ttctgacaca aaattctttt tcatttttcc cttttttcat 10620 ctttagctaa gtgggagaca agcttcaacc acaaaggcac ccaagagcaa gacttctacg 10680 tgacctcgga gactgtggtg cgggtaccca tgatgagccg cgaggatcag tatcactacc 10740 tcctggaccg gaacctctcc tgcagggtgg tgggggtccc ctaccaaggc aatgccacgg 10800 ctttgttcat tctccccagt gagggaaaga tgcagcaggt ggagaatgga ctgagtgaga 10860 aaacgctgag gaagtggctt aagatgttca aaaagaggta ctttcagact accccagggc 10920 cagcctaaac ccacacagcc ccagggagac acacacgccc taccagggcc acacagcact 10980 ggtgggaagg actcacccag ccaaggagct gcctccaggc ccagaggcat cctgtgacat 11040 ccaagtcctg ggggcctagc ccagttggag ggacaagagc tggaaactgg gttccttagg 11100 gtggtgccag agtgggcaga gacctctggg cagcccacgt ccaagtccag agcaagggga 11160 ggctcatcct agaaaagagg ccagaggagc cataaccacc attgttcctt gggttaagga 11220 gtcctttttt aaaaccatca aaactaagaa tccagtgcat tatgaatcca aggggtgagg 11280 ctcagtgtgc caatgcccca gaacagtcta agaaagctcc ttttcccttt ccaggcagct 11340 cgagctttac cttcccaaat tctccattga gggctcctat cagctggaga aagtcctccc 11400 cagtctgggg atcagtaacg tcttcacctc ccatgctgat ctgtccggca tcagcaacca 11460 ctcaaatatc caggtgtctg aggtgggttc agaagctcct atgcatctgc ttcccaagat 11520 ctattctgtt ctattctttc tattctactc taccccattt cattccattc cattccactc 11580 aactccactc cactccactc cactccagtt cactctattc aattccactc cactccagtt 11640 cactctattc aattccactc cactccactc cagttcactc tattcagttc cactccactc 11700 cactccactc cactccagtt cactctattc cattccactc cattccactc ctccactcct 11760 ctcatccact ccactctact cctccactcc acatctccac tccactcctc cactccactc 11820 ctccactcca ctcatccact ccactcctcc actccactcc tccactccac tcctccactc 11880 cactccactc atccactcca ctcttccatt ccactccatt ccactcctcc actccactct 11940 tccactccac tccattccac tcctccactc cactccactc tattctattc tattccattc 12000 cattctactc tattctattc cattccattg cagtcaactc cactccactc tctactattc 12060 tattccactc ctctcccctc cactccattc cattgcagtc cactccactg cactccactc 12120 ctttattctg ttctgttcta ttctattcta ttctattcta ttctctccct ctccctctct 12180 tttcccacaa gtagtgaaag tttcactttg tgtcttatcc ttcatgtaat gggaagccat 12240 atccaccact gttccttgag ttaaggagtc ctgttttaaa caatcaaaac taagaaggca 12300 cttcctagct atgtgatctc caaaaaatac ttgactctct gagcttcctt tctctcttct 12360 ataaaattga agaattacac cttgctcaaa gatgccatga gaattcaatg acagacacat 12420 gcgaagtcac cccccagcac agtgcctggg gcagagtagc tgctccattg ttccatttcc 12480 tacttgctcc atggctcagt tgaacagata cttagaggtt gatgcccata ggcagaagct 12540 ttgccatttg ctatgatgac ttcacctgcc cctggtggcc tggtgatgcc tggtgtctcc 12600 cctgcagatg gtgcacaaag ctgtggtgga ggtggacgag tcgggaacca gagcagcggc 12660 agccacgggg acaatcttca ctttcaggtc ggcccgcctg aactctcaga ggctagtgtt 12720 caacaggccc tttctgatgt tcattgtgga taacaacatc ctcttccttg gcaaagtgaa 12780 ccgcccctga ggtggggctt ctcctgaaat ctacaggcct cagggtggga gatgaagggg 12840 gctatgctat ggcccatctg tatgctggta gctagtgatt tacacaggtt tagttgacta 12900 atgaggcatt acaaataata ttactctatg atgattgctt ccacccacac gactgcaaca 12960 tacaggtgcc ttggggaaat gtggagaaca ttcaatcttg ccgtcactat tcatcaatga 13020 agattagcac tgagatccag agaggctgga tgacttgctc aagttcacca gcatggtagt 13080 ggcaaagaga ggtccagagt cctggccctt gatgcccagc tcagtgccac aaagctcagt 13140 aggagggatg ttccagtgga tgagggccac caggaagcac aggtccaagg ctggtcccac 13200 acttatcagc agcaacaact gtcagttcat cctgcatggg aaaaatgttg gaatgggagt 13260 ctgaaatggg gctactgttt cagtcctaac gtgctgtgtg acattgggac aacactttcc 13320 ctctctggac ctcagtttcc ctctgtatac aaggatcaga ttcttgctgt gacccaagaa 13380 ctcctgaaat catatagaaa ggctggggtg ggccctgtca ttcgtggttg atttcaatac 13440 actcaagtgc cattcatcct ttaagaaaaa catctggata tcaaggtgga aatggcccat 13500 ttaatgattg attatatcat tttgtggata tagttataat ctgatgggcc tggctgggag 13560 tggaagaagg gaagcctttt gcaaatagta gagtgtcagt tgcaggtgcc aatgactaac 13620 tttttgaatt ctatgttggc attaacaata aagcattttg caaacactgg ttataactgt 13680 ctttatggag gcagctctgg gaatggtgac attgatagct taccatgctc caggccgggt 13740 gcctggccct tcacctggat ggtcgcattt gcccctcata agactcccat gaagaaaggc 13800 accactatta tcccatctgt tattcacaga tgggaaaggc aaggcttgaa gtggttaggt 13860 ggcttaccca gtcacatatc ttctaagtgg tgcagccaga atttggcggg gggagtgcga 13920 ccaagaaccc tacactcagt cctgtgctct gtgctgtgga ggagagatga ccaggagcag 13980 aaacttcatt caggggcatc tcaggcacca gctcccccat gagccagcta agttccctcc 14040 ctcccttcac caagcaccat gtgtttcctc atgtgccaaa tgaagaggat tagatactca 14100 agaatggaat gagtgggtga gtgagtcctt cgctgcaccc aagtctgatt ttctgtgcgc 14160 ctgctcaccc caccctgcat gttctaagca tgcttccata aggctgtgcc ccaccctctg 14220 attctagagt ctggactgta tcagaggtga gtgcctacta gaggtaacaa ggtcaggacc 14280 ccaaaccttg tccatccccc aaagtactga gcccccacca tgcaccagcc catgccagat 14340 gctttgcact tgtgatatca cccatccctt gacaacccag caagttctat tattgttccc 14400 attttacagg caataacata agtgctttcc cagggtccca cgctggtgac agtgagggcc 14460 cagggtctga gagcccagat cgcacatgtg cgggctggtg gcaggggaga tggcagcaac 14520 cagactcaga catttctctg cagttgtgct gtgggctcag ggtggctctt tacgaagggg 14580 ccccttcgtg gggtcatgca ctcctgtgtg ctttcccttg catcatgcct tgcctgtctt 14640 ggcaaatatt tctctggagt ttacccagcc agtccaaggt cacagggaag ccctgtctgt 14700 gtctcacaca gaaggtcaac gtccagcact gtccaaactt tactcagcaa acagtcacaa 14760 agcagctcct gtgtgggggt cggggtggct cactgtggtc tctgctgcat gtcacacatt 14820 gaagcactgt gctggggtca tcgcaggctg tttaactcaa ttgtcacatg agcctgggtg 14880 cacaaaatgg tagagcagct cagagagaga tggacagaca gcatgaacct ctgaggagtc 14940 aggttttctt ggatgaaggg acactaagat ggctttggag cgtgagaagg acctcaccta 15000 gcaaatgtgg gaaaggagtg agacctccag gcagagggac tggctggaga cgagcgtgat 15060 gtggtgagcc atggagtgta tgggtcccca cagaacttca gtctgggcct gcacagggca 15120 tgtggaggag acaaggagga gggaggtcgg tgccggcggt tcagtgacag agatcctaaa 15180 tgggaggcca gtgttttgtc tgatctcttt catcccaatt tcagggtagt ttggtcatcc 15240 acgccacatt ccaagtgtcc cctgggccct ttctctccct cacccccctg tctgcacatg 15300 agtagatgcc tccacgcagc cctcccagga cgctcacctc tatccacaga tgcttctcca 15360 aaacccacca ggccctccca tggaacgagc tcacctacag ggtaaaatca ggtcacggtc 15420 acatataggc ctgactactc ccctcaggac cctcattcac agccactgta ttaatttgct 15480 ggggctgcca aaacaaagtg tcctcatctg ggaggctgca gtagatttgc tgaaattgat 15540 ttgctagcgt tgctgaaatt gattcaagct t 15571 97 4279 DNA Homo sapiens 97 cagacaggat attcactgct gtggcaaggc ctgtagagag tttcgaagtt aggaggactc 60 aagacggtcc ctccctggac ttttctgaag gggctcaaaa gatgacacgc gccagagctg 120 gaaggcgtcg ccaattggtc caacttttcc ctcctccctt tttgcggatg agaaaaactg 180 aggcccaggt ttgggatttc cagagcccgg gatttcccgg caacgccgac aaccacattc 240 ccccggctat tctgacccgc cccggttccg ggacgctccc tgggagccgc cgccgagggc 300 ctgctgggac tcccggggac cccgccgtcg gggcagcccc cacgcccggc gccgcccgcc 360 ggaacggcgc cgctgttgcg cacttgcagg ggagccggcg actgagggcg aggcagggag 420 ggagcaagcg gggctgggag ggctgctggc gcgggctcgc cggctgtgta tggtctatcg 480 caggcagctg acctttgagg aggaaatcgc tgctctccgc tccttcctgt agtaacagcc 540 gccgctgccg ccgccgccag gaacccggcc gggagcgaga gccgcggggc gcagagccgg 600 cccggctgcc ggacggtgcg gccccaccag gtgaacggcc atggcgggct ggatccaggc 660 ccagcagctg cagggagacg cgctgcgcca gatgcaggtg ctgtacggcc agcacttccc 720 catcgaggtc cggcactact tggcccagtg gattgagagc cagccatggg atgccattga 780 cttggacaat ccccaggaca gagcccaagc cacccagctc ctggagggcc tggtgcagga 840 gctgcagaag aaggcggagc accaggtggg ggaagatggg tttttactga agatcaagct 900 gaggcactac gccacgcagc tccagaaaac atatgaccgc tgccccctgg agctggtccg 960 ctgcatccgg cacattctgt acaatgaaca gaggctggtc cgagaagcca acaattgcag 1020 ctctccggct gggatcctgg ttgacgccat gtcccagaag caccttcaga tcaaccagac 1080 atttgaggag ctgcgactgg tcacgcagga cacagagaat gagctgaaga aactgcagca 1140 gactcaggag tacttcatca tccagtacca ggagagcctg aggatccaag ctcagtttgc 1200 ccagctggcc cagctgagcc cccaggagcg tctgagccgg gagacggccc tccagcagaa 1260 gcaggtgtct ctggaggcct ggttgcagcg tgaggcacag acactgcagc agtaccgcgt 1320 ggagctggcc gagaagcacc agaagaccct gcagctgctg cggaagcagc agaccatcat 1380 cctggatgac gagctgatcc agtggaagcg gcggcagcag ctggccggga acggcgggcc 1440 ccccgagggc agcctggacg tgctacagtc ctggtgtgag aagttggccg agatcatctg 1500 gcagaaccgg cagcagatcc gcagggctga gcacctctgc cagcagctgc ccatccccgg 1560 cccagtggag gagatgctgg ccgaggtcaa cgccaccatc acggacatta tctcagccct 1620 ggtgaccagc acattcatca ttgagaagca gcctcctcag gtcctgaaga cccagaccaa 1680 gtttgcagcc accgtacgcc tgctggtggg cgggaagctg aacgtgcaca tgaatccccc 1740 ccaggtgaag gccaccatca tcagtgagca gcaggccaag tctctgctta aaaatgagaa 1800 cacccgcaac gagtgcagtg gtgagatcct gaacaactgc tgcgtgatgg agtaccacca 1860 agccacgggc accctcagtg cccacttcag gaacatgtca ctgaagagga tcaagcgtgc 1920 tgaccggcgg ggtgcagagt ccgtgacaga ggagaagttc acagtcctgt ttgagtctca 1980 gttcagtgtt ggcagcaatg agcttgtgtt ccaggtgaag actctgtccc tacctgtggt 2040 tgtcatcgtc cacggcagcc aggaccacaa tgccacggct actgtgctgt gggacaatgc 2100 ctttgctgag ccgggcaggg tgccatttgc cgtgcctgac aaagtgctgt ggccgcagct 2160 gtgtgaggcg ctcaacatga aattcaaggc cgaagtgcag agcaaccggg gcctgaccaa 2220 ggagaacctc gtgttcctgg cgcagaaact gttcaacaac agcagcagcc acctggagga 2280 ctacagtggc ctgtccgtgt cctggtccca gttcaacagg gagaacttgc cgggctggaa 2340 ctacaccttc tggcagtggt ttgacggggt gatggaggtg ttgaagaagc accacaagcc 2400 ccactggaat gatggggcca tcctaggttt tgtgaataag caacaggccc acgacctgct 2460 catcaacaag cccgacggga ccttcttgtt gcgctttagt gactcagaaa tcgggggcat 2520 caccatcgcc tggaagtttg attccccgga acgcaacctg tggaacctga aaccattcac 2580 cacgcgggat ttctccatca ggtccctggc tgaccggctg ggggacctga gctatctcat 2640 ctatgtgttt cctgaccgcc ccaaggatga ggtcttctcc aagtactaca ctcctgtgct 2700 ggctaaagct gttgatggat atgtgaaacc acagatcaag caagtggtcc ctgagtttgt 2760 gaatgcatct gcagatgctg ggggcagcag cgccacgtac atggaccagg ccccctcccc 2820 agctgtgtgc ccccaggctc cctataacat gtacccacag aaccctgacc atgtactcga 2880 tcaggatgga gaattcgacc tggatgagac catggatgtg gccaggcacg tggaggaact 2940 cttacgccga ccaatggaca gtcttgactc ccgcctctcg ccccctgccg gtcttttcac 3000 ctctgccaga ggctccctct catgaatgtt tgaatcccac gcttctcttt ggaaacaata 3060 tgcaatgtga agcggtcgtg ttgtgagttt agtaaggctg tgtacactga cacctttgca 3120 ggcatgcatg tgcttgtgtg tgtgtgtgtg tgtccttgcg catgagctac gcctgcctcc 3180 cctgtgccag tcctgggatg tggctgcagc agcggtggcc ggcctctttt cagatcatgg 3240 catccaagag tgcgccgagt ctgtctctgt catggtagag accgagcctc tgtcactgca 3300 ggcactcaat gcagccagac ctattcctcc tgtgcccctc atctgctcag cagctatttg 3360 aatgagatga ttcagaaggg gaggggagac aggtaacgtc tgtaagctga agtttcactc 3420 cggagtgaga agctttgccc tcctaagaga gagagacaga gagacagaga gagagaaaga 3480 gagagtgtgt gggtctatgt aaatgcatct gtcctcatgt gttgatgtaa ccgattcatc 3540 tctcagaagg gaggctgggg ttcattttcg agtagtattt tatactttag tgaacgtgga 3600 ctccagactc tctgtgaacc ctatgagagc gcgtctgggc ccggccatgt ccttagcaca 3660 ggggggccgc cggtttgagt gagggtttct gagctgctct gaattagtcc ttgcttggct 3720 gcttggcctt gggttcattc aagctcacga tgctgttccc acgtttcccg ggatatatat 3780 tctctcccct ccgttgggcc ccagccttct ttgcttgcct ctctgtttgt aaccttgtcg 3840 acaaagaggt agaaaagatt gggtctagga tatggtgggt ggacaggggc cccgggactt 3900 ggagggttgg tcctcttgcc tcctggaaaa aacaaaaaca aaaaactgca gtgaaagaca 3960 agctgcaaat cagccatgtg ctgcgtgcct gtggaatctg gagtgagggg taaaagctga 4020 tctggtttga ctccgctgga ggtggggcct ggagcaggcc ttgcgctgtt gcgtaactgg 4080 ctgtgttctg gtgaggcctt gctcccaacc ccacacgctc ctccctctga ggcgtgagga 4140 ctcgcagtca ggggcagctg accatggaag attgagagcc caaggtttaa acttcttctc 4200 tgaagggagg tggggatgag aagaggggtt tttttgtact ttgtacaaag accacacatt 4260 tgtgtaaaca gtgttttgg 4279 98 3799 DNA Homo sapiens 98 ctggcactgg gtggtaacca gcaagccagc tggcatccgc atccagggtt tgtttcaatg 60 atgtctcgtg gagaatatgg aggggctggt gccaggactg tccttggctt tgcctcgggg 120 tgtgaacggg gtcagtgacc tctaaaacta acctgcctct cagttctgaa tccagacaga 180 atcaatcctc agctgtgtct cgctccacac cccctgccct ggaagccagg gaaggttgga 240 ggtgctaggg ggtcaggctc ccctctgtga cccctgcagc tgttgtggtg actcatgtcc 300 caacctagct gcctctccca aggagacttt cccctgggac aagggggagg gaatggcatg 360 gaggaggccc acatcaagcg gggccaggaa cccacggtgg caggagctgg gctggtgacc 420 tacccagggc agaagggccc gggactcatc cagaggggaa ggaaggggtc ttcaggaaga 480 ccacggagat gccacaggca gaattggctt cccatctggg agataggtgg ggagaccctg 540 gcattttgac agccagaacc tggggtgctg agcagaatct tcatgcctgg cctggccgcc 600 ttcggaggga agctggaggg ttgggtgcga gaggagtggg gtcagagccc ctacatccgc 660 aggaccccaa atcggctggg ccccaaggcc cggactgcgc tccccggtgg ccccggcggc 720 cctccgcgaa tgcgtcctgc ccctcccctg cccaagccct ctgccctcac ccgggtccgg 780 cgccgccccc gaagtggcgg gaacaacccg aacccgaacc ttctgtcctc gggagccccc 840 agataagcgg ctgggaaccc gcggggcccg caggggaggc ccggctgttc cgcccgctaa 900 gtgcattagc acagctcacc tcccctatcg cgcctgccat cggacgggca gtgccgcgcc 960 ctgctctggg gcccccggag cgaccacagc ggaggccgga acggactgtc ctttctgggg 1020 cggggtgggg agggggtgtc gctggagggc ccggtggcat agcaacggac gagagaggcc 1080 tggaggaggg gcggggaggg ggagttgtgt ggcagttcta agggaagggt gggtgctggg 1140 acgggtgtcc gggagggagg ggagcctggc ggggtctggg gcctcgtcgc ggagggcgct 1200 gcgaggggga aactggggaa agggcctaat tccccagtct ccacctcgaa tcaggaaaga 1260 gaaggggcgg gctgctgggc aaaagaggtg aatggctgcg gggggctgga gaagagagat 1320 gggaggggcc ggccggcggg ggtgaggggg tctaaagatt gtgggggtga ggaactgagg 1380 gtggggggcg cccagaggcg ggactcgggg cggggcaggc gaggcggagg gcgagggctg 1440 cgggagcaag tacggagccg ggggtgtggg ggacgattgc cgctgcagcc gccgccccac 1500 tcacctccgg tgtgtctgca gcccggacac taagggagat ggatgaatgg gtggggagga 1560 tgcggcgcac atggccccgg gcggctcggc ggtcagctgc cgcccccaca gcggaccggt 1620 cggggcgggg gtcgggcggt agaaaaaagg gccgcgaggc gagcggggca ctgggcggac 1680 cgcggcggca gcatgagcgg cgcagaccgt agccccaatg cgggcgcagc ccctgactcg 1740 gccccgggcc aggcggcggt ggcttcggcc taccagcgct tcgagccgcg cgcctacctc 1800 cgcaacaact acgcgccccc tcgcggggac ctgtgcaacc cgaacggcgt cgggccgtgg 1860 aagctgcgct gcttggcgca gaccttcgcc accggtgagc gggggaaact gaggcacgag 1920 ggacaagagg tcgtcgggga gtgaaagcag gcgcagggaa ataaaaagaa ggaaagggag 1980 acagaccagg cgcctaacag atggggacca agaaacaaga gatagctgag aggtgcaaac 2040 agaagagaaa aaggagcaac atcccttagg agaggggcag aggagagaga ggtggagaga 2100 gggggcggag agtgctcaga attgagagct aaggtggggg atgcaggaca gactgaggtg 2160 gagatgcata ggaggaaatg gaggcagatg tgggacaggg gtgagaaact ccaggatttc 2220 ctcgctgagc ctggctggta ggtatagttg ttttctttct ttttctttat tttattttca 2280 tttatttact tatttttatt ttttatttgt tttgagacgg agtttcgctc ttgttgccca 2340 ggctggagta caatggcgcc atctcggctc actgcaacct ccgcctcccc gggttcaagc 2400 gattctcttg cctcagcttc cctagtagct gggattacag gcatgcgccc ccatgcctgg 2460 ctaatttatt tgtattttta gtagagacgg gacttctcca tgttggtcag gctggtctcg 2520 aactcccaac cttaggatcc acccaccccg gcctcccaaa gtgctgggat tacaggtgtg 2580 agccactgcg cccggccagt aggtatagtc ttctagatgt gaaacctgag tctcagagcg 2640 gtgaagttcc cttccgaagg gcagcccatg ttggagctgg gttcagtcta actctggggc 2700 caatgctttt tccagatgga gacacatttg cagaggagaa ggaagaacta gagagaggca 2760 gggagatgca ggggagggaa gggtaaggag gcaggggctg cctgggctgg ctggcaccag 2820 gaccctcttc ctctgccctg cccaggtgaa gtgtccggac gcaccctcat cgacattggt 2880 tcaggcccca ccgtgtacca gctgctcagt gcctgcagcc actttgagga catcaccatg 2940 acagatttcc tggaggtcaa ccgccaggag ctggggcgct ggctgcagga ggagccgggg 3000 gccttcaact ggagcatgta cagccaacat gcctgcctca ttgagggcaa ggggtaagga 3060 ctggggggtg agggttgggg aggaggcttc ccatagagtg gctggttggg gcaacagagg 3120 cctgagcgta gaacagcctt gagccctgcc ttgtgcctcc tgcacaggga atgctggcag 3180 gataaggagc gccagctgcg agccagggtg aaacgggtcc tgcccatcga cgtgcaccag 3240 ccccagcccc tgggtgctgg gagcccagct cccctgcctg ctgacgccct ggtctctgcc 3300 ttctgcttgg aggctgtgag cccagatctt gccagctttc agcgggccct ggaccacatc 3360 accacgctgc tgaggcctgg ggggcacctc ctcctcatcg gggccctgga ggagtcgtgg 3420 tacctggctg gggaggccag gctgacggtg gtgccagtgt ctgaggagga ggtgagggag 3480 gccctggtgc gtagtggcta caaggtccgg gacctccgca cctatatcat gcctgcccac 3540 cttcagacag gcgtagatga tgtcaagggc gtcttcttcg cctgggctca gaaggttggg 3600 ctgtgagggc tgtacctggt gccctgtggc ccccacccac ctggattccc tgttctttga 3660 agtggcacct aataaagaaa taataccctg ccgctgcggt cagtgctgtg tgtggctctc 3720 ctgggaagca gcaagggccc agagatctga gtgtccgggt aggggagaca ttcaccctag 3780 gctttttttc cagaagctt 3799 99 1550 DNA Homo sapiens 99 tgccgccgtc ccgcccgcca gcgccccagc gaggaagcag cgcgcagccc gcggcccagc 60 gcacccgcag cagcgcccgc agctcgtccg cgccatgttc caggcggccg agcgccccca 120 ggagtgggcc atggagggcc cccgcgacgg gctgaagaag gagcggctac tggacgaccg 180 ccacgacagc ggcctggact ccatgaaaga cgaggagtac gagcagatgg tcaaggagct 240 gcaggagatc cgcctcgagc cgcaggaggt gccgcgcggc tcggagccct ggaagcagca 300 gctcaccgag gacggggact cgttcctgca cttggccatc atccatgaag aaaaggcact 360 gaccatggaa gtgatccgcc aggtgaaggg agacctggct ttcctcaact tccagaacaa 420 cctgcagcag actccactcc acttggctgt gatcaccaac cagccagaaa ttgctgaggc 480 acttctggga gctggctgtg atcctgagct ccgagacttt cgaggaaata cccccctaca 540 ccttgcctgt gagcagggct gcctggccag cgtgggagtc ctgactcagt cctgcaccac 600 cccgcacctc cactccatcc tgaaggctac caactacaat ggccacacgt gtctacactt 660 agcctctatc catggctacc tgggcatcgt ggagcttttg gtgtccttgg gtgctgatgt 720 caatgctcag gagccctgta atggccggac tgcccttcac ctcgcagtgg acctgcaaaa 780 tcctgacctg gtgtcactcc tgttgaagtg tggggctgat gtcaacagag ttacctacca 840 gggctattct ccctaccagc tcacctgggg ccgcccaagc acccggatac agcagcagct 900 gggccagctg acactagaaa accttcagat gctgccagag agtgaggatg aggagagcta 960 tgacacagag tcagagttca cggagttcac agaggacgag ctgccctatg atgactgtgt 1020 gtttggaggc cagcgtctga cgttatgagt gcaaaggggc tgaaagaaca tggacttgta 1080 tatttgtaca aaaaaaaagt tttatttttc taaaaaaaga aaaaagaaga aaaaatttaa 1140 agggtgtact tatatccaca ctgcacactg cctagcccaa aacgtcttat tgtggtagga 1200 tcagccctca ttttgttgct tttgtgaact ttttgtaggg gacgagaaag atcattgaaa 1260 ttctgagaaa acttctttta aacctcacct ttgtggggtt tttggagaag gttatcaaaa 1320 atttcatgga aggaccacat tttatattta ttgtgcttcg agtgactgac cccagtggta 1380 tcctgtgaca tgtaacagcc aggagtgtta agcgttcagt gatgtggggt gaaaagttac 1440 tacctgtcaa ggtttgtgtt accctcctgt aaatggtgta cataatgtat tgttggtaat 1500 tattttggta cttttatgat gtatatttat taaagagatt tttacaaatg 1550 100 4673 DNA Homo sapiens 100 tttgctcctg ctcctccgct cctcctgcgc ggggtgctga aacagcccgg ggaagtagag 60 ccgcctccgg ggagcccaac cagccgaacg ccgccggcgt cagcagcctt gcgcggccac 120 agcatgaccg ctcgcggcct ggcccttggc ctcctcctgc tgctactgtg tccagcgcag 180 gtgttttcac agtcctgtgt ttggtatgga gagtgtggaa ttgcatatgg ggacaagagg 240 tacaattgcg aatattctgg cccaccaaaa ccattgccaa aggatggata tgacttagtg 300 caggaactct gtccaggatt cttctttggc aatgtcagtc tctgttgtga tgttcggcag 360 cttcagacac taaaagacaa cctgcagctg cctctacagt ttctgtccag atgtccatcc 420 tgtttttata acctactgaa cctgttttgt gagctgacat gtagccctcg acagagtcag 480 tttttgaatg ttacagctac tgaagattat gttgatcctg ttacaaacca gacgaaaaca 540 aatgtgaaag agttacaata ctacgtcgga cagagttttg ccaatgcaat gtacaatgcc 600 tgccgggatg tggaggcccc ctcaagtaat gacaaggccc tgggactcct gtgtgggaag 660 gacgctgacg cctgtaatgc caccaactgg attgaataca tgttcaataa ggacaatgga 720 caggcacctt ttaccatcac tcctgtgttt tcagattttc cagtccatgg gatggagccc 780 atgaacaatg ccaccaaagg ctgtgacgag tctgtggatg aggtcacagc accatgtagc 840 tgccaagact gctctattgt ctgtggcccc aagccccagc ccccacctcc tcctgctccc 900 tggacgatcc ttggcttgga cgccatgtat gtcatcatgt ggatcaccta catggcgttt 960 ttgcttgtgt tttttggagc attttttgca gtgtggtgct acagaaaacg gtattttgtc 1020 tccgagtaca ctcccatcga tagcaatata gctttttctg ttaatgcaag tgacaaagga 1080 gaggcgtcct gctgtgaccc tgtcagcgca gcatttgagg gctgcttgag gcggctgttc 1140 acacgctggg ggtctttctg cgtccgaaac cctggctgtg tcattttctt ctcgctggtc 1200 ttcattactg cgtgttcgtc aggcctggtg tttgtccggg tcacaaccaa tccagttgac 1260 ctctggtcag cccccagcag ccaggctcgc ctggaaaaag agtactttga ccagcacttt 1320 gggcctttct tccggacgga gcagctcatc atccgggccc ctctcactga caaacacatt 1380 taccagccat acccttcggg agctgatgta ccctttggac ctccgcttga catacagata 1440 ctgcaccagg ttcttgactt acaaatagcc atcgaaaaca ttactgcctc ttatgacaat 1500 gagactgtga cacttcaaga catctgcttg gcccctcttt caccgtataa cacgaactgc 1560 accattttga gtgtgttaaa ttacttccag aacagccatt ccgtgctgga ccacaagaaa 1620 ggggacgact tctttgtgta tgccgattac cacacgcact ttctgtactg cgtacgggct 1680 cctgcctctc tgaatgatac aagtttgctc catgaccctt gtctgggtac gtttggtgga 1740 ccagtgttcc cgtggcttgt gttgggaggc tatgatgatc aaaactacaa taacgccact 1800 gcccttgtga ttaccttccc tgtcaataat tactataatg atacagagaa gctccagagg 1860 gcccaggcct gggaaaaaga gtttattaat tttgtgaaaa actacaagaa tcccaatctg 1920 accatttcct tcactgctga acgaagtatt gaagatgaac taaatcgtga aagtgacagt 1980 gatgtcttca ccgttgtaat tagctatgcc atcatgtttc tatatatttc cctagccttg 2040 gggcacatca aaagctgtcg caggcttctg gtggattcga aggtctcact aggcatcgcg 2100 ggcatcttga tcgtgctgag ctcggtggct tgctccttgg gtgtcttcag ctacattggg 2160 ttgcccttga ccctcattgt gattgaagtc atcccgttcc tggtgctggc tgttggagtg 2220 gacaacatct tcattctggt gcaggcctac cagagagatg aacgtcttca aggggaaacc 2280 ctggatcagc agctgggcag ggtcctagga gaagtggctc ccagtatgtt cctgtcatcc 2340 ttttctgaga ctgtagcatt tttcttagga gcattgtccg tgatgccagc cgtgcacacc 2400 ttctctctct ttgcgggatt ggcagtcttc attgactttc ttctgcagat tacctgtttc 2460 gtgagtctct tggggttaga cattaaacgt caagagaaaa atcggctaga catcttttgc 2520 tgtgtcagag gtgctgaaga tggaacaagc gtccaggcct cagagagctg tttgtttcgc 2580 ttcttcaaaa actcctattc tccacttctg ctaaaggact ggatgagacc aattgtgata 2640 gcaatatttg tgggtgttct gtcattcagc atcgcagtcc tgaacaaagt agatattgga 2700 ttggatcagt ctctttcgat gccagatgac tcctacatgg tggattattt caaatccatc 2760 agtcagtacc tgcatgcggg tccgcctgtg tactttgtcc tggaggaagg gcacgactac 2820 acttcttcca aggggcagaa catggtgtgc ggcggcatgg gctgcaacaa tgattccctg 2880 gtgcagcaga tatttaacgc ggcgcagctg gacaactata cccgaatagg cttcgccccc 2940 tcgtcctgga tcgacgatta tttcgactgg gtgaagccac agtcgtcttg ctgtcgagtg 3000 gacaatatca ctgaccagtt ctgcaatgct tcagtggttg accctgcctg cgttcgctgc 3060 aggcctctga ctccggaagg caaacagagg cctcaggggg gagacttcat gagattcctg 3120 cccatgttcc tttcggataa ccctaacccc aagtgtggca aagggggaca tgctgcctat 3180 agttctgcag ttaacatcct ccttggccat ggcaccaggg tcggagccac gtacttcatg 3240 acctaccaca ccgtgctgca gacctctgct gactttattg acgctctgaa gaaagcccga 3300 cttatagcca gtaatgtcac cgaaaccatg ggcattaacg gcagtgccta ccgagtattt 3360 ccttacagtg tgttttatgt cttctacgaa cagtacctga ccatcattga cgacactatc 3420 ttcaacctcg gtgtgtccct gggcgcgata tttctggtga ccatggtcct cctgggctgt 3480 gagctctggt ctgcagtcat catgtgtgcc accatcgcca tggtcttggt caacatgttt 3540 ggagttatgt ggctctgggg catcagtctg aacgctgtat ccttggtcaa cctggtgatg 3600 agctgtggca tctccgtgga gttctgcagc cacataacca gagcgttcac ggtgagcatg 3660 aaaggcagcc gcgtggagcg cgcggaagag gcacttgccc acatgggcag ctccgtgttc 3720 agtggaatca cacttacaaa atttggaggg attgtggtgt tggcttttgc caaatctcaa 3780 attttccaga tattctactt caggatgtat ttggccatgg tcttactggg agccactcac 3840 ggattaatat ttctccctgt cttactcagt tacatagggc catcagtaaa taaagccaaa 3900 agttgtgcca ctgaagagcg atacaaagga acagagcgcg aacggcttct aaatttctag 3960 ccctctcgca gggcatcctg actgaactgt gtctaagggt cggtcggttt accactggac 4020 gggtgctgca tcggcaaggc caagttgaac accggatggt gccaaccatc ggttgtttgg 4080 cagcagcttt gaacgtagcg cctgtgaact caggaatgca cagttgactt gggaagcagt 4140 attactagat ctggaggcaa ccacaggaca ctaaacttct cccagcctct tcaggaaaga 4200 aacctcattc tttggcaagc aggaggtgac actagatggc tgtgaatgtg atccgctcac 4260 tgacactctg taaaggccaa tcaatgcact gtctgtcctc tcctttttag gagtaagcca 4320 tcccacaagt tctataccat atttttagtg acagttgagg ttgtagatac actttataac 4380 attttatagt ttaaagagct ttattaatgc aataaattaa ctttgtacac atttttatat 4440 aaaaaaacag caagtgattt cagaatgttg taggcctcat tagagcttgg tctccaaaaa 4500 tctgtttgaa aaaagcaaca tgttcttcac agtgttcccc tagaaaggaa gagatttaat 4560 tgccagttag atgtggcatg aaatgaggga caaagaaagc atctcgtagg tgtgtctact 4620 gggttttaac ttatttttct ttaataaaat acattgtttt cctaaaaaaa aaa 4673 101 1362 DNA Homo sapiens 101 catttgggga cgctctcagc tctcggcgca cggcccagct tccttcaaaa tgtctactgt 60 tcacgaaatc ctgtgcaagc tcagcttgga gggtgatcac tctacacccc caagtgcata 120 tgggtctgtc aaagcctata ctaactttga tgctgagcgg gatgctttga acattgaaac 180 agccatcaag accaaaggtg tggatgaggt caccattgtc aacattttga ccaaccgcag 240 caatgcacag agacaggata ttgccttcgc ctaccagaga aggaccaaaa aggaacttgc 300 atcagcactg aagtcagcct tatctggcca cctggagacg gtgattttgg gcctattgaa 360 gacacctgct cagtatgacg cttctgagct aaaagcttcc atgaaggggc tgggaaccga 420 cgaggactct ctcattgaga tcatctgctc cagaaccaac caggagctgc aggaaattaa 480 cagagtctac aaggaaatgt acaagactga tctggagaag gacattattt cggacacatc 540 tggtgacttc cgcaagctga tggttgccct ggcaaagggt agaagagcag aggatggctc 600 tgtcattgat tatgaactga ttgaccaaga tgctcgggat ctctatgacg ctggagtgaa 660 gaggaaagga actgatgttc ccaagtggat cagcatcatg accgagcgga gcgtgcccca 720 cctccagaaa gtatttgata ggtacaagag ttacagccct tatgacatgt tggaaagcat 780 caggaaagag gttaaaggag acctggaaaa tgctttcctg aacctggttc agtgcattca 840 gaacaagccc ctgtattttg ctgatcggct gtatgactcc atgaagggca aggggacgcg 900 agataaggtc ctgatcagaa tcatggtctc ccgcagtgaa gtggacatgt tgaaaattag 960 gtctgaattc aagagaaagt acggcaagtc cctgtactat tatatccagc aagacactaa 1020 gggcgactac cagaaagcgc tgctgtacct gtgtggtgga gatgactgaa gcccgacacg 1080 gcctgagcgt ccagaaatgg tgctcaccat gcttccagct aacaggtcta gaaaaccagc 1140 ttgcgaataa cagtccccgt ggccatccct gtgagggtga cgttagcatt acccccaacc 1200 tcattttagt tgcctaagca ttgcctggcc ttcctgtcta gtctctcctg taagccaaag 1260 aaatgaacat tccaaggagt tggaagtgaa gtctatgatg tgaaacactt tgcctcctgt 1320 gtactgtgtc ataaacagat gaataaactg aatttgtact tt 1362 102 2591 DNA Homo sapiens 102 cccggacgtg cggctcccct cggcctcctc gccatggacg cggacgactc ccgggccccc 60 aagggctcct tgcggaagtt cctggagcac ctctccgggg ccggcaaggc catcggcgtg 120 ctgaccagcg gcggggatgc tcaaggtatg aacgctgccg tccgtgccgt ggtgcgcatg 180 ggtatctacg tgggggccaa ggtgtacttc atctacgagg gctaccaggg catggtggac 240 ggaggctcaa acatcgcaga ggccgactgg gagagtgtct ccagcatcct gcaagtgggc 300 gggacgatca ttggcagtgc gcggtgccag gccttccgca cgcgggaagg ccgcctgaag 360 gctgcttgca acctgctgca gcgcggcatc accaacctgt gtgtgatcgg cggggacggg 420 agcctcaccg gggccaacct cttccggaag gagtggagtg ggctgctgga ggagctggcc 480 aggaacggcc agatcgataa ggaggccgtg cagaagtacg cctacctcaa cgtggtgggc 540 atggtgggct ccatcgacaa tgatttctgc ggcaccgaca tgaccatcgg cacggactcc 600 gccctgcaca ggatcatcga ggtcgtcgac gccatcatga ccacggccca gagccaccag 660 aggaccttcg ttctggaggt gatgggacga cactgtgggt acctggccct ggtgagtgcc 720 ttggcctgcg gtgcggactg ggtgttcctt ccagaatctc caccagagga aggctgggag 780 gagcagatgt gtgtcaaact ctcggagaac cgtgcccgga aaaaaaggct gaatattatt 840 attgtggctg aaggagcaat tgatacccaa aataaaccca tcacctctga gaaaatcaaa 900 gagcttgtcg tcacgcagct gggctatgac acacgtgtga ccatcctcgg gcacgtgcag 960 agaggaggga ccccttcggc attcgacagg atcttggcca gccgcatggg agtggaggca 1020 gtcatcgcct tgctagaggc caccccggac accccagctt gcgtcgtgtc actgaacggg 1080 aaccacgccg tgcgcctgcc gctgatggag tgcgtgcaga tgactcagga tgtgcagaag 1140 gcgatggacg agaggagatt tcaagatgcg gttcgactcc gagggaggag ctttgcgggc 1200 aacctgaaca cctacaagcg acttgccatc aagctgccgg atgatcagat cccaaagacc 1260 aattgcaacg tagctgtcat caacgtgggg gcacccgcgg ctgggatgaa cgcggccgta 1320 cgctcagctg tgcgcgtggg cattgccgac ggccacagga tgctcgccat ctatgatggc 1380 tttgacggct tcgccaaggg ccagatcaaa gaaatcggct ggacagatgt cgggggctgg 1440 accggccaag gaggctccat tcttgggaca aaacgcgttc tcccggggaa gtacttggaa 1500 gagatcgcca cacagatgcg cacgcacagc atcaacgcgc tgctgatcat cggtggattc 1560 gaggcctacc tgggactcct ggagctgtca gccgcccggg agaagcacga ggagttctgt 1620 gtccccatgg tcatggttcc cgctactgtg tccaacaatg tgccgggttc cgatttcagc 1680 atcggggcag acaccgccct gaacactatc accgacacct gcgaccgcat caagcagtcc 1740 gccagcggaa ccaagcggcg cgtgttcatc atcgagacca tgggcggcta ctgtggctac 1800 ctggccaaca tgggggggct cgcggccgga gctgatgccg catacatttt cgaagagccc 1860 ttcgacatca gggatctgca gtccaacgtg gagcacctga cggagaaaat gaagaccacc 1920 atccagagag gccttgtgct cagaaatgag agctgcagtg aaaactacac caccgacttc 1980 atttaccagc tgtattcaga agagggcaaa ggcgtgtttg actgcaggaa gaacgtgctg 2040 ggtcacatgc agcagggtgg ggcaccctct ccatttgata gaaactttgg aaccaaaatc 2100 tctgccagag ctatggagtg gatcactgca aaactcaagg aggcccgggg cagaggaaaa 2160 aaatttacca ccgatgattc catttgtgtg ctgggaataa gcaaaagaaa cgttattttt 2220 caacctgtgg cagagctgaa gaagcaaacg gattttgagc acaggattcc caaagaacag 2280 tggtggctca agctacggcc cctcatgaaa atcctggcca agtacaaggc cagctatgac 2340 gtgtcggact caggccagct ggaacatgtg cagccctgga gtgtctgacc cagtcccgcc 2400 tgcatgtgcc tgcagccacc gtggactgtc tgtttttgta acacttaagt tattttatca 2460 gcactttatg cacgtattat tgacattaat acctaatcgg cgagtgccca tctgccccac 2520 cagctccagt gcgtgctgtc tgtggagtgt gtctcatgct ttcagatgtg catatgagca 2580 gaattaatta a 2591 103 865 DNA Homo sapiens 103 gaattccgga gttccgggcg cgcgcgacgt cagtttgagt tctgtgttct ccccgcccgt 60 gtcccgcccg acccgcgccc gcgatgctgg cgctgcgctg cggctcccgc tggctcggcc 120 tgctctccgt cccgcgctcc gtgccgctgc gcctccccgc ggcccgcgcc tgcagcaagg 180 gctccggcga cccgtcctct tcctcctcct ccgggaaccc gctcgtgtac ctggacgtgg 240 acgccaacgg gaagccgctc ggccgcgtgg tgctggagct gaaggcagat gtcgtcccaa 300 agacagctga gaacttcaga gccctgtgca ctggtgagaa gggcttcggc tacaaaggct 360 ccaccttcca cagggtgatc ccttccttca tgtgccaggc gggcgacttc accaaccaca 420 atggcacagg cgggaagtcc atctacggaa gccgctttcc tgacgagaac tttacactga 480 agcacgtggg gccaggtgtc ctgtccatgg ctaatgctgg tcctaacacc aacggctccc 540 agttcttcat ctgcaccata aagacagact ggttggatgg caagcatgtt gtgttcggtc 600 acgtcaaaga gggcatggac gtcgtgaaga aaatagaatc tttcggctct aagagtggga 660 ggacatccaa gaagattgtc atcacagact gtggccagtt gagctaatct gtggccaggg 720 tgctggcatg gtggcagctg caaatgtcca tgcacccagg tggccgcgtt gggctgtcag 780 ccaaggtgcc tgaaacgata cgtgtgccca ctccactgtc acagtgtgcc tgaggaaggc 840 tgctagggat gttagacgga attcc 865 104 661 DNA Homo sapiens 104 tcaaactgaa gctcgcactc tcgcctccag catgaaagtc tctgccgccc ttctgtgcct 60 gctgctcata gcagccacct tcattcccca agggctcgct cagccagatg caatcaatgc 120 cccagtcacc tgctgctata acttcaccaa taggaagatc tcagtgcaga ggctcgcgag 180 ctatagaaga atcaccagca gcaagtgtcc caaagaagct gtgatcttca agaccattgt 240 ggccaaggag atctgtgctg accccaagca gaagtgggtt caggattcca tggaccacct 300 ggacaagcaa acccaaactc cgaagacttg aacactcact ccacaaccca agaatctgca 360 gctaacttat tttcccctag ctttccccag acatcctgtt ttattttatt ataatgaatt 420 ttgtttgttg atgtgaaaca ttatgcctta agtaatgtta attcttattt aagttattga 480 tgttttaagt ttatctttca tggtactagt gttttttaga tacagagact tggggaaatt 540 gcttttcctc ttgaaccaca gttctacccc tgggatgttt tgagggtctt tgcaagaatc 600 atttttttaa cattccaatg catttaatac aaagaattgc taaaatatta ttgtggaaat 660 g 661 105 420 DNA Homo sapiens 105 gggggctggc cgagcgccgt gcgcgcttgg gagaaggccg gaagcttacc agccgagaag 60 gaattcctag ctagcttcag agccggtgcc tccggagcca gcgtggtggc catagacaac 120 aagttcgaac aggccatgga tctggtgaag aatcatctga tgtatgctgt gagagaggag 180 gtggagatcc tgaaggagca gatccgagag ctggtggaga agaactccca gctagagcgt 240 gagaacaccc tgttgaagac cctggcaagc ccagagcagc tggagaagtt ccagtcctgt 300 ctgagccctg aagagccagc tcccgaatcc ccacaagtgc ccgaggcccc tggtggttct 360 gcggtgtaag tcgctctgtc ctcagggtgg gcagagccac taaacttgtt ttacctaggg 420 106 926 DNA Homo sapiens 106 gaatctcttt ctctcccttc agaatcttat cttggctttg gatcttagaa gagaatcact 60 aaccagagac gagactcagt gagtgagcag gtgttttgga caatggactg gttgagccca 120 tccctattat aaaaatgtct cagagcaacc gggagctggt ggttgacttt ctctcctaca 180 agctttccca gaaaggatac agctggagtc agtttagtga tgtggaagag aacaggactg 240 aggccccaga agggactgaa tcggagatgg agacccccag tgccatcaat ggcaacccat 300 cctggcacct ggcagacagc cccgcggtga atggagccac tgcgcacagc agcagtttgg 360 atgcccggga ggtgatcccc atggcagcag taaagcaagc gctgagggag gcaggcgacg 420 agtttgaact gcggtaccgg cgggcattca gtgacctgac atcccagctc cacatcaccc 480 cagggacagc atatcagagc tttgaacagg tagtgaatga actcttccgg gatggggtaa 540 actggggtcg cattgtggcc tttttctcct tcggcggggc actgtgcgtg gaaagcgtag 600 acaaggagat gcaggtattg gtgagtcgga tcgcagcttg gatggccact tacctgaatg 660 accacctaga gccttggatc caggagaacg gcggctggga tacttttgtg gaactctatg 720 ggaacaatgc agcagccgag agccgaaagg gccaggaacg cttcaaccgc tggttcctga 780 cgggcatgac tgtggccggc gtggttctgc tgggctcact cttcagtcgg aaatgaccag 840 acactgacca tccactctac cctcccaccc ccttctctgc tccaccacat cctccgtcca 900 gccgccattg ccaccaggag aacccg 926 107 1293 DNA Homo sapiens 107 cacgtcagcc ggggctagaa aaggcggcgg ggctgggccc agcgaggtga cagcctcgct 60 tggacgcaga gcccggcccg acgccgccat gacggccgcg ctcttcagcc tggacggccc 120 ggccggcggc gcgccctggc ctgcggagcc tgcgcccttc tacgaaccgg gccgggcggg 180 caagccgggc cgcggggccg agccaggggc cctaggcgag ccaggcgccg ccgcccccgc 240 catgtacgac gacgagagcg ccatcgactt cagcgcctac atcgactcca tggccgccgt 300 gcccaccctg gagctgtgcc acgacgagct cttcgccgac ctcttcaaca gcaatcacaa 360 ggcgggcggc gcggggcccc tggagcttct tcccggcggc cccgcgcgcc ccttgggccc 420 gggccctgcc gctccccgcc tgctcaagcg cgagcccgac tggggcgacg gcgacgcgcc 480 cggctcgctg ttgcccgcgc aggtgggccc gtgcgcacag accgtggtga gcttggcggc 540 cgcagggcag cccaccccgc ccacgtcgcc ggagccgccg cgcagcagcc ccaggcagac 600 ccccgcgccc ggccccgccc gggagaagag cgccggcaag aggggcccgg accgcggcag 660 ccccgagtac cggcagcggc gcgagcgcaa caacatcgcc gtgcgcaaga gccgcgacaa 720 ggccaagcgg cgcaaccagg agatgcagca gaagttggtg gagctgtcgg ctgagaacga 780 gaagctgcac cagcgcgtgg agcagctcac gcgggacctg gccggcctcc ggcagttctt 840 caagcagctg cccagcccgc ccttcctgcc ggccgccggg acagcagact gccggtaacg 900 cgcggccggg gcgggagaga ctcagcaacg acccatacct cagacccgac ggcccggagc 960 ggacgccctg ctgccgacgc cagagccgcc gcgtgcccgc tgcagtttct tggacataga 1020 ccaaagaagc tacagcctgg acttaccacc actaaactgc gagagaagct aaacgtgttt 1080 attttccctt aaattatttt tgtaatggta gctttttcta catcttactc ctgttgatgc 1140 agctaaggta catttgtaaa aagaaaaaaa accagacttt tcagacaaac cctttgtatt 1200 gtagataaga ggaaaagact gagcatgctc acttttttat attaattttt aggacagtat 1260 ttgtaagaat aaagcagcat ttgaaatgcc cct 1293 108 2529 DNA Homo sapiens 108 ccagcaaaac ctgtttagac acatggacaa gaatcccagc gctacaaggc acacagtccg 60 cttcttcgtc ctcagggttg ccagcgcttc ctggaagtcc tgaagctctc gcagtgcagt 120 gagttcatgc accttcttgc caagcctcag tctttgggat ctggggaggc cgcctggttt 180 tcctccctcc ttctgcacgt ctgctggggt ctcttcctct ccaggccttg ccgtccccct 240 ggcctctctt cccagctcac acatgaagat gcacttgcaa agggctctgg tggtcctggc 300 cctgctgaac tttgccacgg tcagcctctc tctgtccact tgcaccacct tggacttcgg 360 ccacatcaag aagaagaggg tggaagccat taggggacag atcttgagca agctcaggct 420 caccagcccc cctgagccaa cggtgatgac ccacgtcccc tatcaggtcc tggcccttta 480 caacagcacc cgggagctgc tggaggagat gcatggggag agggaggaag gctgcaccca 540 ggaaaacacc gagtcggaat actatgccaa agaaatccat aaattcgaca tgatccaggg 600 gctggcggag cacaacgaac tggctgtctg ccctaaagga attacctcca aggttttccg 660 cttcaatgtg tcctcagtgg agaaaaatag aaccaaccta ttccgagcag aattccgggt 720 cttgcgggtg cccaacccca gctctaagcg gaatgagcag aggatcgagc tcttccagat 780 ccttcggcca gatgagcaca ttgccaaaca gcgctatatc ggtggcaaga atctgcccac 840 acggggcact gccgagtggc tgtcctttga tgtcactgac actgtgcgtg agtggctgtt 900 gagaagagag tccaacttag gtctagaaat cagcattcac tgtccatgtc acacctttca 960 gcccaatgga gatatcctgg aaaacattca cgaggtgatg gaaatcaaat tcaaaggcgt 1020 ggacaatgag gatgaccatg gccgtggaga tctggggcgc ctcaagaagc agaaggatca 1080 ccacaaccct catctaatcc tcatgatgat tcccccacac cggctcgaca acccgggcca 1140 ggggggtcag aggaagaagc gggctttgga caccaattac tgcttccgca acttggagga 1200 gaactgctgt gtgcgccccc tctacattga cttccgacag gatctgggct ggaagtgggt 1260 ccatgaacct aagggctact atgccaactt ctgctcaggc ccttgcccat acctccgcag 1320 tgcagacaca acccacagca cggtgctggg actgtacaac actctgaacc ctgaagcatc 1380 tgcctcgcct tgctgcgtgc cccaggacct ggagcccctg accatcctgt actatgttgg 1440 gaggaccccc aaagtggagc agctctccaa catggtggtg aagtcttgta aatgtagctg 1500 agaccccacg tgcgacagag agaggggaga gagaaccacc actgcctgac tgcccgctcc 1560 tcgggaaaca cacaagcaac aaacctcact gagaggcctg gagcccacaa ccttcggctc 1620 cgggcaaatg gctgagatgg aggtttcctt ttggaacatt tctttcttgc tggctctgag 1680 aatcacggtg gtaaagaaag tgtgggtttg gttagaggaa ggctgaactc ttcagaacac 1740 acagactttc tgtgacgcag acagagggga tggggataga ggaaagggat ggtaagttga 1800 gatgttgtgt ggcaatggga tttgggctac cctaaaggga gaaggaaggg cagagaatgg 1860 ctgggtcagg gccagactgg aagacacttc agatctgagg ttggatttgc tcattgctgt 1920 accacatctg ctctagggaa tctggattat gttatacaag gcaagcattt tttttttttt 1980 ttaaagacag gttacgaaga caaagtccca gaattgtatc tcatactgtc tgggattaag 2040 ggcaaatcta ttacttttgc aaactgtcct ctacatcaat taacatcgtg ggtcactaca 2100 gggagaaaat ccaggtcatg cagttcctgg cccatcaact gtattgggcc ttttggatat 2160 gctgaacgca gaagaaaggg tggaaatcaa ccctctcctg tctgcctctg ggtccctcct 2220 ctcacctctc cctcgatcat atttcccctt ggacacttgg ttagacgcct tccaggtcag 2280 gatgcacatt tctggattgt ggttccatgc agggttgggg cattatgggt tcttccccca 2340 cttcccctcc aagaccctgt gttcatttgg tgttcctgga agcaggtgcg acaacatgtg 2400 aggcattcgg ggaagctcga catgtgccac acagtgactt ggccccagac gcatagactg 2460 aggtataaag acaagtatga atattactct caaaatcttt gtataaataa atatttttgg 2520 ggcatcctg 2529

Claims (83)

What is claimed is:
1. A method to identify agonist ligands of progesterone receptors, comprising:
a. contacting a progesterone receptor with a putative agonist ligand, wherein said progesterone receptor is selected from the group consisting of progesterone receptor A (PR-A) and progesterone receptor B (PR-B), under conditions wherein, in the absence of said putative agonist ligand, said progesterone receptor is not activated;
b. detecting expression of at least one gene that is regulated by said progesterone receptor when said progesterone receptor is activated, said at least one gene being selected from the group consisting of:
i. at least one gene that is selectively upregulated by PR-A chosen from a gene in Table 1;
ii. at least one gene that is selectively downregulated by PR-A chosen from a gene in Table 2;
iii. at least one gene that is selectively upregulated by PR-B chosen from a gene in Table 3;
iv. at least one gene that is selectively downregulated by PR-B chosen from a gene in Table 4;
v. at least one gene that is upregulated or downregulated by both PR-A and PR-B chosen from a gene in Table 5;
vi. at least one gene that is reciprocally regulated by PR-A and PR-B chosen from a gene in Table 6; and,
vii. at least one gene that is regulated by one of said PR-A or said PR-B, wherein regulation of said gene is altered when the other of said PR-A or PR-B is expressed by the same cell, chosen from a gene in Table 7; and,
c. comparing the expression of said at least one gene in the presence and in the absence of said putative agonist ligand, wherein detection of regulation of the expression of said at least one gene in the manner associated with activation of said progesterone receptor as set forth in (b) indicates that said putative agonist ligand is a progesterone receptor agonist.
2. The method of claim 1, wherein said progesterone receptor is PR-A.
3. The method of claim 1, wherein said progesterone receptor is PR-B.
4. The method of claim 1, wherein said progesterone receptor comprises both PR-A and PR-B.
5. The method of claim 1, wherein detection of upregulation of expression of at least one gene chosen from a gene in Table 1, or detection of downregulation of at least one gene chosen from a gene in Table 2, in the presence of said putative agonist ligand, indicates that said putative agonist ligand is a selective agonist of PR-A.
6. The method of claim 1, wherein detection of upregulation of expression of at least one gene chosen from a gene in Table 3, or detection of downregulation of at least one gene chosen from a gene in Table 4, in the presence of said putative agonist ligand, indicates that said putative agonist ligand is a selective agonist of PR-B.
7. The method of claim 1, wherein said step (b) of detecting comprises detecting expression of at least five genes from any one or more of said Tables 1-7.
8. The method of claim 1, wherein said step (b) of detecting comprises detecting expression of at least ten genes from any one or more of said Tables 1-7.
9. The method of claim 1, wherein said step (b) of detecting comprises detecting expression of at least 15 genes from any one or more of said Tables 1-7.
10. The method of claim 1, further comprising a step of detecting expression of at least one gene chosen from the genes in Table 8.
11. The method of claim 1, wherein said progesterone receptor is expressed by a cell.
12. The method of claim 11, wherein said progesterone receptor is endogenously expressed by said cell.
13. The method of claim 11, wherein said progesterone receptor is recombinantly expressed by said cell.
14. The method of claim 11, wherein said cell is part of a tissue from a test animal.
15. The method of claim 14, wherein said step of contacting is performed by administration of said putative agonist ligand to said test animal or to said tissue of said test animal.
16. The method of claim 1, wherein expression of said at least one gene is detected by measuring amounts of transcripts of said at least one gene before and after contact of said progesterone receptor with said putative agonist ligand.
17. The method of claim 1, wherein expression of said at least one gene is detected by detecting hybridization of at least a portion of said at least one gene or a transcript thereof to a nucleic acid molecule comprising a portion of said at least one gene or a transcript thereof in a nucleic acid array.
18. The method of claim 1, wherein expression of said at least one gene is detected by measuring expression of a reporter gene that is operatively linked to at least the regulatory region of said at least one gene.
19. The method of claim 1, wherein expression of said at least one gene is detected by detecting the production of a protein encoded by said at least one gene.
20. The method of claim 1, wherein said putative agonist ligand is a product of rational drug design.
21. The method of claim 1, comprising, in step (b), detecting expression of: 11-beta-hydroxysteroid dehydrogenase type 2, tissue factor gene, PCI gene (plasminogen activator inhibitor 3), MAD-3 Ikβ-alpha, Niemann-Pick C disease (NPC1), platelet-type phosphofructokinase, phenylethanolamine n-methyltransferase (PNMT), transforming growth factor-beta 3 (TGF-beta3), Monocyte Chemotactic Protein 1, delta sleep inducing peptide (related to TSC-22), and estrogen receptor-related protein (hERRa1).
22. The method of claim 1, comprising, in step (b), detecting expression of:
growth arrest-specific protein (gas6), tissue factor gene, NF-IL6-beta (C/EBPbeta), PCI gene (plasminogen activator inhibitor), Stat5A, calcium-binding protein S100P, MSX-2, lipocortin II (calpactin I), selenium-binding protein (hSBP), and bullous pemphigoid antigen (plakin family).
23. The method of claim 1, comprising, in step (b), detecting expression of phenylethanolamine n-methyltransferase (PNMT) adrenal medulla.
24. The method of claim 1, comprising, in step (b), detecting expression of proteasome-like subunit MECL-1.
25. The method of claim 1, comprising, in step (b), detecting expression of: growth arrest-specific protein and tissue factor gene.
26. A method to identify antagonists of progesterone receptors, comprising:
a. contacting a progesterone receptor with a putative antagonist ligand, wherein said progesterone receptor is selected from the group consisting of progesterone receptor A (PR-A) and progesterone receptor B (PR-B), under conditions wherein, in the absence of said putative antagonist ligand, said progesterone receptor is activated;
b. detecting expression of at least one gene that is regulated by said progesterone receptor when said progesterone receptor is activated, said at least one gene being selected from the group consisting of:
i. at least one gene that is selectively upregulated by PR-A chosen from a gene in Table 1;
ii. at least one gene that is selectively downregulated by PR-A chosen from a gene in Table 2;
iii. at least one gene that is selectively upregulated by PR-B chosen from a gene in Table 3;
iv. at least one gene that is selectively downregulated by PR-B chosen from a gene in Table 4;
v. at least one gene that is upregulated or downregulated by both PR-A and PR-B chosen from a gene in Table 5;
vi. at least one gene that is reciprocally regulated by PR-A and PR-B chosen from a gene in Table 6; and,
vii. at least one gene that is regulated by one of said PR-A or said PR-B, wherein regulation of said gene is altered when the other of said PR-A or PR-B is expressed by the same cell, chosen from a gene in Table 7; and,
c. comparing the expression of said at least one gene in the presence and in the absence of said putative antagonist ligand, wherein detection of inhibition or reversal of the regulation of expression of said at least one gene as compared to the regulation of expression of said at least one gene in the manner associated with activation of said progesterone receptor as set forth in (b), indicates that said putative antagonist ligand is a progesterone receptor antagonist.
27. The method of claim 26, wherein said progesterone receptor is PR-A.
28. The method of claim 26, wherein said progesterone receptor is PR-B.
29. The method of claim 26, wherein said progesterone receptor comprises both PR-A and PR-B.
30. The method of claim 26, wherein said progesterone receptor is activated by contacting said receptor with a compound that activates said receptor, said step of contacting being performed prior to, simultaneously with, or after said step of contacting of (a).
31. The method of claim 26, wherein detection of inhibition of expression or downregulated expression of at least one gene chosen from a gene in Table 1 in the presence of said putative antagonist ligand as compared to the expression of said at least one gene in the presence of said compound that activates said progesterone receptor, or detection of inhibition of expression or upregulation of expression of at least one gene chosen from a gene in Table 2 in the presence of said putative antagonist ligand as compared to the expression of said at least one gene in the presence of said compound that activates said progesterone receptor, indicates that said putative antagonist ligand is a selective antagonist of PR-A.
32. The method of claim 26, wherein detection of inhibition of expression or downregulation of expression of at least one gene chosen from a gene in Table 3 in the presence of said putative antagonist ligand as compared to the expression of said at least one gene in the presence of said compound that activates said progesterone receptor, or detection of inhibition of expression or upregulation of expression of at least one gene chosen from a gene in Table 4, in the presence of said putative antagonist ligand as compared to the expression of said at least one gene in the presence of said compound that activates said progesterone receptor, indicates that said putative antagonist ligand is a selective antagonist of PR-B.
33. The method of claim 26, wherein said step (b) of detecting comprises detecting expression of at least five genes from any one or more of said Tables 1-7.
34. The method of claim 26, wherein said step (b) of detecting comprises detecting expression of at least ten genes from any one or more of said Tables 1-7.
35. The method of claim 26, wherein said step (b) of detecting comprises detecting expression of at least 15 genes from any one or more of said Tables 1-7.
36. The method of claim 26, further comprising a step of detecting expression of at least one gene chosen from the genes in Table 8.
37. The method of claim 26, wherein said progesterone receptor is expressed by a cell.
38. The method of claim 37, wherein said progesterone receptor is endogenously expressed by said cell.
39. The method of claim 37, wherein said progesterone receptor is recombinantly expressed by said cell.
40. The method of claim 37, wherein said cell is part of a tissue from a test animal.
41. The method of claim 40, wherein said step of contacting is performed by administration of said putative agonist ligand to said test animal.
42. The method of claim 26, wherein expression of said at least one gene is detected by measuring amounts of transcripts of said at least one gene before and after contact of said progesterone receptor with said putative agonist ligand.
43. The method of claim 26, wherein expression of said at least one gene is detected by detecting hybridization of at least a portion of said at least one gene or a transcript thereof to a nucleic acid molecule comprising a portion of said at least one gene or a transcript thereof in a nucleic acid array.
44. The method of claim 26, wherein expression of said at least one gene is detected by measuring expression of a reporter gene that is operatively linked to at least the regulatory region of said at least one gene.
45. The method of claim 26, wherein expression of said at least one gene is detected by detecting the production of a protein encoded by said at least one gene.
46. The method of claim 26, wherein said putative antagonist ligand is a product of rational drug design.
47. A method to identify isoform-specific agonists of progesterone receptors, comprising:
a. contacting a progesterone receptor with a putative agonist ligand, wherein said progesterone receptor is selected from the group consisting of progesterone receptor A (PR-A) and progesterone receptor B (PR-B), under conditions wherein in the absence of said putative agonist ligand, said progesterone receptor is not activated;
b. detecting expression of at least one gene that is regulated by said progesterone receptor when said progesterone receptor is activated, said at least one gene being selected from the group consisting of:
i. at least one gene that is exclusively upregulated or downregulated by PR-A, chosen from a Table selected from the group consisting of Table 1 and Table 2; and,
ii. at least one gene that is exclusively upregulated or downregulated by PR-B chosen from a Table selected from the group consisting of Table 3 and Table 4; and,
c. comparing the expression of said at least one gene in the presence and in the absence of said putative agonist ligand, wherein detection of regulation of the expression of said at least one gene in the manner associated with activation of said progesterone receptor as set forth in (b)(i) but not (b)(ii), indicates that said putative agonist ligand is a PR-A-specific agonist, and wherein detection of regulation of the expression of said at least one gene in the manner associated with activation of said progesterone receptor as set forth in (b)(ii) but not (b)(i), indicates that said putative agonist ligand is a PR-B-specific agonist.
48. The method of claim 47, wherein said progesterone receptor comprises both PR-A and PR-B.
49. The method of claim 47, wherein said step (b) of detecting comprises detecting expression of at least five genes from any one or more of said Tables 1-4.
50. The method of claim 47, wherein said step (b) of detecting comprises detecting expression of at least ten genes from any one or more of said Tables 1-4.
51. The method of claim 47, wherein said step (b) of detecting comprises detecting expression of at least 15 genes from any one or more of said Tables 1-4.
52. A method to identify isoform-specific antagonists of progesterone receptors, comprising:
a. contacting a progesterone receptor with a putative antagonist ligand, wherein said progesterone receptor is selected from the group consisting of progesterone receptor A (PR-A) and progesterone receptor B (PR-B), under conditions wherein, in the absence of said putative antagonist ligand, said progesterone receptor is activated;
b. detecting expression of at least one gene that is regulated by said progesterone receptor when said progesterone receptor is activated, said at least one gene being selected from the group consisting of:
i. at least one gene that is exclusively upregulated or downregulated by PR-A, chosen from a Table selected from the group consisting of Table 1 and Table 2; and,
ii. at least one gene that is exclusively upregulated or downregulated by PR-B chosen from a Table selected from the group consisting of Table 3 and Table 4; and,
c. comparing the expression of said at least one gene in the presence and in the absence of said putative antagonist ligand, wherein, in the presence of said putative antagonist ligand, detection of inhibition or reversal of the regulation of expression of said at least one gene as compared to the regulation of expression of said at least one gene in the manner associated with activation of said progesterone receptor as set forth in (b)(i) but not (b)(ii), indicates that said putative antagonist ligand is a PR-A-specific antagonist, and wherein, in the presence of said putative antagonist ligand, detection of inhibition or reversal of the regulation of expression of said at least one gene as compared to the regulation of the expression of said at least one gene in the manner associated with activation of said progesterone receptor as set forth in (b)(ii) but not (b)(i), indicates that said putative antagonist ligand is a PR-B-specific antagonist.
53. The method of claim 52, wherein said progesterone receptor comprises both PR-A and PR-B.
54. The method of claim 52, wherein said step (b) of detecting comprises detecting expression of at least five genes from any one or more of said Tables 1-4.
55. The method of claim 52, wherein said step (b) of detecting comprises detecting expression of at least ten genes from any one or more of said Tables 1-4.
56. The method of claim 52, wherein said step (b) of detecting comprises detecting expression of at least 15 genes from any one or more of said Tables 1-4.
57. A method to identify a tissue-specific agonist of a progesterone receptor, comprising:
a. providing an expression profile for at least one gene that is known to be regulated by a progesterone receptor in both a first and second tissue type when said progesterone receptor is activated, wherein said at least one gene is chosen from the genes in any one or more of the genes in Tables 1-7;
b. contacting a progesterone receptor expressed by a first tissue type with a putative agonist ligand, wherein said progesterone receptor is selected from the group consisting of progesterone receptor A (PR-A) and progesterone receptor B (PR-B), under conditions wherein, in the absence of said putative agonist ligand, said progesterone receptor is not activated;
c. contacting a progesterone receptor expressed by a second tissue type with said putative agonist ligand under conditions wherein, in the absence of said putative agonist ligand, said progesterone receptor is not activated, wherein said progesterone receptor is the same isoform as the progesterone receptor contacted in (b);
d. detecting expression of said at least one gene from (a);
e. comparing the expression of said at least one gene in the presence and in the absence of said putative agonist ligand in each of said first and second tissue types, wherein detection of regulation of the expression of said at least one gene in one of said first or second tissue types in the manner associated with activation of said progesterone receptor as set forth in said expression profile of (a), and detection of inhibition of regulation or no regulation of said at least one gene in the other of said first or second tissue types, as compared to the expression of said at least one gene associated with activation of said progesterone receptor as set forth in said expression profile of (a), indicates that said putative agonist ligand is a tissue-specific progesterone receptor agonist.
58. The method of claim 57, wherein said first tissue type is breast, and wherein said at least one gene is selected from the group consisting of:
i. at least one gene that is selectively upregulated by PR-A chosen from a gene in Table 1;
ii. at least one gene that is selectively downregulated by PR-A chosen from a gene in Table 2;
iii. at least one gene that is selectively upregulated by PR-B chosen from a gene in Table 3;
iv. at least one gene that is selectively downregulated by PR-B chosen from a gene in Table 4;
v. at least one gene that is upregulated or downregulated by both PR-A and PR-B chosen from a gene in Table 5;
vi. at least one gene that is reciprocally regulated by PR-A and PR-B chosen from a gene in Table 6; and,
vii. at least one gene that is regulated by one of said PR-A or said PR-B, wherein regulation of said gene is altered when the other of said PR-A or PR-B is expressed by the same cell, chosen from a gene in Table 7.
59. The method of claim 57, wherein said second tissue type is selected from the group consisting of breast, uterus, bone, cartilage, cardiovascular tissues, heart, lung, brain, meninges, pituitary, ovary, oocyte, corpus luteum, oviduct, fallopian tubes, T lymphocytes, B lymphocytes, thymocytes, salivary gland, placenta, skin, gut, pancreas, liver, testis, epididymis, bladder, urinary tract, eye, and teeth.
60. The method of claim 57, wherein said first tissue type is a non-malignant tissue and wherein said second tissue type is a malignant tissue from the same tissue source as the first tissue type.
61. The method of claim 60, wherein said tissue source is breast tissue.
62. The method of claim 57, wherein said first tissue type is a normal tissue and wherein said second tissue type is a non-malignant, abnormal tissue.
63. The method of claim 57, wherein said expression profile of genes regulated by a progesterone receptor in said first or second tissue type is provided by a method comprising:
a. providing a first cell of a selected tissue type that expresses a progesterone receptor A (PR-A) and not a progesterone receptor B (PR-B) and a second cell of the same tissue type that expresses PR-B and not PR-A;
b. stimulating said progesterone receptors in (a) by contacting said first and second cells with a progesterone receptor stimulatory ligand;
c. detecting expression of genes by said first and second cells in the presence of said stimulatory ligand and in the absence of said stimulatory ligand, wherein a difference in the expression of a gene in the presence of said stimulatory ligand as compared to in the absence of said stimulatory ligand, indicates that said gene is regulated by said progesterone receptor in said selected tissue type.
64. A method to identify a tissue-specific antagonist of a progesterone receptor, comprising:
a. providing an expression profile for at least one gene that is known to be regulated by a progesterone receptor in both a first and second tissue type when said progesterone receptor is activated, wherein said at least one gene is chosen from the genes in any one or more of the genes in Tables 1-7;
b. contacting a progesterone receptor expressed by a first tissue type with a putative antagonist ligand, wherein said progesterone receptor is selected from the group consisting of progesterone receptor A (PR-A) and progesterone receptor B (PR-B), under conditions wherein, in the absence of said putative antagonist ligand, said progesterone receptor is activated;
c. contacting a progesterone receptor expressed by a second tissue type with said putative antagonist ligand, wherein said progesterone receptor is selected from the group consisting of progesterone receptor A (PR-A) and progesterone receptor B (PR-B), under conditions wherein, in the absence of said putative antagonist ligand, said progesterone receptor is activated;
d. detecting expression of said at least one gene from (a); and,
e. comparing the expression of said at least one gene in the presence and in the absence of said putative antagonist ligand in each of said first and second tissue types, wherein detection of regulation of the expression of said at least one gene in one of said first or second tissue types in the manner associated with activation of said progesterone receptor as set forth in said expression profile of (a) in the presence of said putative antagonist ligand, and detection of inhibition or reversal of regulation of expression of said at least one gene in the other of said first or second tissue types in the presence of said putative antagonist ligand, as compared to the expression of said at least one gene associated with activation of said progesterone receptor as set forth in said expression profile of (a), indicates that said putative antagonist ligand is a tissue-specific progesterone receptor antagonist.
65. The method of claim 64, wherein said first tissue type is breast, and wherein said at least one gene is selected from the group consisting of:
i. at least one gene that is selectively upregulated by PR-A chosen from a gene in Table 1;
ii. at least one gene that is selectively downregulated by PR-A chosen from a gene in Table 2;
iii. at least one gene that is selectively upregulated by PR-B chosen from a gene in Table 3;
iv. at least one gene that is selectively downregulated by PR-B chosen from a gene in Table 4;
v. at least one gene that is upregulated or downregulated by both PR-A and PR-B chosen from a gene in Table 5;
vi. at least one gene that is reciprocally regulated by PR-A and PR-B chosen from a gene in Table 6; and,
vii. at least one gene that is regulated by one of said PR-A or said PR-B, wherein regulation of said gene is altered when the other of said PR-A or PR-B is expressed by the same cell, chosen from a gene in Table 7.
66. The method of claim 64, wherein said second tissue type is selected from the group consisting of breast, uterus, bone, cartilage, cardiovascular tissues, heart, lung, brain, meninges, pituitary, ovary, oocyte, corpus luteum, oviduct, fallopian tubes, T lymphocytes, B lymphocytes, thymocytes, salivary gland, placenta, skin, gut, pancreas, liver, testis, epididymis, bladder, urinary tract, eye, and teeth.
67. The method of claim 64, wherein said first tissue type is a non-malignant tissue and wherein said second tissue type is a malignant tissue from the same tissue source as the first tissue type.
68. The method of claim 67, wherein said tissue source is breast tissue.
69. A method to identify a tissue-specific agonist of a progesterone receptor, comprising:
a. providing an expression profile for at least one gene that is known to be regulated by a progesterone receptor in a first tissue type but not a second tissue type when said progesterone receptor is activated, wherein said at least one gene is chosen from the genes in any one or more of the genes in Tables 1-7;
b. contacting a progesterone receptor expressed by said first tissue type with a putative agonist ligand, wherein said progesterone receptor is selected from the group consisting of progesterone receptor A (PR-A) and progesterone receptor B (PR-B), under conditions wherein, in the absence of said putative agonist ligand, said progesterone receptor is not activated;
c. detecting expression of said at least one gene from (a);
d. comparing the expression of said at least one gene in the presence and in the absence of said putative agonist ligand in said first tissue type, wherein detection of regulation of the expression of said at least one gene in said first tissue type in the manner associated with activation of said progesterone receptor as set forth in said expression profile of (a) indicates that said putative agonist ligand is a tissue-specific progesterone receptor agonist for said first tissue type.
70. A method to identify a tissue-specific antagonist of a progesterone receptor, comprising:
a. providing an expression profile for at least one gene that is known to be regulated by a progesterone receptor in a first but not in a second tissue type when said progesterone receptor is activated, wherein said at least one gene is chosen from the genes in any one or more of the genes in Tables 1-7;
b. contacting a progesterone receptor expressed by a first tissue type with a putative antagonist ligand, wherein said progesterone receptor is selected from the group consisting of progesterone receptor A (PR-A) and progesterone receptor B (PR-B), under conditions wherein, in the absence of said putative antagonist ligand, said progesterone receptor is activated;
c. detecting expression of said at least one gene from (a); and,
d. comparing the expression of said at least one gene in the presence and in the absence of said putative antagonist ligand in said first tissue type, wherein detection of inhibition or reversal of regulation of expression of said at least one gene in said first tissue type in the presence of said putative antagonist ligand, as compared to the expression of said at least one gene associated with activation of said progesterone receptor as set forth in said expression profile of (a), indicates that said putative antagonist ligand is a tissue-specific progesterone receptor antagonist of said first tissue type.
71. A method to determine the profile of genes regulated by progesterone receptors in a breast tumor sample, comprising:
a. obtaining from a patient a breast tumor sample;
b. detecting expression of at least one gene in said breast tumor sample that is regulated by a progesterone receptor when said progesterone receptor is activated, said at least one gene being selected from the group consisting of:
i. at least one gene that is selectively upregulated by PR-A chosen from a gene in Table 9;
ii. at least one gene that is selectively downregulated by PR-A chosen from a gene in Table 10;
iii. at least one gene that is selectively upregulated by PR-B chosen from a gene in Table 11;
iv. at least one gene that is selectively downregulated by PR-B chosen from a gene in Table 12;
v. at least one gene that is upregulated or downregulated by both PR-A and PR-B chosen from a gene in Table 13;
vi. at least one gene that is reciprocally regulated by PR-A and PR-B chosen from a gene in Table 14; and,
vii. at least one gene that is regulated by one of said PR-A or said PR-B, wherein regulation of said gene is altered when the other of said PR-A or PR-B is expressed by the same cell, chosen from a gene in Table 15; and,
c. producing a profile of genes for said tumor sample that are regulated selectively by PR-A, selectively by PR-B, or by both PR-A and PR-B.
72. A plurality of polynucleotides for the detection of the expression of genes regulated by progesterone receptors in breast tissue;
wherein said plurality of polynucleotides consists of polynucleotide probes that are complementary to RNA transcripts, or nucleotides derived therefrom, of genes that are regulated by progesterone receptors; and
wherein said plurality of polynucleotides comprises polynucleotide probes that are complementary to RNA transcripts, or nucleotides derived therefrom, of genes selected from the group consisting of:
a. at least one gene that is selectively upregulated by PR-A chosen from a gene in Table 1;
b. at least one gene that is selectively downregulated by PR-A chosen from a gene in Table 2;
c. at least one gene that is selectively upregulated by PR-B chosen from a gene in Table 3;
d. at least one gene that is selectively downregulated by PR-B chosen from a gene in Table 4;
e. at least one gene that is upregulated or downregulated by both PR-A and PR-B chosen from a gene in Table 5;
f. at least one gene that is reciprocally regulated by PR-A and PR-B chosen from a gene in Table 6; and,
g. at least one gene that is regulated by one of said PR-A or said PR-B, wherein regulation of said gene is altered when the other of said PR-A or PR-B is expressed by the same cell, chosen from a gene in Table 7.
73. The plurality of polynucleotides of claim 72, wherein said polynucleotide probes are immobilized on a substrate.
74. The plurality of polynucleotides of claim 72, wherein said polynucleotide probes are hybridizable array elements in a microarray.
75. The plurality of polynucleotides of claim 72, wherein said polynucleotide probes are conjugated to detectable markers.
76. The plurality of polynucleotides of claim 72, wherein said plurality of polynucleotides further comprises at least one polynucleotide probe that is complementary to RNA transcripts, or nucleotides derived therefrom, of at least one gene chosen from the genes in Table 8.
77. A plurality of antibodies, or antigen binding fragments thereof, for the detection of the expression of genes regulated by progesterone receptors in breast tissue;
wherein said plurality of antibodies, or antigen binding fragments thereof, consists of antibodies, or antigen binding fragments thereof, that selectively bind to proteins encoded by genes that are regulated by progesterone receptors; and
wherein said plurality of antibodies, or antigen binding fragments thereof, comprises antibodies, or antigen binding fragments thereof, that selectively bind to proteins encoded by genes selected from the group consisting of:
a. genes that are selectively upregulated by PR-A chosen from genes in Table 1;
b. genes that are selectively downregulated by PR-A chosen from genes in Table 2;
c. genes that are selectively upregulated by PR-B chosen from genes in Table 3;
d. genes that are selectively downregulated by PR-B chosen from genes in Table 4;
e. genes that are upregulated or downregulated by both PR-A and PR-B chosen from genes in Table 5;
f. genes that are reciprocally regulated by PR-A and PR-B chosen from genes in Table 6; and,
g. genes that are regulated by one of said PR-A or said PR-B, wherein regulation of said gene is altered when the other of said PR-A or PR-B is expressed by the same cell, chosen from genes in Table 7.
78. The plurality of antibodies, or antigen binding fragments thereof, of claim 77, wherein said plurality of antibodies, or antigen binding fragments thereof, further comprises at least one antibody, or an antigen binding fragment thereof, that selectively binds to a protein encoded by a gene chosen from the genes in Table 8.
79. A method to identify genes that are regulated by a progesterone receptor in two or more tissue types, comprising:
a. activating a progesterone receptor in two or more tissue types that express said progesterone receptor;
b. detecting expression of at least one gene said two or more tissue types, said at least one gene being chosen from a gene in any one or more of Tables 1-7, and,
c. identifying genes that are regulated by said progesterone receptor in each of said two or more tissue types.
80. The method of claim 79, further comprising detecting whether said genes are regulated selectively by PR-A, selectively by PR-B, or by both PR-A and PR-B.
81. A method to regulate the expression of a gene selected from the group consisting of any one or more of said genes in Tables 1-7, wherein said method comprises administering to a cell that expresses a progesterone receptor a compound selected from the group consisting of: progesterone, aprogestin, and an antiprogestin, wherein said compound is effective to regulate the expression of said gene.
82. The method of claim 81, wherein said gene is selected from the group consisting of: growth arrest-specific protein (gas6), NF-IL6-beta (C/EBPbeta), calcium-binding protein S100P, MSX-2, selenium-binding protein (hSBP), and bullous pemphigoid antigen (plakin family).
83. The method of claim 81, wherein said cell that expresses a progesterone receptor is in the breast tissue of a patient that has, or is at risk of developing, breast cancer.
US10/776,827 2000-06-28 2004-02-10 Progesterone receptor-regulated gene expression and methods related thereto Abandoned US20040132086A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US10/776,827 US20040132086A1 (en) 2000-06-28 2004-02-10 Progesterone receptor-regulated gene expression and methods related thereto

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US21487000P 2000-06-28 2000-06-28
US09/814,915 US6750015B2 (en) 2000-06-28 2001-03-21 Progesterone receptor-regulated gene expression and methods related thereto
US10/776,827 US20040132086A1 (en) 2000-06-28 2004-02-10 Progesterone receptor-regulated gene expression and methods related thereto

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US09/814,915 Division US6750015B2 (en) 2000-06-28 2001-03-21 Progesterone receptor-regulated gene expression and methods related thereto

Publications (1)

Publication Number Publication Date
US20040132086A1 true US20040132086A1 (en) 2004-07-08

Family

ID=43706153

Family Applications (2)

Application Number Title Priority Date Filing Date
US09/814,915 Expired - Fee Related US6750015B2 (en) 2000-06-28 2001-03-21 Progesterone receptor-regulated gene expression and methods related thereto
US10/776,827 Abandoned US20040132086A1 (en) 2000-06-28 2004-02-10 Progesterone receptor-regulated gene expression and methods related thereto

Family Applications Before (1)

Application Number Title Priority Date Filing Date
US09/814,915 Expired - Fee Related US6750015B2 (en) 2000-06-28 2001-03-21 Progesterone receptor-regulated gene expression and methods related thereto

Country Status (1)

Country Link
US (2) US6750015B2 (en)

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2009138544A1 (en) 2008-05-16 2009-11-19 Proyecto De Biomedicina Cima, S.L. Self-inactivating helper adenoviruses for the production of high-capacity recombinant adenoviruses
WO2009147271A2 (en) 2008-06-04 2009-12-10 Proyecto De Biomedicina Cima, S.L. System for packaging high-capacity adenoviruses
EP2407534A1 (en) 2010-07-14 2012-01-18 Neo Virnatech, S.L. Methods and reagents for obtaining transcriptionally active virus-like particles and recombinant virions
WO2012045905A2 (en) 2010-10-06 2012-04-12 Fundació Privada Institut De Recerca Biomèdica Method for the diagnosis, prognosis and treatment of breast cancer metastasis
WO2013153458A2 (en) 2012-04-09 2013-10-17 Inbiomotion S.L. Method for the prognosis and treatment of cancer metastasis
WO2013182912A2 (en) 2012-06-06 2013-12-12 Fundacio Privada Institut De Recerca Biomedica Method for the diagnosis, prognosis and treatment of lung cancer metastasis
EP2687852A1 (en) 2012-07-17 2014-01-22 Laboratorios Del. Dr. Esteve, S.A. Method for diagnosing and treating chronic fatigue syndrome
WO2014140933A2 (en) 2013-03-15 2014-09-18 Fundacio Privada Institut De Recerca Biomedica Method for the prognosis and treatment of cancer metastasis
WO2014140896A2 (en) 2013-03-15 2014-09-18 Fundacio Privada Institut De Recerca Biomedica Method for the diagnosis, prognosis and treatment of cancer metastasis
WO2014184679A2 (en) 2013-03-15 2014-11-20 Inbiomotion S.L. Method for the prognosis and treatment of renal cell carcinoma metastasis
WO2014188042A1 (en) 2013-05-20 2014-11-27 3P Biopharmaceuticals Alphaviral vectors and cell lines for producing recombinant proteins
WO2015052583A2 (en) 2013-10-09 2015-04-16 Fundacio Institut De Recerca Biomedica (Irb Barcelona) Method for the prognosis and treatment of cancer metastasis
WO2015101666A1 (en) 2014-01-03 2015-07-09 Fundación Biofísica Bizkaia VLPs, METHODS FOR THEIR OBTENTION AND APPLICATIONS THEREOF
US9567365B2 (en) 2014-03-31 2017-02-14 Super Well Biotechnology Corporation Method for separating estrogen from placenta
EP3553186A1 (en) 2012-10-12 2019-10-16 Inbiomotion S.L. Method for the diagnosis, prognosis and treatment of prostate cancer metastasis
EP4371569A1 (en) 2022-11-16 2024-05-22 Universidad del País Vasco/Euskal Herriko Unibertsitatea Vlps against acute myeloid leukaemia

Families Citing this family (31)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020001797A1 (en) * 2000-06-30 2002-01-03 Mitsuko Ishihara Method for detecting endocrine disrupting action of a test substance
US20120226090A1 (en) * 2002-11-13 2012-09-06 University Of North Texas Health Science Center At Fort Worth Protection Against and Treatment of Ionizing Radiation
US8163692B2 (en) * 2002-11-13 2012-04-24 University Of North Texas Health Science Center Of Fort Worth Protection against and treatment of ionizing radiation
US20050123594A1 (en) * 2002-11-13 2005-06-09 Sanjay Awasthi Liposomes for protection against toxic compounds
US9895413B2 (en) 2002-11-13 2018-02-20 Board Of Regents, University Of Texas System Protection against and treatment of ionizing radiation
US20060182749A1 (en) 2002-11-13 2006-08-17 Board Of Regents, The University Of Texas System Therapies for cancer using RLIP76
CA2516182A1 (en) * 2003-02-28 2004-09-16 Bayer Pharmaceuticals Corporation Expression profiles for breast cancer and methods of use
US20050064472A1 (en) * 2003-07-23 2005-03-24 Affymetrix, Inc. Methods of monitoring gene expression
WO2006094014A2 (en) 2005-02-28 2006-09-08 The Regents Of The University Of California Methods for diagnosis and treatment of endometrial cancer
US8318906B2 (en) * 2005-04-15 2012-11-27 The Regents Of The University Of California EMP2 antibodies and their therapeutic uses
WO2006113526A2 (en) 2005-04-15 2006-10-26 The Regents Of The University Of California Prevention of chlamydia infection using a protective antibody
US8648052B2 (en) * 2005-04-15 2014-02-11 The Regents Of The University Of California Prevention of chlamydia infection using SIRNA
US20110020433A1 (en) * 2008-02-07 2011-01-27 Cunningham C Casey Compositions for delivery of cargo such as drugs proteins and/or genetic materials
US8968210B2 (en) 2008-10-01 2015-03-03 Covidien LLP Device for needle biopsy with integrated needle protection
US11298113B2 (en) 2008-10-01 2022-04-12 Covidien Lp Device for needle biopsy with integrated needle protection
US9782565B2 (en) 2008-10-01 2017-10-10 Covidien Lp Endoscopic ultrasound-guided biliary access system
US9332973B2 (en) 2008-10-01 2016-05-10 Covidien Lp Needle biopsy device with exchangeable needle and integrated needle protection
US9186128B2 (en) 2008-10-01 2015-11-17 Covidien Lp Needle biopsy device
EP2456455A4 (en) * 2009-07-24 2012-12-12 Terapio Corp Compositions and methods of use for post-radiation protection
CA2781290A1 (en) 2009-11-20 2011-05-26 Lynn K. Gordon Epithelial membrane protein-2 (emp2) and proliferative vitreoretinopathy (pvr)
AU2013203713B2 (en) 2012-02-13 2015-07-16 Terapio Corporation RLIP76 as a medical chemical countermeasure
MX2014013044A (en) * 2012-04-27 2015-06-23 Univ Minnesota Breast cancer prognosis, prediction of progesterone receptor subtype and|prediction of response to antiprogestin treatment based on gene expression.
WO2013177245A2 (en) * 2012-05-22 2013-11-28 Nanostring Technologies, Inc. Nano46 genes and methods to predict breast cancer outcome
US9177098B2 (en) 2012-10-17 2015-11-03 Celmatix Inc. Systems and methods for determining the probability of a pregnancy at a selected point in time
WO2015042163A1 (en) 2013-09-17 2015-03-26 Terapio Corporation Methods of preventing or treating mucositis using rlip76
CA2939241A1 (en) * 2014-04-08 2015-10-15 Arno Therapeutics, Inc. Systems and methods for identifying progesterone receptor subtypes
AU2015289464A1 (en) * 2014-07-17 2017-02-02 Celmatix Inc. Methods and systems for assessing infertility and related pathologies
CN113559075A (en) 2014-11-17 2021-10-29 康泰科思特生物制药公司 Onapristone extended release compositions and methods
CA2998924A1 (en) 2015-09-25 2017-03-30 Context Biopharma Inc. Methods of making onapristone intermediates
AU2016370499B2 (en) 2015-12-15 2022-06-30 Context Biopharma Inc. Amorphous onapristone compositions and methods of making the same
WO2018102369A1 (en) 2016-11-30 2018-06-07 Arno Therapeutics, Inc. Methods for onapristone synthesis dehydration and deprotection

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5506791A (en) * 1989-12-22 1996-04-09 American Sigma, Inc. Multi-function flow monitoring apparatus with multiple flow sensor capability
US5506102A (en) * 1993-10-28 1996-04-09 Ligand Pharmaceuticals Incorporated Methods of using the A form of the progesterone receptor to screen for antagonists of steroid intracellar receptor-mediated transcription
US5683885A (en) * 1996-04-12 1997-11-04 Baylor College Of Medicine Methods for diagnosing an increased risk for breast or ovarian cancer
US5759785A (en) * 1992-05-14 1998-06-02 Baylor College Of Medicine Method of identifying hormone antagonists and agonists
US5770176A (en) * 1995-12-08 1998-06-23 Chiron Diagnostics Corporation Assays for functional nuclear receptors
US5808139A (en) * 1992-04-21 1998-09-15 Ligand Pharmaceuticals Incorporated Non-steroid progesterone receptor agonist and antagonist and compounds and methods
US5935934A (en) * 1992-05-14 1999-08-10 Baylor College Of Medicine Mutated steroid hormone receptors, methods for their use and molecular switch for gene therapy
US5945279A (en) * 1991-05-02 1999-08-31 Baylor College Of Medicine Screening system for identifying compounds that regulate steroid and orphan receptors mediation of DNA transcription

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5364791A (en) 1992-05-14 1994-11-15 Elisabetta Vegeto Progesterone receptor having C. terminal hormone binding domain truncations
WO1998005679A2 (en) 1996-08-05 1998-02-12 Duke University Mixed agonists of the progesterone receptor and assays therefor

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5506791A (en) * 1989-12-22 1996-04-09 American Sigma, Inc. Multi-function flow monitoring apparatus with multiple flow sensor capability
US5945279A (en) * 1991-05-02 1999-08-31 Baylor College Of Medicine Screening system for identifying compounds that regulate steroid and orphan receptors mediation of DNA transcription
US5808139A (en) * 1992-04-21 1998-09-15 Ligand Pharmaceuticals Incorporated Non-steroid progesterone receptor agonist and antagonist and compounds and methods
US5759785A (en) * 1992-05-14 1998-06-02 Baylor College Of Medicine Method of identifying hormone antagonists and agonists
US5935934A (en) * 1992-05-14 1999-08-10 Baylor College Of Medicine Mutated steroid hormone receptors, methods for their use and molecular switch for gene therapy
US5506102A (en) * 1993-10-28 1996-04-09 Ligand Pharmaceuticals Incorporated Methods of using the A form of the progesterone receptor to screen for antagonists of steroid intracellar receptor-mediated transcription
US5770176A (en) * 1995-12-08 1998-06-23 Chiron Diagnostics Corporation Assays for functional nuclear receptors
US5683885A (en) * 1996-04-12 1997-11-04 Baylor College Of Medicine Methods for diagnosing an increased risk for breast or ovarian cancer

Cited By (26)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2009138544A1 (en) 2008-05-16 2009-11-19 Proyecto De Biomedicina Cima, S.L. Self-inactivating helper adenoviruses for the production of high-capacity recombinant adenoviruses
WO2009147271A2 (en) 2008-06-04 2009-12-10 Proyecto De Biomedicina Cima, S.L. System for packaging high-capacity adenoviruses
EP2407534A1 (en) 2010-07-14 2012-01-18 Neo Virnatech, S.L. Methods and reagents for obtaining transcriptionally active virus-like particles and recombinant virions
WO2012007557A1 (en) 2010-07-14 2012-01-19 Neo Virnatech, S.L. Methods and reagents for obtaining transcriptionally active virus-like particles and recombinant virions
WO2012045905A2 (en) 2010-10-06 2012-04-12 Fundació Privada Institut De Recerca Biomèdica Method for the diagnosis, prognosis and treatment of breast cancer metastasis
EP3517630A1 (en) 2010-10-06 2019-07-31 Institució Catalana de Recerca i Estudis Avançats Method for the diagnosis, prognosis and treatment of breast cancer metastasis
WO2013153458A2 (en) 2012-04-09 2013-10-17 Inbiomotion S.L. Method for the prognosis and treatment of cancer metastasis
EP3825692A1 (en) 2012-04-09 2021-05-26 Fundació Institut de Recerca Biomèdica (IRB Barcelona) Method for the prognosis and treatment of cancer metastasis
EP3467124A1 (en) 2012-06-06 2019-04-10 Fundació Institut de Recerca Biomèdica IRB (Barcelona) Method for the diagnosis, prognosis and treatment of lung cancer metastasis
US11352673B2 (en) 2012-06-06 2022-06-07 Fundacio Institut De Recerca Biomedica (Irb Barcelona) Method for the diagnosis, prognosis and treatment of lung cancer metastasis
WO2013182912A2 (en) 2012-06-06 2013-12-12 Fundacio Privada Institut De Recerca Biomedica Method for the diagnosis, prognosis and treatment of lung cancer metastasis
US10006091B2 (en) 2012-06-06 2018-06-26 Fundació Institut De Recerca Biomèdica (Irb Barcelona) Method for the diagnosis, prognosis and treatment of lung cancer metastasis
EP2687852A1 (en) 2012-07-17 2014-01-22 Laboratorios Del. Dr. Esteve, S.A. Method for diagnosing and treating chronic fatigue syndrome
EP3553186A1 (en) 2012-10-12 2019-10-16 Inbiomotion S.L. Method for the diagnosis, prognosis and treatment of prostate cancer metastasis
WO2014184679A2 (en) 2013-03-15 2014-11-20 Inbiomotion S.L. Method for the prognosis and treatment of renal cell carcinoma metastasis
US11591599B2 (en) 2013-03-15 2023-02-28 Fundació Institut De Recerca Biomèdica (Irb Barcelona) Method for the diagnosis, prognosis and treatment of cancer metastasis
EP3272880A2 (en) 2013-03-15 2018-01-24 Fundació Institut de Recerca Biomèdica IRB (Barcelona) Method for the diagnosis, prognosis and treatment of metastatic cancer
WO2014140896A2 (en) 2013-03-15 2014-09-18 Fundacio Privada Institut De Recerca Biomedica Method for the diagnosis, prognosis and treatment of cancer metastasis
WO2014140933A2 (en) 2013-03-15 2014-09-18 Fundacio Privada Institut De Recerca Biomedica Method for the prognosis and treatment of cancer metastasis
US10011847B2 (en) 2013-05-20 2018-07-03 3P Biopharmaceuticals, S.L. Alphaviral vectors and cell lines for producing recombinant proteins
WO2014188042A1 (en) 2013-05-20 2014-11-27 3P Biopharmaceuticals Alphaviral vectors and cell lines for producing recombinant proteins
EP3524698A1 (en) 2013-10-09 2019-08-14 Fundació Institut de Recerca Biomèdica IRB (Barcelona) Method for the prognosis and treatment of cancer metastasis
WO2015052583A2 (en) 2013-10-09 2015-04-16 Fundacio Institut De Recerca Biomedica (Irb Barcelona) Method for the prognosis and treatment of cancer metastasis
WO2015101666A1 (en) 2014-01-03 2015-07-09 Fundación Biofísica Bizkaia VLPs, METHODS FOR THEIR OBTENTION AND APPLICATIONS THEREOF
US9567365B2 (en) 2014-03-31 2017-02-14 Super Well Biotechnology Corporation Method for separating estrogen from placenta
EP4371569A1 (en) 2022-11-16 2024-05-22 Universidad del País Vasco/Euskal Herriko Unibertsitatea Vlps against acute myeloid leukaemia

Also Published As

Publication number Publication date
US20030027208A1 (en) 2003-02-06
US6750015B2 (en) 2004-06-15

Similar Documents

Publication Publication Date Title
US6750015B2 (en) Progesterone receptor-regulated gene expression and methods related thereto
CN110382521B (en) Method for differentiating tumor-inhibiting FOXO activity from oxidative stress
US20030087266A1 (en) IGs as modifiers of the p53 pathway and methods of use
EP1463928A2 (en) Methods of diagnosis of lung cancer, compositions and methods of screening for modulators of lung cancer
US6506607B1 (en) Methods and compositions for the identification and assessment of prostate cancer therapies and the diagnosis of prostate cancer
CN107743524B (en) Method for prognosis of prostate cancer
US20040033495A1 (en) Methods of diagnosis of angiogenesis, compositions and methods of screening for angiogenesis modulators
EP1405083B1 (en) Natural ligand of gpcr chemr23 and uses thereof
EP1864131B1 (en) Natural ligand of g protein coupled receptor rcc356 and uses thereof
EP1418943A1 (en) Methods of diagnosis of angiogenesis, compositions and methods of screening for angiogenesis modulators
EP1434881A2 (en) Methods of diagnosis of cancer compositions and methods of screening for modulators of cancer
CA2512536A1 (en) Biomarkers and methods for determining sensitivity to epidermal growth factor receptor modulators
US20030068636A1 (en) Compositions, kits and methods for identification, assessment, prevention, and therapy of breast and ovarian cancer
JPH09500023A (en) Identification of ligands by selective amplification of cells transfected with the receptor
WO2003042661A2 (en) Methods of diagnosis of cancer, compositions and methods of screening for modulators of cancer
TW201632629A (en) Methods for cancer diagnosis and prognosis
KR20170095306A (en) Method for predicting response to breast cancer therapeutic agents and method of treatment of breast cancer
US20030215840A1 (en) Methods and compositions for treating cardiovascular disease using 1682, 6169, 6193, 7771, 14395, 29002, 33216, 43726, 69292, 26156, 32427, 2402, 7747, 1720, 9151, 60491, 1371, 7077, 33207, 1419, 18036, 16105, 38650, 14245, 58848, 1870, 25856, 32394, 3484, 345, 9252, 9135, 10532, 18610, 8165, 2448, 2445, 64624, 84237, 8912, 2868, 283, 2554, 9464, 17799, 26686, 43848, 32135, 12208, 2914, 51130, 19489, 21833, 2917, 59590, 15992, 2094, 2252, 3474, 9792, 15400, 1452 or 6585 molecules
US20040219579A1 (en) Methods of diagnosis of cancer, compositions and methods of screening for modulators of cancer
CN102459645B (en) Phosphodiesterase 9A as a marker for prostate cancer
KR20110073451A (en) Interferon response in clinical samples (iris)
US7527935B2 (en) G-protein coupled receptor having eicosanoid as ligand and gene thereof
EP1729930A2 (en) Methods for identifying risk of osteoarthritis and treatments thereof
KR20130087585A (en) Methods for detecting low grade inflammation
CA2487098A1 (en) Novel targets for obesity from fat tissue

Legal Events

Date Code Title Description
AS Assignment

Owner name: RICHER, JENNIFER, COLORADO

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:REGENTS OF THE UNIVERSITY OF COLORADO, THE;REEL/FRAME:015213/0132

Effective date: 20040405

Owner name: HORWITZ, KATHRYN B., COLORADO

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:REGENTS OF THE UNIVERSITY OF COLORADO, THE;REEL/FRAME:015213/0132

Effective date: 20040405

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION

AS Assignment

Owner name: NIH - DIETR, MARYLAND

Free format text: CONFIRMATORY LICENSE;ASSIGNOR:THE REGENTS OF THE UNIVERSITY OF COLORADO A BODY CORPORATE;REEL/FRAME:047550/0268

Effective date: 20181119

AS Assignment

Owner name: NATIONAL INSTITUTES OF HEALTH (NIH), U.S. DEPT. OF

Free format text: CONFIRMATORY LICENSE;ASSIGNOR:UNIVERSITY OF COLORADO;REEL/FRAME:047928/0496

Effective date: 20180814