Nothing Special   »   [go: up one dir, main page]

WO2001047944A2 - Nucleic acids containing single nucleotide polymorphisms and methods of use thereof - Google Patents

Nucleic acids containing single nucleotide polymorphisms and methods of use thereof Download PDF

Info

Publication number
WO2001047944A2
WO2001047944A2 PCT/US2000/035498 US0035498W WO0147944A2 WO 2001047944 A2 WO2001047944 A2 WO 2001047944A2 US 0035498 W US0035498 W US 0035498W WO 0147944 A2 WO0147944 A2 WO 0147944A2
Authority
WO
WIPO (PCT)
Prior art keywords
sequence
polymoφhic
nucleotide
complement
nucleic acid
Prior art date
Application number
PCT/US2000/035498
Other languages
French (fr)
Other versions
WO2001047944A3 (en
Inventor
Richard A. Shimkets
Martin Leach
Original Assignee
Curagen Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Curagen Corporation filed Critical Curagen Corporation
Priority to AU29145/01A priority Critical patent/AU2914501A/en
Priority to CA002395926A priority patent/CA2395926A1/en
Priority to EP00993615A priority patent/EP1244688A1/en
Publication of WO2001047944A2 publication Critical patent/WO2001047944A2/en
Publication of WO2001047944A3 publication Critical patent/WO2001047944A3/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/435Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans
    • C07K14/46Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from vertebrates
    • C07K14/47Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from vertebrates from mammals
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K48/00Medicinal preparations containing genetic material which is inserted into cells of the living body to treat genetic diseases; Gene therapy
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/156Polymorphic or mutational markers

Definitions

  • Sequence polymorphism-based analysis of nucleic acid sequences can augment or replace previously known methods for determining the identity and relatedness of individuals.
  • the approach is generally based on alterations in nucleic acid sequences between related individuals.
  • This analysis has been widely used in a variety of genetic, diagnostic, and forensic applications. For example, polymorphism analyses are used in identity and paternity analysis, and in genetic mapping studies.
  • RFLPS restriction fragment length polymorphism
  • RFLP restriction fragment length polymorphism
  • STR sequences typically include tandem repeats of 2, 3, or 4 nucleotide sequences that are present in a nucleic acid from one individual but absent from a second, related individual at the corresponding genomic location.
  • SNPs single nucleotide polymorphisms
  • cSNP single nucleotide polymorphisms
  • SNPs can arise in several ways.
  • a single nucleotide polymorphism may arise due to a substitution of one nucleotide for another at the polymorphic site.
  • Substitutions can be transitions or transversions.
  • a transition is the replacement of one purine nucleotide by another purine nucleotide, or one pyrimidine by another pyrimidine.
  • a transversion is the replacement of a purine by a pyrimidine, or the converse.
  • Single nucleotide polymorphisms can also arise from a deletion of a nucleotide or an insertion of a nucleotide relative to a reference allele.
  • the polymorphic site is a site at which one allele bears a gap with respect to a single nucleotide in another allele.
  • Some SNPs occur within, or near genes.
  • One such class includes SNPs falling within regions of genes encoding for a polypeptide product. These SNPs may result in an alteration of the amino acid sequence of the polypeptide product and give rise to the expression of a defective or other variant protein.
  • Such variant products can, in some cases result in a pathological condition, e.g. , genetic disease.
  • genes in which a polymorphism within a coding sequence gives rise to genetic disease include sickle cell anemia and cystic fibrosis.
  • Other SNPs do not result in alteration of the polypeptide product.
  • SNPs can also occur in noncoding regions of genes.
  • SNPs tend to occur with great frequency and are spaced uniformly throughout the genome.
  • the frequency and uniformity of SNPs means that there is a greater probability that such a polymorphism will be found in close proximity to a genetic locus of interest.
  • the invention is based in part on the discovery of novel single nucleotide polymorphisms (SNPs) in regions of human DNA.
  • SNPs single nucleotide polymorphisms
  • the invention provides an isolated polynucleotide which includes one or more of the SNPs described herein.
  • the polynucleotide can be, e.g., a nucleotide sequence which includes one or more of the polymorphic sequences shown in Table 1 and the Sequence Listing (SEQ ID NOS: 1 - 7867) and which includes a polymorphic sequence, or a fragment of the polymorphic sequence, as long as it includes the polymorphic site.
  • the polynucleotide may alternatively contain a nucleotide sequence which includes a sequence complementary to one or more of the sequences (SEQ ID NOS: 1-7867), or a fragment of the complementary nucleotide sequence, provided that the fragment includes a polymorphic site in the polymorphic sequence.
  • the polynucleotide can be, e.g., DNA or RNA, and can be between about 10 and about 100 nucleotides, e.g, 10-90, 10-75, 10-51, 10-40, or 10-30, nucleotides in length.
  • the polymorphic site in the polymorphic sequence includes a nucleotide other than the nucleotide listed in Table 1, column 5 for the polymorphic sequence, e.g., the polymorphic site includes the nucleotide listed in Table 1, column 6 for the polymorphic sequence.
  • the complement of the polymorphic site includes a nucleotide other than the complement of the nucleotide listed in Table 1, column 5 for the complement of the polymorphic sequence, e.g., the complement of the nucleotide listed in Table 1, column 6 for the polymorphic sequence.
  • the polymorphic sequence is associated with a polypeptide related to one of the protein families disclosed herein.
  • the nucleic acid may be associated with a polypeptide related to an ATPase associated protein, a cadherin, or any of the other proteins identified in Table 1, column 10.
  • the invention provides an isolated allele-specific oligonucleotide that hybridizes to a first polynucleotide containing a polymorphic site.
  • the first polynucleotide can be, e.g., a nucleotide sequence comprising one or more polymorphic sequences (SEQ ID NOS:l - 7867), provided that the polymorphic sequence includes a nucleotide other than the nucleotide recited in Table 1, column 5 for the polymorphic sequence.
  • the first polynucleotide can be a nucleotide sequence that is a fragment of the polymorphic sequence, provided that the fragment includes a polymorphic site in the polymorphic sequence, or a complementary nucleotide sequence which includes a sequence complementary to one or more polymorphic sequences (SEQ ID NOS:l - 7867), provided that the complementary nucleotide sequence includes a nucleotide other than the complement of the nucleotide recited in Table 1 , column 5.
  • the first polynucleotide may in addition include a nucleotide sequence that is a fragment of the complementary sequence, provided that the fragment includes a polymorphic site in the polymorphic sequence.
  • the oligonucleotide does not hybridize under stringent conditions to a second polynucleotide.
  • the second polynucleotide can be, e.g., (a) a nucleotide sequence comprising one or more polymorphic sequences (SEQ ID NOS:l - 7867), wherein the polymorphic sequence includes the nucleotide listed in Table 1, column 5 for the polymorphic sequence; (b) a nucleotide sequence that is a fragment of any of the polymorphic sequences; (c) a complementary nucleotide sequence including a sequence complementary to one or more polymorphic sequences (SEQ ID NOS:l - 7867), wherein the polymorphic sequence includes the complement of the nucleotide listed in Table 1, column 5; and (d) a nucleotide sequence that is a fragment of the complementary sequence, provided that the fragment includes a polymorphic site in the polymorphic sequence.
  • the oligonucleotide can be, e.g., between about 10 and about 100 bases in length. In some embodiments, the oligonucleotide is between about 10 and 75 bases, 10 and 51 bases, 10 and about 40 bases, or about 15 and 30 bases in length.
  • the invention also provides a method of detecting a polymorphic site in a nucleic acid.
  • the method includes contacting the nucleic acid with an oligonucleotide that hybridizes to a polymorphic sequence selected from the group consisting of SEQ ID NOS: 1-7867, or its complement, provided that the polymorphic sequence includes a nucleotide other than the nucleotide recited in Table 1, column 5 for the polymorphic sequence, or the complement includes a nucleotide other than the complement of the nucleotide recited in Table 1, column 5.
  • the method also includes determining whether the nucleic acid and the oligonucleotide hybridize. Hybridization of the oligonucleotide to the nucleic acid sequence indicates the presence of the polymorphic site in the nucleic acid.
  • the oligonucleotide does not hybridize to the polymorphic sequence when the polymorphic sequence includes the nucleotide recited in Table 1, column 5 for the polymorphic sequence, or when the complement of the polymorphic sequence includes the complement of the nucleotide recited in Table 1, column 5 for the polymorphic sequence.
  • the oligonucleotide can be, e.g., between about 10 and about 100 bases in length. In some embodiments, the oligonucleotide is between about 10 and 75 bases, 10 and 51 bases, 10 and about 40 bases, or about 15 and 30 bases in length.
  • the polymorphic sequence identified by the oligonucleotide is associated with a polypeptide related to one of the protein families disclosed herein.
  • the nucleic acid may be associated polypeptide related to an ATPase associated protein, cadherin, or any of the other protein families identified in Table 1, column 10.
  • the method includes determining if a sequence polymorphism is present in a subject, such as a human.
  • the method includes providing a nucleic acid from the subject and contacting the nucleic acid with an oligonucleotide that hybridizes to a polymorphic sequence selected from the group consisting of SEQ ID NOS: 1-7867, or its complement, provided that the polymorphic sequence includes a nucleotide other than the nucleotide recited in Table 1, column 5 for said polymorphic sequence, or the complement includes a nucleotide other than the complement of the nucleotide recited in Table 1, column 5.
  • Hybridization between the nucleic acid and the oligonucleotide is then determined. Hybridization of the oligonucleotide to the nucleic acid sequence indicates the presence of the polymorphism in said subject.
  • the invention provides a method of determining the relatedness of a first and second nucleic acid.
  • the method includes providing a first nucleic acid and a second nucleic acid and contacting the first nucleic acid and the second nucleic acid with an oligonucleotide that hybridizes to a polymorphic sequence selected from the group consisting of SEQ ID NOS: 1-7867, or its complement, provided that the polymorphic sequence includes a nucleotide other than the nucleotide recited in Table 1, column 5 for the polymorphic sequence, or the complement includes a nucleotide other than the complement of the nucleotide recited in Table 1, column 5.
  • the method also includes determining whether the first nucleic acid and the second nucleic acid hybridize to the oligonucleotide, and comparing hybridization of the first and second nucleic acids to the oligonucleotide. Hybridization of first and second nucleic acids to the nucleic acid indicates the first and second subjects are related.
  • the oligonucleotide does not hybridize to the polymorphic sequence when the polymorphic sequence includes the nucleotide recited in Table 1, column 5 for the polymorphic sequence, or when the complement of the polymorphic sequence includes the complement of the nucleotide recited in Table 1, column 5 for the polymorphic sequence.
  • the oligonucleotide can be, e.g., between about 10 and about 100 bases in length. In some embodiments, the oligonucleotide is between about 10 and 75 bases, 10 and 51 bases, 10 and about 40 bases, or about 15 and 30 bases in length.
  • the method can be used in a variety of applications.
  • the first nucleic acid may be isolated from physical evidence gathered at a crime scene
  • the second nucleic acid may be obtained from a person suspected of having committed the crime. Matching the two nucleic acids using the method can establish whether the physical evidence originated from the person.
  • the first sample may be from a human male suspected of being the father of a child and the second sample may be from the child. Establishing a match using the described method can establish whether the male is the father of the child.
  • the invention provides an isolated polypeptide comprising a polymorphic site at one or more amino acid residues, and wherein the protein is encoded by a polynucleotide including one of the polymorphic sequences SEQ ID NOS: 1-7867, or their complement, provided that the polymorphic sequence includes a nucleotide other than the nucleotide recited in Table 1, column 5 for the polymorphic sequence, or the complement includes a nucleotide other than the complement of the nucleotide recited in Table 1, column 5.
  • the polypeptide can be, e.g., related to one of the protein families disclosed herein.
  • the polypeptide can be related to an ATPase associated protein, cadherin, or any of the other proteins provided in Table 1, column 10.
  • the polypeptide is translated in the same open reading frame as is a wild type protein whose amino acid sequence is identical to the amino acid sequence of the polymorphic protein except at the site of the polymorphism.
  • the polypeptide encoded by the polymorphic sequence, or its complement includes the nucleotide listed in Table 1, column 6 for the polymorphic sequence, or the complement includes the complement of the nucleotide listed in Table 1, column 6.
  • the invention also provides an antibody that binds specifically to a polypeptide encoded by a polynucleotide comprising a nucleotide sequence encoded by a polynucleotide selected from the group consisting of polymorphic sequences SEQ ID NOS:l-7867, or its complement.
  • the polymorphic sequence includes a nucleotide other than the nucleotide recited in Table 1, column 5 for the polymorphic sequence, or the complement includes a nucleotide other than the complement of the nucleotide recited in Table 1, column 5.
  • the antibody binds specifically to a polypeptide encoded by a polymorphic sequence which includes the nucleotide listed in Table 1, column 6 for the polymorphic sequence.
  • the antibody does not bind specifically to a polypeptide encoded by a polymorphic sequence which includes the nucleotide listed in Table 1, column 5 for the polymorphic sequence.
  • the invention further provides a method of detecting the presence of a polypeptide having one or more amino acid residue polymorphisms in a subject. The method includes providing a protein sample from the subject and contacting the sample with the above- described antibody under conditions that allow for the formation of antibody-antigen complexes. The antibody-antigen complexes are then detected. The presence of the complexes indicates the presence of the polypeptide.
  • the invention also provides a method of treating a subject suffering from, at risk for, or suspected of, suffering from a pathology ascribed to the presence of a sequence polymorphism in a subject, e.g., a human, non-human primate, cat, dog, rat, mouse, cow, pig, goat, or rabbit.
  • the method includes providing a subject suffering from a pathology associated with aberrant expression of a first nucleic acid comprising a polymorphic sequence selected from the group consisting of SEQ ID NOS:l - 7867, or its complement, and treating the subject by administering to the subject an effective dose of a therapeutic agent.
  • Aberrant expression can include qualitative alterations in expression of a gene, e.g., expression of a gene encoding a polypeptide having an altered amino acid sequence with respect to its wild- type counterpart.
  • Qualitatively different polypeptides can include, shorter, longer, or altered polypeptides relative to the amino acid sequence of the wild-type polypeptide.
  • Aberrant expression can also include quantitative alterations in expression of a gene. Examples of quantitative alterations in gene expression include lower or higher levels of expression of the gene relative to its wild-type counterpart, or alterations in the temporal or tissue-specific expression pattern of a gene.
  • aberrant expression may also include a combination of qualitative and quantitative alterations in gene expression.
  • the therapeutic agent can be administered to a subject suffering from a pathology associated with aberrant expression of a first nucleic acid comprising a polymorphic sequence.
  • the therapeutic agent can include, e.g., second nucleic acid comprising the polymorphic sequence, provided that the second nucleic acid comprises the nucleotide present in the wild type allele.
  • the second nucleic acid sequence comprises a polymorphic sequence which includes the nucleotide listed in Table 1, column 5 for the polymorphic sequence.
  • the therapeutic agent can be a polypeptide encoded by a polynucleotide comprising a polymorphic sequence selected from the group consisting of SEQ ID NOS:l - 7867, or by a polynucleotide comprising a nucleotide sequence that is complementary to any one of the polymorphic sequences SEQ ID NOS:l - 7867, provided that the polymorphic sequence includes the nucleotide listed in Table 1, column 6 for the polymorphic sequence.
  • the therapeutic agent may further include an antibody as herein described, or an oligonucleotide comprising a polymorphic sequence selected from the group consisting of SEQ ID NOS:l - 7867, or by a polynucleotide comprising a nucleotide sequence that is complementary to any one of polymorphic sequences SEQ ID NOS:l - 7867, provided that the polymorphic sequence includes the nucleotide listed in Table 1, column 5 or Table 1, column 6 for the polymorphic sequence.
  • the invention provides an oligonucleotide array comprising one or more oligonucleotides hybridizing to a first polynucleotide at a polymorphic site encompassed therein.
  • the first polynucleotide can be, e.g., a nucleotide sequence comprising one or more polymorphic sequences (SEQ ID NOS:l - 7867); a nucleotide sequence that is a fragment of any of the nucleotide sequences, provided that the fragment includes a polymorphic site in the polymorphic sequence; a complementary nucleotide sequence comprising a sequence complementary to one or more polymorphic sequences (SEQ ID NOS:l - 7867); or a nucleotide sequence that is a fragment of the complementary sequence, provided that the fragment includes a polymorphic site in the polymorphic sequence.
  • the array comprises 10; 100; 1,000; 10,000; 100,000 or more oligonucleotides.
  • the invention also provides a kit comprising one or more of the herein-described nucleic acids.
  • the kit can include, e.g., a polynucleotide which includes one or more of the SNPs described herein.
  • the polynucleotide can be, e.g., a nucleotide sequence which includes one or more of the polymorphic sequences shown in Table 1 and the Sequence Listing (SEQ ID NOS: 1 - 7867) and which includes a polymorphic sequence, or a fragment of the polymorphic sequence, as long as it includes the polymorphic site.
  • the polynucleotide may alternatively contain a nucleotide sequence which includes a sequence complementary to one or more of the sequences (SEQ ID NOS:l-7867), or a fragment of the complementary nucleotide sequence, provided that the fragment includes a polymorphic site in the polymorphic sequence.
  • the invention provides an isolated allele-specific oligonucleotide that hybridizes to a first polynucleotide containing a polymorphic site.
  • the first polynucleotide can be, e.g., a nucleotide sequence comprising one or more polymorphic sequences (SEQ ID NOS:l - 7867), provided that the polymorphic sequence includes a nucleotide other than the nucleotide recited in Table 1, column 5 for the polymorphic sequence.
  • the first polynucleotide can be a nucleotide sequence that is a fragment of the polymorphic sequence, provided that the fragment includes a polymorphic site in the polymorphic sequence, or a complementary nucleotide sequence which includes a sequence complementary to one or more polymorphic sequences (SEQ ID NOS:l - 7867), provided that the complementary nucleotide sequence includes a nucleotide other than the complement of the nucleotide recited in Table 1, column 5.
  • the first polynucleotide may in addition include a nucleotide sequence that is a fragment of the complementary sequence, provided that the fragment includes a polymorphic site in the polymorphic sequence.
  • the invention provides human SNPs in sequences which are transcribed, i.e., are cSNPs.
  • Many SNPs have been identified in genes related to polypeptides of known function.
  • SNPs associated with various polypeptides can be used together.
  • SNPs can be grouped according to whether they are derived from a nucleic acid encoding a polypeptide related to particular protein family or involved in a particular function.
  • SNPs can be grouped according to the functions played by their gene products. Such functions include, structural proteins, proteins which are associated with metabolic pathways, including fatty acid metabolism, glycolysis, intermediary metabolism, calcium metabolism, proteases, and amino acid metabolism, etc.
  • the present invention provides a large number of human cSNP's based on at least one gene product that has not been previously identified.
  • the cSNP's involve nucleic acid sequences that are assembled from at least one known sequence.
  • the present invention provides a large number of human cSNP's based on at least one gene product that has not been previously identified.
  • the cSNP's involve nucleic acid sequences that are assembled from at least one known sequence.
  • these four or more sequences could be clustered and assembled to make a consensus contig that included an ORF.
  • the assembled contigs defined associated sets of two, or possibly more than two, alleles defined by an SNP at a particular polymorphic site.
  • the nucleotide change from the consensus sequence had to occur in at least two individual sequences, and had to have a "Phred" score of 23 or higher at the site of the presumed SNP.
  • no more than 50% mismatching with the consensus sequence was allowed.
  • the SNP alleles occur in polynucleotides found in public databases. Furthermore, it was found that the assembled contigs defined associated sets of two, or possibly more than two, alleles defined by an SNP at a particular polymorphic site. These associations were not previously known. The SNPs are presented in Table 1.
  • allelic sets in which one allele defines a known polypeptide sequence that includes the polymorphic site and another polypeptide allele is not previously known. Then, various associations of alleles are possible. For example, it is possible that an allelic pair is defined in a noncoding region of the contig containing an ORF. In such cases the inventors believe that the invention resides in the recognition of the allelic pair; this association has not heretofore been made. Alternatively, sets of allelic contigs may exist in which the polymorphic site is within an ORF, but does not result in an amino acid change among the allelic polypeptides.
  • the invention resides in the recognition of the allelic pair; and that this association has not heretofore been made.
  • the polymorphic site resides within an ORF and results in an amino acid change, or a frameshift, among the alleles of the allelic set.
  • at least one of the alleles at the polypeptide level is a known protein.
  • At least one of the remaining allele or alleles in the set, carrying a variant amino acid at the polymorphic site is a novel polypeptide not heretofore known.
  • the invention resides at least in the recognition of the polymorphic allele as being a variant of the known reference polypeptide.
  • Table 1 provides information concerning the allelic sequences.
  • One of the sequences may be termed a reference polymorphic sequence, and the corresponding second sequence includes the variant SNP at the polymo ⁇ hic site. Since the reference polypeptide sequence is already known, the Sequence Listing accompanying this application provides only the sequence of the polymorphic allele, while its SEQ ID NO is provided in the Table. A reference to the SEQ ID NO that corresponds to the translated amino acid sequence is also given.
  • the Table includes thirteen columns that provide descriptive information for each cSNP, each of which occupies one row in the Table. The column headings, and a description of each, are given below.
  • SNPs disclosed in Table 1 were detected by aligning large numbers of sequences from genetically diverse sources of publicly available mRNA libraries (Clontech). Software designed specifically to look for multiple examples of variant bases differing from a consensus sequence was created and deployed. A criteria of a minimum of 2 occurrences of a sequence differing from the consensus in high quality sequence reads was used to identify an SNP.
  • SNPs described herein may be useful in diagnostic kits, for DNA arrays on chips and for other uses that involve hybridization of the SNP.
  • Specific SNPs may have utility where a disease has already been associated with that gene. Examples of possible disease correlations between the claimed SNPs with members of the genes of each classification are listed below:
  • Amylase is responsible for endohydrolysis of 1,4-alpha-glucosidic linkages in oligosaccharides and polysaccharides. Variations in amylase gene may be indicative of delayed maturation and of various amylase producing neoplasms and carcinomas.
  • the serum amyloid A (SAA) proteins comprise a family of vertebrate proteins that associate predominantly with high density lipoproteins (HDL). The synthesis of certain members of the family is greatly increased in inflammation. Prolonged elevation of plasma SAA levels, as in chronic inflammation, 15 results in a pathological condition, called amyloidosis, which affectsthe liver, kidney and spleen and which is characterized by the highly insoluble accumulation of SAA in these tissues. Amyloid selectively inhibits insulin- stimulated glucose utilization and glycogen deposition in muscle, while not affecting adipocyte glucose metabolism.
  • Deposition of fibrillar amyloid proteins intraneuronally, as neurofibrillary tangles, extracellularly, as plaques and in blood vessels, is characteristic of both Alzheimer's disease and aged Down's syndrome. Amyloid deposition is also associated with type II diabetes mellitus.
  • angiogenesis is also an essential step in tumor growth in order for the tumor to get the blood supply it needs to expand. Variation in these genes may be predictive of any form of heart disease, numerous blood clotting disorders, stroke, hypertension and predisposition to tumor formation and metastasis. In particular, these variants may be predictive of the response to various antihypertensive drugs and chemotherapeutic and anti-tumor agents. Apoptosis-related proteins
  • apoptosis Active cell suicide
  • apoptosis is induced by events such as growth factor withdrawal and toxins. It is controlled by regulators, which have either an inhibitory effect on programmed cell death (anti-apoptotic) or block the protective effect of inhibitors (pro- apoptotic).
  • regulators which have either an inhibitory effect on programmed cell death (anti-apoptotic) or block the protective effect of inhibitors (pro- apoptotic).
  • anti-apoptotic an inhibitory effect on programmed cell death
  • pro- apoptotic block the protective effect of inhibitors
  • Many viruses have found a way of countering defensive apoptosis by encoding their own anti-apoptosis genes preventing their target-cells from dying too soon. Variants of apoptosis related genes may be useful in formulation of antiaging drugs.
  • cyclins Members of the cell division/cell cycle pathways such as cyclins, many transcription factors and kinases, DNA polymerases, histones, helicases and other oncogenes play a critical role in carcinogenesis where the uncontrolled proliferation of cells leads to tumor formation and eventually metastasis.
  • Variation in these genes may be predictive of predisposition to any form of cancer, from increased risk of tumor formation to increased rate of metastasis. In particular, these variants may be predictive of the response to various chemotherapeutic and anti-tumor agents.
  • Granulocyte/macrophage colony-stimulating factors are cytokines that act in hematopoiesis by controlling the production, differentiation, and function of 2 related white cell populations of the blood, the granulocytes and the monocytes-macrophages.
  • Complement proteins are immune associated cytotoxic agents, acting in a chain reaction to exterminate target cells to that were opsonized (primed) with antibodies, by forming a membrane attack complex (MAC). The mechanism of killing is by opening pores in the target cell membrane.
  • Variations in 20 complement genes or their inhibitors are associated with many autoimmune disorders. Modified serum levels of complement products cause edemas of various tissues, lupus (SLE), vasculitis, glomerulonephritis, renal failure, hemolytic anemia, thrombocytopenia, and arthritis. They interfere with mechanisms of ADCC (antibody dependent cell cytotoxicity), severely impair immune competence and reduce phagocytic ability.
  • Variants of complement genes may also be indicative of type I diabetes mellitus, meningitis neurological disorders such as Nemaline myopathy, Neonatal hypotonia, muscular disorders such as congenital myopathy and other diseases.
  • the respiratory chain is a key biochemical pathway which is essential to all aerobic cells.
  • cytochromes involved in the chain. These are heme bound proteins which serve as electron carriers. Modifications in these genes may be predictive of ataxia areflexia, dementia and myopathic and neuropathic changes in muscles. Also, association with various types of solid tumors.
  • Kinesins are tubulin molecular motors that function to transport organelles within cells and to move chromosomes along microtubules during cell division. Modifications of these genes may be indicative of neurological disorders such as Pick disease of the brain, tuberous sclerosis.
  • Cytokines such as erythropoietin are cell-specific in their growth stimulation; erythropoietin is useful for the stimulation of the proliferation of erythroblasts.
  • Variants in cytokines may be predictive for a wide variety of diseases, including cancer predisposition.
  • G-protein coupled receptors also called R7G are an extensive group of hormones, neurotransmitters, odorants and light receptors which transduce extracellular signals by interaction with guanine nucleotide-binding (G) proteins. Alterations in genes coding for G-coupled proteins may be involved in and indicative of a vast number of physiological conditions. These include blood pressure regulation, renal dysfunctions, male infertility, dopamine associated cognitive, emotional, and endocrine functions, hypercalcemia, chondrodysplasia and osteoporosis, pseudohypoparathyroidism, growth retardation and dwarfism. Thioesterases
  • Eukaryotic thiol proteases are a family of proteolytic enzymes which contain an active site cysteine. Catalysis proceeds through a thioester intermediate and is facilitated by a nearby histidine side chain; an asparagine completes the essential catalytic triad. Variants of thioester associated genes may be predictive of neuronal disorders and mental illnesses such as Ceroid Lipoffiscinosis, Neuronal 1, Infantile, Santavuori disease and more.
  • SNPs are shown in Table 1 and the Sequence Listing. Both provide a summary of the polymorphic sequences disclosed herein.
  • a "SNP" is a polymorphic site embedded in a polymorphic sequence.
  • the polymorphic site is occupied by a single nucleotide, which is the position of nucleotide variation between the wild type and polymorphic allelic sequences.
  • the site is usually preceded by and followed by relatively highly conserved sequences of the allele (e.g., sequences that vary in less than 1/100 or 1/1000 members of the populations).
  • a polymo ⁇ hic sequence can include one or more of the following sequences: (1) a sequence having the nucleotide denoted in Table 1, column 5 at the polymo ⁇ hic site in the polymo ⁇ hic sequence; or (2) a sequence having a nucleotide other than the nucleotide denoted in Table 1, column 5 at the polymo ⁇ hic site in the polymo ⁇ hic sequence.
  • An example of the latter sequence is a polymo ⁇ hic sequence having the nucleotide denoted in Table 1, column 6 at the polymo ⁇ hic site in the polymo ⁇ hic sequence.
  • Each cSNP entry provides information concerning the wild type nucleotide sequence as well as the corresponding sequence that includes the SNP at the polymo ⁇ hic site. Since the wild type sequence is already known, the Sequence Listing accompanying this application provides only the sequence of the polymo ⁇ hic allele; its SEQ ID NO: is also cross referenced in the Table 1. A reference to the SEQ ID NO: giving the translated amino acid sequence is also given if appropriate.
  • the Table includes thirteen columns that provide descriptive information for each cSNP, each of which occupies one row in the Table. The column headings, and an explanation for each, are given below.
  • SEQ ID provides the cross-references to the nucleotide SEQ ID NOs: for the polymo ⁇ hic sequences, which are numbered consecutively, and, as explained below, amino acid SEQ ID NOs: as well, in the Sequence Listing of the application. Conversely, each sequence entry in the Sequence Listing also includes a cross-reference to the CuraGen sequence ID, under the label "CuraGen sequence ID”.
  • the first SEQ ID NO: given in the first column of each row of the Table is the SEQ ID NO: identifying the nucleic acid sequence for the polymo ⁇ hisms.
  • a polymo ⁇ hism carries an entry for an amino acid in a coding region, then a second SEQ ID NO: appears in parentheses in the column "Amino acid after" (see below) for the polymo ⁇ hic amino acid sequence .
  • the latter SEQ ID NOs: refer to amino acid sequences giving the polymo ⁇ hic amino acid sequences that are the translation of the nucleotide polymo ⁇ hism. If a polymo ⁇ hism carries no entry for the protein portion of the row, only one SEQ ID NO: is provided, in the first column.
  • Base pos. of SNP gives the numerical position of the nucleotide in the nucleic acid at which the cSNP is found, as identified in this invention.
  • Polymo ⁇ hic sequence provides a 51 -base sequence with the polymo ⁇ hic site at the 26 base in the sequence, as well as 25 bases from the reference sequence on the 5' side and the 3' side of the polymo ⁇ hic site.
  • the designation at the polymo ⁇ hic site is enclosed in square brackets, and provides first, the reference nucleotide; second, a "slash (/)"; and third, the polymo ⁇ hic nucleotide.
  • the polymo ⁇ hism is an insertion or a deletion. In that case, the position which is "unfilled" (i.e., the reference or the polymo ⁇ hic position) is indicated by the word "gap".
  • Base before provides the nucleotide present in the reference sequence at the position at which the polymo ⁇ hism is found.
  • Base after provides the altered nucleotide at the position of the polymo ⁇ hism.
  • amino acid before provides the amino acid in the reference protein, if the polymo ⁇ hism occurs in a coding region.
  • amino acid after provides the amino acid in the polymo ⁇ hic protein, if the polymo ⁇ hism occurs in a coding region.
  • This column also includes the SEQ ID NO: in parentheses for the translated polymo ⁇ hic amino acid sequence if the polymo ⁇ hism occurs in a coding region.
  • Type of change provides information on the nature of the polymo ⁇ hism.
  • SILENT-NONCODING is used if the polymo ⁇ hism occurs in a noncoding region of a nucleic acid.
  • SILENT-CODING is used if the polymo ⁇ hism occurs in a coding region of a nucleic acid of a nucleic acid and results in no change of amino acid in the translated polymo ⁇ hic protein.
  • CONSERVATIVE is used if the polymo ⁇ hism occurs in a coding region of a nucleic acid and provides a change in which the altered amino acid falls in the same class as the reference amino acid.
  • the classes are:
  • Acidic Asp, Glu, Asn, Gin;
  • End defines a termination codon
  • NONCONSERVATIVE is used if the polymo ⁇ hism occurs in a coding region of a nucleic acid and provides a change in which the altered amino acid falls in a different class than the reference amino acid.
  • FRAMESHIFT relates to an insertion or a deletion. If the frameshift occurs in a coding region, the Table provides the translation of the frameshifted codons 3 ' to the polymo ⁇ hic site.
  • Protein classification of CuraGen gene provides a generic class into which the protein is classified. During the course of the work leading to the filing of the four applications identified above, approximately 100 classes of proteins were identified. "Name of protein identified following a BLASTX analysis of the CuraGen sequence” provides the database reference for the protein found to resemble the novel reference- polymo ⁇ hism cognate pair most closely. (The next paragraph explains how a sequence was determined to be "novel").
  • Similarity (pvalue) following a BLASTX analysis provides the pvalue, a statistical measure from the BLASTX analysis that the polymo ⁇ hic sequence is similar to, and therefore an allele of, the reference, or wild-type, sequence.
  • a cutoff of pvalue > 1 x 10 "50 is used to establish that the reference-polymo ⁇ hic cognate pairs are novel.
  • Map location provides any information available at the time of filing related to localization of a gene on a chromosome.
  • the polymo ⁇ hisms are arranged in the Table in the following order.
  • SEQ ID Nos: 1-5696 are nucleotide sequences for SNPs that are silent.
  • SEQ ID Nos: 5697-6011 are nucleotide sequences for SNPs that lead to conservative amino acid changes.
  • SEQ ID Nos: 6012-6740 are nucleotide sequences for SNPs that lead to nonconservative amino acid changes.
  • SEQ ID NOs: 6741-7867 are nucleotide sequences for SNPs that involve a gap.
  • the allelic cSNP introduces an additional nucleotide (an insertion) or deletes a nucleotide (a deletion).
  • An SNP that involves a gap generates a frame shift.
  • SEQ ID NOs: 7868-8182 are the amino acid sequences centered at the polymo ⁇ hic amino acid residue for the protein products provided by SNPs that lead to conservative amino acid changes. 7 or 8 amino acids on either side of the polymo ⁇ hic site are shown. The order in which these sequences appear mirrors the order of presentation of the cognate nucleotide sequences, and is set forth in the Table.
  • SEQ ID NOs: 8183-8911 are the amino acid sequences centered at the polymo ⁇ hic amino acid residue for the protein products provided by SNPs that lead to nonconservative amino acid changes. 7 or 8 amino acids on either side of the polymo ⁇ hic site are shown. The order in which these sequences appear mirrors the order of presentation of the cognate nucleotide sequences, and is set forth in the Table.
  • SEQ ID NOs: 8912-10038 are the amino acid sequences centered at the polymo ⁇ hic amino acid residue for the protein products provided by SNPs that lead to frameshift-induced amino acid changes. 7 or 8 amino acids on either side of the polymo ⁇ hic site are shown. The order in which these sequences appear mirrors the order of presentation of the cognate nucleotide sequences, and is set forth in the Table.
  • compositions which include, or are capable of detecting, nucleic acid sequences having these polymo ⁇ hisms, as well as methods of using nucleic acids.
  • polymo ⁇ hic alleles of the invention may be detected at either the DNA, the RNA, or the protein level using a variety of techniques that are well known in the art. Strategies for identification and detection are described in e.g., EP 730,663, EP 717,113, and PCT US97/02102.
  • the present methods usually employ pre-characterized polymo ⁇ hisms. That is, the genotyping location and nature of polymo ⁇ hic forms present at a site have already been determined. The availability of this information allows sets of probes to be designed for specific identification of the known polymo ⁇ hic forms.
  • PCR DNA Amplification
  • PCR Protocols A Guide to Methods and Applications (eds. Innis, et al., Academic Press, San Diego, CA, 1990); Mattila et al., Nucleic Acids Res. 19, 4967 (1991); Eckert et al., PCR Methods and Applications 1, 17 (1991); PCR (eds. McPherson et al., IRL Press, Oxford); and U.S. Patent 4,683,202.
  • recombinant protein or “recombinantly produced protein” refers to a peptide or protein produced using non-native cells that do not have an endogenous copy of
  • a recombinantly produced protein relates to the gene product of a polymo ⁇ hic allele, i.e., a "polymo ⁇ hic protein” containing an altered amino acid at the site of translation of the nucleotide polymo ⁇ hism.
  • the cells produce the protein because they have been genetically altered by the introduction of the appropriate nucleic acid sequence.
  • the recombinant protein will not be found in association with proteins and other subcellular components normally associated with the cells producing the protein.
  • the terms "protein” and “polypeptide” are used interchangeably herein.
  • nucleic acid when referring to a nucleic acid, peptide or protein, means that the chemical composition is in a milieu containing fewer, or preferably, essentially none, of other cellular components with which it is naturally associated.
  • isolated or substantially pure refers to nucleic acid preparations that lack at least one protein or nucleic acid normally associated with the nucleic acid in a host cell. It is preferably in a homogeneous state although it can be in either a dry or aqueous solution. Purity and homogeneity are typically determined using analytical chemistry techniques such as gel electrophoresis or high performance liquid chromatography.
  • a substantially purified or isolated nucleic acid or protein will comprise more than 80% of all macromolecular species present in the preparation.
  • the nucleic acid or protein is purified to represent greater than 90% of all macromolecular species present. More preferably the nucleic acid or protein is purified to greater than 95%, and most preferably the nucleic acid or protein is purified to essential homogeneity, wherein other macromolecular species are not detected by conventional analytical procedures.
  • the genomic DNA used for the diagnosis may be obtained from any nucleated cells of the body, such as those present in peripheral blood, urine, saliva, buccal samples, surgical specimen, and autopsy specimens.
  • the DNA may be used directly or may be amplified enzymatically in vitro through use of PCR (Saiki et al. Science 239:487-491 (1988)) or other in vitro amplification methods such as the ligase chain reaction (LCR) (Wu and Wallace Genomics 4:560-569 (1989)), strand displacement amplification (SDA) (Walker et al. Proc. Natl. Acad. Sci. U.S.A. 89:392-396 (1992)), self-sustained sequence replication (3SR) (Fahy et al. PCR Methods P&J& 1 :25-33 (1992)), prior to mutation analysis.
  • LCR ligase chain reaction
  • SDA strand displacement amplification
  • 3SR self-sustained sequence replication
  • nucleic acid is a deoxyribonucleotide or ribonucleotide polymer in either single-or double-stranded form, including known analogs of natural nucleotides unless otherwise indicated.
  • nucleic acids refers to either DNA or RNA.
  • Nucleic acid sequence or “polynucleotide sequence” refers to a single-stranded sequence of deoxyribonucleotide or ribonucleotide bases read from the 5' end to the 3' end.
  • RNA transcripts The direction of 5' to 3' addition of nascent RNA transcripts is referred to as the transcription direction; sequence regions on the DNA strand having the same sequence as the RNA and which are beyond the 5' end of the RNA transcript in the 5' direction are referred to as "upstream sequences"; sequence regions on the DNA strand having the same sequence as the RNA and which are beyond the 3' end of the RNA transcript in the 3' direction are referred to as "downstream sequences".
  • upstream sequences sequence regions on the DNA strand having the same sequence as the RNA and which are beyond the 3' end of the RNA transcript in the 3' direction are referred to as "downstream sequences”.
  • upstream sequences sequence regions on the DNA strand having the same sequence as the RNA and which are beyond the 3' end of the RNA transcript in the 3' direction are referred to as "downstream sequences”.
  • upstream sequences sequence regions on the DNA strand having the same sequence as the RNA and which are beyond the 3' end of the RNA
  • polymo ⁇ hisms in specific DNA sequences can be accomplished by a variety of methods including, but not limited to, restriction-fragment-length-polymo ⁇ hism detection based on allele-specific restriction-endonuclease cleavage (Kan and Dozy Lancet ii:910-912 (1978)), hybridization with allele-specific oligonucleotide probes (Wallace et al. Nucl. Acids Res. 6:3543-3557 (1978)), including immobilized oligonucleotides (Saiki et al. Proc. Natl. Acad. SCI. USA.
  • DGGE denaturing-gradient gel electrophoresis
  • Single-strand-conformation- polymo ⁇ hism detection Orita et al. Genomics 5:874-879 (1983)
  • RNAase cleavage at mismatched base-pairs Myers et al. Science 230:1242 (1985)
  • chemical Cotton et al. Proc. Natl. w Sci. U.S.A, 8Z4397-4401 (1988)
  • enzymatic Youil et al. Proc.
  • Specific hybridization refers to the binding, or duplexing, of a nucleic acid molecule only to a second particular nucleotide sequence to which the nucleic acid is complementary, under suitably stringent conditions when that sequence is present in a complex mixture (e.g., total cellular DNA or RNA).
  • Stringent conditions are conditions under which a probe will hybridize to its target subsequence, but to no other sequences. Stringent conditions are sequence-dependent and are different in different circumstances. Longer sequences hybridize specifically at higher temperatures than shorter ones.
  • stringent conditions are selected such that the temperature is about 5°C lower than the thermal melting point (Tm) for the specific sequence to which hybridization is intended to occur at a defined ionic strength and pH.
  • Tm is the temperature (under defined ionic strength, pH, and nucleic acid concentration) at which 50% of the target sequence hybridizes to the complementary probe at equilibrium.
  • stringent conditions include a salt concentration of at least about 0.01 to about 1.0 M Na ion concentration (or other salts), at pH 7.0 to 8.3.
  • the temperature is at least about 30°C for short probes (e.g., 10 to 50 nucleotides) .
  • Stringent conditions can also be achieved with the addition of destabilizing agents such as formamide. For example, conditions of 5X SSPE (750 mM NaCl, 50 mM NaPhosphate, 5 mM EDTA, pH 7.4) and a temperature of 25-30°C are suitable for allele-specific probe hybridization.
  • “Complementary” or “target” nucleic acid sequences refer to those nucleic acid sequences which selectively hybridize to a nucleic acid probe. Proper annealing conditions depend, for example, upon a probe's length, base composition, and the number of mismatches and their position on the probe, and must often be determined empirically. For discussions of nucleic acid probe design and annealing conditions, see, for example, Sambrook et al., or Current Protocols in Molecular Biologv, F. Ausubel et al., ed., Greene Publishing and Wiley-Interscience, New York (1987).
  • a perfectly matched probe has a sequence perfectly complementary to a particular target sequence.
  • the test probe is typically perfectly complementary to a portion of the target sequence.
  • a "polymo ⁇ hic" marker or site is the locus at which a sequence difference occurs with respect to a reference sequence.
  • Polymo ⁇ hic markers include restriction fragment length polymo ⁇ hisms, variable number of tandem repeats (VNTR's), hypervariable regions, minisatellites, dinucleotide repeats, trinucleotide repeats, tetranucleotide repeats, simple sequence repeats, and insertion elements such as Alu.
  • the reference allelic form may be, for example, the most abundant form in a population, or the first allelic form to be identified, and other allelic forms are designated as alternative, variant or polymo ⁇ hic alleles.
  • the allelic form occurring most frequently in a selected population is sometimes referred to as the "wild type" form, and herein may also be referred to as the "reference" form.
  • Diploid organisms may be homozygous or heterozygous for allelic forms.
  • a diallelic polymo ⁇ hism has two distinguishable forms (i.e., base sequences), and a triallelic polymo ⁇ hism has three such forms.
  • an "oligonucleotide” is a single-stranded nucleic acid ranging in length from 2 to about 60 bases. Oligonucleotides are often synthetic but can also be produced from naturally occurring polynucleotides.
  • a probe is an oligonucleotide capable of binding to a target nucleic acid of a complementary sequence through one or more types of chemical bonds, usually through complementary base pairing via hydrogen bond formation. Oligonucleotides probes are often between 5 and 60 bases, and, in specific embodiments, may be between 10-40, or 15-30 bases long.
  • An oligonucleotide probe may include natural (i.e.
  • A, G, C, or T or modified bases (7-deazaguanosine, inosine, etc.).
  • the bases in an oligonucleotide probe may be joined by a linkage other than a phosphodiester bond, such as a phosphoramidite linkage or a phosphorothioate linkage, or they may be peptide nucleic acids in which the constituent bases are joined by peptide bonds rather than by phosphodiester bonds, so long as it does not interfere with hybridization.
  • primer refers to a single-stranded oligonucleotide which acts as a point of initiation of template-directed DNA synthesis under appropriate conditions (e.g., in the presence of four different nucleoside triphosphates and a polymerization agent, such as DNA polymerase, RNA polymerase or reverse transcriptase) in an appropriate buffer and at a suitable temperature.
  • a polymerization agent such as DNA polymerase, RNA polymerase or reverse transcriptase
  • the appropriate length of a primer depends on the intended use of the primer, but typically ranges from 15 to 30 nucleotides. Short primer molecules generally require cooler temperatures to form sufficiently stable hybrid complexes with the template.
  • a primer need not be perfectly complementary to the exact sequence of the template, but should be sufficiently complementary to hybridize with it.
  • primer site refers to the sequence of the target DNA to which a primer hybridizes.
  • primer pair refers to a set of primers including a 5' (upstream) primer that hybridizes with the 5' end of the DNA sequence to be amplified and a 3' (downstream) primer that hybridizes with the complement of the 3' end of the sequence to be amplified.
  • DNA fragments can be prepared, for example, by digesting plasmid DNA, or by use of PCR.
  • Oligonucleotides for use as primers or probes are chemically synthesized by methods known in the field of the chemical synthesis of polynucleotides, including by way of non-limiting example the phosphoramidite method described by Beaucage and Carruthers, Tetrahedron Lett 22: 1859-1 862 (1981) and the triester method provided by Matteucci, et al., J. Am. Chem. Soc. 103:3185 (1981) both inco ⁇ orated herein by reference. These syntheses may employ an automated synthesizer, as described in Needham-VanDevanter, D.R., et al., Nucleic Acids Res. 12:61596168 (1984).
  • oligonucleotides Purification of oligonucleotides may be carried out by either native acrylamide gel electrophoresis or by anion-exchange HPLC as described in Pearson, J.D. and Regnier, F.E., ,J. Chrom,, 255:137-149 (1983).
  • a double stranded fragment may then be obtained, if desired, by annealing appropriate complementary single strands together under suitable conditions or by synthesizing the complementary strand using a DNA polymerase with an appropriate primer sequence.
  • a specific sequence for a nucleic acid probe is given, it is understood that the complementary strand is also identified and included. The complementary strand will work equally well in situations where the target is a double-stranded nucleic acid.
  • sequence of the synthetic oligonucleotide or of any nucleic acid fragment can be can be obtained using either the dideoxy chain termination method or the Maxam-Gilbert method (see Sambrook et al. Molecular Cloning - a Laboratory Manual (2nd Ed.i Vols. 1- 3, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, (1989), which is inco ⁇ orated herein by reference. This manual is hereinafter referred to as "Sambrook et al.” ; Zyskind et al., (1988)). Recombinant DNA Laboratory Manual, (Acad. Press, New York). Oligonucleotides useful in diagnostic assays are typically at least 8 consecutive nucleotides in length, and may range upwards of 18 nucleotides in length to greater than 100 or more consecutive nucleotides.
  • antisense nucleic acid molecules that are hybridizable to or complementary to the nucleic acid molecule comprising the SNP- containing nucleotide sequences of the invention, or fragments, analogs or derivatives thereof.
  • An "antisense" nucleic acid comprises a nucleotide sequence that is complementary to a "sense" nucleic acid encoding a protein, e.g., complementary to the coding strand of a double-stranded cDNA molecule or complementary to an mRNA sequence.
  • antisense nucleic acid molecules are provided that comprise a sequence complementary to at least about 10, about 25, about 50, or about 60 nucleotides or an entire SNP coding strand, or to only a portion thereof.
  • an antisense nucleic acid molecule is antisense to a "coding region" of the coding strand of a polymo ⁇ hic nucleotide sequence of the invention.
  • coding region refers to the region of the nucleotide sequence comprising codons which are translated into amino acid.
  • the antisense nucleic acid molecule is antisense to a "noncoding region" of the coding strand of a nucleotide sequence of the invention.
  • noncoding region refers to 5' and 3' sequences which flank the coding region that are not translated into amino acids (i.e., also referred to as 5 1 and 3' untranslated regions).
  • antisense nucleic acids of the invention can be designed according to the rules of Watson and Crick or Hoogsteen base pairing.
  • the antisense nucleic acid molecule can generally be complementary to the entire coding region of an mRNA, but more preferably as embodied herein, it is an oligonucleotide that is antisense to only a portion of the coding or noncoding region of the mRNA.
  • An antisense oligonucleotide can range in length between about 5 and about 60 nucleotides, preferably between about 10 and about 45 nucleotides, more preferably between about 15 and 40 nucleotides, and still more preferably between about 15 and 30 in length.
  • an antisense nucleic acid of the invention can be constructed using chemical synthesis or enzymatic ligation reactions using procedures known in the art.
  • an antisense nucleic acid e.g., an antisense oligonucleotide
  • an antisense nucleic acid can be chemically synthesized using naturally occurring nucleotides or variously modified nucleotides designed to increase the biological stability of the molecules or to increase the physical stability of the duplex formed between the antisense and sense nucleic acids, e.g., phosphorothioate derivatives and acridine substituted nucleotides can be used.
  • modified nucleotides that can be used to generate the antisense nucleic acid include: 5-fluorouracil, 5-bromouracil, 5-chlorouracil, 5-iodouracil, hypoxanthine, xanthine, 4-acetylcytosine, 5-(carboxyhydroxylmethyl) uracil, 5-carboxymethylaminomethyl- 2-thiouridine, 5-carboxymethylaminomethyluracil, dihydrouracil, beta-D-galactosylqueosine, inosine, N6-isopentenyladenine, 1-methylguanine, 1-methylinosine, 2,2-dimethylguanine, 2-methyladenine, 2-methylguanine, 3-methylcytosine, 5-methylcytosine, N6-adenine, 7-methylguanine, 5-methylaminomethyluracil, 5-methoxyaminomethyl-2-thiouracil, beta-D-mannosylqueosine, 5'-methoxy
  • the antisense nucleic acid can be produced biologically using an expression vector into which a nucleic acid has been subcloned in an antisense orientation (i.e., RNA transcribed from the inserted nucleic acid will be of an antisense orientation to a target nucleic acid of interest, described further in the following section).
  • the antisense nucleic acid molecules of the invention are typically administered to a subject or generated in situ such that they hybridize with or bind to cellular mRNA and/or genomic DNA encoding a polymo ⁇ hic protein to thereby inhibit expression of the protein, e.g., by inhibiting transcription and/or translation.
  • the hybridization can be by conventional nucleotide complementary to form a stable duplex, or, for example, in the case of an antisense nucleic acid molecule that binds to DNA duplexes, through specific interactions in the major groove of the double helix.
  • An example of a route of administration of antisense nucleic acid molecules of the invention includes direct injection at a tissue site.
  • antisense nucleic acid molecules can be modified to target selected cells and then administered systemically.
  • antisense molecules can be modified such that they specifically bind to receptors or antigens expressed on a selected cell surface, e.g., by linking the antisense nucleic acid molecules to peptides or antibodies that bind to cell surface receptors or antigens.
  • the antisense nucleic acid molecules can also be delivered to cells using the vectors described herein. To achieve sufficient intracellular concentrations of antisense molecules, vector constructs in which the antisense nucleic acid molecule is placed under the control of a strong pol II or pol III promoter are preferred.
  • the antisense nucleic acid molecule of the invention is an ⁇ -anomeric nucleic acid molecule.
  • An ⁇ -anomeric nucleic acid molecule forms specific double-stranded hybrids with complementary RNA in which, contrary to the usual -units, the strands run parallel to each other (Gaultier et al. (1987) Nucleic Acids Res 15: 6625-6641).
  • the antisense nucleic acid molecule can also comprise a 2'-o-methylribonucleotide (Inoue et al. (1987) Nucleic Acids Res 15: 6131-6148) or a chimeric RNA -DNA analogue (Inoue et ⁇ /. (1987) FEBSLett 215: 327-330).
  • reference sequence is a defined sequence used as a basis for a sequence comparison; a reference sequence may be a subset of a larger sequence, for example, as a segment of a full- length cDNA or gene sequence given in a sequence listing, or may comprise a complete cDNA or gene sequence.
  • Optimal alignment of sequences for aligning a comparison window may, for example, be conducted by the local homology algoritlim of Smith and Waterman Adv. Appl. Math.
  • nucleic acid sequence encoding refers to a nucleic acid which directs the expression of a specific protein, peptide or amino acid sequence.
  • the nucleic acid sequences include both the DNA strand sequence that is transcribed into RNA and the RNA sequence that is translated into protein, peptide or amino acid sequence.
  • the nucleic acid sequences include both the full length nucleic acid sequences disclosed herein as well as non-full length sequences derived from the full length protein. It being further understood that the sequence includes the degenerate codons of the native sequence or sequences which may be introduced to provide codon preference in a specific host cell. Consequently, the principles of probe selection and array design can readily be extended to analyze more complex polymo ⁇ hisms (see EP 730,663). For example, to characterize a triallelic SNP polymo ⁇ hism, three groups of probes can be designed tiled on the three polymo ⁇ hic forms as described above.
  • Genomic DNA is typically amplified before analysis. Amplification is usually effected by PCR using primers flanking a suitable fragment e.g., of 50-500 nucleotides containing the locus of the polymo ⁇ hism to be analyzed. Target is usually labeled in the course of amplification.
  • the amplification product can be RNA or DNA, single stranded or double stranded. If double stranded, the amplification product is typically denatured before application to an array. If genomic DNA is analyzed without amplification, it may be desirable to remove RNA from the sample before applying it to the array. Such can be accomplished by digestion with DNase-free RNase.
  • the SNPs disclosed herein can be used to determine which forms of a characterized polymo ⁇ hism are present in individuals under analysis.
  • Allele-specific probes for analyzing polymo ⁇ hisms is described by e.g., Saiki et al., Nature 324, 163-166 (1986); Dattagupta, EP 235,726, Saiki, WO 89/11548. Allele-specific probes can be designed that hybridize to a segment of target DNA from one individual but do not hybridize to the corresponding segment from another individual due to the presence of different polymo ⁇ hic forms in the respective segments from the two individuals. Hybridization conditions should be sufficiently stringent that there is a significant difference in hybridization intensity between alleles, and preferably an essentially binary response, whereby a probe hybridizes to only one of the alleles.
  • Some probes are designed to hybridize to a segment of target DNA such that the polymo ⁇ hic site aligns with a central position (e.g., in a 15-mer at the 7 position; in a 16-mer, at either the 7, 8 or 9 position) of the probe. This design of probe achieves good discrimination in hybridization between different allelic forms.
  • Allele-specific probes are often used in pairs, one member of a pair showing a perfect match to a reference form of a target sequence and the other member showing a perfect match to a variant form.
  • Several pairs of probes can then be immobilized on the same support for simultaneous analysis of multiple polymo ⁇ hisms within the same target sequence.
  • the polymo ⁇ hisms can also be identified by hybridization to nucleic acid arrays, some examples of which are described in published PCT application WO 95/11995.
  • WO 95/11995 also describes subarrays that are optimized for detection of a variant form of a precharacterized polymo ⁇ hism. Such a subarray contains probes designed to be complementary to a second reference sequence, which is an allelic variant of the first reference sequence.
  • the second group of probes is designed by the same principles, except that the probes exhibit complementarity to the second reference sequence.
  • the inclusion of a second group (or further groups) can be particularly useful for analyzing short subsequences of the primary reference sequence in which multiple mutations are expected to occur within a short distance commensurate with the length of the probes (e.g., two or more mutations within 9 to 21 bases).
  • An allele-specific primer hybridizes to a site on a target DNA overlapping a polymo ⁇ hism and only primes amplification of an allelic form to which the primer exhibits perfect complementarity. See Gibbs, Nucleic Acid Res. 172427-2448 (1989). This primer is used in conjunction with a second primer which hybridizes at a distal site. Amplification proceeds from the two-primers, resulting in a detectable product which indicates the particular allelic form is present. A control is usually performed with a second pair of primers, one of which shows a single base mismatch at the polymo ⁇ hic site and the other of which exhibits perfect complementarity to a distal site. The single-base mismatch prevents amplification and no detectable product is formed.
  • the method works best when the mismatch is included in the 3'-most position of the oligonucleotide aligned with the polymo ⁇ hism because this position is most destabilizing to elongation from the primer (see, e.g., WO 93/22456).
  • Amplification products generated using the polymerase chain reaction can be analyzed by the use of denaturing gradient gel electrophoresis. Different alleles can be identified based on the different sequence-dependent melting properties and electrophoretic migration of DNA in solution. Erlich, ed., PCR Technology, Principles and Applications for DNA Amplification, (W.H. Freeman and Co New York, 1992, Chapter 7).
  • Alleles of target sequences can be differentiated using single-strand conformation polymo ⁇ hism analysis, which identifies base differences by alteration in electrophoretic migration of single stranded PCR products, as described in Orita et al., Proc. Nat. Acad. Sci. 86, 2766-2770 (1989).
  • Amplified PCR products can be generated and heated or otherwise denatured, to form single stranded amplification products.
  • Single-stranded nucleic acids may refold or form secondary structures which are partially dependent on the base sequence.
  • the different electrophoretic mobilities of single-stranded amplification products can be related to base-sequence differences between alleles of target sequences.
  • the genotype of an individual with respect to a pathology suspected of being caused by a genetic polymo ⁇ hism may be assessed by association analysis.
  • Phenotypic traits suitable for association analysis include diseases that have known but hitherto unmapped genetic components (e.g., agammaglobulinemia, diabetes insipidus, Lesch-Nyhan syndrome, muscular dystrophy, Wiskott-Aldrich syndrome, Fabry's disease, familial hypercholesterolemia, polycystic kidney disease, hereditary spherocytosis, von Willebrand's disease, tuberous sclerosis, hereditary hemorrhagic telangiectasia, familial colonic polyposis, Ehlers-Danlos syndrome, osteogenesis imperfecta, and acute intermittent po ⁇ hyria).
  • diseases that have known but hitherto unmapped genetic components e.g., agammaglobulinemia, diabetes insipidus, Lesch-Nyhan syndrome, muscular dystrophy,
  • Phenotypic traits also include symptoms of, or susceptibility to, multifactorial diseases of which a component is or may be genetic, such as autoimmune diseases, inflammation, cancer, diseases of the nervous system, and infection by pathogenic microorganisms.
  • autoimmune diseases include rheumatoid arthritis, multiple sclerosis, diabetes (insulin-dependent and non- independent), systemic lupus erythematosus and Graves disease.
  • cancers include cancers of the bladder, brain, breast, colon, esophagus, kidney, oral cavity, ovary, pancreas, prostate, skin, stomach, leukemia, liver, lung, and uterus.
  • Phenotypic traits also include characteristics such as longevity, appearance (e.g., baldness, obesity), strength, speed, endurance, fertility, and susceptibility or receptivity to particular drugs or therapeutic treatments.
  • polymo ⁇ hisms of the invention are often used in conjunction with polymo ⁇ hisms in distal genes.
  • Preferred polymo ⁇ hisms for use in forensics are diallelic because the population frequencies of two polymo ⁇ hic forms can usually be determined with greater accuracy than those of multiple polymo ⁇ hic forms at multi-allelic loci.
  • the capacity to identify a distinguishing or unique set of forensic markers in an individual is useful for forensic analysis. For example, one can determine whether a blood sample from a suspect matches a blood or other tissue sample from a crime scene by determining whether the set of polymo ⁇ hic forms occupying selected polymo ⁇ hic sites is the same in the suspect and the sample. If the set of polymo ⁇ hic markers does not match between a suspect and a sample, it can be concluded (barring experimental error) that the suspect was not the source of the sample. If the set of markers does match, one can conclude that the DNA from the suspect is consistent with that found at the crime scene. If frequencies of the polymo ⁇ hic forms at the loci tested have been determined (e.g., by analysis of a suitable population of individuals), one can perform a statistical analysis to determine the probability that a match of suspect and crime scene sample would occur by chance.
  • p(ID) is the probability that two random individuals have the same polymo ⁇ hic or allelic form at a given polymo ⁇ hic site. In diallelic loci, four genotypes are possible: AA, AB, BA, and BB. If alleles A and B occur in a haploid genome of the organism with frequencies x and y, the probability of each genotype in a diploid organism are (see WO 95/12607):
  • p(ID) (x 2 ) 2+ (2 y)2+ (y2)2.
  • the cumulative probability of identity (cum p(ID)) for each of multiple unlinked loci is determined by multiplying the probabilities provided by each locus:
  • the object of paternity testing is usually to determine whether a male is the father of a child. In most cases, the mother of the child is known and thus, the mother's contribution to the child's genotype can be traced. Paternity testing investigates whether the part of the child's genotype not attributable to the mother is consistent with that of the putative father. Paternity testing can be performed by analyzing sets of polymo ⁇ hisms in the putative father and the child.
  • the set of polymo ⁇ hisms in the child attributable to the father does not match the putative father, it can be concluded, barring experimental error, that the putative father is not the real father. If the set of polymo ⁇ hisms in the child attributable to the father does match the set of polymo ⁇ hisms of the putative father, a statistical calculation can be performed to determine the probability of coincidental match.
  • the probability of parentage exclusion (representing the probability that a random male will have a polymo ⁇ hic form at a given polymo ⁇ hic site that makes him incompatible as the father) is given by the equation (see WO 95/12607):
  • the probability of non- exclusion is:
  • the cumulative probability of exclusion of a random male is very high. This probability can be taken into account in assessing the liability of a putative father whose polymo ⁇ hic marker set matches the child's polymo ⁇ hic marker set attributable to his/her father.
  • the polymo ⁇ hisms of the invention may contribute to the phenotype of an organism in different ways. Some polymo ⁇ hisms occur within a protein coding sequence and contribute to phenotype by affecting protein structure. The effect may be neutral, beneficial or detrimental, or both beneficial and detrimental, depending on the circumstances. For example, a heterozygous sickle cell mutation confers resistance to malaria, but a homozygous sickle cell mutation is usually lethal. Other polymo ⁇ hisms occur in noncoding regions but may exert phenotypic effects indirectly via influence on replication, transcription, and translation. A single polymo ⁇ hism may affect more than one phenotypic trait. Likewise, a single phenotypic trait may be affected by polymo ⁇ hisms in different genes. Further, some polymo ⁇ hisms predispose an individual to a distinct mutation that is causally related to a certain phenotype.
  • Phenotypic traits include diseases that have known but hitherto unmapped genetic components. Phenotypic traits also include symptoms of, or susceptibility to, multifactorial diseases of which a component is or may be genetic, such as autoimmune diseases, inflammation, cancer, diseases of the nervous system, and infection by pathogenic microorganisms. Some examples of autoimmune diseases include rheumatoid arthritis, multiple sclerosis, diabetes (insulin-dependent and non-independent), systemic lupus erythematosus and Graves disease. Some examples of cancers include cancers of the bladder, brain, breast, colon, esophagus, kidney, leukemia, liver, lung, oral cavity, ovary, pancreas, prostate, skin, stomach and uterus. Phenotypic traits also include characteristics such as longevity, appearance (e.g., baldness, obesity), strength, speed, endurance, fertility, and susceptibility or receptivity to particular drugs or therapeutic treatments.
  • characteristics such as longevity, appearance (e.g.,
  • Correlation is performed for a population of individuals who have been tested for the presence or absence of a phenotypic trait of interest and for polymo ⁇ hic marker sets.
  • a set of polymo ⁇ hisms i.e. a polymo ⁇ hic set
  • the alleles of each polymo ⁇ hism of the set are then reviewed to determine whether the presence or absence of a particular allele is associated with the trait of interest. Correlation can be performed by standard statistical methods and statistically significant correlations between polymo ⁇ hic form(s) and phenotypic characteristics are noted.
  • allele Al at polymo ⁇ hism A correlates with heart disease.
  • allele Bl at polymo ⁇ hism B correlates with increased milk production of a farm animal.
  • Such correlations can be exploited in several ways.
  • detection of the polymo ⁇ hic form set in a human or animal patient may justify immediate administration of treatment, or at least the institution of regular monitoring of the patient.
  • Detection of a polymo ⁇ hic form correlated with serious disease in a couple contemplating a family may also be valuable to the couple in their reproductive decisions.
  • the female partner might elect to undergo in vitro fertilization to avoid the possibility of transmitting such a polymo ⁇ hism from her husband to her offspring.
  • the previous section concerns identifying correlations between phenotypic traits and polymo ⁇ hisms that directly or indirectly contribute to those traits.
  • the present section describes identification of a physical linkage between a genetic locus associated with a trait of interest and polymo ⁇ hic markers that are not associated with the trait, but are in physical proximity with the genetic locus responsible for the trait and co-segregate with it.
  • Such analysis is useful for mapping a genetic locus associated with a phenotypic trait to a chromosomal position, and thereby cloning gene(s) responsible for the trait. See Lander et al., Proc. Natl. Acad. Sci. (USA) 83, 7353-7357 (1986); Lander et al., Proc.
  • Linkage studies are typically performed on members of a family. Available members of the family are characterized for the presence or absence of a phenotypic trait and for a set of polymo ⁇ hic markers. The distribution of polymo ⁇ hic markers in an informative meiosis is then analyzed to determine which polymo ⁇ hic markers co-segregate with a phenotypic trait. See, e.g., Kerem et al., Science 245, 1073-1080 (1989); Monaco et al., Nature 316, 842 (1985); Yamoka et al., Neurology 40, 222-226 (1990); Rossiter et al, FASEB Journal 5, 21- 27 (1991).
  • Linkage is analyzed by calculation of LOD (log of the odds) values.
  • a lod value is the relative likelihood of obtaining observed segregation data for a marker and a genetic locus when the two are located at a recombination fraction RF, versus the situation in which the two are not linked, and thus segregating independently (Thompson & Thompson, Genetics in Medicine (5th ed, W.B. Saunders Company, Philadelphia, 1991); Strachan, "Mapping the human genome” in The Human Genome (BIOS Scientific Publishers Ltd, Oxford), Chapter 4).
  • the likelihood at a given value of RF is: probability of data if loci linked at RF to probability of data if loci unlinked.
  • the computed likelihood is usually expressed as the logi Q of this ratio (i.e., a lod score). For example, a lod score of 3 indicates 1000:1 odds against an apparent observed linkage being a coincidence.
  • a lod score of 3 indicates 1000:1 odds against an apparent observed linkage being a coincidence.
  • the use of logarithms allows data collected from different families to be combined by simple addition. Computer programs are available for the calculation of lod scores for differing values of RF (e.g., LIPED, MLINK (Lathrop, Proc. Nat. Acad. Sci.
  • a recombination fraction may be determined from mathematical tables. See Smith et al., Mathematical tables for research workers in human genetics (Churchill, London, 1961); Smith, Ann. Hum. Genet. 32, 127-150
  • Positive lod score values suggest that the two loci are linked, whereas negative values suggest that linkage is less likely (at that value of RF) than the possibility that the two loci are unlinked.
  • a combined lod score of + 3 or greater is considered definitive evidence that two loci are linked.
  • a negative lod score of -2 or less is taken as definitive evidence against linkage of the two loci being compared.
  • Negative linkage data are useful in excluding a chromosome or a segment thereof from consideration. The search focuses on the remaining non-excluded chromosomal locations.
  • the invention further provides transgenic nonhuman animals capable of expressing an exogenous variant gene and/or having one or both alleles of an endogenous variant gene inactivated.
  • Expression of an exogenous variant gene is usually achieved by operably linking the gene to a promoter and optionally an enhancer, and microinjecting the construct into a zygote. See Hogan et al., "Manipulating the Mouse Embryo, A Laboratory Manual," Cold Spring Harbor Laboratory. (1989).
  • Inactivation of endogenous variant genes can be achieved by forming a transgene in which a cloned variant gene is inactivated by insertion of a positive selection marker.
  • transgene is then introduced into an embryonic stem cell, where it undergoes homologous recombination with an endogenous variant gene. Mice and other rodents are preferred animals. Such animals provide useful drug screening systems.
  • the invention further provides methods for assessing the pharmacogenomic susceptibility of a subject harboring a single nucleotide polymo ⁇ hism to a particular pharmaceutical compound, or to a class of such compounds.
  • Genetic polymo ⁇ hism in drug- metabolizing enzymes, drug transporters, receptors for pharmaceutical agents, and other drug targets have been correlated with individual differences based on distinction in the efficacy and toxicity of the pharmaceutical agent administered to a subject.
  • Pharmocogenomic characterization of a subjects susceptibility to a drug enhances the ability to tailor a dosing regimen to the particular genetic constitution of the subject, thereby enhancing and optimizing the therapeutic effectiveness of the therapy.
  • method of treating such a condition includes administering to a subject experiencing the pathology the wild type cognate of the polymo ⁇ hic protein. Once administered in an effective dosing regimen, the wild type cognate provides complementation or remediation of the defect due to the polymo ⁇ hic protein. The subject's condition is ameliorated by this protein therapy.
  • a subject suspected of suffering from a pathology ascribable to a polymo ⁇ hic protein that arises from a cSNP is to be diagnosed using any of a variety of diagnostic methods capable of identifying the presence of the cSNP in the nucleic acid, or of the cognate polymo ⁇ hic protein, in a suitable clinical sample taken from the subject.
  • the subject is treated with a pharmaceutical composition that includes a nucleic acid that harbors the correcting wild-type gene, or a fragment containing a correcting sequence of the wild-type gene.
  • Non-limiting examples of ways in which such a nucleic acid may be administered include inco ⁇ orating the wild-type gene in a viral vector, such as an adenovirus or adeno associated virus, and administration of a naked DNA in a pharmaceutical composition that promotes intracellular uptake of the administered nucleic acid.
  • a viral vector such as an adenovirus or adeno associated virus
  • administration of a naked DNA in a pharmaceutical composition that promotes intracellular uptake of the administered nucleic acid Once the nucleic acid that includes the gene coding for the wild-type allele of the polymo ⁇ hism is inco ⁇ orated within a cell of the subject, it will initiate de novo biosynthesis of the wild-type gene product. If the nucleic acid is further inco ⁇ orated into the genome of the subject, the treatment will have long-term effects, providing de novo synthesis of the wild-type protein for a prolonged duration. The synthesis of the wild-type protein in the cells of the subject will contribute to a therapeutic enhancement of the
  • a subject suffering from a pathology ascribed to a SNP may be treated so as to correct the genetic defect.
  • Such a subject is identified by any method that can detect the polymo ⁇ hism in a sample drawn from the subject.
  • Such a genetic defect may be permanently corrected by administering to such a subject a nucleic acid fragment inco ⁇ orating a repair sequence that supplies the wild-type nucleotide at the position of the SNP.
  • This site-specific repair sequence encompasses an RNA/DNA oligonucleotide which operates to promote endogenous repair of a subject's genomic DNA.
  • a genetic defect leading to an inborn pathology may be overcome, as the chimeric oligonucleotides induces inco ⁇ oration of the wild-type sequence into the subject's genome.
  • the wild-type gene product is expressed, and the replacement is propagated, thereby engendering a permanent repair.
  • kits comprising at least one allele-specific oligonucleotide as described above.
  • the kits contain one or more pairs of allele- specific oligonucleotides hybridizing to different forms of a polymo ⁇ hism.
  • the allele-specific oligonucleotides are provided immobilized to a substrate.
  • the same substrate can comprise allele-specific oligonucleotide probes for detecting at least 10, 100, 1000 or all of the polymo ⁇ hisms shown in the Table.
  • kits include, for example, restriction enzymes, reverse-transcriptase or polymerase, the substrate nucleoside triphosphates, means used to label (for example, an avidin-enzyme conjugate and enzyme substrate and chromogen if the label is biotin), and the appropriate buffers for reverse transcription, PCR, or hybridization reactions.
  • the kit also contains instructions for carrying out the hybridizing methods.
  • Several aspects of the present invention rely on having available the polymo ⁇ hic proteins encoded by the nucleic acids comprising a SNP of the inventions. There are various methods of isolating these nucleic acid sequences. For example, DNA is isolated from a genomic or cDNA library using labeled oligonucleotide probes having sequences complementary to the sequences disclosed herein.
  • probes can be used directly in hybridization assays.
  • probes can be designed for use in amplification techniques such as PCR.
  • mRNA is isolated from tissue such as heart or pancreas, preferably a tissue wherein expression of the gene or gene family is likely to occur.
  • cDNA is prepared from the mRNA and ligated into a recombinant vector.
  • the vector is transfected into a recombinant host for propagation, screening and cloning. Methods for making and screening cDNA libraries are well known, See Gubler, U. and Hoffman, B.J. Gene 25:263- 269 (1983) and Sambrook et al.
  • the DNA is extracted from tissue and either mechanically sheared or enzymatically digested to yield fragments of about 12-20 kb. The fragments are then separated by gradient centrifugation from undesired sizes and are constructed in bacteriophage lambda vectors. These vectors and phage are packaged in vitro, as described in Sambrook, et al. Recombinant phage are analyzed by plaque hybridization as described in Benton and Davis, Science 196:180-1 82 (1977). Colony hybridization is carried out as generally described in M. Grunstein et al. Proc. Natl. Acad. Sci. USA. 72:3961- 3965 (1975). DNA of interest is identified in either cDNA or genomic libraries by its ability to hybridize with nucleic acid probes, for example on Southern blots, and these DNA regions are isolated by standard methods familiar to those of skill in the art. See Sambrook, et al.
  • oligonucleotide primers complementary to the two 3' borders of the DNA region to be amplified are synthesized. The polymerase chain reaction is then carried out using the two primers. See PCR Protocols: a Guide to Methods and Applications (Innis, M, Gelfand, D., Sninsky, J. and White, T., eds.), Academic Press, San Diego (1990). Primers can be selected to amplify the entire regions encoding a full-length sequence of interest or to amplify smaller DNA segments as desired. PCR can be used in a variety of protocols to isolate cDNAs encoding a sequence of interest.
  • primers and probes for amplifying DNA encoding a sequence of interest are generated from analysis of the DNA sequences listed herein. Once such regions are PCR-amplified, they can be sequenced and oligonucleotide probes can be prepared from the sequence.
  • DNA encoding a sequence comprising a cSNP is isolated and cloned, one can express the encoded polymo ⁇ hic proteins in a variety of recombinantly engineered cells. It is expected that those of skill in the art are knowledgeable in the numerous expression systems available for expression of DNA encoding a sequence of interest. No attempt to describe in detail the various methods known for the expression of proteins in prokaryotes or eukaryotes is made here.
  • the expression of natural or synthetic nucleic acids encoding a sequence of interest will typically be achieved by operably linking the DNA or cDNA to a promoter (which is either constitutive or inducible), followed by inco ⁇ oration into an expression vector.
  • the vectors can be suitable for replication and integration in either prokaryotes or eukaryotes.
  • Typical expression vectors contain initiation sequences, transcription and translation terminators, and promoters useful for regulation of the expression of a polynucleotide sequence of interest.
  • expression plasmids which contain, at the minimum, a strong promoter to direct transcription, a ribosome binding site for translational initiation, and a transcription/translation terminator.
  • the expression vectors may also comprise generic expression cassettes containing at least one independent terminator sequence, sequences permitting replication of the plasmid in both eukaryotes and prokaryotes, i.e., shuttle vectors, and selection markers for both prokaryotic and eukaryotic systems. See Sambrook et al.
  • prokaryotic expression systems may be used to express the polymo ⁇ hic proteins of the invention. Examples include E. coli, Bacillus, Streptomyces, and the like.
  • expression plasmids which contain, at the minimum, a strong promoter to direct transcription, a ribosome binding site for translational initiation, and a transcription/translation terminator.
  • regulatory regions suitable for this pu ⁇ ose in E. coli are the promoter and operator region of the E. coli tryptophan biosynthetic pathway as described by Yanofsky, C, J. Bacterial. 158:1018-1024 (1984) and the leftward promoter of phage lambda as described by ⁇ , I. and Hagen, P., Ann. Rev. Genet. 14:399- 445 (1980).
  • selection markers in DNA vectors transformed in E. coli is also useful. Examples of such markers include genes specifying resistance to ampicillin, tetracycline, or chloramphenicol. See Sambrook et al. for details concerning selection markers for use in E. coli.
  • the expressed protein may first be denatured and then renatured. This can be accomplished by solubilizing the bacterially produced proteins in a chaotropic agent such as guanidine HCI and reducing all the cysteine residues with a reducing agent such as beta- mercaptoethanol. The protein is then renatured, either by slow dialysis or by gel filtration. See U.S. Patent No. 4,511,503. Detection of the expressed antigen is achieved by methods known in the art as radioimmunoassay, or Western blotting techniques or immunoprecipitation. Purification from E. coli can be achieved following procedures such as those described in U.S. Patent No. 4,511,503.
  • Any of a variety of eukaryotic expression systems such as yeast, insect cell lines, bird, fish, and mammalian cells, may also be used to express a polymo ⁇ hic protein of the invention.
  • a nucleotide sequence harboring a cSNP may be expressed in these eukaryotic systems. Synthesis of heterologous proteins in yeast is well known. Methods in Yeast Genetics, Sherman, F., et al., Cold Spring Harbor Laboratory, (1982) is a well recognized work describing the various methods available to produce the protein in yeast.
  • Suitable vectors usually have expression control sequences, such as promoters, including 3-phosphogtycerate kinase or other glycolytic enzymes, and an origin of replication, termination sequences and the like as desired.
  • promoters including 3-phosphogtycerate kinase or other glycolytic enzymes
  • origin of replication termination sequences and the like as desired.
  • suitable vectors are described in the literature (Botstein, et al, Gene 8:17-24 (1979); Broach, et al., Gene 8:121- 133 (1979)).
  • yeast cells are first converted into protoplasts using zymolyase, lyticase or glusulase, followed by addition of DNA and polyethylene glycol (PEG).
  • PEG polyethylene glycol
  • the PEG-treated protoplasts are then regenerated in a 3% agar medium under selective conditions. Details of this procedure are given in the papers by J.D. Beggs, Nature (London) 275:104-109 (1978); and Hinnen, A., et al., Proc. Natl. Acad. Sci. USA, 75:1929-1933 (1978).
  • the second procedure does not involve removal of the cell wall.
  • the cells are treated with lithium chloride or acetate and PEG and put on selective plates (Ito, H., et al., J. Bact, 153163-168 (1983)) cells and applying standard protein isolation techniques to the lysates:.
  • the purification process can be monitored by using Western blot techniques or radioimmunoassay or other standard techniques.
  • the sequences encoding the proteins of the invention can also be ligated to various immunoassay expression vectors for use in transforming cell cultures of, for instance, mammalian, insect, bird or fish origin.
  • Illustrative of cell cultures useful for the production of the polypeptides are mammalian cells.
  • Mammalian cell systems often will be in the form of monolayers of cells although mammalian cell suspensions may also be used.
  • suitable host cell lines capable of expressing intact proteins have been developed in the art, and include the HEK293, BHK21, and CHO cell lines, and various human cells such as COS cell lines, HeLa cells, myeloma cell lines, Jurkat cells, etc.
  • Expression vectors for these cells can include expression control sequences, such as an origin of replication, a promoter (e.g., the CMV promoter, a HSV t£ promoter ovpgk (phosphogly cerate kinase) promoter), an enhancer (Queen et al. Immunol. Rev. 89:49 (1986)) and necessary processing information sites, such as ribosome binding sites, RNA splice sites, polyadenylation sites (e.g., an SV40 large T Ag poly A addition site), and transcriptional terminator sequences.
  • a promoter
  • vectors for expressing the proteins of the invention in insect cells are usually derived from baculovirus.
  • Insect cell lines include mosquito larvae, silkworm, armyworm, moth and Drosophila cell lines such as a Schneider cell line (See Schneider J. Embryol. Exp. Mo ⁇ hol., 27:353-365 (1987).
  • the vector e.g., a plasmid, which is used to transform the host cell, preferably contains DNA sequences to initiate transcription and sequences to control the translation of the protein. These sequences are referred to as expression control sequences.
  • polyadenylation or transcription terminator sequences from known mammalian genes need to be inco ⁇ orated into the vector.
  • An example of a terminator sequence is the polyadenylation sequence from the bovine growth hormone gene. Sequences for accurate splicing of the transcript may also be included.
  • An example of a splicing sequence is the VP1 intron from SV40 (Sprague, J. et a/., J. Virol. 45: 773-781 (1983)).
  • gene sequences to control replication in the host cell may be Saveria-Campo, M., 1985, "Bovine Papilloma virus DNA a Eukaryotic Cloning Vector" in DNA Cloning Vol.
  • the host cells are competent or rendered competent for transformation by various means. There are several well-known methods of introducing DNA into animal cells. These include: calcium phosphate precipitation, fusion of the recipient cells with bacterial protoplasts containing the DNA, treatment of the recipient cells with liposomes containing the DNA, DEAE dextran, electroporation and micro-injection of the DNA directly into the cells.
  • the transformed cells are cultured by means well known in the art (Biochemical
  • the expressed polypeptides are isolated from cells grown as suspensions or as monolayers. The latter are recovered by well known mechanical, chemical or enzymatic means.
  • operably linked refers to linkage of a promoter upstream from a DNA sequence such that the promoter mediates transcription of the DNA sequence.
  • operably linked means that the isolated polynucleotide of the invention and an expression control sequence are situated within a vector or cell in such a way that the gene encoding the protein is expressed by a host cell which has been transformed (transfected) with the ligated polynucleotide/expression sequence.
  • vector refers to viral expression systems, autonomous self-replicating circular DNA (plasmids), and includes both expression and nonexpression plasmids.
  • gene as used herein is intended to refer to a nucleic acid sequence which encodes a polypeptide. This definition includes various sequence polymo ⁇ hisms, mutations, and/or sequence variants wherein such alterations do not affect the function of the gene product.
  • gene is intended to include not only coding sequences but also regulatory regions such as promoters, enhancers, termination regions and similar untranslated nucleotide sequences. The term further includes all introns and other DNA sequences spliced from the mRNA transcript, along with variants resulting from alternative splice sites.
  • Mammalian host cells include, for example, monkey COS cells, Chinese Hamster Ovary (CHO) cells, human kidney 293 cells, human epidermal A43 1 cells, human Co 10205 cells, 3T3 cells, CV-1 cells, other transformed primate cell lines, normal diploid cells, cell strains derived from in vitro culture of primary tissue, primary explants, HeLa cells, mouse L cells, BHK, HL- 60, U937, HaK or Jurkat cells.
  • yeast strains include Saccharomyces cerevisiae, Schizosaccharomyces pombe, Kluyveromyces strains, Candida or any yeast strain capable of expressing heterologous proteins.
  • Potentially suitable bacterial strains include Escherichia coli, Bacillus subtilis, Salmonella typhimurium, or any bacterial strain capable of expressing heterologous proteins. If the protein is made in yeast or bacteria, it may be necessary to modify the protein produced therein, for example by phosphorylation or glycosylation of the appropriate sites, in order to obtain the functional protein.
  • the protein may also be produced by operably linking the isolated polynucleotide of the invention to suitable control sequences in one or more insect expression vectors, and employing an insect expression system.
  • Materials and methods for baculovirus/insect cell expression systems are commercially available in kit form from, e.g., Invitrogen, San Diego, California, U.S.A. (the MaxBac ⁇ kit), and such methods are well known in the art, as described in Summers and Smith, Texas Agricultural Experiment Station Bulletin No. 1555 (1987). inco ⁇ orated herein by reference.
  • an insect cell capable of expressing_a polynucleotide of the present invention is "transformed.”
  • the protein of the invention may be prepared by culturing transformed host cells under culture conditions suitable to express the recombinant protein.
  • the polymo ⁇ hic protein of the invention may also be expressed as a product of transgenic animals, e.g., as a component of the milk of transgenic cows, goats, pigs, or sheep which are characterized by somatic or germ cells containing a nucleotide sequence encoding the protein.
  • the protein may also be produced by known conventional chemical synthesis. Methods for constructing the proteins of the present invention by synthetic means are known to those skilled in the art.
  • the polymo ⁇ hic proteins produced by recombinant DNA technology may be purified by techniques commonly employed to isolate or purify recombinant proteins.
  • Recombinantly produced proteins can be directly expressed or expressed as a fusion protein.
  • the protein is then purified by a combination of cell lysis (e.g., sonication) and affinity chromatography.
  • cell lysis e.g., sonication
  • affinity chromatography e.g., affinity chromatography
  • subsequent digestion of the fusion protein with an appropriate proteolytic enzyme releases the desired polypeptide.
  • the polypeptides of this invention may be purified to substantial purity by standard techniques well known in the art, including selective precipitation with such substances as ammonium sulfate, column chromatography, immunopurification methods, and others. See, for instance, R.
  • antibodies may be raised to the proteins of the invention as described herein.
  • Cell membranes are isolated from a cell line expressing the recombinant protein, the protein is extracted from the membranes and immunoprecipitated. The proteins may then be further purified by standard protein chemistry techniques as described above.
  • the resulting expressed protein may then be purified from such culture (i.e., from culture medium or cell extracts) using known purification processes, such as gel filtration and ion exchange chromatography.
  • the purification of the protein may also include an affinity column containing agents which will bind to the protein; one or more column steps over such affinity resins as concanavalin A-agarose, heparin-Toyopearl@ or Cibacrom blue 3GA Sepharose B; one or more steps involving hydrophobic interaction chromatography using such resins as phenyl ether, butyl ether, or propyl ether; or immunoaffmity chromatography.
  • the protein of the invention may also be expressed in a form which will facilitate purification.
  • fusion protein such as those of maltose binding protein (MBP), glutathione-S-transferase (GST) or thioredoxin (TRX). Kits for expression and purification of such fusion proteins are commercially available from New England BioLab (Beverly, MA), Pharmacia (Piscataway, NJ) and InVitrogen, respectively.
  • MBP maltose binding protein
  • GST glutathione-S-transferase
  • TRX thioredoxin
  • Kits for expression and purification of such fusion proteins are commercially available from New England BioLab (Beverly, MA), Pharmacia (Piscataway, NJ) and InVitrogen, respectively.
  • the protein can also be tagged with an epitope and subsequently purified by using a specific antibody directed to such epitope.
  • One such epitope (“Flag") is commercially available from Kodak (New Haven, CT).
  • RP- HPLC reverse-phase high performance liquid chromatography
  • hydrophobic RP- HPLC media e.g., silica gel having pendant methyl or other aliphatic groups
  • antibody refers to immunoglobulin molecules and immunologically active portions of immunoglobulin molecules, i.e., molecules that contain an antigen binding site that specifically binds (immunoreacts with) an antigen, such as polymo ⁇ hic.
  • Such antibodies include, but are not limited to, polyclonal, monoclonal, chimeric, single chain, F a b and F( ab .) 2 fragments, and an F a b expression library.
  • antibodies to human polymo ⁇ hic proteins are disclosed.
  • the specified antibodies bind to a particular protein and do not bind in a significant amount to other proteins present in the sample.
  • Specific binding to an antibody under such conditions may require an antibody that is selected for its specificity for a particular protein.
  • immunoassay formats may be used to select antibodies specifically immunoreactive with a particular protein.
  • solid-phase ELISA immunoassays are routinely used to select monoclonal antibodies specifically immunoreactive with a protein. See Harlow and Lane (1988) Antibodies, a Laboratory Manual, Cold Spring Harbor Publications, New York, for a description of immunoassay formats and conditions that can be used to determine specific immunoreactivity.
  • Antibodies that immunospecifically bind to polymo ⁇ hic gene products but not to the corresponding prototypical or "wild-type" gene products are also provided.
  • Antibodies can be made by injecting mice or other animals with the variant gene product or synthetic peptide.
  • Monoclonal antibodies are screened as are described, for example, in Harlow & Lane, Antibodies, A Laboratory Manual, Cold Spring Harbor Press, New York (1988); Goding, Monoclonal antibodies, Principles and Practice (2d ed.) Academic Press, New York (1986). Monoclonal antibodies are tested for specific immunoreactivity with a variant gene product and lack of immunoreactivity to the corresponding prototypical gene product.
  • An isolated polymo ⁇ hic protein, or a portion or fragment thereof, can be used as an immunogen to generate the antibody that binds the polymo ⁇ hic protein using standard techniques for polyclonal and monoclonal antibody preparation.
  • the full-length polymo ⁇ hic protein can be used or, alternatively, the invention provides antigenic peptide fragments of polymo ⁇ hic for use as immunogens.
  • the antigenic peptide of a polymo ⁇ hic protein of the invention comprises at least 8 amino acid residues of the amino acid sequence encompassing the polymo ⁇ hic amino acid and encompasses an epitope of the polymo ⁇ hic protein such that an antibody raised against the peptide forms a specific immune complex with the polymo ⁇ hic protein.
  • the antigenic peptide comprises at least 10 amino acid residues, more preferably at least 15 amino acid residues, even more preferably at least 20 amino acid residues, and most preferably at least 30 amino acid residues.
  • Preferred epitopes encompassed by the antigenic peptide are regions of polymo ⁇ hic that are located on the surface of the protein, e.g., hydrophilic regions.
  • polymo ⁇ hic protein For the production of polyclonal antibodies, various suitable host animals (e.g., rabbit, goat, mouse or other mammal) may be immunized by injection with the polymo ⁇ hic protein.
  • An appropriate immunogenic preparation can contain, for example, recombinantly expressed polymo ⁇ hic protein or a chemically synthesized polymo ⁇ hic polypeptide. The preparation can further include an adjuvant.
  • adjuvants used to increase the immunological response include, but are not limited to, Freund's (complete and incomplete), mineral gels (e.g., aluminum hydroxide), surface active substances (e.g., lysolecithin, pluronic polyols, polyanions, peptides, oil emulsions, dinitrophenol, etc.), human adjuvants such as Bacille Calmette-Guerin and Corynebacterium parvum, or similar immunostimulatory agents.
  • the antibody molecules directed against polymo ⁇ hic proteins can be isolated from the mammal (e.g., from the blood) and further purified by well known techniques, such as protein A chromatography, to obtain the IgG fraction.
  • monoclonal antibody or “monoclonal antibody composition”, as used herein, refers to a population of antibody molecules that originates from the clone of a singly hybridoma cell, and that contains only one type of antigen binding site capable of immunoreacting with a particular epitope of a polymo ⁇ hic protein.
  • a monoclonal antibody composition thus typically displays a single binding affinity for a particular polymo ⁇ hic protein with which it immunoreacts.
  • any technique that provides for the production of antibody molecules by continuous cell line culture may be utilized.
  • Such techniques include, but are not limited to, the hybridoma technique (see Kohler & Milstein, 1975 Nature 256: 495-497); the trioma technique; the human B-cell hybridoma technique (see Kozbor, et al, 1983 Immunol Today 4: 72) and the EBV hybridoma technique to produce human monoclonal antibodies (see Cole, et al, 1985 In: MONOCLONAL ANTIBODIES AND CANCER THERAPY, Alan R. Liss, Inc., pp. 77-96).
  • Human monoclonal antibodies may be utilized in the practice of the present invention and may be produced by using human hybridomas (see Cote, et al, 1983.
  • techniques can be adapted for the production of single-chain antibodies specific to a polymo ⁇ hic protein (see e.g., U.S. Patent No. 4,946,778).
  • methodologies can be adapted for the construction of F ab expression libraries (see e.g., Huse, et al, 1989 Science 246: 1275-1281) to allow rapid and effective identification of monoclonal F ab fragments with the desired specificity for a polymo ⁇ hic protein or derivatives, fragments, analogs or homologs thereof.
  • Non-human antibodies can be "humanized" by techniques well known in the art. See e.g., U.S. Patent No. 5,225,539.
  • Antibody fragments that contain the idiotypes to a polymo ⁇ hic protein may be produced by techniques known in the art including, but not limited to: (0 an F( a b ' ) 2 fragment produced by pepsin digestion of an antibody molecule; (ii) an F ab fragment generated by reducing the disulfide bridges of an F( ab' ) 2 fragment; (Hi) an F ab fragment generated by the treatment of the antibody molecule with papain and a reducing agent and (iv) F v fragments.
  • recombinant anti-polymo ⁇ hic protein antibodies such as chimeric and humanized monoclonal antibodies, comprising both human and non-human portions, which can be made using standard recombinant DNA techniques, are within the scope of the invention.
  • chimeric and humanized monoclonal antibodies can be produced by recombinant DNA techniques known in the art, for example using methods described in PCT International Application No. PCT/US86/02269; European Patent Application No. 184,187; European Patent Application No. 171,496; European Patent Application No. 173,494; PCT
  • methodologies for the screening of antibodies that possess the desired specificity include, but are not limited to, enzyme-linked immunosorbent assay (ELISA) and other immunologically-mediated techniques known within the art.
  • ELISA enzyme-linked immunosorbent assay
  • Anti-polymo ⁇ hic protein antibodies may be used in methods known within the art relating to the detection, quantitation and/or cellular or tissue localization of a polymo ⁇ hic protein (e.g., for use in measuring levels of the polymo ⁇ hic protein within appropriate physiological samples, for use in diagnostic methods, for use in imaging the protein, and the like).
  • antibodies for polymo ⁇ hic proteins, or derivatives, fragments, analogs or homologs thereof, that contain the antibody-derived CDR are utilized as pharmacologically-active compounds in therapeutic applications intended to treat a pathology in a subject that arises from the presence of the cSNP allele in the subject.
  • An anti-polymo ⁇ hic protein antibody (e.g., monoclonal antibody) can be used to isolate polymo ⁇ hic proteins by a variety of immunochemical techniques, such as immunoaffinity chromatography or immunoprecipitation.
  • An anti-polymo ⁇ hic protein antibody can facilitate the purification of natural polymo ⁇ hic protein from cells and of recombinantly produced polymo ⁇ hic proteins expressed in host cells.
  • an anti-polymo ⁇ hic protein antibody can be used to detect polymo ⁇ hic protein (e.g., in a cellular lysate or cell supernatant) in order to evaluate the abundance and pattern of expression of the polymo ⁇ hic protein.
  • Anti-polymo ⁇ hic antibodies can be used diagnostically to monitor protein levels in tissue as part of a clinical testing procedure, e.g., to, for example, determine the efficacy of a given treatment regimen. Detection can be facilitated by coupling (i.e., physically linking) the antibody to a detectable substance.
  • detectable substances include various enzymes, prosthetic groups, fluorescent materials, luminescent materials, bioluminescent materials, and radioactive materials.
  • suitable enzymes include horseradish peroxidase, alkaline phosphatase,
  • examples of suitable prosthetic group complexes include streptavidin/biotin and avidin/biotin;
  • suitable fluorescent materials include umbelliferone, fluorescein, fluorescein isothiocyanate, rhodamine, dichlorotriazinylamine fluorescein, dansyl chloride or phycoerythrin;
  • an example of a luminescent material includes luminol;
  • examples of bioluminescent materials include luciferase, luciferin, and aequorin, and examples of suitable radioactive material include 125 I, 131 I, 35 S or 3 H.
  • AATCTTT HYDROXYLASE (1 ,25- DIHYDROXYVITAMIN D(3) 24- HYDROXYLASE) (24-OHASE) - HOMO SAPIENS (HUMAN), 513 aa.[pds:SWISSPROT-ID:Q07973 CYTOCHROME P450-CC24 MITOCHONDRIAL PRECURSOR (EC 1.14.-.-) (P450- CC24) (VITAMIN D(3) 24- HYDROXYLASE) (1 ,25- DIHYDROXYVITAMIN D(3) 24- HYDROXYLASE) (24-OHASE) - HOMO SAPIENS (HUMAN), 513 • a

Landscapes

  • Chemical & Material Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Organic Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Zoology (AREA)
  • Genetics & Genomics (AREA)
  • Analytical Chemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Biochemistry (AREA)
  • Biophysics (AREA)
  • Engineering & Computer Science (AREA)
  • Wood Science & Technology (AREA)
  • Molecular Biology (AREA)
  • Microbiology (AREA)
  • Immunology (AREA)
  • Biotechnology (AREA)
  • Physics & Mathematics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • Pathology (AREA)
  • Toxicology (AREA)
  • Gastroenterology & Hepatology (AREA)
  • Medicinal Chemistry (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The invention provides nucleic acids containing single-nucleotide polymorphisms identified for transcribed human sequences, as well as methods of using the nucleic acids.

Description

NUCLEIC ACIDS CONTAINING SINGLE NUCLEOTIDE POLYMORPHISMS AND METHODS OF USE THEREOF
BACKGROUND OF THE INVENTION
Sequence polymorphism-based analysis of nucleic acid sequences can augment or replace previously known methods for determining the identity and relatedness of individuals. The approach is generally based on alterations in nucleic acid sequences between related individuals. This analysis has been widely used in a variety of genetic, diagnostic, and forensic applications. For example, polymorphism analyses are used in identity and paternity analysis, and in genetic mapping studies.
One such type of variation is a restriction fragment length polymorphism (RFLP). RFLPS can create or delete a recognition sequence for a restriction endonuclease in one nucleic acid relative to a second nucleic acid. The result of the variation is an alteration in the relative length of restriction enzyme generated DNA fragments in the two nucleic acids.
Other polymorphisms take the form of short tandem repeats (STR) sequences, which are also referred to as variable numbers of tandem repeat (VNTR) sequences. STR sequences typically include tandem repeats of 2, 3, or 4 nucleotide sequences that are present in a nucleic acid from one individual but absent from a second, related individual at the corresponding genomic location.
Other polymorphisms take the form of single nucleotide variations, termed single nucleotide polymorphisms (SNPs), between individuals. A SNP can, in some instances, be referred to as a "cSNP" to denote that the nucleotide sequence containing the SNP originates as a cDNA.
SNPs can arise in several ways. A single nucleotide polymorphism may arise due to a substitution of one nucleotide for another at the polymorphic site. Substitutions can be transitions or transversions. A transition is the replacement of one purine nucleotide by another purine nucleotide, or one pyrimidine by another pyrimidine. A transversion is the replacement of a purine by a pyrimidine, or the converse.
Single nucleotide polymorphisms can also arise from a deletion of a nucleotide or an insertion of a nucleotide relative to a reference allele. Thus, the polymorphic site is a site at which one allele bears a gap with respect to a single nucleotide in another allele. Some SNPs occur within, or near genes. One such class includes SNPs falling within regions of genes encoding for a polypeptide product. These SNPs may result in an alteration of the amino acid sequence of the polypeptide product and give rise to the expression of a defective or other variant protein. Such variant products can, in some cases result in a pathological condition, e.g. , genetic disease. Examples of genes in which a polymorphism within a coding sequence gives rise to genetic disease include sickle cell anemia and cystic fibrosis. Other SNPs do not result in alteration of the polypeptide product. Of course, SNPs can also occur in noncoding regions of genes.
SNPs tend to occur with great frequency and are spaced uniformly throughout the genome. The frequency and uniformity of SNPs means that there is a greater probability that such a polymorphism will be found in close proximity to a genetic locus of interest.
SUMMARY OF THE INVENTION
The invention is based in part on the discovery of novel single nucleotide polymorphisms (SNPs) in regions of human DNA.
Accordingly, in one aspect, the invention provides an isolated polynucleotide which includes one or more of the SNPs described herein. The polynucleotide can be, e.g., a nucleotide sequence which includes one or more of the polymorphic sequences shown in Table 1 and the Sequence Listing (SEQ ID NOS: 1 - 7867) and which includes a polymorphic sequence, or a fragment of the polymorphic sequence, as long as it includes the polymorphic site. The polynucleotide may alternatively contain a nucleotide sequence which includes a sequence complementary to one or more of the sequences (SEQ ID NOS: 1-7867), or a fragment of the complementary nucleotide sequence, provided that the fragment includes a polymorphic site in the polymorphic sequence.
The polynucleotide can be, e.g., DNA or RNA, and can be between about 10 and about 100 nucleotides, e.g, 10-90, 10-75, 10-51, 10-40, or 10-30, nucleotides in length.
In some embodiments, the polymorphic site in the polymorphic sequence includes a nucleotide other than the nucleotide listed in Table 1, column 5 for the polymorphic sequence, e.g., the polymorphic site includes the nucleotide listed in Table 1, column 6 for the polymorphic sequence. In other embodiments, the complement of the polymorphic site includes a nucleotide other than the complement of the nucleotide listed in Table 1, column 5 for the complement of the polymorphic sequence, e.g., the complement of the nucleotide listed in Table 1, column 6 for the polymorphic sequence.
In some embodiments, the polymorphic sequence is associated with a polypeptide related to one of the protein families disclosed herein. For example, the nucleic acid may be associated with a polypeptide related to an ATPase associated protein, a cadherin, or any of the other proteins identified in Table 1, column 10.
In another aspect, the invention provides an isolated allele-specific oligonucleotide that hybridizes to a first polynucleotide containing a polymorphic site. The first polynucleotide can be, e.g., a nucleotide sequence comprising one or more polymorphic sequences (SEQ ID NOS:l - 7867), provided that the polymorphic sequence includes a nucleotide other than the nucleotide recited in Table 1, column 5 for the polymorphic sequence. Alternatively, the first polynucleotide can be a nucleotide sequence that is a fragment of the polymorphic sequence, provided that the fragment includes a polymorphic site in the polymorphic sequence, or a complementary nucleotide sequence which includes a sequence complementary to one or more polymorphic sequences (SEQ ID NOS:l - 7867), provided that the complementary nucleotide sequence includes a nucleotide other than the complement of the nucleotide recited in Table 1 , column 5. The first polynucleotide may in addition include a nucleotide sequence that is a fragment of the complementary sequence, provided that the fragment includes a polymorphic site in the polymorphic sequence.
In some embodiments, the oligonucleotide does not hybridize under stringent conditions to a second polynucleotide. The second polynucleotide can be, e.g., (a) a nucleotide sequence comprising one or more polymorphic sequences (SEQ ID NOS:l - 7867), wherein the polymorphic sequence includes the nucleotide listed in Table 1, column 5 for the polymorphic sequence; (b) a nucleotide sequence that is a fragment of any of the polymorphic sequences; (c) a complementary nucleotide sequence including a sequence complementary to one or more polymorphic sequences (SEQ ID NOS:l - 7867), wherein the polymorphic sequence includes the complement of the nucleotide listed in Table 1, column 5; and (d) a nucleotide sequence that is a fragment of the complementary sequence, provided that the fragment includes a polymorphic site in the polymorphic sequence. The oligonucleotide can be, e.g., between about 10 and about 100 bases in length. In some embodiments, the oligonucleotide is between about 10 and 75 bases, 10 and 51 bases, 10 and about 40 bases, or about 15 and 30 bases in length.
The invention also provides a method of detecting a polymorphic site in a nucleic acid. The method includes contacting the nucleic acid with an oligonucleotide that hybridizes to a polymorphic sequence selected from the group consisting of SEQ ID NOS: 1-7867, or its complement, provided that the polymorphic sequence includes a nucleotide other than the nucleotide recited in Table 1, column 5 for the polymorphic sequence, or the complement includes a nucleotide other than the complement of the nucleotide recited in Table 1, column 5. The method also includes determining whether the nucleic acid and the oligonucleotide hybridize. Hybridization of the oligonucleotide to the nucleic acid sequence indicates the presence of the polymorphic site in the nucleic acid.
In preferred embodiments, the oligonucleotide does not hybridize to the polymorphic sequence when the polymorphic sequence includes the nucleotide recited in Table 1, column 5 for the polymorphic sequence, or when the complement of the polymorphic sequence includes the complement of the nucleotide recited in Table 1, column 5 for the polymorphic sequence.
The oligonucleotide can be, e.g., between about 10 and about 100 bases in length. In some embodiments, the oligonucleotide is between about 10 and 75 bases, 10 and 51 bases, 10 and about 40 bases, or about 15 and 30 bases in length.
In some embodiments, the polymorphic sequence identified by the oligonucleotide is associated with a polypeptide related to one of the protein families disclosed herein. For example, the nucleic acid may be associated polypeptide related to an ATPase associated protein, cadherin, or any of the other protein families identified in Table 1, column 10.
In another aspect, the method includes determining if a sequence polymorphism is present in a subject, such as a human. The method includes providing a nucleic acid from the subject and contacting the nucleic acid with an oligonucleotide that hybridizes to a polymorphic sequence selected from the group consisting of SEQ ID NOS: 1-7867, or its complement, provided that the polymorphic sequence includes a nucleotide other than the nucleotide recited in Table 1, column 5 for said polymorphic sequence, or the complement includes a nucleotide other than the complement of the nucleotide recited in Table 1, column 5. Hybridization between the nucleic acid and the oligonucleotide is then determined. Hybridization of the oligonucleotide to the nucleic acid sequence indicates the presence of the polymorphism in said subject.
In a further aspect, the invention provides a method of determining the relatedness of a first and second nucleic acid. The method includes providing a first nucleic acid and a second nucleic acid and contacting the first nucleic acid and the second nucleic acid with an oligonucleotide that hybridizes to a polymorphic sequence selected from the group consisting of SEQ ID NOS: 1-7867, or its complement, provided that the polymorphic sequence includes a nucleotide other than the nucleotide recited in Table 1, column 5 for the polymorphic sequence, or the complement includes a nucleotide other than the complement of the nucleotide recited in Table 1, column 5. The method also includes determining whether the first nucleic acid and the second nucleic acid hybridize to the oligonucleotide, and comparing hybridization of the first and second nucleic acids to the oligonucleotide. Hybridization of first and second nucleic acids to the nucleic acid indicates the first and second subjects are related.
In preferred embodiments, the oligonucleotide does not hybridize to the polymorphic sequence when the polymorphic sequence includes the nucleotide recited in Table 1, column 5 for the polymorphic sequence, or when the complement of the polymorphic sequence includes the complement of the nucleotide recited in Table 1, column 5 for the polymorphic sequence.
The oligonucleotide can be, e.g., between about 10 and about 100 bases in length. In some embodiments, the oligonucleotide is between about 10 and 75 bases, 10 and 51 bases, 10 and about 40 bases, or about 15 and 30 bases in length.
The method can be used in a variety of applications. For example, the first nucleic acid may be isolated from physical evidence gathered at a crime scene, and the second nucleic acid may be obtained from a person suspected of having committed the crime. Matching the two nucleic acids using the method can establish whether the physical evidence originated from the person.
In another example, the first sample may be from a human male suspected of being the father of a child and the second sample may be from the child. Establishing a match using the described method can establish whether the male is the father of the child. In another aspect, the invention provides an isolated polypeptide comprising a polymorphic site at one or more amino acid residues, and wherein the protein is encoded by a polynucleotide including one of the polymorphic sequences SEQ ID NOS: 1-7867, or their complement, provided that the polymorphic sequence includes a nucleotide other than the nucleotide recited in Table 1, column 5 for the polymorphic sequence, or the complement includes a nucleotide other than the complement of the nucleotide recited in Table 1, column 5.
The polypeptide can be, e.g., related to one of the protein families disclosed herein. For example, the polypeptide can be related to an ATPase associated protein, cadherin, or any of the other proteins provided in Table 1, column 10.
In some embodiments, the polypeptide is translated in the same open reading frame as is a wild type protein whose amino acid sequence is identical to the amino acid sequence of the polymorphic protein except at the site of the polymorphism.
In some embodiments, the polypeptide encoded by the polymorphic sequence, or its complement, includes the nucleotide listed in Table 1, column 6 for the polymorphic sequence, or the complement includes the complement of the nucleotide listed in Table 1, column 6.
The invention also provides an antibody that binds specifically to a polypeptide encoded by a polynucleotide comprising a nucleotide sequence encoded by a polynucleotide selected from the group consisting of polymorphic sequences SEQ ID NOS:l-7867, or its complement. The polymorphic sequence includes a nucleotide other than the nucleotide recited in Table 1, column 5 for the polymorphic sequence, or the complement includes a nucleotide other than the complement of the nucleotide recited in Table 1, column 5.
In some embodiments, the antibody binds specifically to a polypeptide encoded by a polymorphic sequence which includes the nucleotide listed in Table 1, column 6 for the polymorphic sequence.
Preferably, the antibody does not bind specifically to a polypeptide encoded by a polymorphic sequence which includes the nucleotide listed in Table 1, column 5 for the polymorphic sequence. The invention further provides a method of detecting the presence of a polypeptide having one or more amino acid residue polymorphisms in a subject. The method includes providing a protein sample from the subject and contacting the sample with the above- described antibody under conditions that allow for the formation of antibody-antigen complexes. The antibody-antigen complexes are then detected. The presence of the complexes indicates the presence of the polypeptide.
The invention also provides a method of treating a subject suffering from, at risk for, or suspected of, suffering from a pathology ascribed to the presence of a sequence polymorphism in a subject, e.g., a human, non-human primate, cat, dog, rat, mouse, cow, pig, goat, or rabbit. The method includes providing a subject suffering from a pathology associated with aberrant expression of a first nucleic acid comprising a polymorphic sequence selected from the group consisting of SEQ ID NOS:l - 7867, or its complement, and treating the subject by administering to the subject an effective dose of a therapeutic agent. Aberrant expression can include qualitative alterations in expression of a gene, e.g., expression of a gene encoding a polypeptide having an altered amino acid sequence with respect to its wild- type counterpart. Qualitatively different polypeptides can include, shorter, longer, or altered polypeptides relative to the amino acid sequence of the wild-type polypeptide. Aberrant expression can also include quantitative alterations in expression of a gene. Examples of quantitative alterations in gene expression include lower or higher levels of expression of the gene relative to its wild-type counterpart, or alterations in the temporal or tissue-specific expression pattern of a gene. Finally, aberrant expression may also include a combination of qualitative and quantitative alterations in gene expression.
The therapeutic agent can be administered to a subject suffering from a pathology associated with aberrant expression of a first nucleic acid comprising a polymorphic sequence. The therapeutic agent can include, e.g., second nucleic acid comprising the polymorphic sequence, provided that the second nucleic acid comprises the nucleotide present in the wild type allele. In some embodiments, the second nucleic acid sequence comprises a polymorphic sequence which includes the nucleotide listed in Table 1, column 5 for the polymorphic sequence.
Alternatively, the therapeutic agent can be a polypeptide encoded by a polynucleotide comprising a polymorphic sequence selected from the group consisting of SEQ ID NOS:l - 7867, or by a polynucleotide comprising a nucleotide sequence that is complementary to any one of the polymorphic sequences SEQ ID NOS:l - 7867, provided that the polymorphic sequence includes the nucleotide listed in Table 1, column 6 for the polymorphic sequence.
The therapeutic agent may further include an antibody as herein described, or an oligonucleotide comprising a polymorphic sequence selected from the group consisting of SEQ ID NOS:l - 7867, or by a polynucleotide comprising a nucleotide sequence that is complementary to any one of polymorphic sequences SEQ ID NOS:l - 7867, provided that the polymorphic sequence includes the nucleotide listed in Table 1, column 5 or Table 1, column 6 for the polymorphic sequence.
In another aspect, the invention provides an oligonucleotide array comprising one or more oligonucleotides hybridizing to a first polynucleotide at a polymorphic site encompassed therein. The first polynucleotide can be, e.g., a nucleotide sequence comprising one or more polymorphic sequences (SEQ ID NOS:l - 7867); a nucleotide sequence that is a fragment of any of the nucleotide sequences, provided that the fragment includes a polymorphic site in the polymorphic sequence; a complementary nucleotide sequence comprising a sequence complementary to one or more polymorphic sequences (SEQ ID NOS:l - 7867); or a nucleotide sequence that is a fragment of the complementary sequence, provided that the fragment includes a polymorphic site in the polymorphic sequence.
In preferred embodiments, the array comprises 10; 100; 1,000; 10,000; 100,000 or more oligonucleotides.
The invention also provides a kit comprising one or more of the herein-described nucleic acids. The kit can include, e.g., a polynucleotide which includes one or more of the SNPs described herein. The polynucleotide can be, e.g., a nucleotide sequence which includes one or more of the polymorphic sequences shown in Table 1 and the Sequence Listing (SEQ ID NOS: 1 - 7867) and which includes a polymorphic sequence, or a fragment of the polymorphic sequence, as long as it includes the polymorphic site. The polynucleotide may alternatively contain a nucleotide sequence which includes a sequence complementary to one or more of the sequences (SEQ ID NOS:l-7867), or a fragment of the complementary nucleotide sequence, provided that the fragment includes a polymorphic site in the polymorphic sequence. The invention provides an isolated allele-specific oligonucleotide that hybridizes to a first polynucleotide containing a polymorphic site. The first polynucleotide can be, e.g., a nucleotide sequence comprising one or more polymorphic sequences (SEQ ID NOS:l - 7867), provided that the polymorphic sequence includes a nucleotide other than the nucleotide recited in Table 1, column 5 for the polymorphic sequence. Alternatively, the first polynucleotide can be a nucleotide sequence that is a fragment of the polymorphic sequence, provided that the fragment includes a polymorphic site in the polymorphic sequence, or a complementary nucleotide sequence which includes a sequence complementary to one or more polymorphic sequences (SEQ ID NOS:l - 7867), provided that the complementary nucleotide sequence includes a nucleotide other than the complement of the nucleotide recited in Table 1, column 5. The first polynucleotide may in addition include a nucleotide sequence that is a fragment of the complementary sequence, provided that the fragment includes a polymorphic site in the polymorphic sequence.
Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Although methods and materials similar or equivalent to those described herein can be used in the practice or testing of the present invention, suitable methods and materials are described below. All publications, patent applications, patents, and other references mentioned herein are incorporated by reference in their entirety. In the case of conflict, the present specification, including definitions, will control. In addition, the materials, methods, and examples are illustrative only and not intended to be limiting.
Other features and advantages of the invention will be apparent from the following detailed description and claims.
DETAILED DESCRIPTION OF THE INVENTION
The invention provides human SNPs in sequences which are transcribed, i.e., are cSNPs. Many SNPs have been identified in genes related to polypeptides of known function. If desired, SNPs associated with various polypeptides can be used together. For example, SNPs can be grouped according to whether they are derived from a nucleic acid encoding a polypeptide related to particular protein family or involved in a particular function. Similarly, SNPs can be grouped according to the functions played by their gene products. Such functions include, structural proteins, proteins which are associated with metabolic pathways, including fatty acid metabolism, glycolysis, intermediary metabolism, calcium metabolism, proteases, and amino acid metabolism, etc. Specifically, the present invention provides a large number of human cSNP's based on at least one gene product that has not been previously identified. In contrast, and as defined specifically in the following paragraph, the cSNP's involve nucleic acid sequences that are assembled from at least one known sequence.
The present invention provides a large number of human cSNP's based on at least one gene product that has not been previously identified. In contrast, and as defined specifically in the following paragraph, the cSNP's involve nucleic acid sequences that are assembled from at least one known sequence.
7867 distinct polymorphic sites were identified by the present inventors, using the following procedure. Raw traces underlying sequence data were drawn from public databases and from the proprietary database of the Assignee of the present invention. The sequences were obtained by calling the bases from these traces, and included assigning "Phred" quality scores for each called base. For each allelic set, at the polynucleotide level, four or more nucleotide sequences were identified having at least partial overlap with one another.
As illustrated in FIG. 1, these four or more sequences could be clustered and assembled to make a consensus contig that included an ORF. In this way, the inventors found that the assembled contigs defined associated sets of two, or possibly more than two, alleles defined by an SNP at a particular polymorphic site. In order to be confirmed as a SNP site, the nucleotide change from the consensus sequence had to occur in at least two individual sequences, and had to have a "Phred" score of 23 or higher at the site of the presumed SNP. Furthermore, in a window of 5 bases on either side of the SNP, no more than 50% mismatching with the consensus sequence was allowed. In the assembly leading to each of the contigs defining the allelic set, the SNP alleles occur in polynucleotides found in public databases. Furthermore, it was found that the assembled contigs defined associated sets of two, or possibly more than two, alleles defined by an SNP at a particular polymorphic site. These associations were not previously known. The SNPs are presented in Table 1.
At the level of translation of an ORF contained in the contigs, however, the inventors identified allelic sets in which one allele defines a known polypeptide sequence that includes the polymorphic site and another polypeptide allele is not previously known. Then, various associations of alleles are possible. For example, it is possible that an allelic pair is defined in a noncoding region of the contig containing an ORF. In such cases the inventors believe that the invention resides in the recognition of the allelic pair; this association has not heretofore been made. Alternatively, sets of allelic contigs may exist in which the polymorphic site is within an ORF, but does not result in an amino acid change among the allelic polypeptides. Here too it is believed that the invention resides in the recognition of the allelic pair; and that this association has not heretofore been made. In yet another alternative, the polymorphic site resides within an ORF and results in an amino acid change, or a frameshift, among the alleles of the allelic set. In the sets of gene products that fall within this group, at least one of the alleles at the polypeptide level is a known protein. At least one of the remaining allele or alleles in the set, carrying a variant amino acid at the polymorphic site, is a novel polypeptide not heretofore known. The invention resides at least in the recognition of the polymorphic allele as being a variant of the known reference polypeptide.
Table 1 provides information concerning the allelic sequences. One of the sequences may be termed a reference polymorphic sequence, and the corresponding second sequence includes the variant SNP at the polymoφhic site. Since the reference polypeptide sequence is already known, the Sequence Listing accompanying this application provides only the sequence of the polymorphic allele, while its SEQ ID NO is provided in the Table. A reference to the SEQ ID NO that corresponds to the translated amino acid sequence is also given. The Table includes thirteen columns that provide descriptive information for each cSNP, each of which occupies one row in the Table. The column headings, and a description of each, are given below.
SNPs disclosed in Table 1 were detected by aligning large numbers of sequences from genetically diverse sources of publicly available mRNA libraries (Clontech). Software designed specifically to look for multiple examples of variant bases differing from a consensus sequence was created and deployed. A criteria of a minimum of 2 occurrences of a sequence differing from the consensus in high quality sequence reads was used to identify an SNP.
The SNPs described herein may be useful in diagnostic kits, for DNA arrays on chips and for other uses that involve hybridization of the SNP. Specific SNPs may have utility where a disease has already been associated with that gene. Examples of possible disease correlations between the claimed SNPs with members of the genes of each classification are listed below:
Amylases
Amylase is responsible for endohydrolysis of 1,4-alpha-glucosidic linkages in oligosaccharides and polysaccharides. Variations in amylase gene may be indicative of delayed maturation and of various amylase producing neoplasms and carcinomas.
Amyloid
The serum amyloid A (SAA) proteins comprise a family of vertebrate proteins that associate predominantly with high density lipoproteins (HDL). The synthesis of certain members of the family is greatly increased in inflammation. Prolonged elevation of plasma SAA levels, as in chronic inflammation, 15 results in a pathological condition, called amyloidosis, which affectsthe liver, kidney and spleen and which is characterized by the highly insoluble accumulation of SAA in these tissues. Amyloid selectively inhibits insulin- stimulated glucose utilization and glycogen deposition in muscle, while not affecting adipocyte glucose metabolism. Deposition of fibrillar amyloid proteins intraneuronally, as neurofibrillary tangles, extracellularly, as plaques and in blood vessels, is characteristic of both Alzheimer's disease and aged Down's syndrome. Amyloid deposition is also associated with type II diabetes mellitus.
Angiopoeitin
Members of the angiopoeitin/fibrinogen family have been shown to stimulate the generation of new blood vessels, inhibit the generation of new blood vessels, and perform several roles in blood clotting. This generation of new blood vessels, called angiogenesis, is also an essential step in tumor growth in order for the tumor to get the blood supply it needs to expand. Variation in these genes may be predictive of any form of heart disease, numerous blood clotting disorders, stroke, hypertension and predisposition to tumor formation and metastasis. In particular, these variants may be predictive of the response to various antihypertensive drugs and chemotherapeutic and anti-tumor agents. Apoptosis-related proteins
Active cell suicide (apoptosis) is induced by events such as growth factor withdrawal and toxins. It is controlled by regulators, which have either an inhibitory effect on programmed cell death (anti-apoptotic) or block the protective effect of inhibitors (pro- apoptotic). Many viruses have found a way of countering defensive apoptosis by encoding their own anti-apoptosis genes preventing their target-cells from dying too soon. Variants of apoptosis related genes may be useful in formulation of antiaging drugs.
Cadherin, Cyclin, Polymerase, Oncogenes, Histones, Kinases
Members of the cell division/cell cycle pathways such as cyclins, many transcription factors and kinases, DNA polymerases, histones, helicases and other oncogenes play a critical role in carcinogenesis where the uncontrolled proliferation of cells leads to tumor formation and eventually metastasis. Variation in these genes may be predictive of predisposition to any form of cancer, from increased risk of tumor formation to increased rate of metastasis. In particular, these variants may be predictive of the response to various chemotherapeutic and anti-tumor agents.
Colony-stimulating factor-related proteins
Granulocyte/macrophage colony-stimulating factors are cytokines that act in hematopoiesis by controlling the production, differentiation, and function of 2 related white cell populations of the blood, the granulocytes and the monocytes-macrophages.
Complement-related proteins
Complement proteins are immune associated cytotoxic agents, acting in a chain reaction to exterminate target cells to that were opsonized (primed) with antibodies, by forming a membrane attack complex (MAC). The mechanism of killing is by opening pores in the target cell membrane. Variations in 20 complement genes or their inhibitors are associated with many autoimmune disorders. Modified serum levels of complement products cause edemas of various tissues, lupus (SLE), vasculitis, glomerulonephritis, renal failure, hemolytic anemia, thrombocytopenia, and arthritis. They interfere with mechanisms of ADCC (antibody dependent cell cytotoxicity), severely impair immune competence and reduce phagocytic ability. Variants of complement genes may also be indicative of type I diabetes mellitus, meningitis neurological disorders such as Nemaline myopathy, Neonatal hypotonia, muscular disorders such as congenital myopathy and other diseases.
Cytochrome
The respiratory chain is a key biochemical pathway which is essential to all aerobic cells. There are five different cytochromes involved in the chain. These are heme bound proteins which serve as electron carriers. Modifications in these genes may be predictive of ataxia areflexia, dementia and myopathic and neuropathic changes in muscles. Also, association with various types of solid tumors.
Kinesins
Kinesins are tubulin molecular motors that function to transport organelles within cells and to move chromosomes along microtubules during cell division. Modifications of these genes may be indicative of neurological disorders such as Pick disease of the brain, tuberous sclerosis.
Cytokines, Interferon, Interleukin
Members of the cytokine families are known for their potent ability to stimulate cell growth and division even at low concentrations. Cytokines such as erythropoietin are cell-specific in their growth stimulation; erythropoietin is useful for the stimulation of the proliferation of erythroblasts. Variants in cytokines may be predictive for a wide variety of diseases, including cancer predisposition.
G-protein coupled receptors
G-protein coupled receptors (also called R7G) are an extensive group of hormones, neurotransmitters, odorants and light receptors which transduce extracellular signals by interaction with guanine nucleotide-binding (G) proteins. Alterations in genes coding for G-coupled proteins may be involved in and indicative of a vast number of physiological conditions. These include blood pressure regulation, renal dysfunctions, male infertility, dopamine associated cognitive, emotional, and endocrine functions, hypercalcemia, chondrodysplasia and osteoporosis, pseudohypoparathyroidism, growth retardation and dwarfism. Thioesterases
Eukaryotic thiol proteases are a family of proteolytic enzymes which contain an active site cysteine. Catalysis proceeds through a thioester intermediate and is facilitated by a nearby histidine side chain; an asparagine completes the essential catalytic triad. Variants of thioester associated genes may be predictive of neuronal disorders and mental illnesses such as Ceroid Lipoffiscinosis, Neuronal 1, Infantile, Santavuori disease and more.
The SNPs are shown in Table 1 and the Sequence Listing. Both provide a summary of the polymorphic sequences disclosed herein. In the Table, a "SNP" is a polymorphic site embedded in a polymorphic sequence. The polymorphic site is occupied by a single nucleotide, which is the position of nucleotide variation between the wild type and polymorphic allelic sequences. The site is usually preceded by and followed by relatively highly conserved sequences of the allele (e.g., sequences that vary in less than 1/100 or 1/1000 members of the populations). Thus, a polymoφhic sequence can include one or more of the following sequences: (1) a sequence having the nucleotide denoted in Table 1, column 5 at the polymoφhic site in the polymoφhic sequence; or (2) a sequence having a nucleotide other than the nucleotide denoted in Table 1, column 5 at the polymoφhic site in the polymoφhic sequence. An example of the latter sequence is a polymoφhic sequence having the nucleotide denoted in Table 1, column 6 at the polymoφhic site in the polymoφhic sequence.
Nucleotide sequences for a referenced-polymoφhic pair are presented in Table 1.
Each cSNP entry provides information concerning the wild type nucleotide sequence as well as the corresponding sequence that includes the SNP at the polymoφhic site. Since the wild type sequence is already known, the Sequence Listing accompanying this application provides only the sequence of the polymoφhic allele; its SEQ ID NO: is also cross referenced in the Table 1. A reference to the SEQ ID NO: giving the translated amino acid sequence is also given if appropriate. The Table includes thirteen columns that provide descriptive information for each cSNP, each of which occupies one row in the Table. The column headings, and an explanation for each, are given below.
"SEQ ID" provides the cross-references to the nucleotide SEQ ID NOs: for the polymoφhic sequences, which are numbered consecutively, and, as explained below, amino acid SEQ ID NOs: as well, in the Sequence Listing of the application. Conversely, each sequence entry in the Sequence Listing also includes a cross-reference to the CuraGen sequence ID, under the label "CuraGen sequence ID". The first SEQ ID NO: given in the first column of each row of the Table is the SEQ ID NO: identifying the nucleic acid sequence for the polymoφhisms. If a polymoφhism carries an entry for an amino acid in a coding region, then a second SEQ ID NO: appears in parentheses in the column "Amino acid after" (see below) for the polymoφhic amino acid sequence . The latter SEQ ID NOs: refer to amino acid sequences giving the polymoφhic amino acid sequences that are the translation of the nucleotide polymoφhism. If a polymoφhism carries no entry for the protein portion of the row, only one SEQ ID NO: is provided, in the first column.
"Base pos. of SNP" gives the numerical position of the nucleotide in the nucleic acid at which the cSNP is found, as identified in this invention.
"Polymoφhic sequence" provides a 51 -base sequence with the polymoφhic site at the 26 base in the sequence, as well as 25 bases from the reference sequence on the 5' side and the 3' side of the polymoφhic site. The designation at the polymoφhic site is enclosed in square brackets, and provides first, the reference nucleotide; second, a "slash (/)"; and third, the polymoφhic nucleotide. In certain cases the polymoφhism is an insertion or a deletion. In that case, the position which is "unfilled" (i.e., the reference or the polymoφhic position) is indicated by the word "gap".
"Base before" provides the nucleotide present in the reference sequence at the position at which the polymoφhism is found.
"Base after" provides the altered nucleotide at the position of the polymoφhism.
"Amino acid before" provides the amino acid in the reference protein, if the polymoφhism occurs in a coding region.
"Amino acid after" provides the amino acid in the polymoφhic protein, if the polymoφhism occurs in a coding region. This column also includes the SEQ ID NO: in parentheses for the translated polymoφhic amino acid sequence if the polymoφhism occurs in a coding region.
"Type of change" provides information on the nature of the polymoφhism. "SILENT-NONCODING" is used if the polymoφhism occurs in a noncoding region of a nucleic acid.
"SILENT-CODING" is used if the polymoφhism occurs in a coding region of a nucleic acid of a nucleic acid and results in no change of amino acid in the translated polymoφhic protein.
"CONSERVATIVE" is used if the polymoφhism occurs in a coding region of a nucleic acid and provides a change in which the altered amino acid falls in the same class as the reference amino acid. The classes are:
Aliphatic: Gly, Ala, Val, Leu, He;
Aromatic: Phe, Tyr, Tφ;
Sulfur-containing: Cys, Met;
Aliphatic OH: Ser, Thr;
Basic: Lys, Arg, His;
Acidic: Asp, Glu, Asn, Gin;
Pro falls in none of the other classes; and
End defines a termination codon.
"NONCONSERVATIVE" is used if the polymoφhism occurs in a coding region of a nucleic acid and provides a change in which the altered amino acid falls in a different class than the reference amino acid.
"FRAMESHIFT" relates to an insertion or a deletion. Ifthe frameshift occurs in a coding region, the Table provides the translation of the frameshifted codons 3 ' to the polymoφhic site.
"Protein classification of CuraGen gene" provides a generic class into which the protein is classified. During the course of the work leading to the filing of the four applications identified above, approximately 100 classes of proteins were identified. "Name of protein identified following a BLASTX analysis of the CuraGen sequence" provides the database reference for the protein found to resemble the novel reference- polymoφhism cognate pair most closely. (The next paragraph explains how a sequence was determined to be "novel").
"Similarity (pvalue) following a BLASTX analysis" provides the pvalue, a statistical measure from the BLASTX analysis that the polymoφhic sequence is similar to, and therefore an allele of, the reference, or wild-type, sequence. In the present application, a cutoff of pvalue > 1 x 10"50 (entered, for example, as l.OE-50 in the Table) is used to establish that the reference-polymoφhic cognate pairs are novel.
"Map location" provides any information available at the time of filing related to localization of a gene on a chromosome.
The polymoφhisms are arranged in the Table in the following order.
SEQ ID NOs: 1-5696 are nucleotide sequences for SNPs that are silent.
SEQ ID NOs: 5697-6011 are nucleotide sequences for SNPs that lead to conservative amino acid changes.
SEQ ID NOs: 6012-6740 are nucleotide sequences for SNPs that lead to nonconservative amino acid changes.
SEQ ID NOs: 6741-7867 are nucleotide sequences for SNPs that involve a gap. With respect to the reference or wild-type sequence at the position of the polymoφhism, the allelic cSNP introduces an additional nucleotide (an insertion) or deletes a nucleotide (a deletion). An SNP that involves a gap generates a frame shift.
SEQ ID NOs: 7868-8182 are the amino acid sequences centered at the polymoφhic amino acid residue for the protein products provided by SNPs that lead to conservative amino acid changes. 7 or 8 amino acids on either side of the polymoφhic site are shown. The order in which these sequences appear mirrors the order of presentation of the cognate nucleotide sequences, and is set forth in the Table.
SEQ ID NOs: 8183-8911 are the amino acid sequences centered at the polymoφhic amino acid residue for the protein products provided by SNPs that lead to nonconservative amino acid changes. 7 or 8 amino acids on either side of the polymoφhic site are shown. The order in which these sequences appear mirrors the order of presentation of the cognate nucleotide sequences, and is set forth in the Table.
SEQ ID NOs: 8912-10038 are the amino acid sequences centered at the polymoφhic amino acid residue for the protein products provided by SNPs that lead to frameshift-induced amino acid changes. 7 or 8 amino acids on either side of the polymoφhic site are shown. The order in which these sequences appear mirrors the order of presentation of the cognate nucleotide sequences, and is set forth in the Table.
Provided herein are compositions which include, or are capable of detecting, nucleic acid sequences having these polymoφhisms, as well as methods of using nucleic acids.
IDENTIFICATION OF INDIVIDUALS CARRYING SNPs
Individuals carrying polymoφhic alleles of the invention may be detected at either the DNA, the RNA, or the protein level using a variety of techniques that are well known in the art. Strategies for identification and detection are described in e.g., EP 730,663, EP 717,113, and PCT US97/02102. The present methods usually employ pre-characterized polymoφhisms. That is, the genotyping location and nature of polymoφhic forms present at a site have already been determined. The availability of this information allows sets of probes to be designed for specific identification of the known polymoφhic forms.
Many of the methods described below require amplification of DNA from target samples. This can be accomplished by e.g., PCR. See generally PCR Technology: Principles and Applications for DNA Amplification (ed. H.A. Erlich, Freeman Press, NY, NY, 1992); PCR Protocols: A Guide to Methods and Applications (eds. Innis, et al., Academic Press, San Diego, CA, 1990); Mattila et al., Nucleic Acids Res. 19, 4967 (1991); Eckert et al., PCR Methods and Applications 1, 17 (1991); PCR (eds. McPherson et al., IRL Press, Oxford); and U.S. Patent 4,683,202.
The phrase "recombinant protein" or "recombinantly produced protein" refers to a peptide or protein produced using non-native cells that do not have an endogenous copy of
DNA able to express the protein. In particular, as used herein, a recombinantly produced protein relates to the gene product of a polymoφhic allele, i.e., a "polymoφhic protein" containing an altered amino acid at the site of translation of the nucleotide polymoφhism. The cells produce the protein because they have been genetically altered by the introduction of the appropriate nucleic acid sequence. The recombinant protein will not be found in association with proteins and other subcellular components normally associated with the cells producing the protein. The terms "protein" and "polypeptide" are used interchangeably herein.
The phrase "substantially purified" or "isolated" when referring to a nucleic acid, peptide or protein, means that the chemical composition is in a milieu containing fewer, or preferably, essentially none, of other cellular components with which it is naturally associated. Thus, the phrase "isolated" or "substantially pure" refers to nucleic acid preparations that lack at least one protein or nucleic acid normally associated with the nucleic acid in a host cell. It is preferably in a homogeneous state although it can be in either a dry or aqueous solution. Purity and homogeneity are typically determined using analytical chemistry techniques such as gel electrophoresis or high performance liquid chromatography. Generally, a substantially purified or isolated nucleic acid or protein will comprise more than 80% of all macromolecular species present in the preparation. Preferably, the nucleic acid or protein is purified to represent greater than 90% of all macromolecular species present. More preferably the nucleic acid or protein is purified to greater than 95%, and most preferably the nucleic acid or protein is purified to essential homogeneity, wherein other macromolecular species are not detected by conventional analytical procedures.
The genomic DNA used for the diagnosis may be obtained from any nucleated cells of the body, such as those present in peripheral blood, urine, saliva, buccal samples, surgical specimen, and autopsy specimens. The DNA may be used directly or may be amplified enzymatically in vitro through use of PCR (Saiki et al. Science 239:487-491 (1988)) or other in vitro amplification methods such as the ligase chain reaction (LCR) (Wu and Wallace Genomics 4:560-569 (1989)), strand displacement amplification (SDA) (Walker et al. Proc. Natl. Acad. Sci. U.S.A. 89:392-396 (1992)), self-sustained sequence replication (3SR) (Fahy et al. PCR Methods P&J& 1 :25-33 (1992)), prior to mutation analysis.
The method for preparing nucleic acids in a form that is suitable for mutation detection is well known in the art. A "nucleic acid" is a deoxyribonucleotide or ribonucleotide polymer in either single-or double-stranded form, including known analogs of natural nucleotides unless otherwise indicated. The term "nucleic acids", as used herein, refers to either DNA or RNA. "Nucleic acid sequence" or "polynucleotide sequence" refers to a single-stranded sequence of deoxyribonucleotide or ribonucleotide bases read from the 5' end to the 3' end. The direction of 5' to 3' addition of nascent RNA transcripts is referred to as the transcription direction; sequence regions on the DNA strand having the same sequence as the RNA and which are beyond the 5' end of the RNA transcript in the 5' direction are referred to as "upstream sequences"; sequence regions on the DNA strand having the same sequence as the RNA and which are beyond the 3' end of the RNA transcript in the 3' direction are referred to as "downstream sequences". The term includes both self-replicating plasmids, infectious polymers of DNA or RNA and nonfunctional DNA or RNA. The complement of any nucleic acid sequence of the invention is understood to be included in the definition of that sequence. "Nucleic acid probes" may be DNA or RNA fragments.
The detection of polymoφhisms in specific DNA sequences, can be accomplished by a variety of methods including, but not limited to, restriction-fragment-length-polymoφhism detection based on allele-specific restriction-endonuclease cleavage (Kan and Dozy Lancet ii:910-912 (1978)), hybridization with allele-specific oligonucleotide probes (Wallace et al. Nucl. Acids Res. 6:3543-3557 (1978)), including immobilized oligonucleotides (Saiki et al. Proc. Natl. Acad. SCI. USA. 86:6230-6234 (1969)) or oligonucleotide arrays (Maskos and Southern Nucl. Acids Res 21:2269-2270 (1993)), allele-specific PCR (Newton et al. Nucl Acids Res 17:2503-_2516 (1989)), mismatch-repair detection (MRD) (Faham and Cox . Genome Res 5:474-482 (1995)), binding of MutS protein (Wagner et al. Nucl Acids Res
23:3944-3948 (1995), denaturing-gradient gel electrophoresis (DGGE) (Fisher and Lerman et al. Proc. Natl. Acad. Sci. U.S.A. 80:1579-1 583 (1983)), single-strand-conformation- polymoφhism detection (Orita et al. Genomics 5:874-879 (1983)), RNAase cleavage at mismatched base-pairs (Myers et al. Science 230:1242 (1985)), chemical (Cotton et al. Proc. Natl. w Sci. U.S.A, 8Z4397-4401 (1988)) or enzymatic (Youil et al. Proc. Natl. Acad. Sci. U.S.A. 92:87-91 (1995)) cleavage of heteroduplex DNA, methods based on allele specific primer_extension (Syvanen et al. Genomics 8:684-692 (1990)), genetic bit analysis (GBA) (Nikiforov et al. &&I Acids 22:4167-4175 (1994)), the oligonucleotide-ligation assay (OLA) (Landegren et al. Science_241 :1077 (1988)), the allele-specific ligation chain reaction (LCR) (Barrany Proc. Natl. Acad. Sci. U.S.A. 88:189-1 93 (1991)), gap-LCR (Abravaya et al. Nucl Acids Res 23:675-682 (1995)), radioactive and/or fluorescent DNA sequencing using standard procedures well known in the art, and peptide nucleic acid (PNA) assays (Orum et al., Nucl. Acids Res, 21:5332-5356 (1993); Thiede et al. Nucl. Acids Res. 24:983-984 (1996)).
"Specific hybridization" or "selective hybridization" refers to the binding, or duplexing, of a nucleic acid molecule only to a second particular nucleotide sequence to which the nucleic acid is complementary, under suitably stringent conditions when that sequence is present in a complex mixture (e.g., total cellular DNA or RNA). "Stringent conditions" are conditions under which a probe will hybridize to its target subsequence, but to no other sequences. Stringent conditions are sequence-dependent and are different in different circumstances. Longer sequences hybridize specifically at higher temperatures than shorter ones. Generally, stringent conditions are selected such that the temperature is about 5°C lower than the thermal melting point (Tm) for the specific sequence to which hybridization is intended to occur at a defined ionic strength and pH. The Tm is the temperature (under defined ionic strength, pH, and nucleic acid concentration) at which 50% of the target sequence hybridizes to the complementary probe at equilibrium. Typically, stringent conditions include a salt concentration of at least about 0.01 to about 1.0 M Na ion concentration (or other salts), at pH 7.0 to 8.3. The temperature is at least about 30°C for short probes (e.g., 10 to 50 nucleotides) . Stringent conditions can also be achieved with the addition of destabilizing agents such as formamide. For example, conditions of 5X SSPE (750 mM NaCl, 50 mM NaPhosphate, 5 mM EDTA, pH 7.4) and a temperature of 25-30°C are suitable for allele-specific probe hybridization.
"Complementary" or "target" nucleic acid sequences refer to those nucleic acid sequences which selectively hybridize to a nucleic acid probe. Proper annealing conditions depend, for example, upon a probe's length, base composition, and the number of mismatches and their position on the probe, and must often be determined empirically. For discussions of nucleic acid probe design and annealing conditions, see, for example, Sambrook et al., or Current Protocols in Molecular Biologv, F. Ausubel et al., ed., Greene Publishing and Wiley-Interscience, New York (1987).
A perfectly matched probe has a sequence perfectly complementary to a particular target sequence. The test probe is typically perfectly complementary to a portion of the target sequence. A "polymoφhic" marker or site is the locus at which a sequence difference occurs with respect to a reference sequence. Polymoφhic markers include restriction fragment length polymoφhisms, variable number of tandem repeats (VNTR's), hypervariable regions, minisatellites, dinucleotide repeats, trinucleotide repeats, tetranucleotide repeats, simple sequence repeats, and insertion elements such as Alu. The reference allelic form may be, for example, the most abundant form in a population, or the first allelic form to be identified, and other allelic forms are designated as alternative, variant or polymoφhic alleles. The allelic form occurring most frequently in a selected population is sometimes referred to as the "wild type" form, and herein may also be referred to as the "reference" form. Diploid organisms may be homozygous or heterozygous for allelic forms. A diallelic polymoφhism has two distinguishable forms (i.e., base sequences), and a triallelic polymoφhism has three such forms.
As used herein an "oligonucleotide" is a single-stranded nucleic acid ranging in length from 2 to about 60 bases. Oligonucleotides are often synthetic but can also be produced from naturally occurring polynucleotides. A probe is an oligonucleotide capable of binding to a target nucleic acid of a complementary sequence through one or more types of chemical bonds, usually through complementary base pairing via hydrogen bond formation. Oligonucleotides probes are often between 5 and 60 bases, and, in specific embodiments, may be between 10-40, or 15-30 bases long. An oligonucleotide probe may include natural (i.e. A, G, C, or T) or modified bases (7-deazaguanosine, inosine, etc.). In addition, the bases in an oligonucleotide probe may be joined by a linkage other than a phosphodiester bond, such as a phosphoramidite linkage or a phosphorothioate linkage, or they may be peptide nucleic acids in which the constituent bases are joined by peptide bonds rather than by phosphodiester bonds, so long as it does not interfere with hybridization.
As used herein, the term "primer" refers to a single-stranded oligonucleotide which acts as a point of initiation of template-directed DNA synthesis under appropriate conditions (e.g., in the presence of four different nucleoside triphosphates and a polymerization agent, such as DNA polymerase, RNA polymerase or reverse transcriptase) in an appropriate buffer and at a suitable temperature. The appropriate length of a primer depends on the intended use of the primer, but typically ranges from 15 to 30 nucleotides. Short primer molecules generally require cooler temperatures to form sufficiently stable hybrid complexes with the template. A primer need not be perfectly complementary to the exact sequence of the template, but should be sufficiently complementary to hybridize with it. The term "primer site" refers to the sequence of the target DNA to which a primer hybridizes. The term
"primer pair" refers to a set of primers including a 5' (upstream) primer that hybridizes with the 5' end of the DNA sequence to be amplified and a 3' (downstream) primer that hybridizes with the complement of the 3' end of the sequence to be amplified. DNA fragments can be prepared, for example, by digesting plasmid DNA, or by use of PCR. Oligonucleotides for use as primers or probes are chemically synthesized by methods known in the field of the chemical synthesis of polynucleotides, including by way of non-limiting example the phosphoramidite method described by Beaucage and Carruthers, Tetrahedron Lett 22: 1859-1 862 (1981) and the triester method provided by Matteucci, et al., J. Am. Chem. Soc. 103:3185 (1981) both incoφorated herein by reference. These syntheses may employ an automated synthesizer, as described in Needham-VanDevanter, D.R., et al., Nucleic Acids Res. 12:61596168 (1984). Purification of oligonucleotides may be carried out by either native acrylamide gel electrophoresis or by anion-exchange HPLC as described in Pearson, J.D. and Regnier, F.E., ,J. Chrom,, 255:137-149 (1983). A double stranded fragment may then be obtained, if desired, by annealing appropriate complementary single strands together under suitable conditions or by synthesizing the complementary strand using a DNA polymerase with an appropriate primer sequence. Where a specific sequence for a nucleic acid probe is given, it is understood that the complementary strand is also identified and included. The complementary strand will work equally well in situations where the target is a double-stranded nucleic acid.
The sequence of the synthetic oligonucleotide or of any nucleic acid fragment can be can be obtained using either the dideoxy chain termination method or the Maxam-Gilbert method (see Sambrook et al. Molecular Cloning - a Laboratory Manual (2nd Ed.i Vols. 1- 3, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, (1989), which is incoφorated herein by reference. This manual is hereinafter referred to as "Sambrook et al." ; Zyskind et al., (1988)). Recombinant DNA Laboratory Manual, (Acad. Press, New York). Oligonucleotides useful in diagnostic assays are typically at least 8 consecutive nucleotides in length, and may range upwards of 18 nucleotides in length to greater than 100 or more consecutive nucleotides.
Another aspect of the invention pertains to isolated antisense nucleic acid molecules that are hybridizable to or complementary to the nucleic acid molecule comprising the SNP- containing nucleotide sequences of the invention, or fragments, analogs or derivatives thereof. An "antisense" nucleic acid comprises a nucleotide sequence that is complementary to a "sense" nucleic acid encoding a protein, e.g., complementary to the coding strand of a double-stranded cDNA molecule or complementary to an mRNA sequence. In specific aspects, antisense nucleic acid molecules are provided that comprise a sequence complementary to at least about 10, about 25, about 50, or about 60 nucleotides or an entire SNP coding strand, or to only a portion thereof.
In one embodiment, an antisense nucleic acid molecule is antisense to a "coding region" of the coding strand of a polymoφhic nucleotide sequence of the invention. The term "coding region" refers to the region of the nucleotide sequence comprising codons which are translated into amino acid. In another embodiment, the antisense nucleic acid molecule is antisense to a "noncoding region" of the coding strand of a nucleotide sequence of the invention. The term "noncoding region" refers to 5' and 3' sequences which flank the coding region that are not translated into amino acids (i.e., also referred to as 51 and 3' untranslated regions).
Given the coding strand sequences disclosed herein, antisense nucleic acids of the invention can be designed according to the rules of Watson and Crick or Hoogsteen base pairing. For example, the antisense nucleic acid molecule can generally be complementary to the entire coding region of an mRNA, but more preferably as embodied herein, it is an oligonucleotide that is antisense to only a portion of the coding or noncoding region of the mRNA. An antisense oligonucleotide can range in length between about 5 and about 60 nucleotides, preferably between about 10 and about 45 nucleotides, more preferably between about 15 and 40 nucleotides, and still more preferably between about 15 and 30 in length. An antisense nucleic acid of the invention can be constructed using chemical synthesis or enzymatic ligation reactions using procedures known in the art. For example, an antisense nucleic acid (e.g., an antisense oligonucleotide) can be chemically synthesized using naturally occurring nucleotides or variously modified nucleotides designed to increase the biological stability of the molecules or to increase the physical stability of the duplex formed between the antisense and sense nucleic acids, e.g., phosphorothioate derivatives and acridine substituted nucleotides can be used.
Examples of modified nucleotides that can be used to generate the antisense nucleic acid include: 5-fluorouracil, 5-bromouracil, 5-chlorouracil, 5-iodouracil, hypoxanthine, xanthine, 4-acetylcytosine, 5-(carboxyhydroxylmethyl) uracil, 5-carboxymethylaminomethyl- 2-thiouridine, 5-carboxymethylaminomethyluracil, dihydrouracil, beta-D-galactosylqueosine, inosine, N6-isopentenyladenine, 1-methylguanine, 1-methylinosine, 2,2-dimethylguanine, 2-methyladenine, 2-methylguanine, 3-methylcytosine, 5-methylcytosine, N6-adenine, 7-methylguanine, 5-methylaminomethyluracil, 5-methoxyaminomethyl-2-thiouracil, beta-D-mannosylqueosine, 5'-methoxycarboxymethyluracil, 5-methoxyuracil, 2-methylthio-N6-isopentenyladenine, uracil-5-oxyacetic acid (v), wybutoxosine, pseudouracil, queosine, 2-thiocytosine, 5-methyl-2-thiouracil, 2-thiouracil, 4-thiouracil, 5-methyluracil, uracil-5-oxyacetic acid methylester, uracil-5-oxyacetic acid (v), 5-methyl-2-thiouracil, 3-(3-amino-3-N-2-carboxypropyl) uracil, (acp3)w, and 2,6-diaminopurine. Alternatively, the antisense nucleic acid can be produced biologically using an expression vector into which a nucleic acid has been subcloned in an antisense orientation (i.e., RNA transcribed from the inserted nucleic acid will be of an antisense orientation to a target nucleic acid of interest, described further in the following section).
The antisense nucleic acid molecules of the invention are typically administered to a subject or generated in situ such that they hybridize with or bind to cellular mRNA and/or genomic DNA encoding a polymoφhic protein to thereby inhibit expression of the protein, e.g., by inhibiting transcription and/or translation. The hybridization can be by conventional nucleotide complementary to form a stable duplex, or, for example, in the case of an antisense nucleic acid molecule that binds to DNA duplexes, through specific interactions in the major groove of the double helix. An example of a route of administration of antisense nucleic acid molecules of the invention includes direct injection at a tissue site. Alternatively, antisense nucleic acid molecules can be modified to target selected cells and then administered systemically. For example, for systemic administration, antisense molecules can be modified such that they specifically bind to receptors or antigens expressed on a selected cell surface, e.g., by linking the antisense nucleic acid molecules to peptides or antibodies that bind to cell surface receptors or antigens. The antisense nucleic acid molecules can also be delivered to cells using the vectors described herein. To achieve sufficient intracellular concentrations of antisense molecules, vector constructs in which the antisense nucleic acid molecule is placed under the control of a strong pol II or pol III promoter are preferred.
In yet another embodiment, the antisense nucleic acid molecule of the invention is an α-anomeric nucleic acid molecule. An α-anomeric nucleic acid molecule forms specific double-stranded hybrids with complementary RNA in which, contrary to the usual -units, the strands run parallel to each other (Gaultier et al. (1987) Nucleic Acids Res 15: 6625-6641). The antisense nucleic acid molecule can also comprise a 2'-o-methylribonucleotide (Inoue et al. (1987) Nucleic Acids Res 15: 6131-6148) or a chimeric RNA -DNA analogue (Inoue et α/. (1987) FEBSLett 215: 327-330).
The following terms are used to describe the sequence relationships between two or more nucleic acids or polynucleotides: "reference sequence", "comparison window", "sequence identity", "percentage of sequence identity", and "substantial identity". A "reference sequence" is a defined sequence used as a basis for a sequence comparison; a reference sequence may be a subset of a larger sequence, for example, as a segment of a full- length cDNA or gene sequence given in a sequence listing, or may comprise a complete cDNA or gene sequence. Optimal alignment of sequences for aligning a comparison window may, for example, be conducted by the local homology algoritlim of Smith and Waterman Adv. Appl. Math. 2482 (1981), by the homology alignment algorithm of Needleman and Wunsch J. Mol. Biol. 48:443 (1970), by the search for similarity method of Pearson and Lipman Proc. Natl. Acad. Sci. U.S.A. 852444 (1988), or by computerized implementations of these algorithms (for example, GAP, BESTFIT, FASTA, and TFASTA in the Wisconsin Genetics Software Package Release 7.0, Genetics Computer Group, 575 Science Dr., Madison, WI).
Techniques for nucleic acid manipulation of the nucleic acid sequences harboring the cSNP's of the invention, such as subcloning nucleic acid sequences encoding polypeptides into expression vectors, labeling probes, DNA hybridization, and the like, are described generally in Sambrook et al., The phrase "nucleic acid sequence encoding" refers to a nucleic acid which directs the expression of a specific protein, peptide or amino acid sequence. The nucleic acid sequences include both the DNA strand sequence that is transcribed into RNA and the RNA sequence that is translated into protein, peptide or amino acid sequence. The nucleic acid sequences include both the full length nucleic acid sequences disclosed herein as well as non-full length sequences derived from the full length protein. It being further understood that the sequence includes the degenerate codons of the native sequence or sequences which may be introduced to provide codon preference in a specific host cell. Consequently, the principles of probe selection and array design can readily be extended to analyze more complex polymoφhisms (see EP 730,663). For example, to characterize a triallelic SNP polymoφhism, three groups of probes can be designed tiled on the three polymoφhic forms as described above. As a further example, to analyze a diallelic polymoφhism involving a deletion of a nucleotide, one can tile a first group of probes based on the undeleted polymoφhic form as the reference sequence and a second group of probes based on the deleted form as the reference sequence.
For assays of genomic DNA, virtually any biological convenient tissue sample can be used. Suitable samples include whole blood, semen, saliva, tears, urine, fecal material, sweat, buccal, skin and hair. Genomic DNA is typically amplified before analysis. Amplification is usually effected by PCR using primers flanking a suitable fragment e.g., of 50-500 nucleotides containing the locus of the polymoφhism to be analyzed. Target is usually labeled in the course of amplification. The amplification product can be RNA or DNA, single stranded or double stranded. If double stranded, the amplification product is typically denatured before application to an array. If genomic DNA is analyzed without amplification, it may be desirable to remove RNA from the sample before applying it to the array. Such can be accomplished by digestion with DNase-free RNase.
DETECTION OF POLYMORPHISMS IN A NUCLEIC ACID SAMPLE
The SNPs disclosed herein can be used to determine which forms of a characterized polymoφhism are present in individuals under analysis.
The design and use of allele-specific probes for analyzing polymoφhisms is described by e.g., Saiki et al., Nature 324, 163-166 (1986); Dattagupta, EP 235,726, Saiki, WO 89/11548. Allele-specific probes can be designed that hybridize to a segment of target DNA from one individual but do not hybridize to the corresponding segment from another individual due to the presence of different polymoφhic forms in the respective segments from the two individuals. Hybridization conditions should be sufficiently stringent that there is a significant difference in hybridization intensity between alleles, and preferably an essentially binary response, whereby a probe hybridizes to only one of the alleles. Some probes are designed to hybridize to a segment of target DNA such that the polymoφhic site aligns with a central position (e.g., in a 15-mer at the 7 position; in a 16-mer, at either the 7, 8 or 9 position) of the probe. This design of probe achieves good discrimination in hybridization between different allelic forms.
Allele-specific probes are often used in pairs, one member of a pair showing a perfect match to a reference form of a target sequence and the other member showing a perfect match to a variant form. Several pairs of probes can then be immobilized on the same support for simultaneous analysis of multiple polymoφhisms within the same target sequence. The polymoφhisms can also be identified by hybridization to nucleic acid arrays, some examples of which are described in published PCT application WO 95/11995. WO 95/11995 also describes subarrays that are optimized for detection of a variant form of a precharacterized polymoφhism. Such a subarray contains probes designed to be complementary to a second reference sequence, which is an allelic variant of the first reference sequence. The second group of probes is designed by the same principles, except that the probes exhibit complementarity to the second reference sequence. The inclusion of a second group (or further groups) can be particularly useful for analyzing short subsequences of the primary reference sequence in which multiple mutations are expected to occur within a short distance commensurate with the length of the probes (e.g., two or more mutations within 9 to 21 bases).
An allele-specific primer hybridizes to a site on a target DNA overlapping a polymoφhism and only primes amplification of an allelic form to which the primer exhibits perfect complementarity. See Gibbs, Nucleic Acid Res. 172427-2448 (1989). This primer is used in conjunction with a second primer which hybridizes at a distal site. Amplification proceeds from the two-primers, resulting in a detectable product which indicates the particular allelic form is present. A control is usually performed with a second pair of primers, one of which shows a single base mismatch at the polymoφhic site and the other of which exhibits perfect complementarity to a distal site. The single-base mismatch prevents amplification and no detectable product is formed. The method works best when the mismatch is included in the 3'-most position of the oligonucleotide aligned with the polymoφhism because this position is most destabilizing to elongation from the primer (see, e.g., WO 93/22456).
Amplification products generated using the polymerase chain reaction can be analyzed by the use of denaturing gradient gel electrophoresis. Different alleles can be identified based on the different sequence-dependent melting properties and electrophoretic migration of DNA in solution. Erlich, ed., PCR Technology, Principles and Applications for DNA Amplification, (W.H. Freeman and Co New York, 1992, Chapter 7).
Alleles of target sequences can be differentiated using single-strand conformation polymoφhism analysis, which identifies base differences by alteration in electrophoretic migration of single stranded PCR products, as described in Orita et al., Proc. Nat. Acad. Sci. 86, 2766-2770 (1989). Amplified PCR products can be generated and heated or otherwise denatured, to form single stranded amplification products. Single-stranded nucleic acids may refold or form secondary structures which are partially dependent on the base sequence. The different electrophoretic mobilities of single-stranded amplification products can be related to base-sequence differences between alleles of target sequences.
The genotype of an individual with respect to a pathology suspected of being caused by a genetic polymoφhism may be assessed by association analysis. Phenotypic traits suitable for association analysis include diseases that have known but hitherto unmapped genetic components (e.g., agammaglobulinemia, diabetes insipidus, Lesch-Nyhan syndrome, muscular dystrophy, Wiskott-Aldrich syndrome, Fabry's disease, familial hypercholesterolemia, polycystic kidney disease, hereditary spherocytosis, von Willebrand's disease, tuberous sclerosis, hereditary hemorrhagic telangiectasia, familial colonic polyposis, Ehlers-Danlos syndrome, osteogenesis imperfecta, and acute intermittent poφhyria).
Phenotypic traits also include symptoms of, or susceptibility to, multifactorial diseases of which a component is or may be genetic, such as autoimmune diseases, inflammation, cancer, diseases of the nervous system, and infection by pathogenic microorganisms. Some examples of autoimmune diseases include rheumatoid arthritis, multiple sclerosis, diabetes (insulin-dependent and non- independent), systemic lupus erythematosus and Graves disease. Some examples of cancers include cancers of the bladder, brain, breast, colon, esophagus, kidney, oral cavity, ovary, pancreas, prostate, skin, stomach, leukemia, liver, lung, and uterus. Phenotypic traits also include characteristics such as longevity, appearance (e.g., baldness, obesity), strength, speed, endurance, fertility, and susceptibility or receptivity to particular drugs or therapeutic treatments.
Determination of which polymoφhic forms occupy a set of polymoφhic sites in an individual identifies a set of polymoφhic forms that distinguishes the individual. See generally National Research Council, The Evaluation of Forensic DNA Evidence (Eds.
Pollard et al., National Academy Press, DC, 1996). Since the polymoφhic sites are within a 50,000 bp region in the human genome, the probability of recombination between these polymoφhic sites is low. That low probability means the haplotype (the set of all 10 polymoφhic sites) set forth in this application should be inherited without change for at least several generations. The more sites that are analyzed the lower the probability that the set of polymoφhic forms in one individual is the same as that in an unrelated individual. Preferably, if multiple sites are analyzed, the sites are unlinked. Thus, polymoφhisms of the invention are often used in conjunction with polymoφhisms in distal genes. Preferred polymoφhisms for use in forensics are diallelic because the population frequencies of two polymoφhic forms can usually be determined with greater accuracy than those of multiple polymoφhic forms at multi-allelic loci.
The capacity to identify a distinguishing or unique set of forensic markers in an individual is useful for forensic analysis. For example, one can determine whether a blood sample from a suspect matches a blood or other tissue sample from a crime scene by determining whether the set of polymoφhic forms occupying selected polymoφhic sites is the same in the suspect and the sample. If the set of polymoφhic markers does not match between a suspect and a sample, it can be concluded (barring experimental error) that the suspect was not the source of the sample. If the set of markers does match, one can conclude that the DNA from the suspect is consistent with that found at the crime scene. If frequencies of the polymoφhic forms at the loci tested have been determined (e.g., by analysis of a suitable population of individuals), one can perform a statistical analysis to determine the probability that a match of suspect and crime scene sample would occur by chance.
p(ID) is the probability that two random individuals have the same polymoφhic or allelic form at a given polymoφhic site. In diallelic loci, four genotypes are possible: AA, AB, BA, and BB. If alleles A and B occur in a haploid genome of the organism with frequencies x and y, the probability of each genotype in a diploid organism are (see WO 95/12607):
Homozygote: p(AA)=χ2
Homozygote: p(BB)=y2=:(ι_x)2
Single Heterozygote: p(AB)=p(BA)=xy=x(l-x)
Both Heterozygotes: p(AB+ BA)=2xy=2x(l-x)
The probability of identity at one locus (i.e, the probability that two individuals, picked at random from a population will have identical polymoφhic forms at a given locus) is given by the equation:
p(ID)=(x2)2+ (2 y)2+ (y2)2. These calculations can be extended for any number of polymoφhic forms at a given locus. For example, the probability of identity p(ID) for a 3-allele system where the alleles have the frequencies in the population of x, y and z, respectively, is equal to the sum of the squares of the genotype frequencies:
p(ID x + (2xv)2+ (2yz)2+ (2xz)2+ z4+ y4
In a locus of n alleles, the appropriate binomial expansion is used to calculate p(ID) and p(exc).
The cumulative probability of identity (cum p(ID)) for each of multiple unlinked loci is determined by multiplying the probabilities provided by each locus:
cum p(ID)=p(IDl)p(ID2)p(ID3) . . . p(ID )
The cumulative probability of non-identity for n loci (i.e. the probability that two random individuals will be different at 1 or more loci) is given by the equation:
cum p(nonID)=l-cum p(ID).
If several polymoφhic loci are tested, the cumulative probability of non-identity for random individuals becomes very high (e.g., one billion to one). Such probabilities can be taken into account together with other evidence in determining the guilt or innocence of the suspect.
The object of paternity testing is usually to determine whether a male is the father of a child. In most cases, the mother of the child is known and thus, the mother's contribution to the child's genotype can be traced. Paternity testing investigates whether the part of the child's genotype not attributable to the mother is consistent with that of the putative father. Paternity testing can be performed by analyzing sets of polymoφhisms in the putative father and the child.
If the set of polymoφhisms in the child attributable to the father does not match the putative father, it can be concluded, barring experimental error, that the putative father is not the real father. If the set of polymoφhisms in the child attributable to the father does match the set of polymoφhisms of the putative father, a statistical calculation can be performed to determine the probability of coincidental match. The probability of parentage exclusion (representing the probability that a random male will have a polymoφhic form at a given polymoφhic site that makes him incompatible as the father) is given by the equation (see WO 95/12607):
p(exc)=xy(l-xy)
where x and y are the population frequencies of alleles A and B of a diallelic polymoφhic site. (At a triallelic site p(exc)=xy(l-xy)+ yz(l-yz)+ xz(l-xz)+ 3xyz(l-xyz))), where x, y and z and the respective population frequencies of alleles A, B and C). The probability of non- exclusion is:
p(non-exc)=\ -p(exc)
The cumulative probability of non-exclusion (representing the value obtained when n loci are used) is thus:
cump(non-exc)=p(non-excl)p(non-exc2)p(non-exc3) . . .p(non-exc )
The cumulative probability of exclusion for n loci (representing the probability that a random male will be excluded) is:
cum p(exc)=l-cum p(non~exc).
If several polymoφhic loci are included in the analysis, the cumulative probability of exclusion of a random male is very high. This probability can be taken into account in assessing the liability of a putative father whose polymoφhic marker set matches the child's polymoφhic marker set attributable to his/her father.
The polymoφhisms of the invention may contribute to the phenotype of an organism in different ways. Some polymoφhisms occur within a protein coding sequence and contribute to phenotype by affecting protein structure. The effect may be neutral, beneficial or detrimental, or both beneficial and detrimental, depending on the circumstances. For example, a heterozygous sickle cell mutation confers resistance to malaria, but a homozygous sickle cell mutation is usually lethal. Other polymoφhisms occur in noncoding regions but may exert phenotypic effects indirectly via influence on replication, transcription, and translation. A single polymoφhism may affect more than one phenotypic trait. Likewise, a single phenotypic trait may be affected by polymoφhisms in different genes. Further, some polymoφhisms predispose an individual to a distinct mutation that is causally related to a certain phenotype.
Phenotypic traits include diseases that have known but hitherto unmapped genetic components. Phenotypic traits also include symptoms of, or susceptibility to, multifactorial diseases of which a component is or may be genetic, such as autoimmune diseases, inflammation, cancer, diseases of the nervous system, and infection by pathogenic microorganisms. Some examples of autoimmune diseases include rheumatoid arthritis, multiple sclerosis, diabetes (insulin-dependent and non-independent), systemic lupus erythematosus and Graves disease. Some examples of cancers include cancers of the bladder, brain, breast, colon, esophagus, kidney, leukemia, liver, lung, oral cavity, ovary, pancreas, prostate, skin, stomach and uterus. Phenotypic traits also include characteristics such as longevity, appearance (e.g., baldness, obesity), strength, speed, endurance, fertility, and susceptibility or receptivity to particular drugs or therapeutic treatments.
Correlation is performed for a population of individuals who have been tested for the presence or absence of a phenotypic trait of interest and for polymoφhic marker sets. To perform such analysis, the presence or absence of a set of polymoφhisms (i.e. a polymoφhic set) is determined for a set of the individuals, some of whom exhibit a particular trait, and some of whom exhibit lack of the trait. The alleles of each polymoφhism of the set are then reviewed to determine whether the presence or absence of a particular allele is associated with the trait of interest. Correlation can be performed by standard statistical methods and statistically significant correlations between polymoφhic form(s) and phenotypic characteristics are noted. For example, it might be found that the presence of allele Al at polymoφhism A correlates with heart disease. As a further example, it might be found that the combined presence of allele Al at polymoφhism A and allele Bl at polymoφhism B correlates with increased milk production of a farm animal.
Such correlations can be exploited in several ways. In the case of a strong correlation between a set of one or more polymoφhic forms and a disease for which treatment is available, detection of the polymoφhic form set in a human or animal patient may justify immediate administration of treatment, or at least the institution of regular monitoring of the patient. Detection of a polymoφhic form correlated with serious disease in a couple contemplating a family may also be valuable to the couple in their reproductive decisions. For example, the female partner might elect to undergo in vitro fertilization to avoid the possibility of transmitting such a polymoφhism from her husband to her offspring. In the case of a weaker, but still statistically significant correlation between a polymoφhic set and human disease, immediate therapeutic intervention or monitoring may not be justified. Nevertheless, the patient can be motivated to begin simple life-style changes (e.g., diet, exercise) that can be accomplished at little cost to the patient but confer potential benefits in reducing the risk of conditions to which the patient may have increased susceptibility by virtue of variant alleles. Identification of a polymoφhic set in a patient correlated with enhanced receptiveness to one of several treatment regimes for a disease indicates that this treatment regime should be followed.
For animals and plants, correlations between characteristics and phenotype are useful for breeding for desired characteristics. For example, Beitz et al., U.S. Pat. No. 5,292,639 discuss use of bovine mitochondrial polymoφhisms in a breeding program to improve milk production in cows. To evaluate the effect of mtDNA D-loop sequence polymoφhism on milk production, each cow was assigned a value of 1 if variant or 0 if wild type with respect to a prototypical mitochondrial DNA sequence at each of 17 locations considered.
The previous section concerns identifying correlations between phenotypic traits and polymoφhisms that directly or indirectly contribute to those traits. The present section describes identification of a physical linkage between a genetic locus associated with a trait of interest and polymoφhic markers that are not associated with the trait, but are in physical proximity with the genetic locus responsible for the trait and co-segregate with it. Such analysis is useful for mapping a genetic locus associated with a phenotypic trait to a chromosomal position, and thereby cloning gene(s) responsible for the trait. See Lander et al., Proc. Natl. Acad. Sci. (USA) 83, 7353-7357 (1986); Lander et al., Proc. Natl. Acad. Sci. (USA) 84, 2363-2367 (1987); Donis-Keller et al, Cell 51, 319-337 (1987); Lander et al., Genetics 121, 185-199 (1989)). Genes localized by linkage can be cloned by a process known as directional cloning. See Wainwright, Med. J. Australia 159, 170-174 (1993); Collins, Nature Genetics 1, 3-6 (1992) (each of which is incoφorated by reference in its entirety for all puφoses).
Linkage studies are typically performed on members of a family. Available members of the family are characterized for the presence or absence of a phenotypic trait and for a set of polymoφhic markers. The distribution of polymoφhic markers in an informative meiosis is then analyzed to determine which polymoφhic markers co-segregate with a phenotypic trait. See, e.g., Kerem et al., Science 245, 1073-1080 (1989); Monaco et al., Nature 316, 842 (1985); Yamoka et al., Neurology 40, 222-226 (1990); Rossiter et al, FASEB Journal 5, 21- 27 (1991).
Linkage is analyzed by calculation of LOD (log of the odds) values. A lod value is the relative likelihood of obtaining observed segregation data for a marker and a genetic locus when the two are located at a recombination fraction RF, versus the situation in which the two are not linked, and thus segregating independently (Thompson & Thompson, Genetics in Medicine (5th ed, W.B. Saunders Company, Philadelphia, 1991); Strachan, "Mapping the human genome" in The Human Genome (BIOS Scientific Publishers Ltd, Oxford), Chapter 4). A series of likelihood ratios are calculated at various recombination fractions (RF), ranging from RF=0.0 (coincident loci) to RF=0.50 (unlinked). Thus, the likelihood at a given value of RF is: probability of data if loci linked at RF to probability of data if loci unlinked. The computed likelihood is usually expressed as the logi Q of this ratio (i.e., a lod score). For example, a lod score of 3 indicates 1000:1 odds against an apparent observed linkage being a coincidence. The use of logarithms allows data collected from different families to be combined by simple addition. Computer programs are available for the calculation of lod scores for differing values of RF (e.g., LIPED, MLINK (Lathrop, Proc. Nat. Acad. Sci.
(USA) 81, 3443-3446 (1984)). For any particular lod score, a recombination fraction may be determined from mathematical tables. See Smith et al., Mathematical tables for research workers in human genetics (Churchill, London, 1961); Smith, Ann. Hum. Genet. 32, 127-150
(1968). The value of RF at which the lod score is the highest is considered to be the best estimate of the recombination fraction.
Positive lod score values suggest that the two loci are linked, whereas negative values suggest that linkage is less likely (at that value of RF) than the possibility that the two loci are unlinked. By convention, a combined lod score of + 3 or greater (equivalent to greater than 1000:1 odds in favor of linkage) is considered definitive evidence that two loci are linked. Similarly, by convention, a negative lod score of -2 or less is taken as definitive evidence against linkage of the two loci being compared. Negative linkage data are useful in excluding a chromosome or a segment thereof from consideration. The search focuses on the remaining non-excluded chromosomal locations.
The invention further provides transgenic nonhuman animals capable of expressing an exogenous variant gene and/or having one or both alleles of an endogenous variant gene inactivated. Expression of an exogenous variant gene is usually achieved by operably linking the gene to a promoter and optionally an enhancer, and microinjecting the construct into a zygote. See Hogan et al., "Manipulating the Mouse Embryo, A Laboratory Manual," Cold Spring Harbor Laboratory. (1989). Inactivation of endogenous variant genes can be achieved by forming a transgene in which a cloned variant gene is inactivated by insertion of a positive selection marker. See Capecchi, Science 244, 1288-1292 The transgene is then introduced into an embryonic stem cell, where it undergoes homologous recombination with an endogenous variant gene. Mice and other rodents are preferred animals. Such animals provide useful drug screening systems.
The invention further provides methods for assessing the pharmacogenomic susceptibility of a subject harboring a single nucleotide polymoφhism to a particular pharmaceutical compound, or to a class of such compounds. Genetic polymoφhism in drug- metabolizing enzymes, drug transporters, receptors for pharmaceutical agents, and other drug targets have been correlated with individual differences based on distinction in the efficacy and toxicity of the pharmaceutical agent administered to a subject. Pharmocogenomic characterization of a subjects susceptibility to a drug enhances the ability to tailor a dosing regimen to the particular genetic constitution of the subject, thereby enhancing and optimizing the therapeutic effectiveness of the therapy.
In cases in which a cSNP leads to a polymoφhic protein that is ascribed to be the cause of a pathological condition, method of treating such a condition includes administering to a subject experiencing the pathology the wild type cognate of the polymoφhic protein. Once administered in an effective dosing regimen, the wild type cognate provides complementation or remediation of the defect due to the polymoφhic protein. The subject's condition is ameliorated by this protein therapy.
A subject suspected of suffering from a pathology ascribable to a polymoφhic protein that arises from a cSNP is to be diagnosed using any of a variety of diagnostic methods capable of identifying the presence of the cSNP in the nucleic acid, or of the cognate polymoφhic protein, in a suitable clinical sample taken from the subject. Once the presence of the cSNP has been ascertained, and the pathology is correctable by administering a normal or wild-type gene, the subject is treated with a pharmaceutical composition that includes a nucleic acid that harbors the correcting wild-type gene, or a fragment containing a correcting sequence of the wild-type gene. Non-limiting examples of ways in which such a nucleic acid may be administered include incoφorating the wild-type gene in a viral vector, such as an adenovirus or adeno associated virus, and administration of a naked DNA in a pharmaceutical composition that promotes intracellular uptake of the administered nucleic acid. Once the nucleic acid that includes the gene coding for the wild-type allele of the polymoφhism is incoφorated within a cell of the subject, it will initiate de novo biosynthesis of the wild-type gene product. If the nucleic acid is further incoφorated into the genome of the subject, the treatment will have long-term effects, providing de novo synthesis of the wild-type protein for a prolonged duration. The synthesis of the wild-type protein in the cells of the subject will contribute to a therapeutic enhancement of the clinical condition of the subject.
A subject suffering from a pathology ascribed to a SNP may be treated so as to correct the genetic defect. (See Kren et al., Proc. Natl. Acad. Sci. USA 96:10349-10354 (1999)). Such a subject is identified by any method that can detect the polymoφhism in a sample drawn from the subject. Such a genetic defect may be permanently corrected by administering to such a subject a nucleic acid fragment incoφorating a repair sequence that supplies the wild-type nucleotide at the position of the SNP. This site-specific repair sequence encompasses an RNA/DNA oligonucleotide which operates to promote endogenous repair of a subject's genomic DNA. Upon administration in an appropriate vehicle, such as a complex with polyethylenimine or encapsulated in anionic liposomes, a genetic defect leading to an inborn pathology may be overcome, as the chimeric oligonucleotides induces incoφoration of the wild-type sequence into the subject's genome. Upon incoφoration, the wild-type gene product is expressed, and the replacement is propagated, thereby engendering a permanent repair.
The invention further provides kits comprising at least one allele-specific oligonucleotide as described above. Often, the kits contain one or more pairs of allele- specific oligonucleotides hybridizing to different forms of a polymoφhism. In some kits, the allele-specific oligonucleotides are provided immobilized to a substrate. For example, the same substrate can comprise allele-specific oligonucleotide probes for detecting at least 10, 100, 1000 or all of the polymoφhisms shown in the Table. Optional additional components of the kit include, for example, restriction enzymes, reverse-transcriptase or polymerase, the substrate nucleoside triphosphates, means used to label (for example, an avidin-enzyme conjugate and enzyme substrate and chromogen if the label is biotin), and the appropriate buffers for reverse transcription, PCR, or hybridization reactions. Usually, the kit also contains instructions for carrying out the hybridizing methods. Several aspects of the present invention rely on having available the polymoφhic proteins encoded by the nucleic acids comprising a SNP of the inventions. There are various methods of isolating these nucleic acid sequences. For example, DNA is isolated from a genomic or cDNA library using labeled oligonucleotide probes having sequences complementary to the sequences disclosed herein.
Such probes can be used directly in hybridization assays. Alternatively probes can be designed for use in amplification techniques such as PCR.
To prepare a cDNA library, mRNA is isolated from tissue such as heart or pancreas, preferably a tissue wherein expression of the gene or gene family is likely to occur. cDNA is prepared from the mRNA and ligated into a recombinant vector. The vector is transfected into a recombinant host for propagation, screening and cloning. Methods for making and screening cDNA libraries are well known, See Gubler, U. and Hoffman, B.J. Gene 25:263- 269 (1983) and Sambrook et al.
For a genomic library, for example, the DNA is extracted from tissue and either mechanically sheared or enzymatically digested to yield fragments of about 12-20 kb. The fragments are then separated by gradient centrifugation from undesired sizes and are constructed in bacteriophage lambda vectors. These vectors and phage are packaged in vitro, as described in Sambrook, et al. Recombinant phage are analyzed by plaque hybridization as described in Benton and Davis, Science 196:180-1 82 (1977). Colony hybridization is carried out as generally described in M. Grunstein et al. Proc. Natl. Acad. Sci. USA. 72:3961- 3965 (1975). DNA of interest is identified in either cDNA or genomic libraries by its ability to hybridize with nucleic acid probes, for example on Southern blots, and these DNA regions are isolated by standard methods familiar to those of skill in the art. See Sambrook, et al.
In PCR techniques, oligonucleotide primers complementary to the two 3' borders of the DNA region to be amplified are synthesized. The polymerase chain reaction is then carried out using the two primers. See PCR Protocols: a Guide to Methods and Applications (Innis, M, Gelfand, D., Sninsky, J. and White, T., eds.), Academic Press, San Diego (1990). Primers can be selected to amplify the entire regions encoding a full-length sequence of interest or to amplify smaller DNA segments as desired. PCR can be used in a variety of protocols to isolate cDNAs encoding a sequence of interest. In these protocols, appropriate primers and probes for amplifying DNA encoding a sequence of interest are generated from analysis of the DNA sequences listed herein. Once such regions are PCR-amplified, they can be sequenced and oligonucleotide probes can be prepared from the sequence.
Once DNA encoding a sequence comprising a cSNP is isolated and cloned, one can express the encoded polymoφhic proteins in a variety of recombinantly engineered cells. It is expected that those of skill in the art are knowledgeable in the numerous expression systems available for expression of DNA encoding a sequence of interest. No attempt to describe in detail the various methods known for the expression of proteins in prokaryotes or eukaryotes is made here.
In brief summary, the expression of natural or synthetic nucleic acids encoding a sequence of interest will typically be achieved by operably linking the DNA or cDNA to a promoter (which is either constitutive or inducible), followed by incoφoration into an expression vector. The vectors can be suitable for replication and integration in either prokaryotes or eukaryotes. Typical expression vectors contain initiation sequences, transcription and translation terminators, and promoters useful for regulation of the expression of a polynucleotide sequence of interest. To obtain high level expression of a cloned gene, it is desirable to construct expression plasmids which contain, at the minimum, a strong promoter to direct transcription, a ribosome binding site for translational initiation, and a transcription/translation terminator. The expression vectors may also comprise generic expression cassettes containing at least one independent terminator sequence, sequences permitting replication of the plasmid in both eukaryotes and prokaryotes, i.e., shuttle vectors, and selection markers for both prokaryotic and eukaryotic systems. See Sambrook et al.
A variety of prokaryotic expression systems may be used to express the polymoφhic proteins of the invention. Examples include E. coli, Bacillus, Streptomyces, and the like.
It is preferred to construct expression plasmids which contain, at the minimum, a strong promoter to direct transcription, a ribosome binding site for translational initiation, and a transcription/translation terminator. Examples of regulatory regions suitable for this puφose in E. coli are the promoter and operator region of the E. coli tryptophan biosynthetic pathway as described by Yanofsky, C, J. Bacterial. 158:1018-1024 (1984) and the leftward promoter of phage lambda as described by Λ, I. and Hagen, P., Ann. Rev. Genet. 14:399- 445 (1980). The inclusion of selection markers in DNA vectors transformed in E. coli is also useful. Examples of such markers include genes specifying resistance to ampicillin, tetracycline, or chloramphenicol. See Sambrook et al. for details concerning selection markers for use in E. coli.
To enhance proper folding of the expressed recombinant protein, during purification from E. coli, the expressed protein may first be denatured and then renatured. This can be accomplished by solubilizing the bacterially produced proteins in a chaotropic agent such as guanidine HCI and reducing all the cysteine residues with a reducing agent such as beta- mercaptoethanol. The protein is then renatured, either by slow dialysis or by gel filtration. See U.S. Patent No. 4,511,503. Detection of the expressed antigen is achieved by methods known in the art as radioimmunoassay, or Western blotting techniques or immunoprecipitation. Purification from E. coli can be achieved following procedures such as those described in U.S. Patent No. 4,511,503.
Any of a variety of eukaryotic expression systems such as yeast, insect cell lines, bird, fish, and mammalian cells, may also be used to express a polymoφhic protein of the invention. As explained briefly below, a nucleotide sequence harboring a cSNP may be expressed in these eukaryotic systems. Synthesis of heterologous proteins in yeast is well known. Methods in Yeast Genetics, Sherman, F., et al., Cold Spring Harbor Laboratory, (1982) is a well recognized work describing the various methods available to produce the protein in yeast. Suitable vectors usually have expression control sequences, such as promoters, including 3-phosphogtycerate kinase or other glycolytic enzymes, and an origin of replication, termination sequences and the like as desired. For instance, suitable vectors are described in the literature (Botstein, et al, Gene 8:17-24 (1979); Broach, et al., Gene 8:121- 133 (1979)).
Two procedures are used in transforming yeast cells. In one case, yeast cells are first converted into protoplasts using zymolyase, lyticase or glusulase, followed by addition of DNA and polyethylene glycol (PEG). The PEG-treated protoplasts are then regenerated in a 3% agar medium under selective conditions. Details of this procedure are given in the papers by J.D. Beggs, Nature (London) 275:104-109 (1978); and Hinnen, A., et al., Proc. Natl. Acad. Sci. USA, 75:1929-1933 (1978). The second procedure does not involve removal of the cell wall. Instead the cells are treated with lithium chloride or acetate and PEG and put on selective plates (Ito, H., et al., J. Bact, 153163-168 (1983)) cells and applying standard protein isolation techniques to the lysates:. The purification process can be monitored by using Western blot techniques or radioimmunoassay or other standard techniques. The sequences encoding the proteins of the invention can also be ligated to various immunoassay expression vectors for use in transforming cell cultures of, for instance, mammalian, insect, bird or fish origin. Illustrative of cell cultures useful for the production of the polypeptides are mammalian cells. Mammalian cell systems often will be in the form of monolayers of cells although mammalian cell suspensions may also be used. A number of suitable host cell lines capable of expressing intact proteins have been developed in the art, and include the HEK293, BHK21, and CHO cell lines, and various human cells such as COS cell lines, HeLa cells, myeloma cell lines, Jurkat cells, etc. Expression vectors for these cells can include expression control sequences, such as an origin of replication, a promoter (e.g., the CMV promoter, a HSV t£ promoter ovpgk (phosphogly cerate kinase) promoter), an enhancer (Queen et al. Immunol. Rev. 89:49 (1986)) and necessary processing information sites, such as ribosome binding sites, RNA splice sites, polyadenylation sites (e.g., an SV40 large T Ag poly A addition site), and transcriptional terminator sequences.
Other animal cells are available, for instance, from the American Type Culture Collection Catalogue of Cell Lines and Hybridomas (7th edition, (1992)). Appropriate vectors for expressing the proteins of the invention in insect cells are usually derived from baculovirus. Insect cell lines include mosquito larvae, silkworm, armyworm, moth and Drosophila cell lines such as a Schneider cell line (See Schneider J. Embryol. Exp. Moφhol., 27:353-365 (1987). As indicated above, the vector, e.g., a plasmid, which is used to transform the host cell, preferably contains DNA sequences to initiate transcription and sequences to control the translation of the protein. These sequences are referred to as expression control sequences. As with yeast, when higher animal host cells are employed, polyadenylation or transcription terminator sequences from known mammalian genes need to be incoφorated into the vector. An example of a terminator sequence is the polyadenylation sequence from the bovine growth hormone gene. Sequences for accurate splicing of the transcript may also be included. An example of a splicing sequence is the VP1 intron from SV40 (Sprague, J. et a/., J. Virol. 45: 773-781 (1983)). Additionally, gene sequences to control replication in the host cell may be Saveria-Campo, M., 1985, "Bovine Papilloma virus DNA a Eukaryotic Cloning Vector" in DNA Cloning Vol. II a Practical Approach Ed. D.M. Glover, IRL Press, Arlington, Virginia pp. 213-238. The host cells are competent or rendered competent for transformation by various means. There are several well-known methods of introducing DNA into animal cells. These include: calcium phosphate precipitation, fusion of the recipient cells with bacterial protoplasts containing the DNA, treatment of the recipient cells with liposomes containing the DNA, DEAE dextran, electroporation and micro-injection of the DNA directly into the cells.
The transformed cells are cultured by means well known in the art (Biochemical
Methods in Cell Culture and Virology, Kuchler, R.J., Dowden, Hutchinson and Ross, Inc., (1977)). The expressed polypeptides are isolated from cells grown as suspensions or as monolayers. The latter are recovered by well known mechanical, chemical or enzymatic means.
General methods of expressing recombinant proteins are also known and are exemplified in R. Kaufman, Methods in Enzymology 185, 537-566 (1990). As defined herein "operably linked" refers to linkage of a promoter upstream from a DNA sequence such that the promoter mediates transcription of the DNA sequence. Specifically, "operably linked" means that the isolated polynucleotide of the invention and an expression control sequence are situated within a vector or cell in such a way that the gene encoding the protein is expressed by a host cell which has been transformed (transfected) with the ligated polynucleotide/expression sequence. The term "vector", refers to viral expression systems, autonomous self-replicating circular DNA (plasmids), and includes both expression and nonexpression plasmids.
The term "gene" as used herein is intended to refer to a nucleic acid sequence which encodes a polypeptide. This definition includes various sequence polymoφhisms, mutations, and/or sequence variants wherein such alterations do not affect the function of the gene product. The term "gene" is intended to include not only coding sequences but also regulatory regions such as promoters, enhancers, termination regions and similar untranslated nucleotide sequences. The term further includes all introns and other DNA sequences spliced from the mRNA transcript, along with variants resulting from alternative splice sites.
A number of types of cells may act as suitable host cells for expression of the protein. Mammalian host cells include, for example, monkey COS cells, Chinese Hamster Ovary (CHO) cells, human kidney 293 cells, human epidermal A43 1 cells, human Co 10205 cells, 3T3 cells, CV-1 cells, other transformed primate cell lines, normal diploid cells, cell strains derived from in vitro culture of primary tissue, primary explants, HeLa cells, mouse L cells, BHK, HL- 60, U937, HaK or Jurkat cells. Alternatively, it may be possible to produce the protein in lower eukaryotes such as yeast or in prokaryotes such as bacteria. Potentially suitable yeast strains include Saccharomyces cerevisiae, Schizosaccharomyces pombe, Kluyveromyces strains, Candida or any yeast strain capable of expressing heterologous proteins. Potentially suitable bacterial strains include Escherichia coli, Bacillus subtilis, Salmonella typhimurium, or any bacterial strain capable of expressing heterologous proteins. If the protein is made in yeast or bacteria, it may be necessary to modify the protein produced therein, for example by phosphorylation or glycosylation of the appropriate sites, in order to obtain the functional protein.
The protein may also be produced by operably linking the isolated polynucleotide of the invention to suitable control sequences in one or more insect expression vectors, and employing an insect expression system. Materials and methods for baculovirus/insect cell expression systems are commercially available in kit form from, e.g., Invitrogen, San Diego, California, U.S.A. (the MaxBac© kit), and such methods are well known in the art, as described in Summers and Smith, Texas Agricultural Experiment Station Bulletin No. 1555 (1987). incoφorated herein by reference. As used herein, an insect cell capable of expressing_a polynucleotide of the present invention is "transformed." The protein of the invention may be prepared by culturing transformed host cells under culture conditions suitable to express the recombinant protein.
The polymoφhic protein of the invention may also be expressed as a product of transgenic animals, e.g., as a component of the milk of transgenic cows, goats, pigs, or sheep which are characterized by somatic or germ cells containing a nucleotide sequence encoding the protein. The protein may also be produced by known conventional chemical synthesis. Methods for constructing the proteins of the present invention by synthetic means are known to those skilled in the art.
The polymoφhic proteins produced by recombinant DNA technology may be purified by techniques commonly employed to isolate or purify recombinant proteins. Recombinantly produced proteins can be directly expressed or expressed as a fusion protein. The protein is then purified by a combination of cell lysis (e.g., sonication) and affinity chromatography. For fusion products, subsequent digestion of the fusion protein with an appropriate proteolytic enzyme releases the desired polypeptide. The polypeptides of this invention may be purified to substantial purity by standard techniques well known in the art, including selective precipitation with such substances as ammonium sulfate, column chromatography, immunopurification methods, and others. See, for instance, R. Scopes, Protein Purification: Principles and Practice, Springer- Verlag: New York (1982), incoφorated herein by reference. For example, in an embodiment, antibodies may be raised to the proteins of the invention as described herein. Cell membranes are isolated from a cell line expressing the recombinant protein, the protein is extracted from the membranes and immunoprecipitated. The proteins may then be further purified by standard protein chemistry techniques as described above.
The resulting expressed protein may then be purified from such culture (i.e., from culture medium or cell extracts) using known purification processes, such as gel filtration and ion exchange chromatography. The purification of the protein may also include an affinity column containing agents which will bind to the protein; one or more column steps over such affinity resins as concanavalin A-agarose, heparin-Toyopearl@ or Cibacrom blue 3GA Sepharose B; one or more steps involving hydrophobic interaction chromatography using such resins as phenyl ether, butyl ether, or propyl ether; or immunoaffmity chromatography. Alternatively, the protein of the invention may also be expressed in a form which will facilitate purification. For example, it may be expressed as a fusion protein, such as those of maltose binding protein (MBP), glutathione-S-transferase (GST) or thioredoxin (TRX). Kits for expression and purification of such fusion proteins are commercially available from New England BioLab (Beverly, MA), Pharmacia (Piscataway, NJ) and InVitrogen, respectively. The protein can also be tagged with an epitope and subsequently purified by using a specific antibody directed to such epitope. One such epitope ("Flag") is commercially available from Kodak (New Haven, CT). Finally, one or more reverse-phase high performance liquid chromatography (RP- HPLC) steps employing hydrophobic RP- HPLC media, e.g., silica gel having pendant methyl or other aliphatic groups, can be employed to further purify the protein. Some or all of the foregoing purification steps, in various combinations, can also be employed to provide a substantially homogeneous isolated recombinant protein. The protein thus purified is substantially free of other mammalian proteins and is defined in accordance with the present invention as an "isolated protein."
The term "antibody" as used herein refers to immunoglobulin molecules and immunologically active portions of immunoglobulin molecules, i.e., molecules that contain an antigen binding site that specifically binds (immunoreacts with) an antigen, such as polymoφhic. Such antibodies include, but are not limited to, polyclonal, monoclonal, chimeric, single chain, Fab and F(ab.)2 fragments, and an Fab expression library. In a specific embodiment, antibodies to human polymoφhic proteins are disclosed.
The phrase "specifically binds to", "immunospecifically binds to" or is "specifically immunoreactive with", an antibody when referring to a protein or peptide, refers to a binding reaction which is determinative of the presence of the protein in the presence of a heterogeneous population of proteins and other biological materials. Thus, for example, under designated immunoassay conditions, the specified antibodies bind to a particular protein and do not bind in a significant amount to other proteins present in the sample. Specific binding to an antibody under such conditions may require an antibody that is selected for its specificity for a particular protein. Of particular interest in the present invention is an antibody that binds immunospecifically to a polymoφhic protein but not to its cognate wild type allelic protein, or vice versa. A variety of immunoassay formats may be used to select antibodies specifically immunoreactive with a particular protein. For example, solid-phase ELISA immunoassays are routinely used to select monoclonal antibodies specifically immunoreactive with a protein. See Harlow and Lane (1988) Antibodies, a Laboratory Manual, Cold Spring Harbor Publications, New York, for a description of immunoassay formats and conditions that can be used to determine specific immunoreactivity.
Polyclonal and/or monoclonal antibodies that immunospecifically bind to polymoφhic gene products but not to the corresponding prototypical or "wild-type" gene products are also provided. Antibodies can be made by injecting mice or other animals with the variant gene product or synthetic peptide. Monoclonal antibodies are screened as are described, for example, in Harlow & Lane, Antibodies, A Laboratory Manual, Cold Spring Harbor Press, New York (1988); Goding, Monoclonal antibodies, Principles and Practice (2d ed.) Academic Press, New York (1986). Monoclonal antibodies are tested for specific immunoreactivity with a variant gene product and lack of immunoreactivity to the corresponding prototypical gene product.
An isolated polymoφhic protein, or a portion or fragment thereof, can be used as an immunogen to generate the antibody that binds the polymoφhic protein using standard techniques for polyclonal and monoclonal antibody preparation. The full-length polymoφhic protein can be used or, alternatively, the invention provides antigenic peptide fragments of polymoφhic for use as immunogens. The antigenic peptide of a polymoφhic protein of the invention comprises at least 8 amino acid residues of the amino acid sequence encompassing the polymoφhic amino acid and encompasses an epitope of the polymoφhic protein such that an antibody raised against the peptide forms a specific immune complex with the polymoφhic protein. Preferably, the antigenic peptide comprises at least 10 amino acid residues, more preferably at least 15 amino acid residues, even more preferably at least 20 amino acid residues, and most preferably at least 30 amino acid residues. Preferred epitopes encompassed by the antigenic peptide are regions of polymoφhic that are located on the surface of the protein, e.g., hydrophilic regions.
For the production of polyclonal antibodies, various suitable host animals (e.g., rabbit, goat, mouse or other mammal) may be immunized by injection with the polymoφhic protein. An appropriate immunogenic preparation can contain, for example, recombinantly expressed polymoφhic protein or a chemically synthesized polymoφhic polypeptide. The preparation can further include an adjuvant. Various adjuvants used to increase the immunological response include, but are not limited to, Freund's (complete and incomplete), mineral gels (e.g., aluminum hydroxide), surface active substances (e.g., lysolecithin, pluronic polyols, polyanions, peptides, oil emulsions, dinitrophenol, etc.), human adjuvants such as Bacille Calmette-Guerin and Corynebacterium parvum, or similar immunostimulatory agents. If desired, the antibody molecules directed against polymoφhic proteins can be isolated from the mammal (e.g., from the blood) and further purified by well known techniques, such as protein A chromatography, to obtain the IgG fraction.
The term "monoclonal antibody" or "monoclonal antibody composition", as used herein, refers to a population of antibody molecules that originates from the clone of a singly hybridoma cell, and that contains only one type of antigen binding site capable of immunoreacting with a particular epitope of a polymoφhic protein. A monoclonal antibody composition thus typically displays a single binding affinity for a particular polymoφhic protein with which it immunoreacts. For preparation of monoclonal antibodies directed towards a particular polymoφhic protein, or derivatives, fragments, analogs or homologs thereof, any technique that provides for the production of antibody molecules by continuous cell line culture may be utilized. Such techniques include, but are not limited to, the hybridoma technique (see Kohler & Milstein, 1975 Nature 256: 495-497); the trioma technique; the human B-cell hybridoma technique (see Kozbor, et al, 1983 Immunol Today 4: 72) and the EBV hybridoma technique to produce human monoclonal antibodies (see Cole, et al, 1985 In: MONOCLONAL ANTIBODIES AND CANCER THERAPY, Alan R. Liss, Inc., pp. 77-96). Human monoclonal antibodies may be utilized in the practice of the present invention and may be produced by using human hybridomas (see Cote, et al, 1983. Proc Natl Acad Sci USA 80: 2026-2030) or by transforming human B-cells with Epstein Barr Virus in vitro (see Cole, et al, 1985 In: MONOCLONAL ANTIBODIES AND CANCER THERAPY, Alan R. Liss, Inc., pp. 77-96).
According to the invention, techniques can be adapted for the production of single-chain antibodies specific to a polymoφhic protein (see e.g., U.S. Patent No. 4,946,778). In addition, methodologies can be adapted for the construction of Fab expression libraries (see e.g., Huse, et al, 1989 Science 246: 1275-1281) to allow rapid and effective identification of monoclonal Fab fragments with the desired specificity for a polymoφhic protein or derivatives, fragments, analogs or homologs thereof. Non-human antibodies can be "humanized" by techniques well known in the art. See e.g., U.S. Patent No. 5,225,539. Antibody fragments that contain the idiotypes to a polymoφhic protein may be produced by techniques known in the art including, but not limited to: (0 an F(ab')2 fragment produced by pepsin digestion of an antibody molecule; (ii) an Fab fragment generated by reducing the disulfide bridges of an F(ab')2 fragment; (Hi) an Fab fragment generated by the treatment of the antibody molecule with papain and a reducing agent and (iv) Fv fragments.
Additionally, recombinant anti-polymoφhic protein antibodies, such as chimeric and humanized monoclonal antibodies, comprising both human and non-human portions, which can be made using standard recombinant DNA techniques, are within the scope of the invention. Such chimeric and humanized monoclonal antibodies can be produced by recombinant DNA techniques known in the art, for example using methods described in PCT International Application No. PCT/US86/02269; European Patent Application No. 184,187; European Patent Application No. 171,496; European Patent Application No. 173,494; PCT
In one embodiment, methodologies for the screening of antibodies that possess the desired specificity include, but are not limited to, enzyme-linked immunosorbent assay (ELISA) and other immunologically-mediated techniques known within the art.
Anti-polymoφhic protein antibodies may be used in methods known within the art relating to the detection, quantitation and/or cellular or tissue localization of a polymoφhic protein (e.g., for use in measuring levels of the polymoφhic protein within appropriate physiological samples, for use in diagnostic methods, for use in imaging the protein, and the like). In a given embodiment, antibodies for polymoφhic proteins, or derivatives, fragments, analogs or homologs thereof, that contain the antibody-derived CDR, are utilized as pharmacologically-active compounds in therapeutic applications intended to treat a pathology in a subject that arises from the presence of the cSNP allele in the subject.
An anti-polymoφhic protein antibody (e.g., monoclonal antibody) can be used to isolate polymoφhic proteins by a variety of immunochemical techniques, such as immunoaffinity chromatography or immunoprecipitation. An anti-polymoφhic protein antibody can facilitate the purification of natural polymoφhic protein from cells and of recombinantly produced polymoφhic proteins expressed in host cells. Moreover, an anti-polymoφhic protein antibody can be used to detect polymoφhic protein (e.g., in a cellular lysate or cell supernatant) in order to evaluate the abundance and pattern of expression of the polymoφhic protein. Anti-polymoφhic antibodies can be used diagnostically to monitor protein levels in tissue as part of a clinical testing procedure, e.g., to, for example, determine the efficacy of a given treatment regimen. Detection can be facilitated by coupling (i.e., physically linking) the antibody to a detectable substance. Examples of detectable substances include various enzymes, prosthetic groups, fluorescent materials, luminescent materials, bioluminescent materials, and radioactive materials. Examples of suitable enzymes include horseradish peroxidase, alkaline phosphatase,
-g alactosidase, or acetylcholinesterase; examples of suitable prosthetic group complexes include streptavidin/biotin and avidin/biotin; examples of suitable fluorescent materials include umbelliferone, fluorescein, fluorescein isothiocyanate, rhodamine, dichlorotriazinylamine fluorescein, dansyl chloride or phycoerythrin; an example of a luminescent material includes luminol; examples of bioluminescent materials include luciferase, luciferin, and aequorin, and examples of suitable radioactive material include 125I, 131I, 35S or 3H. EQUIVALENTS
From the foregoing detailed description of the specific embodiments of the invention, it should be apparent that unique compositions and methods of use thereof in SNPs in known genes have been described. Although particular embodiments have been disclosed herein in detail, this has been done by way of example for puφoses of illustration only, and is not intended to be limiting with respect to the scope of the appended claims which follow. In particular, it is contemplated by the inventor that various substitutions, alterations, and modifications may be made to the invention without departing from the spirit and scope of the invention as defined by the claims.
Figure imgf000052_0001
Figure imgf000053_0001
Figure imgf000054_0001
Figure imgf000055_0001
Figure imgf000056_0001
Figure imgf000057_0001
Figure imgf000058_0001
oe
Figure imgf000059_0001
Figure imgf000060_0001
Figure imgf000061_0001
Figure imgf000062_0001
Figure imgf000063_0001
Figure imgf000064_0001
Figure imgf000065_0001
Figure imgf000066_0001
Figure imgf000067_0001
Figure imgf000068_0001
oe
Figure imgf000069_0001
Figure imgf000070_0001
-4
O
Figure imgf000071_0001
Figure imgf000072_0001
Figure imgf000073_0001
Figure imgf000074_0001
Figure imgf000075_0001
-4
Figure imgf000076_0001
Figure imgf000077_0001
-4 -4
Figure imgf000078_0001
-4 oe
Figure imgf000079_0001
Figure imgf000080_0001
oe o
Figure imgf000081_0001
Figure imgf000082_0001
Figure imgf000083_0001
Figure imgf000084_0001
Figure imgf000085_0001
oe
Ul
Figure imgf000086_0001
oe
Figure imgf000087_0001
oe
-4
Figure imgf000088_0001
oe oe
Figure imgf000089_0001
Figure imgf000090_0001
Figure imgf000091_0001
Figure imgf000092_0001
Figure imgf000093_0001
Figure imgf000094_0001
Figure imgf000095_0001
Figure imgf000096_0001
Figure imgf000097_0001
Figure imgf000098_0001
Figure imgf000100_0001
o o
Figure imgf000101_0001
Figure imgf000102_0001
Figure imgf000103_0001
o l
Figure imgf000104_0001
Figure imgf000105_0001
Figure imgf000106_0001
Figure imgf000107_0001
o
-4
Figure imgf000108_0001
o oe
Figure imgf000109_0001
Figure imgf000110_0001
Figure imgf000111_0001
Figure imgf000112_0001
Figure imgf000113_0001
Figure imgf000114_0001
Figure imgf000115_0001
Figure imgf000116_0001
Figure imgf000117_0001
Figure imgf000118_0001
Figure imgf000119_0001
Figure imgf000120_0001
Figure imgf000121_0001
Figure imgf000122_0001
Figure imgf000123_0001
Figure imgf000124_0001
Figure imgf000125_0001
Ul
Figure imgf000126_0001
Figure imgf000127_0001
-4
Figure imgf000128_0001
Figure imgf000129_0001
Figure imgf000130_0001
Figure imgf000131_0001
Figure imgf000132_0001
Figure imgf000133_0001
Figure imgf000134_0001
Figure imgf000135_0001
Figure imgf000136_0001
Figure imgf000137_0001
-4
Figure imgf000138_0001
Figure imgf000139_0001
Figure imgf000140_0001
Figure imgf000141_0001
Figure imgf000142_0001
Figure imgf000143_0001
Figure imgf000144_0001
Figure imgf000145_0001
Figure imgf000146_0001
Figure imgf000147_0001
Figure imgf000148_0001
Figure imgf000149_0001
Figure imgf000150_0001
Figure imgf000151_0001
Figure imgf000152_0001
Figure imgf000153_0001
Figure imgf000154_0001
Figure imgf000155_0001
Figure imgf000156_0001
Figure imgf000157_0001
Figure imgf000158_0001
Figure imgf000159_0001
Figure imgf000160_0001
Figure imgf000161_0001
Figure imgf000162_0001
Figure imgf000163_0001
Figure imgf000164_0001
Figure imgf000165_0001
Figure imgf000166_0001
Figure imgf000167_0001
Figure imgf000168_0001
Figure imgf000169_0001
Figure imgf000170_0001
-4
O
Figure imgf000171_0001
Figure imgf000172_0001
-4
Figure imgf000173_0001
-4
Ul
Figure imgf000175_0001
Figure imgf000176_0001
-4
Figure imgf000177_0001
-4 -4
Figure imgf000178_0001
-4 oe
Figure imgf000179_0001
oe o
Figure imgf000181_0001
Figure imgf000182_0001
Figure imgf000183_0001
oe
Figure imgf000184_0001
Figure imgf000185_0001
oe n
Figure imgf000186_0001
Figure imgf000187_0001
oe
-4
Figure imgf000188_0001
oe oe
Figure imgf000189_0001
oe
Figure imgf000190_0001
Figure imgf000191_0001
Figure imgf000192_0001
Figure imgf000193_0001
Figure imgf000194_0001
Figure imgf000195_0001
Figure imgf000196_0001
Figure imgf000197_0001
Figure imgf000198_0001
Figure imgf000199_0001
Figure imgf000200_0001
o o
Figure imgf000201_0001
Figure imgf000202_0001
Figure imgf000203_0001
o
Ul
Figure imgf000204_0001
Figure imgf000205_0001
o n
Figure imgf000206_0001
Figure imgf000207_0001
o
-4
Figure imgf000208_0001
o oe
Figure imgf000209_0001
Figure imgf000210_0001
Figure imgf000211_0001
Figure imgf000212_0001
Figure imgf000213_0001
Figure imgf000214_0001
Figure imgf000215_0001
Figure imgf000216_0001
Figure imgf000217_0001
Figure imgf000218_0001
Figure imgf000219_0001
Figure imgf000220_0001
Figure imgf000221_0001
Figure imgf000222_0001
tJ tJ tJ
Figure imgf000223_0001
Figure imgf000224_0001
Figure imgf000225_0001
Figure imgf000226_0001
Figure imgf000227_0001
Figure imgf000228_0001
Figure imgf000229_0001
Figure imgf000230_0001
Ul
©
Figure imgf000231_0001
Figure imgf000232_0001
Figure imgf000233_0001
Ul Ul
Figure imgf000234_0001
Figure imgf000235_0001
Ul Ul
Figure imgf000236_0001
Figure imgf000237_0001
Ul -4
Figure imgf000238_0001
Ul oe
Figure imgf000239_0001
Figure imgf000240_0001
4-
©
Figure imgf000241_0001
Figure imgf000242_0001
Figure imgf000243_0001
- i
Figure imgf000244_0001
Figure imgf000245_0001
4- cn
Figure imgf000246_0001
Figure imgf000247_0001
4- -4
Figure imgf000248_0001
4- oe
Figure imgf000249_0001
Figure imgf000250_0001
n
©
Figure imgf000251_0001
Figure imgf000252_0001
Figure imgf000253_0001
n
Figure imgf000254_0001
Figure imgf000255_0001
n n
Figure imgf000256_0001
Figure imgf000257_0001
n
-4
Figure imgf000258_0001
n oe
Figure imgf000259_0001
Figure imgf000260_0001
Figure imgf000261_0001
Figure imgf000262_0001
Figure imgf000263_0001
Figure imgf000264_0001
Figure imgf000265_0001
1268 cg43143315 2383 GGTTACAAACCG gap SILENT- cyto450 Human Gene SWISSNEW-ID:Q07973 1.90E-279 20
TTTCAGGCCCTG NONCODI CYTOCHROME P450-CC24
C[C/gap]TACCAC NG MITOCHONDRIAL PRECURSOR (EC
ATTCACTGTTTG 1.14.-.-) (P450- CC24) (VITAMIN D(3) 24-
AATCTTT HYDROXYLASE) (1 ,25- DIHYDROXYVITAMIN D(3) 24- HYDROXYLASE) (24-OHASE) - HOMO SAPIENS (HUMAN), 513 aa.[pds:SWISSPROT-ID:Q07973 CYTOCHROME P450-CC24 MITOCHONDRIAL PRECURSOR (EC 1.14.-.-) (P450- CC24) (VITAMIN D(3) 24- HYDROXYLASE) (1 ,25- DIHYDROXYVITAMIN D(3) 24- HYDROXYLASE) (24-OHASE) - HOMO SAPIENS (HUMAN), 513 • a
1269 cg43143315 2717 CTAGTGATTCAC T gap SILENT- cyto450 Human Gene SW ϊlsSεSNEW-ID:Q07973 1.90E-279 20
TGGGGCATTATT NONCODI CYTOCHROME P450-CC24 n T[T/gap]GTTAGA NG MITOCHONDRIAL PRECURSOR (EC
GGACCTTAAAAT 1.14.-.-) (P450- CC24) (VITAMIN D(3) 24-
TGTTTAT HYDROXYLASE) (1 ,25- DIHYDROXYVITAMIN D(3) 24- HYDROXYLASE) (24-OHASE) - HOMO SAPIENS (HUMAN), 513 aa.]pcls:SWISSPROT-ID:Q07973 CYTOCHROME P450-CC24 MITOCHONDRIAL PRECURSOR (EC 1.14.-.-) (P450- CC24) (VITAMIN D(3) 24- HYDROXYLASE) (1 ,25- DIHYDROXYVITAMIN D(3) 24- HYDROXYLASE) (24-OHASE) - HOMO SAPIENS (HUMAN), 513 aa
1270 cg43143315 2951 GATTTAGGATCT A gap SILENT- cyto450 Human Gene SWISSNEW-ID:Q07973 1.90E-279 20
GTGGTGCAGGG NONCODI CYTOCHROME P450-CC24
CA[A/gap]TGTTTC NG MITOCHONDRIAL PRECURSOR (EC
AAAGTTTAGTCA 1.14.-.-) (P450- CC24) (VITAMIN D(3) 24-
CAGCTTA HYDROXYLASE) (1 ,25- DIHYDROXYVITAMIN D(3) 24- HYDROXYLASE) (24-OHASE) - HOMO SAPIENS (HUMAN), 513 aa.|pcls:SWISSPROT-ID:Q07973 CYTOCHROME P450-CC24 MITOCHONDRIAL PRECURSOR (EC 1.14.-.-) (P450- CC24) (VITAMIN D(3) 24- HYDROXYLASE) (1 ,25- DIHYDROXYVITAMIN D(3) 24- HYDROXYLASE) (24-OHASE) - HOMO SAPIFNS (HUMAN), 51 aa
1271 cg43143315 3396 TGTGAAATTATTT gap SILENT- cyto450 Human Gene SW iIsSεSNEW-!D:Q07973 1.90E-279 20
TTAGAATTATAAl NONCODI CYTOCHROME P450-CC24
A/gap]TTCACGTC NG MITOCHONDRIAL PRECURSOR (EC
TTGTCAGATTTC 1.14.-.-) (P450- CC24) (VITAMIN D(3) 24-
ATCTG HYDROXYLASE) (1 ,25- DIHYDROXYVITAMIN D(3) 24- HYDROXYLASE) (24-OHASE) - HOMO SAPIENS (HUMAN), 513 aa.[pcls:SWISSPROT-ID:Q07973 CYTOCHROME P450-CC24 MITOCHONDRIAL PRECURSOR (EC 1.14.-.-) (P450- CC24) (VITAMIN D(3) 24- HYDROXYLASE) (1 ,25- DIHYDROXYVITAMIN D(3) 24- HYDROXYLASE) (24-OHASE) - HOMO SAPIENS (HUMAN), 51 aa
-4
Figure imgf000268_0001
Figure imgf000269_0001
Figure imgf000270_0001
-4
©
Figure imgf000271_0001
Figure imgf000272_0001
-4
Figure imgf000273_0001
-4
Ul
Figure imgf000274_0001
-4
Figure imgf000275_0001
-4 cn
Figure imgf000276_0001
Figure imgf000277_0001
-4 -4
Figure imgf000278_0001
-4 oe
Figure imgf000279_0001
Figure imgf000280_0001
oe ©
Figure imgf000281_0001
Figure imgf000282_0001
Figure imgf000283_0001
oe
Figure imgf000284_0001
Figure imgf000285_0001
oe n
Figure imgf000286_0001
oe
Figure imgf000287_0001
oe
-4
Figure imgf000288_0001
oe oe
Figure imgf000289_0001
oe
Figure imgf000290_0001
Figure imgf000291_0001
Figure imgf000292_0001
Figure imgf000293_0001
Figure imgf000294_0001
Figure imgf000295_0001
Figure imgf000296_0001
Figure imgf000297_0001
Figure imgf000298_0001
Figure imgf000299_0001
Figure imgf000300_0001
Ui
© ©
Figure imgf000301_0001
Figure imgf000302_0001
Figure imgf000303_0001
Ui © Ui
Figure imgf000304_0001
Figure imgf000305_0001
U
© cn
Figure imgf000306_0001
Figure imgf000307_0001
Ui © -4
Figure imgf000308_0001
Ui
© oe
Figure imgf000309_0001
Figure imgf000310_0001
Figure imgf000311_0001
Figure imgf000312_0001
Figure imgf000313_0001
Figure imgf000314_0001
Figure imgf000315_0001
Figure imgf000316_0001
Figure imgf000317_0001
Figure imgf000318_0001
Figure imgf000319_0001
Figure imgf000320_0001
Figure imgf000321_0001
Figure imgf000322_0001
Figure imgf000323_0001
Figure imgf000324_0001
Figure imgf000325_0001
Figure imgf000326_0001
Figure imgf000327_0001
Figure imgf000328_0001
Figure imgf000329_0001
Figure imgf000330_0001
Figure imgf000331_0001
Figure imgf000332_0001
Figure imgf000333_0001
Figure imgf000334_0001
Figure imgf000335_0001
Figure imgf000336_0001
Figure imgf000337_0001
Figure imgf000338_0001
Ui Ui oe
Figure imgf000339_0001
Figure imgf000340_0001
Ui 4-
©
Figure imgf000341_0001
Figure imgf000342_0001
Figure imgf000343_0001
i - i
Figure imgf000344_0001
Figure imgf000345_0001
U 4- cn
Figure imgf000346_0001
Figure imgf000347_0001
U 4- -4
U 4- oe
Figure imgf000349_0001
-
Figure imgf000350_0001
Ui
Ul
©
Figure imgf000351_0001
U
Ul
Figure imgf000352_0001
Ui Ul
Figure imgf000353_0001
U
Ul
Ui
Figure imgf000354_0001
Ui
Ul
Figure imgf000355_0001
Ui
Ul Ul
Figure imgf000356_0001
Figure imgf000357_0001
Ui
Ul -4
Figure imgf000358_0001
Ui
Ul oe
Figure imgf000359_0001
Figure imgf000360_0001
Figure imgf000361_0001
Figure imgf000362_0001
Figure imgf000363_0001
Figure imgf000364_0001
Figure imgf000365_0001
Figure imgf000366_0001
Figure imgf000367_0001
Figure imgf000368_0001
Figure imgf000369_0001
Figure imgf000370_0001
-4
©
Figure imgf000371_0001
U -4
Figure imgf000372_0001
U -4
Figure imgf000373_0001
U -4
Ui
Figure imgf000374_0001
U -4
Figure imgf000375_0001
U -4 cn
Figure imgf000376_0001
U -4
Figure imgf000377_0001
Ui -4 -4
Figure imgf000378_0001
Ui -4 oe
Figure imgf000379_0001
U -4
Figure imgf000380_0001
U oe ©
Figure imgf000381_0001
U oe
Figure imgf000382_0001
Figure imgf000383_0001
U oe
Figure imgf000384_0001
Ui oe
Figure imgf000385_0001
Figure imgf000386_0001
Figure imgf000387_0001
Ui oe
-4
Figure imgf000388_0001
U oe oe
Figure imgf000389_0001
U oe
Figure imgf000390_0001
Figure imgf000391_0001
Figure imgf000392_0001
Figure imgf000393_0001
Figure imgf000394_0001
Figure imgf000395_0001
Figure imgf000396_0001
Figure imgf000397_0001
Figure imgf000398_0001
Figure imgf000399_0001
Figure imgf000400_0001
o o
Figure imgf000401_0001
Figure imgf000402_0001
Figure imgf000403_0001
Figure imgf000404_0001
Figure imgf000405_0001
o n
Figure imgf000406_0001
Figure imgf000407_0001
Figure imgf000407_0002
o
-4
Figure imgf000408_0001
o oe
Figure imgf000409_0001
Figure imgf000410_0001
Figure imgf000411_0001
Figure imgf000412_0001
Figure imgf000413_0001
Figure imgf000414_0001
Figure imgf000415_0001
Figure imgf000416_0001
Figure imgf000417_0001
Figure imgf000418_0001
Figure imgf000419_0001
Figure imgf000420_0001
Figure imgf000421_0001
Figure imgf000422_0001
Figure imgf000423_0001
Figure imgf000424_0001
Figure imgf000425_0001
Figure imgf000426_0001
Figure imgf000427_0001
Figure imgf000427_0002
Figure imgf000428_0001
oe
Figure imgf000429_0001
Figure imgf000430_0001
Figure imgf000431_0001
Figure imgf000432_0002
Figure imgf000432_0001
Figure imgf000433_0001
Figure imgf000433_0002
Figure imgf000434_0001
Figure imgf000435_0001
Ui
Ul
Figure imgf000436_0001
Figure imgf000436_0002
Figure imgf000437_0001
Ui -4
Figure imgf000438_0001
U oe
Figure imgf000439_0001
- i
Figure imgf000440_0001
Figure imgf000441_0001
Figure imgf000442_0002
Figure imgf000442_0001
Figure imgf000443_0001
Figure imgf000444_0001
Figure imgf000445_0001
Figure imgf000446_0001
Figure imgf000447_0001
Figure imgf000448_0001
Figure imgf000449_0001
Figure imgf000450_0001
n o
Figure imgf000451_0001
Figure imgf000452_0002
Figure imgf000452_0001
4c-n
Figure imgf000453_0001
Figure imgf000453_0002
cn
Figure imgf000454_0001
Figure imgf000455_0001
n n
Figure imgf000456_0002
Figure imgf000456_0001
Figure imgf000457_0002
Figure imgf000457_0001
n
-4
Figure imgf000458_0001
n oe
Figure imgf000459_0001
Figure imgf000460_0001
Figure imgf000461_0001
Figure imgf000462_0001
Figure imgf000463_0001
Figure imgf000464_0001
Figure imgf000465_0001
Figure imgf000466_0001
Figure imgf000467_0001
Figure imgf000467_0002
4- -4
Figure imgf000468_0001
Figure imgf000469_0001
Figure imgf000470_0001
-4
O
Figure imgf000471_0002
Figure imgf000471_0001
Figure imgf000472_0002
Figure imgf000472_0001
Figure imgf000473_0001
Figure imgf000474_0001
Figure imgf000475_0001
-4 cn
Figure imgf000476_0001
Figure imgf000476_0002
Figure imgf000477_0001
-4 -4
Figure imgf000478_0001
-4 oe
Figure imgf000479_0001
Figure imgf000480_0001
oe o
Figure imgf000481_0001
Figure imgf000482_0001
Figure imgf000482_0002
Figure imgf000483_0001
Figure imgf000483_0002
oe
Figure imgf000484_0001
Figure imgf000485_0001
Figure imgf000485_0002
oe n
Figure imgf000486_0001
Figure imgf000487_0001
oe
-4
Figure imgf000488_0001
4- oe oe
Figure imgf000489_0001
Figure imgf000489_0002
Figure imgf000490_0001
Figure imgf000491_0001
Figure imgf000492_0001
Figure imgf000493_0001
Figure imgf000493_0002
Figure imgf000494_0001
Figure imgf000495_0001
Figure imgf000496_0001
Figure imgf000497_0001
Figure imgf000498_0001
oe
Figure imgf000499_0001
Figure imgf000500_0001
n o o
Figure imgf000501_0001
n o
Figure imgf000502_0001
n o
Figure imgf000503_0001
n o
Figure imgf000504_0001
n o
Figure imgf000505_0001
n o n
Figure imgf000506_0001
n o
Figure imgf000507_0001
n o
-4
Figure imgf000508_0001
n o oe
Figure imgf000509_0001
n o
Figure imgf000510_0001
cn
©
Figure imgf000511_0001
Figure imgf000511_0002
Figure imgf000512_0001
Figure imgf000513_0001
Figure imgf000514_0001
Figure imgf000515_0001
n n
Figure imgf000516_0001
Figure imgf000517_0001
n -4
Figure imgf000518_0001
n oe
Figure imgf000519_0002
Figure imgf000519_0001
Figure imgf000520_0001
n o
Figure imgf000521_0002
Figure imgf000521_0001
Figure imgf000522_0002
Figure imgf000522_0001
Figure imgf000523_0001
n
Figure imgf000524_0001
Figure imgf000525_0001
n n
Figure imgf000526_0001
Figure imgf000527_0001
n
-4
Figure imgf000528_0002
Figure imgf000528_0001
n oe
Figure imgf000529_0001
n
Figure imgf000530_0001
n
©
Figure imgf000531_0002
Figure imgf000531_0001
n
Figure imgf000532_0002
Figure imgf000532_0001
n
Figure imgf000533_0001
n
Uι Ui
Figure imgf000534_0001
n
Figure imgf000535_0002
Figure imgf000535_0001
n
Uι Ul
Figure imgf000536_0001
n
Figure imgf000537_0001
n
Uι -4
Figure imgf000538_0002
Figure imgf000538_0001
cn
Uι oe
Figure imgf000539_0001
n
Figure imgf000540_0002
Figure imgf000540_0001
n
4-
O
Figure imgf000541_0001
Figure imgf000542_0001
Figure imgf000543_0001
n i
Figure imgf000544_0001
Figure imgf000545_0001
Figure imgf000546_0002
cn cn
Figure imgf000546_0001
n
Figure imgf000547_0001
n
4- -4
Figure imgf000548_0001
n
4- oe
Figure imgf000549_0001
Figure imgf000550_0001
n n
©
Figure imgf000551_0001
n n
Figure imgf000552_0001
cn cn
Figure imgf000553_0001
Figure imgf000554_0001
Figure imgf000555_0001
Figure imgf000556_0001
Figure imgf000557_0001
Figure imgf000558_0001
Figure imgf000559_0001
Figure imgf000560_0001
Figure imgf000561_0001
Figure imgf000562_0001
Figure imgf000563_0001
Figure imgf000564_0001
Figure imgf000565_0001
Figure imgf000566_0001
Figure imgf000567_0001
Figure imgf000568_0001
oe
Figure imgf000569_0001
Figure imgf000570_0001
Figure imgf000571_0001
Figure imgf000572_0001
Figure imgf000573_0001
Figure imgf000574_0001
Figure imgf000575_0001
Figure imgf000576_0001
Figure imgf000577_0001
Figure imgf000578_0001
oe
Figure imgf000579_0001
Figure imgf000580_0001
oe o
Figure imgf000581_0001
oe
Figure imgf000582_0001
Figure imgf000583_0001
Figure imgf000584_0001
oe
Figure imgf000585_0001
Figure imgf000586_0001
oe
oe
-4
Figure imgf000588_0001
29491 cg43064068 1741 CTTTTCCCTTTG SILENT- synthase Human Gene Similar to SWISSNEW- 7.4E-65
GGCCCTTGGCCT NONCODI ID:P39062 ACETYL-COENZYME A
T[C/A]CTATGATG NG SYNTHETASE (EC 6.2.1.1) (ACETATE-
ATATGAGATTCT CO A LIGASE) (ACYL- ACTIVATING
TTATG ENZYME) (ACETYL-COA SYNTHASE) - BACILLUS SUBTILIS, 572 aa.|pcls:SWISSPROT-ID:P39062 ACETYL-COENZYME A SYNTHETASE (EC 6.2.1.1) (ACETATE-COA LIGASE) (ACYL- ACTIVATING ENZYME) (ACETYL-COA SYNTHASE) - BACILLUS SUBTII IS. 57? aa.
2950 cg43064068 1767 CTATGATGATAT SILENT- synthase Human Gene Similar to SWISSNEW- 7.4E-65
GAGATTCTTTAT NONCODI ID:P39062 ACETYL-COENZYME A
G[G/A]AAGAACAT NG SYNTHETASE (EC 6.2.1.1) (ACETATE-
GAATATAAGTTTT COA LIGASE) (ACYL- ACTIVATING
GTCT ENZYME) (ACETYL-COA SYNTHASE) - BACILLUS SUBTILIS, 572 oe aa.|pcls:SWISSPROT-ID:P39062 oe ACETYL-COENZYME A SYNTHETASE (EC 6.2.1.1) (ACETATE-COA LIGASE) (ACYL- ACTIVATING ENZYME) (ACETYL-COA SYNTHASE) - BACILLUS SUBTII IS. 57? aa
2951 cg43064068 1780 AGATTCTTTATG SILENT- synthase Human Gene Similar to SWISSNEW- 7.4E-65
GAAGAACATGAA NONCODI ID.P39062 ACETYL-COENZYME A
T[A/G]TAAGTTTT NG SYNTHETASE (EC 6.2.1.1) (ACETATE-
GTCTTGCCCTGG COA LIGASE) (ACYL- ACTIVATING
TTTTG ENZYME) (ACETYL-COA SYNTHASE) - BACILLUS SUBTILIS, 572 aa.|pcls:SWISSPROT-ID:P39062 ACETYL-COENZYME A SYNTHETASE (EC 6.2.1.1) (ACETATE-COA LIGASE) (ACYL- ACTIVATING ENZYME) (ACETYL-COA SYNTHASE) - BACILLUS SUBTII IS. 57? aa.
Figure imgf000590_0001
Figure imgf000591_0001
Figure imgf000592_0001
Figure imgf000593_0001
Figure imgf000594_0001
Figure imgf000595_0001
Figure imgf000596_0001
Figure imgf000597_0001
Figure imgf000598_0001
oe
Figure imgf000599_0001
Figure imgf000600_0001
o o
Figure imgf000601_0001
Figure imgf000602_0001
o
Figure imgf000603_0001
o
Figure imgf000604_0001
Figure imgf000605_0001
Figure imgf000606_0001
Figure imgf000607_0001
o
-4
Figure imgf000608_0001
o oe
Figure imgf000609_0001
Figure imgf000610_0001
Figure imgf000611_0001
Figure imgf000612_0001
Figure imgf000613_0001
Figure imgf000614_0001
Figure imgf000615_0001
Figure imgf000616_0001
Figure imgf000617_0001
-4
Figure imgf000618_0001
Figure imgf000619_0001
Figure imgf000620_0001
Figure imgf000621_0001
Figure imgf000622_0001
Figure imgf000623_0001
Figure imgf000624_0001
Figure imgf000625_0001
Figure imgf000626_0001
Figure imgf000627_0001
Figure imgf000628_0002
Figure imgf000628_0001
Figure imgf000629_0001
Figure imgf000630_0001
©
Figure imgf000631_0001
Figure imgf000632_0001
Figure imgf000633_0001
\ i i
Figure imgf000634_0001
\ i
Figure imgf000635_0001
Figure imgf000636_0001
Figure imgf000637_0001
Uι -4
Figure imgf000638_0001
Figure imgf000639_0001
Figure imgf000640_0001
Figure imgf000641_0001
Figure imgf000642_0001
-
Figure imgf000643_0001
Figure imgf000644_0001
Figure imgf000645_0001
Figure imgf000646_0001
Figure imgf000647_0001
4- -4
Figure imgf000648_0001
0\ 4- oe
Figure imgf000649_0001
\ -
Figure imgf000650_0001
Figure imgf000651_0001
Figure imgf000652_0001
Figure imgf000653_0001
Figure imgf000654_0001
Figure imgf000655_0001
Figure imgf000656_0001
Figure imgf000657_0001
Figure imgf000658_0001
Figure imgf000659_0001
Figure imgf000660_0001
Figure imgf000661_0001
Figure imgf000662_0001
Figure imgf000663_0001
Figure imgf000664_0001
Figure imgf000665_0001
0\
©\
Figure imgf000666_0001
Figure imgf000667_0001
Figure imgf000668_0001
oe
Figure imgf000669_0001
Figure imgf000670_0001
-4
O
Figure imgf000671_0001
Figure imgf000672_0001
Figure imgf000673_0001
-4
Ul
Figure imgf000674_0001
Figure imgf000675_0001
Figure imgf000676_0001
Figure imgf000677_0001
-4 -4
Figure imgf000678_0001
-4 oe
Figure imgf000679_0001
Figure imgf000680_0001
oe o
Figure imgf000681_0001
Figure imgf000682_0001
Figure imgf000683_0001
oe
Ul
Figure imgf000684_0001
Figure imgf000685_0001
Figure imgf000686_0001
Figure imgf000687_0001
oe
-4
Figure imgf000688_0001
oe oe
Figure imgf000689_0001
Figure imgf000690_0001
Figure imgf000691_0001
Figure imgf000692_0001
Figure imgf000693_0001
Figure imgf000694_0001
Figure imgf000695_0001
Figure imgf000697_0001
Figure imgf000698_0001
oe
Figure imgf000699_0001
Figure imgf000700_0001
-4
O O
Figure imgf000701_0001
-4
O
Figure imgf000702_0001
-4
O
Figure imgf000703_0001
-4
O Ul
Figure imgf000704_0001
-4
O
Figure imgf000705_0001
-4
O CΛ
Figure imgf000706_0001
-4
O
Figure imgf000707_0001
-4
O -4
Figure imgf000708_0001
-4
O oe
Figure imgf000709_0001
Figure imgf000710_0001
-4 O
Figure imgf000711_0001
Figure imgf000712_0001
-4
Figure imgf000713_0001
-4
I— ' Ul
Figure imgf000714_0001
Figure imgf000715_0001
-4
Figure imgf000716_0001
-4
Figure imgf000717_0001
-4 -4
Figure imgf000718_0001
-4 oe
Figure imgf000719_0001
Figure imgf000720_0001
-4
O
Figure imgf000721_0001
Figure imgf000722_0001
Figure imgf000723_0001
-4
Ul
Figure imgf000724_0001
Figure imgf000725_0001
-4 CΛ
Figure imgf000726_0001
Figure imgf000727_0001
-4 -4
Figure imgf000728_0001
-4 oe
Figure imgf000729_0001
Figure imgf000730_0001
-4
Ul
O
Figure imgf000731_0001
-4
Ul
Figure imgf000732_0001
-4
Ul
Figure imgf000733_0001
-4
Ul Ul
Figure imgf000734_0001
-4
Ul
Figure imgf000735_0001
-4
Ul CΛ
Figure imgf000736_0001
-4
Ul
Figure imgf000737_0001
-4
Ul -4
Figure imgf000738_0001
-4
Ul oe
Figure imgf000739_0001
Figure imgf000740_0001
-4
4-
O
Figure imgf000741_0001
Figure imgf000742_0001
-4
4-
Figure imgf000743_0001
-4
4- Ui
Figure imgf000744_0001
Figure imgf000745_0001
-4
4- CΛ
Figure imgf000746_0001
-4
4-
Figure imgf000747_0001
-4
4- -4
-4
4- oe
Figure imgf000749_0001
Figure imgf000750_0001
-4 CΛ
O
Figure imgf000751_0001
-4 CΛ
Figure imgf000752_0001
-4 CΛ
Figure imgf000753_0001
-4 CΛ
Ul
Figure imgf000754_0001
-4 CΛ
Figure imgf000755_0001
-4 CΛ CΛ
Figure imgf000756_0001
-4 CΛ
Figure imgf000757_0001
-4 CΛ -4
Figure imgf000758_0001
-4 CΛ oe
Figure imgf000759_0001
-4 CΛ
Figure imgf000760_0001
-4 o
Figure imgf000761_0001
Figure imgf000762_0001
Figure imgf000763_0001
Figure imgf000764_0001
-4
Figure imgf000765_0001
-4 CΛ
Figure imgf000766_0001
Figure imgf000767_0001
-4 -4
Figure imgf000768_0001
-4 oe
Figure imgf000769_0001
Figure imgf000770_0001
-4 -4
O
Figure imgf000771_0001
-4 -4
Figure imgf000772_0001
-4 -4
Figure imgf000773_0001
-4 -4
Ul
Figure imgf000774_0001
-4 -4
Figure imgf000775_0001
-4 -4 CΛ
Figure imgf000776_0001
-4 -4
Figure imgf000777_0001
-4 -4 -4
Figure imgf000778_0001
-4 -4 oe
Figure imgf000779_0001
Figure imgf000780_0001
-4 oe o
Figure imgf000781_0001
-4 oe
Figure imgf000782_0001
-4 oe
Figure imgf000783_0001
-4 oe
Ul
Figure imgf000784_0001
-4 oe
Figure imgf000785_0001
-4 oe
Figure imgf000786_0001
-4 oe
Figure imgf000787_0001
-4 oe
-4
Figure imgf000788_0001
-4 oe oe
Figure imgf000789_0001
-4 oe
Figure imgf000790_0001
Figure imgf000791_0001
Figure imgf000792_0001
Figure imgf000793_0001
Figure imgf000794_0001
Figure imgf000795_0001
Figure imgf000796_0001
Figure imgf000797_0001
Figure imgf000798_0001
oe
Figure imgf000799_0001
Figure imgf000800_0001
oe o o
Figure imgf000801_0001
oe o
Figure imgf000802_0001
oe o
Figure imgf000803_0001
oe o
Ul
Figure imgf000804_0001
oe o
Figure imgf000805_0001
oe o
Figure imgf000806_0001
oe o
Figure imgf000807_0001
oe o
-4
Figure imgf000808_0001
oe o oe
Figure imgf000809_0001
oe o
Figure imgf000810_0001
oe o
Figure imgf000811_0001
Figure imgf000812_0001
oe
Figure imgf000813_0001
Figure imgf000814_0001
Figure imgf000815_0001
oe
Figure imgf000816_0001
oe
Figure imgf000817_0001
oe -4
Figure imgf000818_0001
oe oe
Figure imgf000819_0001
oe
Figure imgf000820_0001
oe o
Figure imgf000821_0001
Figure imgf000822_0001
oe
Figure imgf000823_0001
oe
Figure imgf000824_0001
Figure imgf000825_0001
Figure imgf000826_0001
oe
Figure imgf000827_0001
oe
-4
Figure imgf000828_0001
oe oe
Figure imgf000829_0001
Figure imgf000830_0001
oe
©
Figure imgf000831_0001
oe
Figure imgf000832_0001
oe
Figure imgf000833_0001
oe
Uι Ui
Figure imgf000834_0001
oe
Figure imgf000835_0001
oe
Figure imgf000836_0001
oe
Figure imgf000837_0001
oe
Uι -4
Figure imgf000838_0001
oe
Uι oe
Figure imgf000839_0001
oe
Figure imgf000840_0001
oe
4-
O
Figure imgf000841_0001
Figure imgf000842_0001
oe
Figure imgf000843_0001
oe Ui
Figure imgf000844_0001
Figure imgf000845_0001
oe
Figure imgf000846_0001
oe
Figure imgf000847_0001
oe
4- -4
Figure imgf000848_0001
oe oe
Figure imgf000849_0001
oe
Figure imgf000850_0001
oe
Figure imgf000851_0001
oe
Figure imgf000852_0001
oe
Figure imgf000853_0001
oe
Ui
Figure imgf000854_0001
oe
Figure imgf000855_0001
oe
Figure imgf000856_0001
oe
Figure imgf000857_0001
Figure imgf000858_0001
oe oe
Figure imgf000859_0001
oe
Figure imgf000860_0001
oe o
Figure imgf000861_0001
Figure imgf000862_0001
Figure imgf000863_0001
oe
Figure imgf000864_0001
Figure imgf000865_0001
Figure imgf000866_0001
Figure imgf000867_0001
oe
-4
Figure imgf000868_0001
oe oe
Figure imgf000869_0001
Figure imgf000870_0001
oe
-4
O
Figure imgf000871_0001
oe
-4
Figure imgf000872_0001
oe
-4
Figure imgf000873_0001
oe
-4
Ui
Figure imgf000874_0001
oe
-4
Figure imgf000875_0001
oe
Figure imgf000876_0001
oe
-4
Figure imgf000877_0001
oe
-4 -4
Figure imgf000878_0001
oe
-4 oe
Figure imgf000879_0001
oe
Figure imgf000880_0001
oe oe o
Figure imgf000881_0001
oe oe
Figure imgf000882_0001
oe oe
Figure imgf000883_0001
oe oe
Figure imgf000884_0001
oe oe
Figure imgf000885_0001
oe oe
Figure imgf000886_0001
oe oe
Figure imgf000887_0001
oe oe
-4
Figure imgf000888_0001
oe oe oe
Figure imgf000889_0001
oe oe
Figure imgf000890_0001
oe o
Figure imgf000891_0001
oe
Figure imgf000892_0001
oe
Figure imgf000893_0001
oe Uι
Figure imgf000894_0001
oe
Figure imgf000895_0001
oe
Figure imgf000896_0001
oe
Figure imgf000897_0001
oe
Figure imgf000898_0001
oe oe
Figure imgf000899_0001
oe
Figure imgf000900_0001
o o
Figure imgf000901_0001
Figure imgf000902_0001
Figure imgf000903_0001
Figure imgf000904_0001
Figure imgf000905_0001
Figure imgf000906_0001
Figure imgf000907_0001
o
-4
Figure imgf000908_0001
o oe
Figure imgf000909_0001
Figure imgf000910_0001
Figure imgf000911_0001
Figure imgf000912_0001
Figure imgf000913_0001
Figure imgf000914_0001
Figure imgf000915_0001
Figure imgf000916_0001
Figure imgf000917_0001
Figure imgf000918_0001
oe
Figure imgf000919_0001
Figure imgf000920_0001
Figure imgf000921_0001
Figure imgf000922_0001
Figure imgf000923_0001
Figure imgf000924_0001
Figure imgf000925_0001
Figure imgf000926_0001
Figure imgf000927_0001
Figure imgf000928_0001
Figure imgf000929_0001
Figure imgf000930_0001
Figure imgf000931_0001
Figure imgf000932_0001
Figure imgf000933_0001
Figure imgf000934_0001
Figure imgf000935_0001
Figure imgf000936_0001
Figure imgf000937_0001
Figure imgf000938_0001
Figure imgf000939_0001
Figure imgf000940_0001
Figure imgf000941_0001
Figure imgf000942_0001
Figure imgf000943_0001
Figure imgf000944_0001
Figure imgf000945_0001
Figure imgf000946_0001
Figure imgf000947_0001
Figure imgf000948_0001
Figure imgf000949_0001
Figure imgf000950_0001
Figure imgf000951_0001
Figure imgf000952_0001
Figure imgf000953_0001
Figure imgf000954_0001
Figure imgf000955_0001
Figure imgf000956_0001
Figure imgf000957_0001
Figure imgf000958_0001
Figure imgf000959_0001
Figure imgf000960_0001
Figure imgf000961_0001
Figure imgf000962_0001
Figure imgf000963_0001
Figure imgf000964_0001
Figure imgf000965_0001
Figure imgf000966_0001
Figure imgf000967_0001
Figure imgf000968_0001
oe
Figure imgf000969_0001
Figure imgf000970_0001
Figure imgf000971_0001
Figure imgf000972_0001
Figure imgf000973_0001
Figure imgf000974_0001
Figure imgf000975_0001
Figure imgf000976_0001
Figure imgf000977_0001
Figure imgf000978_0001
oe
Figure imgf000979_0001
Figure imgf000980_0001
oe ©
Figure imgf000981_0001
Figure imgf000982_0001
Figure imgf000983_0001
oe
Figure imgf000984_0001
Figure imgf000985_0001
oe
Ul
Figure imgf000986_0001
Figure imgf000987_0001
oe
-4
Figure imgf000988_0001
oe oe
Figure imgf000989_0001
Figure imgf000990_0001
Figure imgf000991_0001
Figure imgf000992_0001
Figure imgf000993_0001
Figure imgf000994_0001
Figure imgf000995_0001
Figure imgf000996_0001
Figure imgf000997_0001
Figure imgf000998_0001
oe
Figure imgf000999_0001
Figure imgf001000_0001
© © ©
Figure imgf001001_0001
© ©
Figure imgf001002_0001
© ©
Figure imgf001003_0001
© ©
Ui
Figure imgf001004_0001
© ©
Figure imgf001005_0001
© ©
Ul
Figure imgf001006_0001
© ©
Figure imgf001007_0001
© ©
-4
Figure imgf001008_0001
© © oe
Figure imgf001009_0001
© ©
Figure imgf001010_0001
©
I— '
©
Figure imgf001011_0001
Figure imgf001012_0001
Figure imgf001013_0001
Figure imgf001014_0001
Figure imgf001015_0001
Figure imgf001016_0001
Figure imgf001017_0001
©
I— ' -4
Figure imgf001018_0001
oe
Figure imgf001019_0001
Figure imgf001020_0001
© ©
Figure imgf001021_0001
Figure imgf001022_0001
Figure imgf001023_0001
©
Ui
Figure imgf001024_0001
Figure imgf001025_0001
©
Ul
Figure imgf001026_0001
Figure imgf001027_0001
©
-4
Figure imgf001028_0001
© oe
Figure imgf001029_0001
Figure imgf001030_0001
©
Ui
©
Figure imgf001031_0001
©
Ui
Figure imgf001032_0001
Figure imgf001033_0001
©
Ui Ui
Figure imgf001034_0001
Figure imgf001035_0001
©
Ui
Ul
Figure imgf001036_0001
Figure imgf001037_0001
©
Ui -4
Figure imgf001038_0001
© ( oe
Figure imgf001039_0001
Figure imgf001040_0001
©
4-
©
Figure imgf001041_0001
Figure imgf001042_0001
©
4-
Figure imgf001043_0001
©
4- Ui
Figure imgf001044_0001
Figure imgf001045_0001
©
4- Ul
Figure imgf001046_0001
©
4-
Figure imgf001047_0001
©
4- -4
©
4- oe
Figure imgf001049_0001
Figure imgf001050_0001
©
Ul
©
Figure imgf001051_0001
©
Ul
Figure imgf001052_0001
©
Ul
Figure imgf001053_0001
©
Ul
Ui
Figure imgf001054_0001
©
Ul
Figure imgf001055_0001
©
Ul Ul
Figure imgf001056_0001
©
Ul
Figure imgf001057_0001
©
Ul -4
Figure imgf001058_0001
©
Ul oe
Figure imgf001059_0001
Figure imgf001060_0001
© ©
Figure imgf001061_0001
Figure imgf001062_0001
Figure imgf001063_0001
©
Figure imgf001064_0001
Figure imgf001065_0001
©
Ul
Figure imgf001066_0001
Figure imgf001067_0001
©
-4
Figure imgf001068_0001
© oe
Figure imgf001069_0001
Figure imgf001070_0001
©
-4
©
Figure imgf001071_0001
©
-4
Figure imgf001072_0001
©
-4
Figure imgf001073_0001
©
-4
Ui
Figure imgf001074_0001
©
-4
Figure imgf001075_0001
©
-4 Ul
Figure imgf001076_0001
©
-4
Figure imgf001077_0001
©
-4 -4
Figure imgf001078_0001
©
-4 oe
Figure imgf001079_0001
Figure imgf001080_0001
© oe ©
Figure imgf001081_0001
© oe
Figure imgf001082_0001
© oe
Figure imgf001083_0001
© oe
Figure imgf001084_0001
© oe
4-
Figure imgf001085_0001
© oe
Ul
Figure imgf001086_0001
© oe
Figure imgf001087_0001
© oe
-4
Figure imgf001088_0001
© oe oe
Figure imgf001089_0001
© oe vo
Figure imgf001090_0001
©
©
Figure imgf001091_0001
Figure imgf001092_0001
Figure imgf001093_0001
© Ui
Figure imgf001094_0001
Figure imgf001095_0001
Figure imgf001096_0001
Figure imgf001097_0001
Figure imgf001098_0001
© oe
Figure imgf001099_0001
Figure imgf001100_0001
© ©
Figure imgf001101_0001
Figure imgf001102_0001
Figure imgf001103_0001
© (
Figure imgf001104_0001
Figure imgf001105_0001
©
Ul
Figure imgf001106_0001
Figure imgf001107_0001
©
-4
Figure imgf001108_0001
© oe
Figure imgf001109_0001
Figure imgf001110_0001
Figure imgf001111_0001
Figure imgf001112_0001
Figure imgf001113_0001
Figure imgf001114_0001
Figure imgf001115_0001
Figure imgf001116_0001
Figure imgf001117_0001
Figure imgf001118_0001
Figure imgf001119_0001
Figure imgf001120_0001
Figure imgf001121_0001
Figure imgf001122_0001
Figure imgf001123_0001
Figure imgf001124_0001
Figure imgf001125_0001
Figure imgf001126_0001
Figure imgf001127_0001
Figure imgf001128_0001
Figure imgf001129_0001
Figure imgf001130_0001
Figure imgf001131_0001
Figure imgf001132_0001
Figure imgf001133_0001
u, u,
Figure imgf001134_0001
Figure imgf001135_0001
Figure imgf001136_0001
ON
Figure imgf001137_0001
U -4
Figure imgf001138_0002
Figure imgf001138_0001
Ui oe
Figure imgf001139_0001
Figure imgf001140_0001
Figure imgf001140_0002
Figure imgf001141_0001
Figure imgf001142_0002
Figure imgf001142_0001
Figure imgf001143_0001
Figure imgf001144_0001
Figure imgf001145_0001
Figure imgf001146_0001
Figure imgf001147_0001
Figure imgf001148_0001
Figure imgf001149_0001
Figure imgf001150_0001
Figure imgf001151_0001
Figure imgf001152_0001
Figure imgf001153_0001
Ul
Ui
Figure imgf001154_0001
Figure imgf001155_0001
Figure imgf001156_0001
Figure imgf001157_0001
-4
Figure imgf001158_0001
Ul oe
Figure imgf001159_0001
Figure imgf001160_0001
o
Figure imgf001161_0001
Figure imgf001162_0001
Figure imgf001163_0001
Figure imgf001164_0001
Figure imgf001165_0001
Figure imgf001166_0001
Figure imgf001167_0001
Figure imgf001168_0001
Figure imgf001169_0001
Figure imgf001170_0001
-4
O
Figure imgf001171_0001
Figure imgf001172_0001
-4
Figure imgf001173_0001
-4
Ui
Figure imgf001174_0001
Figure imgf001175_0001
-4 Ul
Figure imgf001176_0001
Figure imgf001177_0001
-4 -4
Figure imgf001178_0001
-4 oe
Figure imgf001179_0001
Figure imgf001180_0001
oe β
Figure imgf001181_0001
Figure imgf001182_0001
Figure imgf001183_0001
oe
Figure imgf001184_0001
Figure imgf001185_0001
oe
Ul
Figure imgf001187_0001
oe
-4
Figure imgf001188_0001
oe oe
Figure imgf001189_0001
oe
Figure imgf001190_0001
Figure imgf001191_0001
Figure imgf001192_0001
Figure imgf001193_0001
Figure imgf001194_0001
Figure imgf001195_0001
Figure imgf001196_0001
Figure imgf001197_0001
Figure imgf001198_0001
Figure imgf001199_0001
Figure imgf001200_0001
© ©
Figure imgf001201_0001
Figure imgf001202_0001
Figure imgf001203_0001
©
Ui
Figure imgf001204_0001
Figure imgf001205_0001
©
Ul
Figure imgf001206_0001
Figure imgf001207_0001
©
-4
Figure imgf001208_0001
© oe
Figure imgf001209_0001
Figure imgf001210_0001
Figure imgf001211_0001
Figure imgf001212_0001
Figure imgf001213_0001
Figure imgf001214_0001
Figure imgf001215_0001
Figure imgf001216_0001
Figure imgf001217_0001
Figure imgf001218_0001
Figure imgf001219_0001
Figure imgf001220_0001
Figure imgf001221_0001
Figure imgf001222_0001
Figure imgf001223_0001
Figure imgf001224_0001
Figure imgf001225_0001
Figure imgf001226_0001
Figure imgf001227_0001
Figure imgf001228_0001
Figure imgf001229_0001
Figure imgf001230_0001
Figure imgf001231_0001
Figure imgf001232_0001
Figure imgf001233_0001
Figure imgf001234_0001
Figure imgf001235_0001
Figure imgf001236_0001
Figure imgf001237_0001
Figure imgf001238_0001
Figure imgf001239_0001
Figure imgf001240_0001
4-
O
Figure imgf001241_0001
Figure imgf001242_0001
-
Figure imgf001243_0001
- i
Figure imgf001244_0001
Figure imgf001245_0001
4- Ul
Figure imgf001246_0001
Figure imgf001247_0001
4- -4
Figure imgf001248_0001
4- oe
Figure imgf001249_0001
Figure imgf001250_0001
Ul
©
Figure imgf001251_0001
Figure imgf001252_0001
Ul
Figure imgf001253_0001
Ul
Ui
Figure imgf001254_0001
Figure imgf001255_0001
Ul Ul
Figure imgf001256_0001
Figure imgf001257_0001
Ul -4
Figure imgf001258_0001
Ul oe
Figure imgf001259_0001
Figure imgf001260_0001
Figure imgf001261_0001
Figure imgf001262_0001
Figure imgf001263_0001
Figure imgf001264_0001
Figure imgf001265_0001
Figure imgf001266_0001
Figure imgf001267_0001
Figure imgf001268_0001
Figure imgf001269_0001
Figure imgf001270_0001
-4
O
Figure imgf001271_0001
Figure imgf001272_0001
Figure imgf001273_0001
-4
Ui
Figure imgf001274_0001
Figure imgf001275_0001
Figure imgf001276_0001
-4
Figure imgf001277_0001
-4 -4
Figure imgf001278_0001
-4 oe
Figure imgf001279_0001
Figure imgf001280_0001
oe o
Figure imgf001281_0001
Figure imgf001282_0001
Figure imgf001283_0001
Figure imgf001284_0001
Figure imgf001285_0001
Figure imgf001286_0001
Figure imgf001287_0001
oe
-4
Figure imgf001288_0001
oe oe
Figure imgf001289_0001
oe
Figure imgf001290_0001
Figure imgf001291_0001
Figure imgf001292_0001
Figure imgf001293_0001
Figure imgf001294_0001
Figure imgf001295_0001
Figure imgf001296_0001
Figure imgf001297_0001
Figure imgf001298_0001
Figure imgf001299_0001
Figure imgf001300_0001
U
© o
Figure imgf001301_0001
Figure imgf001302_0001
Figure imgf001303_0001
Figure imgf001304_0001
Figure imgf001305_0001
Figure imgf001306_0001
Figure imgf001307_0001
Figure imgf001308_0001
Figure imgf001309_0001
Figure imgf001310_0001
Figure imgf001311_0001
Figure imgf001312_0001
Figure imgf001313_0001
Figure imgf001314_0001
Figure imgf001315_0001
Figure imgf001316_0001
Figure imgf001317_0001
Figure imgf001318_0001
Figure imgf001319_0001
Figure imgf001320_0001
Figure imgf001321_0001
Figure imgf001322_0001
Figure imgf001323_0001
Figure imgf001324_0001
Figure imgf001325_0001
Figure imgf001326_0001
Figure imgf001327_0001
U to
-4
Figure imgf001328_0001
Figure imgf001329_0001
Figure imgf001330_0001
Figure imgf001331_0001
Figure imgf001332_0001
Figure imgf001333_0001
Figure imgf001334_0001
Figure imgf001335_0001
Figure imgf001336_0001
Figure imgf001337_0001
Figure imgf001338_0001
Figure imgf001339_0001
Figure imgf001340_0001
Ui 4-
O
Figure imgf001341_0001
Figure imgf001342_0001
Figure imgf001343_0001
- i
Figure imgf001344_0001
Figure imgf001345_0001
U 4- Ul
Figure imgf001346_0001
Figure imgf001347_0001
U 4- -4
Figure imgf001348_0001
Figure imgf001349_0001
Figure imgf001350_0001
U
Ul
©
Figure imgf001351_0001
Figure imgf001352_0001
Figure imgf001353_0001
Figure imgf001354_0001
Figure imgf001355_0001
U
Ul Ul
Figure imgf001356_0001
Figure imgf001357_0001
U
Ul -4
Figure imgf001358_0002
Figure imgf001358_0001
U
Ul
00
Figure imgf001359_0002
Figure imgf001359_0001
Figure imgf001360_0001
Figure imgf001361_0001
Figure imgf001362_0001
Figure imgf001363_0001
Figure imgf001364_0001
Figure imgf001365_0001
Figure imgf001366_0001
Figure imgf001367_0001
Figure imgf001368_0001
Figure imgf001369_0001
Figure imgf001370_0001
U -4
O
Figure imgf001371_0001
Figure imgf001372_0001
Figure imgf001373_0001
Figure imgf001374_0001
Figure imgf001375_0001
Figure imgf001376_0001
Ui -4
Figure imgf001377_0002
Figure imgf001377_0001
U -4 -4
Figure imgf001378_0001
U -4
00
Figure imgf001379_0001
U -4
Figure imgf001380_0001
Figure imgf001380_0002
Figure imgf001381_0001
Figure imgf001382_0001
Figure imgf001383_0001
Ul
00 Ul
Figure imgf001384_0001
Ul
00
Figure imgf001385_0001
Figure imgf001386_0001
Figure imgf001387_0001
Figure imgf001388_0001
Figure imgf001389_0001

Claims

WHAT IS CLAIMED IS:
1. An isolated polynucleotide selected from the group consisting of: a) a nucleotide sequence comprising one or more polymoφhic sequences selected from the group consisting of SEQ ID NOS:l - 7867; b) a fragment of said nucleotide sequence, provided that the fragment includes a polymorphic site in said polymorphic sequence; c) a complementary nucleotide sequence comprising a sequence complementary to one or more of said polymorphic sequences selected from the group consisting of SEQ ID NOS : 1 -7867; and d) a fragment of said complementary nucleotide sequence, provided that the fragment includes a polymorphic site in said polymorphic sequence.
2. The polynucleotide of claim 1, wherein said polynucleotide sequence is DNA.
3. The polynucleotide of claim 1, wherein said polynucleotide sequence is RNA.
4. The polynucleotide of claim 1 , wherein said polynucleotide sequence is between about 10 and about 100 nucleotides in length.
5. The polynucleotide of claim 1, wherein said polynucleotide sequence is between about 10 and about 90 nucleotides in length.
6. The polynucleotide of claim 1, wherein said polynucleotide sequence is between about 10 and about 75 nucleotides in length.
7. The polynucleotide of claim 1, wherein said polynucleotide is between about 10 and about 50 bases in length.
8. The polynucleotide of claim 1, wherein said polynucleotide is between about 10 and about 40 bases in length.
9. The polynucleotide of claim 1, wherein said polynucleotide is between about 15 and about 30 bases in length.
10. The polynucleotide of claim 1 , wherein said polymoφhic site includes a nucleotide other than the nucleotide listed in Table 1, column 5 for said polymoφhic sequence.
11. The polynucleotide of claim 1 , wherein the complement of said polymoφhic site includes a nucleotide other than the complement of the nucleotide listed in Table 1, column 5 for the complement of said polymoφhic sequence.
12. The polynucleotide of claim 1, wherein said polymoφhic site includes the nucleotide listed in Table 1, column 6 for said polymoφhic sequence.
13. The polynucleotide of claim 1 , wherein the complement of said polymoφhic site includes the complement of the nucleotide listed in Table 1, column 6 for said polymoφhic sequence.
14. An isolated allele-specific oligonucleotide that hybridizes to a first polynucleotide at a polymoφhic site encompassed therein, wherein the first polynucleotide is selected from the group consisting of: a) a nucleotide sequence comprising one or more polymoφhic sequences selected from the group consisting of SEQ ID NOS:l - 7867 provided that the polymoφhic sequence includes a nucleotide other than the nucleotide recited in Table 1, column 5 for said polymoφhic sequence; b) a nucleotide sequence that is a fragment of said polymoφhic sequence, provided that the fragment includes a polymoφhic site in said polymoφhic sequence; c) a complementary nucleotide sequence comprising a sequence complementary to one or more polymoφhic sequences selected from the group consisting of SEQ ID NOS: 1 - 7867, provided that the complementary nucleotide sequence includes a nucleotide other than the complement of the nucleotide recited in Table 1, column 5; and d) a nucleotide sequence that is a fragment of said complementary sequence, provided that the fragment includes a polymoφhic site in said polymoφhic sequence.
15. The oligonucleotide of claim 14, wherein the oligonucleotide does not hybridize under stringent conditions to a second polynucleotide selected from the group consisting of: a) a nucleotide sequence comprising one or more polymoφhic sequences selected from the group consisting of SEQ ID NOS:l - 7867, wherein said polymoφhic sequence includes the nucleotide listed in Table 1, column 5 for said polymoφhic sequence; b) a nucleotide sequence that is a fragment of any of said nucleotide sequences; c) a complementary nucleotide sequence comprising a sequence complementary to one or more polymoφhic sequences selected from the group consisting of SEQ ID NOS:l - 7867, wherein said polymoφhic sequence includes the complement of the nucleotide listed in Table 1, column 5; and d) a nucleotide sequence that is a fragment of said complementary sequence, provided that the fragment includes a polymoφhic site in said polymoφhic sequence.
16. The oligonucleotide of claim 15, wherein the oligonucleotide is between about 10 and about 51 bases in length.
17. The oligonucleotide of claim 15, wherein the oligonucleotide is between about 10 and about 40 bases in length.
18. The oligonucleotide of claim 15, wherein the oligonucleotide is between about 15 and about 30 bases in length.
19. A method of detecting a polymoφhic site in a nucleic acid, the method comprising: a) contacting said nucleic acid with an oligonucleotide that hybridizes to a polymoφhic sequence selected from the group consisting of SEQ ID NOS: 1-7867, or its complement, provided that the polymoφhic sequence includes a nucleotide other than the nucleotide recited in Table 1, column 5 for said polymoφhic sequence, or the complement includes a nucleotide other than the complement of the nucleotide recited in Table 1, column 5; and b) determining whether said nucleic acid and said oligonucleotide hybridize; whereby hybridization of said oligonucleotide to said nucleic acid sequence indicates the presence of the polymoφhic site in said nucleic acid.
20. The method of claim 19, wherein said oligonucleotide does not hybridize to said polymoφhic sequence when said polymoφhic sequence includes the nucleotide recited in Table 1, column 5 for said polymoφhic sequence, or when the complement of the polymoφhic sequence includes the complement of the nucleotide recited in Table 1, column 5 for said polymoφhic sequence.
21. The method of claim 19, wherein said oligonucleotide is between about 10 and about 51 bases in length.
22. The method of claim 19, wherein said oligonucleotide is between about 10 and about 40 bases in length.
23. A method of detecting the presence of a sequence polymoφhism in a subject, the method comprising: a) providing a nucleic acid from said subject; b) contacting said nucleic acid with an oligonucleotide that hybridizes to a polymoφhic sequence selected from the group consisting of SEQ ID NOS: 1-7867, or its complement, provided that the polymoφhic sequence includes a nucleotide other than the nucleotide recited in Table 1, column 5 for said polymoφhic sequence, or the complement includes a nucleotide other than the complement of the nucleotide recited in Table 1, column 5; and c) determining whether said nucleic acid and said oligonucleotide hybridize; whereby hybridization of said oligonucleotide to said nucleic acid sequence indicates the presence of the polymoφhism in said subject.
24. A method of determining the relatedness of a first and second nucleic acid, the method comprising: a) providing a first nucleic acid and a second nucleic acid; b) contacting said first nucleic acid and said second nucleic acid with an oligonucleotide that hybridizes to a polymoφhic sequence selected from the group consisting of SEQ ID NOS: 1-7867, or its complement, provided that the polymoφhic sequence includes a nucleotide other than the nucleotide recited in Table 1, column 5 for said polymoφhic sequence, or the complement includes a nucleotide other than the complement of the nucleotide recited in Table 1, column 5; c) determining whether said first nucleic acid and said second nucleic acid hybridize to said oligonucleotide; and d) comparing hybridization of said first and second nucleic acids to said oligonucleotide, wherein hybridization of first and second nucleic acids to said nucleic acid indicates the first and second subjects are related.
25. The method of claim 24, wherein said oligonucleotide does not hybridize to said polymoφhic sequence when said polymoφhic sequence includes the nucleotide recited in Table 1, column 5 for said polymoφhic sequence, or when the complement of the polymoφhic sequence includes the complement of the nucleotide recited in Table 1, column 5 for said polymoφhic sequence.
26. The method of claim 24, wherein the oligonucleotide is between about 10 and about 51 bases in length.
27. The method of claim 24, wherein the oligonucleotide is between about 10 and about 40 bases in length.
28. The method of claim 24, wherein the oligonucleotide is between about 15 and about 30 bases in length.
29. An isolated polypeptide comprising a polymoφhic site at one or more amino acid residues, wherein the protein is encoded by a polynucleotide selected from the group consisting of polymoφhic sequences SEQ ID NOS: 1-7867, or their complement, provided that the polymoφhic sequence includes a nucleotide other than the nucleotide recited in Table 1, column 5 for said polymoφhic sequence, or the complement includes a nucleotide other than the complement of the nucleotide recited in Table 1 , column 5.
30. The polypeptide of claim 29, wherein said polypeptide is translated in the same open reading frame as is a wild type protein whose amino acid sequence is identical to the amino acid sequence of the polymoφhic protein except at the site of the polymoφhism.
31. The polypeptide of claim 29, wherein the polypeptide encoded by said polymoφhic sequence, or its complement, includes the nucleotide listed in Table 1, column 6 for said polymoφhic sequence, or the complement includes the complement of the nucleotide listed in Table 1, column 6.
32. An antibody that binds specifically to a polypeptide encoded by a polynucleotide comprising a nucleotide sequence selected from the group consisting of polymoφhic sequences SEQ ID NOS: 1-7867, or its complement, provided that the polymoφhic sequence includes a nucleotide other than the nucleotide recited in Table 1, column 5 for said polymoφhic sequence, or the complement includes a nucleotide other than the complement of the nucleotide recited in Table 1, column 5.
33. The antibody of claim 32, wherein said antibody binds specifically to a polypeptide encoded by a polymoφhic sequence which includes the nucleotide listed in Table 1, column 6 for said polymoφhic sequence.
34. The antibody of claim 32, wherein said antibody does not bind specifically to a polypeptide encoded by a polymoφhic sequence which includes the nucleotide listed in Table 1, column 5 for said polymoφhic sequence.
35. A method of detecting the presence of a polypeptide having one or more amino acid residue polymoφhisms in a subject, the method comprising a) providing a protein sample from said subject; b) contacting said sample with the antibody of claim 34 under conditions that allow for the formation of antibody-antigen complexes; and c) detecting said antibody-antigen complexes, whereby the presence of said complexes indicates the presence of said polypeptide.
36. A method of treating a subject suffering from, at risk for, or suspected of, suffering from a pathology ascribed to the presence of a sequence polymoφhism in a subject, the method comprising: a) providing a subject suffering from a pathology associated with aberrant expression of a first nucleic acid comprising a polymoφhic sequence selected from the group consisting of SEQ ID NOS:l - 7867, or its complement; and b) administering to the subject an effective therapeutic dose of a second nucleic acid comprising the polymoφhic sequence, provided that the second nucleic acid comprises the nucleotide present in the wild type allele, thereby treating said subject.
37. The method of claim 36, wherein the second nucleic acid sequence comprises a polymoφhic sequence which includes the nucleotide listed in Table 1, column 5 for said polymoφhic sequence.
38. A method of treating a subject suffering from, at risk for, or suspected of, suffering from a pathology ascribed to the presence of a sequence polymoφhism in a subject, the method comprising: a) providing a subject suffering from a pathology associated with aberrant expression of a polymoφhic sequence selected from the group consisting of polymoφhic sequences SEQ ID NOS:l - 7867, or its complement; and b) administering to the subject an effective therapeutic dose of a polypeptide, wherein said polypeptide is encoded by a polynucleotide comprising a polymoφhic sequence selected from the group consisting of SEQ ID NOS:l - 7867, or by a polynucleotide comprising a nucleotide sequence that is complementary to any one of polymoφhic sequences SEQ ID NOS:l - 7867, provided that said polymoφhic sequence includes the nucleotide listed in Table 1, column 6 for said polymoφhic sequence.
39. A method of treating a subject suffering from, at risk for, or suspected of suffering from, a pathology ascribed to the presence of a sequence polymoφhism in a subject, the method comprising: a) providing a subject suffering from, at risk for, or suspected of suffering from, a pathology associated with aberrant expression of a first nucleic acid comprising a polymoφhic sequence selected from the group consisting of SEQ ID NOS:l - 7867, or its complement; and b) administering to the subject an effective dose of the antibody of claim 34, thereby treating said subject.
40. A method of treating a subject suffering from, at risk for, or suspected of suffering from, a pathology ascribed to the presence of a sequence polymoφhism in a subject, the method comprising: a) providing a subject suffering from, at risk for, or suspected of suffering from, a pathology associated with aberrant expression of a nucleic acid comprising a polymoφhic sequence selected from the group consisting of SEQ ID NOS : 1 - 7867, or its complement; and b) administering to the subject an effective dose of an oligonucleotide comprising a polymoφhic sequence selected from the group consisting of SEQ ID NOS:l - 7867, or by a polynucleotide comprising a nucleotide sequence that is complementary to any one of polymoφhic sequences SEQ ID NOS:l - 7867, provided that said polymoφhic sequence includes the nucleotide listed in Table 1, column 5 or Table 1, column 6 for said polymoφhic sequence, thereby treating said subject.
41. An oligonucleotide array, comprising one or more oligonucleotides hybridizing to a first polynucleotide at a polymoφhic site encompassed therein, wherein the first polynucleotide is chosen from the group consisting of: a) a nucleotide sequence comprising one or more polymoφhic sequences selected from the group consisting of SEQ ID NOS:l - 7867; b) a nucleotide sequence that is a fragment of any of said nucleotide sequence, provided that the fragment includes a polymoφhic site in said polymoφhic sequence; c) a complementary nucleotide sequence comprising a sequence complementary to one or more polymoφhic sequences selected from the group consisting of SEQ ID NOS:l - 7867; and d) a nucleotide sequence that is a fragment of said complementary sequence, provided that the fragment includes a polymoφhic site in said polymoφhic sequence.
42. The array of claim 41, wherein said array comprises about 10 oligonucleotides.
43. The array of claim 41, wherein said array comprises about 100 oligonucleotides.
44. The array of claim 41, wherein said array comprises about 1000 oligonucleotides.
PCT/US2000/035498 1999-12-28 2000-12-28 Nucleic acids containing single nucleotide polymorphisms and methods of use thereof WO2001047944A2 (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
AU29145/01A AU2914501A (en) 1999-12-28 2000-12-28 Nucleic acids containing single nucleotide polymorphisms and methods of use thereof
CA002395926A CA2395926A1 (en) 1999-12-28 2000-12-28 Nucleic acids containing single nucleotide polymorphisms and methods of use thereof
EP00993615A EP1244688A1 (en) 1999-12-28 2000-12-28 Nucleic acids containing single nucleotide polymorphisms and methods of use thereof

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US17341999P 1999-12-28 1999-12-28
US60/173,419 1999-12-28

Publications (2)

Publication Number Publication Date
WO2001047944A2 true WO2001047944A2 (en) 2001-07-05
WO2001047944A3 WO2001047944A3 (en) 2003-02-20

Family

ID=22631931

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2000/035498 WO2001047944A2 (en) 1999-12-28 2000-12-28 Nucleic acids containing single nucleotide polymorphisms and methods of use thereof

Country Status (3)

Country Link
AU (1) AU2914501A (en)
CA (1) CA2395926A1 (en)
WO (1) WO2001047944A2 (en)

Cited By (28)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2003012032A2 (en) * 2001-08-01 2003-02-13 Isis Pharmaceuticals, Inc. Antisense modulation of p70 s6 kinase expression
EP1308522A1 (en) * 2001-11-05 2003-05-07 Haferlach, Torsten, PD Dr. Dr. Novel genetic markers for leukemias
FR2836484A1 (en) * 2002-02-25 2003-08-29 Assist Publ Hopitaux De Paris In vitro detection of tumor cells, in a biological sample, uses a highlight of allelic imbalance in insertion-deletion chromosome markers
US6696247B2 (en) 1998-03-18 2004-02-24 Corixa Corporation Compounds and methods for therapy and diagnosis of lung cancer
US6710170B2 (en) 1999-09-10 2004-03-23 Corixa Corporation Compositions and methods for the therapy and diagnosis of ovarian cancer
WO2004111274A1 (en) * 2003-06-10 2004-12-23 bioMérieux B.V. Nucleic acid sequences that can be used as primers and probes in the amplification and detection of sars coronavirus
US6960570B2 (en) 1998-03-18 2005-11-01 Corixa Corporation Compositions and methods for the therapy and diagnosis of lung cancer
US7235358B2 (en) 2001-06-08 2007-06-26 Expression Diagnostics, Inc. Methods and compositions for diagnosing and monitoring transplant rejection
US7258860B2 (en) 1998-03-18 2007-08-21 Corixa Corporation Compositions and methods for the therapy and diagnosis of lung cancer
EP1492886A4 (en) * 2002-04-03 2007-11-21 Syngenta Participations Ag Detection of wheat and barley fungal pathogens which are resistant to certain fungicides using the polymerase chain reaction
WO2007140625A1 (en) * 2006-06-09 2007-12-13 The University Of British Columbia Interferon gamma polymorphisms as indicators of subject outcome in critically ill subjects
EP1923463A1 (en) * 2005-08-09 2008-05-21 Kumamoto University Cancer-rejection antigen peptide derived from glypican-3 (gpc3) for use in hal-a2-positive patient and pharmaceutical comprising the antigen
US7485297B2 (en) 2003-08-12 2009-02-03 Dyax Corp. Method of inhibition of vascular development using an antibody
US7579160B2 (en) 1998-03-18 2009-08-25 Corixa Corporation Methods for the detection of cervical cancer
US7598051B2 (en) 1999-09-10 2009-10-06 Corixa Corporation Compositions and methods for the therapy and diagnosis of ovarian cancer
US7741468B2 (en) * 2003-01-03 2010-06-22 Shanghai Institutes For Biological Sciences, Chinese Academy Of Sciences Human liver regeneration associated protein and the use thereof
US7871610B2 (en) 2003-08-12 2011-01-18 Dyax Corp. Antibodies to Tie1 ectodomain
US8110364B2 (en) 2001-06-08 2012-02-07 Xdx, Inc. Methods and compositions for diagnosing or monitoring autoimmune and chronic inflammatory diseases
JP2012506407A (en) * 2008-10-22 2012-03-15 ヴェクト−オリュス Peptide derivatives and their use as molecular vectors in the form of conjugates
US8148093B2 (en) 2003-08-15 2012-04-03 Diadexus, Inc. Pro108 antibody compositions and methods of use and use of Pro108 to assess cancer risk
US8283459B2 (en) * 2002-05-30 2012-10-09 Memorial Sloan-Kettering Cancer Center Kinase suppressor of Ras inactivation for therapy of Ras mediated tumorigenesis
JP2013538558A (en) * 2010-07-19 2013-10-17 イエダ リサーチ アンド ディベロップメント カンパニー リミテッド Peptides based on the transmembrane domain of TOLL-like receptor (TLR) for treating TLR-mediated diseases
US9546198B2 (en) 2007-10-12 2017-01-17 Cancer Research Technology Limited Cyclic peptides as ADAM protease inhibitors
US9908922B2 (en) 2015-07-01 2018-03-06 Immatics Biotechnologies Gmbh Peptides and combination of peptides for use in immunotherapy against ovarian cancer and other cancers
WO2019048555A1 (en) * 2017-09-06 2019-03-14 Jaerver Peter Single stranded oligonucleotides inhibiting endocytosis
CN110719790A (en) * 2017-05-24 2020-01-21 加利福尼亚大学董事会 Antisense therapy for the treatment of cancer
US10961305B2 (en) 2016-12-21 2021-03-30 Mereo Biopharma 3 Limited Use of anti-sclerostin antibodies in the treatment of osteogenesis imperfecta
US11230581B2 (en) 2015-07-01 2022-01-25 Immatics Biotechnologies Gmbh Peptides and combination of peptides for use in immunotherapy against ovarian cancer and other cancers

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1992013069A1 (en) * 1991-01-21 1992-08-06 Imperial College Of Science, Technology & Medicine Test and model for alzheimer's disease
US5656477A (en) * 1992-05-01 1997-08-12 American Cyanamid Company Amyloid precursor proteins and method of using same to assess agents which down-regulate formation of β-amyloid peptide
WO1998020165A2 (en) * 1996-11-06 1998-05-14 Whitehead Institute For Biomedical Research Biallelic markers
US5795963A (en) * 1992-06-04 1998-08-18 Alzheimer's Institute Of America Amyloid precursor protein in alzheimer's disease
WO1998038846A2 (en) * 1997-03-07 1998-09-11 Affymetrix, Inc. Genetic compositions and methods

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1992013069A1 (en) * 1991-01-21 1992-08-06 Imperial College Of Science, Technology & Medicine Test and model for alzheimer's disease
US5656477A (en) * 1992-05-01 1997-08-12 American Cyanamid Company Amyloid precursor proteins and method of using same to assess agents which down-regulate formation of β-amyloid peptide
US5795963A (en) * 1992-06-04 1998-08-18 Alzheimer's Institute Of America Amyloid precursor protein in alzheimer's disease
WO1998020165A2 (en) * 1996-11-06 1998-05-14 Whitehead Institute For Biomedical Research Biallelic markers
WO1998038846A2 (en) * 1997-03-07 1998-09-11 Affymetrix, Inc. Genetic compositions and methods

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
FAN J ET AL: "Genetic mapping: Finding and analyzing single-nucleotide polymorphisms with high-density DNA arrays" AMERICAN JOURNAL OF HUMAN GENETICS, UNIVERSITY OF CHICAGO PRESS, CHICAGO,, US, vol. 61, no. 4, SUPPL, 1 October 1997 (1997-10-01), page 1601 XP002089397 ISSN: 0002-9297 *
WANG D G ET AL: "Large-scale identification, mapping, and genotyping of single-nucleotide polymorphisms in the human genome" SCIENCE, AMERICAN ASSOCIATION FOR THE ADVANCEMENT OF SCIENCE,, US, vol. 280, 1998, pages 1077-1082, XP002089398 ISSN: 0036-8075 *

Cited By (72)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6960570B2 (en) 1998-03-18 2005-11-01 Corixa Corporation Compositions and methods for the therapy and diagnosis of lung cancer
US7585506B2 (en) 1998-03-18 2009-09-08 Corixa Corporation Compositions and methods for the therapy and diagnosis of lung cancer
US7579160B2 (en) 1998-03-18 2009-08-25 Corixa Corporation Methods for the detection of cervical cancer
US6696247B2 (en) 1998-03-18 2004-02-24 Corixa Corporation Compounds and methods for therapy and diagnosis of lung cancer
US7258860B2 (en) 1998-03-18 2007-08-21 Corixa Corporation Compositions and methods for the therapy and diagnosis of lung cancer
US7598051B2 (en) 1999-09-10 2009-10-06 Corixa Corporation Compositions and methods for the therapy and diagnosis of ovarian cancer
US6710170B2 (en) 1999-09-10 2004-03-23 Corixa Corporation Compositions and methods for the therapy and diagnosis of ovarian cancer
US7985843B2 (en) 1999-09-10 2011-07-26 Corixa Corporation Compositions and methods for the therapy and diagnosis of ovarian cancer
US7749505B2 (en) 1999-12-17 2010-07-06 Corixa Corporation Compositions and methods for the therapy and diagnosis of lung cancer
US7235358B2 (en) 2001-06-08 2007-06-26 Expression Diagnostics, Inc. Methods and compositions for diagnosing and monitoring transplant rejection
US8110364B2 (en) 2001-06-08 2012-02-07 Xdx, Inc. Methods and compositions for diagnosing or monitoring autoimmune and chronic inflammatory diseases
WO2003012032A3 (en) * 2001-08-01 2004-08-12 Isis Pharmaceuticals Inc Antisense modulation of p70 s6 kinase expression
WO2003012032A2 (en) * 2001-08-01 2003-02-13 Isis Pharmaceuticals, Inc. Antisense modulation of p70 s6 kinase expression
EP1308522A1 (en) * 2001-11-05 2003-05-07 Haferlach, Torsten, PD Dr. Dr. Novel genetic markers for leukemias
FR2836484A1 (en) * 2002-02-25 2003-08-29 Assist Publ Hopitaux De Paris In vitro detection of tumor cells, in a biological sample, uses a highlight of allelic imbalance in insertion-deletion chromosome markers
EP1492886A4 (en) * 2002-04-03 2007-11-21 Syngenta Participations Ag Detection of wheat and barley fungal pathogens which are resistant to certain fungicides using the polymerase chain reaction
US7691569B2 (en) 2002-04-24 2010-04-06 Xdx, Inc. Methods and compositions for diagnosing and monitoring transplant rejection
US8283459B2 (en) * 2002-05-30 2012-10-09 Memorial Sloan-Kettering Cancer Center Kinase suppressor of Ras inactivation for therapy of Ras mediated tumorigenesis
US7741468B2 (en) * 2003-01-03 2010-06-22 Shanghai Institutes For Biological Sciences, Chinese Academy Of Sciences Human liver regeneration associated protein and the use thereof
US8106172B2 (en) 2003-06-10 2012-01-31 Biomerieux, B.V. Nucleic acid sequences that can be used as primers and probes in the amplification and detection of SARS coronavirus
WO2004111274A1 (en) * 2003-06-10 2004-12-23 bioMérieux B.V. Nucleic acid sequences that can be used as primers and probes in the amplification and detection of sars coronavirus
EP1911853A2 (en) * 2003-06-10 2008-04-16 bioMerieux B.V. Nucleic acid sequences that can be used as primers and probes in the amplification and detection of SARS coronavirus
EP1911853A3 (en) * 2003-06-10 2011-11-16 bioMerieux B.V. Nucleic acid sequences that can be used as primers and probes in the amplification and detection of SARS coronavirus
US7871610B2 (en) 2003-08-12 2011-01-18 Dyax Corp. Antibodies to Tie1 ectodomain
US7485297B2 (en) 2003-08-12 2009-02-03 Dyax Corp. Method of inhibition of vascular development using an antibody
US8148093B2 (en) 2003-08-15 2012-04-03 Diadexus, Inc. Pro108 antibody compositions and methods of use and use of Pro108 to assess cancer risk
EP1923463A1 (en) * 2005-08-09 2008-05-21 Kumamoto University Cancer-rejection antigen peptide derived from glypican-3 (gpc3) for use in hal-a2-positive patient and pharmaceutical comprising the antigen
US8053556B2 (en) 2005-08-09 2011-11-08 Oncotherapy Science, Inc. Glypican-3 (GPC3)-derived tumor rejection antigenic peptides useful for HLA-A2-positive patients and pharmaceutical comprising the same
EP1923463A4 (en) * 2005-08-09 2009-09-09 Onco Therapy Science Inc Cancer-rejection antigen peptide derived from glypican-3 (gpc3) for use in hal-a2-positive patient and pharmaceutical comprising the antigen
CN101313063B (en) * 2005-08-09 2013-04-03 肿瘤疗法·科学股份有限公司 Cancer-rejection antigen peptide derived from glypican-3 (GPC3) for use in HLA-A2-positive patient and pharmaceutical comprising the antigen
US8535942B2 (en) 2005-08-09 2013-09-17 Oncotherapy Science, Inc. Glypican-3 (GPC3)-derived tumor rejection antigenic peptides useful for HLA-A2-positive patients and pharmaceutical comprising the same
CN104877008B (en) * 2005-08-09 2018-06-05 肿瘤疗法·科学股份有限公司 The cancer rejection antigen peptide from GPC3 for HLA-A2 Positive Populations and the drug containing the peptide
CN104877008A (en) * 2005-08-09 2015-09-02 肿瘤疗法·科学股份有限公司 Cancer-rejection antigen peptide derived from glypican-3 GPC3 for use in HAL-A2-positive patient and pharmaceutical comprising the antigen
WO2007140625A1 (en) * 2006-06-09 2007-12-13 The University Of British Columbia Interferon gamma polymorphisms as indicators of subject outcome in critically ill subjects
EP2197898B1 (en) * 2007-10-12 2018-05-30 Cancer Research Technology Limited Protease inhibition
US10472393B2 (en) 2007-10-12 2019-11-12 Cancer Research Technology Limited Method for inhibiting ADAM proteases with cyclic peptides
US9546198B2 (en) 2007-10-12 2017-01-17 Cancer Research Technology Limited Cyclic peptides as ADAM protease inhibitors
JP2015212264A (en) * 2008-10-22 2015-11-26 ヴェクト−オリュスVect−Horus Peptide derivatives and use thereof as vectors for molecules in form of conjugates
JP2012506407A (en) * 2008-10-22 2012-03-15 ヴェクト−オリュス Peptide derivatives and their use as molecular vectors in the form of conjugates
US9890202B2 (en) 2010-07-19 2018-02-13 Yeda Research And Development Co. Ltd. Peptides based on the transmembrane domain of a toll-like receptor (TLR) for treatment of TLR-mediated diseases
JP2013538558A (en) * 2010-07-19 2013-10-17 イエダ リサーチ アンド ディベロップメント カンパニー リミテッド Peptides based on the transmembrane domain of TOLL-like receptor (TLR) for treating TLR-mediated diseases
US11384127B2 (en) 2015-07-01 2022-07-12 Immatics Biotechnologies Gmbh Peptides and combination of peptides for use in immunotherapy against ovarian cancer and other cancers
US12071458B2 (en) 2015-07-01 2024-08-27 Immatics Biotechnologies Gmbh Peptides and combination of peptides for use in immunotherapy against ovarian cancer and other cancers
US10738094B2 (en) 2015-07-01 2020-08-11 Immatics Biotechnologies Gmbh Peptides and combination of peptides for use in immunotherapy against ovarian cancer and other cancers
US10227388B2 (en) 2015-07-01 2019-03-12 Immatics Biotechnologies Gmbh Peptides and combination of peptides for use in immunotherapy against non-small cell lung cancer and other cancers
US11912749B2 (en) 2015-07-01 2024-02-27 Immatics Biotechnologies Gmbh Peptides and combination of peptides for use in immunotherapy against ovarian cancer and other cancers
US10239925B2 (en) 2015-07-01 2019-03-26 Immatics Biotechnologies Gmbh Peptides and combination of peptides for use in immunotherapy against ovarian cancer and other cancers
US10253076B2 (en) 2015-07-01 2019-04-09 Immatics Biotechnologies Gmbh Peptides and combination of peptides for use in immunotherapy against ovarian cancer and other cancers
US10280205B2 (en) 2015-07-01 2019-05-07 Immatics Biotechnologies Gmbh Method for treating cancer with an activated T lymphocyte that selectively recognizes a cancer cell consisting of specific peptide
US10464978B2 (en) 2015-07-01 2019-11-05 Immatics Biotechnologies Gmbh Peptides and combination of peptides for use in immunotherapy against ovarian cancer and other cancers
US10472401B2 (en) 2015-07-01 2019-11-12 Immatics Biotechnologies Gmbh Peptides and combination of peptides for use in immunotherapy against ovarian cancer and other cancers
US10000539B2 (en) 2015-07-01 2018-06-19 Immatics Biotechnologies Gmbh Peptides and combination of peptides for use in immunotherapy against ovarian cancer and other cancers
US10494413B2 (en) 2015-07-01 2019-12-03 Immatics Biotechnologies Gmbh Peptides and combination of peptides for use in immunotherapy against ovarian cancer and other cancers
US10533041B2 (en) 2015-07-01 2020-01-14 Immatics Biotechnologies Gmbh Peptides and combination of peptides for use in immunotherapy against ovarian cancer and other cancers
US11912748B2 (en) 2015-07-01 2024-02-27 Immatics Biotechnologies Gmbh Peptides and combination of peptides for use in immunotherapy against ovarian cancer and other cancers
US10174089B2 (en) 2015-07-01 2019-01-08 Immatics Biotechnologies Gmbh Peptides and combination of peptides for use in immunotherapy against ovarian cancer and other cancers
US10934333B2 (en) 2015-07-01 2021-03-02 Immatics Biotechnologies Gmbh Peptides and combination of peptides for use in immunotherapy against ovarian cancer and other cancers
US10047131B2 (en) 2015-07-01 2018-08-14 Immatics Biotechnologies Gmbh Peptides and combination of peptides for use in immunotherapy against ovarian cancer and other cancers
US11485765B2 (en) 2015-07-01 2022-11-01 Immatics Biotechnologies Gmbh Peptides and combination of peptides for use in immunotherapy against ovarian cancer and other cancers
US11401310B2 (en) 2015-07-01 2022-08-02 Immatics Biotechnologies Gmbh Peptides and combination of peptides for use in immunotherapy against ovarian cancer and other cancers
US11230581B2 (en) 2015-07-01 2022-01-25 Immatics Biotechnologies Gmbh Peptides and combination of peptides for use in immunotherapy against ovarian cancer and other cancers
US11384128B2 (en) 2015-07-01 2022-07-12 Immatics Biotechnologies Gmbh Peptides and combination of peptides for use in immunotherapy against ovarian cancer and other cancers
US9908922B2 (en) 2015-07-01 2018-03-06 Immatics Biotechnologies Gmbh Peptides and combination of peptides for use in immunotherapy against ovarian cancer and other cancers
US11384129B2 (en) 2015-07-01 2022-07-12 Immatics Biotechnologies Gmbh Peptides and combination of peptides for use in immunotherapy against ovarian cancer and other cancers
US10961305B2 (en) 2016-12-21 2021-03-30 Mereo Biopharma 3 Limited Use of anti-sclerostin antibodies in the treatment of osteogenesis imperfecta
EP3634494A4 (en) * 2017-05-24 2021-03-17 The Regents of the University of California Antisense therapies for treating cancer
US11643656B2 (en) 2017-05-24 2023-05-09 The Regents Of The University Of California Antisense therapies for treating cancer
CN110719790A (en) * 2017-05-24 2020-01-21 加利福尼亚大学董事会 Antisense therapy for the treatment of cancer
US11345916B2 (en) 2017-09-06 2022-05-31 Tirmed Pharma Ab Single stranded oligonucleotides inhibiting endocytosis
WO2019048555A1 (en) * 2017-09-06 2019-03-14 Jaerver Peter Single stranded oligonucleotides inhibiting endocytosis
AU2018330372B2 (en) * 2017-09-06 2024-05-23 Tirmed Pharma Ab Single stranded oligonucleotides inhibiting endocytosis
CN111742047A (en) * 2017-09-06 2020-10-02 蒂尔梅德制药有限公司 Single-stranded oligonucleotides that inhibit endocytosis

Also Published As

Publication number Publication date
AU2914501A (en) 2001-07-09
CA2395926A1 (en) 2001-07-05
WO2001047944A3 (en) 2003-02-20

Similar Documents

Publication Publication Date Title
WO2001047944A2 (en) Nucleic acids containing single nucleotide polymorphisms and methods of use thereof
WO2000029623A2 (en) Nucleic acids containing single nucleotide polymorphisms and methods of use thereof
EP0812922A2 (en) Polymorphisms in human mitochondrial nucleic acid
WO2001048245A2 (en) Nucleic acids containing single nucleotide polymorphisms and methods of use thereof
AU1940001A (en) Nucleic acids containing single nucleotide polymorphisms and methods of use thereof
US6833240B2 (en) Very low density lipoprotein receptor polymorphisms and uses therefor
AU1915700A (en) Nucleic acids containing single nucleotide polymorphisms and methods of use thereof
EP1250456A2 (en) Nucleic acids containing single nucleotide polymorphisms and methods of use thereof
US20040235041A1 (en) Nucleic acids containing single nucleotide polymorphisms and methods of use thereof
AU2004203849B2 (en) Nucleic Acids Containing Single Nucleotide Polymorphisms and Methods of Use Thereof
US20040235026A1 (en) Nucleic acids containing single nucleotide polymorphisms and methods of use thereof
EP1244688A1 (en) Nucleic acids containing single nucleotide polymorphisms and methods of use thereof
US20030224413A1 (en) Nucleic acids containing single nucleotide polymorphisms and methods of use thereof
US20030232365A1 (en) BDNF polymorphisms and association with bipolar disorder
WO2001047942A2 (en) Nucleic acids containing single nucleotide polymorphisms and methods of use thereof
US20030009016A1 (en) Nucleic acids containing single nucleotide polymorphisms and methods of use thereof
US7339049B1 (en) Polymorphisms in human mitochondrial DNA
WO2001090161A2 (en) Nucleic acids containing single nucleotide polymorphisms and methods of use thereof
WO2002054939A2 (en) Methods and compositions for diagnosing and treating neuropsychiatric disorders such as schizophrenia
US6913885B2 (en) Association of dopamine beta-hydroxylase polymorphisms with bipolar disorder
WO2003087309A2 (en) Bdnf polymorphisms and association with bipolar disorder

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CR CU CZ DE DK DM DZ EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG US US UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
WWE Wipo information: entry into national phase

Ref document number: 2395926

Country of ref document: CA

WWE Wipo information: entry into national phase

Ref document number: 2000993615

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 29145/01

Country of ref document: AU

WWP Wipo information: published in national office

Ref document number: 2000993615

Country of ref document: EP

REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

NENP Non-entry into the national phase

Ref country code: JP

WWW Wipo information: withdrawn in national office

Ref document number: 2000993615

Country of ref document: EP

DPE2 Request for preliminary examination filed before expiration of 19th month from priority date (pct application filed from 20040101)