Nothing Special   »   [go: up one dir, main page]

WO2016009006A1 - Tobacco protease genes - Google Patents

Tobacco protease genes Download PDF

Info

Publication number
WO2016009006A1
WO2016009006A1 PCT/EP2015/066341 EP2015066341W WO2016009006A1 WO 2016009006 A1 WO2016009006 A1 WO 2016009006A1 EP 2015066341 W EP2015066341 W EP 2015066341W WO 2016009006 A1 WO2016009006 A1 WO 2016009006A1
Authority
WO
WIPO (PCT)
Prior art keywords
plant
tobacco
seq
mutant
expression
Prior art date
Application number
PCT/EP2015/066341
Other languages
French (fr)
Inventor
Lucien Bovet
Dion FLORACK
James BATTEY
Original Assignee
Philip Morris Products S.A
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority to RU2017105148A priority Critical patent/RU2756102C2/en
Priority to KR1020177001642A priority patent/KR20170032317A/en
Priority to CN201580038165.0A priority patent/CN106661556A/en
Priority to BR112017000932A priority patent/BR112017000932A2/en
Priority to MX2017000834A priority patent/MX2017000834A/en
Priority to CA2954828A priority patent/CA2954828A1/en
Application filed by Philip Morris Products S.A filed Critical Philip Morris Products S.A
Priority to AP2017009676A priority patent/AP2017009676A0/en
Priority to JP2017502853A priority patent/JP2017529063A/en
Priority to EP15738907.3A priority patent/EP3169149B1/en
Priority to US15/325,997 priority patent/US20170265516A1/en
Publication of WO2016009006A1 publication Critical patent/WO2016009006A1/en
Priority to PH12016502546A priority patent/PH12016502546A1/en

Links

Classifications

    • AHUMAN NECESSITIES
    • A24TOBACCO; CIGARS; CIGARETTES; SIMULATED SMOKING DEVICES; SMOKERS' REQUISITES
    • A24BMANUFACTURE OR PREPARATION OF TOBACCO FOR SMOKING OR CHEWING; TOBACCO; SNUFF
    • A24B3/00Preparing tobacco in the factory
    • A24B3/12Steaming, curing, or flavouring tobacco
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/82Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
    • C12N15/8201Methods for introducing genetic material into plant cells, e.g. DNA, RNA, stable or transient incorporation, tissue culture methods adapted for transformation
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/82Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
    • C12N15/8201Methods for introducing genetic material into plant cells, e.g. DNA, RNA, stable or transient incorporation, tissue culture methods adapted for transformation
    • C12N15/8213Targeted insertion of genes into the plant genome by homologous recombination
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/82Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
    • C12N15/8241Phenotypically and genetically modified plants via recombinant DNA technology
    • C12N15/8242Phenotypically and genetically modified plants via recombinant DNA technology with non-agronomic quality (output) traits, e.g. for industrial processing; Value added, non-agronomic traits
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/82Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
    • C12N15/8241Phenotypically and genetically modified plants via recombinant DNA technology
    • C12N15/8261Phenotypically and genetically modified plants via recombinant DNA technology with agronomic (input) traits, e.g. crop yield
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/10Transferases (2.)
    • C12N9/1025Acyltransferases (2.3)
    • C12N9/1029Acyltransferases (2.3) transferring groups other than amino-acyl groups (2.3.1)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/10Transferases (2.)
    • C12N9/1025Acyltransferases (2.3)
    • C12N9/104Aminoacyltransferases (2.3.2)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/48Hydrolases (3) acting on peptide bonds (3.4)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/48Hydrolases (3) acting on peptide bonds (3.4)
    • C12N9/50Proteinases, e.g. Endopeptidases (3.4.21-3.4.25)
    • C12N9/63Proteinases, e.g. Endopeptidases (3.4.21-3.4.25) derived from plants
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y203/00Acyltransferases (2.3)
    • C12Y203/01Acyltransferases (2.3) transferring groups other than amino-acyl groups (2.3.1)
    • C12Y203/01091Sinapoylglucose--choline O-sinapoyltransferase (2.3.1.91)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y203/00Acyltransferases (2.3)
    • C12Y203/02Aminoacyltransferases (2.3.2)
    • C12Y203/02002Gamma-glutamyltransferase (2.3.2.2)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y304/00Hydrolases acting on peptide bonds, i.e. peptidases (3.4)
    • C12Y304/21Serine endopeptidases (3.4.21)
    • C12Y304/21107Peptidase Do (3.4.21.107)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y304/00Hydrolases acting on peptide bonds, i.e. peptidases (3.4)
    • C12Y304/21Serine endopeptidases (3.4.21)
    • C12Y304/21112Site-1 protease (3.4.21.112), i.e. subtilisin kexin isozyme-1
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y304/00Hydrolases acting on peptide bonds, i.e. peptidases (3.4)
    • C12Y304/22Cysteine endopeptidases (3.4.22)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y304/00Hydrolases acting on peptide bonds, i.e. peptidases (3.4)
    • C12Y304/22Cysteine endopeptidases (3.4.22)
    • C12Y304/22002Papain (3.4.22.2)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y304/00Hydrolases acting on peptide bonds, i.e. peptidases (3.4)
    • C12Y304/23Aspartic endopeptidases (3.4.23)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y304/00Hydrolases acting on peptide bonds, i.e. peptidases (3.4)
    • C12Y304/24Metalloendopeptidases (3.4.24)
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A40/00Adaptation technologies in agriculture, forestry, livestock or agroalimentary production
    • Y02A40/10Adaptation technologies in agriculture, forestry, livestock or agroalimentary production in agriculture
    • Y02A40/146Genetically Modified [GMO] plants, e.g. transgenic plants

Definitions

  • the present invention concerns the use of proteases expressed in tobacco to alter the characteristics of cured tobacco products.
  • the invention provides processes for altering the curing of tobacco leaf and modulating tobacco leaf composition by modulating the expression of one or more tobacco protease genes.
  • Tobacco curing is a process of physical and biochemical changes that bring out the aroma and flavor of each variety of tobacco. After tobacco has been harvested, it is necessary to cure it and then age it before comsumption, to improve its flavour. There are four common methods of curing, and the method used depends on the type of tobacco and its intended use.
  • Air-cured tobacco is sheltered from wind and sun in a well-ventilated chamber, where it air- dries for six to eight weeks. Air-cured tobacco is low in sugar, which gives the tobacco smoke a light, sweet flavor, and high in nicotine. Cigar and burley tobaccos are air cured. In fire curing, smoke from a low-burning fire permeates the leaves. This gives the leaves a distinctive smokey aroma and flavour. Fire curing takes three to ten weeks and produces a tobacco low in sugar and high in nicotine. Pipe tobacco, chewing tobacco, and snuff are fire cured.
  • Flue-cured tobacco is kept in an enclosed heated area, but it is not directly exposed to smoke. This method produces cigarette tobacco that is high in sugar and has medium to high levels of nicotine. It is the fastest method of curing, requiring about a week. Virginia tobacco that has been flue cured is also called bright tobacco, because flue curing turns its leaves gold, orange, or yellow.
  • Sun-cured tobacco dries uncovered in the sun. This method is used in Turkey, Greece and other Mediterranean countries to produce oriental tobacco. Sun-cured tobacco is low in sugar and nicotine and is used in cigarettes.
  • Curing produces various compounds in the tobacco leaves that give cured tobacco its specific flavour and taste, such as for example a sweet hay, tea, rose oil, or fruity aromatic flavor.
  • the chlorophyll content is reduced.
  • This phase takes between 2 and 8 days depending on the tobacco type.
  • leaf metabolic activities are drastically changed. Not only is chlorophyll degraded but also, for example, starch and proteins.
  • the only methods for altering the curing process which have been proposed are base on altering the actual conditions to which the tobacco is exposed in the chosen curing procedure. Very little is known about gene expression in tobacco during curing, and moreover few data have been reported on the activities of proteases in tobacco leaf and their resulting products.
  • protease genes that are activated during leaf curing in the three main tobacco types, Burley, Virginia and Oriental. We have found that specific protease expression is associated with particular flavour profiles in tobacco.
  • protease genes SEQ ID NO: 1 -80 were identified that are up-regulated in Burley tobacco upon air curing, Virginia tobacco upon flue-curing and Oriental tobacco upon sun curing. Details on such up-regulation in one or more of the different tobacco types are summarised in Figure 2 and Table 1 & 2.
  • polynucleotide sequences SEQ ID NO: 1 -80 include exon and intron sequences.
  • the protein sequences relating to the coding sequence part of the polynucleotide sequences SEQ ID NO: 1 -80, are depicted in SEQ ID NO: 81 -160.
  • mutant, non-naturally occurring or transgenic tobacco plant cell comprising:
  • a polynucleotide comprising, consisting or consisting essentially of a sequence encoding a functional protease and having at least 95% sequence identity to any one of SEQ ID NO:1 to SEQ ID No: 80;
  • polypeptide comprising, consisting or consisting essentially of a sequence encoding a protease and having at least 95% sequence identity to SEQ ID NO:81 to SEQ ID No: 160; or
  • the expression or activity of said protease is upregulated compared to the control tobacco plant cell.
  • the expression or activity of said protease is downregulated compared to the control tobacco plant cell.
  • at least one protease can be upregulated at the same time as at least one protease is downregulated in the same cell.
  • a mutant, non-naturally occurring or transgenic tobacco plant cell according to claim 4 wherein the expression or activity of a protease selected from at least one of SEQ ID NO: 30 to 41 is modulated in an Oriental type tobacco.
  • a mutant, non-naturally occurring or transgenic tobacco plant cell according to claim 4 wherein the expression or activity of a protease selected from SEQ ID NO: 17 to 22 is modulated in a Virginia type tobacco.
  • a mutant, non-naturally occurring or transgenic tobacco plant cell according to claim 4 wherein the expression or activity of a protease selected from at least one of SEQ ID NO: 42 to 44 is modulated in a Burley type tobacco.
  • a mutant, non-naturally occurring or transgenic tobacco plant cell according to claim 4 wherein the expression or activity of a protease selected from at least one of SEQ ID NO: 45 to 61 is modulated in a Virginia or Oriental type tobacco.
  • a mutant, non-naturally occurring or transgenic tobacco plant cell according to claim 4 wherein the expression or activity of a protease selected from at least one of SEQ ID NO: 23 to 29 is modulated in a Burley or Virginia type tobacco.
  • the mutant, non-naturally occurring or transgenic tobacco plant cell can be a tobacco plant cell wherein said mutation(s) is a heterozygous or homozygous mutation.
  • the expression of the one or more proteases is increased by about 10% to about 1000%, for example by at least 10%, at least 20%, at least 25%, at least 50%, at least 100%, at least 200%, at least 500%, at least 750% or up to 1000%.
  • plant material including biomass, seed, stem, flowers or leaves from the plant of the second aspect of the invention.
  • a method for preparing a tobacco plant with modulated levels of protease comprising the steps of:
  • the tobacco plant in step (b) is a mutant tobacco plant, preferably, wherein said mutant tobacco plant comprises one or more mutations in one or more further sequence encoding a functional protease and having at least 95% sequence identity to at least one of SEQ ID NO:1 to SEQ ID No: 80.
  • a plant can be constructed in which one or more cells comprise multiple mutated proteases.
  • the mutated cells comprising modulated protease expression or activity are impart a different flavour profile to tobacco leaf during the curing process.
  • the genome of a cell of a tobacco plant is modified by a genome editing technology or by genome engineering techniques selected from CRISPR/Cas technology, zinc finger nuclease-mediated mutagenesis, chemical or radiation mutagenesis, homologous recombination, oligonucleotide-directed mutagenesis and meganuclease-mediated mutagenesis.
  • a polynucleotide comprising, consisting or consisting essentially of a sequence encoding a functional protease and having at least 95% sequence identity to any one of SEQ ID NO:1 to SEQ ID No: 80;
  • polypeptide comprising, consisting or consisting essentially of a sequence encoding a protease and having at least 95% sequence identity to SEQ ID NO:81 to SEQ ID No: 160; or
  • the curing procedure in according to this aspect of the invention can be selected from the group consisting of air curing, fire curing, smoke curing and flue curing.
  • Modification or modulation of protease activity during curing can be through (further) up- regulation or down-regulation. Modification or modulation can be through genetic engineering using for example certain promoter sequences that are (at least) active during such curing. Modulation can also be through for example mutagenesis as claimed above, of such sequences and/or their regulatory region resulting in either up- or down-regulation, or complete knock-out, of the protease activity encoded thereby under the respective curing conditions.
  • certain gene sequences are only up-regulated in one or two of the three tobacco types (as defined according to tobacco type and curing method)
  • certain gene sequences can potentially be used to modify or modulate protease activity during curing such that the outcome with respect to leaf chemistry (for example the metabolite content of cell) and properties of the obtained tobacco leaf cell, are changed such that for example an air-cured Burley tobacco acquires certain characteristics of a flue-cured Virginia-type tobacco or sun- cured Oriental tobacco upon curing.
  • This for example can be done by modulating the expression of one or more of the gene sequences that are up-regulated in one or two of tobacco types and not in the other tobacco.
  • 17 gene sequences are uniquely up-regulated in air-cured Burley, 19 in flue-cured Virginia, and 12 in both types of tobacco during curing.
  • By selectively modulating one or more of the 19 gene sequences that are only up-regulated in air-cured Burley now in flue-cured Virginia the leaf cell composition of the sun-cured Virginia tobacco upon curing can be altered towards a more Burley type.
  • this can be achieved using for example a promoter sequence that is active under the curing conditions of the targeted tobacco type.
  • Promoter sequences of use therefore are for example the regulatory sequences driving the expression of the gene sequences listed here.
  • the mutated gene sequence can be active under the curing conditions of the targeted tobacco type.
  • a regulatory sequence is mutated such that the gene sequence downstream is active under the desired curing conditions. For example, by selectively modifying or modulating the expression of one or more of the 19 sequences that are uniquely up- regulated in flue-cured Virginia in an air-cured Burley type of tobacco, the leaf cell composition of the Burley type tobacco upon curing can be altered towards a more Virginia type. Also, by selectively modulating the expression of one or more of the 12 sequences that are up-regulated both in air-cured Burley and flue-cured Virginia, in a sun-cured Oriental tobacco, the leaf cell composition of the sun-cured Oriental tobacco upon curing can be altered such that it acquires Burley and Virginia characteristics.
  • one of the gene sequences listed in SEQ ID Nos: 1 -80 or sequences comprising, consisting or consisting essentially of a sequence having at least 95% sequence identity to one or more of the listed sequences is up-regulated. In another embodiment more than one of the gene sequences listed in SEQ ID Nos: 1 -80 or sequences comprising, consisting or consisting essentially of a sequence having at least 95% sequence identity to such sequences, are up-regulated. In another embodiment, one or more of the gene sequences listed in SEQ ID Nos: 1 -80 or sequences comprising, consisting or consisting essentially of a sequence having at least 95% sequence identity to such listed sequences, are down-regulated.
  • sequences listed in SEQ ID Nos: 1 -80 or sequences comprising, consisting or consisting essentially of a sequence having at least 95% sequence identity to such listed sequences are up-regulated, and one or more sequences listed in SEQ ID Nos: 1 -80 or sequences comprising, consisting or consisting essentially of a sequence having at least 95% sequence identity to such listed sequences are down-regulated.
  • the invention also provides tobacco leaves and products comprising such leaves, obtained according to the methods claimed above.
  • Such products include but are not limited to chewing tobacco, tobacco sticks, extracts obtained therefrom and other smoking articles comprising such leaf material or a material derived therefrom.
  • isolated refers to any entity that is taken from its natural milieu, but the term does not connote any degree of purification.
  • An "expression vector” is a nucleic acid vehicle that comprises a combination of nucleic acid components for enabling the expression of nucleic acid. Suitable expression vectors include episomes capable of extra-chromosomal replication such as circular, double-stranded nucleic acid plasmids; linearized double-stranded nucleic acid plasmids; and other functionally equivalent expression vectors of any origin.
  • An expression vector comprises at least a promoter positioned upstream and operably-linked to a nucleic acid, nucleic acid constructs or nucleic acid conjugate, as defined below.
  • construct refers to a double-stranded, recombinant nucleic acid fragment comprising one or more polynucleotides.
  • the construct comprises a "template strand” base- paired with a complementary "sense or coding strand.”
  • a given construct can be inserted into a vector in two possible orientations, either in the same (or sense) orientation or in the reverse (or anti-sense) orientation with respect to the orientation of a promoter positioned within a vector - such as an expression vector.
  • a “vector” refers to a nucleic acid vehicle that comprises a combination of nucleic acid components for enabling the transport of nucleic acid, nucleic acid constructs and nucleic acid conjugates and the like. Suitable vectors include episomes capable of extra- chromosomal replication such as circular, double-stranded nucleic acid plasmids; linearized double-stranded nucleic acid plasmids; and other vectors of any origin.
  • a “promoter” refers to a nucleic acid element/sequence, typically positioned upstream and operably-linked to a double-stranded DNA fragment. Promoters can be derived entirely from regions proximate to a native gene of interest, or can be composed of different elements derived from different native promoters or synthetic DNA segments.
  • the terms "homology, identity or similarity” refer to the degree of sequence similarity between two polypeptides or between two nucleic acid molecules compared by sequence alignment.
  • the degree of homology between two discrete nucleic acid sequences being compared is a function of the number of identical, or matching, nucleotides at comparable positions.
  • the percent identity may be determined by visual inspection and mathematical calculation. Alternatively, the percent identity of two nucleic acid sequences may be determined by comparing sequence information using a computer program such as - ClustalW, BLAST, FASTA or Smith-Waterman.
  • a “variant” means a substantially similar sequence.
  • a variant can have a similar function or substantially similar function as a wild-type sequence.
  • a similar function is at least about 50%, 60%, 70%, 80% or 90% of wild-type enzyme function under the same conditions.
  • a substantially similar function is at least about 90%, 95%, 96%, 97%, 98% or 99% of wild-type enzyme function under the same conditions.
  • wild-type protease sequences are set forth in SEQ ID Nos: 81 -160.
  • the variants can have one or more mutations that result in the enzyme having a reduced level of protease activity as compared to the wild-type protease.
  • the variants can have one or more mutations that result in their protease activity being knocked out (i.e. a 100% inhibition, and thus a nonfunctional polypeptide). Variants can also have increased activity, leading to a more active protease enzyme function.
  • plant refers to any plant or part of a plant at any stage of its life cycle or development, and its progenies.
  • the plant is a "tobacco plant”, which refers to a plant belonging to the genus Nicotiana. Preferred species of tobacco plant are described herein.
  • Plant parts include plant cells, plant protoplasts, plant cell tissue cultures from which a whole plant can be regenerated, plant calli, plant clumps and plant cells that are intact in plants or parts of plants such as embryos, pollen, anthers, ovules, seeds, leaves, flowers, stems, branches, fruit, roots, root tips and the like. Progeny, variants and mutants of regenerated plants are also included within the scope of the disclosure, provided that they comprise the introduced polynucleotides described herein.
  • a "plant cell” refers to a structural and physiological unit of a plant.
  • the plant cell may be in the form of a protoplast without a cell wall, an isolated single cell or a cultured cell, or as a part of higher organized unit such as but not limited to, plant tissue, a plant organ, or a whole plant.
  • plant material refers to any solid, liquid or gaseous composition, or a combination thereof, obtainable from a plant, including biomass, leaves, stems, roots, flowers or flower parts, fruits, pollen, egg cells, zygotes, seeds, cuttings, secretions, extracts, cell or tissue cultures, or any other parts or products of a plant.
  • the plant material comprises or consists of biomass, stem, seed or leaves.
  • the plant material comprises or consists of leaves.
  • variable refers to a population of plants that share constant characteristics which separate them from other plants of the same species. While possessing one or more distinctive traits, a variety is further characterized by a very small overall variation between individuals within that variety. A variety is often sold commercially.
  • a "type” of tobacco is defined by origin and curing method. Flue-cured tobacco, which accounts for 40% of global production, is also known as “Bright” and “Virginia” tobacco. It is used almost entirely in cigarette blends. Some of the heavier leaves may be used in mixtures for pipe smoking. Some English cigarettes are 100% flue-cured. Flue-cured leaf is characterized by a high sugar: nitrogen ratio. This ratio is enhanced by the picking of the leaf in an advanced stage of ripeness, and by the unique curing process which allows certain chemical changes to occur in the leaf. Cured leaves vary from lemon to orange to mahogany in colour.
  • Burley is light air-cured type derived from the White Burley which arose as a mutant on a farm in Ohio in 1864. Burley is used primarily in cigarette blends. Some of the heavier leaf is sued in pipe blends and also for chewing.
  • Cured burley leaf is characterized by low sugar content and a very low sugar to nitrogen ratio (high nicotine). This is enhanced by high Nitrogen fertilizer, harvesting at an early stage of senescence, and the air curing process which allows oxidation of any sugars which may have occurred.
  • Maryland is another light air-cured type. It is used to some extent in American blended cigarettes and to a greater extent in certain Swiss cigarette blends.
  • Maryland tobacco is extremely fluffy, has good burning properties, low nicotine, and neutral aroma.
  • Dark air-cured tobacco encompasses a number of types used mainly for chewing, snuff, cigar, and pipe blends. Most of the world production is confined to the tropics.
  • Oriental tobacco gives a mild smoke with very characteristic aroma. Resins, waxes and gum exuded by glandular hairs (trichomes) furnish the aroma. Nicotine is low averaging around
  • Dark-fired tobacco is used in the production of snuff, chewing tobacco, and pipe blends. Dark-fired leaves are subjected to smoke from smoldering wood during the early stage of curing. The type of wood used is very important in determining taste and grown. Cured leaves are very dark in color and are long and heavy bodied.
  • modulating may refer to reducing, inhibiting, increasing or otherwise affecting the expression or activity of a polypeptide.
  • the term may also refer to reducing, inhibiting, increasing or otherwise affecting the activity of a gene encoding a polypeptide which can include, but is not limited to, modulating transcriptional activity.
  • reduce refers to a reduction of from about 10% to about 99%, or a reduction of at least 10%, at least 20%, at least 25%, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 75%, at least 80%, at least 90%, at least 95%, at least 98%, at least 99%, or at least 100% or more of a quantity or an activity, such as but not limited to polypeptide activity, transcriptional activity and protein expression.
  • inhibitor refers to a reduction of from about 98% to about 100%, or a reduction of at least 98%, at least 99%, but particularly of 100%, of a quantity or an activity, such as but not limited to polypeptide activity, transcriptional activity and protein expression.
  • increase refers to an increase of from about 5% to about 99%, or an increase of at least 5%, at least 10%, at least 20%, at least 25%, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 75%, at least 80%, at least 90%, at least 95%, at least 98%, at least 99%, at least 100%, at least 500% or at least 1000% or more of a quantity or an activity, such as but not limited to polypeptide activity, transcriptional activity and protein expression.
  • control in the context of a control plant means a plant or plant cell in which the expression or activity of an enzyme has not been modified (for example, increased or reduced) and so it can provide a comparison with a plant in which the expression or activity of the enzyme has been modified.
  • the control plant may comprise an empty vector.
  • the control plant or plant cell may correspond to a wild-type plant or wild-type plant cell.
  • the control plant or plant cell can be the same genotype as the starting material for the genetic alteration that resulted in the subject plant. In all such cases, the subject plant and the control plant are cultured and harvested using the same protocols for comparative purposes.
  • Changes in levels, ratios, activity, or distribution of the genes or polypeptides described herein, or changes in tobacco plant phenotype, particularly reduced production of proteases can be measured by comparing a subject plant to the control plant, where the subject plant and the control plant have been cultured, harvested and cured using the same protocols.
  • the control plant can provide a reference point for measuring changes in phenotype of the subject plant.
  • the measurement of changes in phenotype can be measured at any time in a plant, including during plant development, senescence, or preferably after curing.
  • Measurement of changes in phenotype can be measured in plants grown under any conditions, including from plants grown in growth chamber, greenhouse, or in a field. Changes in phenotype can be measured by determining the expression or activity of proteases identified herein in SEQ ID Nos 81 -160.
  • an isolated polynucleotide comprising, consisting or consisting essentially of a polynucleotide sequence having at least 95% sequence identity to any of the sequences described herein, including any of polynucleotides shown in the sequence lisiting.
  • the isolated polynucleotide comprises, consists or consists essentially of a sequence having at least 95%, 96%, 97%, 98%, 99% or 100% sequence identity thereto.
  • the polynucleotide(s) described herein encode a protein with protease activity that is at least about 50%, 60%, 70%, 80%, 90% 95%, 96%, 97%, 98%, 99% 100% or more of the activity of the protein set forth in SEQ ID NOs: 81 -160.
  • a polynucleotide as described herein can include a polymer of nucleotides, which may be unmodified or modified deoxyribonucleic acid (DNA) or ribonucleic acid (RNA). Accordingly, a polynucleotide can be, without limitation, a genomic DNA, complementary DNA (cDNA), mRNA, or antisense RNA or a fragment(s) thereof.
  • DNA deoxyribonucleic acid
  • RNA ribonucleic acid
  • a polynucleotide can be, without limitation, a genomic DNA, complementary DNA (cDNA), mRNA, or antisense RNA or a fragment(s) thereof.
  • a polynucleotide can be single-stranded or double-stranded DNA, DNA that is a mixture of single-stranded and double-stranded regions, a hybrid molecule comprising DNA and RNA, or a hybrid molecule with a mixture of single-stranded and double-stranded regions or a fragment(s) thereof.
  • the polynucleotide can be composed of triple-stranded regions comprising DNA, RNA, or both or a fragment(s) thereof.
  • a polynucleotide can contain one or more modified bases, such as phosphothioates, and can be a peptide nucleic acid.
  • polynucleotides can be assembled from isolated or cloned fragments of cDNA, genomic DNA, oligonucleotides, or individual nucleotides, or a combination of the foregoing.
  • sequences described herein are shown as DNA sequences, the sequences include their corresponding RNA sequences, and their complementary (for example, completely complementary) DNA or RNA sequences, including the reverse complements thereof.
  • a polynucleotide as described herein will generally contain phosphodiester bonds, although in some cases, polynucleotide analogues are included that may have alternate backbones, comprising, for example, phosphoramidate, phosphorothioate, phosphorodithioate, or O- methylphophoroamidite linkages; and peptide polynucleotide backbones and linkages.
  • polynucleotide analogues include those with positive backbones; non-ionic backbones, and non-ribose backbones.
  • Modifications of the ribose-phosphate backbone may be done for a variety of reasons, for example, to increase the stability and half-life of such molecules in physiological environments or as probes on a biochip. Mixtures of naturally occurring polynucleotides and analogues can be made; alternatively, mixtures of different polynucleotide analogues, and mixtures of naturally occurring polynucleotides and analogues may be made.
  • polynucleotide analogues including, for example, phosphoramidate, phosphorothioate, phosphorodithioate, O-methylphophoroamidite linkages and peptide polynucleotide backbones and linkages.
  • Other analogue polynucleotides include those with positive backbones, non-ionic backbones and non-ribose backbones.
  • Polynucleotides containing one or more carbocyclic sugars are also included.
  • analogues include peptide polynucleotides which are peptide polynucleotide analogues. These backbones are substantially non-ionic under neutral conditions, in contrast to the highly charged phosphodiester backbone of naturally occurring polynucleotides. This may result in advantages.
  • the peptide polynucleotide backbone may exhibit improved hybridization kinetics.
  • Peptide polynucleotides have larger changes in the melting temperature for mismatched versus perfectly matched base pairs. DNA and RNA typically exhibit a 2-4 °C drop in melting temperature for an internal mismatch. With the non-ionic peptide polynucleotide backbone, the drop is closer to 7-9 °C.
  • peptide polynucleotides may not be degraded or degraded to a lesser extent by cellular enzymes, and thus may be more stable.
  • fragments as probes in nucleic acid hybridisation assays or primers for use in nucleic acid amplification assays.
  • Such fragments generally comprise at least about 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19 or 20 or more contiguous nucleotides of a DNA sequence.
  • a DNA fragment comprises at least about 10, 15, 20, 30, 40, 50 or 60 or more contiguous nucleotides of a DNA sequence.
  • a method for detecting a polynucleotide encoding a protein with nicotine N-demethylase activity member or encoding a nicotine N-demethylase enzyme comprising the use of the probes or primers or both.
  • oligonucleotides are useful as primers, for example, in polymerase chain reactions (PCR), whereby DNA fragments are isolated and amplified.
  • degenerate primers can be used as probes for genetic libraries.
  • libraries would include but are not limited to cDNA libraries, genomic libraries, and even electronic express sequence tag or DNA libraries. Homologous sequences identified by this method would then be used as probes to identify homologues of the sequences identified herein.
  • polynucleotides and oligonucleotides for example, primers or probes
  • polynucleotide(s) that hybridize under reduced stringency conditions, typically moderately stringent conditions, and commonly highly stringent conditions to the polynucleotide(s) as described herein.
  • the basic parameters affecting the choice of hybridization conditions and guidance for devising suitable conditions are set forth by Sambrook, J., E. F. Fritsch, and T. Maniatis (1989, Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. and can be readily determined by those having ordinary skill in the art based on, for example, the length or base composition of the polynucleotide.
  • One way of achieving moderately stringent conditions involves the use of a prewashing solution containing 5x Standard Sodium Citrate, 0.5% Sodium Dodecyl Sulphate, 1 .0 mM Ethylenediaminetetraacetic acid (pH 8.0), hybridization buffer of about 50% formamide, 6x Standard Sodium Citrate, and a hybridization temperature of about 55 °C (or other similar hybridization solutions, such as one containing about 50% formamide, with a hybridization temperature of about 42°C), and washing conditions of about 60°C, in 0.5x Standard Sodium Citrate, 0.1 % Sodium Dodecyl Sulphate.
  • highly stringent conditions are defined as hybridization conditions as above, but with washing at approximately 68 °C, 0.2x Standard Sodium Citrate, 0.1 % Sodium Dodecyl Sulphate.
  • SSPE (1 x SSPE is 0.15 M sodium chloride, 10 mM sodium phosphate, and 1 .25 mM Ethylenediaminetetraacetic acid, pH 7.4) can be substituted for Standard Sodium Citrate (1 x Standard Sodium Citrate is 0.15 M sodium chloride and 15 mM sodium citrate) in the hybridization and wash buffers; washes are performed for 15 minutes after hybridization is complete.
  • wash temperature and wash salt concentration can be adjusted as necessary to achieve a desired degree of stringency by applying the basic principles that govern hybridization reactions and duplex stability, as known to those skilled in the art and described further below (see, for example, Sambrook, J., E. F. Fritsch, and T. Maniatis (1989, Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y).
  • the hybrid length is assumed to be that of the hybridizing polynucleotide.
  • the hybrid length can be determined by aligning the sequences of the polynucleotides and identifying the region or regions of optimal sequence complementarity.
  • each such hybridizing polynucleotide has a length that is at least 25% (commonly at least 50%, 60%, or 70%, and most commonly at least 80%) of the length of a polynucleotide to which it hybridizes, and has at least 60% sequence identity (for example, at least 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99% or 100%) with a polynucleotide to which it hybridizes.
  • a linear DNA has two possible orientations: the 5'-to-3' direction and the 3'-to-5' direction.
  • the reference sequence and the second sequence are orientated in the same direction, or have the same orientation.
  • a promoter sequence and a gene of interest under the regulation of the given promoter are positioned in the same orientation.
  • the reference sequence and the second sequence are orientated in anti-sense direction, or have anti- sense orientation.
  • Two sequences having anti-sense orientations with respect to each other can be alternatively described as having the same orientation, if the reference sequence (5'- to-3' direction) and the reverse complementary sequence of the reference sequence (reference sequence positioned in the 5'-to-3') are positioned within the same polynucleotide molecule/strand.
  • the sequences set forth herein are shown in the 5'-to-3' direction.
  • Recombinant constructs provided herein can be used to transform plants or plant cells in order to modulate protein expression and/or activity levels.
  • a recombinant polynucleotide construct can comprise a polynucleotide encoding one or more polynucleotides as described herein, operably linked to a regulatory region suitable for expressing the polypeptide.
  • a polynucleotide can comprise a coding sequence that encodes the polypeptide as described herein.
  • Plants or plant cells in which protein expression and/or activity levels are modulated can include mutant, non-naturally occurring, transgenic, man-made or genetically engineered plants or plant cells.
  • the transgenic plant or plant cell comprises a genome that has been altered by the stable integration of recombinant DNA.
  • Recombinant DNA includes DNA which has been genetically engineered and constructed outside of a cell and includes DNA containing naturally occurring DNA or cDNA or synthetic DNA.
  • a transgenic plant can include a plant regenerated from an originally-transformed plant cell and progeny transgenic plants from later generations or crosses of a transformed plant.
  • the transgenic modification alters the expression or activity of the polynucleotide or the polypeptide described herein as compared to a control plant.
  • the polypeptide encoded by a recombinant polynucleotide can be a native polypeptide, or can be heterologous to the cell.
  • the recombinant construct contains a polynucleotide that modulates expression, operably linked to a regulatory region. Examples of suitable regulatory regions are described herein.
  • Vectors containing recombinant polynucleotide constructs such as those described herein are also provided.
  • Suitable vector backbones include, for example, those routinely used in the art such as plasmids, viruses, artificial chromosomes, bacterial artificial chromosomes, yeast artificial chromosomes, or bacteriophage artificial chromosomes.
  • Suitable expression vectors include, without limitation, plasmids and viral vectors derived from, for example, bacteriophage, baculoviruses, and retroviruses. Numerous vectors and expression systems are commercially available.
  • the vectors can include, for example, origins of replication, scaffold attachment regions or markers.
  • a marker gene can confer a selectable phenotype on a plant cell.
  • a marker can confer biocide resistance, such as resistance to an antibiotic (for example, kanamycin, G418, bleomycin, or hygromycin), or an herbicide (for example, glyphosate, chlorsulfuron or phosphinothricin).
  • an expression vector can include a tag sequence designed to facilitate manipulation or detection (for example, purification or localization) of the expressed polypeptide.
  • Tag sequences such as luciferase, beta-glucuronidase, green fluorescent protein, glutathione S-transferase, polyhistidine, c- myc or hemagglutinin sequences typically are expressed as a fusion with the encoded polypeptide.
  • tags can be inserted anywhere within the polypeptide, including at either the carboxyl or amino terminus.
  • a plant or plant cell can be transformed by having the recombinant polynucleotide integrated into its genome to become stably transformed.
  • the plant or plant cell described herein can be stably transformed.
  • Stably transformed cells typically retain the introduced polynucleotide with each cell division.
  • a plant or plant cell can be transiently transformed such that the recombinant polynucleotide is not integrated into its genome. Transiently transformed cells typically lose all or some portion of the introduced recombinant polynucleotide with each cell division such that the introduced recombinant polynucleotide cannot be detected in daughter cells after a sufficient number of cell divisions.
  • a number of methods are available in the art for transforming a plant cell which are all encompassed herein, including biolistics, gene gun techniques, Agrobacterium-mediated transformation, viral vector-mediated transformation and electroporation.
  • the Agrobacterium system for integration of foreign DNA into plant chromosomes has been extensively studied, modified, and exploited for plant genetic engineering. Naked recombinant DNA molecules comprising DNA sequences corresponding to the subject purified tobacco protein operably linked, in the sense or antisense orientation, to regulatory sequences are joined to appropriate T-DNA sequences by conventional methods. These are introduced into tobacco protoplasts by polyethylene glycol techniques or by electroporation techniques, both of which are standard.
  • such vectors comprising recombinant DNA molecules encoding the subject purified tobacco protein are introduced into live Agrobacterium cells, which then transfer the DNA into the tobacco plant cells. Transformation by naked DNA without accompanying T-DNA vector sequences can be accomplished via fusion of tobacco protoplasts with DNA-containing liposomes or via electroporation. Naked DNA unaccompanied by T-DNA vector sequences can also be used to transform tobacco cells via inert, high velocity microprojectiles.
  • plants can be regenerated from transformed cultures if desired, by techniques known to those skilled in the art.
  • regulatory regions to be included in a recombinant construct depends upon several factors, including, but not limited to, efficiency, selectability, inducibility, desired expression level, and cell- or tissue-preferential expression. It is a routine matter for one of skill in the art to modulate the expression of a coding sequence by appropriately selecting and positioning regulatory regions relative to the coding sequence. Transcription of a polynucleotide can be modulated in a similar manner. Some suitable regulatory regions initiate transcription only, or predominantly, in certain cell types. Methods for identifying and characterizing regulatory regions in plant genomic DNA are known in the art.
  • Suitable promoters include tissue-specific promoters recognized by tissue-specific factors present in different tissues or cell types (for example, root-specific promoters, shoot-specific promoters, xylem-specific promoters), or present during different developmental stages, or present in response to different environmental conditions. Suitable promoters include constitutive promoters that can be activated in most cell types without requiring specific inducers. Examples of suitable promoters for controlling RNAi polypeptide production include the cauliflower mosaic virus 35S (CaMV/35S), SSU, OCS, Iib4, usp, STLS1 , B33, nos or ubiquitin- or phaseolin-promoters. Persons skilled in the art are capable of generating multiple variations of recombinant promoters.
  • Tissue-specific promoters are transcriptional control elements that are only active in particular cells or tissues at specific times during plant development, such as in vegetative tissues or reproductive tissues. Tissue-specific expression can be advantageous, for example, when the expression of polynucleotides in certain tissues is preferred.
  • tissue-specific promoters under developmental control include promoters that can initiate transcription only (or primarily only) in certain tissues, such as vegetative tissues, for example, roots or leaves, or reproductive tissues, such as fruit, ovules, seeds, pollen, pistols, flowers, or any embryonic tissue.
  • Reproductive tissue-specific promoters may be, for example, anther-specific, ovule-specific, embryo-specific, endosperm-specific, integument- specific, seed and seed coat-specific, pollen-specific, petal-specific, sepal-specific, or combinations thereof.
  • Suitable leaf-specific promoters include pyruvate, orthophosphate dikinase (PPDK) promoter from C4 plant (maize), cab-m1 Ca+2 promoter from maize, the Arabidopsis thaliana myb- related gene promoter (Atmyb5), the ribulose biphosphate carboxylase (RBCS) promoters (for example, the tomato RBCS 1 , RBCS2 and RBCS3A genes expressed in leaves and light-grown seedlings, RBCS1 and RBCS2 expressed in developing tomato fruits or ribulose bisphosphate carboxylase promoter expressed almost exclusively in mesophyll cells in leaf blades and leaf sheaths at high levels).
  • PPDK orthophosphate dikinase
  • Atmyb5 the Arabidopsis thaliana myb- related gene promoter
  • RBCS ribulose biphosphate carboxylase
  • Suitable senescence-specific promoters include a tomato promoter active during fruit ripening, senescence and abscission of leaves, a maize promoter of gene encoding a cysteine protease, the promoter of 82E4 and the promoter of SAG genes. Suitable anther- specific promoters can be used. Suitable root-preferred promoters known to persons skilled in the art may be selected. Suitable seed-preferred promoters include both seed-specific promoters (those promoters active during seed development such as promoters of seed storage proteins) and seed-germinating promoters (those promoters active during seed germination).
  • Such seed-preferred promoters include, but are not limited to, Cim1 (cytokinin-induced message); cZ19B1 (maize 19 kDa zein); milps (myo-inositol-1 -phosphate synthase); mZE40-2, also known as Zm-40; nuclc; and celA (cellulose synthase).
  • Gama-zein is an endosperm-specific promoter.
  • Glob-1 is an embryo-specific promoter.
  • seed- specific promoters include, but are not limited to, bean beta-phaseolin, napin, ⁇ -conglycinin, soybean lectin, cruciferin, and the like.
  • seed-specific promoters include, but are not limited to, a maize 15 kDa zein promoter, a 22 kDa zein promoter, a 27 kDa zein promoter, a g-zein promoter, a 27 kDa gamma-zein promoter (such as gzw64A promoter, see Genbank Accession number S78780), a waxy promoter, a shrunken 1 promoter, a shrunken 2 promoter, a globulin 1 promoter (see Genbank Accession number L22344), an Itp2 promoter, cim1 promoter, maize endl and end2 promoters, nud promoter, Zm40 promoter, eepl and eep2; led , thioredoxin H promoter; mlip15 promoter, PCNA2 promoter; and the shrunken-2 promoter.
  • a maize 15 kDa zein promoter such as gz
  • inducible promoters include promoters responsive to pathogen attack, anaerobic conditions, elevated temperature, light, drought, cold temperature, or high salt concentration.
  • Pathogen-inducible promoters include those from pathogenesis-related proteins (PR proteins), which are induced following infection by a pathogen (for example, PR proteins, SAR proteins, beta-1 ,3-glucanase, chitinase).
  • PR proteins pathogenesis-related proteins
  • promoters may be derived from bacterial origin for example, the octopine synthase promoter, the nopaline synthase promoter and other promoters derived from Ti plasmids, or may be derived from viral promoters (for example, 35S and 19S RNA promoters of cauliflower mosaic virus (CaMV), constitutive promoters of tobacco mosaic virus, cauliflower mosaic virus (CaMV) 19S and 35S promoters, or figwort mosaic virus 35S promoter).
  • CaMV cauliflower mosaic virus
  • CaMV cauliflower mosaic virus
  • CaMV constitutive promoters of tobacco mosaic virus
  • CaMV cauliflower mosaic virus
  • CaMV cauliflower mosaic virus
  • figwort mosaic virus 35S promoter figwort mosaic virus 35S promoter
  • Preferred promoters include the control elements provided herein, as part of SEQ ID Nos. 1 - 80, which demonstrate desirable expression during curing procedures in tobacco leaf.
  • an isolated polypeptide comprising, consisting or consisting essentially of a polypeptide sequence having at least 95% sequence identity to any of the polypeptide sequences described herein, including any of the polypeptides shown in the sequence lisiting.
  • the isolated polypeptide comprises, consists or consists essentially of a sequence having at least 95% 96%, 97%, 98%, 99%, 99.1 %, 99.2%, 99.3%, 99.4%, 99.5%, 99.6%, 99.7%, 99.8%, 99.9% or 100% sequence identity thereto.
  • the polypeptide can include sequences comprising a sufficient or substantial degree of identity or similarity to SEQ ID NOs: 81 -160 to function as proteases. Fragments of the polypeptide(s) typically retain some or all of the activity of the full length sequence.
  • the polypeptides also include mutants produced by introducing any type of alterations (for example, insertions, deletions, or substitutions of amino acids; changes in glycosylation states; changes that affect refolding or isomerizations, three- dimensional structures, or self-association states), which can be deliberately engineered or isolated naturally provided that they still have some or all of their function or activity as a protease.
  • alterations for example, insertions, deletions, or substitutions of amino acids; changes in glycosylation states; changes that affect refolding or isomerizations, three- dimensional structures, or self-association states
  • the function or activity as a protease is modulated, increased or reduced.
  • Polypeptides include variants produced by introducing any type of alterations (for example, insertions, deletions, or substitutions of amino acids; changes in glycosylation states; changes that affect refolding or isomerizations, three-dimensional structures, or self- association states), which can be deliberately engineered or isolated naturally.
  • the variant may have alterations which produce a silent change and result in a functionally equivalent protein. Deliberate amino acid substitutions may be made on the basis of similarity in polarity, charge, solubility, hydrophobicity, hydrophilicity and the amphipathic nature of the residues as long as the secondary binding activity of the substance is retained.
  • negatively charged amino acids include aspartic acid and glutamic acid; positively charged amino acids include lysine and arginine; and amino acids with uncharged polar head groups having similar hydrophilicity values include leucine, isoleucine, valine, glycine, alanine, asparagine, glutamine, serine, threonine, phenylalanine, and tyrosine. Conservative substitutions may be made, for example according to the Table below. Amino acids in the same block in the second column and preferably in the same line in the third column may be substituted for each other:
  • the polypeptide may be a mature protein or an immature protein or a protein derived from an immature protein.
  • Polypeptides may be in linear form or cyclized using known methods. Polypeptides typically comprise at least 10, at least 20, at least 30, or at least 40 contiguous amino acids.
  • a tobacco plant or plant cell comprising a mutation in a gene encoding a protease as described herein wherein said mutation results in modulated expression or modulated function of said protease.
  • the expression or function of the protease(s) may be enhanced.
  • the mutant plants or plant cells can have one or more further mutations in one or more other genes or polypeptides.
  • the mutants aside from the one or more mutations in a protease gene, the mutants can have one or more further mutations in one or more other genes or polypeptides - such as one or more other protease genes or polypeptides as described in the Sequence Listing.
  • a protease is expressed in the leaves of the mutant plant during the curing procedure.
  • a method for modulating the level of a protease in a (cured) tobacco plant or in (cured) tobacco plant material comprising introducing into the genome of said plant one or more mutations that modulate expression of at least one protease gene, wherein said at least one protease gene is selected from SEQ ID Nos: 1 -80.
  • a method for identifying a tobacco plant with increased levels of protease comprising screening a nucleic acid sample from a tobacco plant of interest for the presence of one or more mutations in SEQ ID NOs:1 -80, and optionally correlating the identified mutation(s) with mutation(s) that are known to modulate levels of protease.
  • a tobacco plant or plant cell that is heterozygous or homozygous for mutations in a gene encoding a protease, wherein said mutation results in modulated (enhanced or reduced) expression or function of said protease.
  • a number of approaches can be used to combine mutations in one plant including sexual crossing.
  • a plant having one or more favourable heterozygous or homozygous mutations in a protease gene that enhances or reduces protease expression or activity can be crossed with a plant having one or more favourable heterozygous or homozygous mutations in one or more other protease genes that enhance or reduce protease activity.
  • crosses are made in order to introduce one or more favourable heterozygous or homozygous mutations within a protease gene within the same plant.
  • protease activity is statistically lower or higher than the protease activity of the same protease(s) in a tobacco plant that has not been modified to inhibit the activity of that protease polypeptide and which has been cultured, harvested and cured using the same protocols.
  • the mutation(s) is introduced into a tobacco plant or plant cell using a mutagenesis approach, and the introduced mutation is identified or selected using methods known to those of skill in the art - such as Southern blot analysis, DNA sequencing, PCR analysis, or phenotypic analysis. Mutations that impact gene expression or that interfere with the function of the encoded protein can be determined using methods that are well known in the art. Insertional mutations in gene exons usually result in null-mutants. Mutations in conserved residues can be particularly effective in inhibiting the metabolic function of the encoded protein.
  • Any plant of interest including a plant cell or plant material can be genetically modified by various methods known to induce mutagenesis, including site-directed mutagenesis, oligonucleotide- directed mutagenesis, chemically-induced mutagenesis, irradiation-induced mutagenesis, mutagenesis utilizing modified bases, mutagenesis utilizing gapped duplex DNA, double- strand break mutagenesis, mutagenesis utilizing repair-deficient host strains, mutagenesis by total gene synthesis, DNA shuffling and other equivalent methods.
  • Fragments of protease polynucleotides and polypeptides encoded thereby are also disclosed.
  • Fragments of a polynucleotide may encode protein fragments that retain the biological activity of the native protein and hence are involved in the metabolic conversion of nicotine to nornicotine.
  • fragments of a polynucleotide that are useful as hybridization probes or PCR primers generally do not encode fragment proteins retaining biological activity.
  • fragments of the disclosed nucleotide sequences include those that can be assembled within recombinant constructs as discussed herein.
  • Fragments of a polynucleotide sequence may range from at least about 25 nucleotides, about 50 nucleotides, about 75 nucleotides, about 100 nucleotides about 150 nucleotides, about 200 nucleotides, about 250 nucleotides, about 300 nucleotides, about 400 nucleotides, about 500 nucleotides, about 600 nucleotides, about 700 nucleotides, about 800 nucleotides, about 900 nucleotides, about 1000 nucleotides, about 1 100 nucleotides, about 1200 nucleotides, about 1300 nucleotides or about 1400 nucleotides and up to the full-length polynucleotide encoding the polypeptides described herein.
  • Fragments of a polypeptide sequence may range from at least about 25 amino acids, about 50 amino acids, about 75 amino acids, about 100 amino acids about 150 amino acids, about 200 amino acids, about 250 amino acids, about 300 amino acids, about 400 amino acids, about 500 amino acids, and up to the full-length polypeptide described herein.
  • Mutant polypeptide variants can be used to create mutant, non-naturally occurring or transgenic plants (for example, mutant, non-naturally occurring, transgenic, man-made or genetically engineered plants) or plant cells comprising one or more mutant polypeptide variants.
  • mutant polypeptide variants retain the activity of the unmutated polypeptide.
  • the activity of the mutant polypeptide variant may be higher, lower or about the same as the unmutated polypeptide.
  • Mutations in the nucleotide sequences and polypeptides described herein can include man- made mutations or synthetic mutations or genetically engineered mutations. Mutations in the nucleotide sequences and polypeptides described herein can be mutations that are obtained or obtainable via a process which includes an in vitro or an in vivo manipulation step. Mutations in the nucleotide sequences and polypeptides described herein can be mutations that are obtained or obtainable via a process which includes intervention by man.
  • the process may include mutagenesis using exogenously added chemicals - such as mutagenic, teratogenic, or carcinogenic organic compounds, for example ethyl methanesulfonate (EMS), that produce random mutations in genetic material.
  • the process may include one or more genetic engineering steps - such as one or more of the genetic engineering steps that are described herein or combinations thereof.
  • the process may include one or more plant crossing steps.
  • a polypeptide may be prepared by culturing transformed or recombinant host cells under culture conditions suitable to express a polypeptide.
  • the resulting expressed polypeptide may then be purified from such culture using known purification processes.
  • the purification of the polypeptide may include an affinity column containing agents which will bind to the polypeptide; one or more column steps over such affinity resins; one or more steps involving hydrophobic interaction chromatography; or immunoaffinity chromatography.
  • the polypeptide may also be expressed in a form that will facilitate purification. For example, it may be expressed as a fusion polypeptide, such as those of maltose binding polypeptide, glutathione-5-transferase, his-tag or thioredoxin.
  • Kits for expression and purification of fusion polypeptides are commercially available.
  • the polypeptide may be tagged with an epitope and subsequently purified by using a specific antibody directed to such epitope.
  • One or more liquid chromatography steps such as reverse-phase high performance liquid chromatography can be employed to further purify the polypeptide.
  • Some or all of the foregoing purification steps, in various combinations, can be employed to provide a substantially homogeneous recombinant polypeptide.
  • the polypeptide thus purified may be substantially free of other polypeptides and is defined herein as a "substantially purified polypeptide"; such purified polypeptides include polypeptides, fragments, variants, and the like.
  • Expression, isolation, and purification of the polypeptides and fragments can be accomplished by any suitable technique, including but not limited to the methods described herein.
  • an affinity column such as a monoclonal antibody generated against polypeptides, to affinity-purify expressed polypeptides.
  • affinity column such as a monoclonal antibody generated against polypeptides
  • These polypeptides can be removed from an affinity column using conventional techniques, for example, in a high salt elution buffer and then dialyzed into a lower salt buffer for use or by changing pH or other components depending on the affinity matrix utilized, or be competitively removed using the naturally occurring substrate of the affinity moiety.
  • Isolated or substantially purified polynucleotides or protein compositions are disclosed.
  • An "isolated” or “purified” polynucleotide or protein, or biologically active portion thereof, is substantially or essentially free from components that normally accompany or interact with the polynucleotide or protein as found in its naturally occurring environment.
  • an isolated or purified polynucleotide or protein is substantially free of other cellular material, or culture medium when produced by recombinant techniques, or substantially free of chemical precursors or other chemicals when chemically synthesized.
  • an "isolated" polynucleotide is free of sequences (optimally protein encoding sequences) that naturally flank the polynucleotide (for example, sequences located at the 5' and 3' ends of the polynucleotide) in the genomic DNA of the organism from which the polynucleotide is derived.
  • the isolated polynucleotide can contain less than about 5 kb, 4 kb, 3 kb, 2 kb, 1 kb, 0.5 kb, or 0.1 kb of nucleotide sequence that naturally flank the polynucleotide in genomic DNA of the cell from which the polynucleotide is derived.
  • a protein that is substantially free of cellular material includes preparations of protein having less than about 30%, 20%, 10%, 5%, or 1 % (by dry weight) of contaminating protein.
  • a polypeptide may also be produced by known conventional chemical synthesis. Methods for constructing the polypeptides or fragments thereof by synthetic means are known to those skilled in the art. The synthetically-constructed polypeptide sequences, by virtue of sharing primary, secondary or tertiary structural or conformational characteristics with native polypeptides may possess biological properties in common therewith, including biological activity.
  • non-naturally occurring as used herein describes an entity (for example, a polynucleotide, a genetic mutation, a polypeptide, a plant, a plant cell and plant material) that is not formed by nature or that does not exist in nature.
  • entity for example, a polynucleotide, a genetic mutation, a polypeptide, a plant, a plant cell and plant material
  • Such non-naturally occurring entities or artificial entities may be made, synthesized, initiated, modified, intervened, or manipulated by methods described herein or that are known in the art.
  • Such non-naturally occurring entities or artificial entities may be made, synthesized, initiated, modified, intervened, or manipulated by man.
  • a non-naturally occurring plant a non-naturally occurring plant cell or non-naturally occurring plant material may be made using traditional plant breeding techniques - such as backcrossing - or by genetic manipulation technologies - such as antisense RNA, interfering RNA, meganuclease and the like.
  • a non-naturally occurring plant, a non-naturally occurring plant cell or non-naturally occurring plant material may be made by introgression of or by transferring one or more genetic mutations (for example one or more polymorphisms) from a first plant or plant cell into a second plant or plant cell (which may itself be naturally occurring), such that the resulting plant, plant cell or plant material or the progeny thereof comprises a genetic constitution (for example, a genome, a chromosome or a segment thereof) that is not formed by nature or that does not exist in nature.
  • the resulting plant, plant cell or plant material is thus artificial or non-naturally occurring.
  • an artificial or non-naturally occurring plant or plant cell may be made by modifying a genetic sequence in a first naturally occurring plant or plant cell, even if the resulting genetic sequence occurs naturally in a second plant or plant cell that comprises a different genetic background from the first plant or plant cell.
  • a mutation is not a naturally occurring mutation that exists naturally in a nucleotide sequence or a polypeptide - such as a gene or a protein.
  • Differences in genetic background can be detected by phenotypic differences or by molecular biology techniques known in the art - such as nucleic acid sequencing, presence or absence of genetic markers (for example, microsatellite RNA markers).
  • Antibodies that are immunoreactive with the polypeptides described herein are also provided.
  • the polypeptides, fragments, variants, fusion polypeptides, and the like, as set forth herein, can be employed as "immunogens" in producing antibodies immunoreactive therewith.
  • Such antibodies may specifically bind to the polypeptide via the antigen-binding sites of the antibody.
  • Specifically binding antibodies are those that will specifically recognize and bind with a polypeptide, homologues, and variants, but not with other molecules.
  • the antibodies are specific for polypeptides having an amino acid sequence as set forth herein and do not cross-react with other polypeptides.
  • polypeptides, fragment, variants, fusion polypeptides, and the like contain antigenic determinants or epitopes that elicit the formation of antibodies.
  • antigenic determinants or epitopes can be either linear or conformational (discontinuous).
  • Linear epitopes are composed of a single section of amino acids of the polypeptide, while conformational or discontinuous epitopes are composed of amino acids sections from different regions of the polypeptide chain that are brought into close proximity upon polypeptide folding.
  • Epitopes can be identified by any of the methods known in the art.
  • epitopes from the polypeptides can be used as research reagents, in assays, and to purify specific binding antibodies from substances such as polyclonal sera or supernatants from cultured hybridomas.
  • Such epitopes or variants thereof can be produced using techniques known in the art such as solid-phase synthesis, chemical or enzymatic cleavage of a polypeptide, or using recombinant DNA technology.
  • Both polyclonal and monoclonal antibodies to the polypeptides can be prepared by conventional techniques.
  • Hybridoma cell lines that produce monoclonal antibodies specific for the polypeptides are also contemplated herein. Such hybridomas can be produced and identified by conventional techniques.
  • various host animals may be immunized by injection with a polypeptide, fragment, variant, or mutants thereof. Such host animals may include, but are not limited to, rabbits, mice, and rats, to name a few.
  • Various adjutants may be used to increase the immunological response.
  • such adjuvants include, but are not limited to, Freund's (complete and incomplete), mineral gels such as aluminium hydroxide, surface active substances such as lysolecithin, pluronic polyols, polyanions, peptides, oil emulsions, keyhole limpet hemocyanin, dinitrophenol, and potentially useful human adjuvants such as BCG (bacille Calmette-Guerin) and Corynebacterium parvum.
  • BCG Bacille Calmette-Guerin
  • Corynebacterium parvum BCG (bacille Calmette-Guerin) and Corynebacterium parvum.
  • the monoclonal antibodies can be recovered by conventional techniques. Such monoclonal antibodies may be of any immunoglobulin class including IgG, IgM, IgE, IgA, IgD, and any subclass thereof.
  • the antibodies can also be used in assays to detect the presence of the polypeptides or fragments, either in vitro or in vivo.
  • the antibodies also can be employed in purifying polypeptides or fragments by immunoaffinity chromatography.
  • compositions that can modulate the expression or the activity of one or more of the proteases described herein include, but are not limited to, sequence- specific polynucleotides that can interfere with the transcription of one or more endogenous gene(s); sequence-specific polynucleotides that can interfere with the translation of RNA transcripts (for example, double-stranded RNAs, siRNAs, ribozymes); sequence-specific polypeptides that can interfere with the stability of one or more proteins; sequence-specific polynucleotides that can interfere with the enzymatic activity of one or more proteins or the binding activity of one or more proteins with respect to substrates or regulatory proteins; antibodies that exhibit specificity for one or more proteins; small molecule compounds that can interfere with the stability of one or more proteins or the enzymatic activity of one or more proteins or the binding activity of one or more proteins; zinc finger proteins that bind one or more polynucleotides; and meganucleases that have activity towards one or more polynucleo
  • TALENs transcription activator-like effector nucleases
  • Non-homologous end joining reconnects DNA from either side of a double- strand break where there is very little or no sequence overlap for annealing.
  • This repair mechanism induces errors in the genome via insertion or deletion, or chromosomal rearrangement. Any such errors may render the gene products coded at that location nonfunctional.
  • Another method of gene editing involves the use of the bacterial CRISPR/Cas system.
  • CRISPR clustered regularly interspaced short palindromic repeats
  • crRNAs CRISPR RNAs
  • tracrRNA trans-activating crRNA
  • Cas CRISPR-associated proteins
  • Cas9 is normally programmed by a dual RNA consisting of the crRNA and tracrRNA. However, the core components of these RNAs can be combined into a single hybrid 'guide RNA' for Cas9 targeting.
  • the use of a noncoding RNA guide to target DNA for site-specific cleavage promises to be significantly more straightforward than existing technologies - such as TALENs.
  • TALENs TALENs.
  • retargeting the nuclease complex only requires introduction of a new RNA sequence and there is no need to reengineer the specificity of protein transcription factors.
  • Antisense technology is another well-known method that can be used to modulate the expression of a polypeptide.
  • a polynucleotide of the gene to be repressed is cloned and operably linked to a regulatory region and a transcription termination sequence so that the antisense strand of RNA is transcribed.
  • the recombinant construct is then transformed into a plant cell and the antisense strand of RNA is produced.
  • the polynucleotide need not be the entire sequence of the gene to be repressed, but typically will be substantially complementary to at least a portion of the sense strand of the gene to be repressed.
  • a polynucleotide may be transcribed into a ribozyme, or catalytic RNA, that affects expression of an mRNA.
  • Ribozymes can be designed to specifically pair with virtually any target RNA and cleave the phosphodiester backbone at a specific location, thereby functionally inactivating the target RNA.
  • Heterologous polynucleotides can encode ribozymes designed to cleave particular mRNA transcripts, thus preventing expression of a polypeptide.
  • Hammerhead ribozymes are useful for destroying particular mRNAs, although various ribozymes that cleave mRNA at site-specific recognition sequences can be used.
  • Hammerhead ribozymes cleave mRNAs at locations dictated by flanking regions that form complementary base pairs with the target mRNA. The sole requirement is that the target RNA contains a 5'-UG-3' nucleotide sequence.
  • the construction and production of hammerhead ribozymes is known in the art.
  • Hammerhead ribozyme sequences can be embedded in a stable RNA such as a transfer RNA (tRNA) to increase cleavage efficiency in vivo.
  • tRNA transfer RNA
  • the sequence-specific polynucleotide that can interfere with the translation of RNA transcript(s) is interfering RNA.
  • RNA interference or RNA silencing is an evolutionarily conserved process by which specific mRNAs can be targeted for enzymatic degradation.
  • a double-stranded RNA (double-stranded RNA) is introduced or produced by a cell (for example, double-stranded RNA virus, or interfering RNA polynucleotides) to initiate the interfering RNA pathway.
  • the double-stranded RNA can be converted into multiple small interfering RNA duplexes of 21 -24 bp length by RNases III, which are double-stranded RNA-specific endonucleases.
  • the small interfering RNAs can be subsequently recognized by RNA-induced silencing complexes that promote the unwinding of small interfering RNA through an ATP-dependent process.
  • the unwound antisense strand of the small interfering RNA guides the activated RNA-induced silencing complexes to the targeted mRNA comprising a sequence complementary to the small interfering RNA anti-sense strand.
  • the targeted mRNA and the anti-sense strand can form an A-form helix, and the major groove of the A-form helix can be recognized by the activated RNA-induced silencing complexes.
  • Interfering RNA expression vectors may comprise interfering RNA constructs encoding interfering RNA polynucleotides that exhibit RNA interference activity by reducing the expression level of mRNAs, pre-mRNAs, or related RNA variants.
  • the expression vectors may comprise a promoter positioned upstream and operably-linked to an Interfering RNA construct, as further described herein.
  • Interfering RNA expression vectors may comprise a suitable minimal core promoter, a Interfering RNA construct of interest, an upstream (5') regulatory region, a downstream (3') regulatory region, including transcription termination and polyadenylation signals, and other sequences known to persons skilled in the art, such as various selection markers.
  • Various embodiments are directed to methods for modulating the expression level of one or more of the polynucleotide(s) described herein (or any combination thereof as described herein) by integrating multiple copies of the polynucleotide(s) into a (tobacco) plant genome, comprising: transforming a plant cell host with an expression vector that comprises a promoter operably-linked to a polynucleotide.
  • compositions and methods are provided for modulating the endogenous gene expression level by modulating the translation of mRNA.
  • a host (tobacco) plant cell can be transformed with an expression vector comprising: a promoter operably-linked to a polynucleotide, positioned in anti-sense orientation with respect to the promoter to enable the expression of RNA polynucleotides having a sequence complementary to a portion of mRNA.
  • RNA polynucleotide operably-linked to a polynucleotide in which the sequence is positioned in anti- sense orientation with respect to the promoter.
  • the lengths of anti-sense RNA polynucleotides can vary, and may be from about 15-20 nucleotides, about 20-30 nucleotides, about 30-50 nucleotides, about 50-75 nucleotides, about 75-100 nucleotides, about 100-150 nucleotides, about 150-200 nucleotides, and about 200-300 nucleotides.
  • the expression of one or more polypeptides can be modulated by non- transgenic means - such as creating one or more mutations in one or more genes, as discussed herein.
  • Methods that introduce a mutation randomly in a gene sequence can include chemical mutagenesis, EMS mutagenesis and radiation mutagenesis.
  • Methods that introduce one or more targeted mutations into a cell include but are not limited to genome editing technology, particularly zinc finger nuclease-mediated mutagenesis and targeting induced local lesions in genomes (TILLING), homologous recombination, oligonucleotide- directed mutagenesis, and meganuclease-mediated mutagenesis. In one embodiment, TILLING is used.
  • TILLING This is a mutagenesis technology that can be used to generate and/or identify polynucleotides encoding polypeptides with modified expression and/or activity. TILLING also allows selection of plants carrying such mutants. TILLING combines high- density mutagenesis with high-throughput screening methods. Methods for TILLING are well known in the art (see McCallum et al., (2000) Nat Biotechnol 18: 455-457 and Stemple (2004) Nat Rev Genet 5(2): 145-50).
  • mutant or non-naturally occurring plants will include at least a portion of foreign or synthetic or man- made nucleic acid (for example, DNA or RNA) that was not present in the plant before it was manipulated.
  • the foreign nucleic acid may be a single nucleotide, two or more nucleotides, two or more contiguous nucleotides or two or more non-contiguous nucleotides - such as at least 10, 20, 30, 40, 50,100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 1 100, 1200, 1300, 1400 or 1500 or more contiguous or non-contiguous nucleotides.
  • the mutant or non-naturally occurring plants or plant cells can have any combination of one or more mutations in one or more genes which results in modulated protein levels.
  • the mutant or non-naturally occurring plants or plant cells may have a single mutation in a single gene; multiple mutations in a single gene; a single mutation in two or more or three or more or four or more genes; or multiple mutations in two or more or three or more or four or more genes. Examples of such mutations are described herein.
  • the mutant or non-naturally occurring plants or plant cells may have one or more mutations in a specific portion of the gene(s) - such as in a region of the gene that encodes an active site of the protein or a portion thereof.
  • the mutant or non-naturally occurring plants or plant cells may have one or more mutations in a region outside of one or more gene(s) - such as in a region upstream or downstream of the gene it regulates provided that they modulate the activity or expression of the gene(s).
  • Upstream elements can include promoters, enhancers or transription factors. Some elements - such as enhancers - can be positioned upstream or downstream of the gene it regulates. The element(s) need not be located near to the gene that it regulates since some elements have been found located several hundred thousand base pairs upstream or downstream of the gene that it regulates.
  • the mutant or non-naturally occurring plants or plant cells may have one or more mutations located within the first 100 nucleotides of the gene(s), within the first 200 nucleotides of the gene(s), within the first 300 nucleotides of the gene(s), within the first 400 nucleotides of the gene(s), within the first 500 nucleotides of the gene(s), within the first 600 nucleotides of the gene(s), within the first 700 nucleotides of the gene(s), within the first 800 nucleotides of the gene(s), within the first 900 nucleotides of the gene(s), within the first 1000 nucleotides of the gene(s), within the first 1 100 nucleotides of the gene(s), within the first 1200 nucleotides of the gene(s), within the first 1300 nucleotides of the gene(s), within the first 1400 nucleotides of the gene(s) or within the first 1500 nucleotides of the gene(s).
  • the mutant or non-naturally occurring plants or plant cells may have one or more mutations located within the first, second, third, fourth, fifth, sixth, seventh, eighth, ninth, tenth, eleventh, twelfth, thirteenth, fourteenth or fifteenth set of 100 nucleotides of the gene(s) or combinations thereof.
  • Mutant or non-naturally occurring plants or plant cells comprising the mutant polypeptide variants are disclosed.
  • seeds from plants are mutagenised and then grown into first generation mutant plants.
  • the first generation plants are then allowed to self-pollinate and seeds from the first generation plant are grown into second generation plants, which are then screened for mutations in their loci.
  • the mutagenized plant material can be screened for mutations, an advantage of screening the second generation plants is that all somatic mutations correspond to germline mutations.
  • plant materials including but not limited to, seeds, pollen, plant tissue or plant cells, may be mutagenised in order to create the mutant plants.
  • the type of plant material mutagenised may affect when the plant nucleic acid is screened for mutations.
  • the seeds resulting from that pollination are grown into first generation plants. Every cell of the first generation plants will contain mutations created in the pollen; thus these first generation plants may then be screened for mutations instead of waiting until the second generation.
  • Mutagens that create primarily point mutations and short deletions, insertions, transversions, and or transitions, including chemical mutagens or radiation, may be used to create the mutations.
  • Mutagens include, but are not limited to, ethyl methanesulfonate, methylmethane sulfonate, N-ethyl-N-nitrosurea, triethylmelamine, N-methyl-N-nitrosourea, procarbazine, chlorambucil, cyclophosphamide, diethyl sulfate, acrylamide monomer, melphalan, nitrogen mustard, vincristine, dimethylnitrosamine, N-methyl-N'-nitro-Nitrosoguanidine, nitrosoguanidine, 2-aminopurine, 7,12 dimethyl-benz(a)anthracene, ethylene oxide, hexamethylphosphoramide, bisulfan, diepoxyalkanes (diepoxyoctane,
  • Suitable mutagenic agents can also include, for example, ionising radiation - such as X-rays, gamma rays, fast neutron irradiation and UV radiation. Any method of plant nucleic acid preparation known to those of skill in the art may be used to prepare the plant nucleic acid for mutation screening.
  • Prepared nucleic acid from individual plants, plant cells, or plant material can optionally be pooled in order to expedite screening for mutations in the population of plants originating from the mutagenized plant tissue, cells or material.
  • One or more subsequent generations of plants, plant cells or plant material can be screened.
  • the size of the optionally pooled group is dependent upon the sensitivity of the screening method used.
  • nucleic acid samples After the nucleic acid samples are optionally pooled, they can be subjected to polynucleotide-specific amplification techniques, such as Polymerase Chain Reaction.
  • Any one or more primers or probes specific to the gene or the sequences immediately adjacent to the gene may be utilized to amplify the sequences within the optionally pooled nucleic acid sample.
  • the one or more primers or probes are designed to amplify the regions of the locus where useful mutations are most likely to arise. Most preferably, the primer is designed to detect mutations within regions of the polynucleotide. Additionally, it is preferable for the primer(s) and probe(s) to avoid known polymorphic sites in order to ease screening for point mutations.
  • the one or more primers or probes may be labelled using any conventional labelling method. Primer(s) or probe(s) can be designed based upon the sequences described herein using methods that are well understood in the art.
  • the primer(s) or probe(s) may be labelled using any conventional labelling method. These can be designed based upon the sequences described herein using methods that are well understood in the art. Polymorphisms may be identified by means known in the art and some have been described in the literature.
  • a method of preparing a mutant plant involves providing at least one cell of a plant comprising a gene encoding a functional polynucleotide described herein (or any combination thereof as described herein). Next, the at least one cell of the plant is treated under conditions effective to modulate the activity of the polynucleotide(s) described herein. The at least one mutant plant cell is then propagated into a mutant plant, where the mutant plant has a modulated level of polypeptide(s) described (or any combination thereof as described herein) as compared to that of a control plant.
  • the treating step involves subjecting the at least one cell to a chemical mutagenising agent as described above and under conditions effective to yield at least one mutant plant cell.
  • the treating step involves subjecting the at least one cell to a radiation source under conditions effective to yield at least one mutant plant cell.
  • mutant plant includes mutants plants in which the genotype is modified as compared to a control plant, suitably by means other than genetic engineering or genetic modification.
  • the mutant plant, mutant plant cell or mutant plant material may comprise one or more mutations that have occured naturally in another plant, plant cell or plant material and confer a desired trait.
  • This mutation can be incorporated (for example, introgressed) into another plant, plant cell or plant material (for example, a plant, plant cell or plant material with a different genetic background to the plant from which the mutation was derived) to confer the trait thereto.
  • a mutation that occurred naturally in a first plant may be introduced into a second plant - such as a second plant with a different genetic background to the first plant.
  • the skilled person is therefore able to search for and identify a plant carrying naturally in its genome one or more mutant alleles of the genes described herein which confer a desired trait.
  • the mutant allele(s) that occurs naturally can be transferred to the second plant by various methods including breeding, backcrossing and introgression to produce a lines, varieties or hybrids that have one or more mutations in the genes described herein.
  • Plants showing a desired trait may be screened out of a pool of mutant plants.
  • the selection is carried out utilising the knowledge of the nucleotide sequences as described herein. Consequently, it is possible to screen for a genetic trait as compared to a control.
  • Such a screening approach may involve the application of conventional nucleic acid amplification and/or hybridization techniques as discussed herein.
  • a further aspect of the present invention relates to a method for identifying a mutant plant comprising the steps of: (a) providing a sample comprising nucleic acid from a plant; and (b) determining the nucleic acid sequence of the polynucleotide, wherein a difference in the sequence of the polynucleotide as compared to the polynucleotide sequence of a control plant is indicative that said plant is a mutant plant.
  • a method for identifying a mutant plant which accumulates increased or reduced levels of protease as compared to a control plant comprising the steps of: (a) providing a sample from a plant to be screened; (b) determining if said sample comprises one or more mutations in one or more of the polynucleotides described herein; and (c) determining at least the protease content of said plant during or after a curing procedure.
  • a method for preparing a mutant plant which has increased or reduced levels of protease as compared to a control plant comprising the steps of: (a) providing a sample from a first plant; (b) determining if said sample comprises one or more mutations in one or more the polynucleotides described herein that result in modulated levels of a protease; and (c) transferring the one or more mutations into a second plant.
  • a second plant Suitably at least the protease content is determined in cured leaf material.
  • the mutation(s) can be transferred into the second plant using various methods that are known in the art - such as by genetic engineering, genetic manipulation, introgression, plant breeding, backcrossing and the like.
  • the first plant is a naturally occurring plant.
  • the second plant has a different genetic background to the first plant.
  • a method for preparing a mutant plant which has increased or reduced levels of a protease as compared to a control plant comprising the steps of: (a) providing a sample from a first plant; (b) determining if said sample comprises one or more mutations in one or more of the polynucleotides described herein that results in modulated levels of the protease; and (c) introgressing the one or more mutations from the first plant into a second plant.
  • at least the protease content is determined in cured leaf material.
  • the step of introgressing comprises plant breeding, optionally including backcrossing and the like.
  • the first plant is a naturally occurring plant.
  • the second plant has a different genetic background to the first plant.
  • the first plant is not a cultivar or an elite cultivar.
  • the second plant is a cultivar or an elite cultivar.
  • a further aspect relates to a mutant plant (including a cultivar or elite cultivar mutant plant) obtained or obtainable by the methods described herein.
  • the "mutant plants” may have one or more mutations localised only to a specific region of the plant - such as within the sequence of the one or more polynucleotide(s) described herein. According to this embodiment, the remaining genomic sequence of the mutant plant will be the same or substantially the same as the plant prior to the mutagenesis.
  • the mutant plants may have one or more mutations localised in more than one region of the plant - such as within the sequence of one or more of the polynucleotides described herein and in one or more further regions of the genome. According to this embodiment, the remaining genomic sequence of the mutant plant will not be the same or will not be substantially the same as the plant prior to the mutagenesis.
  • the mutant plants may not have one or more mutations in one or more, two or more, three or more, four or more or five or more exons of the polynucleotide(s) described herein; or may not have one or more mutations in one or more, two or more, three or more, four or more or five or more introns of the polynucleotide(s) described herein; or may not have one or more mutations in a promoter of the polynucleotide(s) described herein; or may not have one or more mutations in the 3' untranslated region of the polynucleotide(s) described herein; or may not have one or more mutations in the 5' untranslated region of the polynucleotide(s) described herein; or may not have one or more mutations in the coding region of the polynucleotide(s) described herein; or may not have one or more mutations in the non-coding region of the polynucleotide(s)
  • a method of identifying a plant, a plant cell or plant material comprising a mutation in a gene encoding a polynucleotide described herein comprising: (a) subjecting a plant, a plant cell or plant material to mutagenesis; (b) obtaining a nucleic acid sample from said plant, plant cell or plant material or descendants thereof; and (c) determining the nucleic acid sequence of the gene encoding a polynucleotide described herein or a variant or a fragment thereof, wherein a difference in said sequence is indicative of one or more mutations therein.
  • Zinc finger proteins can be used to modulate the expression or the activity of one or more of the polynucleotides described herein.
  • a genomic DNA sequence comprising a part of or all of the coding sequence of the polynucleotide is modified by zinc finger nuclease-mediated mutagenesis.
  • the genomic DNA sequence is searched for a unique site for zinc finger protein binding.
  • the genomic DNA sequence is searched for two unique sites for zinc finger protein binding wherein both sites are on opposite strands and close together, for example, 1 , 2, 3, 4, 5, 6 or more basepairs apart. Accordingly, zinc finger proteins that bind to polynucleotides are provided.
  • a zinc finger protein may be engineered to recognize a selected target site in a gene.
  • a zinc finger protein can comprise any combination of motifs derived from natural zinc finger DNA- binding domains and non-natural zinc finger DNA-binding domains by truncation or expansion or a process of site-directed mutagenesis coupled to a selection method such as, but not limited to, phage display selection, bacterial two-hybrid selection or bacterial one- hybrid selection.
  • the term "non-natural zinc finger DNA-binding domain” refers to a zinc finger DNA-binding domain that binds a three-base pair sequence within the target nucleic acid and that does not occur in the cell or organism comprising the nucleic acid which is to be modified. Methods for the design of zinc finger protein which binds specific nucleotide sequences which are unique to a target gene are known in the art.
  • a zinc finger protein may be selected to bind to a regulatory sequence of a polynucleotide. More specifically, the regulatory sequence may comprise a transcription initiation site, a start codon, a region of an exon, a boundary of an exon-intron, a terminator, or a stop codon. Accordingly, the invention provides a mutant, non-naturally occurring or transgenic plant or plant cells, produced by zinc finger nuclease-mediated mutagenesis in the vicinity of or within one or more polynucleotides described herein, and methods for making such a plant or plant cell by zinc finger nuclease-mediated mutagenesis. Methods for delivering zinc finger protein and zinc finger nuclease to a tobacco plant are similar to those described below for delivery of meganuclease.
  • Plants suitable for use in genetic modification include, but are not limited to, monocotyledonous and dicotyledonous plants and plant cell systems, including species from one of the following families: Acanthaceae, Alliaceae, Alstroemeriaceae, Amaryllidaceae, Apocynaceae, Arecaceae, Asteraceae, Berberidaceae, Bixaceae, Brassicaceae, Bromeliaceae, Cannabaceae, Caryophyllaceae, Cephalotaxaceae, Chenopodiaceae, Colchicaceae, Cucurbitaceae, Dioscoreaceae, Ephedraceae, Erythroxylaceae, Euphorbiaceae, Fabaceae, Lamiaceae, Linaceae, Lycopodiaceae, Malvaceae, Melanthiaceae, Musaceae, Myrtaceae, Nyssaceae, Papaveraceae, Pinaceae
  • Suitable species may include members of the genera Abelmoschus, Abies, Acer, Agrostis, Allium, Alstroemeria, Ananas, Andrographis, Andropogon, Artemisia, Arundo, Atropa, Berberis, Beta, Bixa, Brassica, Calendula, Camellia, Camptotheca, Cannabis, Capsicum, Carthamus, Catharanthus, Cephalotaxus, Chrysanthemum, Cinchona, Citrullus, Coffea, Colchicum, Coleus, Cucumis, Cucurbita, Cynodon, Datura, Dianthus, Digitalis, Dioscorea, Elaeis, Ephedra, Erianthus, Erythroxylum, Eucalyptus, Festuca, Fragaria, Galanthus, Glycine, Gossypium, Helianthus, Hevea, Hordeum, Hyoscyamus, Jatropha, Lactuca, Linum, Lolium, Lup
  • Suitable species may include Panicum spp., Sorghum spp., Miscanthus spp., Saccharum spp., Erianthus spp., Populus spp., Andropogon gerardii (big bluestem), Pennisetum purpureum (elephant grass), Phalaris arundinacea (reed canarygrass), Cynodon dactylon (bermudagrass), Festuca arundinacea (tall fescue), Spartina pectinata (prairie cord-grass), Medicago sativa (alfalfa), Arundo donax (giant reed), Secale cereale (rye), Salix spp.
  • Phleum pratense timothy
  • Panicum virgatum switchgrass
  • Sorghu35yclise35or sorghum, sudangrass
  • Miscanthus giganteus micanthus
  • compositions and methods can be applied to any species of the genus Nicotiana, including N. rustica and N. tabacum (for example, LA B21 , LN KY171 , Tl 1406, Basma, Galpao, Perique, Beinhart 1000-1 , and Petico).
  • Other species include N. acaulis, N. acuminata, N. africana, N. alata, N.
  • N. amplexicaulis N. arentsii, N. attenuata, N. azambujae, N. benavidesii, N. benthamiana, N. bigelovii, N. bonariensis, N. cavicola, N. clevelandii, N. cordifolia, N. corymbosa, N. debneyi, N. excelsior, N. forgetiana, N. fragrans, N. glauca, N. glutinosa, N. goodspeedii, N. gossei, N. hybrid, N. ingulba, N. kawakamii, N. knightiana, N. langsdorffii, N.
  • the transgenic, non-naturally occurring or mutant plant may therefore be a tobacco variety or elite tobacco cultivar that comprises one or more transgenes, or one or more genetic mutations or a combiantion thereof.
  • the genetic mutation(s) can be mutations that do not exist naturally in the individual tobacco variety or tobacco cultivar (for example, elite tobacco cultivar) or can be genetic mutation(s) that do occur naturally provided that the mutation does not occur naturally in the individual tobacco variety or tobacco cultivar (for example, elite tobacco cultivar).
  • Nicotiana tabacum varieties include Burley type, dark type, flue-cured type, and Oriental type tobaccos.
  • varieties or cultivars are: BD 64, CC 101 , CC 200, CC 27, CC 301 , CC 400, CC 500, CC 600, CC 700, CC 800, CC 900, Coker 176, Coker 319, Coker 371 Gold, Coker 48, CD 263, DF91 1 , DT 538 LC Galpao tobacco, GL 26H, GL 350, GL 600, GL 737, GL 939, GL 973, HB 04P, HB 04P LC, HB3307PLC, Hybrid 403LC, Hybrid 404LC, Hybrid 501 LC, K 149, K 326, K 346, K 358, K394, K 399, K 730, KDH 959, KT 200, KT204LC, KY10, KY14, KY 160, KY 17, KY 171
  • Embodiments are also directed to compositions and methods for producing mutant plants, non-naturally occurring plants, hybrid plants, or transgenic plants that have been modified to modulate the expression or activity of a polynucleotide(s) described herein (or any combination thereof as described herein).
  • the mutant plants, non-naturally occurring plants, hybrid plants, or transgenic plants that are obtained may be similar or substantially the same in overall appearance to control plants.
  • Various phenotypic characteristics such as degree of maturity, number of leaves per plant, stalk height, leaf insertion angle, leaf size (width and length), internode distance, and lamina-midrib ratio can be assessed by field observations.
  • One aspect relates to a seed of a mutant plant, a non-naturally occurring plant, a hybrid plant or a transgenic plant described herein.
  • the seed is a tobacco seed.
  • a further aspect relates to pollen or an ovule of a mutant plant, a non-naturally occurring plant, a hybrid plant or a transgenic plant that is described herein.
  • a mutant plant, a non-naturally occurring plant, a hybrid plant or a transgenic plant as described herein which further comprises a nucleic acid conferring male sterility.
  • the regenerable cells include but are not limited to cells from leaves, pollen, embryos, cotyledons, hypocotyls, roots, root tips, anthers, flowers and a part thereof, ovules, shoots, stems, stalks, pith and capsules or callus or protoplasts derived therefrom.
  • a still further aspect relates to a cured plant material - such as cured leaf or cured tobacco - derived or derivable from a mutant, non-naturally occurring or transgenic plant or cell, wherein expression of one or more of the polynucleotides described herein or the activity of the protein encoded thereby is modulated.
  • a cured plant material such as cured leaf or cured tobacco - derived or derivable from a mutant, non-naturally occurring or transgenic plant or cell, wherein expression of one or more of the polynucleotides described herein or the activity of the protein encoded thereby is modulated.
  • the visual appearance of said plant is substantially the same as the control plant.
  • the plant is a tobacco plant.
  • Embodiments are also directed to compositions and methods for producing mutant, non- naturally occurring or transgenic plants or plant cells that have been modified to modulate the expression or activity of the one or more of the polynucleotides or polypeptides described herein which can result in plants or plant components (for example, leaves - such as green leaves or cured leaves - or tobacco) or plant cells with modulated levels of proteases.
  • a method for modulating eg. increasing the amount of protease in at least a part of a plant (for example, the leaves - such as cured leaves - or in tobacco), comprising the steps of: (i) modulating (eg.
  • the polypeptide(s) is encoded by the corresponding polynucleotide sequence described herein; (ii) measuring the protease content in at least a part (for example, the leaves - such as cured leaves - or tobacco or in smoke) of the mutant, non-naturally occurring or transgenic plant obtained in step (i); and (iii) identifying a mutant, non-naturally occurring or transgenic plant in which the protease content therein has been modulated (eg. increased) in comparison to a control plant.
  • the visual appearance of said mutant, non-naturally occurring or transgenic plant is substantially the same as the control plant.
  • the plant is a tobacco plant.
  • a method for modulating (eg. increasing) the amount of protease in at least a part of cured plant material - such as cured leaf - comprising the steps of: (i) modulating (eg. increasing) the expression or activity of an one or more of the polypeptides (or any combination thereof as described herein), suitably, wherein the polypeptide(s) is encoded by the corresponding polynucleotide sequence described herein; (ii) harvesting plant material - such as one or more of the leaves - and curing for a period of time; (iii) measuring the protease content in at least a part of the cured plant material obtained in step (ii) or during step (ii); and (iv) identifying cured plant material in which the protease content therein has been modulated (eg. increased) in comparison to a control plant.
  • An increase in expression as compared to the control may be from about 5 % to about 100 %, or an increase of at least 10 %, at least 20 %, at least 25 %, at least 30 %, at least 40 %, at least 50 %, at least 60 %, at least 70 %, at least 75 %, at least 80 %, at least 90 %, at least 95 %, at least 98 %, or 100 % or more - such as 200%, 300%, 500%, 1000% or more, which includes an increase in transcriptional activity or polynucleotide expression or polypeptide expression or a combination thereof.
  • An increase in activity as compared to a control may be from about 5 % to about 100 %, or an increase of at least 10 %, at least 20 %, at least 25 %, at least 30 %, at least 40 %, at least 50 %, at least 60 %, at least 70 %, at least 75 %, at least 80 %, at least 90 %, at least 95 %, at least 98 %, or 100 % or more - such as 200%, 300%, 500%, 1000% or more.
  • a reduction in expression as compared to a control may be from about 5 % to about 100 %, or a reduction of at least 10 %, at least 20 %, at least 25 %, at least 30 %, at least 40 %, at least 50 %, at least 60 %, at least 70 %, at least 75 %, at least 80 %, at least 90 %, at least 95 %, at least 98 %, or 100 %, which includes a reduction in transcriptional activity or polynucleotide expression or polypeptide expression or a combination thereof.
  • a reduction in activity as compared to a control may be from about 5 % to about 100 %, or a reduction of at least 10 %, at least 20 %, at least 25 %, at least 30 %, at least 40 %, at least 50 %, at least 60 %, at least 70 %, at least 75 %, at least 80 %, at least 90 %, at least 95 %, at least 98 %, or 100 %.
  • Polynucleotides and recombinant constructs described herein can be used to modulate the expression of the proteases described herein in a plant species of interest, suitably tobacco.
  • a number of polynucleotide based methods can be used to increase gene expression in plants and plant cells.
  • a construct, vector or expression vector that is compatible with the plant to be transformed can be prepared which comprises the gene of interest together with an upstream promoter that is capable of overexpressing the gene in the plant or plant cell.
  • Exemplary promoters are described herein. Following transformation and when grown under suitable conditions, the promoter can drive expression in order to modulate (for example, reduce) the levels of this enzyme in the plant, or in a specific tissue thereof.
  • a vector carrying one or more polynucleotides described herein (or any combination thereof as described herein) is generated to overexpress the gene in a plant or plant cell.
  • the vector carries a suitable promoter - such as the cauliflower mosaic virus CaMV 35S promoter - upstream of the transgene driving its constitutive expression in all tissues of the plant.
  • the vector also carries an antibiotic resistance gene in order to confer selection of the transformed calli and cell lines.
  • a promoter and regulatory sequences are derived from one or more of SEQ ID Nos: 1 -80. These regulatory sequences can be used in conjunction with cognate or non-cognate expression sequences to increase expression of said sequences in a tobacco plant during the curing procedure.
  • sequences from promoters can be enhanced by including expression control sequences, including enhancers, chromatin activating elements, transcription factor responsive elements and the like.
  • control sequences may be constitutive, and upregulate transcription in a universal manner; or they may be facultative, and upregulate transcription in response to specific signals. Signals associated with senescence and signals which are active during the curing procedure are specifically indicated.
  • Various embodiments are therefore directed to methods for modulating (for example, increasing) the expression level of one or more polynucleotides described herein (or any combination thereof as described herein) by integrating multiple copies of the polynucleotide into a plant genome, comprising: transforming a plant cell host with an expression vector that comprises a promoter operably-linked to one or more polynucleotides described herein.
  • the polypeptide encoded by a recombinant polynucleotide can be a native polypeptide, or can be heterologous to the cell.
  • a tobacco plant carrying a mutant allele of one or more polynucleotides described herein (or any combination thereof as described herein) can be used in a plant breeding program to create useful lines, varieties and hybrids.
  • the mutant allele is introgressed into the commercially important varieties described above.
  • methods for breeding plants that comprise crossing a mutant plant, a non-naturally occurring plant or a transgenic plant as described herein with a plant comprising a different genetic identity.
  • the method may further comprise crossing the progeny plant with another plant, and optionally repeating the crossing until a progeny with the desirable genetic traits or genetic background is obtained.
  • breeding methods One purpose served by such breeding methods is to introduce a desirable genetic trait into other varieties, breeding lines, hybrids or cultivars, particularly those that are of commercial interest. Another purpose is to facilitate stacking of genetic modifications of different genes in a single plant variety, lines, hybrids or cultivars. Intraspecific as well as interspecific matings are contemplated. The progeny plants that arise from such crosses, also referred to as breeding lines, are examples of non-naturally occurring plants of the invention.
  • a method for producing a non-naturally occurring tobacco plant comprising: (a) crossing a mutant or transgenic tobacco plant with a second tobacco plant to yield progeny tobacco seed; (b) growing the progeny tobacco seed, under plant growth conditions, to yield the non-naturally occurring tobacco plant.
  • the method may further comprises: (c) crossing the previous generation of non-naturally occurring tobacco plant with itself or another tobacco plant to yield progeny tobacco seed; (d) growing the progeny tobacco seed of step (c) under plant growth conditions, to yield additional non- naturally occurring tobacco plants; and (e) repeating the crossing and growing steps of (c) and (d) multiple times to generate further generations of non-naturally occurring tobacco plants.
  • the method may optionally comprises prior to step (a), a step of providing a parent plant which comprises a genetic identity that is characterized and that is not identical to the mutant or transgenic plant.
  • the crossing and growing steps are repeated from 0 to 2 times, from 0 to 3 times, from 0 to 4 times, 0 to 5 times, from 0 to 6 times, from 0 to 7 times, from 0 to 8 times, from 0 to 9 times or from 0 to 10 times, in order to generate generations of non-naturally occurring tobacco plants.
  • Backcrossing is an example of such a method wherein a progeny is crossed with one of its parents or another plant genetically similar to its parent, in order to obtain a progeny plant in the next generation that has a genetic identity which is closer to that of one of the parents.
  • Techniques for plant breeding particularly tobacco plant breeding, are well known and can be used in the methods of the invention.
  • the invention further provides non- naturally occurring tobacco plants produced by these methods. Certain emboiments exclude the step of selecting a plant.
  • lines resulting from breeding and screening for variant genes are evaluated in the field using standard field procedures.
  • Control genotypes including the original unmutagenized parent are included and entries are arranged in the field in a randomized complete block design or other appropriate field design.
  • standard agronomic practices are used, for example, the tobacco is harvested, weighed, and sampled for chemical and other common testing before and during curing.
  • Statistical analyses of the data are performed to confirm the similarity of the selected lines to the parental line. Cytogenetic analyses of the selected plants are optionally performed to confirm the chromosome complement and chromosome pairing relationships.
  • DNA fingerprinting, single nucleotide polymorphism, microsatellite markers, or similar technologies may be used in a marker-assisted selection (MAS) breeding program to transfer or breed mutant alleles of a gene into other tobaccos, as described herein.
  • MAS marker-assisted selection
  • a breeder can create segregating populations from hybridizations of a genotype containing a mutant allele with an agronomically desirable genotype. Plants in the F2 or backcross generations can be screened using a marker developed from a genomic sequence or a fragment thereof, using one of the techniques listed herein. Plants identified as possessing the mutant allele can be backcrossed or self-pollinated to create a second population to be screened.
  • successful crosses yield F1 plants that are fertile.
  • Selected F1 plants can be crossed with one of the parents, and the first backcross generation plants are self- pollinated to produce a population that is again screened for variant gene expression (for example, the null version of the the gene).
  • the process of backcrossing, self-pollination, and screening is repeated, for example, at least 4 times until the final screening produces a plant that is fertile and reasonably similar to the recurrent parent.
  • This plant if desired, is self- pollinated and the progeny are subsequently screened again to confirm that the plant exhibits variant gene expression.
  • a plant population in the F2 generation is screened for variant gene expression, for example, a plant is identified that fails to express a polypeptide due to the absence of the gene according to standard methods, for example, by using a PCR method with primers based upon the nucleotide sequence information for the polynucleotide(s) described herein (or any combination thereof as described herein).
  • Hybrid tobacco varieties can be produced by preventing self-pollination of female parent plants (that is, seed parents) of a first variety, permitting pollen from male parent plants of a second variety to fertilize the female parent plants, and allowing F1 hybrid seeds to form on the female plants.
  • Self-pollination of female plants can be prevented by emasculating the flowers at an early stage of flower development.
  • pollen formation can be prevented on the female parent plants using a form of male sterility.
  • male sterility can be produced by cytoplasmic male sterility (CMS), or transgenic male sterility wherein a transgene inhibits microsporogenesis and/or pollen formation, or self- incompatibility.
  • CMS cytoplasmic male sterility
  • transgenic male sterility wherein a transgene inhibits microsporogenesis and/or pollen formation, or self- incompatibility.
  • Female parent plants containing CMS are particularly useful. In embodiments in which the female parent plants are CMS, pollen is harvested from
  • Varieties and lines described herein can be used to form single-cross tobacco F1 hybrids.
  • the plants of the parent varieties can be grown as substantially homogeneous adjoining populations to facilitate natural cross-pollination from the male parent plants to the female parent plants.
  • the F1 seed formed on the female parent plants is selectively harvested by conventional means.
  • One also can grow the two parent plant varieties in bulk and harvest a blend of F1 hybrid seed formed on the female parent and seed formed upon the male parent as the result of self-pollination.
  • three-way crosses can be carried out wherein a single-cross F1 hybrid is used as a female parent and is crossed with a different male parent.
  • double-cross hybrids can be created wherein the F1 progeny of two different single-crosses are themselves crossed.
  • a population of mutant, non-naturally occurring or transgenic plants can be screened or selected for those members of the population that have a desired trait or phenotype.
  • a population of progeny of a single transformation event can be screened for those plants having a desired level of expression or activity of the polypeptide(s) encoded thereby.
  • Physical and biochemical methods can be used to identify expression or activity levels.
  • RNA transcripts include Southern analysis or PCR amplification for detection of a polynucleotide; Northern blots, S1 RNase protection, primer-extension, or RT-PCR amplification for detecting RNA transcripts; enzymatic assays for detecting enzyme or ribozyme activity of polypeptides and polynucleotides; and protein gel electrophoresis, Western blots, immunoprecipitation, and enzyme-linked immunoassays to detect polypeptides.
  • Other techniques such as in situ hybridization, enzyme staining, and immunostaining and enzyme assays also can be used to detect the presence or expression or activity of polypeptides or polynucleotides.
  • Mutant, non-naturally occurring or transgenic plant cells and plants are described herein comprising one or more recombinant polynucleotides, one or more polynucleotide constructs, one or more double-stranded RNAs, one or more conjugates or one or more vectors/expression vectors.
  • the plants described herein may be modified for other purposes either before or after the expression or activity has been modulated according to the present invention.
  • One or more of the following genetic modifications can be present in the mutant, non-naturally occurring or transgenic plants.
  • one or more genes that are involved in the conversion of nitrogenous metabolic intermediates is modified resulting in plants (such as leaves) that when cured, produces lower levels of at least one tobacco- specific nitrosamine than control plants.
  • Non-limiting examples of genes that can be modified includegenes encoding a nicotine demethylase, such as CYP82E4, CYP82E5 and CYP82E10 which participate in the conversion of nicotine to nornicotine and are described in WO2006091 194, WO2008070274, WO2009064771 and PCT/US201 1/021088 and as described in detail herein.
  • one or more genes that are involved in heavy metal uptake or heavy metal transport are modified resulting in plants or parts of plants (such as leaves) having a lower heavy metal content than control plants or parts thereof without the modification(s).
  • Non-limiting examples include genes in the family of multidrug resistance associated proteins, the family of cation diffusion facilitators (CDF), the family of Zrt-, Irt-like proteins (ZIP), the family of cation exchangers (CAX), the family of copper transporters (COPT), the family of heavy-metal P-type ATPases (for example, HMAs, as described in WO2009074325), the family of homologs of natural resistance-associated macrophage proteins (NRAMP), and the family of ATP-binding cassette (ABC) transporters (for example, MRPs, as described in WO2012/028309, which participate in transport of heavy metals, such as cadmium.
  • the term heavy metal as used herein includes transition metals.
  • Glyphosate resistant transgenic plants have been developed by transferring the aroA gene (a glyphosate EPSP synthetase from Salmonella typhimurium and E.coli). Sulphonylurea resistant plants have been produced by transforming the mutant ALS (acetolactate synthetase) gene from Arabidopsis. OB protein of photosystem II from mutant Amaranthus hybridus has been transferred in to plants to produce atrazine resistant transgenic plants; and bromoxynil resistant transgenic plants have been produced by incorporating the bxn gene from the bacterium Klebsiella pneumoniae.
  • aroA gene a glyphosate EPSP synthetase from Salmonella typhimurium and E.coli
  • Sulphonylurea resistant plants have been produced by transforming the mutant ALS (acetolactate synthetase) gene from Arabidopsis.
  • OB protein of photosystem II from mutant Amaranthus hybridus has been transferred in to plants to produce atraz
  • Bacillus thuringiensis (Bt) toxins can provide an effective way of delaying the emergence of Bt-resistant pests, as recently illustrated in broccoli where pyramided crylAc and crylC Bt genes controlled diamondback moths resistant to either single protein and significantly delayed the evolution of resistant insects.
  • Another exemplary modification results in plants that are resistant to diseases caused by pathogens (for example, viruses, bacteria, fungi). Plants expressing the Xa21 gene (resistance to bacterial blight) with plants expressing both a Bt fusion gene and a chitinase gene (resistance to yellow stem borer and tolerance to sheath) have been engineered.
  • Another exemplary modification results in altered reproductive capability, such as male sterility.
  • Another exemplary modification results in plants that are tolerant to abiotic stress (for example, drought, temperature, salinity), and tolerant transgenic plants have been produced by transferring acyl glycerol phosphate enzyme from Arabidopsis; genes coding mannitol dehydrogenase and sorbitol dehydrogenase which are involved in synthesis of mannitol and sorbitol improve drought resistance.
  • Other exemplary modifications can result in plants with improved storage proteins and oils, plants with enhanced photosynthetic efficiency, plants with prolonged shelf life, plants with enhanced carbohydrate content, and plants resistant to fungi; plants encoding an enzyme involved in the biosynthesis of alkaloids.
  • Transgenic plants in which the expression of S-adenosyl-L-methionine (SAM) and/or cystathionine gamma-synthase (CGS) has been modulated are also contemplated.
  • One or more such traits may be introgressed into the mutant, non-naturally occuring or transgenic tobacco plants from another tobacco cultivar or may be directly transformed into it.
  • the introgression of the trait(s) into the mutant, non-naturally occuring or transgenic tobacco plants of the invention maybe achieved by any method of plant breeding known in the art, for example, pedigree breeding, backcrossing, doubled-haploid breeding, and the like (see, Wernsman, E. A, and Rufty, R. C. 1987. Chapter Seventeen. Tobacco. Pages 669- 698 In: Cultivar Development. Crop Species. W. H. Fehr (ed.), MacMillan Publishing Co, Inc., New York, N.Y 761 pp.).
  • Molecular biology-based techniques described above, in particular RFLP and microsatelite markers can be used in such backcrosses to identify the progenies having the highest degree of genetic identity with the recurrent parent. This permits one to accelerate the production of tobacco varieties having at least 90%, preferably at least 95%, more preferably at least 99% genetic identity with the recurrent parent, yet more preferably genetically identical to the recurrent parent, and further comprising the trait(s) introgressed from the donor parent. Such determination of genetic identity can be based on molecular markers known in the art. The last backcross generation can be selfed to give pure breeding progeny for the nucleic acid(s) being transferred.
  • the resulting plants generally have essentially all of the morphological and physiological characteristics of the mutant, non-naturally occuring or transgenic tobacco plants of the invention, in addition to the transferred trait(s) (for example, one or more single gene traits).
  • the exact backcrossing protocol will depend on the trait being altered to determine an appropriate testing protocol. Although backcrossing methods are simplified when the trait being transferred is a dominant allele, a recessive allele may also be transferred. In this instance, it may be necessary to introduce a test of the progeny to determine if the desired trait has been successfully transferred.
  • Various embodiments provide mutant plants, non-naturally occurring plants or transgenic plants, as well as biomass in which the expression level of a polynucleotide (or any combination thereof as described herein) is modulated to modulate the protease activity therein.
  • Parts of such plants, particularly tobacco plants, and more particularly the leaf lamina and midrib of tobacco plants, can be incorporated into or used in making various consumable products including but not limited to aerosol forming materials, aerosol forming devices, smoking articles, smokable articles, smokeless products, and tobacco products.
  • aerosol forming materials include but are not limited to tobacco compositions, tobaccos, tobacco extract, cut tobacco, cut filler, cured tobacco, expanded tobacco, homogenized tobacco, reconstituted tobacco, and pipe tobaccos.
  • Smoking articles and smokable articles are types of aerosol forming devices. Examples of smoking articles or smokable articles include but are not limited to cigarettes, cigarillos, and cigars.
  • smokeless products comprise chewing tobaccos, and snuffs.
  • a tobacco composition or another aerosol forming material is heated by one or more electrical heating elements to produce an aerosol.
  • an aerosol is produced by the transfer of heat from a combustible fuel element or heat source to a physically separate aerosol forming material, which may be located within, around or downstream of the heat source.
  • Smokeless tobacco products and various tobacco-containing aerosol forming materials may contain tobacco in any form, including as dried particles, shreds, granules, powders, or a slurry, deposited on, mixed in, surrounded by, or otherwise combined with other ingredients in any format, such as flakes, films, tabs, foams, or beads.
  • the term 'smoke' is used to describe a type of aerosol that is produced by smoking articles, such as cigarettes, or by combusting an aerosol forming material.
  • cured plant material from the mutant, transgenic and non-naturally occurring tobacco plants described herein.
  • Processes of curing green tobacco leaves are known by those having skills in the art and include without limitation air- curing, fire-curing, flue-curing and sun-curing as described herein.
  • tobacco products including tobacco-containing aerosol forming materials comprising plant material - such as leaves, preferably cured leaves - from the mutant tobacco plants, transgenic tobacco plants or non-naturally occurring tobacco plants described herein.
  • plant material - such as leaves, preferably cured leaves - from the mutant tobacco plants, transgenic tobacco plants or non-naturally occurring tobacco plants described herein.
  • the tobacco products described herein can be a blended tobacco product which may further comprise unmodified tobacco.
  • mutant, non-naturally occurring or transgenic plants may have other uses in, for example, agriculture.
  • mutant, non-naturally occurring or transgenic plants described herein can be used to make animal feed and human food products.
  • the invention also provides methods for producing seeds comprising cultivating the mutant plant, non-naturally occurring plant, or transgenic plant described herein, and collecting seeds from the cultivated plants.
  • Seeds from plants described herein can be conditioned and bagged in packaging material by means known in the art to form an article of manufacture.
  • Packaging material such as paper and cloth are well known in the art.
  • a package of seed can have a label, for example, a tag or label secured to the packaging material, a label printed on the package that describes the nature of the seeds therein.
  • compositions, methods and kits for genotyping plants for identification, selection, or breeding can comprise a means of detecting the presence of a polynucleotide (or any combination thereof as described herein) in a sample of polynucleotide. Accordingly, a composition is described comprising one of more primers for specifically amplifying at least a portion of one or more of the polynucleotides and optionally one or more probes and optionally one or more reagents for conducting the amplification or detection.
  • gene specific oligonucleotide primers or probes comprising about 10 or more contiguous polynucleotides corresponding to the polynucleotide(s) described herein are dislcosed.
  • Said primers or probes may comprise or consist of about 15, 20, 25, 30, 40, 45 or 50 more contiguous polynucleotides that hybridise (for example, specificially hybridise) to the polynucleotide(s) described herein.
  • the primers or probes may comprise or consist of about 10 to 50 contiguous nucleotides, about 10 to 40 contiguous nucleotides, about 10 to 30 contiguous nucleotides or about 15 to 30 contiguous nucleotides that may be used in sequence-dependent methods of gene identification (for example, Southern hybridization) or isolation (for example, in situ hybridization of bacterial colonies or bacteriophage plaques) or gene detection (for example, as one or more amplification primers in nucleic acid amplification or detection).
  • the one or more specific primers or probes can be designed and used to amplify or detect a part or all of the polynucleotide(s).
  • two primers may be used in a polymerase chain reaction protocol to amplify a nucleic acid fragment encoding a nucleic acid - such as DNA or RNA.
  • the polymerase chain reaction may also be performed using one primer that is derived from a nucleic acid sequence and a second primer that hybridises to the sequence upstream or downstream of the nucleic acid sequence - such as a promoter sequence, the 3' end of the mRNA precursor or a sequence derived from a vector.
  • Examples of thermal and isothermal techniques useful for in vitro amplification of polynucleotides are well known in the art.
  • the sample may be or may be derived from a plant, a plant cell or plant material or a tobacco product made or derived from the plant, the plant cell or the plant material as described herein.
  • a method of detecting a polynucleotide(s) described herein (or any combination thereof as described herein) in a sample comprising the step of: (a) providing a sample comprising, or suspected of comprising, a polynucleotide; (b) contacting said sample with one of more primers or one or more probes for specifically detecting at least a portion of the polynucleotide(s); and (c) detecting the presence of an amplification product, wherein the presence of an amplification product is indicative of the presence of the polynucleotide(s) in the sample.
  • kits for detecting at least a portion of the polynucleotide(s) are also provided which comprise one of more primers or probes for specifically detecting at least a portion of the polynucleotide(s).
  • the kit may comprise reagents for polynucleotide amplification - such as PCR - or reagents for probe hybridization-detection technology - such as Southern Blots, Northern Blots, in-situ hybridization, or microarray.
  • the kit may comprise reagents for antibody binding-detection technology such as Western Blots, ELISAs, SELDI mass spectrometry or test strips.
  • the kit may comprise reagents for DNA sequencing.
  • the kit may comprise reagents and instructions for determining at least the proteasae content.
  • the kit comprises reagents and instructions for determining at least protease content in plant material, cured plant material or cured leaves.
  • kits may comprise instructions for one or more of the methods described.
  • the kits described may be useful for genetic identity determination, phylogenetic studies, genotyping, haplotyping, pedigree analysis or plant breeding particularly with co- dominant scoring.
  • the present invention also provides a method of genotyping a plant, a plant cell or plant material comprising a polynucleotide as described herein.
  • Genotyping provides a means of distinguishing homologs of a chromosome pair and can be used to differentiate segregants in a plant population.
  • Molecular marker methods can be used for phylogenetic studies, characterizing genetic relationships among crop varieties, identifying crosses or somatic hybrids, localizing chromosomal segments affecting monogenic traits, map based cloning, and the study of quantitative inheritance.
  • the specific method of genotyping may employ any number of molecular marker analytic techniques including amplification fragment length polymorphisms (AFLPs).
  • AFLPs amplification fragment length polymorphisms
  • AFLPs are the product of allelic differences between amplification fragments caused by nucleotide sequence variability.
  • the present invention further provides a means to follow segregation of one or more genes or nucleic acids as well as chromosomal sequences genetically linked to these genes or nucleic acids using such techniques as AFLP analysis.
  • cured plant material from the mutant, transgenic and non-naturally occurring plants described herein.
  • processes of curing tobacco leaves include without limitation air- curing, fire-curing, flue-curing and sun-curing.
  • tobacco products including tobacco products comprising plant material - such as leaves, suitably cured plant material - such as cured leaves - from the mutant, transgenic and non-naturally occurring plants described herein or which are produced by the methods described herein.
  • the tobacco products described herein may further comprise unmodified tobacco.
  • tobacco products comprising plant material, preferably leaves - such as cured leaves, from the mutant, transgenic and non-naturally occurring plants described herein.
  • the plant material may be added to the inside or outside of the tobacco product and so upon burning a desirable aroma is released.
  • the tobacco product according to this embodiment may even be an unmodified tobacco or a modified tobacco.
  • the tobacco product according to this embodiment may even be derived from a mutant, transgenic or non-naturally occurring plant which has modifications in one or more genes other than the genes disclosed herein.
  • a 48h time-point following the curing start was selected to screen for curing-activated genes based on Affymetrix data essentially as described by Martin et al. (2012) BMC Genomics, 13:674).
  • exon candidates from genomic DNA and from EST contigs were joined and the genomic candidates were cleaned for redundancies (98% threshold). This resulted in a set of 312,053 exon candidates, 12,925 of which were represented by ESTs, but were not included in the genome assembly. Data sets were verified as described by the manufacturer (Affymetrix).
  • quality checks included probe-level models, Normalized Unsealed Standard Error (NUSE) and Relative Log Expression (RLE) plots, and the analysis of DABG results as described by the manufacturer.
  • RMA Robust Multi-array Average
  • tissue samples were sequenced using RNA-seq; reads were mapped to the genomes of the 3 varieties using Tophat2.
  • Previously published gene models were used as the basis for the differential gene expression analysis. Expression changes during curing were calculated using the Cuffdiff2 software based on the mapped reads. Genes were considered up-regulated if their expression levels increased significantly during the first 48h of curing, and not if the change was insignificant or decreased.
  • Tobacco proteins were identified by a BLAST search against a database of transcripts for the 3 varieties and equivalent genes in the 3 varieties were identified by a mutual best BLAST hit search of the transcripts of the 3 varieties Burley, Virginia and Oriental (e-value cutoff 1 e- 80).
  • Example 2 The proteasae genes identified in Example 2 were analysed for membership of known protease families. The results are set forth in table 1.
  • the 80 curing-activated protease genes were found to belong to 21 different protease families.
  • AC air-cured
  • FC flue-cured
  • SC sun-cured.
  • AC+FC+SC up-regulated in all three types of tobacco ;
  • AC+FC up-regulated in air-cured and flue-cured tobacco ;
  • AC+SC up-regulated in air-cured and sun-cured tobacco ;
  • FC+SC up-regulated in flue-cured and sun-cured tobacco ;
  • AC, FC and SC up-regulated only in the respective tobacco type.
  • Heat shock protein 101 1 1 - - 1 1 -
  • APA 1 is encoded by a single gene in Arabidopsis thaliana and 4 in Tomato.
  • the gene activated in flue-cured Virginia tobacco is close to APA1 -Tomato-1.
  • Affymetrix data confirmed the activation of the S form (upper panel) and apparently not the T form during Virginia flue-curing (lower panel).
  • Table 2 illustrates the differential up-regulation of SEQ ID NO:1 to 80 in the three tobacco types air-cured Burley (AC), flue-cured Virginia (FC) and sun-cured Oriental (SC).
  • AC air-cured Burley
  • FC flue-cured Virginia
  • SC sun-cured Oriental
  • ATCTTAG G AAAATGTTACTTTTCTTGCTG AGCTGTTG AAG GTT CAAAG GAACAAG GAAAT
  • CAATCACAGATTCTCATGTGTTCCACATCTGCAGTTATGTTG GG CACG ATG AG G GAG GA
  • CAACTTACTG AAG CTGTTAG GAG G CGG CCTTACAGTGTTGTG CTCTTTG ATG AAGTTGA
  • AAAG C C ATAAAATAATTC AAAC AATAATTTACTTAAC AAATTAC CTTC AATAC C AC G AATC

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Chemical & Material Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Organic Chemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • General Engineering & Computer Science (AREA)
  • Biotechnology (AREA)
  • Biomedical Technology (AREA)
  • Biochemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Microbiology (AREA)
  • Medicinal Chemistry (AREA)
  • Plant Pathology (AREA)
  • Biophysics (AREA)
  • Physics & Mathematics (AREA)
  • Cell Biology (AREA)
  • Botany (AREA)
  • Breeding Of Plants And Reproduction By Means Of Culturing (AREA)
  • Manufacture Of Tobacco Products (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)
  • Physiology (AREA)
  • Developmental Biology & Embryology (AREA)
  • Environmental Sciences (AREA)

Abstract

The invention provides protease genes which are regulated in a specific manner during curing of tobacco plants material and which affect the flavour of cured tobacco.

Description

TOBACCO PROTEASE GENES
FIELD OF THE INVENTION
The present invention concerns the use of proteases expressed in tobacco to alter the characteristics of cured tobacco products. In particular, the invention provides processes for altering the curing of tobacco leaf and modulating tobacco leaf composition by modulating the expression of one or more tobacco protease genes.
BACKGROUND OF THE INVENTION
Tobacco curing is a process of physical and biochemical changes that bring out the aroma and flavor of each variety of tobacco. After tobacco has been harvested, it is necessary to cure it and then age it before comsumption, to improve its flavour. There are four common methods of curing, and the method used depends on the type of tobacco and its intended use.
Air-cured tobacco is sheltered from wind and sun in a well-ventilated chamber, where it air- dries for six to eight weeks. Air-cured tobacco is low in sugar, which gives the tobacco smoke a light, sweet flavor, and high in nicotine. Cigar and burley tobaccos are air cured. In fire curing, smoke from a low-burning fire permeates the leaves. This gives the leaves a distinctive smokey aroma and flavour. Fire curing takes three to ten weeks and produces a tobacco low in sugar and high in nicotine. Pipe tobacco, chewing tobacco, and snuff are fire cured.
Flue-cured tobacco is kept in an enclosed heated area, but it is not directly exposed to smoke. This method produces cigarette tobacco that is high in sugar and has medium to high levels of nicotine. It is the fastest method of curing, requiring about a week. Virginia tobacco that has been flue cured is also called bright tobacco, because flue curing turns its leaves gold, orange, or yellow.
Sun-cured tobacco dries uncovered in the sun. This method is used in Turkey, Greece and other Mediterranean countries to produce oriental tobacco. Sun-cured tobacco is low in sugar and nicotine and is used in cigarettes.
Curing produces various compounds in the tobacco leaves that give cured tobacco its specific flavour and taste, such as for example a sweet hay, tea, rose oil, or fruity aromatic flavor.
During the first phase of curing, corresponding to the so-called yellowing phase and also known as color curing, the chlorophyll content is reduced. This phase takes between 2 and 8 days depending on the tobacco type. During this phase leaf metabolic activities are drastically changed. Not only is chlorophyll degraded but also, for example, starch and proteins. To date, the only methods for altering the curing process which have been proposed are base on altering the actual conditions to which the tobacco is exposed in the chosen curing procedure. Very little is known about gene expression in tobacco during curing, and moreover few data have been reported on the activities of proteases in tobacco leaf and their resulting products.
We have identified 80 protease genes that are activated during leaf curing in the three main tobacco types, Burley, Virginia and Oriental. We have found that specific protease expression is associated with particular flavour profiles in tobacco.
SUMMARY OF THE INVENTION
80 protease genes (SEQ ID NO: 1 -80) were identified that are up-regulated in Burley tobacco upon air curing, Virginia tobacco upon flue-curing and Oriental tobacco upon sun curing. Details on such up-regulation in one or more of the different tobacco types are summarised in Figure 2 and Table 1 & 2.
Such gene sequences and their regulatory sequences can be used to modulate or modify protease activity during curing. The polynucleotide sequences SEQ ID NO: 1 -80 include exon and intron sequences. The protein sequences relating to the coding sequence part of the polynucleotide sequences SEQ ID NO: 1 -80, are depicted in SEQ ID NO: 81 -160.
Accordingly, there is provided a mutant, non-naturally occurring or transgenic tobacco plant cell comprising:
(i) a polynucleotide comprising, consisting or consisting essentially of a sequence encoding a functional protease and having at least 95% sequence identity to any one of SEQ ID NO:1 to SEQ ID No: 80;
(ii) a polypeptide encoded by the polynucleotide set forth in (i);
(iii) a polypeptide comprising, consisting or consisting essentially of a sequence encoding a protease and having at least 95% sequence identity to SEQ ID NO:81 to SEQ ID No: 160; or
(iv) a construct, vector or expression vector comprising the isolated polynucleotide set forth in (i),
and wherein the expression or activity of said protease is modulated as compared to a control tobacco plant cell in which the expression or activity of said protease has not been altered.
Alteration of protease expression in tobacco cells during the curing process imparts different flavours to the cured tobacco and products manufactured therefrom. The effects of different genes on different tobacco flavour profiles are further discussed below.
In embodiments, the expression or activity of said protease is upregulated compared to the control tobacco plant cell. However, in certain embodiments, the expression or activity of said protease is downregulated compared to the control tobacco plant cell. In still further embodiments, at least one protease can be upregulated at the same time as at least one protease is downregulated in the same cell.
In an exemplary embodiment, therefore, there is provided a mutant, non-naturally occurring or transgenic tobacco plant cell according to any preceding claim, wherein the expression or activity is modulated of a protease selected from:
at least one of SEQ ID NO: 1 to 16; or
at least one of SEQ ID NO: 30 to 41 ; or
at least one of SEQ ID NO: 17 to 22; or
at least one of SEQ ID NO: 42 to 44; or
at least one of SEQ ID NO: 45 to 61 ; or
at least one of SEQ ID NO: 62 to 80 or
at least one of SEQ ID NO: 23 to 29.
In a specific embodiment, there is provided a mutant, non-naturally occurring or transgenic tobacco plant cell according to claim 4, wherein the expression or activity of a protease selected from at least one of SEQ ID NO: 30 to 41 is modulated in an Oriental type tobacco. In a specific embodiment, there is provided a mutant, non-naturally occurring or transgenic tobacco plant cell according to claim 4, wherein the expression or activity of a protease selected from SEQ ID NO: 17 to 22 is modulated in a Virginia type tobacco.
In a specific embodiment, there is provided a mutant, non-naturally occurring or transgenic tobacco plant cell according to claim 4, wherein the expression or activity of a protease selected from at least one of SEQ ID NO: 42 to 44 is modulated in a Burley type tobacco. In a specific embodiment, there is provided a mutant, non-naturally occurring or transgenic tobacco plant cell according to claim 4, wherein the expression or activity of a protease selected from at least one of SEQ ID NO: 45 to 61 is modulated in a Virginia or Oriental type tobacco.
In a specific embodiment, there is provided a non-naturally occurring or transgenic tobacco plant cell according to claim 4, wherein the expression or activity of a protease selected from at least one of SEQ ID NO: 62 to 80 is modulated in a Burley or Oriental type tobacco.
In a specific embodiment, there is provided a mutant, non-naturally occurring or transgenic tobacco plant cell according to claim 4, wherein the expression or activity of a protease selected from at least one of SEQ ID NO: 23 to 29 is modulated in a Burley or Virginia type tobacco.
The mutant, non-naturally occurring or transgenic tobacco plant cell can be a tobacco plant cell wherein said mutation(s) is a heterozygous or homozygous mutation. In embodiments of the invention, the expression of the one or more proteases is increased by about 10% to about 1000%, for example by at least 10%, at least 20%, at least 25%, at least 50%, at least 100%, at least 200%, at least 500%, at least 750% or up to 1000%.
In a second aspect, there is provided a mutant, non-naturally occurring or transgenic plant or component or part thereof comprising the plant cell according to the preceding aspect of the invention.
In a third aspect, there is provided plant material including biomass, seed, stem, flowers or leaves from the plant of the second aspect of the invention.
In a fourth aspect, there is provided a method for preparing a tobacco plant with modulated levels of protease, said method comprising the steps of:
(a) providing a plant comprising (i) a polynucleotide comprising, consisting or consisting essentially of a sequence encoding a functional protease and having at least 95% sequence identity to at least one of SEQ ID NO:1 to SEQ ID No: 80;
(b) inserting one or more mutations into said polynucleotide of said tobacco plant to create a mutant tobacco plant; and
(c) curing the tobacco plant material.
In some embodiments, the tobacco plant in step (b) is a mutant tobacco plant, preferably, wherein said mutant tobacco plant comprises one or more mutations in one or more further sequence encoding a functional protease and having at least 95% sequence identity to at least one of SEQ ID NO:1 to SEQ ID No: 80. Thus, a plant can be constructed in which one or more cells comprise multiple mutated proteases.
The mutated cells comprising modulated protease expression or activity are impart a different flavour profile to tobacco leaf during the curing process. By replicating a leaf chemistry of one tobacco type in another, it is possible to transfer flavour characteristics to a tobacco type which does not normally possess those characteristics.
In embodiments, the genome of a cell of a tobacco plant is modified by a genome editing technology or by genome engineering techniques selected from CRISPR/Cas technology, zinc finger nuclease-mediated mutagenesis, chemical or radiation mutagenesis, homologous recombination, oligonucleotide-directed mutagenesis and meganuclease-mediated mutagenesis.
In a further aspect, therefore, there is provided a method for producing cured plant material, preferably cured leaves, or flowers with an altered flavour profile as compared to control plant material comprising the steps of:
(a) providing a plant or the plant material according to the foregoing aspects of the invention;
(b) optionally harvesting the plant material therefrom; and (c) curing the plant material for a period of time such that the levels of at least one protease are modulated compared to control cured plant material.
In a still further aspect, there is provided use of
(i) a polynucleotide comprising, consisting or consisting essentially of a sequence encoding a functional protease and having at least 95% sequence identity to any one of SEQ ID NO:1 to SEQ ID No: 80;
(ii) a polypeptide encoded by the polynucleotide set forth in (i);
(iii) a polypeptide comprising, consisting or consisting essentially of a sequence encoding a protease and having at least 95% sequence identity to SEQ ID NO:81 to SEQ ID No: 160; or
(iv) a construct, vector or expression vector comprising the isolated polynucleotide set forth in (i),
for the modulation of the expression or activity of one or more proteases in tobacco during a tobacco curing procedure.
The curing procedure in according to this aspect of the invention can be selected from the group consisting of air curing, fire curing, smoke curing and flue curing.
Modification or modulation of protease activity during curing can be through (further) up- regulation or down-regulation. Modification or modulation can be through genetic engineering using for example certain promoter sequences that are (at least) active during such curing. Modulation can also be through for example mutagenesis as claimed above, of such sequences and/or their regulatory region resulting in either up- or down-regulation, or complete knock-out, of the protease activity encoded thereby under the respective curing conditions.
In another embodiment there is provided the use of at least one of the 16 gene sequences SEQ ID NO: 1 to 16 (see Table 2), and sequences comprising, consisting or consisting essentially of a sequence having at least 95% sequence identity to one or more of said 16 sequences that are up-regulated in all three types of tobacco during curing for modifying the flavour of cured tobacco.
In another embodiment there is provided the use of at least one of the 12 gene sequences SEQ ID NO: 30 to 41 , and sequences comprising, consisting or consisting essentially of a sequence having at least 95% sequence identity to one or more of said 12 sequences that are up-regulated both in air-cured Burley and flue-cured Virginia, in an Oriental type tobacco to modify the flavour of said tobacco during curing.
In another embodiment there is provided the use of at least one of the 6 gene sequences SEQ ID NO: 17 to 22, and sequences comprising, consisting or consisting essentially of a sequence having at least 95% sequence identity to one or more of said 6 sequences, that are up-regulated both in air-cured Burley and sun-cured Oriental, in a Virginia type tobacco to modify the flavour of said tobacco during curing.
In another embodiment there is provided the use of at least one of the 3 gene sequences SEQ ID NO: 42 to 44, and sequences comprising, consisting or consisting essentially of a sequence having at least 95% sequence identity to one or more of said 3 sequences that are up-regulated in both flue-cured Virginia and sun-cured Oriental tobacco, to modify the flavour of a Burley type tobacco during curing.
In another embodiment there is provided the use of at least one of the 17 gene sequences SEQ ID NO: 45 to 61 , and sequences comprising, consisting or consisting essentially of a sequence having at least 95% sequence identity to one or more of said 17 sequences, that are uniquely up-regulated in air-cured Burley, to modify the flavour in a Virginia or Oriental type tobacco during curing.
In another embodiment there is provided the use of at least one of the 19 gene sequences SEQ ID NO: 62 to 80, and sequences comprising, consisting or consisting essentially of a sequence having at least 95% sequence identity to one or more of said 19 sequences, that are uniquely up-regulated in flue-cured Virginia, to modify the flavour in a Burley or Oriental type tobacco during curing.
In another embodiment there is provided the use of at least one of the 7 gene sequences SEQ ID NO: 23 to 29, and sequences comprising, consisting or consisting essentially of a sequence having at least 95% sequence identity to one or more of said 7 sequences, that are uniquely up-regulated in sun-cured Oriental, to modify the flavour of a Burley or Virginia type tobacco during curing.
As certain gene sequences are only up-regulated in one or two of the three tobacco types (as defined according to tobacco type and curing method), certain gene sequences can potentially be used to modify or modulate protease activity during curing such that the outcome with respect to leaf chemistry (for example the metabolite content of cell) and properties of the obtained tobacco leaf cell, are changed such that for example an air-cured Burley tobacco acquires certain characteristics of a flue-cured Virginia-type tobacco or sun- cured Oriental tobacco upon curing. This for example can be done by modulating the expression of one or more of the gene sequences that are up-regulated in one or two of tobacco types and not in the other tobacco. For example, 17 gene sequences are uniquely up-regulated in air-cured Burley, 19 in flue-cured Virginia, and 12 in both types of tobacco during curing. By selectively modulating one or more of the 19 gene sequences that are only up-regulated in air-cured Burley now in flue-cured Virginia, the leaf cell composition of the sun-cured Virginia tobacco upon curing can be altered towards a more Burley type. Using a genetic engineering approach this can be achieved using for example a promoter sequence that is active under the curing conditions of the targeted tobacco type. Promoter sequences of use therefore are for example the regulatory sequences driving the expression of the gene sequences listed here. Using a mutagenesis, genome editing or engineering approach, the mutated gene sequence can be active under the curing conditions of the targeted tobacco type.
In one example, a regulatory sequence is mutated such that the gene sequence downstream is active under the desired curing conditions. For example, by selectively modifying or modulating the expression of one or more of the 19 sequences that are uniquely up- regulated in flue-cured Virginia in an air-cured Burley type of tobacco, the leaf cell composition of the Burley type tobacco upon curing can be altered towards a more Virginia type. Also, by selectively modulating the expression of one or more of the 12 sequences that are up-regulated both in air-cured Burley and flue-cured Virginia, in a sun-cured Oriental tobacco, the leaf cell composition of the sun-cured Oriental tobacco upon curing can be altered such that it acquires Burley and Virginia characteristics.
In one embodiment, one of the gene sequences listed in SEQ ID Nos: 1 -80 or sequences comprising, consisting or consisting essentially of a sequence having at least 95% sequence identity to one or more of the listed sequences, is up-regulated. In another embodiment more than one of the gene sequences listed in SEQ ID Nos: 1 -80 or sequences comprising, consisting or consisting essentially of a sequence having at least 95% sequence identity to such sequences, are up-regulated. In another embodiment, one or more of the gene sequences listed in SEQ ID Nos: 1 -80 or sequences comprising, consisting or consisting essentially of a sequence having at least 95% sequence identity to such listed sequences, are down-regulated. In another embodiment one or more of sequences listed in SEQ ID Nos: 1 -80 or sequences comprising, consisting or consisting essentially of a sequence having at least 95% sequence identity to such listed sequences are up-regulated, and one or more sequences listed in SEQ ID Nos: 1 -80 or sequences comprising, consisting or consisting essentially of a sequence having at least 95% sequence identity to such listed sequences are down-regulated.
As curing conditions determine the ultimate leaf cell chemistry, such modification or modulation affects the way a consumer experiences a product made from such leaf material. Hence, the invention also provides tobacco leaves and products comprising such leaves, obtained according to the methods claimed above. Such products include but are not limited to chewing tobacco, tobacco sticks, extracts obtained therefrom and other smoking articles comprising such leaf material or a material derived therefrom. BRIEF DESCRIPTION OF THE DRAWINGS
Figure 1. Expression of CYP82E4 (AGD93125.1 / Gl:444237502) increased after 48h curing in the three main tobacco types, SC, sun-cured; FC, flue-cured; AC, air-cured.
Figure 2. The expression of 80 senescence-activated protease genes increased in the three main tobacco types
Figure 3. One APA 1 tobacco gene (SEQ 68) is only expressed during Virginia Curing. DEFINITIONS
The technical terms and expressions used within the scope of this application are generally to be given the meaning commonly applied to them in the pertinent art of plant and molecular biology. All of the following term definitions apply to the complete content of this application. The word "comprising" does not exclude other elements or steps, and the indefinite article "a" or "an" does not exclude a plurality. A single step may fulfil the functions of several features recited in the claims. The terms "about", "essentially" and "approximately" in the context of a given numerate value or range refers to a value or range that is within 20%, within 10%, or within 5%, 4%, 3%, 2% or 1 % of the given value or range.
The term "isolated" refers to any entity that is taken from its natural milieu, but the term does not connote any degree of purification.
An "expression vector" is a nucleic acid vehicle that comprises a combination of nucleic acid components for enabling the expression of nucleic acid. Suitable expression vectors include episomes capable of extra-chromosomal replication such as circular, double-stranded nucleic acid plasmids; linearized double-stranded nucleic acid plasmids; and other functionally equivalent expression vectors of any origin. An expression vector comprises at least a promoter positioned upstream and operably-linked to a nucleic acid, nucleic acid constructs or nucleic acid conjugate, as defined below.
The term "construct" refers to a double-stranded, recombinant nucleic acid fragment comprising one or more polynucleotides. The construct comprises a "template strand" base- paired with a complementary "sense or coding strand." A given construct can be inserted into a vector in two possible orientations, either in the same (or sense) orientation or in the reverse (or anti-sense) orientation with respect to the orientation of a promoter positioned within a vector - such as an expression vector.
A "vector" refers to a nucleic acid vehicle that comprises a combination of nucleic acid components for enabling the transport of nucleic acid, nucleic acid constructs and nucleic acid conjugates and the like. Suitable vectors include episomes capable of extra- chromosomal replication such as circular, double-stranded nucleic acid plasmids; linearized double-stranded nucleic acid plasmids; and other vectors of any origin. A "promoter" refers to a nucleic acid element/sequence, typically positioned upstream and operably-linked to a double-stranded DNA fragment. Promoters can be derived entirely from regions proximate to a native gene of interest, or can be composed of different elements derived from different native promoters or synthetic DNA segments.
The terms "homology, identity or similarity" refer to the degree of sequence similarity between two polypeptides or between two nucleic acid molecules compared by sequence alignment. The degree of homology between two discrete nucleic acid sequences being compared is a function of the number of identical, or matching, nucleotides at comparable positions. The percent identity may be determined by visual inspection and mathematical calculation. Alternatively, the percent identity of two nucleic acid sequences may be determined by comparing sequence information using a computer program such as - ClustalW, BLAST, FASTA or Smith-Waterman.
A "variant" means a substantially similar sequence. A variant can have a similar function or substantially similar function as a wild-type sequence. For a protease, a similar function is at least about 50%, 60%, 70%, 80% or 90% of wild-type enzyme function under the same conditions. For a protease, a substantially similar function is at least about 90%, 95%, 96%, 97%, 98% or 99% of wild-type enzyme function under the same conditions. For example, wild-type protease sequences are set forth in SEQ ID Nos: 81 -160. The variants can have one or more mutations that result in the enzyme having a reduced level of protease activity as compared to the wild-type protease. The variants can have one or more mutations that result in their protease activity being knocked out (i.e. a 100% inhibition, and thus a nonfunctional polypeptide). Variants can also have increased activity, leading to a more active protease enzyme function.
The term "plant" refers to any plant or part of a plant at any stage of its life cycle or development, and its progenies. In one embodiment, the plant is a "tobacco plant", which refers to a plant belonging to the genus Nicotiana. Preferred species of tobacco plant are described herein.
"Plant parts" include plant cells, plant protoplasts, plant cell tissue cultures from which a whole plant can be regenerated, plant calli, plant clumps and plant cells that are intact in plants or parts of plants such as embryos, pollen, anthers, ovules, seeds, leaves, flowers, stems, branches, fruit, roots, root tips and the like. Progeny, variants and mutants of regenerated plants are also included within the scope of the disclosure, provided that they comprise the introduced polynucleotides described herein.
A "plant cell" refers to a structural and physiological unit of a plant. The plant cell may be in the form of a protoplast without a cell wall, an isolated single cell or a cultured cell, or as a part of higher organized unit such as but not limited to, plant tissue, a plant organ, or a whole plant. The term "plant material" refers to any solid, liquid or gaseous composition, or a combination thereof, obtainable from a plant, including biomass, leaves, stems, roots, flowers or flower parts, fruits, pollen, egg cells, zygotes, seeds, cuttings, secretions, extracts, cell or tissue cultures, or any other parts or products of a plant. In one embodiment, the plant material comprises or consists of biomass, stem, seed or leaves. In another embodiment, the plant material comprises or consists of leaves.
The term "variety" refers to a population of plants that share constant characteristics which separate them from other plants of the same species. While possessing one or more distinctive traits, a variety is further characterized by a very small overall variation between individuals within that variety. A variety is often sold commercially.
A "type" of tobacco is defined by origin and curing method. Flue-cured tobacco, which accounts for 40% of global production, is also known as "Bright" and "Virginia" tobacco. It is used almost entirely in cigarette blends. Some of the heavier leaves may be used in mixtures for pipe smoking. Some English cigarettes are 100% flue-cured. Flue-cured leaf is characterized by a high sugar: nitrogen ratio. This ratio is enhanced by the picking of the leaf in an advanced stage of ripeness, and by the unique curing process which allows certain chemical changes to occur in the leaf. Cured leaves vary from lemon to orange to mahogany in colour.
Burley is light air-cured type derived from the White Burley which arose as a mutant on a farm in Ohio in 1864. Burley is used primarily in cigarette blends. Some of the heavier leaf is sued in pipe blends and also for chewing.
Cured burley leaf is characterized by low sugar content and a very low sugar to nitrogen ratio (high nicotine). This is enhanced by high Nitrogen fertilizer, harvesting at an early stage of senescence, and the air curing process which allows oxidation of any sugars which may have occurred.
Maryland is another light air-cured type. It is used to some extent in American blended cigarettes and to a greater extent in certain Swiss cigarette blends.
Maryland tobacco is extremely fluffy, has good burning properties, low nicotine, and neutral aroma.
Dark air-cured tobacco encompasses a number of types used mainly for chewing, snuff, cigar, and pipe blends. Most of the world production is confined to the tropics.
Oriental tobacco gives a mild smoke with very characteristic aroma. Resins, waxes and gum exuded by glandular hairs (trichomes) furnish the aroma. Nicotine is low averaging around
1 .0%.
Dark-fired tobacco is used in the production of snuff, chewing tobacco, and pipe blends. Dark-fired leaves are subjected to smoke from smoldering wood during the early stage of curing. The type of wood used is very important in determining taste and grown. Cured leaves are very dark in color and are long and heavy bodied.
The term "modulating" may refer to reducing, inhibiting, increasing or otherwise affecting the expression or activity of a polypeptide. The term may also refer to reducing, inhibiting, increasing or otherwise affecting the activity of a gene encoding a polypeptide which can include, but is not limited to, modulating transcriptional activity.
The term "reduce" or "reduced" as used herein, refers to a reduction of from about 10% to about 99%, or a reduction of at least 10%, at least 20%, at least 25%, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 75%, at least 80%, at least 90%, at least 95%, at least 98%, at least 99%, or at least 100% or more of a quantity or an activity, such as but not limited to polypeptide activity, transcriptional activity and protein expression. The term "inhibit" or "inhibited" as used herein, refers to a reduction of from about 98% to about 100%, or a reduction of at least 98%, at least 99%, but particularly of 100%, of a quantity or an activity, such as but not limited to polypeptide activity, transcriptional activity and protein expression.
The term "increase" or "increased" as used herein, refers to an increase of from about 5% to about 99%, or an increase of at least 5%, at least 10%, at least 20%, at least 25%, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 75%, at least 80%, at least 90%, at least 95%, at least 98%, at least 99%, at least 100%, at least 500% or at least 1000% or more of a quantity or an activity, such as but not limited to polypeptide activity, transcriptional activity and protein expression.
The term "control" in the context of a control plant means a plant or plant cell in which the expression or activity of an enzyme has not been modified (for example, increased or reduced) and so it can provide a comparison with a plant in which the expression or activity of the enzyme has been modified. The control plant may comprise an empty vector. The control plant or plant cell may correspond to a wild-type plant or wild-type plant cell. For example, the control plant or plant cell can be the same genotype as the starting material for the genetic alteration that resulted in the subject plant. In all such cases, the subject plant and the control plant are cultured and harvested using the same protocols for comparative purposes. Changes in levels, ratios, activity, or distribution of the genes or polypeptides described herein, or changes in tobacco plant phenotype, particularly reduced production of proteases, can be measured by comparing a subject plant to the control plant, where the subject plant and the control plant have been cultured, harvested and cured using the same protocols. The control plant can provide a reference point for measuring changes in phenotype of the subject plant. The measurement of changes in phenotype can be measured at any time in a plant, including during plant development, senescence, or preferably after curing. Measurement of changes in phenotype can be measured in plants grown under any conditions, including from plants grown in growth chamber, greenhouse, or in a field. Changes in phenotype can be measured by determining the expression or activity of proteases identified herein in SEQ ID Nos 81 -160.
DETAILED DESCRIPTION
In one embodiment, there is provided an isolated polynucleotide comprising, consisting or consisting essentially of a polynucleotide sequence having at least 95% sequence identity to any of the sequences described herein, including any of polynucleotides shown in the sequence lisiting. Suitably, the isolated polynucleotide comprises, consists or consists essentially of a sequence having at least 95%, 96%, 97%, 98%, 99% or 100% sequence identity thereto.
Suitably, the polynucleotide(s) described herein encode a protein with protease activity that is at least about 50%, 60%, 70%, 80%, 90% 95%, 96%, 97%, 98%, 99% 100% or more of the activity of the protein set forth in SEQ ID NOs: 81 -160.
A polynucleotide as described herein can include a polymer of nucleotides, which may be unmodified or modified deoxyribonucleic acid (DNA) or ribonucleic acid (RNA). Accordingly, a polynucleotide can be, without limitation, a genomic DNA, complementary DNA (cDNA), mRNA, or antisense RNA or a fragment(s) thereof. Moreover, a polynucleotide can be single-stranded or double-stranded DNA, DNA that is a mixture of single-stranded and double-stranded regions, a hybrid molecule comprising DNA and RNA, or a hybrid molecule with a mixture of single-stranded and double-stranded regions or a fragment(s) thereof. In addition, the polynucleotide can be composed of triple-stranded regions comprising DNA, RNA, or both or a fragment(s) thereof. A polynucleotide can contain one or more modified bases, such as phosphothioates, and can be a peptide nucleic acid. Generally, polynucleotides can be assembled from isolated or cloned fragments of cDNA, genomic DNA, oligonucleotides, or individual nucleotides, or a combination of the foregoing. Although the polynucleotide sequences described herein are shown as DNA sequences, the sequences include their corresponding RNA sequences, and their complementary (for example, completely complementary) DNA or RNA sequences, including the reverse complements thereof.
A polynucleotide as described herein will generally contain phosphodiester bonds, although in some cases, polynucleotide analogues are included that may have alternate backbones, comprising, for example, phosphoramidate, phosphorothioate, phosphorodithioate, or O- methylphophoroamidite linkages; and peptide polynucleotide backbones and linkages. Other analogue polynucleotides include those with positive backbones; non-ionic backbones, and non-ribose backbones. Modifications of the ribose-phosphate backbone may be done for a variety of reasons, for example, to increase the stability and half-life of such molecules in physiological environments or as probes on a biochip. Mixtures of naturally occurring polynucleotides and analogues can be made; alternatively, mixtures of different polynucleotide analogues, and mixtures of naturally occurring polynucleotides and analogues may be made.
A variety of polynucleotide analogues are known, including, for example, phosphoramidate, phosphorothioate, phosphorodithioate, O-methylphophoroamidite linkages and peptide polynucleotide backbones and linkages. Other analogue polynucleotides include those with positive backbones, non-ionic backbones and non-ribose backbones. Polynucleotides containing one or more carbocyclic sugars are also included.
Other analogues include peptide polynucleotides which are peptide polynucleotide analogues. These backbones are substantially non-ionic under neutral conditions, in contrast to the highly charged phosphodiester backbone of naturally occurring polynucleotides. This may result in advantages. First, the peptide polynucleotide backbone may exhibit improved hybridization kinetics. Peptide polynucleotides have larger changes in the melting temperature for mismatched versus perfectly matched base pairs. DNA and RNA typically exhibit a 2-4 °C drop in melting temperature for an internal mismatch. With the non-ionic peptide polynucleotide backbone, the drop is closer to 7-9 °C. Similarly, due to their non- ionic nature, hybridization of the bases attached to these backbones is relatively insensitive to salt concentration. In addition, peptide polynucleotides may not be degraded or degraded to a lesser extent by cellular enzymes, and thus may be more stable.
Among the uses of the disclosed polynucleotides, and fragments thereof, is the use of fragments as probes in nucleic acid hybridisation assays or primers for use in nucleic acid amplification assays. Such fragments generally comprise at least about 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19 or 20 or more contiguous nucleotides of a DNA sequence. In other embodiments, a DNA fragment comprises at least about 10, 15, 20, 30, 40, 50 or 60 or more contiguous nucleotides of a DNA sequence. Thus, in one aspect, there is also provided a method for detecting a polynucleotide encoding a protein with nicotine N-demethylase activity member or encoding a nicotine N-demethylase enzyme comprising the use of the probes or primers or both.
The basic parameters affecting the choice of hybridization conditions and guidance for devising suitable conditions are described by Sambrook, J., E. F. Fritsch, and T. Maniatis (1989, Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.). Using knowledge of the genetic code in combination with the amino acid sequences described herein, sets of degenerate oligonucleotides can be prepared. Such oligonucleotides are useful as primers, for example, in polymerase chain reactions (PCR), whereby DNA fragments are isolated and amplified. In certain embodiments, degenerate primers can be used as probes for genetic libraries. Such libraries would include but are not limited to cDNA libraries, genomic libraries, and even electronic express sequence tag or DNA libraries. Homologous sequences identified by this method would then be used as probes to identify homologues of the sequences identified herein.
Also of potential use are polynucleotides and oligonucleotides (for example, primers or probes) that hybridize under reduced stringency conditions, typically moderately stringent conditions, and commonly highly stringent conditions to the polynucleotide(s) as described herein. The basic parameters affecting the choice of hybridization conditions and guidance for devising suitable conditions are set forth by Sambrook, J., E. F. Fritsch, and T. Maniatis (1989, Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. and can be readily determined by those having ordinary skill in the art based on, for example, the length or base composition of the polynucleotide. One way of achieving moderately stringent conditions involves the use of a prewashing solution containing 5x Standard Sodium Citrate, 0.5% Sodium Dodecyl Sulphate, 1 .0 mM Ethylenediaminetetraacetic acid (pH 8.0), hybridization buffer of about 50% formamide, 6x Standard Sodium Citrate, and a hybridization temperature of about 55 °C (or other similar hybridization solutions, such as one containing about 50% formamide, with a hybridization temperature of about 42°C), and washing conditions of about 60°C, in 0.5x Standard Sodium Citrate, 0.1 % Sodium Dodecyl Sulphate. Generally, highly stringent conditions are defined as hybridization conditions as above, but with washing at approximately 68 °C, 0.2x Standard Sodium Citrate, 0.1 % Sodium Dodecyl Sulphate. SSPE (1 x SSPE is 0.15 M sodium chloride, 10 mM sodium phosphate, and 1 .25 mM Ethylenediaminetetraacetic acid, pH 7.4) can be substituted for Standard Sodium Citrate (1 x Standard Sodium Citrate is 0.15 M sodium chloride and 15 mM sodium citrate) in the hybridization and wash buffers; washes are performed for 15 minutes after hybridization is complete. It should be understood that the wash temperature and wash salt concentration can be adjusted as necessary to achieve a desired degree of stringency by applying the basic principles that govern hybridization reactions and duplex stability, as known to those skilled in the art and described further below (see, for example, Sambrook, J., E. F. Fritsch, and T. Maniatis (1989, Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y). When hybridizing a polynucleotide to a target polynucleotide of unknown sequence, the hybrid length is assumed to be that of the hybridizing polynucleotide. When polynucleotides of known sequence are hybridized, the hybrid length can be determined by aligning the sequences of the polynucleotides and identifying the region or regions of optimal sequence complementarity. The hybridization temperature for hybrids anticipated to be less than 50 base pairs in length should be 5 to 10 °C less than the melting temperature of the hybrid, where melting temperature is determined according to the following equations. For hybrids less than 18 base pairs in length, melting temperature (°C)=2(number of A+T bases)+4(number of G+C bases). For hybrids above 18 base pairs in length, melting temperature (°C)=81 .5+16.6(log10 [Na+])+0.41 (% G+C)-(600/N), where N is the number of bases in the hybrid, and [Na+] is the concentration of sodium ions in the hybridization buffer ([Na+] for 1 x Standard Sodium Citrate=0.165M). Typically, each such hybridizing polynucleotide has a length that is at least 25% (commonly at least 50%, 60%, or 70%, and most commonly at least 80%) of the length of a polynucleotide to which it hybridizes, and has at least 60% sequence identity (for example, at least 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99% or 100%) with a polynucleotide to which it hybridizes.
As will be understood by the person skilled in the art, a linear DNA has two possible orientations: the 5'-to-3' direction and the 3'-to-5' direction. For example, if a reference sequence is positioned in the 5'-to-3' direction, and if a second sequence is positioned in the 5'-to-3' direction within the same polynucleotide molecule/strand, then the reference sequence and the second sequence are orientated in the same direction, or have the same orientation. Typically, a promoter sequence and a gene of interest under the regulation of the given promoter are positioned in the same orientation. However, with respect to the reference sequence positioned in the 5'-to-3' direction, if a second sequence is positioned in the 3'-to-5' direction within the same polynucleotide molecule/strand, then the reference sequence and the second sequence are orientated in anti-sense direction, or have anti- sense orientation. Two sequences having anti-sense orientations with respect to each other can be alternatively described as having the same orientation, if the reference sequence (5'- to-3' direction) and the reverse complementary sequence of the reference sequence (reference sequence positioned in the 5'-to-3') are positioned within the same polynucleotide molecule/strand. The sequences set forth herein are shown in the 5'-to-3' direction.
Recombinant constructs provided herein can be used to transform plants or plant cells in order to modulate protein expression and/or activity levels. A recombinant polynucleotide construct can comprise a polynucleotide encoding one or more polynucleotides as described herein, operably linked to a regulatory region suitable for expressing the polypeptide. Thus, a polynucleotide can comprise a coding sequence that encodes the polypeptide as described herein. Plants or plant cells in which protein expression and/or activity levels are modulated can include mutant, non-naturally occurring, transgenic, man-made or genetically engineered plants or plant cells. Suitably, the transgenic plant or plant cell comprises a genome that has been altered by the stable integration of recombinant DNA. Recombinant DNA includes DNA which has been genetically engineered and constructed outside of a cell and includes DNA containing naturally occurring DNA or cDNA or synthetic DNA. A transgenic plant can include a plant regenerated from an originally-transformed plant cell and progeny transgenic plants from later generations or crosses of a transformed plant. Suitably, the transgenic modification alters the expression or activity of the polynucleotide or the polypeptide described herein as compared to a control plant.
The polypeptide encoded by a recombinant polynucleotide can be a native polypeptide, or can be heterologous to the cell. In some cases, the recombinant construct contains a polynucleotide that modulates expression, operably linked to a regulatory region. Examples of suitable regulatory regions are described herein.
Vectors containing recombinant polynucleotide constructs such as those described herein are also provided. Suitable vector backbones include, for example, those routinely used in the art such as plasmids, viruses, artificial chromosomes, bacterial artificial chromosomes, yeast artificial chromosomes, or bacteriophage artificial chromosomes. Suitable expression vectors include, without limitation, plasmids and viral vectors derived from, for example, bacteriophage, baculoviruses, and retroviruses. Numerous vectors and expression systems are commercially available. The vectors can include, for example, origins of replication, scaffold attachment regions or markers. A marker gene can confer a selectable phenotype on a plant cell. For example, a marker can confer biocide resistance, such as resistance to an antibiotic (for example, kanamycin, G418, bleomycin, or hygromycin), or an herbicide (for example, glyphosate, chlorsulfuron or phosphinothricin). In addition, an expression vector can include a tag sequence designed to facilitate manipulation or detection (for example, purification or localization) of the expressed polypeptide. Tag sequences, such as luciferase, beta-glucuronidase, green fluorescent protein, glutathione S-transferase, polyhistidine, c- myc or hemagglutinin sequences typically are expressed as a fusion with the encoded polypeptide. Such tags can be inserted anywhere within the polypeptide, including at either the carboxyl or amino terminus.
A plant or plant cell can be transformed by having the recombinant polynucleotide integrated into its genome to become stably transformed. The plant or plant cell described herein can be stably transformed. Stably transformed cells typically retain the introduced polynucleotide with each cell division. A plant or plant cell can be transiently transformed such that the recombinant polynucleotide is not integrated into its genome. Transiently transformed cells typically lose all or some portion of the introduced recombinant polynucleotide with each cell division such that the introduced recombinant polynucleotide cannot be detected in daughter cells after a sufficient number of cell divisions.
A number of methods are available in the art for transforming a plant cell which are all encompassed herein, including biolistics, gene gun techniques, Agrobacterium-mediated transformation, viral vector-mediated transformation and electroporation. The Agrobacterium system for integration of foreign DNA into plant chromosomes has been extensively studied, modified, and exploited for plant genetic engineering. Naked recombinant DNA molecules comprising DNA sequences corresponding to the subject purified tobacco protein operably linked, in the sense or antisense orientation, to regulatory sequences are joined to appropriate T-DNA sequences by conventional methods. These are introduced into tobacco protoplasts by polyethylene glycol techniques or by electroporation techniques, both of which are standard. Alternatively, such vectors comprising recombinant DNA molecules encoding the subject purified tobacco protein are introduced into live Agrobacterium cells, which then transfer the DNA into the tobacco plant cells. Transformation by naked DNA without accompanying T-DNA vector sequences can be accomplished via fusion of tobacco protoplasts with DNA-containing liposomes or via electroporation. Naked DNA unaccompanied by T-DNA vector sequences can also be used to transform tobacco cells via inert, high velocity microprojectiles.
If a cell or cultured tissue is used as the recipient tissue for transformation, plants can be regenerated from transformed cultures if desired, by techniques known to those skilled in the art.
The choice of regulatory regions to be included in a recombinant construct depends upon several factors, including, but not limited to, efficiency, selectability, inducibility, desired expression level, and cell- or tissue-preferential expression. It is a routine matter for one of skill in the art to modulate the expression of a coding sequence by appropriately selecting and positioning regulatory regions relative to the coding sequence. Transcription of a polynucleotide can be modulated in a similar manner. Some suitable regulatory regions initiate transcription only, or predominantly, in certain cell types. Methods for identifying and characterizing regulatory regions in plant genomic DNA are known in the art.
Suitable promoters include tissue-specific promoters recognized by tissue-specific factors present in different tissues or cell types (for example, root-specific promoters, shoot-specific promoters, xylem-specific promoters), or present during different developmental stages, or present in response to different environmental conditions. Suitable promoters include constitutive promoters that can be activated in most cell types without requiring specific inducers. Examples of suitable promoters for controlling RNAi polypeptide production include the cauliflower mosaic virus 35S (CaMV/35S), SSU, OCS, Iib4, usp, STLS1 , B33, nos or ubiquitin- or phaseolin-promoters. Persons skilled in the art are capable of generating multiple variations of recombinant promoters.
Tissue-specific promoters are transcriptional control elements that are only active in particular cells or tissues at specific times during plant development, such as in vegetative tissues or reproductive tissues. Tissue-specific expression can be advantageous, for example, when the expression of polynucleotides in certain tissues is preferred. Examples of tissue-specific promoters under developmental control include promoters that can initiate transcription only (or primarily only) in certain tissues, such as vegetative tissues, for example, roots or leaves, or reproductive tissues, such as fruit, ovules, seeds, pollen, pistols, flowers, or any embryonic tissue. Reproductive tissue-specific promoters may be, for example, anther-specific, ovule-specific, embryo-specific, endosperm-specific, integument- specific, seed and seed coat-specific, pollen-specific, petal-specific, sepal-specific, or combinations thereof.
Suitable leaf-specific promoters include pyruvate, orthophosphate dikinase (PPDK) promoter from C4 plant (maize), cab-m1 Ca+2 promoter from maize, the Arabidopsis thaliana myb- related gene promoter (Atmyb5), the ribulose biphosphate carboxylase (RBCS) promoters (for example, the tomato RBCS 1 , RBCS2 and RBCS3A genes expressed in leaves and light-grown seedlings, RBCS1 and RBCS2 expressed in developing tomato fruits or ribulose bisphosphate carboxylase promoter expressed almost exclusively in mesophyll cells in leaf blades and leaf sheaths at high levels).
Suitable senescence-specific promoters include a tomato promoter active during fruit ripening, senescence and abscission of leaves, a maize promoter of gene encoding a cysteine protease, the promoter of 82E4 and the promoter of SAG genes. Suitable anther- specific promoters can be used. Suitable root-preferred promoters known to persons skilled in the art may be selected. Suitable seed-preferred promoters include both seed-specific promoters (those promoters active during seed development such as promoters of seed storage proteins) and seed-germinating promoters (those promoters active during seed germination). Such seed-preferred promoters include, but are not limited to, Cim1 (cytokinin-induced message); cZ19B1 (maize 19 kDa zein); milps (myo-inositol-1 -phosphate synthase); mZE40-2, also known as Zm-40; nuclc; and celA (cellulose synthase). Gama-zein is an endosperm-specific promoter. Glob-1 is an embryo-specific promoter. For dicots, seed- specific promoters include, but are not limited to, bean beta-phaseolin, napin, β-conglycinin, soybean lectin, cruciferin, and the like. For monocots, seed-specific promoters include, but are not limited to, a maize 15 kDa zein promoter, a 22 kDa zein promoter, a 27 kDa zein promoter, a g-zein promoter, a 27 kDa gamma-zein promoter (such as gzw64A promoter, see Genbank Accession number S78780), a waxy promoter, a shrunken 1 promoter, a shrunken 2 promoter, a globulin 1 promoter (see Genbank Accession number L22344), an Itp2 promoter, cim1 promoter, maize endl and end2 promoters, nud promoter, Zm40 promoter, eepl and eep2; led , thioredoxin H promoter; mlip15 promoter, PCNA2 promoter; and the shrunken-2 promoter.
Examples of inducible promoters include promoters responsive to pathogen attack, anaerobic conditions, elevated temperature, light, drought, cold temperature, or high salt concentration. Pathogen-inducible promoters include those from pathogenesis-related proteins (PR proteins), which are induced following infection by a pathogen (for example, PR proteins, SAR proteins, beta-1 ,3-glucanase, chitinase). In addition to plant promoters, other suitable promoters may be derived from bacterial origin for example, the octopine synthase promoter, the nopaline synthase promoter and other promoters derived from Ti plasmids, or may be derived from viral promoters (for example, 35S and 19S RNA promoters of cauliflower mosaic virus (CaMV), constitutive promoters of tobacco mosaic virus, cauliflower mosaic virus (CaMV) 19S and 35S promoters, or figwort mosaic virus 35S promoter).
Preferred promoters include the control elements provided herein, as part of SEQ ID Nos. 1 - 80, which demonstrate desirable expression during curing procedures in tobacco leaf.
In another aspect, there is provided an isolated polypeptide comprising, consisting or consisting essentially of a polypeptide sequence having at least 95% sequence identity to any of the polypeptide sequences described herein, including any of the polypeptides shown in the sequence lisiting. Suitably, the isolated polypeptide comprises, consists or consists essentially of a sequence having at least 95% 96%, 97%, 98%, 99%, 99.1 %, 99.2%, 99.3%, 99.4%, 99.5%, 99.6%, 99.7%, 99.8%, 99.9% or 100% sequence identity thereto.
The polypeptide can include sequences comprising a sufficient or substantial degree of identity or similarity to SEQ ID NOs: 81 -160 to function as proteases. Fragments of the polypeptide(s) typically retain some or all of the activity of the full length sequence.
As discussed herein, the polypeptides also include mutants produced by introducing any type of alterations (for example, insertions, deletions, or substitutions of amino acids; changes in glycosylation states; changes that affect refolding or isomerizations, three- dimensional structures, or self-association states), which can be deliberately engineered or isolated naturally provided that they still have some or all of their function or activity as a protease. Suitably, the function or activity as a protease is modulated, increased or reduced. Polypeptides include variants produced by introducing any type of alterations (for example, insertions, deletions, or substitutions of amino acids; changes in glycosylation states; changes that affect refolding or isomerizations, three-dimensional structures, or self- association states), which can be deliberately engineered or isolated naturally. The variant may have alterations which produce a silent change and result in a functionally equivalent protein. Deliberate amino acid substitutions may be made on the basis of similarity in polarity, charge, solubility, hydrophobicity, hydrophilicity and the amphipathic nature of the residues as long as the secondary binding activity of the substance is retained. For example, negatively charged amino acids include aspartic acid and glutamic acid; positively charged amino acids include lysine and arginine; and amino acids with uncharged polar head groups having similar hydrophilicity values include leucine, isoleucine, valine, glycine, alanine, asparagine, glutamine, serine, threonine, phenylalanine, and tyrosine. Conservative substitutions may be made, for example according to the Table below. Amino acids in the same block in the second column and preferably in the same line in the third column may be substituted for each other:
Figure imgf000021_0001
The polypeptide may be a mature protein or an immature protein or a protein derived from an immature protein. Polypeptides may be in linear form or cyclized using known methods. Polypeptides typically comprise at least 10, at least 20, at least 30, or at least 40 contiguous amino acids.
A tobacco plant or plant cell comprising a mutation in a gene encoding a protease as described herein is disclosed, wherein said mutation results in modulated expression or modulated function of said protease. The expression or function of the protease(s) may be enhanced. Aside from one or more mutations in said protease, the mutant plants or plant cells can have one or more further mutations in one or more other genes or polypeptides. In certain embodiments, aside from the one or more mutations in a protease gene, the mutants can have one or more further mutations in one or more other genes or polypeptides - such as one or more other protease genes or polypeptides as described in the Sequence Listing. Suitably, a protease is expressed in the leaves of the mutant plant during the curing procedure.
There is also provided a method for modulating the level of a protease in a (cured) tobacco plant or in (cured) tobacco plant material said method comprising introducing into the genome of said plant one or more mutations that modulate expression of at least one protease gene, wherein said at least one protease gene is selected from SEQ ID Nos: 1 -80. There is also provided a method for identifying a tobacco plant with increased levels of protease, said method comprising screening a nucleic acid sample from a tobacco plant of interest for the presence of one or more mutations in SEQ ID NOs:1 -80, and optionally correlating the identified mutation(s) with mutation(s) that are known to modulate levels of protease. There is also disclosed a tobacco plant or plant cell that is heterozygous or homozygous for mutations in a gene encoding a protease, wherein said mutation results in modulated (enhanced or reduced) expression or function of said protease.
A number of approaches can be used to combine mutations in one plant including sexual crossing. A plant having one or more favourable heterozygous or homozygous mutations in a protease gene that enhances or reduces protease expression or activity can be crossed with a plant having one or more favourable heterozygous or homozygous mutations in one or more other protease genes that enhance or reduce protease activity. In one embodiment, crosses are made in order to introduce one or more favourable heterozygous or homozygous mutations within a protease gene within the same plant.
The activity of one or more protease polypeptides in a tobacco plant is reduced or enhanced according to the present disclosure if the protease activity is statistically lower or higher than the protease activity of the same protease(s) in a tobacco plant that has not been modified to inhibit the activity of that protease polypeptide and which has been cultured, harvested and cured using the same protocols.
In some embodiments, the mutation(s) is introduced into a tobacco plant or plant cell using a mutagenesis approach, and the introduced mutation is identified or selected using methods known to those of skill in the art - such as Southern blot analysis, DNA sequencing, PCR analysis, or phenotypic analysis. Mutations that impact gene expression or that interfere with the function of the encoded protein can be determined using methods that are well known in the art. Insertional mutations in gene exons usually result in null-mutants. Mutations in conserved residues can be particularly effective in inhibiting the metabolic function of the encoded protein.
Methods for obtaining mutant polynucleotides and polypeptides are also disclosed. Any plant of interest, including a plant cell or plant material can be genetically modified by various methods known to induce mutagenesis, including site-directed mutagenesis, oligonucleotide- directed mutagenesis, chemically-induced mutagenesis, irradiation-induced mutagenesis, mutagenesis utilizing modified bases, mutagenesis utilizing gapped duplex DNA, double- strand break mutagenesis, mutagenesis utilizing repair-deficient host strains, mutagenesis by total gene synthesis, DNA shuffling and other equivalent methods.
Fragments of protease polynucleotides and polypeptides encoded thereby are also disclosed. Fragments of a polynucleotide may encode protein fragments that retain the biological activity of the native protein and hence are involved in the metabolic conversion of nicotine to nornicotine. Alternatively, fragments of a polynucleotide that are useful as hybridization probes or PCR primers generally do not encode fragment proteins retaining biological activity. Furthermore, fragments of the disclosed nucleotide sequences include those that can be assembled within recombinant constructs as discussed herein. Fragments of a polynucleotide sequence may range from at least about 25 nucleotides, about 50 nucleotides, about 75 nucleotides, about 100 nucleotides about 150 nucleotides, about 200 nucleotides, about 250 nucleotides, about 300 nucleotides, about 400 nucleotides, about 500 nucleotides, about 600 nucleotides, about 700 nucleotides, about 800 nucleotides, about 900 nucleotides, about 1000 nucleotides, about 1 100 nucleotides, about 1200 nucleotides, about 1300 nucleotides or about 1400 nucleotides and up to the full-length polynucleotide encoding the polypeptides described herein. Fragments of a polypeptide sequence may range from at least about 25 amino acids, about 50 amino acids, about 75 amino acids, about 100 amino acids about 150 amino acids, about 200 amino acids, about 250 amino acids, about 300 amino acids, about 400 amino acids, about 500 amino acids, and up to the full-length polypeptide described herein.
Mutant polypeptide variants can be used to create mutant, non-naturally occurring or transgenic plants (for example, mutant, non-naturally occurring, transgenic, man-made or genetically engineered plants) or plant cells comprising one or more mutant polypeptide variants. Suitably, mutant polypeptide variants retain the activity of the unmutated polypeptide. The activity of the mutant polypeptide variant may be higher, lower or about the same as the unmutated polypeptide.
Mutations in the nucleotide sequences and polypeptides described herein can include man- made mutations or synthetic mutations or genetically engineered mutations. Mutations in the nucleotide sequences and polypeptides described herein can be mutations that are obtained or obtainable via a process which includes an in vitro or an in vivo manipulation step. Mutations in the nucleotide sequences and polypeptides described herein can be mutations that are obtained or obtainable via a process which includes intervention by man. By way of example, the process may include mutagenesis using exogenously added chemicals - such as mutagenic, teratogenic, or carcinogenic organic compounds, for example ethyl methanesulfonate (EMS), that produce random mutations in genetic material. By way of further example, the process may include one or more genetic engineering steps - such as one or more of the genetic engineering steps that are described herein or combinations thereof. By way of further example, the process may include one or more plant crossing steps.
A polypeptide may be prepared by culturing transformed or recombinant host cells under culture conditions suitable to express a polypeptide. The resulting expressed polypeptide may then be purified from such culture using known purification processes. The purification of the polypeptide may include an affinity column containing agents which will bind to the polypeptide; one or more column steps over such affinity resins; one or more steps involving hydrophobic interaction chromatography; or immunoaffinity chromatography. Alternatively, the polypeptide may also be expressed in a form that will facilitate purification. For example, it may be expressed as a fusion polypeptide, such as those of maltose binding polypeptide, glutathione-5-transferase, his-tag or thioredoxin. Kits for expression and purification of fusion polypeptides are commercially available. The polypeptide may be tagged with an epitope and subsequently purified by using a specific antibody directed to such epitope. One or more liquid chromatography steps - such as reverse-phase high performance liquid chromatography can be employed to further purify the polypeptide. Some or all of the foregoing purification steps, in various combinations, can be employed to provide a substantially homogeneous recombinant polypeptide. The polypeptide thus purified may be substantially free of other polypeptides and is defined herein as a "substantially purified polypeptide"; such purified polypeptides include polypeptides, fragments, variants, and the like. Expression, isolation, and purification of the polypeptides and fragments can be accomplished by any suitable technique, including but not limited to the methods described herein.
It is also possible to utilise an affinity column such as a monoclonal antibody generated against polypeptides, to affinity-purify expressed polypeptides. These polypeptides can be removed from an affinity column using conventional techniques, for example, in a high salt elution buffer and then dialyzed into a lower salt buffer for use or by changing pH or other components depending on the affinity matrix utilized, or be competitively removed using the naturally occurring substrate of the affinity moiety.
Isolated or substantially purified polynucleotides or protein compositions are disclosed. An "isolated" or "purified" polynucleotide or protein, or biologically active portion thereof, is substantially or essentially free from components that normally accompany or interact with the polynucleotide or protein as found in its naturally occurring environment. Thus, an isolated or purified polynucleotide or protein is substantially free of other cellular material, or culture medium when produced by recombinant techniques, or substantially free of chemical precursors or other chemicals when chemically synthesized. Optimally, an "isolated" polynucleotide is free of sequences (optimally protein encoding sequences) that naturally flank the polynucleotide (for example, sequences located at the 5' and 3' ends of the polynucleotide) in the genomic DNA of the organism from which the polynucleotide is derived. For example, in various embodiments, the isolated polynucleotide can contain less than about 5 kb, 4 kb, 3 kb, 2 kb, 1 kb, 0.5 kb, or 0.1 kb of nucleotide sequence that naturally flank the polynucleotide in genomic DNA of the cell from which the polynucleotide is derived. A protein that is substantially free of cellular material includes preparations of protein having less than about 30%, 20%, 10%, 5%, or 1 % (by dry weight) of contaminating protein.
A polypeptide may also be produced by known conventional chemical synthesis. Methods for constructing the polypeptides or fragments thereof by synthetic means are known to those skilled in the art. The synthetically-constructed polypeptide sequences, by virtue of sharing primary, secondary or tertiary structural or conformational characteristics with native polypeptides may possess biological properties in common therewith, including biological activity.
The term 'non-naturally occurring' as used herein describes an entity (for example, a polynucleotide, a genetic mutation, a polypeptide, a plant, a plant cell and plant material) that is not formed by nature or that does not exist in nature. Such non-naturally occurring entities or artificial entities may be made, synthesized, initiated, modified, intervened, or manipulated by methods described herein or that are known in the art. Such non-naturally occurring entities or artificial entities may be made, synthesized, initiated, modified, intervened, or manipulated by man. Thus, by way of example, a non-naturally occurring plant, a non- naturally occurring plant cell or non-naturally occurring plant material may be made using traditional plant breeding techniques - such as backcrossing - or by genetic manipulation technologies - such as antisense RNA, interfering RNA, meganuclease and the like. By way of further example, a non-naturally occurring plant, a non-naturally occurring plant cell or non-naturally occurring plant material may be made by introgression of or by transferring one or more genetic mutations (for example one or more polymorphisms) from a first plant or plant cell into a second plant or plant cell (which may itself be naturally occurring), such that the resulting plant, plant cell or plant material or the progeny thereof comprises a genetic constitution (for example, a genome, a chromosome or a segment thereof) that is not formed by nature or that does not exist in nature. The resulting plant, plant cell or plant material is thus artificial or non-naturally occurring. Accordingly, an artificial or non-naturally occurring plant or plant cell may be made by modifying a genetic sequence in a first naturally occurring plant or plant cell, even if the resulting genetic sequence occurs naturally in a second plant or plant cell that comprises a different genetic background from the first plant or plant cell. In certain embodiments, a mutation is not a naturally occurring mutation that exists naturally in a nucleotide sequence or a polypeptide - such as a gene or a protein.
Differences in genetic background can be detected by phenotypic differences or by molecular biology techniques known in the art - such as nucleic acid sequencing, presence or absence of genetic markers (for example, microsatellite RNA markers).
Antibodies that are immunoreactive with the polypeptides described herein are also provided. The polypeptides, fragments, variants, fusion polypeptides, and the like, as set forth herein, can be employed as "immunogens" in producing antibodies immunoreactive therewith. Such antibodies may specifically bind to the polypeptide via the antigen-binding sites of the antibody. Specifically binding antibodies are those that will specifically recognize and bind with a polypeptide, homologues, and variants, but not with other molecules. In one embodiment, the antibodies are specific for polypeptides having an amino acid sequence as set forth herein and do not cross-react with other polypeptides. More specifically, the polypeptides, fragment, variants, fusion polypeptides, and the like contain antigenic determinants or epitopes that elicit the formation of antibodies. These antigenic determinants or epitopes can be either linear or conformational (discontinuous). Linear epitopes are composed of a single section of amino acids of the polypeptide, while conformational or discontinuous epitopes are composed of amino acids sections from different regions of the polypeptide chain that are brought into close proximity upon polypeptide folding. Epitopes can be identified by any of the methods known in the art. Additionally, epitopes from the polypeptides can be used as research reagents, in assays, and to purify specific binding antibodies from substances such as polyclonal sera or supernatants from cultured hybridomas. Such epitopes or variants thereof can be produced using techniques known in the art such as solid-phase synthesis, chemical or enzymatic cleavage of a polypeptide, or using recombinant DNA technology.
Both polyclonal and monoclonal antibodies to the polypeptides can be prepared by conventional techniques. Hybridoma cell lines that produce monoclonal antibodies specific for the polypeptides are also contemplated herein. Such hybridomas can be produced and identified by conventional techniques. For the production of antibodies, various host animals may be immunized by injection with a polypeptide, fragment, variant, or mutants thereof. Such host animals may include, but are not limited to, rabbits, mice, and rats, to name a few. Various adjutants may be used to increase the immunological response. Depending on the host species, such adjuvants include, but are not limited to, Freund's (complete and incomplete), mineral gels such as aluminium hydroxide, surface active substances such as lysolecithin, pluronic polyols, polyanions, peptides, oil emulsions, keyhole limpet hemocyanin, dinitrophenol, and potentially useful human adjuvants such as BCG (bacille Calmette-Guerin) and Corynebacterium parvum. The monoclonal antibodies can be recovered by conventional techniques. Such monoclonal antibodies may be of any immunoglobulin class including IgG, IgM, IgE, IgA, IgD, and any subclass thereof.
The antibodies can also be used in assays to detect the presence of the polypeptides or fragments, either in vitro or in vivo. The antibodies also can be employed in purifying polypeptides or fragments by immunoaffinity chromatography.
Other than mutagenesis, compositions that can modulate the expression or the activity of one or more of the proteases described herein include, but are not limited to, sequence- specific polynucleotides that can interfere with the transcription of one or more endogenous gene(s); sequence-specific polynucleotides that can interfere with the translation of RNA transcripts (for example, double-stranded RNAs, siRNAs, ribozymes); sequence-specific polypeptides that can interfere with the stability of one or more proteins; sequence-specific polynucleotides that can interfere with the enzymatic activity of one or more proteins or the binding activity of one or more proteins with respect to substrates or regulatory proteins; antibodies that exhibit specificity for one or more proteins; small molecule compounds that can interfere with the stability of one or more proteins or the enzymatic activity of one or more proteins or the binding activity of one or more proteins; zinc finger proteins that bind one or more polynucleotides; and meganucleases that have activity towards one or more polynucleotides. Gene editing technologies, genetic editing technologies and genome editing technologies are well known in the art.
One method of gene editing involves the use of transcription activator-like effector nucleases (TALENs) which induce double-strand breaks which cells can respond to with repair mechanisms. Non-homologous end joining reconnects DNA from either side of a double- strand break where there is very little or no sequence overlap for annealing. This repair mechanism induces errors in the genome via insertion or deletion, or chromosomal rearrangement. Any such errors may render the gene products coded at that location nonfunctional. Another method of gene editing involves the use of the bacterial CRISPR/Cas system. Bacteria and archaea exhibit chromosomal elements called clustered regularly interspaced short palindromic repeats (CRISPR) that are part of an adaptive immune system that protects against invading viral and plasmid DNA. In Type II CRISPR systems, CRISPR RNAs (crRNAs) function with trans-activating crRNA (tracrRNA) and CRISPR-associated (Cas) proteins to introduce double-stranded breaks in target DNA. Target cleavage by Cas9 requires base-pairing between the crRNA and tracrRNA as well as base pairing between the crRNA and the target DNA. Target recognition is facilitated by the presence of a short motif called a protospacer-adjacent motif (PAM) that conforms to the sequence NGG. This system can be harnessed for genome editing. Cas9 is normally programmed by a dual RNA consisting of the crRNA and tracrRNA. However, the core components of these RNAs can be combined into a single hybrid 'guide RNA' for Cas9 targeting. The use of a noncoding RNA guide to target DNA for site-specific cleavage promises to be significantly more straightforward than existing technologies - such as TALENs. Using the CRISPR/Cas strategy, retargeting the nuclease complex only requires introduction of a new RNA sequence and there is no need to reengineer the specificity of protein transcription factors. Antisense technology is another well-known method that can be used to modulate the expression of a polypeptide. A polynucleotide of the gene to be repressed is cloned and operably linked to a regulatory region and a transcription termination sequence so that the antisense strand of RNA is transcribed. The recombinant construct is then transformed into a plant cell and the antisense strand of RNA is produced. The polynucleotide need not be the entire sequence of the gene to be repressed, but typically will be substantially complementary to at least a portion of the sense strand of the gene to be repressed.
A polynucleotide may be transcribed into a ribozyme, or catalytic RNA, that affects expression of an mRNA. Ribozymes can be designed to specifically pair with virtually any target RNA and cleave the phosphodiester backbone at a specific location, thereby functionally inactivating the target RNA. Heterologous polynucleotides can encode ribozymes designed to cleave particular mRNA transcripts, thus preventing expression of a polypeptide. Hammerhead ribozymes are useful for destroying particular mRNAs, although various ribozymes that cleave mRNA at site-specific recognition sequences can be used. Hammerhead ribozymes cleave mRNAs at locations dictated by flanking regions that form complementary base pairs with the target mRNA. The sole requirement is that the target RNA contains a 5'-UG-3' nucleotide sequence. The construction and production of hammerhead ribozymes is known in the art. Hammerhead ribozyme sequences can be embedded in a stable RNA such as a transfer RNA (tRNA) to increase cleavage efficiency in vivo.
In one embodiment, the sequence-specific polynucleotide that can interfere with the translation of RNA transcript(s) is interfering RNA. RNA interference or RNA silencing is an evolutionarily conserved process by which specific mRNAs can be targeted for enzymatic degradation. A double-stranded RNA (double-stranded RNA) is introduced or produced by a cell (for example, double-stranded RNA virus, or interfering RNA polynucleotides) to initiate the interfering RNA pathway. The double-stranded RNA can be converted into multiple small interfering RNA duplexes of 21 -24 bp length by RNases III, which are double-stranded RNA-specific endonucleases. The small interfering RNAs can be subsequently recognized by RNA-induced silencing complexes that promote the unwinding of small interfering RNA through an ATP-dependent process. The unwound antisense strand of the small interfering RNA guides the activated RNA-induced silencing complexes to the targeted mRNA comprising a sequence complementary to the small interfering RNA anti-sense strand. The targeted mRNA and the anti-sense strand can form an A-form helix, and the major groove of the A-form helix can be recognized by the activated RNA-induced silencing complexes. The target mRNA can be cleaved by activated RNA-induced silencing complexes at a single site defined by the binding site of the 5'-end of the small interfering RNA strand. The activated RNA-induced silencing complexes can be recycled to catalyze another cleavage event. Interfering RNA expression vectors may comprise interfering RNA constructs encoding interfering RNA polynucleotides that exhibit RNA interference activity by reducing the expression level of mRNAs, pre-mRNAs, or related RNA variants. The expression vectors may comprise a promoter positioned upstream and operably-linked to an Interfering RNA construct, as further described herein. Interfering RNA expression vectors may comprise a suitable minimal core promoter, a Interfering RNA construct of interest, an upstream (5') regulatory region, a downstream (3') regulatory region, including transcription termination and polyadenylation signals, and other sequences known to persons skilled in the art, such as various selection markers. Various embodiments are directed to methods for modulating the expression level of one or more of the polynucleotide(s) described herein (or any combination thereof as described herein) by integrating multiple copies of the polynucleotide(s) into a (tobacco) plant genome, comprising: transforming a plant cell host with an expression vector that comprises a promoter operably-linked to a polynucleotide.
Various compositions and methods are provided for modulating the endogenous gene expression level by modulating the translation of mRNA. A host (tobacco) plant cell can be transformed with an expression vector comprising: a promoter operably-linked to a polynucleotide, positioned in anti-sense orientation with respect to the promoter to enable the expression of RNA polynucleotides having a sequence complementary to a portion of mRNA.
Various expression vectors for modulating the translation of mRNA may comprise: a promoter operably-linked to a polynucleotide in which the sequence is positioned in anti- sense orientation with respect to the promoter. The lengths of anti-sense RNA polynucleotides can vary, and may be from about 15-20 nucleotides, about 20-30 nucleotides, about 30-50 nucleotides, about 50-75 nucleotides, about 75-100 nucleotides, about 100-150 nucleotides, about 150-200 nucleotides, and about 200-300 nucleotides. As discussed herein, the expression of one or more polypeptides can be modulated by non- transgenic means - such as creating one or more mutations in one or more genes, as discussed herein. Methods that introduce a mutation randomly in a gene sequence can include chemical mutagenesis, EMS mutagenesis and radiation mutagenesis. Methods that introduce one or more targeted mutations into a cell include but are not limited to genome editing technology, particularly zinc finger nuclease-mediated mutagenesis and targeting induced local lesions in genomes (TILLING), homologous recombination, oligonucleotide- directed mutagenesis, and meganuclease-mediated mutagenesis. In one embodiment, TILLING is used. This is a mutagenesis technology that can be used to generate and/or identify polynucleotides encoding polypeptides with modified expression and/or activity. TILLING also allows selection of plants carrying such mutants. TILLING combines high- density mutagenesis with high-throughput screening methods. Methods for TILLING are well known in the art (see McCallum et al., (2000) Nat Biotechnol 18: 455-457 and Stemple (2004) Nat Rev Genet 5(2): 145-50).
Specific mutations in polynucleotides can be created that can result in modulated gene expression, modulated stability of mRNA, or modulated stability of protein. Such plants are referred to herein as "non-naturally occurring" or "mutant" plants. Typically, the mutant or non-naturally occurring plants will include at least a portion of foreign or synthetic or man- made nucleic acid (for example, DNA or RNA) that was not present in the plant before it was manipulated. The foreign nucleic acid may be a single nucleotide, two or more nucleotides, two or more contiguous nucleotides or two or more non-contiguous nucleotides - such as at least 10, 20, 30, 40, 50,100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 1 100, 1200, 1300, 1400 or 1500 or more contiguous or non-contiguous nucleotides.
The mutant or non-naturally occurring plants or plant cells can have any combination of one or more mutations in one or more genes which results in modulated protein levels. For example, the mutant or non-naturally occurring plants or plant cells may have a single mutation in a single gene; multiple mutations in a single gene; a single mutation in two or more or three or more or four or more genes; or multiple mutations in two or more or three or more or four or more genes. Examples of such mutations are described herein. By way of further example, the mutant or non-naturally occurring plants or plant cells may have one or more mutations in a specific portion of the gene(s) - such as in a region of the gene that encodes an active site of the protein or a portion thereof. By way of further example, the mutant or non-naturally occurring plants or plant cells may have one or more mutations in a region outside of one or more gene(s) - such as in a region upstream or downstream of the gene it regulates provided that they modulate the activity or expression of the gene(s). Upstream elements can include promoters, enhancers or transription factors. Some elements - such as enhancers - can be positioned upstream or downstream of the gene it regulates. The element(s) need not be located near to the gene that it regulates since some elements have been found located several hundred thousand base pairs upstream or downstream of the gene that it regulates. The mutant or non-naturally occurring plants or plant cells may have one or more mutations located within the first 100 nucleotides of the gene(s), within the first 200 nucleotides of the gene(s), within the first 300 nucleotides of the gene(s), within the first 400 nucleotides of the gene(s), within the first 500 nucleotides of the gene(s), within the first 600 nucleotides of the gene(s), within the first 700 nucleotides of the gene(s), within the first 800 nucleotides of the gene(s), within the first 900 nucleotides of the gene(s), within the first 1000 nucleotides of the gene(s), within the first 1 100 nucleotides of the gene(s), within the first 1200 nucleotides of the gene(s), within the first 1300 nucleotides of the gene(s), within the first 1400 nucleotides of the gene(s) or within the first 1500 nucleotides of the gene(s). The mutant or non-naturally occurring plants or plant cells may have one or more mutations located within the first, second, third, fourth, fifth, sixth, seventh, eighth, ninth, tenth, eleventh, twelfth, thirteenth, fourteenth or fifteenth set of 100 nucleotides of the gene(s) or combinations thereof. Mutant or non-naturally occurring plants or plant cells (for example, mutant, non-naturally occurring or transgenic plants or plant cells and the like, as described herein) comprising the mutant polypeptide variants are disclosed.
In one embodiment, seeds from plants are mutagenised and then grown into first generation mutant plants. The first generation plants are then allowed to self-pollinate and seeds from the first generation plant are grown into second generation plants, which are then screened for mutations in their loci. Though the mutagenized plant material can be screened for mutations, an advantage of screening the second generation plants is that all somatic mutations correspond to germline mutations. One of skill in the art would understand that a variety of plant materials, including but not limited to, seeds, pollen, plant tissue or plant cells, may be mutagenised in order to create the mutant plants. However, the type of plant material mutagenised may affect when the plant nucleic acid is screened for mutations. For example, when pollen is subjected to mutagenesis prior to pollination of a non-mutagenized plant the seeds resulting from that pollination are grown into first generation plants. Every cell of the first generation plants will contain mutations created in the pollen; thus these first generation plants may then be screened for mutations instead of waiting until the second generation.
Mutagens that create primarily point mutations and short deletions, insertions, transversions, and or transitions, including chemical mutagens or radiation, may be used to create the mutations. Mutagens include, but are not limited to, ethyl methanesulfonate, methylmethane sulfonate, N-ethyl-N-nitrosurea, triethylmelamine, N-methyl-N-nitrosourea, procarbazine, chlorambucil, cyclophosphamide, diethyl sulfate, acrylamide monomer, melphalan, nitrogen mustard, vincristine, dimethylnitrosamine, N-methyl-N'-nitro-Nitrosoguanidine, nitrosoguanidine, 2-aminopurine, 7,12 dimethyl-benz(a)anthracene, ethylene oxide, hexamethylphosphoramide, bisulfan, diepoxyalkanes (diepoxyoctane, diepoxybutane, and the like), 2-methoxy-6-chloro-9[3-(ethyl-2-chloro-ethyl)aminopropylamino]acridine dihydrochloride and formaldehyde.
Spontaneous mutations in the locus that may not have been directly caused by the mutagen are also contemplated provided that they result in the desired phenotype. Suitable mutagenic agents can also include, for example, ionising radiation - such as X-rays, gamma rays, fast neutron irradiation and UV radiation. Any method of plant nucleic acid preparation known to those of skill in the art may be used to prepare the plant nucleic acid for mutation screening.
Prepared nucleic acid from individual plants, plant cells, or plant material can optionally be pooled in order to expedite screening for mutations in the population of plants originating from the mutagenized plant tissue, cells or material. One or more subsequent generations of plants, plant cells or plant material can be screened. The size of the optionally pooled group is dependent upon the sensitivity of the screening method used.
After the nucleic acid samples are optionally pooled, they can be subjected to polynucleotide-specific amplification techniques, such as Polymerase Chain Reaction. Any one or more primers or probes specific to the gene or the sequences immediately adjacent to the gene may be utilized to amplify the sequences within the optionally pooled nucleic acid sample. Suitably, the one or more primers or probes are designed to amplify the regions of the locus where useful mutations are most likely to arise. Most preferably, the primer is designed to detect mutations within regions of the polynucleotide. Additionally, it is preferable for the primer(s) and probe(s) to avoid known polymorphic sites in order to ease screening for point mutations. To facilitate detection of amplification products, the one or more primers or probes may be labelled using any conventional labelling method. Primer(s) or probe(s) can be designed based upon the sequences described herein using methods that are well understood in the art.
To facilitate detection of amplification products, the primer(s) or probe(s) may be labelled using any conventional labelling method. These can be designed based upon the sequences described herein using methods that are well understood in the art. Polymorphisms may be identified by means known in the art and some have been described in the literature.
In a further aspect there is provided a method of preparing a mutant plant. The method involves providing at least one cell of a plant comprising a gene encoding a functional polynucleotide described herein (or any combination thereof as described herein). Next, the at least one cell of the plant is treated under conditions effective to modulate the activity of the polynucleotide(s) described herein. The at least one mutant plant cell is then propagated into a mutant plant, where the mutant plant has a modulated level of polypeptide(s) described (or any combination thereof as described herein) as compared to that of a control plant. In one embodiment of this method of making a mutant plant, the treating step involves subjecting the at least one cell to a chemical mutagenising agent as described above and under conditions effective to yield at least one mutant plant cell. In another embodiment of this method, the treating step involves subjecting the at least one cell to a radiation source under conditions effective to yield at least one mutant plant cell. The term "mutant plant" includes mutants plants in which the genotype is modified as compared to a control plant, suitably by means other than genetic engineering or genetic modification.
In certain embodiments, the mutant plant, mutant plant cell or mutant plant material may comprise one or more mutations that have occured naturally in another plant, plant cell or plant material and confer a desired trait. This mutation can be incorporated (for example, introgressed) into another plant, plant cell or plant material (for example, a plant, plant cell or plant material with a different genetic background to the plant from which the mutation was derived) to confer the trait thereto. Thus by way of example, a mutation that occurred naturally in a first plant may be introduced into a second plant - such as a second plant with a different genetic background to the first plant. The skilled person is therefore able to search for and identify a plant carrying naturally in its genome one or more mutant alleles of the genes described herein which confer a desired trait. The mutant allele(s) that occurs naturally can be transferred to the second plant by various methods including breeding, backcrossing and introgression to produce a lines, varieties or hybrids that have one or more mutations in the genes described herein. Plants showing a desired trait may be screened out of a pool of mutant plants. Suitably, the selection is carried out utilising the knowledge of the nucleotide sequences as described herein. Consequently, it is possible to screen for a genetic trait as compared to a control. Such a screening approach may involve the application of conventional nucleic acid amplification and/or hybridization techniques as discussed herein. Thus, a further aspect of the present invention relates to a method for identifying a mutant plant comprising the steps of: (a) providing a sample comprising nucleic acid from a plant; and (b) determining the nucleic acid sequence of the polynucleotide, wherein a difference in the sequence of the polynucleotide as compared to the polynucleotide sequence of a control plant is indicative that said plant is a mutant plant. In another aspect there is provided a method for identifying a mutant plant which accumulates increased or reduced levels of protease as compared to a control plant comprising the steps of: (a) providing a sample from a plant to be screened; (b) determining if said sample comprises one or more mutations in one or more of the polynucleotides described herein; and (c) determining at least the protease content of said plant during or after a curing procedure.
In another aspect there is provided a method for preparing a mutant plant which has increased or reduced levels of protease as compared to a control plant comprising the steps of: (a) providing a sample from a first plant; (b) determining if said sample comprises one or more mutations in one or more the polynucleotides described herein that result in modulated levels of a protease; and (c) transferring the one or more mutations into a second plant. Suitably at least the protease content is determined in cured leaf material. The mutation(s) can be transferred into the second plant using various methods that are known in the art - such as by genetic engineering, genetic manipulation, introgression, plant breeding, backcrossing and the like. In one embodiment, the first plant is a naturally occurring plant. In one embodiment, the second plant has a different genetic background to the first plant. In another aspect there is provided a method for preparing a mutant plant which has increased or reduced levels of a protease as compared to a control plant comprising the steps of: (a) providing a sample from a first plant; (b) determining if said sample comprises one or more mutations in one or more of the polynucleotides described herein that results in modulated levels of the protease; and (c) introgressing the one or more mutations from the first plant into a second plant. Suitably at least the protease content is determined in cured leaf material. In one embodiment, the step of introgressing comprises plant breeding, optionally including backcrossing and the like. In one embodiment, the first plant is a naturally occurring plant. In one embodiment, the second plant has a different genetic background to the first plant. In one embodiment, the first plant is not a cultivar or an elite cultivar. In one embodiment, the second plant is a cultivar or an elite cultivar. A further aspect relates to a mutant plant (including a cultivar or elite cultivar mutant plant) obtained or obtainable by the methods described herein. In certain embodiments, the "mutant plants" may have one or more mutations localised only to a specific region of the plant - such as within the sequence of the one or more polynucleotide(s) described herein. According to this embodiment, the remaining genomic sequence of the mutant plant will be the same or substantially the same as the plant prior to the mutagenesis.
In certain embodiments, the mutant plants may have one or more mutations localised in more than one region of the plant - such as within the sequence of one or more of the polynucleotides described herein and in one or more further regions of the genome. According to this embodiment, the remaining genomic sequence of the mutant plant will not be the same or will not be substantially the same as the plant prior to the mutagenesis. In certain embodiments, the mutant plants may not have one or more mutations in one or more, two or more, three or more, four or more or five or more exons of the polynucleotide(s) described herein; or may not have one or more mutations in one or more, two or more, three or more, four or more or five or more introns of the polynucleotide(s) described herein; or may not have one or more mutations in a promoter of the polynucleotide(s) described herein; or may not have one or more mutations in the 3' untranslated region of the polynucleotide(s) described herein; or may not have one or more mutations in the 5' untranslated region of the polynucleotide(s) described herein; or may not have one or more mutations in the coding region of the polynucleotide(s) described herein; or may not have one or more mutations in the non-coding region of the polynucleotide(s) described herein; or any combination of two or more, three or more, four or more, five or more; or six or more thereof parts thereof.
In a futher aspect there is provided a method of identifying a plant, a plant cell or plant material comprising a mutation in a gene encoding a polynucleotide described herein comprising: (a) subjecting a plant, a plant cell or plant material to mutagenesis; (b) obtaining a nucleic acid sample from said plant, plant cell or plant material or descendants thereof; and (c) determining the nucleic acid sequence of the gene encoding a polynucleotide described herein or a variant or a fragment thereof, wherein a difference in said sequence is indicative of one or more mutations therein.
Zinc finger proteins can be used to modulate the expression or the activity of one or more of the polynucleotides described herein. In various embodiments, a genomic DNA sequence comprising a part of or all of the coding sequence of the polynucleotide is modified by zinc finger nuclease-mediated mutagenesis. The genomic DNA sequence is searched for a unique site for zinc finger protein binding. Alternatively, the genomic DNA sequence is searched for two unique sites for zinc finger protein binding wherein both sites are on opposite strands and close together, for example, 1 , 2, 3, 4, 5, 6 or more basepairs apart. Accordingly, zinc finger proteins that bind to polynucleotides are provided.
A zinc finger protein may be engineered to recognize a selected target site in a gene. A zinc finger protein can comprise any combination of motifs derived from natural zinc finger DNA- binding domains and non-natural zinc finger DNA-binding domains by truncation or expansion or a process of site-directed mutagenesis coupled to a selection method such as, but not limited to, phage display selection, bacterial two-hybrid selection or bacterial one- hybrid selection. The term "non-natural zinc finger DNA-binding domain" refers to a zinc finger DNA-binding domain that binds a three-base pair sequence within the target nucleic acid and that does not occur in the cell or organism comprising the nucleic acid which is to be modified. Methods for the design of zinc finger protein which binds specific nucleotide sequences which are unique to a target gene are known in the art.
In other embodiments, a zinc finger protein may be selected to bind to a regulatory sequence of a polynucleotide. More specifically, the regulatory sequence may comprise a transcription initiation site, a start codon, a region of an exon, a boundary of an exon-intron, a terminator, or a stop codon. Accordingly, the invention provides a mutant, non-naturally occurring or transgenic plant or plant cells, produced by zinc finger nuclease-mediated mutagenesis in the vicinity of or within one or more polynucleotides described herein, and methods for making such a plant or plant cell by zinc finger nuclease-mediated mutagenesis. Methods for delivering zinc finger protein and zinc finger nuclease to a tobacco plant are similar to those described below for delivery of meganuclease.
Plants suitable for use in genetic modification include, but are not limited to, monocotyledonous and dicotyledonous plants and plant cell systems, including species from one of the following families: Acanthaceae, Alliaceae, Alstroemeriaceae, Amaryllidaceae, Apocynaceae, Arecaceae, Asteraceae, Berberidaceae, Bixaceae, Brassicaceae, Bromeliaceae, Cannabaceae, Caryophyllaceae, Cephalotaxaceae, Chenopodiaceae, Colchicaceae, Cucurbitaceae, Dioscoreaceae, Ephedraceae, Erythroxylaceae, Euphorbiaceae, Fabaceae, Lamiaceae, Linaceae, Lycopodiaceae, Malvaceae, Melanthiaceae, Musaceae, Myrtaceae, Nyssaceae, Papaveraceae, Pinaceae, Plantaginaceae, Poaceae, Rosaceae, Rubiaceae, Salicaceae, Sapindaceae, Solanaceae, Taxaceae, Theaceae, or Vitaceae.
Suitable species may include members of the genera Abelmoschus, Abies, Acer, Agrostis, Allium, Alstroemeria, Ananas, Andrographis, Andropogon, Artemisia, Arundo, Atropa, Berberis, Beta, Bixa, Brassica, Calendula, Camellia, Camptotheca, Cannabis, Capsicum, Carthamus, Catharanthus, Cephalotaxus, Chrysanthemum, Cinchona, Citrullus, Coffea, Colchicum, Coleus, Cucumis, Cucurbita, Cynodon, Datura, Dianthus, Digitalis, Dioscorea, Elaeis, Ephedra, Erianthus, Erythroxylum, Eucalyptus, Festuca, Fragaria, Galanthus, Glycine, Gossypium, Helianthus, Hevea, Hordeum, Hyoscyamus, Jatropha, Lactuca, Linum, Lolium, Lupinus, Lycopersicon, Lycopodium, Manihot, Medicago, Mentha, Miscanthus, Musa, Nicotiana, Oryza, Panicum, Papaver, Parthenium, Pennisetum, Petunia, Phalaris, Phleum, Pinus, Poa, Poinsettia, Populus, Rauwolfia, Ricinus, Rosa, Saccharum, Salix, Sanguinaria, Scopolia, Secale, Solanum, Sorghum, Spartina, Spinacea, Tanacetum, Taxus, Theobroma, Triticosecale, Triticum, Uniola, Veratrum, Vinca, Vitis, and Zea.
Suitable species may include Panicum spp., Sorghum spp., Miscanthus spp., Saccharum spp., Erianthus spp., Populus spp., Andropogon gerardii (big bluestem), Pennisetum purpureum (elephant grass), Phalaris arundinacea (reed canarygrass), Cynodon dactylon (bermudagrass), Festuca arundinacea (tall fescue), Spartina pectinata (prairie cord-grass), Medicago sativa (alfalfa), Arundo donax (giant reed), Secale cereale (rye), Salix spp. (willow), Eucalyptus spp. (eucalyptus), Triticosecale (tritic wheat times rye), bamboo, Helianthus annuus (sunflower), Carthamus tinctorius (safflower), Jatropha curcas (jatropha), Ricinus communis (castor), Elaeis guineensis (palm), Linum usitatissimum (flax), Brassica juncea, Beta vulgaris (sugarbeet), Manihot esculenta (cassaya), Lycopersicon esculentum (tomato), Lactuca sativa (lettuce), Musyclise alca (banana), Solanum tuberosum (potato), Brassica oleracea (broccoli, cauliflower, Brussels sprouts), Camellia sinensis (tea), Fragaria ananassa (strawberry), Theobroma cacao (cocoa), Coffe35ycliseca (coffee), Vitis vinifera (grape), Ananas comosus (pineapple), Capsicum annum (hot & sweet pepper), Allium cepa (onion), Cucumis melo (melon), Cucumis sativus (cucumber), Cucurbita maxima (squash), Cucurbita moschata (squash), Spinacea oleracea (spinach), Citrullus lanatus (watermelon), Abelmoschus esculentus (okra), Solanum melongena (eggplant), Rosa spp. (rose), Dianthus caryophyllus (carnation), Petunia spp. (petunia), Poinsettia pulcherrima (poinsettia), Lupinus albus (lupin), Uniola paniculata (oats), bentgrass (Agrostis spp.), Populus tremuloides (aspen), Pinus spp. (pine), Abies spp. (fir), Acer spp. (maple), Hordeum vulgare (barley), Poa pratensis (bluegrass), Lolium spp. (ryegrass) and Phleum pratense (timothy), Panicum virgatum (switchgrass), Sorghu35yclise35or (sorghum, sudangrass), Miscanthus giganteus (miscanthus), Saccharum sp. (energycane), Populus balsamifera (poplar), Zea mays (corn), Glycine max (soybean), Brassica napus (canola), Triticum aestivum (wheat), Gossypium hirsutum (cotton), Oryza sativa (rice), Helianthus annuus (sunflower), Medicago sativa (alfalfa), Beta vulgaris (sugarbeet), or Pennisetum glaucum (pearl millet).
Various embodiments are directed to mutant tobacco, non-naturally occurring tobacco or transgenic tobacco plants or plant cells modified to modulate gene expression levels thereby producing a plant or plant cell - such as a tobacco plant or plant cell - in which the expression level of a polypeptide is modulated within tissues of interest as compared to a control. The disclosed compositions and methods can be applied to any species of the genus Nicotiana, including N. rustica and N. tabacum (for example, LA B21 , LN KY171 , Tl 1406, Basma, Galpao, Perique, Beinhart 1000-1 , and Petico). Other species include N. acaulis, N. acuminata, N. africana, N. alata, N. ameghinoi, N. amplexicaulis, N. arentsii, N. attenuata, N. azambujae, N. benavidesii, N. benthamiana, N. bigelovii, N. bonariensis, N. cavicola, N. clevelandii, N. cordifolia, N. corymbosa, N. debneyi, N. excelsior, N. forgetiana, N. fragrans, N. glauca, N. glutinosa, N. goodspeedii, N. gossei, N. hybrid, N. ingulba, N. kawakamii, N. knightiana, N. langsdorffii, N. linearis, N. longiflora, N. maritima, N. megalosiphon, N. miersii, N. noctiflora, N. nudicaulis, N. obtusifolia, N. occidentalis, N. occidentalis subsp. hesperis, N. otophora, N. paniculata, N. pauciflora, N. petunioides, N. plumbaginifolia, N. quadrivalvis, N. raimondii, N. repanda, N. rosulata, N. rosulata subsp. ingulba, N. rotundifolia, N. setchellii, N. simulans, N. solanifolia, N. spegazzinii, N. stocktonii, N. suaveolens, N. sylvestris, N. thyrsiflora, N. tomentosa, N. tomentosiformis, N. trigonophylla, N. umbratica, N. undulata, N. velutina, N. wigandioides, and N. x sanderae. The use of tobacco cultivars and elite tobacco cultivars is also contemplated herein. The transgenic, non-naturally occurring or mutant plant may therefore be a tobacco variety or elite tobacco cultivar that comprises one or more transgenes, or one or more genetic mutations or a combiantion thereof. The genetic mutation(s) (for example, one or more polymorphisms) can be mutations that do not exist naturally in the individual tobacco variety or tobacco cultivar (for example, elite tobacco cultivar) or can be genetic mutation(s) that do occur naturally provided that the mutation does not occur naturally in the individual tobacco variety or tobacco cultivar (for example, elite tobacco cultivar).
Particularly useful Nicotiana tabacum varieties include Burley type, dark type, flue-cured type, and Oriental type tobaccos. Non-limiting examples of varieties or cultivars are: BD 64, CC 101 , CC 200, CC 27, CC 301 , CC 400, CC 500, CC 600, CC 700, CC 800, CC 900, Coker 176, Coker 319, Coker 371 Gold, Coker 48, CD 263, DF91 1 , DT 538 LC Galpao tobacco, GL 26H, GL 350, GL 600, GL 737, GL 939, GL 973, HB 04P, HB 04P LC, HB3307PLC, Hybrid 403LC, Hybrid 404LC, Hybrid 501 LC, K 149, K 326, K 346, K 358, K394, K 399, K 730, KDH 959, KT 200, KT204LC, KY10, KY14, KY 160, KY 17, KY 171 , KY 907, KY907LC, KY14xL8 LC, Little Crittenden, McNair 373, McNair 944, msKY 14xL8, Narrow Leaf Madole, Narrow Leaf Madole LC, NBH 98, N-126, N-777LC, N-7371 LC, NC 100, NC 102, NC 2000, NC 291 , NC 297, NC 299, NC 3, NC 4, NC 5, NC 6, NC7, NC 606, NC 71 , NC 72, NC 810, NC BH 129, NC 2002, Neal Smith Madole, OXFORD 207, PD 7302 LC, PD 7309 LC, PD 7312 LC, 'Perique' tobacco, PVH03, PVH09, PVH19, PVH50, PVH51 , R 610, R 630, R 7-1 1 , R 7-12, RG 17, RG 81 , RG H51 , RGH 4, RGH 51 , RS 1410, Speight 168, Speight 172, Speight 179, Speight 210, Speight 220, Speight 225, Speight 227, Speight 234, Speight G-28, Speight G-70, Speight H-6, Speight H20, Speight NF3, Tl 1406, Tl 1269, TN 86, TN86LC, TN 90, TN 97, TN97LC, TN D94, TN D950, TR (Tom Rosson) Madole, VA 309, VA359, AA 37-1 , B13P, Xanthi (Mitchell-Mor), Bel-W3, 79-615, Samsun Holmes NN, KTRDC number 2 Hybrid 49, Burley 21 , KY8959, KY9, MD 609, PG01 , PG04, P01 , P02, P03, RG1 1 , RG 8, VA509, AS44, Banket A1 , Basma Drama B84/31 , Basma I Zichna ZP4/B, Basma Xanthi BX 2A, Batek, Besuki Jember, C104, Coker 347, Criollo Misionero, Delcrest, Djebel 81 , DVH 405, Galpao Comum, HB04P, Hicks Broadleaf, Kabakulak Elassona, Kutsage E1 , LA BU 21 , NC 2326, NC 297, PVH 21 10, Red Russian, Samsun, Saplak, Simmaba, Talgar 28, Wislica, Yayaldag, Prilep HC-72, Prilep P23, Prilep PB 156/1 , Prilep P12-2/1 , Yaka JK-48, Yaka JB 125/3, TI-1068, KDH-960, TI-1070, TW136, Basma, TKF 4028, L8, TKF 2002, GR141 , Basma xanthi, GR149, GR153, Petit Havana. Low converter subvarieties of the above, even if not specifically identified herein, are also contemplated.
Embodiments are also directed to compositions and methods for producing mutant plants, non-naturally occurring plants, hybrid plants, or transgenic plants that have been modified to modulate the expression or activity of a polynucleotide(s) described herein (or any combination thereof as described herein). Advantageously, the mutant plants, non-naturally occurring plants, hybrid plants, or transgenic plants that are obtained may be similar or substantially the same in overall appearance to control plants. Various phenotypic characteristics such as degree of maturity, number of leaves per plant, stalk height, leaf insertion angle, leaf size (width and length), internode distance, and lamina-midrib ratio can be assessed by field observations.
One aspect relates to a seed of a mutant plant, a non-naturally occurring plant, a hybrid plant or a transgenic plant described herein. Preferably, the seed is a tobacco seed. A further aspect relates to pollen or an ovule of a mutant plant, a non-naturally occurring plant, a hybrid plant or a transgenic plant that is described herein. In addition, there is provided a mutant plant, a non-naturally occurring plant, a hybrid plant or a transgenic plant as described herein which further comprises a nucleic acid conferring male sterility.
Also provided is a tissue culture of regenerable cells of the mutant plant, non-naturally occurring plant, hybrid plant, or transgenic plant or a part thereof as described herein, which culture regenerates plants capable of expressing all the morphological and physiological characteristics of the parent. The regenerable cells include but are not limited to cells from leaves, pollen, embryos, cotyledons, hypocotyls, roots, root tips, anthers, flowers and a part thereof, ovules, shoots, stems, stalks, pith and capsules or callus or protoplasts derived therefrom.
A still further aspect, relates to a cured plant material - such as cured leaf or cured tobacco - derived or derivable from a mutant, non-naturally occurring or transgenic plant or cell, wherein expression of one or more of the polynucleotides described herein or the activity of the protein encoded thereby is modulated. Suitably the visual appearance of said plant (for example, leaf) is substantially the same as the control plant. Suitably, the plant is a tobacco plant.
Embodiments are also directed to compositions and methods for producing mutant, non- naturally occurring or transgenic plants or plant cells that have been modified to modulate the expression or activity of the one or more of the polynucleotides or polypeptides described herein which can result in plants or plant components (for example, leaves - such as green leaves or cured leaves - or tobacco) or plant cells with modulated levels of proteases.
In another aspect, there is provided a method for modulating (eg. increasing) the amount of protease in at least a part of a plant (for example, the leaves - such as cured leaves - or in tobacco), comprising the steps of: (i) modulating (eg. increasing) the expression or activity of an one or more of the polypeptides described herein (or any combination thereof as described herein), suitably, wherein the polypeptide(s) is encoded by the corresponding polynucleotide sequence described herein; (ii) measuring the protease content in at least a part (for example, the leaves - such as cured leaves - or tobacco or in smoke) of the mutant, non-naturally occurring or transgenic plant obtained in step (i); and (iii) identifying a mutant, non-naturally occurring or transgenic plant in which the protease content therein has been modulated (eg. increased) in comparison to a control plant. Suitably, the visual appearance of said mutant, non-naturally occurring or transgenic plant is substantially the same as the control plant. Suitably, the plant is a tobacco plant.
In another aspect, there is provided a method for modulating (eg. increasing) the amount of protease in at least a part of cured plant material - such as cured leaf - comprising the steps of: (i) modulating (eg. increasing) the expression or activity of an one or more of the polypeptides (or any combination thereof as described herein), suitably, wherein the polypeptide(s) is encoded by the corresponding polynucleotide sequence described herein; (ii) harvesting plant material - such as one or more of the leaves - and curing for a period of time; (iii) measuring the protease content in at least a part of the cured plant material obtained in step (ii) or during step (ii); and (iv) identifying cured plant material in which the protease content therein has been modulated (eg. increased) in comparison to a control plant.
An increase in expression as compared to the control may be from about 5 % to about 100 %, or an increase of at least 10 %, at least 20 %, at least 25 %, at least 30 %, at least 40 %, at least 50 %, at least 60 %, at least 70 %, at least 75 %, at least 80 %, at least 90 %, at least 95 %, at least 98 %, or 100 % or more - such as 200%, 300%, 500%, 1000% or more, which includes an increase in transcriptional activity or polynucleotide expression or polypeptide expression or a combination thereof. An increase in activity as compared to a control may be from about 5 % to about 100 %, or an increase of at least 10 %, at least 20 %, at least 25 %, at least 30 %, at least 40 %, at least 50 %, at least 60 %, at least 70 %, at least 75 %, at least 80 %, at least 90 %, at least 95 %, at least 98 %, or 100 % or more - such as 200%, 300%, 500%, 1000% or more.
A reduction in expression as compared to a control may be from about 5 % to about 100 %, or a reduction of at least 10 %, at least 20 %, at least 25 %, at least 30 %, at least 40 %, at least 50 %, at least 60 %, at least 70 %, at least 75 %, at least 80 %, at least 90 %, at least 95 %, at least 98 %, or 100 %, which includes a reduction in transcriptional activity or polynucleotide expression or polypeptide expression or a combination thereof.
A reduction in activity as compared to a control may be from about 5 % to about 100 %, or a reduction of at least 10 %, at least 20 %, at least 25 %, at least 30 %, at least 40 %, at least 50 %, at least 60 %, at least 70 %, at least 75 %, at least 80 %, at least 90 %, at least 95 %, at least 98 %, or 100 %.
Polynucleotides and recombinant constructs described herein can be used to modulate the expression of the proteases described herein in a plant species of interest, suitably tobacco. A number of polynucleotide based methods can be used to increase gene expression in plants and plant cells. By way of example, a construct, vector or expression vector that is compatible with the plant to be transformed can be prepared which comprises the gene of interest together with an upstream promoter that is capable of overexpressing the gene in the plant or plant cell. Exemplary promoters are described herein. Following transformation and when grown under suitable conditions, the promoter can drive expression in order to modulate (for example, reduce) the levels of this enzyme in the plant, or in a specific tissue thereof. In one exemplary embodiment, a vector carrying one or more polynucleotides described herein (or any combination thereof as described herein) is generated to overexpress the gene in a plant or plant cell. The vector carries a suitable promoter - such as the cauliflower mosaic virus CaMV 35S promoter - upstream of the transgene driving its constitutive expression in all tissues of the plant. The vector also carries an antibiotic resistance gene in order to confer selection of the transformed calli and cell lines.
In a preferred embodiment, a promoter and regulatory sequences are derived from one or more of SEQ ID Nos: 1 -80. These regulatory sequences can be used in conjunction with cognate or non-cognate expression sequences to increase expression of said sequences in a tobacco plant during the curing procedure.
The expression of sequences from promoters can be enhanced by including expression control sequences, including enhancers, chromatin activating elements, transcription factor responsive elements and the like. Such control sequences may be constitutive, and upregulate transcription in a universal manner; or they may be facultative, and upregulate transcription in response to specific signals. Signals associated with senescence and signals which are active during the curing procedure are specifically indicated.
Various embodiments are therefore directed to methods for modulating (for example, increasing) the expression level of one or more polynucleotides described herein (or any combination thereof as described herein) by integrating multiple copies of the polynucleotide into a plant genome, comprising: transforming a plant cell host with an expression vector that comprises a promoter operably-linked to one or more polynucleotides described herein. The polypeptide encoded by a recombinant polynucleotide can be a native polypeptide, or can be heterologous to the cell.
A tobacco plant carrying a mutant allele of one or more polynucleotides described herein (or any combination thereof as described herein) can be used in a plant breeding program to create useful lines, varieties and hybrids. In particular, the mutant allele is introgressed into the commercially important varieties described above. Thus, methods for breeding plants are provided, that comprise crossing a mutant plant, a non-naturally occurring plant or a transgenic plant as described herein with a plant comprising a different genetic identity. The method may further comprise crossing the progeny plant with another plant, and optionally repeating the crossing until a progeny with the desirable genetic traits or genetic background is obtained. One purpose served by such breeding methods is to introduce a desirable genetic trait into other varieties, breeding lines, hybrids or cultivars, particularly those that are of commercial interest. Another purpose is to facilitate stacking of genetic modifications of different genes in a single plant variety, lines, hybrids or cultivars. Intraspecific as well as interspecific matings are contemplated. The progeny plants that arise from such crosses, also referred to as breeding lines, are examples of non-naturally occurring plants of the invention.
In one embodiment, a method is provided for producing a non-naturally occurring tobacco plant comprising: (a) crossing a mutant or transgenic tobacco plant with a second tobacco plant to yield progeny tobacco seed; (b) growing the progeny tobacco seed, under plant growth conditions, to yield the non-naturally occurring tobacco plant. The method may further comprises: (c) crossing the previous generation of non-naturally occurring tobacco plant with itself or another tobacco plant to yield progeny tobacco seed; (d) growing the progeny tobacco seed of step (c) under plant growth conditions, to yield additional non- naturally occurring tobacco plants; and (e) repeating the crossing and growing steps of (c) and (d) multiple times to generate further generations of non-naturally occurring tobacco plants. The method may optionally comprises prior to step (a), a step of providing a parent plant which comprises a genetic identity that is characterized and that is not identical to the mutant or transgenic plant. In some embodiments, depending on the breeding program, the crossing and growing steps are repeated from 0 to 2 times, from 0 to 3 times, from 0 to 4 times, 0 to 5 times, from 0 to 6 times, from 0 to 7 times, from 0 to 8 times, from 0 to 9 times or from 0 to 10 times, in order to generate generations of non-naturally occurring tobacco plants. Backcrossing is an example of such a method wherein a progeny is crossed with one of its parents or another plant genetically similar to its parent, in order to obtain a progeny plant in the next generation that has a genetic identity which is closer to that of one of the parents. Techniques for plant breeding, particularly tobacco plant breeding, are well known and can be used in the methods of the invention. The invention further provides non- naturally occurring tobacco plants produced by these methods. Certain emboiments exclude the step of selecting a plant.
In some embodiments of the methods described herein, lines resulting from breeding and screening for variant genes are evaluated in the field using standard field procedures. Control genotypes including the original unmutagenized parent are included and entries are arranged in the field in a randomized complete block design or other appropriate field design. For tobacco, standard agronomic practices are used, for example, the tobacco is harvested, weighed, and sampled for chemical and other common testing before and during curing. Statistical analyses of the data are performed to confirm the similarity of the selected lines to the parental line. Cytogenetic analyses of the selected plants are optionally performed to confirm the chromosome complement and chromosome pairing relationships. DNA fingerprinting, single nucleotide polymorphism, microsatellite markers, or similar technologies may be used in a marker-assisted selection (MAS) breeding program to transfer or breed mutant alleles of a gene into other tobaccos, as described herein. For example, a breeder can create segregating populations from hybridizations of a genotype containing a mutant allele with an agronomically desirable genotype. Plants in the F2 or backcross generations can be screened using a marker developed from a genomic sequence or a fragment thereof, using one of the techniques listed herein. Plants identified as possessing the mutant allele can be backcrossed or self-pollinated to create a second population to be screened. Depending on the expected inheritance pattern or the MAS technology used, it may be necessary to self-pollinate the selected plants before each cycle of backcrossing to aid identification of the desired individual plants. Backcrossing or other breeding procedure can be repeated until the desired phenotype of the recurrent parent is recovered.
In a breeding program, successful crosses yield F1 plants that are fertile. Selected F1 plants can be crossed with one of the parents, and the first backcross generation plants are self- pollinated to produce a population that is again screened for variant gene expression (for example, the null version of the the gene). The process of backcrossing, self-pollination, and screening is repeated, for example, at least 4 times until the final screening produces a plant that is fertile and reasonably similar to the recurrent parent. This plant, if desired, is self- pollinated and the progeny are subsequently screened again to confirm that the plant exhibits variant gene expression. In some embodiments, a plant population in the F2 generation is screened for variant gene expression, for example, a plant is identified that fails to express a polypeptide due to the absence of the gene according to standard methods, for example, by using a PCR method with primers based upon the nucleotide sequence information for the polynucleotide(s) described herein (or any combination thereof as described herein).
Hybrid tobacco varieties can be produced by preventing self-pollination of female parent plants (that is, seed parents) of a first variety, permitting pollen from male parent plants of a second variety to fertilize the female parent plants, and allowing F1 hybrid seeds to form on the female plants. Self-pollination of female plants can be prevented by emasculating the flowers at an early stage of flower development. Alternatively, pollen formation can be prevented on the female parent plants using a form of male sterility. For example, male sterility can be produced by cytoplasmic male sterility (CMS), or transgenic male sterility wherein a transgene inhibits microsporogenesis and/or pollen formation, or self- incompatibility. Female parent plants containing CMS are particularly useful. In embodiments in which the female parent plants are CMS, pollen is harvested from male fertile plants and applied manually to the stigmas of CMS female parent plants, and the resulting F1 seed is harvested.
Varieties and lines described herein can be used to form single-cross tobacco F1 hybrids. In such embodiments, the plants of the parent varieties can be grown as substantially homogeneous adjoining populations to facilitate natural cross-pollination from the male parent plants to the female parent plants. The F1 seed formed on the female parent plants is selectively harvested by conventional means. One also can grow the two parent plant varieties in bulk and harvest a blend of F1 hybrid seed formed on the female parent and seed formed upon the male parent as the result of self-pollination. Alternatively, three-way crosses can be carried out wherein a single-cross F1 hybrid is used as a female parent and is crossed with a different male parent. As another alternative, double-cross hybrids can be created wherein the F1 progeny of two different single-crosses are themselves crossed. A population of mutant, non-naturally occurring or transgenic plants can be screened or selected for those members of the population that have a desired trait or phenotype. For example, a population of progeny of a single transformation event can be screened for those plants having a desired level of expression or activity of the polypeptide(s) encoded thereby. Physical and biochemical methods can be used to identify expression or activity levels. These include Southern analysis or PCR amplification for detection of a polynucleotide; Northern blots, S1 RNase protection, primer-extension, or RT-PCR amplification for detecting RNA transcripts; enzymatic assays for detecting enzyme or ribozyme activity of polypeptides and polynucleotides; and protein gel electrophoresis, Western blots, immunoprecipitation, and enzyme-linked immunoassays to detect polypeptides. Other techniques such as in situ hybridization, enzyme staining, and immunostaining and enzyme assays also can be used to detect the presence or expression or activity of polypeptides or polynucleotides.
Mutant, non-naturally occurring or transgenic plant cells and plants are described herein comprising one or more recombinant polynucleotides, one or more polynucleotide constructs, one or more double-stranded RNAs, one or more conjugates or one or more vectors/expression vectors.
Without limitation, the plants described herein may be modified for other purposes either before or after the expression or activity has been modulated according to the present invention. One or more of the following genetic modifications can be present in the mutant, non-naturally occurring or transgenic plants. In one embodiment, one or more genes that are involved in the conversion of nitrogenous metabolic intermediates is modified resulting in plants (such as leaves) that when cured, produces lower levels of at least one tobacco- specific nitrosamine than control plants. Non-limiting examples of genes that can be modified includegenes encoding a nicotine demethylase, such as CYP82E4, CYP82E5 and CYP82E10 which participate in the conversion of nicotine to nornicotine and are described in WO2006091 194, WO2008070274, WO2009064771 and PCT/US201 1/021088 and as described in detail herein. In another embodiment, one or more genes that are involved in heavy metal uptake or heavy metal transport are modified resulting in plants or parts of plants (such as leaves) having a lower heavy metal content than control plants or parts thereof without the modification(s). Non-limiting examples include genes in the family of multidrug resistance associated proteins, the family of cation diffusion facilitators (CDF), the family of Zrt-, Irt-like proteins (ZIP), the family of cation exchangers (CAX), the family of copper transporters (COPT), the family of heavy-metal P-type ATPases (for example, HMAs, as described in WO2009074325), the family of homologs of natural resistance-associated macrophage proteins (NRAMP), and the family of ATP-binding cassette (ABC) transporters (for example, MRPs, as described in WO2012/028309, which participate in transport of heavy metals, such as cadmium. The term heavy metal as used herein includes transition metals. Examples of other modifications include herbicide tolerance, for example, glyphosate is an active ingredient of many broad spectrum herbicides. Glyphosate resistant transgenic plants have been developed by transferring the aroA gene (a glyphosate EPSP synthetase from Salmonella typhimurium and E.coli). Sulphonylurea resistant plants have been produced by transforming the mutant ALS (acetolactate synthetase) gene from Arabidopsis. OB protein of photosystem II from mutant Amaranthus hybridus has been transferred in to plants to produce atrazine resistant transgenic plants; and bromoxynil resistant transgenic plants have been produced by incorporating the bxn gene from the bacterium Klebsiella pneumoniae. Another exemplary modification results in plants that are resistant to insects. Bacillus thuringiensis (Bt) toxins can provide an effective way of delaying the emergence of Bt-resistant pests, as recently illustrated in broccoli where pyramided crylAc and crylC Bt genes controlled diamondback moths resistant to either single protein and significantly delayed the evolution of resistant insects. Another exemplary modification results in plants that are resistant to diseases caused by pathogens (for example, viruses, bacteria, fungi). Plants expressing the Xa21 gene (resistance to bacterial blight) with plants expressing both a Bt fusion gene and a chitinase gene (resistance to yellow stem borer and tolerance to sheath) have been engineered. Another exemplary modification results in altered reproductive capability, such as male sterility. Another exemplary modification results in plants that are tolerant to abiotic stress (for example, drought, temperature, salinity), and tolerant transgenic plants have been produced by transferring acyl glycerol phosphate enzyme from Arabidopsis; genes coding mannitol dehydrogenase and sorbitol dehydrogenase which are involved in synthesis of mannitol and sorbitol improve drought resistance. Other exemplary modifications can result in plants with improved storage proteins and oils, plants with enhanced photosynthetic efficiency, plants with prolonged shelf life, plants with enhanced carbohydrate content, and plants resistant to fungi; plants encoding an enzyme involved in the biosynthesis of alkaloids. Transgenic plants in which the expression of S-adenosyl-L-methionine (SAM) and/or cystathionine gamma-synthase (CGS) has been modulated are also contemplated.
One or more such traits may be introgressed into the mutant, non-naturally occuring or transgenic tobacco plants from another tobacco cultivar or may be directly transformed into it. The introgression of the trait(s) into the mutant, non-naturally occuring or transgenic tobacco plants of the invention maybe achieved by any method of plant breeding known in the art, for example, pedigree breeding, backcrossing, doubled-haploid breeding, and the like (see, Wernsman, E. A, and Rufty, R. C. 1987. Chapter Seventeen. Tobacco. Pages 669- 698 In: Cultivar Development. Crop Species. W. H. Fehr (ed.), MacMillan Publishing Co, Inc., New York, N.Y 761 pp.). Molecular biology-based techniques described above, in particular RFLP and microsatelite markers, can be used in such backcrosses to identify the progenies having the highest degree of genetic identity with the recurrent parent. This permits one to accelerate the production of tobacco varieties having at least 90%, preferably at least 95%, more preferably at least 99% genetic identity with the recurrent parent, yet more preferably genetically identical to the recurrent parent, and further comprising the trait(s) introgressed from the donor parent. Such determination of genetic identity can be based on molecular markers known in the art. The last backcross generation can be selfed to give pure breeding progeny for the nucleic acid(s) being transferred. The resulting plants generally have essentially all of the morphological and physiological characteristics of the mutant, non-naturally occuring or transgenic tobacco plants of the invention, in addition to the transferred trait(s) (for example, one or more single gene traits). The exact backcrossing protocol will depend on the trait being altered to determine an appropriate testing protocol. Although backcrossing methods are simplified when the trait being transferred is a dominant allele, a recessive allele may also be transferred. In this instance, it may be necessary to introduce a test of the progeny to determine if the desired trait has been successfully transferred.
Various embodiments provide mutant plants, non-naturally occurring plants or transgenic plants, as well as biomass in which the expression level of a polynucleotide (or any combination thereof as described herein) is modulated to modulate the protease activity therein.
Parts of such plants, particularly tobacco plants, and more particularly the leaf lamina and midrib of tobacco plants, can be incorporated into or used in making various consumable products including but not limited to aerosol forming materials, aerosol forming devices, smoking articles, smokable articles, smokeless products, and tobacco products. Examples of aerosol forming materials include but are not limited to tobacco compositions, tobaccos, tobacco extract, cut tobacco, cut filler, cured tobacco, expanded tobacco, homogenized tobacco, reconstituted tobacco, and pipe tobaccos. Smoking articles and smokable articles are types of aerosol forming devices. Examples of smoking articles or smokable articles include but are not limited to cigarettes, cigarillos, and cigars. Examples of smokeless products comprise chewing tobaccos, and snuffs. In certain aerosol forming devices, rather than combustion, a tobacco composition or another aerosol forming material is heated by one or more electrical heating elements to produce an aerosol. In another type of heated aerosol forming device, an aerosol is produced by the transfer of heat from a combustible fuel element or heat source to a physically separate aerosol forming material, which may be located within, around or downstream of the heat source. Smokeless tobacco products and various tobacco-containing aerosol forming materials may contain tobacco in any form, including as dried particles, shreds, granules, powders, or a slurry, deposited on, mixed in, surrounded by, or otherwise combined with other ingredients in any format, such as flakes, films, tabs, foams, or beads. As used herein, the term 'smoke' is used to describe a type of aerosol that is produced by smoking articles, such as cigarettes, or by combusting an aerosol forming material.
In one embodiment, there is also provided cured plant material from the mutant, transgenic and non-naturally occurring tobacco plants described herein. Processes of curing green tobacco leaves are known by those having skills in the art and include without limitation air- curing, fire-curing, flue-curing and sun-curing as described herein.
In another embodiment, there is described tobacco products including tobacco-containing aerosol forming materials comprising plant material - such as leaves, preferably cured leaves - from the mutant tobacco plants, transgenic tobacco plants or non-naturally occurring tobacco plants described herein. The tobacco products described herein can be a blended tobacco product which may further comprise unmodified tobacco.
The mutant, non-naturally occurring or transgenic plants may have other uses in, for example, agriculture. For example, mutant, non-naturally occurring or transgenic plants described herein can be used to make animal feed and human food products.
The invention also provides methods for producing seeds comprising cultivating the mutant plant, non-naturally occurring plant, or transgenic plant described herein, and collecting seeds from the cultivated plants. Seeds from plants described herein can be conditioned and bagged in packaging material by means known in the art to form an article of manufacture. Packaging material such as paper and cloth are well known in the art. A package of seed can have a label, for example, a tag or label secured to the packaging material, a label printed on the package that describes the nature of the seeds therein.
Compositions, methods and kits for genotyping plants for identification, selection, or breeding can comprise a means of detecting the presence of a polynucleotide (or any combination thereof as described herein) in a sample of polynucleotide. Accordingly, a composition is described comprising one of more primers for specifically amplifying at least a portion of one or more of the polynucleotides and optionally one or more probes and optionally one or more reagents for conducting the amplification or detection.
Accordingly, gene specific oligonucleotide primers or probes comprising about 10 or more contiguous polynucleotides corresponding to the polynucleotide(s) described herein are dislcosed. Said primers or probes may comprise or consist of about 15, 20, 25, 30, 40, 45 or 50 more contiguous polynucleotides that hybridise (for example, specificially hybridise) to the polynucleotide(s) described herein. In some embodiments, the primers or probes may comprise or consist of about 10 to 50 contiguous nucleotides, about 10 to 40 contiguous nucleotides, about 10 to 30 contiguous nucleotides or about 15 to 30 contiguous nucleotides that may be used in sequence-dependent methods of gene identification (for example, Southern hybridization) or isolation (for example, in situ hybridization of bacterial colonies or bacteriophage plaques) or gene detection (for example, as one or more amplification primers in nucleic acid amplification or detection). The one or more specific primers or probes can be designed and used to amplify or detect a part or all of the polynucleotide(s). By way of specific example, two primers may be used in a polymerase chain reaction protocol to amplify a nucleic acid fragment encoding a nucleic acid - such as DNA or RNA. The polymerase chain reaction may also be performed using one primer that is derived from a nucleic acid sequence and a second primer that hybridises to the sequence upstream or downstream of the nucleic acid sequence - such as a promoter sequence, the 3' end of the mRNA precursor or a sequence derived from a vector. Examples of thermal and isothermal techniques useful for in vitro amplification of polynucleotides are well known in the art. The sample may be or may be derived from a plant, a plant cell or plant material or a tobacco product made or derived from the plant, the plant cell or the plant material as described herein.
In a further aspect, there is also provided a method of detecting a polynucleotide(s) described herein (or any combination thereof as described herein) in a sample comprising the step of: (a) providing a sample comprising, or suspected of comprising, a polynucleotide; (b) contacting said sample with one of more primers or one or more probes for specifically detecting at least a portion of the polynucleotide(s); and (c) detecting the presence of an amplification product, wherein the presence of an amplification product is indicative of the presence of the polynucleotide(s) in the sample. In a further aspect, there is also provided the use of one of more primers or probes for specifically detecting at least a portion of the polynucleotide(s). Kits for detecting at least a portion of the polynucleotide(s) are also provided which comprise one of more primers or probes for specifically detecting at least a portion of the polynucleotide(s). The kit may comprise reagents for polynucleotide amplification - such as PCR - or reagents for probe hybridization-detection technology - such as Southern Blots, Northern Blots, in-situ hybridization, or microarray. The kit may comprise reagents for antibody binding-detection technology such as Western Blots, ELISAs, SELDI mass spectrometry or test strips. The kit may comprise reagents for DNA sequencing. The kit may comprise reagents and instructions for determining at least the proteasae content. Suitably, the kit comprises reagents and instructions for determining at least protease content in plant material, cured plant material or cured leaves.
In some embodiments, a kit may comprise instructions for one or more of the methods described. The kits described may be useful for genetic identity determination, phylogenetic studies, genotyping, haplotyping, pedigree analysis or plant breeding particularly with co- dominant scoring.
The present invention also provides a method of genotyping a plant, a plant cell or plant material comprising a polynucleotide as described herein. Genotyping provides a means of distinguishing homologs of a chromosome pair and can be used to differentiate segregants in a plant population. Molecular marker methods can be used for phylogenetic studies, characterizing genetic relationships among crop varieties, identifying crosses or somatic hybrids, localizing chromosomal segments affecting monogenic traits, map based cloning, and the study of quantitative inheritance. The specific method of genotyping may employ any number of molecular marker analytic techniques including amplification fragment length polymorphisms (AFLPs). AFLPs are the product of allelic differences between amplification fragments caused by nucleotide sequence variability. Thus, the present invention further provides a means to follow segregation of one or more genes or nucleic acids as well as chromosomal sequences genetically linked to these genes or nucleic acids using such techniques as AFLP analysis.
In one embodiment, there is also provided cured plant material from the mutant, transgenic and non-naturally occurring plants described herein. For example, processes of curing tobacco leaves are known by those having skills in the field and include without limitation air- curing, fire-curing, flue-curing and sun-curing.
In another embodiment, there is described tobacco products including tobacco products comprising plant material - such as leaves, suitably cured plant material - such as cured leaves - from the mutant, transgenic and non-naturally occurring plants described herein or which are produced by the methods described herein. The tobacco products described herein may further comprise unmodified tobacco.
In another embodiment, there is described tobacco products comprising plant material, preferably leaves - such as cured leaves, from the mutant, transgenic and non-naturally occurring plants described herein. For example, the plant material may be added to the inside or outside of the tobacco product and so upon burning a desirable aroma is released. The tobacco product according to this embodiment may even be an unmodified tobacco or a modified tobacco. The tobacco product according to this embodiment may even be derived from a mutant, transgenic or non-naturally occurring plant which has modifications in one or more genes other than the genes disclosed herein.
The invention is further described in the Examples below, which are provided to describe the invention in further detail. These examples, which set forth a preferred mode presently contemplated for carrying out the invention, are intended to illustrate and not to limit the invention.
EXAMPLES
The following examples are provided as an illustration and not as a limitation. Unless otherwise indicated, the present invention employs conventional techniques and methods of molecular biology, plant biology, bioinformatics, and plant breeding.
Example 1
A 48h time-point following the curing start was selected to screen for curing-activated genes based on Affymetrix data essentially as described by Martin et al. (2012) BMC Genomics, 13:674). In brief, exon candidates from genomic DNA and from EST contigs were joined and the genomic candidates were cleaned for redundancies (98% threshold). This resulted in a set of 312,053 exon candidates, 12,925 of which were represented by ESTs, but were not included in the genome assembly. Data sets were verified as described by the manufacturer (Affymetrix). In addition, quality checks included probe-level models, Normalized Unsealed Standard Error (NUSE) and Relative Log Expression (RLE) plots, and the analysis of DABG results as described by the manufacturer.
As the exon array design had no mismatch probes, summarization was performed using Robust Multi-array Average (RMA) method. A total of 272,342 probeset expression values were generated, and DABG P-values were computed to assess the significance of the signal obtained for each probeset. This involved the background probes that are spread over the chip. These random probes have a varying GC content. Quality checks involved a combination of Affymetrix Power Tools (APT) and Bioconductor packages, for which the Tobacco Exon Array (TobArray520623F) cdf environment was created. Once the expression values were available, differential gene expression analysis was performed using moderated t-statistics in linear model LIMMA.
Example 2
Differential expression. The tissue samples were sequenced using RNA-seq; reads were mapped to the genomes of the 3 varieties using Tophat2. Previously published gene models were used as the basis for the differential gene expression analysis. Expression changes during curing were calculated using the Cuffdiff2 software based on the mapped reads. Genes were considered up-regulated if their expression levels increased significantly during the first 48h of curing, and not if the change was insignificant or decreased. Tobacco proteins were identified by a BLAST search against a database of transcripts for the 3 varieties and equivalent genes in the 3 varieties were identified by a mutual best BLAST hit search of the transcripts of the 3 varieties Burley, Virginia and Oriental (e-value cutoff 1 e- 80).
The data (Figure 2) shows the number of senescence-activated genes in the 3 cured varieties.
Example 3
The proteasae genes identified in Example 2 were analysed for membership of known protease families. The results are set forth in table 1.
The 80 curing-activated protease genes were found to belong to 21 different protease families. In the table, AC, air-cured; FC, flue-cured; SC, sun-cured. AC+FC+SC, up-regulated in all three types of tobacco ; AC+FC, up-regulated in air-cured and flue-cured tobacco ; AC+SC, up-regulated in air-cured and sun-cured tobacco ; FC+SC, up-regulated in flue-cured and sun-cured tobacco ; AC, FC and SC, up-regulated only in the respective tobacco type.
Protease coding genes AC+FC+SC AC+FC AC+SC FC+SC AC FC SC
Alpha/beta-Hydrolases 1
superfamily protein
Aspartic proteinase A1 2 1 2 1 *
(APA1 )
CLP protease/crotonase 1 1
family protein
Cysteine proteinases 3 2 3 1 2 1 2 superfamily protein
DegP protease 3 - - - - - 1 -
Eukaryotic aspartyl 4 1 1 1 3 2 2 protease family protein
FTSH protease 8 - - - - - - 1
Gamma-glutamyl 1
transpeptidase 4
Heat shock protein 101 1 1 - - 1 1 -
Ion protease 1 & 3 - - - - - 3 -
Metallopeptidase M24 1
family protein
Papain family cysteine 1
protease
Peptidase M20/M25/M40 1 1 family protein
Protease-related - - - - 1 2 -
SAG 12 - 1 - - - - -
Serine carboxypeptidase- 1 1 1 1 1
1
like
SERPIN - 1 1 - - 1 -
Signal peptide peptidase - - - - 1 - -
SITE-1 protease - 1 - - - -
Subtilisin-like ser 3 1 - 1 3 - endopeptidase fam prot.
Ubiquitin-specific 1 3 2
proteases
Total 16 12 6 3 17 19 7
Table 1.
Example 4
APA 1 is encoded by a single gene in Arabidopsis thaliana and 4 in Tomato. The gene activated in flue-cured Virginia tobacco (see Table 1 ) is close to APA1 -Tomato-1. Two gene copies from both ancestors N. sylvestris (S) and N. tomentosiformis (T) exist in N. tabacum. Affymetrix data confirmed the activation of the S form (upper panel) and apparently not the T form during Virginia flue-curing (lower panel).
Example 5
Table 2 illustrates the differential up-regulation of SEQ ID NO:1 to 80 in the three tobacco types air-cured Burley (AC), flue-cured Virginia (FC) and sun-cured Oriental (SC).
Table 2
SEQ ID AC-FC-
NO: AC FC SC AC-FC AC-SC FC-SC SC
1 X
2 X
3 X
4 X
5 X
6 X
7 X
8 X
9 X
10 X
11 X X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X SEQUENCE LISTING
SEQ ID NO: 1
ATGGCTCTTCGTTTCTCTTTAATTTTCCTATTTTCTCTTTTCTTAACGACGTCGTTATTGTT
GTCCGTTAACGGCAACATTAACGGCGGTGAAGATGACGATATTTTGATCCGTCAAGTCG
TAGGCGACGACGACGATCACTTGTTAAACGCCGATCATCACTTCACGATTTTTAAGAGG
AGGTTCGGCAAAACCTACGCGTCCGATGAGGAGCATCATTACAGATTCTCGGTGTTCAA
GGCTAACTTGCGCCGTGCAATGCGCCACCAGAAGCTTGATCCCTCCGCCGTTCACGGT
GTGACTCAGTTTTCCGATTTGACTCCGGCCGAGTTCCGCCGGAATTTTCTAGGAGTTAA
CCGTCGGCTCCGGCTTCCTTCTGATGCCAATAAAGCTCCTATTCTTCCTACTGAGGATC
TCCCTTCAGGTTTCGATTGGAGAGATCACGGTGCCGTCACGTCAGTAAAGAATCAGGTA
CTAGTATATATCAATGTTTGTGTAAAGTTTATCTTTTTTTGGATAGGCGAAGTGTTCGTCA
TTAATGAATAATTACATAATTTCTATTTGTATCGATTGAAAAACTAGGGTTCATGTGGCTC
GTGCTGGTCATTTAGTACCACTGGTGCGTTAGAAGGTGCCACCTATCTTTCTACAGGGA
AGCTTGTAAGCCTCAGCGAGCAACAACTTGTGGACTGTGATCACGAGGTTTGACGTTCT
TCCTCTTTATCTTAGCTTAAAATCATGAATATATTGTCAATAGAGTTACTGTTTTTCTTTTT
TCTTTTTTTCTGGGACGTTTGAATGTGTAAAATAATTTTCGCTGTGGTGTGTCACAGGAT
TTG GTC CATAG CTGTC ATCTTTTTCTAGTTAAAGAAAATTGATAGCGTG AAG GACACTAA
CCGCATAAATTAAAGTGCTTTCTCTGATTCCGTCTCACTTTAAAGTTTAAGAACCCGTTT
GGCCATGAAATTTCTTTTTTTTTTCCGTAAAATTTAACTTTTCTTCTAAATCAATGTTTGGC
CATCAAATTTTTTATTTTCACTTGAAGATAATTTTACAATTTTTCAAAAATTTGAAAAACTT
CAAAAACTGTTTTTCAAAATTTTGAATATTGTTGTTGATGTAAAAAACAGACACTAATTTA
TAAGAGTAATCTCCTCTTCTTTGTTTGGTGGATGGCCAGGGGTGGGGACTGGGGACCC
ATCTTAAG GG AGCG G AG G AAAAGTTGTTTTATTATTAGTTTATG G CTG GTTATG AAATTC
AACTAATTGATACTCTGAGGATAACCACGGACAAAATTGTTTGGATGATGAGGAAATCG
CATCCAAAAATTGTCTGCATCTGAATATACTTTTAACATTACTTGAAGTTTCAAGTTTAAG
CTCGTGTATGCAACGTGGTGGGAGATGTACAAGGATAAATAGAAAGGCGTTGAGTTATT
GAGATAGGTTTGTAAAACTCTTCTTAAATTTTCCATTGTTTGATTGCCATTATATAATCAT
TTGTATAATTTCCAACTTGGAAAAAGCTGTTCAAACTCAAAATAAGGTTTAGGCTTGAAC
TTATTGCTATTTACGGTGTCTGCCATTTTATAATCAGAAATGGGATTGAATACAGAGTTA
ATAAGACCACTGACTCGCCTTATTTACCTCACTCGTCTCAGATGAATTTTATACTTCCAA
ATTTCAGTGTTCCCCATCTCCCTGAAAAATGTATAATTTGGCCTTGCATTTATCTGCAGT
GTGATCCAGAAGAAAAAGATTCATGTGACGCAGGGTGCAATGGTGGCCTAATGAATAG
TGCCTTTGAATACACTCTGAAAGCTGGTGGACTTATGCGAGAAGAAGATTATCCATACA
CTGGCACCGATCGTGGAACCTGCAAATTTGACAACACCAAGGTTGCTGCTAAAGTTGCT
AACTTTAGCGTTGTCTCCCTTGACGAAGAACAAATCGCTGCTAATCTTGTCAAGAATGG
TCCTCTCGCTGGTAAATAGTCTCTCAAAACACTTTTCAATTTGCCTATCATTATGCTTCTT
CTTTGTCCTTACTTGATATTGTCAAAGTATATACTTGGATTGTCATATTTATGCACTGGAA
TGTAAAAGGTATTTACACAATTAAGTCACTTATTAGGTAATTACAAGTAACTATTTTGATA
AGTTTTAATTAGTAATGTGTTAAAATGATAATTAACTTGCTATTTAAATTCACTGATAGCC
GTAACAAAATCTTTTAACTATTAATATATATAATATAAATATTTGTTTTTTAATAAACAACA
AATATTATTTGTGAAAGATCCAGTTATGTAGCTTGAAACTACATTTTGGGATTTTGAATTA
TGTACTACTCTTCTTATGCTAATGGTTTTCAATTTTTCACTGATGTAAACTTCTGAAAGCA
TTTTTGTTGCTTGGCTTGCAGTGGCGATCAATGCAGTGTTCATGCAGACATACGTTGGC
GGAGTTTCCTGCCCATATATATGCTCTAAGAAGTTGGATCATGGTGTCTTATTAGTTGGT
TATGGTACTGGCTTTTCTCCCATTAGAATGAAAGAGAAACCATACTGGATCATCAAGAAC
TCATGGGGAGAGAAATGGGGTGAAAACGGATACTACAAAATCTGTAGAGGCCGCAATG
TTTGCGGAGTGGATTCAATGGTTTCAACAGTTTCAGCTGTTAGTACCAGCTCACAC SEQ 2
TTAAGCTGCTTCAGCAAATCCAACTCTGAGTTTGCCATAATCGAAGACTGTGTGATATC
GACCCATGAAAACATCACCCAAGATCCTGTAACCAAAGGAATACCATAGAGAACTCAGT
G AACAAAAG AACTG CAG G CTCAG GTTTAATTGTGCTGTAGCTCTATAGTTCG G ATTAAA
CTTATATTTGGATTAACTGCATTGCTGATATTTATCTCTAAAACATAATTATAAACTAAAAT
AGAGAGAACATATAAAGATAACTTTACCAGAGTGGTCCGCGGGGAGGAGGAATGTCCA
AGCCAGTGAAACCACTAATACACTGTGCCTTAGCACCCTCGCCCACCTTGAGTATGTAC
TGATCACGGGAAAAGAATCCAAAATTAGAACATAGATCAATCTAGGTCAGCAATCTAAA
CAGACAACTGAAACAAGTAAAAGGGAACCACACATCGGATTACAACTTCACTCTTTCAA
TCTTG AAAAAATTTGTTG AAAG G GTG G G AAAAGACTAGAGTGATAGTCTAGTAG AG AAA
AGTTTTG CATATG GTCAAG GG GTTTG GTGTATCACTTG G ATTTTTTC CTTTGTAAGATGT
GGTCTATCCTGAATTATTCAAAGCTCAACCTCTTTATGTTACTAAACCACAAAACAAACA
AATTCAGAAAAAATGCAAATGATCAAATTGATTTGGTGTACACTTGATGAATTTCTTCTTT
GTAAGATTTGTTCTATACTGAATCCTTCAGACATAAAAAAAAATATTTTTTTTTTGGGGGG
N CG G CCTGAATTAG CAAG GTCAG CAAGTAATACACTTCCATAAAAATAG CAAAGG GTAA
CTTTTTCACGGCACAAAGATCTTATGCAGGTTTTCTTAGATTACTTAGCTGGAAAATGAG
ACATCTAAATTTAAGTAAAGTCGAAATACTCACAACCTCAAAATATAGAAGTACTTCTTGA
TGACAACAAACATCTACTTCTCTGTAGAAACTGAAAACCTTAAACACTAGAATCGGTTTT
GTAATATGACAATTAGTTGTAATGCCACAAAAGGACTCTATGATGAGCCACTTAATTTTT
TCTCTCTTTGACAATGTTGAATTAGAAGAGGAATAGCAATGTTTATTACTGTCAAAGACC
ATTATAAAGCATACCTCCTTCGGGACGAGGTCAAAAACTTTGCCACCAATTGTGAAAGA
GACTGTAGGCATTGAAGAAAGCTTTCCACAGTCAACAGCTGATTCCCCCAATGGGCTTG
G GAG ACG CTCG CAAAG CTGTC AGTCAGTCAG CAATCTTTTCAAG AAAAG AAG AAAATTG
CAGTGACAGATGTTTTACCTCATTCACATAGTTTAATATGCGATCTTGAGTCTGGTTTTG
TCTCAGTTGATTCTCCATCCATATGACCGCCATTTCACAAGCAGAGCACATACCATCCT
G CAGTCCTGTG GATCTG CCAGCTTTCTC GTCTACAACACTCTCAATTCC CATACTG CAG
AAAAAAGGCCACAGAATTATTCAGTATTTATATCAACATTATGGATTAACCAAATGCACT
ATATGTTCACATGAAGGGGCAAAGAGAGCCTGAAGACTAACCTAACTCCGCGGTTTCCA
TCGAAAGTGCATACTCCAACCTGTGAGCAAATCTTCTTTGGATGTGCCTGTTTAAATACT
GACCAATTAGATAACCGGGAAAGGCAACTAGATTGCCAAGTGTCATTTTGCTGTAACTG
CACAGGAAACTGCATATCAAACAAATGAAAATGCAGTTACACAGTTGAATGCTCACCTC
TGCTAACAGCAAATCCATGATTGTCTGCCCGTACTGCTCCACTACAGATTTGCATTGTT
GGCTAGCAACTCCAGAGGCTCCAATGGCTTGATTAATCATAGTGATTATGGTCTGTACG
GAAAGGGGTGGGTTTAAGATTGCTCAACCTTGGAAGTGTTTTAATCGTACAATTGTAGA
G AC AAAAG G CAG C AG ATTTTTACTTAATTTATATTGTC AAC ATTTC C AAG C C AAC AG GAT
AAAACTTGG CTACAGTTTTCG G GTTG G ATAATTTTCTTTTCAAATAG AAG AGG G GTAAAT
AAATAAGTCGACAGAAGACCAGGACTACAGCAGAAGTAAAAGCATCATCCTCATTGAAA
C GTAAT AAAAG C AAGTAAC AC AAAAC AAC AAGTAC CTACTG AG G CAG CTTTAAAC ATATT
AAACTG AAAG AC AG G G AAG AAAAAG C AG ATTTAC AG ACTTC G G C C C AGTG ATAAG CTAA
GATGGTATATCCAAAGGTAACCTCAGAAATGAAAAACCAATTTCACTACCATCCTCTCTG
TGATGAAAAATTAAAACACAACACAGATCAGATGATGGATTCGTGCTATTTAACTCATGA
ATCTTAG G AAAATGTTACTTTTCTTGCTG AGCTGTTG AAG GTT CAAAG GAACAAG GAAAT
CAATAATCGAATTGCGGTTGACTTTGATGATGCAAACAAATAACAAAAACATAACAAACA
AGCGATATGTCCCAAATCAAGGCTATAGATAATACCGTTGGACCAGCCAAGAGAGAAGT
CCCTGAATCCGCTATTGCAGAGCACCCACTTTCACAGTAACCTACACCCGATATGTTCA
TGTTAAAAACTCAAAGGAAGGGAAAATTCTATATCCAGGCACATAGCCTTCATCTATATT
CCCGAAATTCGGCAATCCAATTCAAAAGGTACATAGCAAAACATACCAGTAGCTTTACC
CTCGATAAGAACATCACCCATATCAAACTGCCAATCATATTACTAAGATCAAGATAGCAT
TTGTACAAAAAATGAACATACATAGTATCGAATTGACCGAATGACAAACCTGCCAATAAC
CTTTGTGTGTGACTGGGACATAAGTGATTTCTCCCTTATAGTGATTAGGATCAACCCCA CCAAACACGATTTCTCCGCCTTGTTCTTCCTCTGTATTTCGGTTGAGCCAAAATGAGAA
GACAGGATCCTTGATAAGACCCTGTTGGACCATGTTGTACCTGGAAAAGACAGGAGAT
GCTGCCCAGATGAATGTCAAATCAAATTTAAACAGAAAGAGACATCCAGCCTATCCTGC
ATTTATGGAAATCTAATCCTTCAATGTGTTAAACCTCTTCTGGAAAGGAAATTGTCTAGA
GCTTTAATTTGGTTTGTGGGAAAGAAATAGAGCAAACTAAATACCGCCCACGTACCAAA
CTGGAACAGCATTGCCAACTGAAATCTCCTGGAATCCAAGACCCAATATACCGTCAAAC
TTGGCTACCAAAAATGTCACGCTGGGTTCTCTGGTTGCCTCAATAAATTCCTAGTACAT
GAACACCTTGAGATATAAGATTTCCACTTTCAAGAGATTTAAAACAAGTGAGGAGCCTCA
CTAACCTGATCTGTTACAACAAGGTCACCAACTTTGACGTTGTCTTGACTGAAGAATCCA
GAAATAGCTCCACTACCATACTGAATTGCAGCAGACTTCCCTGTATGGCAAATCAAAAA
TTTATCACGAACTAAATCACATTAAATTACAATGCCAAATACGATCTCAGTCTTGTGGAA
ACATTCGATAAGATCTTAATGTTGTTCATTAAGGTAGGAGTGACCTAGTGTTCTTAAAAG
CAAAACGTGCAAAAAAATAAAACAAGGTCCACGGACTTGCATTGAATTGCGAGAAGTGA
AG C G C AAAATTAAC AG G AAAC AAG AAAATATC G AAC ATTTATG AATTTACTCTAC C ATAA
AAATTAAACTGCAGGTTAAATAACTAATTTCGGCATCCAATATACAATAATCCCAGTATTA
ATTCAACTCCTCAAAATTGAGATTCAAAGAAGCAACCAATTCTAGTTGGAATCACTTTGT
GCACCATTATTTGAAGCGCAACTTCTCTAAAGCGCATGGCTTCAGCAATGAAGTGATAG
CCCTTGCTGCATCGCTTCATAACTTTAAACGACCAAGCAATGGCTTTCAATAACACTGG
AGTGAACTCACCTGGCAAACCCATTCAACTGCTATAGACTGTTCAATCCATTTTCTTTGA
GCAACATATATAATTGTAATAGAACAAAAAATAAAGAATAACTAGTTCTCTGCGGAAAAT
TTCTTATGTCACAGACCTACATGGATACAAACCGAGTTAATAGGGAAGAAGAAAGACCA
TCTAAAAAG G CATTG CATAG GTTAAG ACTTAAGACTATAC AAG GTG CAACG AAAAGG CA
CTAATCGCAGAGAGATATAAGGATATTGATGTTTCTTTTCCAAAACCTCACTAGTTACAG
TAATATACTAAGAAACACAACATAAACATTAAACAGCCTCGTTTTATGTCTTAACAGTCAA
CTACATGTACTCGTCAATTAACCTTTCCAAGGGAATCCCTTGATGCACCGTGAGAAACA
CATAAGGACAATACAAAAGATGTTCCATAATGAACAAGATGGCACGTATTCTAAACAATA
ACAG G CATTAG AAGG AAG CATATGTTTCATGC AG CAATAAACAAG CAAATGGTAGAG AG
AAACAATTTGCATCAACATACAGAAATGGAAACATAACATACCATTCTTCTTATAAGTACT
TG ATTCG CTTG ATTTG AACTTG G AATGAAAG AAACAG G GAACCTACACCAAG ATAG GCA
GTCATCAAATTTTACATCACTCAAGATGGATGTACAATGCTATGCTTTGTATCATTTGCAT
GTATAG AAG CTTAC AG AG AAATAG C ACTTC G AC G AC G G C AC C C AC AAATTC G AG CTAC C
AGTGTCAAAGATTACAGTGAACTTCTGAGGTGGAGTGCCTACACCAATCTCCCCAAAAT
ATTGAGCATCCATATAGTTCTTCAGTGCTACAATGTCTGTATCCTCAGAGTCCCCGAGTT
TACCACGGAAGTTATACTTCCTAATAGACGCCCTCAAAACGTCCCCTTCCTTTGACTCAA
TGCGTGCAGCAAGCCGGTTATTTTGATCAAATTTCATTTTTTTCAAGCCAATTCTCATCA
AGCCATCATTGGATGAGGAGGCCAAAGGAAAGAGCAGTGCTGAGAGAAACAGGGCAA
CAAGAAATACTTTTGCTCCCAT
SEQ 3
ATGGGTTCTTTCCTCTGTTTCTCCGTCATTGTTGTTCTCCTTGTTCTTCAGCCATGTTTA
GCCAAGAAAGTTTACATTGTTCACATGAAAAATCACCAAATACCTTCTTCTTTTGCTACC
CATCACGATTGGTACAATGCTCAGCTCCAATCTTTGTCCTCTTCTTCTACCTCTGATGAA
TCATCCCTTCTTTACTCTTACGACACTGCTTATTCTGGCTTTGCTGCTTCTCTTGACCCA
CATGAAGCTGAACTACTCCGTCAATCTGATGATGTTGTTGGAGTTTACGAGGATACTGT
TTATACACTCCATACAACAAGGACTCCTGAGTTTCTGGGGTTGAATAATGAGCTCGGCC
TTTGGGCTGGTCACAGTCCACAGGAACTCAACAACGCTGCTCAGGATGTTGTTATCGG
AGTTCTTGACACCGGCGTTTGGCCGGAGTCGAAGAGCTATAACGATTTCGGTATGCCC
GATGTGCCGTCGAGGTGGAAGGGTGAATGTGAATCGGGTTCCGATTTCGATCCGAAAG
TACATTG CAACAAAAAG CTG ATAG GTG CTCGTTTTTTCTCCAAAG GTTATCAAATGTC GG CCTCTGGCTCGTTCACGAACCAACCTAGACAGCCGGAGTCACCTCGTGACCAAGACGG
TCATGGCACCCACACATCCAGCACCGCCGCTGGTGCACCTGTGGCGAACGCTAGCCTT
CTCGGGTACGCTAGTGGGGTCGCGCGTGGTATGGCACCTCGAGCGCGTGTAGCTACG
TACAAGGTATGCTGGCCTACTGGTTGTTTTGGTTCTGATATTCTAGCTGGTATGGAACG
TGCTATTTTAGATGGAGTTGATGTACTTTCATTATCTTTGGGTGGTGGATCGGGTCCTTA
TTATCGTGATACAATTGCTATTGGTGCTTTCTCTGCTATGGAAAAAGGAATTGTTGTTTC
CTGTTCAGCTGGAAATAGCGGTCCAGCTAAAGGCTCACTTGCAAATACAGCTCCTTGGA
TCATGACCGTTGGTGCTGGTACCATAGATCGTGATTTCCCTGCATTTGCTACTTTAGGT
AACGGGAAAAAAATTACCGGAGTTTCGTTATACAGTGGAAAAGGAATGGGTAAAAAGGT
AGTTC CATTAGTTTACAG CACAG ACAGTAGTG CAAGTCTTTGTTTG CCG GGTTCACTTG
ACCCGAAAATGGTCCGAGGGAAAATAGTGTTATGTGATAGAGGGACAAATGCGAGAGT
AGAAAAGG GTTTAGTAGTG AAG GAAGCTG GTG GAGTTG G G ATG ATATTG GCTAATACG
GCGGAGAGCGGCGAGGAATTGGTGGCGGATAGTCATTTGTTGCCGGCGGTAGCTGTA
GGTAGGAAATTGGGAGATTTTATAAGGCAGTATGTAAAGAGTGAAAAGAATCCGGCCG
CCGTGCTCAGCTTTGGTGGGACGGTGGTGAATGTGAAACCGTCGCCGGTGGTGGCTG
CGTTTAGTTCAAGAGGGCCCAATACTGTAACTCCACAGATTTTGAAGCCCGATGTTATT
GGGCCTGGAGTTAATATTTTGGCTGCTTGGTCTGAGGCTATTGGGCCCACTGGGCTTG
AAAAGGATACCAGAAGGACCAAGTTCAACATCATGTCTGGTAAGTATTACCAACAACGG
CTAGTTTCTTAATTTAATCTTTTTCATGCTTAGCTTAATTATGGCCTTAATTATATTTTTAT
TAG ATCTC G C AATTATTAATACTAAC C GTAC AC ACTTAAAAAG G AAAAG AG G AAC G C GTA
GAATAAAGACACCTGTGGGTGATCTGGAATTATGTACTATGCACATTCCTAAACTTTAGA
GGGGTTCACATGTGTAGCATTGATAAGTTAATCCTAAATTACATTAGTTATAATTAAATAT
TAATGCAGTTTCCAAGAAAATAGATGGACTAAAATTTAGACTTATTTGTATGATGTGACG
TGTGGAATTAAATTTAAAAACTGCCCAAGCCTATATCAAATTTATGGCTAAAATAGCAAG
AAACGTCCCTTTAATAGGCACAGAAGAAATCCAAGAGGGGCTCGCTGTAGGAGTGTTA
AGAGTTTCGATATGAACAAGGTCTAGAGAAGAATTTATTAATTAATTTCAATAATATACGC
TAATGGTATTTGAAAACAATATATTGTAATTTATCGTAACAAGTTACTAATTTCGCTTATTA
TAGACCATTATTGTGAAGTTATTTCTATAGATAAGCCAATAGCATAAAATTCATCCGTCG
GAATGTGCAAGGTGTAGTGGTAGGAGTGCTACTCATGATGTGACAAGTGCATGTCACG
GGGTTTGAATCATAATGCAAACAAAAGCCTGATATGTTAGTGAAAAATGATAGAGGGAC
G GGTTCATTATTC ACACAAAAG CTTG ATATTTAAGTGAAAAATG ATAG G GGAACG AGTTC
ATTATCCAAAGAGTTTCGAACCTAACCTTACACCATGGCCCTTCTTGGTATAATTTACAC
TAGTTTTATGAGGCTCCTTTTTGTCTCACAAATTTGTGGATCCCAATCTTACACTTCTGG
GTCCACAAAATTGTGAGACAAAAAAGTTACCTCATACAACTAGTCTCGGTTGTGACTAGT
TGTATGAGACACAAAATAAAATTTTCCGAAAAAGTAGTATGGTCTGTCTTCCGCTATGAG
ACTAGTTGTTTGAAACACAAAATAAAATTTTCCGAAAAAGTAGTATCGTCTGTCTTCCAC
TAGTGGGTCCTGGTCCCCTTGGAATCCCAGATTATTGGTCCCTACATAACTATAAAGGT
CATAACCTTATCATGGATTTAACATCAACCCTTTGCCCCATCTGAGCACTCTGGACCTAC
CTTAATCACTTTATTGGCTGGAAATAAGTTGATGAACTTTTTGAATTTTTCTTGAAAAAAC
AACAACAAAAAACCACTTGTGATCCCACAAGTGGGTCCGGGGATTAGTGTGTTATAAAG
AGGATGTTTCGGATAGACTTTCGGCTTAGGAAAGATCAATAAAGTAGTAGAAACAAGCA
ATAACAATAGCAAAATACTGAATTTTTCTTGAAAATCCTACACAAATCTCATACTTTGAAA
ATTGTATTTTGTTACATAATTTGATCATTTTTCACTTCGAACTCTTGTAGGCACATCCATG
TCCTGTCCTCATATCAGTGGCCTAGCTGCACTGCTGAAAGCAGCACATCCTGAATGGA
GTCCAAGCGCGATCAAATCTGCACTTATGACGACTGCCTATGTTCGCGACACCACCAAC
TCTCCTCTCCGCGACGCTGAAGGTGGCCAACTCTCCACTCCTTGGGCTCATGGATCAG
GTCATGTTGATCCCCATAAGGCACTTTCCCCCGGTCTAATCTATGATATTACCCCAGAG
GACTACATCAAATTCTTATGCTCCTTGGACTATGAGTTGAACCACATACAAGCCATTGTC
AAGCGCCCGAATGTCACTTGTACTAAGAAATTTGCAGATCCTGGGCAGATTAACTACCC
TTCATTCTCAGTTTTGTTCGGGAAATCAAGGGTTGTTCGTTACACCCGTGCAGTCATCAA
TGTAGGAGCTGCAGGATCCGTCTATGAGGTGACCGTTGATGCTCCCCCGTCTGTTACT GTAACCGTGAAGCCATCAAAACTTGTATTCAAAAGGGTAGGAGAGAGGCTGCGTTACA CCGTTAC ATTCGTGTCAAAGAAG G GTGTTAACATG ATG AG AAAG AGTG CATTTG G CTCC ATTTCTTGGAATAATGCTCAAAACCAAGTTAGGAGTCCAGTTTCATATTCCTGGTCACAA CTATTAGAC
SEQ 4
TCAAGCATCAGCACATCTTGTTGGTGCATATCCCAGTCTGGACCTTTTGGTGTCATATAA
GATATGAAAATTCTGCTGCTGATAGTTTCCAATTATCGACAAAGCAGATCGAGGAGTCC
CTAAAACTGCCAAACAAACGATATCCTCTGGTTCGAGTTTGATAAAGTAGTTCTCTACTG
GAAAATTCCATACAGCTCCATCACCAAACACGATCCCAAACGAGGGAAATTCCAAGTTC
TTCACACCAGACACATTGTAACACGGATTCAAAATAGGAAAGTCTTGTACAATGGGATAT
CCCTTAACCTTATTGACAAATGCCTCTTTTATAATCTCATAAGCAGGATCCGCGAAATAA
CTCAATGTGGTACCTGAATCAATGATTGCACCACCAAGACCTTCTAGCGATAAATTCCA
CGTCTCCTCGGGTATATTCAGTACCTCTCCTCCAACTATGACAGACTTTATCTGCACATA
GTAGAATGTTTCCACTTCTTTGCCTCCAACCAATGAAGTAAAATTCAACTGTGGATGTTT
CAAAAGTTCCTTATCTTCACCAAAAATCAACTTACTACTAACACTAGAATTGCTATTCCTA
TCAACAAGACAATACGAAAACGAATGACCATATAAAGATTGAAGCTGAGAAGCAAACGA
AAGCGGCCCTCTCCCTAATCCTAACAAACCAGCAGCACCATGAAATAATCCTCTATTCC
AATGACCACAACCAAACATCACATTTTCCACCTTCCTAAATTCACTCCCACTCGTCGTCG
TGAGGTTAACAGTAAATGTCTCTAGCGCGAAATCGCCAGTAGTATTAGAACTATCACCA
TACCAATAGTAATAAGGACAAGTTTGATTCTCGGATTTACAAAGCTGAGGAGGATCAGG
GGATGTAACAAATTTACACCTAGGATCATGACAACTTATATTTCTAAATGAAGTAGAGTC
TTGAGGATTATAATGAGGTCCATTTTGTTCAAAACAATCAAAACAAGGAACACATTGAAT
CCAATTAAGATCACTACCAGTATCAAGAATTAAAGAAAAATGCTTAGGTGGTGTACCAAC
AAACACATCCATAAAATACTCACCAGAGCCAAGGCTTACACCTGACTCCAAAGTCGCCA
TTAGTTTGCCGGAAAGTTCATAAGATTCCAGCGAAACTGCTGCCGGAGCAATCACAGG
CTTATGTTTGTCCACATGTTTTTCATTACTTTTTGCAAGTCTTGAATTGTAATTCTGATTTT
TCTTCTCAACAATTCTTGTATGGAGTGTCTGAATTCTGCTTAAATCCCTTGCTCTTGACT
CAAAGACTGAATCCTTAGCCTCAATTTTTTTACCAGCTGATCTGTGCCTTAACTGAAACT
TTACAGCTTCTTTTTTCTGGTTTCCAAAAATGGAAACTTCTTCATTTTCTCCATTTTTTACA
TCAACACCATCTACTTCTTGAGCTATTGAATGGGTTTTTGATTTTTGAGAAACTCCATAG
TTG CAATCTG AGTCAGCTGAAG AAG AAACAG CATTAAAG CTTG G ATG GTTAG GG AATTC
AATACCCGAAACACTAGAATTTAGATTTCTGAAGCTGTAAAATCCTCCACAGGCAACAAA
ACCAGAGGAAAACAAGAATATAAACAACAAAATGAAAAGAATGAACTTTGTCCCCAT
SEQ 5
TTACATTGAGGCATCAAGGAAAGCATTAGAGACATCATCTAATTCCACATTCAGATTTTT
GGCTGAAGGCAATCCAGCCACCACATTATGTTCAATGCCACATTCATTTGTTCCTCTTCT
GATCTTGAAGTAACCATCCTGTTATCATAGAGATAGACATACATTTAGACAAGAAGCTTA
TACAAATAGAGTTTAACTTTTGTGTATTGATAGTTTTCAGTTTGTTTAATCATTCAGGCTA
AGATGACTTATGCAACTGTCTAAAGTAATTCTAATTTAGTAACTTTAAGAGTGCAGTAAAT
AACTTGCTTTAACTTTTAAGATACACAGATAGTGTAAAAATTATTTACGCTGTCAGTATAG
TTAGCG CATCTACAACAAGG AGTTAG GATGTC GATGGTTTAAG G AAATTTGTG CTCATA
TATGAAAGCATAAGGAGAAAATAGAGTACATACATCACCCCAGCCTCTGTTCCAAGAAT
TAGCAATAAGCTGCAGAAGATAATTAAAAAAAAAGGTGATGATTAGATTTTGATAAAATG
ATATGATTTTAGATAGAAAATAAGACACCACAACTTGAACACATCAACATACCCAATAGT
CCTCTCCCTGCTCACTGGTTCCCCATCCGATAAGCTTAACAGCATGGCCTCCCATACTT TGCCCTGTTACATGCTTGTAAACTCCAGACTTGTAGTGAGCAAAATCCTGTGATTGACA
AAAAAGTTTTAAGTCATTGGGTTAGCGGAAAGCATTGCTAAGAAAAAGAGAAAATAACAA
TTATAATCAACGAGAACAATCAAATTCAAACAAGCATTAGTTTATACTAGAGAAACTGAG
AAACTAACAATTATAACCAGATAACTTCTCATATTGTGCTAAGCATTAAATTATCTCATTT
ACAAGTATTAATTTGCAAGTAAATCTCGGACAACATAAAATGAGAAGGTACATCGGTAGT
TAAAGTTTCTCAATTATCATAATAGTTACTATCAGGTCACTTAAAAGATCATTACATGTAT
CACATTATG GG ATTGG CAGGTAATTGTTTCTATGTACACATAC CACAG GACCAG CAT AT
AAGTTTACCAGCTCAGACAATGTCATGTGTGAAAACTATATCAAATAATTTTAGTATCAG
AACTTAACCCTCCCCCTTATCCTCATTACCTCGTAGACGGTAAAAGAGACCTCGACTGG
TCCATTTTTGTAAATTTCTGTCATGATACTGTTGGGATCATGGTGGATCCTGTATGCATT
GACACCATAATGCTTTGATTTCCCCCATAGTAGAATCTCCTTCACACACTTCCTCTGACA
CTTTGGGGTGGGATATCCTGGTTCACAACCAGGGTGGGAACATCCCTCATTATCAAAGT
AAG G GTCACACTGC CAG AAAAAAC AACTTTATTAGTGATTG ATCATAAAGATCCACG GT
AGCTAATGGTTTTAGAGGAAGCTGTAATCTCTTTTGGTGAAAATAAGTCACCATTTACCT
CTTCTGTGACCACACCCCTACGGATAAAGTATCGCCATGCTGTTATTGGATATCCACCA
TCACAACCACTCCCACATAAAAAGCCACAGCATGCTAACAGATCATTTACAGACAGAGA
GATATTCTGCATTACACATTAGAAGTTTAACATCAGTGACCATAACTACAGAAATAGATT
CACATACGTTTTGTGCTAGGTGAGATGGTTTCCACATGCTTAGACCATGAAAAAGAATC
ATGAGGCTGGCACGTGAGAGCACTTGCTGAAGATATATAATTAAGTACATAAAAATGTG
TCATCTAAAGCTTTTAAATGAGGCGGTCACACAGTTCAACAAAATTTACCAAGTTATGAT
G G ATAC AG AAAC G AT CAG AC AG C G ATTC AAC AG C AC C AAAAG C C C AAC AAG AAC C G C A
ATGTCCCTGATCTGTGACGCAAATTTGTCCCGGTGATTGATGTGCAAAGACGGAAAGCA
TTAGGATCACTATCATAGAATTATAATTCAATAGTAGTAAGCAAGAACAAGAAGAGACTG
ACCCAGAATTCTTCCGATAGTACTACATTGAGGCCAAGCTTTTCGTGCATCAAACTCTTT
TGGTAGCTCCAAAAGTTTTGGATGAGTTAGAATAGGAATTCCCTCCAAATCACCTTCTCT
TGCGGGCTTAACTCCAAGAAGGCGCTTAAATTGCGAAACCTGTGAAAACCAAAGAAAAA
CATAATTATAAATCGACGTTTTCGGTGTATGGCTGCAAAATAAGCAACTGTTATTGATGG
TTAGAGAAAGCGCTAAGCCATATGATGATCTGACCGTGAAATTCGAGAATCGAGGGTTG
AATGCAGCTTTCCACCCAGCTTTGGCATTTTCATTAACCTCTTTAATGATTGATTCCTAC
ATGATTATCCAAAAAGCTCTCTTAGTTTTAAATTTGAAGCAACAAGGGCAATAACATCTT
CTCTAAACATGAAAAGAAGAATATACCTGAAGGATTGCAGATTCAACTTTAGCTTCAGAT
ATTGGCTGCTCTGCAACAACCTGTTTTCCATAAGAACAAAGAGATTCTACTCATCAAAAG
ACATCCTTAAAGCTTTTAGGAAACAGAGCTGCAACTCCAGGAACAAAAGCATGACACAA
TGAGTGACAAACGAAGAACTTCGGGCTCGTTTGGTACGAGGGATAAGGGATAATTAATC
TCGGGATTAAATTTGAGATGAGTTTATCCCATGTTTGATTGTAGTGTTATTTTAATAATTA
TGGGAGGGTGGGATAAACAATCGCGGGATAACTAATTTCGGGATAATTAATCCTGCGAA
CCAAACAATCCCTAAAGGTTTCACTTTAATCAAGATGAAACTCTTCCACAACTTTTATTTT
CAACATTATAATACTATTAGCCTGGAAAATTAATCAAAAGTTTGTAGGAAATTCATCATAT
GTCTAAAGCACTATAACGTAGAGGAAAAAGAATCATAGAACAAGCAGAAATTGTAATTA
GTCCATTATTTCTCCTCCTTCTTCTCCCCTTTCTGTATTTCTTTGTGAAGCAATACTTCCT
CTCATGTTATTATATTTCGACAAGTAAGTTAGCTAACACATAACATAAGTAATTTGCATCA
AACCATATAATTAACTTCAGAAACATGTGTATACTTCTCTTTTCTCATTCTCACTAGGTAA
TGAGAAAATCATTAAAATTTGCTTCTACTCATGATTTCTAGTCAACGCTTAACTAAAGCAT
AAGAAGTCCAAAATACCCAACAATATTTGATCTTTCTGAAGAAACACAAAAAGGCTAATC
CTTGTGTTCATCAAAAGCTATACAAATCAAATCAATACGCTAAATCCACCTAAAACAAAA
TCATCAATTCAATAGGCAAGAACTACCCATAAGACATACTCCTACTGTGAAAGGTTCAAA
GAATG AAG AAAC AAAC CTG C AATATAAG GAT AAAC AAAG C AC C AAAAAG C AAAG G AGTT
G CTAAAG ACTTC AG G GTC AAG G C C AT
SEQ 6 ACATTAGTCCTCCATACTTCTTTCTATCTTCTTCTGTCAGTCGCATCTCCCGGCGACTGT
CTCCTCCTCTCCATTTTTCCTTTCTCTTTTTCCTCACCGAGATATTTTCCCTATAAACAAA
ACACCGTAAAAATCATCTCCTCTAATTTCCTATTTTCCCCATTTTTCCAAATGGGTTCTTT
CCTCTGTTTCTCTGTCATTGTTCTTTTCCTTGTTTTTCAGCCATGTTTTTCCAAGAAAGTT
TACATTGTTCACATGAAAAACCACCAAATACCTTCTTCTTTTGCTACACACCATGATTGG
TACAATGCTCAGCTCCAATCCTTGTCCTCTTCTTCAACCTCTGACGAATCATCACTTCTT
TACTCTTACGACACTGCTTATTCTGGCTTTGCTGCTTCTCTTGACCCACATGAAGCTGAA
CTACTCCGTCAATCTGATGATGTTGTTGGAGTTTACGAGGATACTGTTTATACACTTCAT
ACAACAAGGACTCCTGAGTTTCTGGGGCTGAATAATGAGCTCGGTCTTTGGGCTGGTC
ACAGTCCGCAGGAACTCAACAACGCTGCTCAGGATGTTGTTATCGGAGTTCTCGACAC
CGGTGTTTGGCCGGAGTCGAAGAGCTTTAACGATTTCGGTATGCCCAATGTGCCGTCG
AGGTGGAAAGGTGAATGTGAATCGGGTCCTGATTTCGATCCGAAAGTACATTGCAACAA
AAAGTTAATCGGTGCTCGATTTTTCTCCAAAGGTTACCAAATGTCGGCTTCTGGTTCATT
TACGAACCAACCTAGACAGCCGGAGTCACCTCGGGACCAGGACGGTCATGGGACTCA
CACATCCAGTACCGCCGCTGGTGCACCGGTGGCGAACGCTAGCCTTCTCGGTTACGCT
AGCGGGGTCGCGCGTGGTATGGCACCGCGAGCGCGTGTAGCTACGTACAAGGTGTGC
TGGCCTACTGGTTGTTTTGGTTCTGATATTCTAGCTGGTATGGAACGTGCTATTTTAGAT
GGCGTTGATGTACTTTCTTTATCTTTGGGTGGTGGATCGGGTCCTTATTATCATGATACA
ATTGCTATTG GTG CTTTCTCTG CTATGG AAAAAG GAATTGTTGTTTCCTGTTCAG CTG GA
AATAGCGGTCCAGCCAAAGCTTCACTTGCAAATACAGCTCCTTGGATTATGACCGTTGG
TGCTGGTACCATAGATCGTGATTTCCCTGCTTTTGCTACTTTAGGTAACGGGAAAAAGA
TTACCGGAGTTTCGTTGTACAGTGGAAAAGGAATGGGTAAAAAGGTAGTTCCCTTAGTT
TACAGCACAGATAGTAGTGCAAGTCTTTGTTTGCCGGGTTCACTTGACCCGAAAATAGT
CCGTGGGAAAATAGTGTTATGTGATAGAGGGACAAATGCGAGAGTAGAAAAGGGTTTA
GTAGTGAAAGAAGCTGGTGGGGTTGGGATGATATTGGCGAACACGGCGGAGAGCGGC
GAGGAATTGGTGGCGGATAGTCATTTGTTACCGGCGGTAGCTGTAGGGAGGAAATTGG
GTGATTTTATAAGGCAGTATGTGAAGAGTGAGAAGAATCCGGCCGCCGTGCTCAGCTTT
GGTGGGACGGTGGTGAATGTGAAACCGTCGCCGGTGGTGGCTGCGTTTAGTTCAAGA
GGGCCCAATACTGTAACTCCACAGATTTTGAAGCCCGATGTTATTGGGCCTGGAGTTAA
TATTTTGGCTGCTTGGTCTGAAGCTATTGGGCCCACTGGGCTTGAAAAGGATACTAGAA
GAACTAAGTTCAACATCATGTCTGGTAAGTATTACCAACAACGGCTAGTTTTTGTCATAA
TCTTTTTATTTATGCTTAGATTAATTATGGCCTTAATTATATTTTTATTAGATCTTGCAATT
ATTAATACTAATCGTACACACTTGAAAGGAAAAAGAGGAACATGTTTAATTAGTGCGTAG
TG ATCTG G AG CACATG CCTAAAGTTTAG AG G GGTTCACATGTGTTG CATTG ATAAGTTA
ATCCTAAATTACATTAGTTATAATTAAATATTAATGCGCTTCCAAGAAAAAGTTGACTAAA
TTTATCATATATTTCCAAATTTGTTTTGAAAAATATGATTTTGGTGAAGTTTGGCTTGAAG
ATGAAAATGTGTTTGGACATCAATTTTCAAAACATATTTCCCAAATTTATTTTGGAAAAAC
ATGAAACATTTCTTATACCCACAAGTTTAAAAAACTATCACAAATATCCAACGGTACCATT
ATCAATAACATTCATTATATTATCTCAAACCATAATCCTGAATATAAATAAATTTGGCACA
ATATTATCATTTTTATAATTAACTATATGATACACTATTAGATGATCGAGAATACGAAGCA
ACATCGTTTCAAAATAATAAATGAAAAATGGTGGACTCTTTTATATAATACAAAAGTTTGG
AATAATTTTTAAAAAATATAATAATGATATTTTGACCCAAAACCAACATGTAGTCAAAATC
TATGACCAAACATGTGTTTGCCAAATAAAACCCAAATTTATTTTGACAAAATATATGGCC
AAACGGGGCTTAGTTGTATGATGTGTCGTGTGATATTAATAAAAGAACTGCCGAAGCCT
AT AC C AAATTTATG G CTAAAATAG C AAG AAAC GTC C CTTTAAC AG G G AC AG AAG AAATC
CAAG AGG G GG CTCG CTG GTCTAG AG AAAAATTTATTAATTAATTTCAATAATACG CTG AT
GGTGTAAAAAATATTGACACCATCAATATATTGTAATATATCGTAAAAGTTTATTAATTTC
ACTTATTATAGACAATTATTTGAAGTTATCTTTATAGCTAACCCAATAGTGTAAAATTCAT
CCGTTGGAATGTGCAATATGATGTTTGTTTTCAGCTTTTGTGCAGTAGTGATTTTAAATA
GGTATTACTTGGAGCTTTTGTGCGATGTGACAAAGTGCATGTCACAAGGTTAGAGTCAT
AATGAAGGCAAAAACATGATATTTAAGTGAAAAGTGATAGAGGGACGAGTTCATTGTCC G CAC AAAAG CCTGATATTTAAGTTAAAAAAAATAG GTG ACCAG CTTG ATATTTAAGTG AA
AAAGGATAGAAAGACGGGTTCATTATCCACCGAAAGTCGAACCTAACCTTTTGCCATGG
CCTTTCTTGGTCATCAAAATAATTTAGGAGACTACCTAGGAAAAGTAGTATGGTCTGTCT
TCCACTAGTGGGTCATAACCTTAATATCATCCCCCTTGCCCCGTTGAGTACTCTGGACC
TATCTTAATCACTTCATTAGCTGGAAATAAGTTGATGAACTTTTTGAATCATTCTTGAAAA
TTCACAAATTC GAACC GTG G AAACAATCTATTACAG G AATGC AGTCTAAGTCTTCGTACA
ATAGACCCTGTGGTCCGGCCCTTATAGCAGGAGCCTACTGCACTGGGCTGACCTTTTT
CTTTAAAATCTTACAGAGCTCAAAATTTGGACTTTGTACTGTTTCGTTACATTATTTGATC
CTTTTTGTACGTCAAACTCTTTCAGGCACATCCATGTCCTGTCCTCATATCAGTGGCCTA
GCTGCACTTCTGAAAGCAGCGCATCCCGAGTGGAGTCCAAGCGCGATCAAATCTGCAC
TTATGACGACTGCCTATGTTCACGACACCACCAACTCTCCTCTCCGTGACGCTGAAGGT
GGCCAACTCTCCACTCCTTTCGCTCATGGATCAGGTCATGTTGATCCCCACAAGGCACT
TTCCCCGGGTCTCATCTATGATATTACTCCAGAGGACTACATCAAATTCTTATGCTCCTT
GGACTATGAGTTGAACCACATACAAGCCATTGTCAAGCGCCCGAATGTCACTTGTGCTA
AGAAATTTGCAGATCCCGGGCAGATTAACTACCCTTCGTTCTCAGTTTTGTTTGGGAAAT
CAAGGGTTGTTCGTTACACCCGTGCAGTGACCAATGTAGCAGCTGCAGGATCCGTTTAT
GAGGTAGTCGTTGATGCTCCCCCATCCGTTCTGGTGACCGTGAAGCCATCAAAGCTTG
TGTTCAAAAGGGTAGGAGAGAGGCTGCGCTACACCGTTACATTTGTGTCCAACAAGGG
TGTTAACATGATGAGAAAGAGTGCATTTGGTTCCATTTCTTGGAATAATGCTCAAAACCA
AGTTAGGAGTCCAGTCTCATATTCCTGGTCACAACTATTAGAC
SEQ 8
ATGAATCCTGAAAAATTCACCCACAAGACTAACGAGGCCCTTGCTGGGGCACACGAGC
TAGCACTATCCGCAGGGCATGCTCAATTTACGCCTCTGCATATGGCTGTGGCCTTAATA
TCTGATCACAATGGTATTTTTCGACAAGCGATTGTCAATGCTGGTGGGAATGAAGAAGT
AGCTAATTCAGTGGAGCGGGTATTGAATCAAGCGATGAAGAAGCTACCTTCTCAGACAC
CGGCTCCAGACGAAATCCCACCTAGCACTTCCCTTATCAAGGTGTTACGACGAGCACA
ATCGTCG CAGAAGTCTTGTG GTG ACAG CCATTTAG CAGTG G ATCAGTTGATTTTAG G AC
TGCTAGAAGACTCTCAAATCGGAGATCTTTTGAAAGAAGCTGGGGTGAGTGCATCAAGA
GTGAAATCAG AG GTAG AG AAACTTAG AGG AAAG GAAG G AAG AAAAGTGG AAAGTG CTT
CAGGGGATACCACATTCCAAGCACTCAAGACTTATGGCCGTGATCTTGTGGAACAAGC
AGGAAAGCTTGATCCCGTGATTGGTAGGGATGAAGAAATTAGAAGAGTCGTTCGGATTT
TATCAAGGAGGACGAAGAACAACCCGGTTCTTATTGGAGAGCCTGGTGTGGGTAAAAC
AGCAGTTGTTG AAG G GCTAG CACAG AG G ATTGTAC GTG GTGATGTTCCAAGTAATTTAG
CTGATGTTAGGCTTATAGCATTGGATATGGGAGCGCTAGTTGCTGGAGCTAAGTACAGA
G GTG AATTTG AAG AG AG G CTG AAGG CTGTG CTGAAAG AAGTTG AAG AAG CAG AAG G GA
AAGTGATACTTTTCATTGACGAGATACATTTAGTCCTTGGTGCTGGTCGGACAGAAGGG
TCTATGGATGCTGCTAATCTGTTTAAGCCAATGCTAGCCAGAGGTCAATTACGGTGTAT
TG GTG CAACTACACTTGAAGAGTACAGG AAGTATGTTG AG AAG G ATG CTG CATTCG AGA
GGCGTTTCCAGCAGGTGTATGTGGCTGAGCCTAGTGTTACTGACACTATTAGTATTCTC
CG CG GGTTG AAG GAGAG GTATG AAG GG CATCATG GTGTTAAAATTCAG G AC AG AG CTC
TTGTGGTGGCTGCCCAGCTCTCATCTCGGTACATTACAGGTATCTATACTTTTGCTATTT
TTACATAGCACCTTGTTTTGATGTCTTTTCTCCGTCAATAACTAAGCATGTATATGCACTA
CTTTTTCCTCGTGCATTTCATTAACTCTATAAATCAGAATGGGACTTAGATTCGGTTAAG
CGAATGAAGGTGAATTTTAACCTAAAATGTTATGGTGTCGGAGCTATAGATGTATATTTG
TCTG GTACTAAAATGACTTCTTG AAG CAGTAG CCAG AATTTTG ATTCATTTAAGCAG GTA
GGGCATGAGACTTAATTAGCATATCATTGTCTGCACTTCCTTCTGGACCTTTACCAGTGT
ATGAGTTGTTTTTGTGTTACAAGCTGCTCCCCATCTGGATAATGGTGGATTAAGACTTAT
ATGATTGTCAGAAGTGTACTAAAACTTCTTGAGGATAATTAAAAATTGCTCAAATCAAAT CCGTAGCTCGTTTTCCACTGTCAGTTTTTGCAAAATGCTTTTTATGTCTGTGTCGTGACA
AATTAAG CAGTCAG CCAGTTAAATTTTG GCAGTTTGG CATGC AAATTGTCTTTG CTG CAC
ATTTCAGGTGCAAAAATCACTAACCTCTTTGTATTTTCAGGTCGACATCTGCCAGATAAG
GCTATTGACCTAGTTGATGAAGCTTGTGCAAATGTTAGAGTTCAGCTTGATAGTCAACCT
GAGGAAATTGACAATCTCGAAAGGAAGAGAATTCAGCTAGAAGTTGAACTTCACGCTCT
CGAGAAGGAAAAAGACAAAGCTAGCAAAGCACGTCTAGTAGAAGTAAGTATTATATACT
ACCAATGCTTTTACTGGTAATTGCTCTATTTTCTAAAAGATATGTTAAGAATTATACTGAC
TCGAATTATACTGACACTGGTCCAGGTGAGGAAAGAACTTGATGATTTGAGAGACAAAC
TCCAGC CCTTGATG ATG AG GTACAAAAAAG AG AAG GAAAG GATAGATGAG CTTCG CAG
G CTCAAG CAAAAG CG CGATG AG CTCATTTATG CTTTAC AAGAAGCTG AAAGG AGATATG
ATCTGG CTAGG G CAG CAG ATCTG AGATATG G GG CAATTCAAGAAGTG G AAACTG CAAT
AGCAAATCTTGAGAGTACCTCAGCTGAGAGTACAATGCTAACAGAGACTGTGGGTCCT
GATCAGATTGCCGAAGTTGTGAGTCGCTGGACTGGTATTCCGGTCTCAAGGCTTGGGC
AGAATGAGAAAGAGAAACTGATTGGTCTTGGCGATAGGTTGCACCAAAGAGTGGTCGG
G CAAG ATCATG CAGTTAGAGCTGTTG CTG AAG CCGTGTTAAG ATCTAG AGCTG GTTTAG
GAAGGCCACAGCAACCAACTGGTTCATTCCTTTTCTTGGGGCCAACTGGTGTTGGAAA
G AC GG AG CTCG CTAAAGCTCTTG CAGAG CAG CTCTTTG ATG ATGATAAACTGATG ATCA
GAATAGACATGTCCGAGTACATGGAACAACACTCTGTTGCCCGGCTGATTGGTGCTCC
ACCAGGGTAAGTTTGAATCTAATTCTTTTCTTTTAATGTCATGTCATATTATTACAGTATT
CAATCACAGATTCTCATGTGTTCCACATCTGCAGTTATGTTG GG CACG ATG AG G GAG GA
CAACTTACTG AAG CTGTTAG GAG G CGG CCTTACAGTGTTGTG CTCTTTG ATG AAGTTGA
GAAAGCCCATCCTACGGTGTTTAATACATTGCTTCAAGTTTTGGATGATGGAAGGTTAA
CAGATGGTCAAGGCCGCACAGTTGATTTCACCAACACCGTGATTATTATGACTTCAAAC
TTGGGAGCAGAGTATCTGTTGTCTGGATTAATGGGAAAATGTACGATGGAGACGGCTC
GTGAAATGGTAATGCAGGAGGTAAATAGTCTCAAACTAGTAACTTCCCCTTTGCTGATA
AAACTGGAAGAATACAGTGAAATAGTTTACCTTATTAGCTAGAATGACAACTGTTTACAT
GTGTGTATGCTTTGTGATAGGTGCGAAAGCAGTTTAAGCCCGAGCTCCTGAATCGGCT
GGATGAGATTGTTGTGTTTGATCCCCTGTCCCACGAGCAGTTGAGGCAAGTATGCCGC
TACCAGATGAAAGACGTTGCACTACGGCTGGCTGAGAGGGGTATTGCATTGGGTGTTA
CTGAGGCAGCTCTAGATGTCATACTCTCAGAGAGTTATGACCCGGTAAGTGTTATATCT
TGTAATCTAGTCCAATATTTTAGGATTATTTTGCGAACTTGTACTTATTGTGGTGATCATG
G CATTCAG GTTT ATG GTG CAAG ACCTATTAG GAG ATG GTTGG AGAGG AAG GTGGTG AC
CGAGCTATCCAAGATGCTTGTGAAGGAGGAGATTGATGAGAACTCAACGGTTTACATAG
ATGCTGGTGTCGGCAGGAAAGATCTAACCTACAGGGTGGAGAAGAATGGAGGTCTTGT
GAATGCTGCCACCGGGCAAAAATCTGATATATTGATTCAGCTTCCTAATGGTCCCAGGA
GTGATGCTGTCCAAGCAGTCAAGAAAATGAGGATTGAAGAAATTGAAGAAGACGAAATG
GAAGAT
SEQ 9
TTATGTAAATGCTTCACGTTGCTGTGTAGGTAGCTCCAGTTCAGGCTCAAATGCTGTTA
GCCGAGAAAAATACTGATCTACCATGTCATCTGTTACTTTATCCAAGCTTGGAGGATCC
CACTG CACGTTTTAACATAAATCAAG AAACTCTCTCTAAG GTTAAAGTTG ACTCTTAG GA
AAATTCCTCCAGGAAGGGGCTCATAATTCATAAAAATAGCATATTAGTTCGCAATAATAT
TGATTTACCTTAGGTGCAAAATCTCTATCCACAAGCCGTGCACGAACCCCCTGATAGGA
AGATATTAGTACACTGTGATGATGATCTGTGGAATTTAGTACTGAGACTTTTTTTAGCAG
CATATATTACCTCACAGAAGTCATTAGTTATTTGTCCAGAGAAAGCTTGTACTGACATTC
GATACTCACGAATTAGACACTGGTCCAGAGTCTGATGTCTGCCTTCGCGTATCTGATCA
GAAGCATTATTGGAAGGTCATCTCACAATGCCCTGGTGGCAGAAGGGAAAAGGAGAAC
ATTCTATCTAGAGAAACTTACAGATCTCAGTGAAACCTTCAAGCTCAGTGGGGCTGTTT CTTGTAGTTTTCGCAATGTTGAAACACACCATGCATCTTGCTTCTTGGCTGCCTCACTTT
CCTGGCAACATTTAAATTTCTTTTTAAAAATGTAATCACACCTGTCCTTAAAAACTTCCAC
ACATGTAGCAGACTTTATATGTTTACTTAAATACTTTACTGTCAGTATTTTAAAAACAAGA
GAAAATCCATCTCGCTCGAAAGAGCACCCAGAGGTCGTAGATGCAAACGCAATGACTC
AGCTTGTACTATGGTGCAACAAAGTGCAGCACTTAAAGGCTTCTCCGTGCAATTTATAC
GTGATGTTAGAATCTCCACAATATGCTTTGCCCAAAGAATGATAGATATAAGTTAGTGTG
ACAACAGCCCCAGGACCTATAGGTACTTTTTGAACCTCAAGGAGCAAACAAAGAGTATG
CTGAGCAATGATCATTGCATTGTTTCAGATTTTTCAAATAGTGGCAACTTGTGATGGCCG
AGGTATTAAAAGCAAAGCAATCCGATGCTGAACATTATTGCATTGATCAACTAAAATATG
TTTCCACTGGCACCTAAGGGGGTAGACTGTAGCCAAATAGTCAATGAAGTCGGTGAAAA
ACACGGGAGACTAGGTTTCAAATCTAGCAAAGAGTCGGAGAGAGTCAGAGACAGAAAG
GCTAGGTGATTTCTTCCTATCTGCCTAAAATTTGGTAGATAGCAAGTACCTGTTGGAATA
GTCGAGGTGCGAGCAAATTGGCCCACACCACCGTTATAGAAAAGAATAATGTTTCTACT
GGCCACACTTTGATTGTATGCCCATCCGATCGAGTTTCTTACCAGGTAATAGCCTTCAC
ACCCCAGGCTAGTGAATTAAGATGTGCTATCCTATAAAGACTTACTTCGTCCTCCACTAA
TATGATCGTATCTTGTTACTGGAGTCGAAGATGGTATGTTGGCGAGGCTTCCACCTTCT
GAATACCACATGTTTAATCCCTGTTCTTTCTTTTCTCACTAGAGATATTAAGCAAATTAAA
ATCAATTAACATCAGAATTTTGTATCTTTCAGAGACCTTTTCAGTATATTTGGGAAACAAG
TTACACACTCAGGAACTATATGCTCACCAAAGCATCAATAATTTCTTCGACTGTGTCATG
GCTGAAACATTTATTGAGAGTTTCAATCCTAAAGAAAGAGAAAAGATGGATTAGATTAAG
CTATCAAATACTCATTGATCTGCAAGCAAGATCACACAAACTTGATAAGAATTTATTTTTA
TAAAAACAAAAGGTTAAAGAACAAGACTAAATACAAAAACCAAAAGATCCATGTGAAACT
ATGACAAAAATTATCATGAACCACAAAGTTTTAAGAATGTAGTGTGAAGTTTTAGAGAGC
TGAAGTTTGTTCCACTAGTTGATGTATTTTATATAACTCATGGAAAAGATGGTGCCACTA
CCAGTTTCTCTATAATGGAAAAAACAATTTCGCCACTTACAAATCTAAGTGAAGCTAAAT
ATGAAGAGCTAATTGCAGGATTTTGTGTTTACTGAATTAAACAATTTACCAAAAATTACCA
ATTTGATTCCTCGAAGAAGAAAGTAAGAGATTGATTTGCAATTCTTTCTTTTTGAGTAGTA
TGAAAAATGTTATGAAATGAGGGCAAGTGTCTCATCCAATTAAACCTGTAACTTGCTTTC
CAACAG CAG GTAAAATATTGAAGCAAG G CTG CAG CGTGTCCACTTATG ATTTCAATTAG
CCAAGGGATTCTTTCGATGTTCTATAAGAAACGTGAACGGGATTCCCCGAAGATGTAGG
TAGGGAATCCGCTGACTGGTGTGGTATATATGTTTGGTTAACAACTAGAAAAGGTGTTT
CAAATCCAAAAGCAGCCCTTAACATTAAACGGTAATATGTATCAGTCCACCCCCTTTCAA
ACTGTAG CAG G AACTAATATATTTATG GACATTCCAATTTCCATATTTAG CAAG GATG AC
AGGTACCTGTGAAGTACACTTGTTGGATCTGGATGGACAATCTCTCCACAATTTTCAAG
AGATCTTTCAATCACTGAAGGATCATCAGTCATCAATTTACCAAGTTGTTCCTCAATTAA
GGGAAGCTTCTGAGAGGAGTGGAGAAGGCAGAGAAGAAGTTTTCAGAGCAATAGTCAA
ATAACCTAAGTAGGCTATTGGTACCTAGGTGCAAAGCAAATTAAGCAAAGGTAGTAGAC
TTACTGCACTGTGTAAGTAGTGAGTAGCAAGCCCACAGGATATCATTTCTGCTCCATTG
ATCTTGTCTCCAGTTAGGGCCAGGTACTCTCCTGCAAATTGAAAACATTTAAATAGCTTT
CTTCCATGCTGATTTTTTCTTTCATTTACGCAGTAGCAAATGTCCAAACAGATGAGAGAC
ACAGAGAGAGGCAGCATACTAAAAACATATTCCACTTGAACCACAAACTAGTATAGCAT
AAGATAGATAGGGTTTACATGAGCTGACATGCGTGCATCAGGTTGCTTAACATTTATAT
GCCATAGAATATGAAGTCATCAAAGAGTCAGCCATGTAAATGCAGTGCTACATGATCTA
GCATGAGTTATCAGTTATCTACAAAAAAAAGGCTTGTAACCCACTAATTTTAGTCCGCCA
CCATTTAAAGCTAAATATTAAATAGATAGAACCACATATTAACATCACCGTATGAAGTTAA
ATTTTTAAAAAAAAAGGCAGTCCAGACAAATTCCGAACATAGGTAGAATTCAAAGTCATA
ATCTCCAAAATACATGAAACAGGAGGAGGAACACATGCTTACTTCCCAATGAGTTAGGG
AAAAATAAGATGAAGTAAAAAGAACATCACGTTAGTACAAATATTTTGATAACAAGCATA
TAAGAAGGGAAACGTCCAGATATGACATGCTTATTTTTTCCCCATAGAAGATGCGCATC
TCCCCAGAGAAGAGGAGATCGGGGAAGGGGGTGTGTGTGTGTGTGTGTAGAGAGAGA
GAG AG AG AG AG GGGGCGCGGG G ATC AG AAAAGTAGTTTC C AACTAG ATAATTC ATC AT CAAAGACGATCAGCTACATGACATGATAGAAGTTACTGATGGATCCAAACCACAGCATG
AGATCAGCAAAAGAATATACCCAAATAACCAGGGAGGTGTGAAAGGTAAAATGACGCC
CCAGCATCGGGATGGTAACCAATCAATGTTTCTGGTGTGGCAAAAACCTACATTGATAC
AAGACCCAATACCCTTGTAAGAACGGTGAAAAATCGATCTAACAGCACTTAAAACACCT
AAACCATCACCTCGTGCACCGGCCAAATTTGATAGACCTCTTTGACAATTAAGTTCGATA
AATTTTTTATTTCCTGCACATTCCTTTAGTTCTTGAGTAATTTAAAATCAGGTTATAGCATT
TCAGTAAGACCAAGATATCTTAAATTTTTCAGCACTTTAAACGCTAAGCCTGTGGTTATA
AGTCATATATAACCGAAATAAACAAGCTTCATCACGAAAAGTCAAAAGAATTGATAATAT
ACAATTGAAATTACTATTTAACTTATTGGTACAAAGAAACTTCATCAATATAGAGCATTCA
ATTTAAGAAGATTAAAAGATGAATTTTATCAAATTAACTCGGTACGAACTCATAAATAGAT
AATAATAACTG ATAAG G C C AAAACTATC AATAAAG G AAAAG C AAG AG AAG G G G C AG AG G
AAAAATTTGCTGAATAAGTTTGTAGCACTAGGATTTGAACTTCTCTCTGAAGAGAAATAA
TTAAGATTTAAAGAACAGAAAATTCATGATATGTGAAGTCACTAAGCTGTATATAAGAAT
GAGCACAAATGGAACTTCACATTACTTAGCAATGTAGTCTAGCAGTTCTTGAAGTAGGA
GAATTTATTCTGAACCAACAAATGAAAAGCTTAAACAAATAAAGCATGACTAATCTTTTCC
ATACAGTTTTCTCAGTTGCAACACGGAAAGTTCCAGGAATTGAGATGCCAGCCCCACCA
CCCATGGTAATTCCATTCAAAAGAGCAACCTGCATATGGCATCTGAGGATTAATATCTAT
ACTTGGGAACCGCATCCAGGGACCAGCTAATGTCAACAAAAGTATGACAAGTACTGAAT
ATGCAATGACTGAATTCCTTCAACATGGAATTGAGGAGGTGAGAGCGAGAAGTAACTCA
AATCCGAAAGTGCGCAAAAAGCTCTAATCGATAAAAAGATACAAAGAAATATGGATGAT
AACTACTGATAAATGCACACCGGGTTTCTGAACTAAAAAAAGACGTATATTATGATTAAT
AAAAGAATGAAAAGACCATTATGGCTGCATCTTGTCAAAGCAAAAATGATCACATGATTT
TAACAAAATAAACAACTTCCAAAATTGAGGAGATATTATTTCATGGGCAAAAAGGAACAG
ACAAACTATAGGTCTAAACAGGTGACCACTTTGTTGAGCATCAACAACAGCTTATCCCA
AATCTTTCATTTCAGAAGTCAAAGAACACAGCAGCAAGAGAACTATGGATGAACTAAGA
AAATGGAGTTGAATATCTATGAATAATTGGATCAAATTTGCATTTGTCCTAAGGAAGTTC
TTTAAGCTTATGAATCATAATGTTAATGTGACTGATTATGTTTTCCTCAACAGGCATCGTT
ATGAAGATTGTATAGCAGCCAGCATGATAAAGTAGTGTTGCTACCATTTCAAGTTTATGA
ATAAGATCCAATCAAACTTTGGCAAGACACATCTAACCTCTCTGAATACCATTTTAATTC
AGAAAGAAGTGATTGGTGTAATTAACTGTCGAGATGCTGTCTTTGCAGCAGCTTAGTCA
CAGTAAGATGAGAGAGAATCCAATTAACAGAACAAGCGATCTTCTAATAAATCCAGATAT
TTCTATAAGAGTATCTTTAAAACTGCCAGCATAAAGTACAAAGGTGTTGAAATTTCAATA
AGCAATGGCGAATATAGGATGAAAATGTTGATTCAACATCTAATAGAACTTACATATATT
TTTTAGAAATTCTTAGGTTCCAACTACAGAAGCATATTTTATTATGCGGAAGTCTAGACG
CACAGTTAGATCACAACATAAAAGACATGTAACTACAAAATTTATAAGACGCTGGCCTCT
ATCAAGGTTTAGTAAATACAACAAAAGTTCCTGATACATATATAAGAGGCAGAAAACAGA
AGAAAATTCAAGTCAACAGTTCTTTTAGTATCTGATGTTTGAACAATAATGATACTTACAT
GTGGCTTCAAGAGTGTGCCGACAACGTATACTAAGTTATTTATTGTCCAACAAAAATCTT
TACAGTCTTGAAGATTCCCTGCATCTATCCAATACAAAAAATATTACACAAAGAATGGAG
AAGATCAAATTAAAATTGAGGTAATAGTAAATTGGTAGCATGAAGATGCCGGCTGCAGT
TTTTAAAACGATGTAGAAACGTTTTAATTGTTCCCAAAAATAAATAGATATCAATACCCTA
CTCAATAAAAACAACAGAACATCCAGCTATCAGCACAAATAGCTTAATTAAGATAATTCA
CATCTAAACAACTTCTTCACACCCCAATAATACAATTATCGACAATGTTTAAATATTTAAA
ATTCCTTTAACCATGTATCACTTGCAACATAGCAGAATATGAACAAGTTCGCCGTGAAGT
ACAACAGTCAAAATGACACAACATACCTTGTTTTAACAAATTATAAATAGTGACAATGTC
ACCCCCAGCAGAAAATGCCTTGCCACTTCCCTTCAAAGTTAAAGAATAACTGAGTTGCT
ACTCAGTAGATTAACATATGTAACTTCACCTTAAGATAATCCAGAAGAACAACAATAGGA
AAAG C C ATAAAATAATTC AAAC AATAATTTACTTAAC AAATTAC CTTC AATAC C AC G AATC
CAATATCAGGATCATCTTCCCAATTTTTGTACAGCTTTAGCAACCTATCCACCTAAACAA
CAGTATCATGCAGAGTTTTTATTTATTATACAAGGTGGAAATTAAGATTGCCAACCAACT
GGAATATCAAGATTCCCCAGTGTCATGTAGGGTTCATTCCCAAAGCCAAAACAAAATCA ATTCAG CATAAAAACACG ATTTTGATG GTTG G ATATTAATTAAAACACG ATTTTACTTG GT GGGTATCAATAAAAAAAATCCAAAAAGAGAGCAATAGATTTCAAAAATGATCTTCTTGTG CGTACACACTGTAGATTTCAAAAATGGCAAAAACAGAAGCAAAAAGTAAAGGTCTTTAAA AGG CAGG AAACTAACAACTG AAAAATTG AG AG CATTTAACG CATG G GGTCTGTTAAG GA TTG CTGTTCTCGAAGAAG CTTTTCC CTCCACTAACAC CTGAAAAATG CAAACTCATATTT TATAACAAAATG G AAATTTTGATAATG AATCATAAATTG G ATTG ACAAATTTTCTTTTAAA AAAAATTCAGAACTCACAGTGCTTTGGGATTCATCAACAAGGGCATTGGTAGAGACACT GCAAAAGCTTCTGGAGTGAGAAACCAAGCGCGAATTCTGCAGTAAGCGCCTCAAAATA CTTGCTGATTTGAAGCTCTGCAT
SEQ 10
ATGG CCTTG ACTCTGAAGTCTTTAG CAACTCCTTTG CTTTTG G GTG CTTTCTTTATCCTT
GTATTGCAGGTTTGCTTCTTCATTCTTTGAACCTTTCTACAGTAGGAGTATTTCTAATTAT
GGGTAGTCCTTGCCTATTGAATTGATGATTTTGTTTAGGTGGATTTAGCGTATTGGTTTA
ATAGCTTTTGATGATTTTACAATTTATATGGATTACCCTTTTCGTATTTCTTCAGAAAGAT
CAAAGATTATTGGGTATTTAGGACATCCTATGCTTTAGTTAAGCGTTGACTTGAAATCAT
GAGTAGAAGTAAATTTTAATAATTTTCTCATTACCTAGTGAGAATGAGAAAAGAGAAGTA
TAGACATGTTCCTGCAGTTAAATATATGGTTTGATGCAAATTACTTATGTGTTAGTTAACT
TACATGTTTCTATATATAACATGAGAGGAAGTATTGCTTCACAAAGAAATACAGAAAGGG
CAGAAAATGGACGAAAAACAATGGACTAGTTACAGTTTCTGCTTTGCTCTATGATTCTTC
CTCTACGTTATAGTGCTTTAGACATATGATGCATTTGCTACAAATTTTTAATTAATTTTCC
AGGTTAATAGTATTATAATGTTGAAAATAAAAATTGTGGAAGTGTTTCATCTTGATTAAAG
TGAAACCTTTAGTTCTGCGTTTGTGACCCACTGTGTCATACTTTTGTTCCTGGAGTTTCA
ACTCTGTTTCCTAAAAGCTTTTAGCTTGTCTATTGATGAATAGAATTGATGTGTTCTTATG
GAAAGCAGGTTGTTGCAGAGAAGCCAATATCTGAAGCTAAAGTTGAGTCTGCAATCCTT
AAGGTATATTCTTCTGTTCATGATTAAAGAAGATGTTAGTGCCCTTGTTGCTTCAAATTTA
AAACTTAAGAGCGCTTTTTGGATGATCGTGTAGGAATCTATCATCAAAGAGGTTAATGAA
AATGCCAAAGCTGGATGGAAAGCTGCATTCAACCCTCAATTCTCGAATTTCACGGTCAG
ATCATCATATATCTTAGCGCTTTCTCTAACCATCAACAACAGTTGCTTATTTTGTTGCTAT
ACACTGAAAACATGCATTTATAATTATGTCCATCTTTGGTATTCACAGGTTTCACAATTTA
AGCGCCTTCTTGGAGTTAAGCCCGCACGAGAAGGTGATTTGGAGGGAATTCCACTTCT
AACTCATCCTAAACTTTCGGAGCTACCAAAAGAGTTTGATGCACGAAAAGCTTGGCCTC
AATGTAGCACTATCGGAAGAATTCTGGGTCAGTTTCTTCTTGTTCTTGCTTACTACTATT
GAATTATAATTCTATGATAGTGATCCTAATGCTTTCCGTCTTTGCACATCAATCACTGGG
ACAAATTTG CATCAC AG ATC AG G GACATTG CG GTTCTTGTTG G GCTTTTGGTG CTGTTG
AATCGTTGTCTGATCGTTTCTGTATCCATCACAACTTGGTAAATTCTGTTGAACTGTGTG
ACCACCTCATTTAAAAGCTTTAGATGACGCATTTTTATTTACTTATTTATATATCTTCAGC
ATACTCTCTCATGTGCGAGCCCTGATTCTTTCTCATGGGCCAAGCACGTGGAAACTATC
TTATATTAGCACAAAATGCTTGTGAAGTTTTCACTATAGTTAATGTCACTAATGTTAACTT
TTAATGTGTAATGCAGAATATCTCTCTGTCTGTAAATGATCTGCTAGCATGCTGTGGCTT
TTTATGTGGATCCGGTTGTGATGGTGGATATCCTATATCAGCATGGCGATACTTTATCC
GTAGGGGTGTGGTCACAGAAGAGGTAAATGTTGTCTTATTTTCACCTCAAAAGAGATTA
CAGCTTTCAGTAAAACCATTAGTTACCGTGGATCTTTATGATCAATCACTAATAAAGTTG
TTTTTATTCTTGCAGTGTGACCCTTACTTTGATAATGAGGGATGTTCGCACCCGGGTTGT
GAACCAGGATATCCCACCCCAAAGTGCCAGAGGAAGTGTGTGAAGGAGAACCTACTAT
GGGGGAAATCAAAGCATTATGGTGTCAATGCATACAGAATCCACCGTGATCCCTACAGT
ATCATGACAGAAATTTACAAAAATGGACCAGTTGAGGTCTCGTTTACAGTGTACGAGGT
AATGACGATAAGGAAGAATGTTAAGTTCTGATCCTAAAACTATTTGATACAGCTTTCCGT
ACATGACATTATCTGAGCTGGTAACCTTATATGTGGTTGCCTACCTATCCCAAAATGAGA TACATGTAATTATTTTTAGGTGACCTATAGTGTAACTGTTATGATAATTGAGAAACTTTAA
CTACCGATGTACCTTCCCAATTTATGTTTGCCCGAGATTTACTTGCAAACTAATATCTGT
AAATGAGATATTTAATGCTAACCACAAGACAATATCAGAAGTTACCTGTTGTCGTAAAAC
TGCATCATCTCTTTCTCGGTGCAAGTAGATTTGTTTAGATTTTGTTTGTTGTCTTTGATCA
TAACTGTTATCATCTCTTTTTCTCAGCAATGCTTTCCTCTAACCAATGAGTCAATTTTTTT
TATTTTTTTTTTGTCAATCACAGGATTTTGCTCACTACAAGTCAGGAGTTTACAAGCACG
TAACAG GTCAAAGTATG G GAG GC CATG CTGTTAAG CTTATCG G ATG G GG AACTAGTGA
ACAGGGAGAGGACTATTGGGTATGTAGATGTGTTCAAGTTCTGGTGTCCTGTTTTCTAT
TTAAAAGCATATCTTTTTGTCAAAATCTAATCACCTTATATATCATCTGCAGCTTATCGCA
AATTCTTGGAACAGAGGCTGGGGTGATGTATGTCCTTAAATTCATCCCTATGTTTTCATA
TATGAGCAAAAAGTCCTTAGACATAGGCATGCTAGCTTCTTGTTGTTGATGCACTAACTG
G CAC ATCAATAAATG G ATTTCAACTTATATAAACTAACAACGTAAACAATTTTTG CACTAT
ATTTCAACTGGTAAAGTTATCTCTGTGTGACCTATTGGTCACGGGTTCGAGCCGTGGAA
GCAGCCACTAATGCTTGCATTTGGGTAGGCTGTCTACATCATACCCCTTGGGGCTACG
GCCCTTCCCAGGACCCTGCGTGAACGCGGGATGCCTTGTGCACCAGACTGCCCCTTTT
ATATTTTAAC CAGTTAAG G CAAGTTATTTACTG CATTTTTTGAAGTTACTCATTTAG G ATT
ACTATAGAGAGTTACATGCCGTCGTATGTCATTTAACCTAATGATGCAAATAAATTGTAT
ACTATTTTAATGCACAGAAGTTAAAGTAGCTTCTTCTCTAAATGAATGTATATCTCCAATA
TGACAGGATGGTTACTTCAAGATCAGAAGAGGAACAAATGAGTGTGGCATTGAACATAA
TGTGGTGGCTGGATTGCCTTCTGCAAAAAATCTGAATGTGGAACTTGATGATGTATCTG
ATGCTTTCCTTGATGCCTCAATG
SEQ 1 1
CTATACCATCATACCCATGTTGGAATGTGCCACTCTGACAACAAGTGGAGATGTTACCG
ATGTTCTTTTGTTCCTCCATGTCAAAGAACCAAATACATATCCCTGTGTTGGTGCAGCCA
CCTTGAAGTTCACTGTAAAATTCATCTTCTGGTAATATCTAGTGAAGGCTAATCTTCGGG
GAACCACAGTGACATTGACACCCGTAGGTGCATAGACAACTGCCTTGTAAATGCTTCTT
GCTTTTCCCACGTTAGTAACAGTTCGAGTTACTGAATATGTGCTTCTGAGGTTTGGTATT
GTGATGGAGGGATAATTTAGTCCATTTGGTGATGCAAAGGTTTGATCACAGGTGCTATT
GTCCCTTGTAATCAGATGCAGAGATTTCTCATCATAACCAATTGAACAAAGAAATGCTCT
GTAATCTGCTGGCTGTGCATCGTATATAAGACCAGGATCCAGGACATTCGTAGGGTTAA
CAAAGCCAGAACCAAAATCAAATGGAGTAGCTCTCTTCCCTTCAGGATCTACTATTATG
GGTTTGTGATGCTTATCTGACAGTTTAGCTGCATTAATATTGATGAGATTAGTGCATTGA
AGGCTTGAATGAAAGAGTTAGATTATGTAAAAGCTTTTATTCTACCTGTCGTCATGATCG
C G G ATTTAATTG C AG AG G GAG AC C AAG ATG G ATG C AC AG CTTTTAAC AAG G C AAC AACT
CCTGTTATGTGGGGGCAAGCCATAGAAGTTCCGGATAGTACATTGAAGTTCAACTTAGT
AGAAGCTGCTGGAGACCATGCTGCCAGGATATTTAATCCAGGAGCTGCAATATCAGGC
TACAAAG G CAAATCAG CCAG GAAATTACTTG CAAG AAAAAG CCAAATCCTC AAATAAG G
TAAGAACCAAAGAAAATGCAAAGTAACAAAGAACTAATCCACAACCACATTGACGATCC
AGGAAGTAAATATCAGGATGCAACATAAAACTTTGTTGGTCTACATGACAAAGGCAGAG
AGAGATCATTGTTGAAAACAAGTGGCAGTTGAAATTAAGTCCCTATATTACTATTTTTAG
CGCACAAATTACCTTCAAAATTTCTGGTGTTACAGAATTAGGACCTCTTGAAGAAAATGC
TGCTACTCGAGGAGCAGGTTGAGCTCCCAAAACGGTTCTAGCAGAGAGAATCCTTGCC
ATGGGGAGGCTGTTAATATAATAAGATGTGAGATTGGGAAGAGATGATAGTTAACAACA
TCAAAAAGTCAAGAAGAAGGAAGTGAACCGTGTATTGTTAATGTAAGCTAGGATCTTGT
TTCCAATCTTTTTCCCAACAGTTGCTGCAGGAATGACAAAAGGGATGGCCACACCCTTG
TCTGCGTCATCTATAAGGATCATCCCAACTCCACCGGCTTCTTTAACTATAATGCTTTTC
TCCATCTTTGACTCACTTGAGCTTCCAGCATGTAGGCACACAAGCACCTTCCCTTTGGC
CTTAGTTCTATTCAAAGAACTATCTAAGCAATAACTAGTGGAAAAGAGAGGGAAGGAAA AAAGTAAATTAAGATAATTGTCAATGCATACATATCTACATTTAACAAACAGCGAAAGTA
CCTTATGATTCACCTGGATTGATAGGGAGTGAAGTATCCAGCATAAGCTTCAGAAGCAG
GTATGATTCTTGTAGATGTATTCATTTGAGATAAGCTAAGACTTTCACCCTTCCAAATATC
CACAACTTAATTAGAAAATAGAAATTGAATAATAATAACATGTAAACTTGTCGAAAACTG
GGATTACCTTGAGCCGAACTCCATTTCCTAGTAAAATATCAGAAGTAAAATCTCTATCAG
TTGAACTGGCTGCAACTGTGATCATCCAAGGAGCTAAATTTGTGGCTGAACCAGTGCTG
CCTTCATTTCCAACTGAAGCCACCACAAGTATTCCGCGGCTAACAGCATGATATGACCC
CACAGAAATGGCATCATTGAAATAATCTCCTTGGGGAGCATCAGGGCCCAAAGATAGA
GAAATGACATGAACCCCATCTCTAATTGCATCATCAAATGCAGCCAATAAATCAACATCA
TAGCAACCAGAACTCCAGCAGGTTTTATACACTGCTATCCTGGCCATTGGGGCACCACC
TCTGGCTCCTCCATTTGCCAAACCTTTGTAATTCATATTAGCTACGTAACGCCCCGCTG
CTGTTGAAGCTGTGTGACTCCCATGACCAGAACTGTCCCTAGCAGACTTGTAAAACATG
GTCTTCCCATTTTCTTCTTCAGCTTCATAGCCACTCATATAATATCTTGCCCCAATTATTT
TCCTACATAAGACAAATCATATTGCACTTATCATCTAAATAACAAAAGAAGAGATGGTTG
CCAATCAAAAGAAAACCTGTTGCATATAGAGGCATTGAATGCTTCTCCTGATTGGCATT
GTCCTTTCC ATC CAG CTG GCACTG GAGG CATGTTG GTATCACTAAAACTTG GAGACTCA
GGCCAAATTCCTGTTTCATTAAATATCTTAAGAGCTTAACCTCAAGTTCTAATTAGCTCG
AAAAACAAG G GAAACAGTG G AG CTG G GG AC CAG GTTGAG GATAAACTGATAAG GTTGT
GAACAGAGATAATATTTGCATTGAAACATGTCACTCTAATGTATAGTGGCTCTTCCCATA
AAGTAACATTTACAAGTAGTTCAAG CACACTGTTAG GC ATAAATG CAAATGG CAAAATAT
G GG AAG AG GTGAG AAAG ATGAACTG G GAAGCTAAGAAATTG CATAAACTTG AGTTTTAA
AAAAATCTAAGCAAACATATTTCATCATTCGAAAATGATTGAAACAAGAATTGATTGATGA
AAGGAACTACTTTCCTCAGGTTCAGCCATATGTACCCAAATGACAAACTTAGCACTTTTG
CAAAGTCATGTTATTGTACTCTTCTTAAGAAAACTAACAGAGACAAGAGCCCTTTTAAGT
GACAATACATTAAATGAAGGGACCAACACTAACTTGGTTGGTGCATTCCATCATTAAAC
GATCATATCTTTCACCTAACTCGGAAAAGATTGCTAGAATTTAAGATAATTAAAGCAAAA
GGAACAGAGAAACCACCTGTATCAATGAAACCAATGATTACATTAATTTGGTTCTTGGTA
GAAAAACCTGGAATTTCCATTGTTTCATCATCACTGAGCCCCATAAAATCCCATGAATGA
GTTGTGTGTAGGCTCCTCTTAGTATTTGGAAACACGGATACCACTCCAGGCATTTCTGT
TTTATTTTAACATTAAAGACAAATTTCTCAGTACTTATTCATATCATTACCTTAAAGAAAAA
CATTTGGCATGGGCTTACTGGATATTTCAGAAGCCTGTGCCTCAGTCAACTTGGCTGCA
AAGCCTTTAAAACCATGCCTATAACTATATACATGTGAAGTCTTGGCTTGTTCAATGCTG
TAAAGTTACAACTTCAGTTTTTTTG CTAAAAG CAACCAGTGTAAAACC CAATGAACCAG C
TCAAAAAAGGGAAAAACCCTCATAGCTAAAGGTAACAAGAACTGACCTTCCTTTATGAAT
AGCAGTCAG CATTTGATG GTTTTG CCTCAAAATCTCATCTG G GTGTTCATCACTATCTTT
G CTTC C C ATGTAC AC C AC ATATAACTG G AAAAAAAAAAAAAC C AAG AAC AC AACTTTACT
AACTATTCATCATTTCAAAACATCATTCAAATATACACCCTCAATAACAGCACATAAAAAC
CCATATCAACATACAGACTACAGAGCCAAAGTTTATTTACCTTGGAAGAAAAGCAGAGG
CTAATATCTCCAAGAAAAACACAAAGAAAGAGTAAAAGAAGAGTCTTTTTTAGAACACCC
AT
SEQ 12
ATGGGAGCAAAAGCATTTCTTGTTGCTATGTTTCTCTCAGCACTGTTATTTCCTTTTGCC
TCCTCATCCAATGATGGCTTGATGAGAATTGGCTTGAAAAAAATGAAATTTGATCAAAAT
AATCGGCTTGCTGCACGCATTGAGTCAAAGGAAGGGGATGTTTTGAGGGGGTCGATTA
G GAAGTATAACTTCCGTG GTAAACTG G GG G ACTTTG AGG ATACAGACATTGTAG CATTG
AAG AACTATATG GATG CTCAATACTTTG GG G AG ATTG GTGTAG GC ACTCCACCTCAG AA
GTTCACTGTAATCTTTGACACAGGTAGCTCGAATTTGTGGGTGCCGTCGTCGAAGTGCT
ATTTCTCTGTAAGCTTCTATACATGCAAATGATACAAAGGATAGCATTGAACATCCATCT TGAGTGATGTAAAATTTGATGACTGCCTATCTTGGTGTAGGTTCCCTGTTTCTTTCATTC
CAAGTACAAATCAAGTGAATCAAGTACTTATAAGAAGAATGGTATGTTATGTTTCCATTTT
TGTATATTGCTTCTCTCTACCATCTGGTTGTTTATTGCTGCATGAAACATATATATGCTTC
CTTCTAGTGCCGGTTATTGTTTAGAATATGTGCCATCTTGTTCATTTTAGAACACTTTTTG
TATTGTCCTTATGTGTTTCTCACGGTGCATCAAGGGATTACATTGGAAAAGTTAAATGAT
GAGTACATGTAGTTGACTGTTGAGACATAAAAAGAGGCTGTTTATGTTATGTTTCTTAGT
ATATTACTGTAACTAGTGAGGTTCCAGAAAAGAAACACCAATATCCTTATCTCTCTCTGC
GATTAGTGCTTTTTGGTTGCGAGTTGTATAGTTTTAACCTCTGCAATGCCTCTTTAGGTG
GCCTTTCTTCTTCCCTATTAACTAGGTTTTGTATCCATGTCTTGGCATGTCTGTGACATA
AGAAATTTTCCGCAGAAAACTAGTTATTCTTGATTTTTTTGTTCTATTACAATTATATATGT
TGCTCAAAGAAAATGAATTGAACAGTCTGTAGCAATTGAATGGGTTTGCCAGATGAGTT
CACTCCAGTGTTATTGAAGGCCATTGCTTGGTCGTTTAAAGTTATGAAGCGATACAGCA
AGGGCTATCGCTTCATCGCTGAAGCCATGCACTTTAGAGAAGTTTGCACTTCAAATAAT
GGTGCACAAAGTGATTCCAACGAGAATTTATTGTTTCATTGAACATCAACTTTTAGAAGT
TGGATTAATGTTGGCATTTTCACAAAGTGATTCCTGTTGGGTATAAAAATAATATTCACG
GTATTAGTGATAAACGCGGAACACTAAGTTATGCTTAAATCAGTAAGAATAAAAATGCAG
C AAAAATG AC AC C AAG ATTTTAC CTAG AAAC C CTTCTG AATAAG G G AAAAAC C AC G G C C
AAG AAG AG CAACTG ATATCACTATAG CG AG G ATTTTACACTGTGTAGTAACG AGTAC GA
ATACTCCTAAGACCACTACACCCTCAAAAGAAATAAACACTCTTTTGCTTTTTCACCTCA
CTACAATATCTCTCACACTCTATTTTCTTTACAAACTATTTTCTTATAGTTTATGGAATACC
TTGCTCTCTCTTTTTTCTCTCTTTGTTGGTGTGTAGAAATGAGAGTTAAAGCTCTCCTTTT
ATAGCCAAAACCTCACTCTCTAAAGCCTACAATATTTGACAATTTGCACTCCCTTTTCAC
AATTTCAACAAGGTTGGCTACCAAACCAAACCAAATCAATAAAATTGTCTACCAAATCAT
ACCAATTCAACAAGGTTGGCTACCAAACCAAACCAAGTCAACAAAATTGGCTACCAAAT
CATTTGAATGAGATGGAAACCATATCAATCTCCCCCTCCAGTCTCATTCATCTAGAGGA
GGTAGCACTGTCTTCTAGTCTGAGTGCATGCCGACAAGTTCTTTGCATAGCTCGAACTT
GTCTCTTGGTACCACCTTGGTCAACATATCAGCAGGATTTTCACTTGTGTGAATCTTTTT
GACCTGAAGTGATTCGTTCTCCACTTGCTCACGAATCCAATGATATCTGACGTCGATGT
GTTTTGTTCTCGCATGGTACATGGAGTTTTTGCTTAGGTCTATTGCACTCTGACTGTCGC
AATAGACAACATACTCCATCTGTTGCAATCCAAGTTCTTGAAGAAATCTCTTGAGCCATA
TCATCTCTTTGCCAGCTTCAGTAGCCGCAATATACTCTGCTTCAGTTGTAGATAGTGCG
CACTTCTGCAACTTCGACTGCTATGATATAGCTCCCCTTGAAAAAGTAAACAAATATCCG
GTAGTGGATTTTCTGTTATCAAGGTCACCTGCCATATCAAAATCTGTATAACCTTTCAAA
ATTGGATTTGATCCTCCAAAACATAAGCATTCACGAGAGCTTCCTCTTAGATACCTGAGT
ATCCACTTTACAACTTCCCAATGCTCTTTTCCTGGATTTTCGAGAAATTTGCTAATAACAC
CGACTACATGAGCAATATCTGGTCGATTGCATACCATTGCATACATCAAACTTCCGACA
G CAG AAG AATAAG G AATCTTG GC CATTCTCTCTTTTTCCTCCGTTGTTGTAG GACACATC
TTCTTACTCAACTTCAAATG AC CAG G AAGG G GTGTG CG AACCAACTTAG CACTTTTC AT
ATTGAAGCGCTCCACTACACGTTCTATGTACTTCTCCGGTGACAAGTAAAGCTTTCTTTC
GTTTCTCGAACGAGTAATTCTCATGCCCAAAATCTGCTTAGCATGACCCAAGTCTTTTAT
TGCAAAAGACTTATTCAACTGTTTCTTCAACTCGTCAATCTTGGATGCATTCCTGCCCAC
AATCAACATACCATCCACATATAGCAAGAGGATGATAAAATCATCATCAGAAAATCTTTG
TACAAATACACAGTAATCTGAAGAAGTCTTCTTGTAGCCTTGCTCCCCCATAACAGACTC
AAACTTCTTGTACCACTGTCTGGGAGCTTGCTTCAATCCATATAGACTCTTCTTAAGTTT
GCATACAAGATTTTCTTTACCTTTTGCATTGAAGCCTTCAGGTTGTTCCATATAAATCTCC
TCTTCTAAGTCACCGTGAAGAAAAACAGTCTTCACATCCATCTGCTCAATCTCCAAATCA
AGATTGGCAGTTAAACCAAGAACTGTCGGAATGGAGGACATTTTCACGATAGGAGAAAA
TATTTCGTCAAAGTCAATACCTTTCCTTTGACCAAATCCCTTGACAACCAATCTAGCTTT
GTATCTGGGCTTCAAACTATGTTCTTCAGCTTTAACTTTGAACACCCACTTGTTCTTCAA
AGCTCTCATGCCCTTAGGCAATTTCACCAACTCATAAGTATGGTTCTCATGCAGAGATTT
CATCTCATCTTGCATGGCTTCAATCCATTGATCCTTGTGCTCATCTTCTATGGCCTCCGC ATAACATTCAGGTTCTCCCCCATCAGTGAGTAATACATATTCATTGGGTGAATAACGGG
AGGAAGGAGTACGAGGTCTAGAAGACCTCCTGAGTGGAATATCTAACTCGTCCACAGC
TTCGTGAGTAGGAGCATCTACCTCATCAACATTAGCATTGTTATCACCATCACCATCAAT
ATGCTGATCCAGAATATGGTTCTGGGCATCACCATCATCATTGAGCCCACCAACGTCAT
CCACATTTGTATGAGGAACTTGATCAAGATTAACTAAACCTTCAGAACTTGAAGATTTTA
GTTTCTCCGCTTTGTTAATATCTTCAATGGTTTGATCCTCCACGAAGATAACATCACGGC
TTCTCACGACCTTCTTCTCAATTGGATCATATAACTTGTAACCAAACTCATCAAGGTCAT
AACCAATG AAG ATG CATTGC CTTGTCTTG G CAGTTAATTTTG AC CTCTCATCTTTAG GCA
CATGTACAAAAGCTTTGCAACCAAACACTTTCAAGTGGTCATAGGAAATATCCTTGCCAT
ACCAAACTTTGTTTG G AACATCACTTTG CAAAG CAACCACAG G G GAAAG ATTAATAACAT
GTGCGGCGGTCAACAAAGCCTCACCCTAAGAGGAATTCGGCAACTTTGCTTCAGAAAA
CAAACATCTGACTCTTTCCATCAAGGTCCTATTCATCTTTTCTGCTAAACTATTAAGCTGA
GGAGTCTTAGGAGGAGTCTTCTGGTGTCTGATACCCTGTTGTTTGCAGTATTCGTCAAA
CAGTCCACAATATTCACCACCGTTATCAGTACGAATACACTTCAGCTTCTTTCCAGTTTC
TCTTTTAGCTGAAGCCTAGAACTGCTTAAAGACACCCAACACTTGGTCTTTAGTCTTCAA
GATGTAGACCCAAAGTTTCCTTGAGCAATCATCAATAAAGGTAGCAAAATAAAGTGCAC
CACCCAAAGTC CTTGTCTTCATTG G ACCACATACGTCTG AATG CAC CAACTCAAG CAAC
TCGTCTTTCTTGAAAGAAGATGAGACTGGAAAGAAACTCTTTTTTGTTTTCCAGCCAAGA
AGTG CTCACATTTTTCTAATTTTGCACTTTCAAAATTTG ACAACAATTTCTTCTTG G CTAG
AACATTTAGTCCTTTCTCGCTAATGTGGCTAAACCTCTTATGCCATAACGTTGAAGAGTT
ATGGCTCTCAACGGCATTCACCATATCAACACAGGTAGAGGTCGTAGTCCAATATAGAC
CACGACGCTTTTCCCCACGAGCCATAATCATGGAGCCCTTAGTGAGCTTCCACTTTCTA
GCACCATTGGTACTGACATATCCCTCATCATCCAAAACACCAACAGAGATCAAGTGCAA
ACGAACATCAGGTGCGTGCTTTACATTGTTTAAAACTAGTTTAGTTCCAATACTAGTTTC
CAAACAAATCATTCCAACACCAGTCACCCTAGATAAGTTCCAAAGTCACCCTGAGTATA
GGATGAGAAAATCCTTCCTTGATGTCACATGAGATGCGGCACCACTATCCACAACCCAG
CTTGACTCATCACAAGCAATATTTATCAAATCCGCATCAAGGACAATAACAAGATCTTCT
GTAGTGACGGTGGCCATACGATTGCCATCTTCTTTCTGTTCTTCCTTGTCTCTATTCTCC
TTTTTCAAAATCCGCAGAACTTCTTTGTGTGCCCTTTCTTCCCGCAATGATAACACTTAA
TATCTTTAAGTCTGCTTCTGGATTTGCTTCTATTATGTTCTCTATTTTGAGAACCCCGATT
CTTGCTTCTCCCCCTAGAGTCAGTCACCAAGACATCTGATGGGGAGGAACCTTGAGATT
TTCTTCTCATCTCTTCATTTAAAAGACTGCTTTTGGCAAGATCCATAGAGATCACACCAT
CCGGAGCAGAATTTGATAATGAAGTTCTAAGAATTTCCCAAGAACTTGGTAGGGAACCA
AGTAG AAACAG GC CTTGAATTTCTTCATCAAATTTAATGCTC ATAG CAG ATAACTG GTTC
ATGATCCCCTGAAAATTATTCAGATGATCTGTCATCGCAGAACCATCATGGTATTTTAAA
CCCAACATCTGCTTTATCAGAAACATCTTGTTGTTTCCAGTTTTCCGAGCATACAAACTT
TCAAGGTGCTCCCATAGGGTCCGAGCATGTGTCTCCCCAGAAATATGGTTCAAAACATT
ATCGTCAACTCACTGTCTAATAAAGCCGCAAACCTGCCTGTGTAACAGATTCCACTCTT
CATCTGATTTATTATCAGGCTTTACAGTGGCGAAGACAGGTTGATGAAAATTCTTGACAT
AGAGCAAATCTTCCATTTTGCCCTTCCAAATGGCATAATTTGTGCCATTCAAAGTAACCA
TTCTACTAGTGTTGGCTTTCATCGTTTATCACAAATACAAATACTATTTATTATGAGACCA
AAGTAATTCTTTTCTGATGTGGAAGTTCAGACTGTGCTGCAACCACAGAGCATACTCAN
N N N NTATTTATTATGAGACCAAAGTAATTCTTTTCTGATGTGGAAGTTCAGACTGTGCTG
CAACCACAAAG CATACTCAAACAG AACCTTG G CTCTG ATACCACTTGTTG G GAATAAAC
CCCGTAAAAATAATATTCACGATATTAGTGATAAACGCGGAACACTAAGTTATGGTTAAA
TCAGTAAGAATAAAAATGCAGCAAAAATGACACCAAGATTTTACGTGGAAACCCTTCTGA
AT AAG G G AAAAACTAC G G C C AAG AAG AG C AACTG ATATC ACTATAG C AAG G ATTTTAC A
CTGTGTAGTAACGAGTACGAATACTCCTAAGACCACTACACTCTCAAAAGAAATAAACA
CTCTTTTGCTTTTTCACCTCACTACAATATCTCTCACACTCTATTTTTCTTCACAAACTATT
TTCTTATAGTTTATGGAATATCTTGCTCTTTTTTCTCTCTTTGTTGGTGTGTAGAAATGAG
AGTTAAAGCTCTCCTTTTATAGCCAAAACCTCACTCTCTAAAGCCTACAATATTTGACAA TTTG CAC ATCCTTTTCACAATTTCAACAAG GTTG G CTAC CAAACC AAACC AAGTC AATAA
AATTGGCTACCAAATCATACCAATTAGTAATGGTGCACAAAGTGATTCCAACGAGAATTT
ATTGTTTCATTGAACCTCAACTTTTAGAAGTTGGATTAATGTTGGCATTTTCACAAAGTGA
TTCCAACTAGAATTGGTTGCTTCTTTGAACCTCAATTTTGAGGAGTTGCATTAATACGGG
GATTATTGTATATTGGATGCTGAAATTAGTTATTTCAACTGCAATTTGATTTTTATTGTAG
AGTAAATTAATAAATGTTTGATATTTTCTTGTTTTCTGTTAATTGTGCGCCTCACTTCTCG
CTATTCACTGCAAGTCTGTGGACCTTGTTTTATTTTGTTGCACGTTTTGGTTTTAAGAAC
ACTAGGTCACTCCTACCTAGGGGTGTCAATGGATATTAGAAAACCGACTTAACCGACCG
AACCGTACCGTACCGAACCGATTTTTAGGTTTCTTTTAAAGAAACCGTAGGTTTTTATAT
AAATCTATAATCGTACCGATAATTAGGGTAGGTTTTTTATTTTATAAAAATAAACCGAAAA
AATACCGAACCGTACCGAATAAGTTTTACATATGAAAAATATATTCATATAGTAAGTTTAA
AACTAGTAAAGTATTAAATTTTTCATTGGGTCTTGGAATTATGAAAACTGTTACAAGCCAA
TAAGTAATTAAACTCAAAATACTAATTCCTAAAACN N NNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNN CTTTTATATAATTTAGATTTATCTTTTTAAATATTTAATATAGACTTTA
TTCTTGAGTCCCAACTTGGTTAATATCTTTCCACTCGTGTGATTTATATCTTCTTTGCTTT
TACTTGGTTTCTTTTCGTTGGTGTCGAATAGTTGTGTATTTATACTCTAGCCATCTTTCAT
GTTTTTTAATTCATTATCCTTTAAACAGTAAAAATGTCTAGAGAGTTTCGCTAAGTCCTAT
AAAAGAACGTACGTTATTGCATTCTATTTTTACTGGTGAATTTTATATGACATTTAAAAAA
TACCGAAAATTAACCGAACCGTACCGATACCGAAGTGAAACCGACATGATTGGGACGG
TTTCGAAAAGTCTAGTTTTGGTTATACATAATAAAATAACCGAAAAATTGGTATGGTATAA
ATTTTATAAAATAAG C C G AAC C G AAC C ATTG AC AC C C CTACTC CTAC CTTAATG AAC AAC
ATTTATAACTTATCGAATGTTTCCACAAGACAGAGATCGTATTTGGCATAGTAATTTTAAT
GTGATTTCATTCGTGATAAATTTTTGATTTTCCATATAGGGAAGTCTGCTGCAATTCAGT
ATGGTAGTGGAGCTATTTCTGGATTCTTCAGTCAAGATAACGTCAAAGTTGGTGACCTT
GTTGTAACAGATCAGGTGAGTGAGGCTCCTCACTTGTTTTAAGTGTCTTGAAGGTAGAA
ATCTTACATCTCAAGGTGTTCATGTACTAGGAATTTATTGAGGCAACCAGAGAACCCAG
CGTGACATTTTTGGTAGCCAAGTTTGACGGTATATTGGGTCTTGGATTCCAGGAGATTT
CAGTTG G AAATG CTGTTCCAGTGTG GTACGTG GACAGCATTTAGTTTG CTCTCTTTCTTT
CCCACAAACCAAATTAAAGATCTAGACAATTCTTTTTTCCAGATGAGGTTTAACACATTG
AAGGATTAGATTTCCATAAATGCAGGATAGGCTGGATGTCTCTTTCTGTTTAAATTTGAT
TTGGCATTCATCTGGGCAGCATCTCCTGTCTTTTCCAGGTACAACATGGTCAAACAGGG
TCTTATCAAGGATCCTGTCTTCTCGTTTTGGCTCAACCGAAATACAGAGGAAGAACAAG
GCGGAGAAATCGTATTTGGTGGGGTTGATCCTAATCACTATAAGGGAGAAATAACTTAT
GTTCCAGTCACACAGAAAGGTTATTGGCAGGTTTGTCATTCCGTCAATTCGTTACTATGT
ATGTTCATGTTTTGTACAAATGCTATCTTAATCTTAGTAATATGATTGGCAGTTTGATATG
GGTGATGTTCTTATCGATGGTAAAGCTACTGGTATGTTTTGCTCTGTACCTTTTGAATTG
GATTGCTGAATTTTGCGAATATAGATGAGGGCTATGTGCCTGGATATAGTCTTCCTTTGA
GTTTTTAACATGAACATATCGGGTGTAGGTTACTGTGAAAGTGGGTGCTCTGCAATAGC
GGATTCAGGGACTTCTCTCTTGGCTGGTCCAACGGTATTATCTATAACCTTGATTTTGGA
CATATCGCTTTTTTGTTATGTTTTTGTTATTTGTTTCCACCATCAAAGTCAACCGCAATTC
GATTATTGATTTCCTTGTTCCTTTGAACCTTCAACAGGTTGCTCAGCAAGAAAAGTAACA
TTTTCCTAAGATTCATGAGTTAAATAGAACGAATCCAGTGTCTGATCTGTGTTGTGTTTT
AATTTTTCATCACAAAGAGGATGTTAGTGAAATTGGTTTTTCATTTCTGAGGTTACCTTTG
GATATACCATCTTAGCTTATCACTGGGTCGATGTCTGTAAATCTGCTTTCTCTTTCCTGT
CTTTCAGCTTAATATGTGTAAACCTGCCTCAGTAGGTACTTGTTGTTTTGTGTTACTTGC
TTTTATTACG CTTCAATGAG GATG ATG CTTTTACTTCTG CTTTAGTCCTAGTGTTCTGTC
GACTTATTTATTTACCCCTCTTCTATTTGAAAGAAAATTATCCAACCCGAAAACTGTAGC
CAGGTTTTATCCTGTTG ACTTG G AAATG CTGACAATTAAAT GAAATAAAAATCTG CTG CC
TTTTGTCTCTACAAATTCAGGGGCGAACGTTTACCCAAACGGTGTCATCCGACACCGCT TGGTCGAAATTTTTTACTGTATAGACATATATATGTGGGAAAAAACAGTACGCAATATAA
ATTATAAATGACCATTTATGCGTGTAGCTGTCTCTGGTATAATGGTCAAGTGCTGTTTTT
TCCCTACTTGACTTGATGTCGTGGGATTGACCTCAACGGATGGCATTTTTTTATTTCAAA
TTTTTTAGCATGGTCCTTTTAAAAACTAGAGTTTATATAGAGTTGAACCTTCAATATCATC
TTAAAAAAACAACTAAATCTGGGGGACATGACGGCTTCCACGTTATAAGTGATAAAAAAT
TTATTTGAATG CATAAAGG AAG CTGTTTAAG CATACATAATATAATTTAG AATAATAATTT
TTAAAAAATTATTCGACACCGCTTACTAAAAGTTCTGCGTACGCCCTTGTCTACAATTGT
ACGATTAAAACACTTCCAAGGTTGAGCAATCTTAAGCCCACCCTTTTCCGTACAGGCCA
TAATCACTATG ATTAATCAAG CCATTG G AG CCTCTG G AGTTG CTAG CCAACAATG CAAA
TCTGTAGTGGAGCAGTACGGGCAGACGATCATGGATTTGCTGTTAGCAGAGGTGAGCA
TTCAACTGTGTAACAGCATTTTCATTTGTTTGATATGCAGTTTCCTGTGCAGTTACAGCA
AAATGACACTTGGCAATCTAGTTGCCTTTCCCGGTTATCTAATTGGTCCGCATTTAAACA
GGCACATCCAAAGAAGATTTGTTCACAGGTTGGAGTATGCACCTTTGATGGAAACCGCG
GAGTTAGGTTAGTCTTCAGGCTCTCTCTGCCCCTTCACGTGAACATATTGTGCATTTTGT
TAATCCATAATGTTGATATAAATACTGAATAATTCTGTGGCCTTTTTTCTGCAGTATGGGA
ATTGACAGTGTTGTAGACGAGAAAGCTGGCAGATCCACAGGACTGCAGGATGGTATGT
GCTCTGCTTGTGAAATGGCGGTCATATGGATGGCGAATCAACTGAGACAAAACCAGAC
TCAAGATCGCATATTAAACTATGTGAATGAGGTAAAACATCTGTCACTGCAATTTTCTCC
TTTTCTTTGAAAAGAATGCTGACTGACTGACAGCTTTGCGAGCGTCTCCCAAGCCCATT
G GG G GAATCAG CTGTTGACTGTG GAAAG CTTTCTTCAATG CCTAAAGTCTCTTTCAC AA
TTGGTGGCAAAGTGTTTGATCTCTCCCCAAATGAGGTATGCTTTATAATGGTGTTTGCG
AGTAATAAACTTTGCTATGCCTCTTGTAATTCAACATTGTCAGAGAGAAAAATTAAGTGG
CTCATCATAGAGTCTTTTTGTGGCATTACAACTAATTGTCATATTACAAAACCGATTCTAG
TGTTTAAGGTTTTCAGTTTCTACAGAGAAGTAGATGTTTGTTGTCATCAAGAAGTACTTC
TATATTTTGAGGTTGTGAGTATTTCGATTTTACATCAATTTAGATGTCTCATTTTCCAACT
AAGTAATCTAAGAGAACCTCCGTAAGATCTTTGTGCCGTGAAAAAGTTACTCTTTGCTAT
TTGTATGGAAGTGTATTACATGCTGACCTTGCTAATGCAAGCCGCTTGAGAAGGCCCAA
TGAATCGAGATTTAGATGTCGTCCCCCCCACCCCCCGGGCCCCAAAAATAATAATAATT
TATATCTGGATGATTCAGCATGGAACAATTCTCTACCTTCATAGACAGGGGTAAGGTCT
GCGTACACACTACCCTCCCACACCCCACTTGTGGATTCCACTGGGTTGTTGTTGTTGTT
GTTGATTCAGCATAGAACAAATCTTACAAAGAAGAAAATTCATCAAGTGTACACCAAATC
AATTTGATCATTCACGTTTTTTCTAAATTTGTTTGTTTTGTGGTTTAGTAAACATAAAGAG
ATTGAGCTTTGAATAATTCAGGATAGATCACATCTTATAAAGGAAAAAATTCAAGCGAGA
CACCAAATCACTTGACCATACGCAAAACTTTTCTTTACCAGACCATCACTCTAGTCTTTT
CCTGCCCTTTCAACAAATTTTATTCAAGAATGAGAGTAAAGTTGTGATCCGATATGTGGT
TCTGTTTTACTTGTTTCAGTTGTTTGTTTAACTTGCTGACCTAGATTGATCTATATTCTAA
TTTTGGATTCTTTTCCCGTGATCAGTACATACTCAAGGTGGGCGAGGGTGCTAAGGCAC
AATGTATTAGTGGTTTCACTGGCTTGGACATTCCTCCTCCCCGCGGACCTCTCTGGTAA
AGTTATCTTTATATGTTCTCTCTATTTTAGTTTATAATTATGTTTTTTTGAGATAACTATCA
GCAATGTAGTTAATCCAACAATAAGCTTAATCCGAATTATAGAGCTACGGCACAATTAAA
CCCGAGCCAGCAATTCTTTTGTTCACTGAGTTCTCTATGGTATTCCTTTGGTTACAGGAT
CTTGGGTGATATTTTCATGGGTCGATATCACACAGTTTTCGATTATGGCAAACTCAGAGT
TG GATTTG CTG AAG CAG CT
SEQ 13
ATGACTTTTTTCAGGTCGTTCTTATTCTTTCTTCTCACCTTATTTGTTATTTCATCTGCACT CGACATGTCCATCATTAGTTACGACGAACAGCACGGCCAGATGGGGACAACACATCAT CGTACTG ACGATG AAGTCAG AG AATTGTAC GAATC GTG G CTTGTTAAG CACG GAAAG AA TTACAATGCCATCGGAGAGAAAGAGAGAAGATTTGAGATTTTTAACGATAATTTAAGATT CATCGACGAGCACAACGCTGAGAACCGCTCATATAAACTTGGGTTGAATCGATTCTCTG
ATCTTACCAACGAGGAATACCGTGCCATGTTCGTAGGTGGACGGTTGGATAGAAAGAC
GAGGTTGATGAAGAGCCCTAAAAGTAACCGTTACGCTTTTCAGGCCGGCGAAAAGTTG
CCGGAATCCGTTGATTGGAGAGAGAAAGGCGCCGTTGCCCCTGTTAAAGATCAAGGCC
AATG CG GTG AGTTTTTTTCTTCTTCAAAACTTTCCTACTATAAAG GAAAG CTCTG CTCTTT
ATCGTAAACATGTACTTTTGTTTTGTCTGCTTACGGAGTGAGACCAAGAGGAAGAGTTT
GGATAGATTGTTGAAAGGAGTCATATGTAGGTCAAAAGTTTTTGATTTTTAGGTTGTTTT
TTGACCTATGTTGTCGTCTTATACGGTCAATGATCTGTTATTGGGTAACTAATGATTCTG
TTTTCATGTTTATTTCAGTCAACAAATTGGAGAATAAATTAATTGCTGCTCTGTCTGGTAG
TTAATCTTCATGATATACACCTAAAGCTTACATCCTGATTTAGTATTTGGTGTCTCCAATT
GGAATGTTTATTTGCTTTGCTAGTGTTTCCTCTCTCTCTCTCTAGGGTAAATATAAAAAG
ATCTAAAATTTAGAGGTACCTGGTGTATATCTTAATATATTCCATGTACAAACTTTAAAAA
ATTATTTAAGCTTCCCCTAATTTGTTTAATACGCTGATAAGGGGTAATCAAAAAGCATAA
AGATTAGATTGAACGGACACAGTATATATTTTGCTTTTGCAAGTTGATCAGTTTCTTTCTC
CATTCTAAATCGGAATCGACCAGAATTTAAAGCGGTATAACTTAAGATTAAGCCATGAAG
ACATATTTGGCTATTCTAGGTGTTATAAATTTTAACCCAAGTGTCCTAGGGAATTGATGT
TTAATCTTGCTTTGATTATGACGAAACCCATATCTCGATTGGTTAGATATCAGTATATCTA
TGTTATGTATAG AATCCTC GTTTG AAATTTG AG ATTTTCTTATG AAG G GAGTTGTTG G GC
ATTCTCAACGGTTGGCGCTGTTGAAGGAATAAATAAAATTGTAACGGGTGAATTAATTAG
TCTGTCAGAGCAAGAGCTTGTTGATTGTGATAGGAGTTATAACCAGGGATGTAATGGCG
GTCTCATGGATTACGCCTTTGATTTCATCAAAAATAACGGTGGCATTGACACTGAAGAT
GACTACCCTTACCATGCTCAAGATGGCACTTGTGATCCATACAGGGTAAGTAATTAACC
ATACTATCAAGAAAACATCCAAATATTAATTATGTACTATTTCAGAATGTAAGTCTATATA
GCAAGTAATTAATAGTATTTGCTGACAAAATTTGGTCATTCAGAAAAATGCCCGTGTTGT
CTCCATTGAAGGGTATGAAGATGTTCCAGAAAACGATGAGAAGTCGTTGATGAAGGCA
GTGGCAAATCAACCAGTTAGTGTTGCTATTGAAGGTGGTGGCAGAGCTTTCCAGCACTA
CTCTTCGGTATGGTGGGCGGATCTTGACTAATATATCCTTCTGAATATATATGTTATTTG
TGTCTGAACTCACTGGCCCTAAATTCTGGATTCGTTATTGCATTTTAGTATGCCTGTGTC
CCTAATCTGCAAACACGGCTGCATTGTGCCTTGTTTTACTACTTAAAGCTAGTATACTCA
TTTACCCTTCCAATTTTTATCAAATCATGCAGGGTGTTTTCACTGGATATTGTGGAACGC
AACTAG ACCATG GTGTAGTTGTAGTTG G CTATG G AACAG AAAATG G CG AAGATTACTGG
ATTGTGAGGAATTCATGGGGTGCTAACTGGGGAGAAAGTGGTTACATCAAGCTTCAGC
GCAATTTCGCTAATTCTACAACTGGAAAGTGTGGAATTGCAATGCAGGCATCTTATCCT
CTTAAGTCTGGCGCAAATCCTCCTAATCCTGGTCCATCTCCTCCTACTCCTGTAACACC
ATCAACTGTTTGCGATGAGTACTATAGCTGCCCACAGGGCACTACTTGCTGCTGCATTT
ATCAATATGGCGAATACTGTTTTGGCTGGGGATGCTGTCCTTATGAGTCTGCTACCTGT
TGTGATGATAACTACAGCTGCTGTCCCCATGATTATCCTGTATGTGATGTTGATGCTGG
CACTTGCCTTATGGTAAATATTTTTTCCCTCCCATTCTGCTTTTTTCTCCTTTATAATAAT
GATCGTCAATTTCACTTATTACGTGTAATATTCTACCAGCACAGGATTAATTAGATAACT
CTGTCTACCAAAACTTTGGCAGATATTTAAACCTTCGTCTTCACTCGTTTATTGACCGCT
AGACCCACGTACAGATTCAACCTTTTATAGGTTTAATCATCAATGCAAGACTACTTATCA
CAATCTTTTTTCTTTTTATGTGACAGAGCAAGGACAATCCATTAAAAGTAAAAGCATTGA
AGAGAGGTCCAGCTAGAGTAAACTGGTCAGGGATGAAATCTAACAGGAAAGTGAGTTA
CGTT
SEQ 14
TCATGAAGAAACAATGATCAAATAATAGCTAAAAAGGGAAAACAGAGCCATCATAAGTT
GGCAAATGTAGGAATTGAAATGTGCTGGTGCATGGTTTATTGCAGGTCTAGATGACGGA
ACAGATGGAAACGAAGTAGCGGGTTCATTTCCACTTCCATTTCCCTTGGTGGCCTCTGG CACCACACTGGAGGGCGACGGCGCTTCAGTAGAATTACGTTTGTTCACTGGCAGAGTT
GTCGATTTGTCGTTGGATTCTCTAGAATCATAACCTGACAAGAATGGTTTCTAGTTAAAA
TATGG ACAG GTGTG CACACTAAAAAG GTCATACTCATGAATGCAAACTCACAATCG G AT
GGTTTCCAACCCAAAACCATCTTCTCTCGATCAAAAACCACGCGATAGCCTGTCATAAA
ATTTTCTGCAAAAAGATCCATGTATTAGTTTTCTGTCATAAATTCCAAATACAGAGACAAA
ATCGAAGTAAATCAACAAATAGCTCAGATTTTTGACTATGTAACCAGTTTTACTAGTTGG
TTATGGACTTGATTCTTACTAATTTACTTTGTCACATTCACATTTCCAATAATATATAAGCT
AGAAGTATGTAAAAAACTTGATAGGAACCAAAACTTCTAAAAGTTGGTAATTGTGAGATC
ATTAGCTGGCATGGTGCAAGTTATATTGCAAAATTCCATGGATAGAAACAAAATCTACGA
GCAAGCAACTGATAGATACTTACGTCCAATGATGTTGACATCCCCACTTTTCACAACAG
CTAAGCAAAATGCGCGAGAACCATCCTGATGATCATGACTGGGTATTAGAATATTCTAA
AAGAGAACTTCTGTAATAAAGAAGGAGCAGAAACCATCTTACCTGGAGCGAGAGCATAA
TTATCGGATCGAAAAGAAAAAACTGGTTGCCGCCTTTCATTGTCAAATTTAAATCAGGAA
CTTCGAATGTAGTTTGATTTGCACTGCGTATGTAGCATGTTAATCCTGAGAGTTCAAAGG
ACATCAAGAAAGTAAAAGTGATGAAGATTATAAAAGATGGTTCACCTTAGCCCGTAGCA
GTATTCAAAAGGAATTTCGCCATCAGGTTGAATACGTAGCTGTTTTGCTTGAGAATCAAA
CTGAAAAAAGAAACAAAATATCTTCAGTTTCACAATACAAAGAACAAACTCCCAACTAAC
AAATCATACAGTCAGCCTGTCACTCACGTTCTCTGTAATGACTTTGTAAGCTGGGTCGTT
CAAGTATGTGAATGAGGTGCCAGAGTCAAAAATGGCTGTGAAATCAACATCAGTGATCT
TGTTTCCCACTGTTATTCCTGTCAAGCTGATGTTATAGGTTGGGCTGCAATTCAAGGAA
G GATAAG AGTG AATACATATTTG GTCG AATTCTCCTGAATC AAGC CAG G AACAG AGG CA
AACGCTAAACCTAGAATATCAAATTGACTGCTTACTGTAGTTGATCAAGATTGAGTGGTG
TTTCTCCTTGGTCTGGACTCCCTTTATCTCCAAACACTATTCTTCCAATACCATCAGGGC
CAAAGC ACATG GAGAAAG AATTTG CAG CAAG AC CTTTACTTG CTAACATG CTCGG AACA
GATATACTTTCCAAGCCAAGTCCAAATAGACCATTAGGAGCAGCGCCACTTAAAAATGC
ACCGGTTTGTCTTATCCCACACCTGAAATTGAACAAGTAGAAAATTAACATCATGGTGGA
TAAAGATTG CATC CAAATTACACTG CATTTCTCTCAAAC CATACCCACCCTAGAG CAATT
GGAGCCTCAACACTTTTTTGTTGAGCATTATCTGTCTCTAAGTGCAAGATGTCTTCCACC
AGTACCCCTGATGATGAGGTATTATTGGAGAGATATGCAACTCCATAAGCACATGCGTT
TTGTGAAGATAAGCATCGCCTCCTTTGTCCACACAGAGTGCCGTTGCAAGGAACAATCT
GACCCGTTGACGACGTATTAGGGCTGTAAATATTGAGATTTATTCGCTGTTTCAAAGCA
AACAAAG GATTATAAGTGTCAAATAATACATAAAG AAACATTAAACG GAAAAGG CAAAAG
AGCAAAGAAAATTTTGTTCCTGAAGGTGAAATCCTGTGGAGTATGAAGTAATTGAGCTA
GTAAGAATCTCAAAATTCTTCAATGACAAACAAGCCACGGAACAATATGGATCAGATATT
TCATTTCAGAGTCAGTATATTACATGCATCAAACTCACTTTGCAAATATTGATTATGTTGA
ATTTTGCGCTAGTTATGTATCTTCCTTAACAGAAAAACAATTTTCTCAATACATTCTCCAC
CCCATATCTTGTTAACTAAGAGAATATATTATTGTCATAATGACGGAAAAAGACAGTTGA
AGAGAAATCCACTGTGTTACAGATTCAGCCATTGTGTTATCAGCAGGTTCCTTTCTGAAA
AAGGTTATCAGACGAGGATGATCATCTTCCTTGTAGCAGAAGGCAACATACAGACGAG
GATGATCATCGAAAAAGAAAAACTCTTTAAACATTTGAAAGTAGAAAGAAAAAGGTACTA
GAATAAAGCAAACATACTCGTCCAGAGCGTGTCTCGAGGGCGCGCACACAATTGCTGC
AATCACAGGGTAGCCAAAACAAGTCACTGCCAGTGTCAAGTGCCACCAGAAATGATAG
CCCAG GAGTG CCCACTGTCACATTTG CATAATG CAAACTG CAAATTG G CAAAG AGTATT
AGTCACACACCTTAAGAAGAAAAAATCACAACTACAGATACTACATATTTTGCATTCAAC
TCTATCTTTATAACATATAAATTACAACATGCTACTGCAAGGAATTTTCAGAACAATTCCT
TGTCTAG AGAG GAGATAAGTG GCAG CAG AGG AACG G AAAATCAG AAAAAAAAAAATG G
AATTTATTTC C G GAG CAT G G AATTC G AACTAG AAG AAG ACTATAATTAAATTTAG AGTC A
GTACTTTTAATATAGGAGTGAAATCGCCAAAATTCCAGTCCGATGAAACACACAAACAG
AATTAAAGAACAGAAACAGGCCTAAATCTTTCTTTTTTTTGTTTTATCATATTTTCCTCCA
CATGAATCTCGTAAGAACTATTAATGGTACATGGAATTTATTTAAGTTAGGTAACCTATTT
TTCCTGAACTGACACATCCAACTAAACAGACAAAAACAAACGCAAAGCTCAGTCAACTC TAACATCACACTAAACGGACAAAAGTAAAAGACAGTAAACAAGAATTTCCAAAAACGTAC
TAGTAGTTCACAATCAGGAATAACAACAAAAATATTTTAAAAAAAAAATAGAGCAGCAAA
TAAACAACTGCAAAAGCAATCAGAAAAGAAAAATAGAGTGAGCTTACAATCCCAAAGAA
CTGAGGCGGAAAGTTTCATTTCCTCCGGAAAAAGAGAGAGGAGTGGGATTAGTTGTAT
CAGCAAGGCGGCGACCTTTGATAAAGCGATCACGCTGAGTCCAAGCTGAATAATACTC
AACACTTCCCTTCTCAGGCAATCCATGAAGGTCCAAAATACCCTTCACCGGATCCGAAT
ACCGGTGATGGATATCAAACCCGAACGTCCCAAACCCATCGCTGCTCTGCAATTGCAAT
C C C AG AATC G C C AAG AAAATAATAG G G G C AAG G AAAAAATT AAAACTTGTATAAG AATT
AG C CAT
SEQ 15
ATGGTGACAAAGTTTAGTATTTTTATTTTGGTGGTGTTGTTGAGGTTATTTTCATTTGGTT
CTGTAG CCTCAAGG G AAATTCACAATTCTG GTCTTAATCTG AATTCTAGTG CTTCTG GTA
TTGAATTCCCTCAACATCCAAGTTTCAACTCAGTTACTGCTTCTGGAAATTCAGATTGCA
GTTATGGAACATCCAAGAAATCAACAACCACCCATGTAATAACTCAAGAAGAAAATAGAT
CTGATGAAAAAGAAGATGAAGATTTAATGGTATCTAAAAACCAGCCAAGAGAAGCAGTC
AAGTTTCACCTAAGGCACAGATCAGCTGGTCAAAATATAGAGGCCAAAGACTCAATATT
TG AGTCCACAACAAG G GACTTAGGTAG AATTCAG ACATTG CATACAAG GATTGTAG AGA
AAAAGAATCAGAACTCTATTTCAAGGCAAACAAAAAATAGTGAAAAACCTACACAATCTT
CTTCATTTG AATTCTCAGG CAAG CTCATGG CAACATTAGAGTCAG GTGTAAGTCATG GT
TCAGGGGAGTATTTCATGGATGTTTTTGTCGGTACACCTCCTAAGCACTTCTCTTTGATT
CTTGATACTGGTAGTGATCTTAATTGGATTCAGTCTGTTCCTTGTTATGATTGTTTTGAAC
AAAATGGTCCTCATTATGATCCTAAGGATTCTATCTCTTTCAAAAATATAAGCTGCCATG
ATCCTAGGTGTCACCTTGTTTCATCTCCTGACCCTCCACAGCCTTGCAAGTCTGAAAAC
CAGACTTGCCCTTATTACTATTGGTACGGAGACAGCTCGAACACGACTGGTGATTTCGC
GCTTGAGACGTTTACGGTTAATCTCACAACCCCTAGTGGGGATTCAGAGATCAAGAAGG
TGGAAAATGTGATGTTTGGTTGTGGACATTGGAATAGAGGCTTGTTTCATGGTGCTGCT
GGTTTGTTAGGACTTGGTAGAGGACCGCTTTCGTTTTCGTCTCAGCTTCAATCTTTATAT
GGCCATTCTTTTTCGTATTGTTTGGTTAATAGGAACAGCAATTCTAGCGTAAGCAGCAAA
TTGATTTTTGGTGAAGATAAGGAACTCTTGAAACACGCGAATTTGAACTTCACTTCACTG
GTTGGTGGGAAAGAAAATCATTTGGAAACATTCTACTATGTGCAGATAAAATCAGTTATA
GCTGGAGGTGAAGTGCTGAATATACCTGAGGAGACATGGAATTTGTCTACAGAAGGTG
TTG GTG G AACAATCATTG ATTCAG GAACTACTTTG AGCTATTTTG CAGAAC CAG CATATG
AGATTATAAAACAGGCATTTGTTAACAAGGTGAAGCACTATCCTGTTTTAGAAGATTTTC
CAATTTTGAAACCATGTTACAATGTTTCTGGAGTGGAGAAACTTGAATTGCCTTCATTTG
G GATAGTTTTTG GTGATG GAG CTATATG GAATTTTCCAGTAGAG AACTACTTCATC AAAC
TTGAACCAGAGGATATTGTTTGTTTGGCAATGTTAGGAACTCCTCATTCGGCCATGTCG
ATAATTGGCAACTACCAACAGCAGAATTTTCATATCTTATATGACACCAAAAGGTCAAGG
CTGGGATTTGCACCAACAAGATGTGCTGATGCC
SEQ 16
TCACATAGGAGCAAGATGACCTTCTTTAGACAATTTATCTTGCATCCACCTCTGAAGCAT
TTCCATTGCTGCCTTAGGCTGATCCATTGGAACCATGTGACCTGCATCATGGACCTTAA
GGAAAGTTAAAGGTCCATAGTTTTTTTGAACTCCTTTCTCTACACCATCTACTGCAAAAG
AAACTTGTGTGGCTTTTCCAAAGGCTTTTTGCCCTGTCCATTTCATTGCATGCACCCATC
TCGAATTCCCTGTCCATATAACAAGAAATAGAGTTTTATAATATTATTGTTAGTTGGTAGC
TTTGAATTGACTTATAAACAATATGATAGAGTTCAACTTCTATATTTTGACAGACGTAAAA G ATAACTTGAAGTAACTTTTGTTAAAAGTAG AATTACTAG CATG AAAAAATAAG GTAG GT
TAAAATACACTAATATAGTATGAAAATCCTCTTTTGTGTATATAAGTTAAATCCATATGAT
AATATAAAG CTTAC C AAG C C AATTG C AG ATAAG GTC ATATTC C C C AG C ATAC ACTAGTAG
CTTG ATACCATCCTCAAG G AGTG AAG GAATTCC CAATTCAAG ATTC CTCATC CAGTCCA
ACTGCATTGCCTGGTAAACTTCAGAGCTACATGAAACAAACTCAATATCCCCAACACCA
AGAGCCTTTTTAACTTGTTGATCATTGAGGAAAGTTTCCATTTTGGAGAAATCATAGCAT
AGATCGCCCTCACATCTCTTCCGCACATCATAGTACTGCAACTCGGAAATTACAAAATTC
ATATGACTTTAACTTTTGTATACTGACAGTGTAAAAAAAACTTTATTCTATCACGTCACTT
AAAGGATTGTAACTATAGATACCCGTTCATTATAAGTGAGATCAGTAATTTGGAAAATAA
GACAAGTAATATGTTGTACATCATTAGCTCAGAGAAAATGAGATTGGTCTTTCTTACGTT
TTTGTCACCAGCAATGTCCATAATCTTGTTGAAGATGCTTGTACAAACAAGATATGCAGC
CATGCAAGCAGTTCCGCCATCTTTTCCTAATATTATTTGCATAGAAGAAAGCTAATGTAA
AGACTAG CTG CTG CTATATGAAGAAG G AAAAG GACATTG AGG ATAAACAAAATGAATTA
CCACAAAGCTTAATTGCTAGTTGACATTTTGGATATGACTTCTCTATGGCATTGTAATCA
GATTTTTTGATCAATTTCATATCCAGAGCATAGTCAGTGTAGGCTTTGTATTGAATTTCTG
GATCAGTGAGTCCATTACCAATAGCAAATCCCTAAAAAAAATTGTACTTTGTTAAGTCAT
TGGCATGACGACAAATTCAAATTAAACCTAACTAAAGGTAATTACTGGATAAAGAAAAAG
GGATGATATATGGTAAGAGTTAGAAATACCTTGAGATTTACGTAGATTCCTTCTTTATTTT
TGTTTCCTTGGTGAACCCGAGAAGCAAATGCAGGAATGTAATGCCCAGCATATGATTCT
CCAGTAATATAGAAATCATTTTTTGCATACTGTGGATGTGCCTTGAAGAAGGCCTATCAT
CAAAAGAATTTGAAAAAGTTTGAATTAAATTTTATTAATTATATCAGTTAAACTTTAGAGAT
TTATCACGAGCTAAAAAAAAGGAATGAAAGAATAGGATCAACCTGCAAGAAGTCATAGA
GATCATTGCTTACGCCCCTTTCATCGTGACGAATATCATCATCGTTTGAACTATAACTGA
AACCAGTTCCAGTTGGCTGATCGACGTATATAAGATTTGAGACCTGTAAAATTGCAATTT
ATCATATGTTATCATTCTTCAACTAACAAAGGAAAGTTGCATGTTTGATTATAGGATTTAA
CCGGTGTAAACGATTTTTACACTATTGTTATATTTTAACATGTTGTAACATGTTGTATTCG
TCCCACTTAAATAAAGTGAAGAGAAGCGTAGTAGTCATTGATGTCAATAAACGTTGAACT
ACTTTCGAATTTTTGAAATTCTACAAGTCACAGCTAATGAACAACAAGTGTTAAAGAAAA
AAATGCTAGTAGGTAAAAAGGTATTTTGCATGATGGAGAAAGGTTGAATAACAAATAAAA
ACATGGAGGGAATTCTTTTAGATTTTTACCATATTCAAAAGATCTAACTGACGTTTCTTGA
G AAATTAATTGG GTAAAATAAAAAGAATAAACTG AAAAAAAG AG AG G AAAAAACAAAAGA
AAAAG C AAAAG G AAG AAAAC AAG AAC CTTGTC C C AG C C G AAATC ATTC C AG AC AAG AG A
CATGTTATCTGCAATTTTGAATGGTCCATTTTCATAAAACACAGCCAATTCACTGCTACA
TCCTGGCCCTCCAGTTAGCCATATAACTACTGGATCATTCTTCCTGCTCCTCGATTCAAA
G AAAAAGTAAAAC ATC CTG C C AAAAAC AG ATAATTTAG C ATTAATTAATAATAC C CAT AAA
TTCATTTTTTTACCAAAATGAAGCAAGAAGAACATTTAATCCAATTCAAACCTTGCATCTT
TAGTATGTGGAAGACGATAATAACCAGCGTGATGACCCAAGTCTTGAACTGTAGACCCA
GAATTACCAACATAAGATAAATTCAATTTCTTTTCAAAAAGTCTCTGTTCAGTAACTGCTG
CAGAATCCCCTGTTGCTGCAGCCTTGTTGATATCATGCTTAGGGAATAAATTAAGCTGT
CTG ATTAG CTTTTCTG CCATTGTTAATG G GAATTTTG G AGTAGAAG ATAG G AAAAAATCA
TCATCATTAGAATTTAAAGTTGATGAGAAAGATAAGGAAATAGAAGCAAGAAGCAGAGT
AAGAAAGAGAAGAGAGAAAGATGAAGGCAT
SEQ 17
TCACACGACACTTTGGGGTGGGATATCCTGGTTCACAACCCGGGTGTGAAATCCCTCA
TTATCAAAGTAAGGGTCACACTGCCAGAAAAAACAACTTTATTAGTGGTTGATCAAAAAG
ATCCACAGTAGCTAATGGTTTTAGTGGAAGCTGTAACCTCTGTGAGGTGAAAATAAGTC
AGCATTTACCTCTTCTGTGACCACACCCCTACGGATAAAGTATCGCCATGCTGATATAG
GATATCCACCATCACAACCACTCCCACGTAAAAAGCCACAGCATGCTAACAGATCATTT ACGGACAGAGAGATATTCTGCAATACACATTAAAAGTTTAGCATCAGTGACCATAACTAC
AGAAATACTTCACAAACATTTTGTGCTAATTAAGATAAGATGGTTTCCATGTGCTTGGCC
CATGAAAAAG AATCAGG G CTCG CACGTG AG AGAG CATG CTG AAG ATATATAAATAACTA
AATAAAAATGTGTCATCTAAAGCTTTTAAATGAGGCGGTCACACAGTGCAACAGAATTTA
CCAAGTTATGATGGATACAGAAACGATCAGACAGAGATTCAACAGCACCAAAAGCCCAA
CAAGAACCGCAATGTCCCTGATCTGTGATGCAAATTTGTCCCGGTGATTGATGTGCAAA
GATGGAAAGCATTAGGATCACTAAAATAGAATTATAATTCAGTAGTAGTAAGCAAGAACA
AGAAGAAACTGACCTAGAATTCTTCCGATAGTGCTACATTGAGGCCAAGCTTTTCGTGC
ATCAAACTCTTTTGGTAGCTCCAAAAGCTTTGGATGAGTTAGAATCGGAATTCCCTCCAA
ATCACCTTCTCTTGCGGGCTTAACTCCAAGAAGGCGCTTAAATTGTGAAACCTGTGAAT
ACCAAAGATGG AGATAATTATAAAAG CATATTTTCATTGTATAG CTG CAAAATAAG CAAC
TGTTATTGATGGTTCGAGAAAGCGCTAAACTATATGATGATATGACGACAAGAGGGGGT
TGCTCCGATGGTCAGCATCCTCCACCTACGACCCCAGGATTGTGGGTTCGAGTCACCA
AAG GAG C AATAG CTC C AAC AAAG AG G AT C AC AG G G G AATC AAAAG G G GAG G G G AATTT
TAAAAAAATGATGATCTGACCGTGAAATTCGAGAATCGAGGGTTGAATGCAGCTTTCCA
TCCAGCTTTGGCATTTTCATTAACCTCTTTGATGATTGATTCCTGCATGATCATCCAAAA
AAG CTCTCTCAGTTTTCGAATTG AAG G ACAAG G GCTATAACATCTTG AATCATGAAAAGA
AGAATAATGTACCTGAAGGATTGCAGATTCAACTTTAGCTTCAGATATTGGCTTCTCTGC
AACAACCTGCTTTCCATAAGAACACATCAATTCTATTCATCAATAGACAAGCTAAAAGCT
TTTAG GAAACAGAGTTGCAATTC CAG G AACAAAAGTATG AC AGTACTGTG ACAAACAAA
GAACTAAAGGTTTCACTTTAATCAAGATGAAACGCTTCCAACATTTCTTATTTTCGACATT
ATAATCCTCTTAACTTAGGAAAATTAATAAAAAATTTGTAGCAAATGCATCATATGTCTAA
AGCACTATAACATAGAGGAAGAACCATAGAACAAGCATAAATTGTAACTATTCCATTATT
TGTCCTCCTTTTTCTCCCCTTTCTGTATTTCTTTGTGAAGCAATACTTCCTCTCATGTTAT
ATATAGAAACATGTAAGTTAGCTAACACATAAGTAATTTGCATCAAACCATATATTTAACT
TCAGAAACATGTCTATACTTCTGTTTTCTCATTCTCACTAGGTAATAAGAAAATCATTAAA
ATTTATTTCTACTCATGATTTCAAGTCAACGCTTAACTAAAGCATAAAAAGTCCAAAATAC
CCAACAATATTTGATCTTTCTGAAGAAATACAAAAAGGGTAATCCATGTAATCATCAAAA
CCTATATAAATTAAACCAATAATCTAAATCCATCTAAACAAAGAAATACTCTTACTGTAGA
AAG GTTC AAC G AATG AAG AAAC AAAC CTG C AATATAAG G ATAC AAAAAG C AC C C AAAAA
CAAAGGAGCTGCTAAAGACTTCAGAGTCAAGGTCAT
SEQ18
ATGTTCCGACTAGTAATGGTGACAAAGTTTAGTATTTTTATTTTGGTGGTGTTGTTGAGG
TTATTTTCATTTGGTTTTGTAGCCTCAAGAGAAATTCACAATTTTGGTATTAATCTGAATT
TTAGTGCTTCTGGTATTGAATTCCCTCAACATCCAAGCTTCAACTCTGTTACTGCTTCTG
GAAATTCAGATTGCAGTTATGGAACATCCAAGAAATCAACAACCACCCATGTAATAACTC
AAGAAGAAAATAATTCTGATGAAAAAGAAGATGAAGATTTAATGGTATCTGAAAACCAGC
CAAGAGAAGCAGTCAAGTTTCACTTAAGGCACAGATCAGCTGGTCAAAATATAGAGGCC
AAAGACTCAATATTTG AGTCCACAACAAG G GACTTG G GTAGAATTCAG ACATTG CATAC
AAGGATTGTAGAGAAAAAGAATCAGAACTTTATTTCAAGGCAAACAAAAAATAGTGAAAA
AACTACACAATCTTCTTCATTTGAATTCTCAGGTAAGCTCATGGCAACATTAGAGTCAGG
TGTGAGTCATGGTTCAGGGGAGTATTTCATGGATGTTTTTGTTGGTACACCTCCTAAAC
ACTTCTCTTTGATTCTTGATACTGGTAGTGATCTTAATTGGATTCAATCTGTTCCTTGTTA
TGATTGTTTTGAACAAAATGGTCCTCATTATGATCCTAAGGATTCTATCTCTTTCAAGAAT
ATAAGTTGCGATGATCCGAGGTGTCACCTTGTTTCATCTCCTGACCCTCCACAGCCTTG
CAAGTCTGAAAACCAGACTTGCCCTTATTACTATTGGTATGGAGACAGCTCGAACACGA
CTGGTGATTTCGCGCTTGAGACGTTCACGGTTAATCTCACAACCCCTAATGGGGATTCA
GAG AT CAAGAAAGTG GAAAATGTG ATGTTTG GTTGTG G ACATTG GAATAG AG G CTTATT TCATGGTGCTGCTGGTTTGTTAGGACTTGGTAGAGGACCTCTTTCGTTTTCGTCTCAGC
TTCAATCTTTATATGGCCATTCCTTTTCGTATTGTTTGGTTAATAGGAACAGCAATTCTAG
TGTAAGCAGCAAGTTGATTTTTGGTGAAGATAAGGAACTCTTGAAACACCTGAATTTGAA
TTTCACTTCATTGGTTGGTGGGAAAGAAAATCATTTGGAAACATTCTATTATGTGCAGAT
AAAATCAGTTATAGTTGGAGGTGAAGTGCTGAATATACCTGAGGAGACATGGAATTTGT
CTACAGAAGGTGTTGGTGGAACGATCATCGATTCAGGAACCACTTTGAGCTATTTTGCA
GAACCAGCATATGAGATTATAAAACAGGCATTTGTTAACAAGGTGAAGCGCTATCCTATT
TTAGATGATTTTCCAATTTTGAAACCATGTTACAATGTTTCTGGAGTGGAGAAACTTGAA
TTGCCTTCATTTGGGATAGTTTTTGGTGATGGAGCTATATGGACTTTTCCAGTAGAGAAC
TACTTCATCAAACTTGAACCAGAGGACATTGTTTGTTTGGCAATTTTAGGAACTCCTCAT
TCGGCCATGTCGATAATTGGCAACTACCAACAGCAGAATTTTCATATCTTATATGACACC
AAAAG GTCAAG GCTG G GATTTG CACC AAGAAGATGTG CTG ATGCC
SEQ 19
TTACAAAGGTTGCTGAGCTATCCATCTTTTAAACATATGAAAGCTCTCTTCACGTTTGTA
CTCAGGAGCTGTGTGCCCTCCTCCCTAAAGAGAAAAAAAGACAAGAAGGAAGAACAAA
AACATCTGAGAACTGTGAAAATGTGAGCACAAGACAATTTTACTATCGAGTGTGAGTAC
AAAATTCTAACCTTTACTGTTGCATATGTCATATGATTAGAGAAAGATCTTGTGTAACTG
CACATCAAAGCTATAAATTTAGAATTTCATAACAATTATATCGTCTATCCAACTGGAATAG
AGATCCAGAGAAATTAAAATGGAGAATCATACCCTGCAACTTGACCATCAATTGTCCAA
GGGCGCCAATCATCAATGATAGAATAATTTAGATACTTTATCCATGCTTGCGTCGATTGG
AAAGGAACAACCATGTCATGATCACCACTGGGCAAAAAAAAAAAAAAAAAAAAAAAACTT
TGTTATGTATATAATAACTTCAGCTAAATCTTTTTTGCATAAAGCACAAATTAGAGAACTA
CAAACATTGGTTTAGGTTTAAAGTATCAAACCTGTATATGAGTGATCGATAACCTTTGGA
ACTAAGGTTAACATGGTAAGGTATACTATTCATGAAAGTAACTCTATAAGTTGTACCCAT
AATACTTTGCCTACATCTCGCCCACGCTCTTCTTATAGTTCCCTAATCAAAGAAGAACCA
AAAAG AAAAG AAAAAAAG C CTTTTAATATCTTG C AC CTTAATG C C AAG C AAC CTTC ACTA
TGTACAAGTACAAACAATTGAAATACCTTTCTAACATGGAGAGCCTCTTGAACACTAGG
GTCATTTGCCCAATGATTGGAGAGCTTGCGTGTCGCAACCTAAAATCAGAAAGCATATG
TATATCTAAGTATCCTACATTTATCTGTTTATCCCGAGAAATCATGAAAAAGAACCGGGC
ATATATGCCTCTTTTAACATAAATAGGTATATATCTTAAAATTATTAAACAATATTCAAACT
TTATTCGGTGTAAAGGCATTTAGTTGCTGCAATATTCAAACTTTATATAGTAATATTTGAG
GGACTTACACGACTTTCTCGACAAATGAAGTCATCATGTTTTAGAAAGATAAAATCCTCT
TCAAGAGATCTCCTTTCACCAGACAATTGGCGTGGGTTTGGCGATTCTGAGTCCGTTCC
ACAAAAAGG CTCTAGTATTTG CTG ATCATTTATACTG CTTACAAG CTACAAAGTTG CATT
ACATGAACATTGTGGTATTACCTAATTACAAACTTAGTACCAATTCAAAACATTAAAACAA
GAATATGTAAGAAGGAATATAACCTTACCTTCTTGAACATTTTAAAGTTTTCTAAACATAG
TTTATTGGTAGGATCAATGTTTCGGCAATCGCCTTTGCAAGTCTCCTTCAGTGACTGATA
CAACAAGATCCAAATTATCATTTTTGTGCCAAACTGTGTTGGTATGATACAAAGTTAATAT
ATGAATTGAAGTAAGAGATGAACCAACCTCATAAAGTTCATTAGATATTAGTCCCATACC
ATGACAGAAAG GAATTTG GTAATTG CTTTCTTCAG G AAATGTTAG CGG ATTGCCAAGTG
AATAACCCTGAAAGGATATTAATTACCATTAAAAATAATAATTTTTCATGTATGAGATATA
TTTTAAAT AAAAG G AAAG G AATAG AAAAC CTTAAG GTTG ATTAGTG GCTTTTTG CCTG CT
TCAATTCCTGCATTTTCATATCACTCTTGAAGTGTCAACAGAGAACACTTTGGCTAGAAA
TGTTTTGAGAATCTTCTCTTCAACCAAGTGAGGCTACACATTCAGTTTTGAAGAGACAAC
AAAAGAATGTAGTGTTCTATATATCTTATCTAATATCTGTGGAAGTAATGATCAATAATAA
CAAAGCATAAACTTGGCTTAGAGCCCTAGGAATGGAGAACCTTTGATCAATAGAGTCAC
TTTACTCCCCCTAGTGGACCAATTCAGCATTAATTGCCAGCGGGCTTCAAATAACGAAT
GGCTAAACCGGAAAAATAATAATAACAAAGCATACACTAAAGAGTGATGGGGTGATGAG AACGATCACAACCGCACATTATATGAAAATACTAAATGACACTTACCATCTGATATTAGT
TGAACAATAACTGGAACTGTAATGCCTGAATATGAGTCCCCAGAAACATAGAAAGGGTT
G GAAATG AATTCTG GATG ATTATTGAACCACTG CAAAATAAATCACAACTTATTTCGC GA
TACATTTTTGTTATATTTGGTTGCTATTTAATTATTACTGTCTTCATTTCTTCACGGTTCAT
TTG ATCAATCCAC CAAGATACACATATATTAGG ATTTAG GTGG GTG GAGG G AATTCAAA
TACGTATATATTGAGCCTAAAAAATTTATATCCTAGATCCTCTACCAATGAACAATGCAAA
ATGAGAAGTATATTGAGGAATACATGAAAATTCAGAGTTAACTCTGCTTACTAATTCAAT
CTAAAAAATACTCAGTTTACTAATTCATATGTATCTAACTAAATGAACTTGAAATGATTTT
CAAATACCTTTAGTAGAAATTCATAGACCTGGTCGCACGCTTGTAGATCAGTACACTTG
GATGCCGCTGAAGTTGTTGCATATGAGAACCCAGTATTTACAGGCTGTTCCAAGAAAAG
TATGCTCGCAAACTACCAAGTTTAAGGAACACATTAATTTGATGATCTAATGTTAACCAT
AAGAGGATAAAAGAGAACATGAAGTGTGGATTAGAATATATATGCAGTTTGTTGACAGC
AAAATGAAGTGTGCAATAGAAAAATAACATGCTCTTCAGTGTTACCTTTGTCCAGGAATA
TGGAGTTGAAACAAGAATTGGTAGGCTCCCATTGTATGCCTTCTGACCAAAAGCCAATG
GCCCTACAAAAGAGAATTCAGAAGTTAATTTTCTCCAACTATGAGTTACACGTAATACCA
AGCTTTTCCACACCTATATATTTACTACTAATTGACTAATTAGAAACATCGACTAAAAGTA
AATTGCTTTTGTGATACTAACAAAAGCATTTCTCAAATGAGTTAAGTTAATCACAAAAGGT
TAAACTCTATTTGACAAAAATTACATTTGAACACAAACAATAATGGTAACTGTTGGGATAA
AGATAACCTCCAGACTATGATTACATATTTAAGGTGAGTTACTGATAAACATACTTGATA
ATACAATAGTAGTATAACTAACATACTATCATAGGTTAAATAATATTTTATAAAAAATATTT
ACATTGTCAGTATATACAATATAAATATTTACCTACTTCATACGCCACACCCGTGAAGGA
TGAGCAACCAGGCCCTCCCGTTAGCCATAGCAAGAGTGGATCTTTTTTAGGGTTGGATT
CTGATTTGACAAAGTAATAGAATAGTTGCACTTCCTCGGATTTGCCAACTCCAATATATC
TAACAATTGTATTCAAAATACATCACTTCAACAAACTTGTTTTACTACTCCACTATATATG
TAGCCAGTATGTTCTGAATGAAGTAAATTACCTAAGAAAAAGTTTAGATTCTTTTTATACT
AATTGATCTTTTGATCAAATACAAGTTAAAATTCAAAGGGTGGTAAAATTAACGTACCCA
GTCTCAAG ATAAAAAG G AAG AG G GCCATCAAAACCAG G AAG AAACTC AACAGTTG AG C
TATTCTGAG GAAG ACTTTGTACATATTGTAG AAAG AG AGTAAG AG G AAG AAGAAGATGA
AAC AATAGTG G C AG G C G AAAAC C AG AC AT
SEQ 20
TTAACCAGCTAGAGGATTCATCACACTGCCAATGAATAACACCGCACCAGTAGATTCGT
CTCTTATAAGGAATAGGAATGGATGGTCCGCAACAAAATCCATTTCCTTCTCAATAATCA
AGGACATGGTCATTATTACAGTAGCGGTAACAGCTGCAGCTTCGGTTCCTTCCTCATTT
ACCTCAATGAAAGACTTGTGAAAAACCTGTGAAACAGACAGGTTCTGAGGCATAGGAGA
ATCAACCATCTCAGTGAGGCTACCACCACAAAAAGGCAACGTGAGGCCGAGTCCCTTT
AGAATGTTGGAAGCTTCAAATCCAAAAGTTATTTTAAATTTAGGGATAAGAAACTTGCGC
GCTCTAACTTTTCCATATGGAACATGGTTATTTAAAAATCCTGGTTCTAAGCTGATTTTTT
CCAGTAAAGCAGGTAATCCATCATGGGCATCTGGGAGAATGAAATACATACAGAAGCG
ACGCGTATCCGTGCCTTGTTTATAAGGAAGCCTCAATATTTTAAAGCAATCAAACGCTG
CTATGTACTGCTTCTTCTTGCTAGTCATAAATGGTGCTTGAATAGACCCTCCATTGAGGA
GATGGAAGTCATGATCTTTCGTTTCTGACACATCGAACTTCTCATTCCATTCTCCTTTGA
AATATAGTG CATTG G AC AAGATCAGC CTTGTC ATGTTGTTCACTG CATCG CGAG GAAGA
ATCTCTTTGATAAGACCATTTGTCTCCAT
SEQ 21 TTAGAAAAAAAG CCAATG CTTCTTTCTG CG CACTCTAG CTG GAACCTCTGTGG CACATT
ATGTTCAAAGTAAAGCAAAAATTAGTATTCAAGAATAGCTATGACAAAAAATTCTGAACT
CAGAAATAGTTAAAGCAAGAAGACACTTACCTCCAATTTCGTATAACGAATTGTAGTGAA
CTTCGCTCCAGAAACTCAGCCACAATTCTGAAAATCAAGAATAACAAGTCAAAAGTTTTG
TCTTCAAGAGATTAAACATATCGAGAAGAAATGTTCGTCAGCGCCACACCAACACATTT
CCGAACAAGTGATTTGGAATTTATTTACTTGTTAAGATAAGAGTTAATCTGATCAGGTTT
ACTCAGTCTTTCCTAAGATCACCATGGATGTGTTTGTTTATGAAAATATAATCAGCAGAA
CAT CATC AAACATGTTAAAG G G GAG CTTTGG AGCAACGGTAAAGTTGTCTC CATGTG AC
CTATAGGTCACGGGTTCGAGCCGTGAAAGCGGCCACTAATGTTCGCATTAGGATAGAC
TGGCTACATCACACTCCTTGGGATACATCCCTTCCTCGGACCCTGCATGAACACGGGAT
GCCTTATGCACCGGGCTGCCTTTTTTTAATCATTAAACATGTTCAGCATATTCAATTTCTT
GAGAAAACAATTTTAGCATAACAAAAGAGAACTATTAAAACAGGAGCGACGATCTGATG
TTACTTTATTG G CAG CG GATCAAGTTTATCGTCTGTAACAAATAACG CTTAATAAGTG CT
TGCATCAGTAAATTGTGTGTTTCACGTGAAGCAAGACTCTGGAGATAAATTCTCCTGATA
GCAACTAATACTGCTCTTAGTAAGAGCATCAGGAAACTCTGTACAAAGCTTACATTCACT
AATTGTAAACATACAAATGATCCACAGTTCACAAATGACAGCGAAGGATTCTTGTCATTA
GAATCAGCCTCGTTCAATGATGCAACAATGTCATTACAAATTAAAGGTGTTTTAGTTTTA
CTATTTAAAAATTAGTTAGTACCAATTCACTTCGACATACTTCCACTCACAATCTATTGTG
TGTGGTACCAAGCAATACGATGATACTTTCTCCTAAATGAGAAAAGCACATCCATTTGTC
AAAATAAACAAAATGATTCATTTTTGTTTTCCTTTTTTTTCATTTTTGTTTTCCTTTTTTTTT
CATTTTTGTTTTATATATGTATTCTTATAATAAAGTGGGAGCATGCTAGAGAGTTCAAAGT
AGGACCATGCTACAAAGTTCAGAAGAATACTTTTGGTTTAGTGTCTCAAACAAAACCAG
CAAGTACTATTATTTAAATTCTGGAATTTATATCATAATATCATTATTTTAGAGTTATTTGT
AAATTTCAAGTATTTATTTTATTTTTTGAGTTTAAAAACTCAAG CAG G AAGTTAAAG AACT
AATAATCAGAAGTATTGATTGTGCCATGATCATTAGATACAAATAACATAAAATGTATGTA
CCCCTAGAAGGTTGTATGTCCTTGGGAAGAATATCAATGTAGCCGTTGTCTCGGAAAGA
C GTG AC C AAAC ATATTTTTAC C C C AAACTG C AAAC AAAATAC AAG C AAAC AC AAGTAC C A
TC AGTC AC C C AAAAC G G G AAAG C AAAAAAATAAAAATG C AG AG G G C ATAAG G G C G AAA
G GCTG ATTAACAACACAAATATCTAAAG CAGTTCC ACAAAG GTAAAAGAG CAAG AG AAT
GGTCATATTGCGGGAAATTCTTTAGGTAGTAGATCTCCTGCTTTTAAGTTGCTTTCGCGT
TTGATGCTATGTTGAACTTCAAATATATTCAAGTTTGAACCCATAATTTCTAAAGTGTAGC
AAATTTAGTGGTAAGAACCTAAAAGCTGAACCCACCAAACTTAAATCCTGAATCCGCCTT
CGTATTTGATGTCTTAGACATTTGTATGCCTTCTAAGCGTAAATGTAGGTTGACAAGATG
AAACAGGGCAAATTTAACTTTTTTTCTCATTTCTGCTCTAATATCTACAGTTAAATTTCAG
TAACCTACATGCAACACAGGCATACTGCACAAAAAGTCAGGCTAAAAAATAAGAAGATG
CCGGCGTTTGTAAGAAAGTAAGAGGATTATTAGAAATGTTCTTGGTTATTTTCTCCGACA
GCCTTGGGCAAGAATAAAGCTCTAAATTTCACCGTCAAAGAGTTCTAATCAGGAACCAT
TTTAATACATTG AGG G GTTTTCTCTTAAG GAAAAATAATTTTCATAAAAG CCAAATG GTTA
AAGGTTAACAGCTGAGCCGAATTGCCTTAATACTCATAACTCTTAATGCTACCATATTAT
ATACCAACACCAACTCAAAAATTCACTTCCTAAAATTTTTCAACTTTTTCACAGGAAACTA
TTCCACTATGTGTATTATGCAATGTTAGATACTATTAACAAAGTTGCTTCATTGTTTTTTTT
CTCATTTAAAAAGCTTAAATTTGCAATGCAACAGTTCCACTATCCAGGCAATACACCTTT
ATTGACAGTATAATTTGTTGGCTTTTCTTGTTCGTAGTGTTCAAATGCTGTCTTACATACT
AACAAGTGTACCGTATATATGCCTCAAAAGCTAGTTGAGGACAAAAGAGAAACTTTCAA
CTTACTCTATCGGCAGCGGCTTGTAGAGTGACATGATCTCCCCACTCTCCCAACCTGGT
TAGAGCAAGAATATGTTGGTAATATATTGAGATTTCCCGCAATAAGAAGCACTGAAAATT
GACCAGACAGCTTACCTCTTCATTTTCCTCAAGTAGCTTTTGTATCTCATGGGCACATAA
CCTTCATATAACTTTCTAAAGCGCTTTAGCTGAAATGGATTGCCAACTAGGATAAGAAAC
GTGTAGGACATAAACTAGACATAGGCGTCAAACTCAAGCAAGTATAGGACTCTGTTGAT
GTTTAATAGATCCAGAAATTAGGTTAAGCAGTTGAGCTGATGAGTCTCAAGCAAACACC
GGGATATTTTATTTTTAGTACTCTATTATTATGCACAAGTTACTTTAGTTCTACTAAAACT CACGAAAACGAATATAAACTGCAGGATTTGTTAATCTAGGATGAACACAATCCTCAAACA
TAATCATTATGTAATCTCAACTTTTCACCCATCCTGTTTCACAACCAAAGGCAAGAGATG
AGAAGCAAATATTGAAAAAATGCAAACAGACCTGTTTAACGACCTCCTTCCTTACATGCT
TATGATACTCTGGATTATGATACAACTGATCCGAAAGGGCCCGAAACTACAATAAGCCA
AGTAGACAGTCATATAGCAGGATTAATCACCTGTGAATTAATAAGTAGCAGGTGGGTCT
CAAGTTCAATGAGAGTTTGTCGACTAGTCAAAAGAGACCGACCTGGCAATTTCCATCTC
CTTCAATTTGCATTTCAGCAAGACCATATGTCGCTAACCTGAAACAAAGAAGTTACAAAT
AG G ATC C G AGTTG C C AAAAG G C G AAAAAC ATAG AAAAG C AC C AAG AAATTTC AC C CTTT
AAGTGCTAAGTGCAATTAAAGCATGAATTAAGCTATATAGAAAGACACTAATGAAGAAAA
AGATAACCTATGAGCAGAAACTATAATAATTTGGACAAAGAATTGAAAAACTAAAATTAA
TCAATAAATATGAAGTAAAAATCATGTCAATTAAGTTCCATAATACTGGATTCACAAAATA
AGTCAACCAATGACCTCAAGCTGAACAACTATAGCATGCTCAGTAGTCAGTACAATGAT
GGATAACCTGCTAGAGAGCCTCCCATGGTCTAGTGTGGCATCATTTGGGTCAGGTATCT
CCCCAATTACCCGTGGAGTGTGCTGCAAGGATAATAAGACTATAAGTAGCATGTAGAGA
TAGTGCAACATGGGTTTTCATGGGTTAAAGCTCAAAAAGGAATATATAGAGAAGCAACC
TTTGGATACATACATGTGGTCACATTATACTCATTTAATTTGAAAGAGTTGCCCCCAAAA
ATGTAAATTTAGATAATTGTTCATAAAAAGGTCAAAAGTTCCCCGGATTTCTAAACTTCTT
GAACATTCTCCTTCTTCCTCTATGGATAGGGTGACGTGACGCGCAATGGATCAGTTAGA
AAGGAGAAGAAACTGATCAAGTGTCCCCTTCGCAACAACCTTTTTCCCTTTCTCGCATC
CTTGGTCAAGCTTGAGATTTTCTTCTATCTGGATTTAGTTACCCGAACCTATGCAAAGCA
AGCCTACACTCTAGCTGGGGCCAAGCCATCTTTGCCTAAGTTCCATTACCTCATTAACT
CG ATAAAG CTG GC ATTAAG AGAAG CTTG GTAG CATAAAGTGTCAAG GAGGTG GTTTTCG
CCTATCTATAAGGACAGTTGGAGTTGATGGTCCTGGTTTTGTTACAGTAAGGGATGTTG
CAGGAATCACTTGCATTGATGGAACAGAAGCTCTTCCTGCTCTTGTATTCGACGAAGAA
TAGGCCCAAGGGACATCAATGGACAATCGCCAAGCACATGGTTTAGCTAATTCAATAGC
CTTCGGAGAAAAGTGATAAGTACCATCGAACAAGCAAATCACTTCTGGATTATTAACCTT
TAAGGATCCACTTATGGTGAAATTTTGGAATGATGCGGGGATAACCCAAGGGAGTCAG
CTAAACAGAGACGGAAAATATCAGTACCATTCTGGTCGGCGTATGCTTGTTGAAGAGAA
CAAGCTCACTTCTAGCTAGCTAAAGGGAAAGAGAGATACTATAGAATACCTAAGTCAAT
TTATCATTGAGCTGCAAAGATCATAGCAGACACTACGAATTAATAGCAAATCTCGGCAC
AAACCTCTACATGTCAAGATAGTGAGTCCATCAGATCACAAAAGACTTAATACCAGATAC
CTTTCTAGATAAAGGGGTAAAATCACTTTCTTTCTTCCACAAGATCTCCATTTAAAGAGA
CTATCAACTGTACTTAAATGTAATCATATCAACTGTACTTAAATGTAATCATACCGGGATT
GAGTCCAAATGAGAAAGTCTCCTCCCAAGTTTATTGCCACCATATTTGAGAGCATTTTCT
TCTTCTTCAGCTAAAATTCTTGCAATGGTATGATCATCCTCAGTACCGTGTGAACTACTA
TTCAAACTAGATGTAGTAGAACTCGAGCTTGCTCTTGAATTTCCATAGGATTCATTCAT
SEQ 22
CTAGAAAGGGTAAATGAGACCTCCGAACTTCCCAGAAAATGCTTCCTTCTGAGGCTGTT
G CACTATAGTTG CATTGCTCATTCG CTG CATTTAAAACAAATTAACTGTGAAAACTACAG
TAGCAAAAGGTTAAAGAAAACGAACATGAATAGCACGTCAAGAGAAATTGGCTTTGCTT
TAACGGTTATTTCATCTCTGTCAACAATGAAATGGCAATGAGTGAACCTTTTCAGAATAA
GTTGGCTTATCTCATTAATGAGAGACAGATAACAAGAGATGTCTCCTCTAATCTCTAAAT
TGATATTTCATGTTGTATGGATCCTAATAGGATGAGAAATGCATCAAATACAGAAAGGAA
TGGCAGAAGTGGAGATATACCTTGAGGCAGAGGTTCCGACTTGTGTCACAAATAGGAT
AATCATGTGGGCAACAGTGGCGTCCATCTTTGCAACACACTGCAGAATCCAACCCACAA
CATTTCCAAGAAACGCAAACTCCAAGAAGCCTCCACCCACAGCAGCAGGTTTCACCTTG
ACCACATGAGGTAAACATACTGCATTTGCTTGGACCAGGAGATGGAGGAGATGGTGGA
TTTGGGCTACTCTTAGTTGGATATGAAGCTAGCTTATTGATCCCACATATCCCTTCTTGA TTCCCACTATTACGCTGCATGTGCATATAACCATTTATTCCCCAGCTTGTTCCCCATGAA
TTTTTTATAATCCAGTAATCAACTCCATTTTCAGAACCATAGCCCACAATCAGTACCGCA
TGATCAAGTACTGTAGAACACGGTCCAGTGAATATCCCCTGCAACACAAAGATAGCACC
TTTATATTTCTCTC CGACAAACAATTTAACTG ATTAG GAG ATTG GTAATTTG GAG ATG GA
AGATACCTTTGAATATGATTGAAATGCTCTCTCACTGCCGCATATCCCAACACTCACGG
GTTGATTTGCCACCGCCTTTAGAAGCTTGTCCTCATCATATTGGGGAACATCAGTATATC
CATCAATGGTTACAACACGTCTTTGTAGCTGCAAAGTCGACAAGTTAAGCCAAGCAATC
ATATGTTAACGTTGTTCATTATGTTTTATCTGGAATAAACTTGTCCTAGGTTCTCTATATT
AATTATG AAATC C AG AAG C AG GAG G G CAT AT AT AG G AC AAAAAAAG ATTTAGTAC AAG A
ATAGGAGCAAGAAGATGAGAGAACAATTATGTACCTTGTTTTTGTTGCATGTTCCTTCTC
TTTCATTAAAGGGGTAATCCTCTTCAGTGTCAATACCACCATTCTTTTTGACAAATTCAAA
AGCATAGTCCATCAATCCACCTCCACAGCCGTCATTGTAACTTTTGTCGCAATCAATTAA
CTCCTGCTCAGAGAGACTTACAAGAGATCCAGTGACAATCTTATTGATACCTTCGATTG
CTCCAGTGGCTGAGAATGACCAGCAAGCACCTGGGAAAATGAAACAGAAGTAACTGGT
TTTAGTTACAGAAGCTAGTTGCTGAGATTAAGTATATGGAATGACAATAGAATGACAGTG
TTGTGGACAAGGGCAAATTTGATTCATATTATCTCAGAACAAATTCACAAAAAGGCTAGA
TCTTCACTTCCGTCCTATATTCAGGCTGACATTACCAGACATATCTACAGAAAATAATTA
CTTGAGAAACATAAAGGCAGTGATAAATTTTAAGAATAAACTATACTTAGTGAGAATTGT
GTGCAGTCATAAAAGTAAC AAGTCTAAGTCCCTGAAG CAAATTCTG CATTG G GAG GAAA
GTATATTTCCGCGTATATGACAGCCAAATTAGTTGCTATAAAACATCACACTAGTATGTG
ACTCAATATTGACAGTAAAATTATAAAACATGTTCCTTCGATTGCACAAGCAAGTAGAGA
ATCATAGGGTACAATTGTGTACACAGTTCCAAAAACAAGAAAGAAGAGCTAAAACAATG
AATTGTGAGTCAATGATTCAATGGTCGAAAACAGGACCAAGAAATGGATCAGTGGATAT
TTATATTTATTTCTATTTTTAAAACTTAAAGGGCATGATGTGAAGACTGAAGCTGGTAATC
CAGTTTTGATTGTGATGCACATGAATGGATGTGGAAAGTAATATTCTCTGGAAGACAGA
AGACCACTAACCACCTCAGTTGCTCAGACCAAGATAGTGAGTAGATCTCCCCTAATCTA
TTCAACAAAGTTCATTGGAAAGAAACAAACAATGAAATGGCGGATCTCCGAGCAGTCTG
GTGAAGTTTATGGTCCAGTGGTTAAAAAGAATGACAACTCAACCATATTTTACTCCTCCG
ATGTGCTCCGTTCATGATCTTTTTAATATTTCTCGCTAATCCGCTAATCAATAAAGAATGA
GATACTGTATCAGTATGTCCTATTATTGTTGTCTTCCAGTCACTCTGAAGAAATGATTTTC
ACATACATAGAGACAAAAATTGAAAGTAAGAAACAACAACAACAAACCAGTGGAATCAC
ATAAGTGGGGTCCGGGAGTATAATGTGTACGCAGACCTTAGAGGTTGTTTCTGATAGAC
CCTCGGCTCAAGAACAGTGAGAAAATTGAAAGTAAGAAACAAACAGTATATTCATTCCTA
ATCAACTCATGAAAGGACGAGCTCATGAGACTAAGTTTCAACAACAACAACCATATGATT
GTTTATTCCACTTCATCTTGATTCCAATACCTAATAATTTGTCTTTTGGGGCAACTCAAG
GGTTCCTAAAGCTAAGAATTCTCTAAATCTCACACTTCTCCTTATACAAACATTCAAATCC
TAACCAAACTGAAAGTGCTCCTGTCTAATACTGATGAACTAAAGTAAGTGCTGAGGCTA
GGTTTCAATGAAGTAAATTAGTCCTGAACTTCAACCTGTCAAATAATACAAGGAAAAGCA
AAAAAGGGTAGCTCCCAGACAAGAGAAAAAGGCAAAACTAACATCACAAGTTTCCATTG
TCATTTGAGAAAAAAATCATCAAAATCCAAACTTTGTAAAAATTTCTAATGTTGGCTCTAC
TATGCACAAGTTATATATCCTCCACATAAATGAAATCACTATAAAGATACAACTAAAAGAT
AACGCAATAAACTGAGCATACCACAACTGCCTTGATTCTTGACTTTAGTAACAGCTCCTT
TCTCTCTCCAATCCAAAGAAGAAGGAATATCAACAACACCAACATCATTAAAAACTCCAG
CAGAAGACGACCCAGTTTTCAATCTAATAAAATCATTAGCAGAAGAGGACAAACCCAAA
AAAGAGTTCTTGAATTCATGGTGAGTGAGATCAGAAAAGGCATTGAGATTAAGGGTATA
AGTGGAATTCCCCTTACTATTATGCTCTATAATATAAGCATAATTTTCTTCAAACACCTCG
AGTCTGTACACCCTTTCTTGTTCAGAAGAATATGTCTTTCCATTTTGCTGACACCAACTT
TCAAAAAGATCAGAAATTGATGAACAAGTGCAAATTGGTCCTTGAAAAATTAGAAGTACA
AGAAC CAAAGATG GACATAACCAACTCAT SEQ 23
TTAATG CTTATTCCAGAAACTCCACTTCTTCTTCTTCTTCTTAAAGTCAAATG G CAG G AA
GTCTGGAATGGAGCATCAAACTACAGTATTAGAATAATATGATAAGGGTAGTGTGTACG
CCGGTGGCGGACCCAGGATTTTGTGCAAGCGGGTTCAATCTTAGAAGTATATAACTTTA
GTTGTAAAATAGTAGTTGTCAAGTGGGTTCAAATAAAATATTTAAACAAAATTTACGCAG
CTTTAATCCTAATTTATACATATATACAGTATTAGTTTTTGATGCTTGCCACCACGTGCGT
CCACCACTGTGTACACATACCCCTTACCCCCTACCTTGTGAGGATAGAAATCTAAATGA
AGCAAGAGCAGCAAACTTCCGGTCAACTCTCAATGTTCATCGCTTTGTCCATAAGCATG
TGATAAACAAAAAGTGTTATTCCGTAATGCCCATGGTAACCCCCCCCCCCCGGGGGGG
G N G GTTTAAAC ATGTAATTAATC AG ATATAG G C C AATTAATAATAGTTG AG C G AC C ATG C
TAAAACCACGGAACTCCGGAGTACCTAACCCCCCCCCCCCGGGGGGGGGATGTCCAT
G CTAAG ACAAACTAAACAGAAACG G GACATAAAAGTACAAG CAACTACCTCCTTGAG GA
TAGATTGAGTTGTAGTGCACCTCTGCCCAGAAACTCAAGTATATGACTGCAAGAAGAGA
AAAATAAGAGAAACAATTAAATTGGTGTGAGCGAATGAATATCAAGACTTCAAAACACCG
CTCTTACAAAGTGCACATGCACAAAGAAATTGCATTCATACTTTAATTTCTCTTCCAAGC
AATCTCAAGATTTGCTTGCCCACACTTGGCTTCATAGTATAGGTATGATACAATGGCATA
G AATAAATG AC ATG C ATAC ATAAC C AATATAAAG CTTG C C C C C ATTC ATTAAACTTAC AA
CCGTCTCATAATCATTCAATATGTTAAAACAGACAAATTCCGGTCTCTAAAAGGAGAATG
TGAATGTCAAAGCATCCATGTTATGAGATGGAATTTAGATTTCAAAAGAGCTAAAACGGA
CGACTCTTCAAAAATCAAAATCTCCTTCTCATGAAACGCAAAATCGAATTTGCTTAAGAT
TGTCCTTAAGGGTTCATAGTCATCCATTCATCCCTCCTTCCCCTTGCGCAATTTTTTGGT
CAAG G CAG G CCG AG GTACTAACTTTACAGTCCAAAGATCAAAAGTACTATTTGC ATTCT
TCACGACTCATGTAATAAACTTATGTTGTCTCTTTAACTCCAGTGGTCCTACTTTATCAG
AGTCGTTATCTGATTTTGGACTTCTGAAAAAGTTTGATACAAAGATCGTACTAACTTTTC
CATAGTTGGCACAAATTTCAAGAATCAAATTCATCCAGTAAATCAGGTTGCTCTGGTACC
AGCTACTTCTATAATTTAATTTACTACTATTACTACAATATGCATAATCAAATTATCTGCTT
CATCTCCATGTGTAGCCTGTGTCGTCTGAAACGCCAATGGGGGAGTACTAATTTGGTG
GTTACGAATTATCATACTATCTCCCCTTTTTGAATTGTTGAATTTGGTCCTGAAAAATGTT
G CGGTTTTG G CTAAAAGTCTAAAACTG CATTG CAGAAAGTATC AATAG AACG AC ATAAG
AACTCGATAATGGTTTCTCAGTTAGTGGAATTACAGCTGAGGAAAGCATCTTTAACCGC
AAACTGGAAATACGTACAGCATTAAGCGACTCGTGACTTTTTATTTGAGACACATGGAAA
TTGAGAAATAGGATCTATGTCACTCCCACTTCCAAATATTTTTGTAATAAAAACTTGTTCA
ATCGCATTTTGTGAAGTAGAGGATATTACAGAAAGGTAAAAGCAATTCAAAGTTTGAGAA
CTAGCCTACCTCTGTTTGACTTCTGATTCTTCGGAAGAATCTCGATGTAACATGTATCCT
TGAATGACGTTATAACAAGAATTTTCACACCATACTGCCCACAAGACAAAAAGACAAAAT
CAATGTGCAACACAAGTAGGTTCATTCACAAACCAATGCATCCTAGTCTAGCATCATCA
CAAATAAAATCTTCATAAAAG GAG CTG CG CAT ATAC AATAAATAAAAAATG CATC ATAAC
CACTCAAAATG GAG AGTG AAAG AAAGAAATAGCAAAATAG AG G CAC ATG AATTAACAAA
AGCTAGTAAAGCACCCAATGGAGGCACTATACCAGGACATCCAAATTATGGTCTGGCCA
ACAATAGCTTAAGCTTCTTATATTCCAAGGTTAAAAAGTAAACCAAAGTAATCAAATGGA
G AG AAAAAAC CAAG G AAG C AAATAAG G G G AATAATC AATAC C G AGTC AG CAG CAG C CT
GCAACGTAACATGATCGCCCCATTCCCCACTCCTGGTTCAACCAATATATAAGCAGATG
TTGATTTAAAGGAGAAAAGATAATGCAGAATTTCAAAATGGAGAAACAAAGGTCTGAGA
ATTACTTGGACATCCTCGTCAAGTACTCTCCATACTCCATTGGGACATATCCCTCATACA
TCTCCGGATGATGTTGAAACTGACAAGACAAAATGATTTTAAAAAACATACTCAGCTAAA
ATGTATGTGAAATAATTCAAAAATGAAATCGAAATATCATAAAGAAGATTATCTTATTCAC
TTAGACAAG CAC CTGG CTG ACTACTTG CTGTCTGACAAATTTGTG GTG CTCTGGTGTAC
G ATAG AATTG ATCTG ATAAAG CAC GG AACTG GAACAAAAGAG GACATGTG AATAGATGT
G C ATTAG AAAG AAATAG G ATG G AAC CTTATTTC AAC AAACTTAG AATC G GAG G C G AAG C
CTAATCCACTTGGGAAAAATAGGCAAAGTTCCCATACCCCATAAAACTAAAGAGTTGAG AAAAAAATTGAATTATCCTTTTGTCAAACTACTAAAAATCCACAAATATTTTTAATAAAGTT
GGAATGGCAAGCAGCCTGACAATACGTCACTTGAGTTTCATATTCCAATTTTTTTAAAAA
TCCTATTATGAACCACTTCATTCACCATTTCACTGTCACAAACACCAACAACAAGTTTCT
ATCAGAGCCAAAGAGTTAATTCCAAAGTAAGGAATACTGATAAACCGTCATCAAACATTA
TGTTTTTTGTTTTTCATTCCCTTTCTTCTTTGAACCAGAGAGTAAGACCTCCATACCACCT
AGCACCTTGATATACTGAGATTTTTCATGAGACAAACTTAAAGAATTGGGGATCCTTCTT
TTGTTTATGGTTAAAAATTATGCTACAAGTAGTTTAAGGAAAGGGAAAACAATTTTTTTCT
TTCACTAGAAATCAAATCATGAGCGTCTCTAGACAGTTTGATATCATTACTGCAAGACAA
ATCAACCAAGTAATACAGTAACCTGGCAGTTGCCATCTCCTTGCACTTTGTGCTCCACC
AAGTCAAATAATTGCAATCTGAAGCAAGAAAAACCATAGCATGGGATCCTTAGCAAAGA
ATAACTTGCAGCAATAATATTACCATTCAATTGCATTGGCAAAAATATCAACTTCACAGG
AAACTTAGGTGGCACAAAACCTAATAAAAAACACAACAATATCAACTTCATAGGAAGCTT
AGGTGGCACAAAATCTAATAAAGAAACACAACAAAAGAAAATAAAATGAAAAACCACATG
CATTGTCCTGTGACTTTAAGAAAACAAAGGGTTAATAGTATCTATTGGAATGTGCTTCAT
AAGTGTTTATTTACAAAGCACAAAATACTAGTTGGCTCATTCCACACCAATTATTTTCTGT
GCTTGTCTCGGCTCCCATCTCTCCCATTTGATTCTCGTTTTCTCAAAATGCTTGAGGGGT
CAGATGTTACTTTTCGAATAGCAGGCAGTAAGAACCACCAGTAAAAGAATGAGTTAATA
AAGAAAGAGAAAACCTAACAAAATAAAAATAAAGAAATGTTTCTGGAGACACAGGACCC
CTATCACAATAGAGGTATGCATTTCTCACAGTGAAGCTATATTTCATTACCTTGAAAAGT
GCTCAGAATTTGGAGAGAATTAAGCAAAGCACTATCATAAGATCATATGTCATTAACTTG
TTCATACACAATTTTTATCTACTTGAAAACCTAAGTCGGAGGGATCAGGAGATACCTATT
TAGCAGCCTTTGATGATCAGAAGTTGCTTCATCGACTGAAGGTATGTCCCCATTTATTCT
AGG AACATG CTG CATG GTTAATAACACAAATTGTTATAAGAAACTAAAGCAG CCTCAAG
AAAATG G CCATAG GTGCAAAG CACCACACATGTC CTCGTACACAAAGTG AATTAGTGTT
CTCAATATACTAACAG ACACTACACTTAC AG GAACAG CACTCAG CTG GTTTATTCTCTTC
CCTACTTCTCCATCAAGCTCAAATTCATCTTGTATTTCCAATGTGTAGGTATACTCTTCTC
CATCGTATGATCTGTCTCCAGGACTCGAACAAGAACTTGAAGGCCCTACATCATCAGCT
TCTAGGCTGGTGTCATGCCCTGTGAATCATATATAATTAGCAAATGTTTAAACTCAAGGA
ACATCACAGAATTGAAAACAAGAAATGTACCAGCATAATACTCTCTTGGAGGAGTATGC
CAATGTTGTACACCAGTGGAGGCTTGCAAATACTGCTCGTCTGCATGTGAAGATTCAGC
ATCTTCTGCGATGGACAACTCTGACAAATCTTCTTGTAGAACATGAGCAATAGCCTCATC
ATTGTCAACATTGCAATATGATGTGTGATAGTGGTTTTCTCTGGCATATTGTTCATGACA
TATCTCAACGTCATGCTTTCTACCATCACCGTAATAGTTGGAACTAAAAAGTTGGTCCAC
ATCGAGAAAACTAAGAACGCCACGAGCAGCTTCAGATTCCGGCTCACACAT
SEQ 24
ATGCCTTCACTTCTTCAAATTTTCCTTCCTTTGTTTCCATTCTTTTTCTTGGTTTCTTTCTC
AGTTTCTCACGGACCCTTTTTGCCAAAGGCCATTATTCTTCCTGTAAACAAAGATCTGTC
AACTTTTCAGTATGTTACTCAAGTTTACATGGGTGCTCATCTTGTTCCTACCAATTTAGTT
GTAGATCTTGGAGGTTCATTTCTCTGGACTAATTGTGGCTTAACTTCTGTATCTTCAAGT
CAGAAACTTGTCCCCTGTAATTCACTCAAATGCTCAATGGCTAAACCTAATGGTTGCACT
AACAAGATTTGTGGTGTACAATCAGAAAATCCTTTTACAAAAGTGGCTGCAACAGGGGA
ATTAGCAGAGGACATGTTTGCTGTGGAATTCATAGATGAGTTAAAAACAGGTTCAATTG
CTTCAATACATGAATTCTTGTTTTCTTGTG CATCAACTACTTTGTTG CAAG GTCTTG CTAG
AGGTGCCAAAGGAATGTTAGGACTTGGAAATTCAAGAATTGCATTGCCATCTCAGTTGT
CTGATACATTTGGTTTCCAGAGGAAATTTGCTCTCTGTTTGTCTTCTTCAAATGGTGCTA
TAATATCTGGTGAAAGTCCTTACTTGTCACTTTTGGGTCATGATGTTTCAAGATCTATGC
TTTATACACCTTTGATTTCATCTAAAGATGGTGTTTCAGAAGAGTATTATATCAACGTTAA
ATCCATCAAAATTAATGGCAAGAAACTGTCGTTAAACACATCTTTGTTTGCAATGGATGA AGGTGTTGGAGGGACAAAGATTAGTACAATTCCCCCTTTTACCACCATGAAAAGCTCAA
TTTATAAGTCATTTATTGAAGCTTATGAGAAATTTGCTATTTCCATGGAATTGAATAAAGT
G GAAGCTATAG CACCATTTGAG CTTTGCTTTAG CAC AAAG G GG ATAG ATGTCACAAAAG
TGGGGCCAAATGTGCCAACTACGGATCTTGTGTTGCAAAGTGAAATGGTTAAGTGGAG
GATTTATGGGAGAAATTCAATGGTGAAAGTAAGTGATGAAGTGATGTGTTTGGGATTCT
TGAATGGAGGGGTGAATCAAAAGGCTTCAATTGTTATAGGGGGTTACCAGTTGGAGGA
TAATCTTTTG G AGTTTAACTTG G GAACTTCTATG CTTG G ATTTACTTCTTCACTTTCAATG
GCAGAAACAAGCTGTTCTGACTTTATGTTCCATTCTGTATCAAAAGATTCAGCTTTTGAT
TCT
SEQ 25
TTAAGAAGAATGAGAAGTAAACTTATTTGTTGAATTTAAGAGGTAAGAATATGCAAAAGT
AGCATGAATTGCAGCACCAATTGGAAGGACATCCTCATCAATGATGAAATGTGGATTGT
GTGG AGG GTAAATAG CAC CAATCTTTTCATTTTTTGTTC CCAAAAG GAAGAAG G AACCA
G GAACTTTCTCTAAAAACACTG CAAAATCTTCACTTC CCATG AAG CTAGGTG CTATTTTG
AAACTCTCTTCCCCAACAATCATTTTTGAAACTTTTCGGGCATGTTCGTATATTCTCTCAT
CGTTTATTGTTGGAGGAAGTGTTGGATTTTCTCGACCATCAAAGTCAATCTCGACCGTA
CATCGATGTACTGCTGCTTGTGCTCGTATCACCTGAAAATTTTACCAATAAAAAGTTTAA
TTACCAAATATTGAATATAATAATATGTTCTAAAAATAACATGGATGTCTATTCCTAATTAT
TAGCAAGTTATTTCATTCTCCCTAGTTGATTAGTGAATTACTGAAAGGTTATGGCGTTCT
GATATTGTTAAATGTACCACTTATTTTATTGAAAAGTTATATTGCATCTCTAAAGAACTGA
AAAGTCATTTGACCTCTTGGCTCGTGACCCTTCTCAAAAAACAGTTCTTTGGCTATAAAT
AAGATTATTTTGTGTTGAATGAATATATCAAGCAACTTGAAAATATTAAAACCTCTTTCCC
GAAACAAATCCTACAATTTCCTCAAGTACTCGTCTTTTCAAACAAAGTATTAATGAAAAA
GAAACGTAACTTGTTTGAATAAATAAAATTTGCACATATAAACTTTGTAAAGAGACGTGT
GGGACTTTGAATATTGGTCAAAGTATCCAAGATTTTTGTTCTAAATTACAAGTAGTTTAC
CTCTTCAATTCTTTTCCTCAAACCGTAGAAACTCTTCTTACTGAATGCTCTATAGGTCCC
GGAAATTGTAGCTAATTCTGGTATGATATTAAATGCATGCCCCCCTTCAATCATGGCAAC
AGAAACTACCTGAAATTTCAAAAACTAAATATATAAGAATTGATAAAATAAAAATTTAAAA
TTTGTTTGAATAAGTTTGGAAAAGAAAATTGTTAATCTCTATGTTTCAAAAAGATTGTTCT
AGTTTGACTTGACACAAATTTTAATAAGGAAGAAAAGACTTTGAGATATGTGGTCCTAAA
TAAACCATATCATTTGTGTGACTGTAAAACTTTTGAAACTTGTGATCTTAAACTTACTATA
ACATTTGTGTAACTATAAATGCTTCTAATAAAAAAAATATTAAAATTTGTCAATTTTTTTGA
AACAG ACCAATAAATAAATAGTGTCAATG CTTTTG AAACG G AG GTAGTAC CTGG G ATTC
AAGAGGATCAGTCTCTCTAGAGACAATACTTTGCAAACTAATAACAGAAGTAGAAGCAG
CCAAAATTGGATCAACAGAATCGTGTGGAACAGCAGCATGACCTCCTTTTCCTCTAATT
GTAGCTTTAAAGCTTCCACATCCAGCCAAGAATTCACCAGGCCTAGATGCAACTACTCC
ACTTTCATACTTATGAACTAAGTGCATTCCAAAAATGGCTTCCACATTTTCAAGAACTCC
TTCTTCTATCATATCTTTAGCCCCATGCCCTCGTTCTTCAGCTGGTTGAAAAATTAACAC
CACTGTTCCCTGCAATATTATTATACACGGTAATTAAATTCATTACTTCAACTAATCCATT
AGCTTAGAAGTATGTATTTAGAGCTTAATTAAGGGTTTTATTTAACCTGTAAATTGTGTCG
GAGTTGTTGTAATATCTTGGCAGCACCAAGAAGCATGGCAGTATGGGCATCATGAGCA
CAAGCATGCATTTTTCCATCAACTTTGCTCTTGTGCTCCCATTTCGCCAATTCCTACATT
AGAAGAATTCAACTTTGACTCACGACTCTTTTATTGATCAAATTATTCACTTTATAGATTT
TTGAAGAATTGATTAATCGAGAATAAATATAGAGTCCTACTGTAGAGGCATATTATATGA
TATTGACCTCTACAACTTATAAAACCCGACTTATGATCTTTATTTTCTTTTTCTGTTGTTTA
GATCACAATTGATATTTGATGTTCAAATTAAATGTTTTAGCGGTGTAATATTATTACTTAT
GGTACTTTCGGCCATCCTATCCAATTTTACTACTAGGAAAATAAAAAACGTGTTGACCCT
TTATTCCACAACATATGAAACTAAAAGTAAAAAGAGATGGTCACCATAGAAGAAAACTAG CTAAAGTATATACCTACGAATTGAAGTGTTTTCTCTTTCCCAATGAAGTTCCAAAATTCAA
GAATCTCTTTGTTTTAGGTATAATTAAGCTGTTTCGAACTCTATACTTAATTCAATATTAA
GAGAGATCTGATTTATTACTTTCCTTTCATGGCTTAAATATTACCGCCGCCGCCGCCATT
TCTGACAAAAACGGAAAGTAAACTGCCGCAAGTAATTTCTTCTTCTGCCATAGTTAATTT
AGTCGCCCACAAAATTAATAAAATGACTCAAATTTACTGCCTACACCCTAGTTCCGACCG
AATACAACATATAAATGATCCCCGTGCTGTTGTCATCTCGAACATCCTTAATAACAATCT
CCAAAACCATTAATGAAATACAGACAAAGGTAAGAAGTAAATTTGAAGATATATAGTACT
ATATTAGGCCTATAGATCTACCTCCTAAACTCCACAAACTGTTTAAAGTGAATAAAACAT
TTAAAGAGTTCATATCAATTTTTTTTGATATGAAGAGTTATCCGTGGTTATAAATGAACTA
AACGTGATACTAGTATAAATATTCTTACCGTTTTTTGTTTTGAATATAATTGCAGGGTTGA
GAAAATTTCCAAGCAGAGACTACTAACCTGAATAGGCAAAGCATCCATGTCTGCTCTGA
GAGCCACAAATGGCGGCTTACCGGAGCCGATGGTGGCAACAACTCCGGTCTTAGCCA
CCGGCCACCGGTACTTTACTCCCATCCGATCAAGCTCCTCTCTGATCAAACCACTCGTC
TTAAATTCTTCATAAGCAAGTTCTGGGTTCTCGTGAATTTGTCTCCTTATTTTCATCATCC
ACTTCACTGTCTCCGTAGCATTTGCTAATTTTGTAATATAATCTTTCACGTAACAGTTTTG
ATCTACCAAAAACGGATTCAAGCACTCATCATCGCCGTGACACGAAGGAAAAACAATGA
ACATACATACAAGCACCAAAATTAGAACTTCCTTAGCACCCAT
SEQ 26
ATGAAACTGAATCCTTACTCATGGACAAAGGTAAGTACTTGATTGTGAATTATAACTGTA
TTATGTACATAAGGTCGCTGCACAACACAAAATGTTGAAAATAAGATGGAATTATTAGGT
GGCAAGCATTATTTTCTTAGACTTACCAGTAGGCACTGGATTTTCCTATGCAAGAACTCC
AACAGCTTTACAGTCATCTGATTTACAAGCAAGTGATCAAGCATATGAGTTCCTTTACAA
GGTAATTAGATTCTTCACGAAATTATTAGTTAAATGTATTTTCTCCTTTGCCCCTCAATGT
TGTTCAATATGTAGTAGAACAGTCAATAATTTTATGTTGTTTGCAGTGGTTCCTTGATCA
CCCAGAATTCTTAAAGAATCCATTGTATGTTGGCGGCGACTCATATTCAGGGATGGTTG
TTCCCATCATTACTCAAATTATAGCAACTAGTAAGACTATATTTTCCCTCAAATAGTTGTG
AAACAAGTAATGGCAGCCTAAGGTAGTAAGGTGTTCTGTTCTTGTACTATAACATTTTGT
GGCCTTGTGATAATGCAGAAAATGAGATGGGAATAAAACCTTTTGTGGATCTTCAGGTT
TGTCATTTTTCTTGTATATATTCTCTTTTCCCTACGGATAAGCAGACGGATTACATACCAA
CTCAGAATTTGTAACGAAATTGTTATGAGAATGTCACGACCCAAGCCCATAGCATGTATT
GTCTGCTTTGGGCCTAGGCTCGCACGGATTTGTCTTTCGGGCTACGCCACCTCGAGCC
CCAAAAGCGCGTGCACCATGTGAACTTGTGTCATACCTTATAAAGTTCATCACTTTCCTC
TATTATTCCGATATGGGGATTCGTCTAAGGTGACATGTGCACCGCTTATTCAGAAGTTT
GGCAGCCTAGAAGCTAGTCAGTCCTACTTAACTTGCCCTCATCAGCCCCCTCCTTCATG
GGCATCACACAGAATCAAAAGTCACTGTAGAATGTGAGTTGATTTGCAAAATGTATGAC
CTGATATCTCTCGTCAAGTGGTTTCAGGGATATTTACTCGGAAATCCATCGACTTTTAAA
G GTG AAAAG AATTATG AG ATTCCATTTG CTTATG GAATG G G ACTTATTTCTG ATG AACTC
TATGAGGTTGGTTTTCCTTTGGTGTTATATAGTACAGTCAAACCTTTCTATAATAGCTACA
TTTGTTCCGATATTTTTTGGATGCTATAATGAAGTGTTGTTATAGAGGATATATATTAGTA
TAACATAACATACAAAATCGGCTCCGAGAAAAACTTGGCTTTATAGTAAATGACTATTAT
ATATGGATGCTGTTATACAGAGGTTTGACCGTAAGATCTTAAATATCCTCCAGTTATGCG
CTTTAATTTAGTTTGCTTACATTGTCCTTAGAACTAATTGATTTCCCTTTCTCAAATAGTC
CTTGACGAGAAATTGTAAAGGAGAGTATCAAAACACTGATCCAAGCAATACACAATGTTT
GCAAGATGTTCATACTTTTCAAGAGGTTGGATCCTATTTTGAGGAAAATCAAATATCATC
TGTTTGTTTTATGATAGGTTCATTAACATACTGACCTTATGCAGCTTCTGAAAAGAATTAA
TAATCCCCATATTCTGGAGCCCAAATGTCAGTTTGCTTCACCAAAGCCACACCTATTGTT
TGGCCAAAGAAGATCTCTTAATGTGAAGTTTCATCAACTTAACAATCCTCAACAACTCCC
TGCGCTAAAGTGTCGCGTGGGTACTCATCAACAAACTCTAGCATTCTTTATGCTATTGAT TTTTTGTTTCACTGAGATACTTACGAGAATTTACAACTTGCAATTGATTTAGAATGATTGG
TACAAACTTTCTTCTCATTGGGCTGATGATGGCCAAGTTAGAGAGGCCCTCCATATCCG
AAAGGTACGTTAGTTCTTGTTGGAAGGGGAACCTTGGAGCAACGGTAAAAATATCTCTG
TGTGATCTATAGAGCACGGATTTGAGCCATGAAAGCAGTAATGCTTGCATTATGATAGG
CTGTCTATATCACACCCTTGAGATGCGGCCACCTTGCATGAATGCGTGATACTTTGTGC
ATCATGCTGCCTTTTTTTTTGAAGAACAACAAAATTTAACAAAGTGTGCTACACAAAACTA
AAAATATGATCAATTTGATTACAGGGAACTATTGGAAAATGGGTGAGATGTGCAAGTTTG
CAATACCAAAAGACAATCATGAGTAGCATACCATATCATGCAAACCTCAGTGCTAAAGG
TTACAGATCTCTTATATACAGGTTGAGTAAGATTGTTGTGTTTGCAAGATTGGAATAACT
ACATAAATAGTTGAAGATTATTATCTCTGTGAAACTATTTACTTAGTTTTCTATGTTTTTTG
AATTAAGCAGTGGAGATCATGACAAGGTTGTTACCTTCCTATCAACTCAAGCATGGATA
AAATCTCTTAACTACTCCATTGTTGATGATTGGCGACCGTGGATCGTTGACAATCAAGTT
GCCGGGTTAGTTTATGATGAAAACATTGTACGCTAGTCATAAGCTCTGTCAAGGTATAG
AAGTTAAACTCATTTTTTGTCTTTTGCATGATTGTAGTTACACGAGAAGTTACTCAAATCG
GATGACATTTGCCACAGTAAAGGCAAGATATCTCTTTCACTTGCTTTTCTCAGTTAAGTT
TGAAGATAAAAAATTTTGTTAAATAGTTGGTGTTTAAATTGCACTATTTTGTTACAGGGAG
CAGGGCATACTGCACCAGAGTATAAGCCTCGTGAATGTCTGGCCATGCTCAAAAGGTT
GATGTCTTACAAGCCTTTG
SEQ 27
ATGTGTGAACCGGAGTCTGAAGCAACTCGTGGGGTTCTTAGTTTTCTCGATGTGGACCA
ACTTTTCAGTTCCAACTATTACGGCGATGGTAGAAAGCATGACGTTGAGATATGTCATG
AACAATATGCCAGAGAAAACCAGTATCACACATCATATTGCAATGTTGACAGTGATGAG
G CTATTG CTCATCTTTTACAAG AAGAATTGTCAG AGTTGTCCATCG CAG AAGATG CTGA
ATCTTCACATGCAGATGAGCAGTATTTTCAAGCCTCCACTGGTGTACAACATTGGCATA
CTCCTCCAAGGGAGTACTATGCCGGTACATTTCTTGTTTTCAGTTTTGTGATTTTTCCTC
GAGTTTAAACATTTGCTAATTTATATATGATTCACAGGGCATGACACTGGTCTAGAAGCT
GATGATGTGGGGCCTTCAAGTTCTTGTTCTAGTCCTGGCGACAGATCATACGATGGAGA
AGAGTATACCTAC ACATTG GAAATACAAG ATG AATTTG AG CTTG ATG G AG AAGTAG G GA
AGAGAATAAACCAGCTGAGTGCTGTTCCTGTAAGTGTAGTGTCTGTTAGTATATCAAGA
ACACTAATTCACTTTGTGTACGAGGACATGTGCGGCGCTCTGCAACTTTGGCCATTTTC
TTGTCACTGCTTTAGTTTCTTATAACAATTTGTGTTATTAACCGTGCAGCATGTTCCTAGA
ATAAATGGAGACATACCTTCAGTCGATGAAGCAACTTCTGATCATCAAAGGCTGCTAGA
TAGGTATCTCCTGATCCCTCCGACTTAGGTTTTCAAGTTGACAGAAATTTTGTGTATGAA
CAAGTTAATGACATATGATCTTATGGTAGTGCTTTGCTTAATTCTCTCTCAGATTAGCAC
TTTCCAAGGTAATGAAATATAAGTTCACTGCGAGAAATGCATACCTCTATTGTGATTAGG
TGTCCTGTGTCTCCAGAATCATTTCTGTATTTTTTTTAGGTTTTCTCTTTCTTTATTAATTC
ATTCTTTTCCCGGTGGTTCTTACTGCCTGCTATTTGAAAAGTAACATCAAACCCCTCATG
C ATTTTG AG AAAAG AG AATC AAATG G GAG AG ATG G G AC C C G G G AC AAG C AC AG AAAAT
AATTGTTGTGGAATGAGCCAACTAGTATGTTGTGCTTTCTAAATAAACACTTACGAAGCA
CATTCCAGTAGATACTGTTAACCCTTTGTTTGCTTAAAGTCACAGGACAATGCATGCGGT
TTTTCATTTTGTTCTGTTTTTTTATTAGGTTTTGTGCCACCTAAGTTTCCTAGGAAGTTGA
TATTGTTGTGTTTTTTCATTAGGTTTTGTGCCACCTAGTTTCCTATGAAGTTGATATTTTT
GCTAATTCATTTGAATGGTAATACTATTGCTATAATAACTTATTTTCTGCTAAGCATCCCA
TG CTGTGATTTTTCTTG CTTCAG ATTGC AATTATTTG ACTTGGTG GAG CAC AAAGTG CAA
GGAGATGGCAACTGTCAGGTTATCATATTACCTGGTTGATTTATCTTGCAGTAATGATAT
CAAACTGTCTAGATGCGCTCATGATTTGATTTCTAGTGGAAGAAAAAAACTGTATTCCCT
TTCCTTAAACTACTTACAGCATAAGTATTAATCTTAAACATAATGTTTATCAGTATTCCTT
CCTTTTGGAATTGTTCTGGTAGAAACTTGTTCTTGGTGTTTGTGACAATGTCTTAGCTTT CTTTATTACTTTTTAGTTATGCTTGAAAACAGTGGAAACAGTAAAGTTATCTCCATATAAA
GTTGTCTCTGTGTG ACATATAG GTC ATG AGTTTG AG CCGTG GAAG CAG CCATTAATGCT
TGCATTAGGTTAGGCTATCTATATCACACCCCTTGGGTGAGGCTCTTCTCGGGACCCTG
CGTGAATGTGGTCGGGACCCTGCGTGAATGTGGGATGCTTTGTGCACTGGGCTGCCAT
TTTAGTTATGCTTGAAATTCTCAACTTTTTAATTTTCATATTTGGTTTTTACTTGTCTATTC
TTTCCATTAGCCTTAAGCAGTTGCTCACTGTTCATCATATTTCATTTAAGTTTGTGAAGTG
TGTGAGACCATATACAATATTGCTGAATTATGATATACATTGGGGATTGGCAATTTCATT
TAAATTGAATTCTTTAGTGATTAGTTCAATAAAGTCACAAAAAGAAAATCGGACTTGAATT
ATTGATTTGGGAGTTATTTAATTATGAAATGAATACTAGTAAGAAGCGAGTCAAGAAATT
TGAGACTGAATGTGAAAATTGGATGGAAGATGTTCACGGAGAAAAGCTGATTAATAGTA
ATGTTGGTAAAATAGGAAGGGATTAGAACTCGGATAATGAATGTAGAGCGAACTACAAA
ATATAAGAAGTTGAGAGTTCGGATGGAGTTGGGGGGATGGGTGGTGAATGGAAGTGGT
TCATAATAGGATTTTGGAGAAAAACTGGAATATGAAAACTCAAGTGATATATCGGCAGGT
TG CTTGC CATG CCAAGTG CCAACTTTATGAAAATTATTTGTG GATTTTCAGTTAGTTTGA
CAAAAG G ACAATTCAAATTTTTTCTTAACTCTTTAGTATTATG G G GGTATG G G AACTTAG
CCCGTTTTTCCTTTCTGTGAATTAGGTTTCACCTCCGATTCTAAGTTTGTTGAGATAAGG
TTCCATCCTATCTCTTTCTAATGCACATCTATTCACCTGTCTTCTTTTGTTCCAGTTCCGT
GCTTTATCAGATCAATTCTATCGTACACCGGAGCACCACAAATTTGTCAGACAGCAAGT
AGTCAGTCAGGTGCTTGTCTAAGTGAATAAGATTATCTTCTCTATGATATTTCGGTTTTC
ATTTTTGAATTATTTCACATACATTTTAGCTGAGTAGGTTTTTTAAAATCATTTTGTTTTGT
CAGCTTAAACATCATCCAGAGATGTATGAGGGATATGTCCCAATGGAATATGGAGAGTA
CTTGAAGAGGATGTCCAAGTAATTCTCAGACCTTTGTTTTTCCATTTTGAAATCGTGCAT
TACCTTCTCTCCTTTAATTTACATCTGACTTTTATATTGGTTGAACCAGGAGTGGGGAAT
GGGGCGATCATGTTACGTTGCAGGCTGCTGCTGACTCGGTACTGATTATTGCCCTTACT
TTGGTTCCTTGGTTTTTCTCTCCATTTGATTACTGCCTTTTGGTTTGTTTCTTAACCTTGG
AAATAAGAACCGTAAGCTATTGTTGGCCAAACCATAACTTGGATGTCCTCATATGGTGC
CTTCTTTGGATATTGTAGTAGCTTGTTAATTGCAAGTTGATGGTATGTAGGAAGTAACAT
GCTTCTATAGAATTTGTGATCCTGTAGTTTTTCATGAGTATGTGTTAATCCTTTATTTTGT
AGTGTG G AAGAAAATGTGTGTTTATGTG CCTCTCTTG CTATTTTTTTCTTTCAGTCTC CAT
TGTGTGGTCGTGTTGCATTTTTTATTGTATATGCTTAGCTCCTTTACGAAGATTTTGCTTG
TGATAATATTAGATGAGGATGGATTGGTTTGTGAATGACCTATTTGTGCTGCACATTGAA
TTGTCTTTTTGTCTTGTGAGCAGTATGGTGTGAAAATTCTCGTTATAACGTCATTCAAGG
ATACATGTTACATCGAGATTCTTCCGAAGAATCAAAAGTCAAACAGAGGTAACTAGTTCT
CAAATTTTGAATTGCTTTTACCTTTCTGTAATATCCTCTACTTCATAGAATGTGATTGAAC
AAGTTTTCATTAACAAAATATTTGGAAGTGGTAGTGGCATGAATTCTCAATTTCCATGTAT
CTCATATACAAATACATGTGTCACCGAATTGTGTACGTACTTCCAGTTCGCAGTTAAGGA
TTTTCCCCAACTTTAATTCCACTATGTGAGCAACCCTTACAAAGTTCTTCAAACATTCTTT
ATTGGTTATTTCTGAAATGTGGTTTTAGACTTATAGATAATACCAGAATATTATCCAGGG
CCAAATTTCAACAATTCAAACAAGGAAAGATAGTATGATCATTCTTAACACCACTGTAGT
TACCCCCCATTGAATTTTCCGATAACGAGGGCTACACATGGAGATGAAGGAGATGATCT
G ATTATG CATATTGTAGTAATAGTAGTAG ATAAATTTATAAAAGTAG CTAATACTAG ATAA
CCCGATTTATTATCTGAGTTTGATTCTTGAAACTTGTTCGAGTTATGGAGAAACTAGTAT
GGCCTTTGTATAGGACCTTTCCGAAAGTCCCAGATCCGATAACGATCTGATAAAGTAGG
ACCACTGGAGTTATGAGACAACATTAGTCTATAATATAAATGATCAGAATATTGCAACAG
AAAACATCAGTTGTCTCTTTCCTCTCTTCGATAGAGGCAAAGGAGATTGAATCTAATTGA
TTCCGAAATGCTTCATTGGATATTCAATAGTTAAATCGAACTATTCATCTTGCAACTCTGA
AAACAG ACGTG CTATACATG CAGTATAAGAG CAACATAATTAACATACACTAGGTTG GA
GGTTTACTTATCTATGTTTAGGTGGTCGGTTTATGTGCAAGTTTTCCATTTTTCAACAATT
TAGAGTTATCAGAG CTACTTATAAG ACATG ATACTTTTG CTGTATTTAACTTTTTTTGTAA
AGTTCAGCAAGAGTTTTTGCTAGTCCGAGTGGAAATAATTATTTGTTGACTACCCATTTT
CCCTTTTTACTTGAGAAAAAGATTGAGAGGGGGGAGGCAGAGCAGCATCATGAGTCAT CTGGAGAATGCAAAGAGTACTTTTGATCTTTGGACTGAAGAGTTAGAACGTTGGCCTTC
CTTGATAACAAAATTTTCAAGGGGAGGGGGGATTAATGGATGACTATGAACCGTTATGG
ACAATCTTAAGCAAATCCGATCTTGGGTTTCATGAAAAGGAGATTCCCAAGGGTTGACC
AGTTTTG GTTGTTTTG AAATCTAAAAG ATG G ACTG ATGAG CATCCTTATTG CCTTTTTAG
AGACCTAAAGTTGTCTACTTTAACATATTGAATGATTATGAGACAGTTCTAAGTTTAATGA
ATGGGGACAGCTTTATATCGGTCATGTATGCATGACATTTATTCCATGCCATTATATGAT
ACCTCTACTGTG AAG CCAAG G GTG G GC ACAACTTAG CAATG GATACTTAAATCTGG AGA
TTGCTTGGAACATCTTTGTGCATATGCACTTTGTAAGAGCGATGTTTGAAGTCTTGATAT
TG ATTCG CTTACAC CAATTTAATTGTTTTGTTTCTCTC GTTTTTCTCTTG GTG CAGTCATA
TACTTAAGTTTCTGGGCAGAGGTGCACTACAACTCAATCTATCCTCAAGGAGGTAGTTG
CTTGTACATTTATCTCCTGTTCCTGTTTATTTTGTCTTAGCATTGAGAGTTGAGTGGGAG
TTTGCTGATCTTGCTTAGTTTAGAATTCCATTCTATTATCATATTATTCTAATACTGTAGTT
TGATGCTCCATTCTAGACTTCCTGCCATTTGATCTTAAGAAGAAGAAGAAGAAGTGGAG
TTTCTGGAACAAGCAT
SEQ 28
TCAAGAATCAAAAGCTGAATCTTTCGATACAGAATGGATCATGAAGTCAGAACAGCTTG
TTTCTGCTGTAGACAGTGAAGAAGTAAATCCAAGCATAGAAGTTCCCAAGTTAAACTCC
AAAAGATTATTCTCCAACTGGTAACCCCCTATAACAATTGAAGCCTTTTGATTCACCCCT
CCATCCAAAAATCCCCAACACATCACTTCATCACTTACTTTCACCATTGAATTTCTCCCA
TAAATCCTCCACTTAACCATTTCACTTTGCAACACAAGATCCATAGTTGGAACATTTGGC
CCCACTTTTGTGACATCTATCCCCTCTGTGCTAAAGCAAAGCTCAAATGGTGCTATGGA
TTCCACTTTAGTCAAATTCACGGAAATAGCAATTTTTTCATAAGCTTCCATAAATGTCCTA
TAAATTGAGCTTTTCATGCTAGTAAAAGGGGAAATTGTACTAATCTTTGTCCCGCCAACA
CCTTCTTCATCCATTGTAAACAAAGATATGTTTAAAGACAGTTTATTGCCATTAATTTTTA
TGGATTTGACATTGATGTAATACTCTTCTGAAACACCATTTTTAGATGAAATCAAAGGTG
TGTAAAGCATAGATCTTGAAACATCATGACCCAAAAGTGACAAGTAAGGACTTTCACCA
GATATTATAGCACCATTTGAAGAAGACAAACAGAGAGCAAATTTCCTCTGGAAACCAAAT
GTATCAGACAACTGAGATGGCAATGCAATTCTTGAATTTCCAAGTCCTAACATTCCTTTG
GCACCTCTAGCAAGACCTTGTAACAAAGTAGTTGATGCACAAGAAAACAAGAATTCATG
TATTGAAGCAATTGAACCTGTTTTTAACTCATCTATGAATTCCACAGCAAACATGTCCTC
TGCTAATTCCCCTGTTGCAGCCACTTTTGTGAAAGGATTTTCTGATTGTACACCACAAAT
CTTGTTAGTG CAACCATTAG GTTTAG CCATTGAG CACTTG AGTG AATTACAG G G GACAA
GTTTCTGACTTGAAGATACAGAAGTTAAGCCACAATTAGTCCAGAGAAACGAACCTCCA
AGATCTACAACTAAATTGGTAGGAACAAGATGAGCACCCATGTAAACTTGAGTAACATA
CTG AAAAGTG GACAGATCTTTGTTTACAG GAAG AATAATG G CCTTAGG CAAAAAG GGTC
CATGAG AAACTG AGAAAG AAACAAAG AAAAAG AACG GAAACAAAG GAAG G AATATTTGA
AGAAGTGAAGGCAT
SEQ 29
TTAGGCCTCAATCAGTTCTCTAATTGGTTTGCTGTTTATGTTAGCTGATGGGATATTAGT
GAATTCAGAGAGTATAGCTCTGAATTCATCTCCTGTAAGAGTCTCCTTTTCTAGCAACAC
ATCCACTAATTTGTCGATTGCCTCCCTGTTGTTCCTTATGTGGTTCTTTGCAATTTCATAT
GCTCTCTCAATTATGTGCCTTACCGATGCATCAATGTCTTCTGCTAGTTTCTCTGACATT
TGATTCCTCGCCAGCATTCTCAGCACCACATCACCACTCTGTGTTGCTGGATCTGTTAA
CGCCCATGGTCCTATCTCAGACATCCCGAACATTGTCACCATCTGCTCATGATATAAAC
ATTGGCAAGTTAATACTTGTGTGTATTCGAATATGTTGTTCTCTTTTAATGTGGTGCAAC AAGATGATGTGTTAAGTAAATACCTGTCTTGCTATTTGAGTTATTTGTTGCAAGTCTCCG
GCTGCACCAGTAGTGATTTCTGCTTCACCAAAAATTATTTCCTCTGCTGCTCTACCTCCT
AAGCTTCCAACTATTCTAGCAAAAAGTTGCTGCTTAGATATCAAGGTTGGATCTTCACCA
GGAATAAACCATGTAAGACCGCGAGCTTGCCCTCTTGGGATCAATGTAACTTTCTGTAC
TGCATCATGGCCAGGGGTCAATGTCCTATAAGCACAAGGACACATTCTTTAGTACTGTG
TCTTTTGATTACAAATAACAAACTGAAAAGATTGAATACTTAGACAGCTTCTTAAATTTGT
CCGGTTTTTTCATCTAAACACCTTGTCTAAGGGCCTGATATATTGAACACTTGATGTTAG
TTGAAAATTCAATAAGGAGCAAATTACTCCCTTTTTTTGCTTCATGTATAATCTAGTATAA
ATGAAAATAATGAGAGGAAAGAAATGATTGTTAACTTACGCGCAGACACCATGTCCAAC
TTCATGATATGCTACCAAAATCTTGTTTTTGCCATCTGTCATCTTGGTTCCTTCCATTCCA
GCAACAATTCTATCGATGGAATCATCAATCTCTTTCGAGGTAATCTTATCTTTTCCTCTTC
TTCCAGCTAGAATAGCAGCTTCATTCATGAGGTTTGCAAGATCTGCACCACTGAATCCT
GGAGTTCTCATTGCAATAACACTTAGAGACACATCTTTATCAAGCTTCTTGTTGTTACTA
TGAACCTTCAATATTTCTTCCCTTCCTCTTATATCAGGCAGTCCAACACTTACCTAATAAA
ATGAAATATCAATATAAGTGAAGTGTATTCTGGAAACTGTATAATACACCTCATTTTATTG
GAATTTTACAATCAAAATCTCATTTTATACCTGTCTATCAAATCTTCCAGGTCGAAGCAAA
G CTTGATCAAG AATTTCAG GCCTATTAGTG G CAG CAATG ACAATG ACTCCAGTGTTTCC
AGTGAAACCATCCATTTCAGTGAGAAGTTGGTTAAGTGTCTGCTCTCTTTCATCATTTCC
ACCGCCAATACCAGTTCCTCTTTGCCTCCCAACAGCATCAATCTCATCAATAAAGACTAA
ACAAGGTGAATTTTCCTTTGCCTTGTTGAATAAGTCCCTAACTCTAGAAGCTCCCACACC
AACAAACATCTCAACAAACTCTGAACCAGAGAGAGATAAGAATGGAACCTCTGCTTCTC
CGGCAATCGCCTTAGCTAGCAATGTCTTCCCTGTCCCTGGTGGCCCTACTAAGAGAACT
CCCTTTGGTATCTTTGCCCCAACTGCTGCAAACTTTTCTGGGGTTTTCAAGAACTCAACA
ATCTCTTGAAAATCTTGCTTTGCATCATCTACCCCAGCCACATCATCAAATGTTACTCCT
GTATTTGGTTCCATCTGGAATTTTGCTTTGCTCCTGCATTATTCACAAACAAATACTAGTT
ATTAGTAGTTGTTGAAGATTACATCACTAGACATAATGTTCAATCTTGATCATGTTTATGG
AATTTCTATTATAGCATACTGTTGGGTTTCTTAAAGAGATGGAAATGATTGAAATTGTCTC
TC CTAAGTTTTATTAACTATAG AG C G ATTTAAATAG C C AACTTG AAAATAAAATAC AC AAA
TTTATAAAATATTGAAAAACCTAAAATATCTCAACAACCTAAAATATCTAACCGAAATTTA
AATTCAAACAAAGTAGACTACTTTTACCACTAAAAATTACTCCTTCTATTTCAATTTAGAT
GATACAATTTCCTATTAGTACGTTCCAAAAAGAATTATACATTTCTATAATTGAAAATAAT
TCAACTTTAAACTCTTTATTTTATCTATTTTAACCTTAATAAAAAACTTTTATAACTACACA
AATATCATGCCCCCCACAAAGCTTTTACCTCTTAAACTTTTTCAAAAGTCTTCTGTTTTTT
TTTTTTAAACTACGTGCCGAGTCAAACTAACTAATTTAAATTTAAACCGAGGAAGTATTAT
TCTAGTAAATTAACAGTAACAGAAGCTATATACAAGACATACCTTCCTAATCCAAAAGGC
AGGTTTGGCCCTCCAGGAGTATTTGAAGAAGAGGTTCTCAACAGCAAAGAGCCAAGCA
ATATCAATGGAAAAGCTAAATTCCCAAGTAAATCAAGAAGTGGCCCTATGACATTCATTT
CAGGGAGATGAGCAGCAAAATCTACATCCTTCTCTCTAAGTTTTCTCACCAATTCTGGT
GGCAATCCTGGCAACTGAACTTTAACTCTCTGGACTTTGTTAAGAGCAGGATTGAATAT
CTCAG CAACAG CACTACTCTCAAAAAAATCAACTTTTTTC ACAG CACCTTC ATTCAAGTA
TTCCAAGAATCTTGAATATGACATTCTACTTGAAGTTGCTTCAATTGGTGCTTCAGTTTCT
GCTCTTGCTGGTTTAGCCAAAGTCCCTGCTACAAGGCTCAAACCACTACCACTCAACAG
CTTCCTCCTATTTATTCTGGTGTCTGAATATGATTTTTGACATGGGGTTTCTTTACTAAAG
ATTTTAG G ATTGTTAGTATCCTTAGAAAG ATCTTG G GATTTG CATAG GG G AAATTGAATG
ACAGACAAAGAAAGGGCAGGGGACATTTTCAT
SEQ 30
TTAGGCAGTGGGATAAGAAGCGTCCATAGCAAGTCCACAAAGGCCTTCTTTCTCATGAA CATCCCTTTTGATGCGCATATATCCACTGTCACCCCATTTACTGCCCCATGAATTCTTTA TAATCCAATATTTTGTACCGTCAGTTGTTGCACCATATCCCACTGCTGTAACAGCGTGGT
TAAGCCAAGTGCTGCATGATCCACTGAATACACCACTTGAATAGAACTGGAAATCGAAG
CTACTCCCGTCTATTGCCACCGAAACAGGTTGATTAGCCACTGCCTGCAATAGAGCCTT
CTCACTGTTCGCTGGCACATCTTCATATCCTGTAATAAGAGGCGTAAGTCATAATTTCAA
GCTTATGGATTCGGAATATTTATCGTTTGAAGTTGCTGGGCTAGATCATAATTAAACCAA
CTCACCCAATTGAAGATTTGTTCTACCCCTTATATTTTTATGGGCTTACCTGTAATTTTGG
CTGCTGAAAGAGCTGACTTTTTCTTGTTGCAGACACCATCTTCTCCTTTGTATGGATAGT
TTACTTCTGTTGTGAGGCCCTTGTTTTTCAGGATGAAATCAAAGGCAGTGTCCAAGAGT
CCACCGCTGCAACCTTCGTCCTCGCCTTCGACATCACAGTCTACAAGCTCTTGCTCTGA
TAAAGGGATCAACTCTCCTGTTTTCAGTTGGTGTAGCCCTTCCAT
SEQ 31
ATGGGATGCCGCATGAAATTCTTGAATGTGGTTTTGGTGGTGGCGGCGGTGATGGCTG
CTGCCGCCGCCGTGGCCTTCGGAGCTGAGAAATTGCCGGCGGGAGTGCTTAGTTTGG
AAAGG ATTTTTCCTTTG AATGG G AAG ATG G AG CTG GAG GAG GTTAG AG CAAG GG ACAG
AGCTAGGCATGCTCGAATGTTGCAGAGTTTTGCTGGTGGTATTGTTAATTTTCCTGTTGT
CGGTTCATCTGACCCTTATCTTGTCGGGTAATTACTTTGTTACGACCAATTTGATAAGAT
TATATTTGTGATGTTTTTAGTGTTTTCTTCCTTTTTCTAATGTGGAGTTATATTGCTATATT
TGCTATATTTTATTTGGTATGATGACGATGATATGGCTTGAGCTTAAATGGAGAAGTGAT
GATTGGTATAGCGGACTCCAACTTGTTTGGGACCGAGGCGTTGTTGTTGTTGAGTTTTG
TTTGGTTAATTTAGTCATTTTTTGGAAAGTTTGATTCTTTATGATGTTAAAACTTGGAACT
TTTGGTGAATGTATGGAAGCTATGGACTATTTGATGTGTTATTAACGTCTTATGAATTTG
ATCTCATGAATTCTGATGTAAATTTTGTTTTAGTTTGAGGGTAATTGATTTTAAGTGTATT
AAAGTACTTGTAACACAATG AATTTTG GTGTG CTGTTTTTCTTTTCTAG GTGCTTTCTTGT
TAATTATCGGTTGGATGGTGTTGTTGATGGTAGTGATAGTTCTGTTTGCTTTATCTTGTA
TCGTTTCTTGTTCAGGTGCAAAATATTTAGGTACAATTGAATGATGAGTTTCGTGTTGCT
CTGAATATGAACATTAGCTATATCAGTTTGGCATTTGCTTTCTGTTATTTGTGGATGAGG
GATATTCTTATTGATTTGACTATCAATTTTGTTGACCATCGTCTCTTTCTCTCTCTTCTAT
GCTTTTGGTGTATTTGTAGCCTTTATTTTACAAAAGTAAGACTGGGAACTCCACCAAGAG
AATACAATGTGCAGATCGACACTGGCAGTGATATCCTATGGGTCACATGTAGTTCCTGC
GATGATTGTCCTCGGACAAGTGGACTTGGGGTAACTCATCTTCCCTTCATCTTGTTATTA
CTTTTTTAGTTTCTTGTTTAAAGTGTGGTGAAGGAATAAACTGTTACGTGGGTGCAGGTT
GAGCTCAACTTCTATGATGCTACCATCTCGTCAACTGCTTCTCCCATTTCTTGTGCAGAC
CAAGTGTGCGCCTCTATAGTTCAAACTGCCTCCGCTGAGTGCTCTACGGAAACCAATCA
GTGTGGTTACTCCTTTCAATATGGAGATGGGAGTGGCACAACTGGCCATTACGTAGCC
GATTTACTATATTTTGACACAGTCCTGGGAACTTCTTTGATTGCCAACTCTTCAGCACCG
ATTATTTTTGGGTGAGTTCTTATTTTTTAAATACCCCTATATCTATACTTAAAATTTCATTA
GAAATAGTTGTGGGTCATTTGAACCAGAAATATCTTTGGCCCAATTTACAAAAAAACCAT
GTTTGTTTACTCAAAG CTTATACTTG GATATG ATTTAAAACAGGTG CAG CACCTCTCAGT
CTGGGGACTTGACCAAGACGGACAGAGCAATTGATGGGATATTTGGGTTTGGTCAACA
GGGTCTTTCAGTAATATCTCAACTGTCTTCTCATCGGATTACTCCTAAAGTATTTTCACAT
TG CTTG AAAG G AG AGG G AAATGGTG GAG GTATACTAGTCCTTG GTGAG ATTTTG G ATC
CGAGAATCGTATATAGTCCCCTTGTTCCGTCACAGTACGTATTGTTACAGTACAATGAA
GTTTCTTTTCTTGCTTATGACGAATATAGAGATTTAATTGTTTTCATCTTTAGTGTGCCTT
GTGCTACATGATATAAAACAGTTGTGTTCTTTATAGTTTGTGATCCAGCTTGAGCATGTG
AAATATACCTCTCATGCGCTACATCCTGATTTTATTGAAATTTCGTCACTATATTATTGGT
TTTGCATCTACAGATATATAGTAGTTGGGTCTTGGGAAGATGACATCAATGAAACTTTAC
TTTGTACATATAAAAAAGGGCAGCCCGGTGCACAATTTTGAGTGTTATATATATATATAT
ATATNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNN NTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTG
TGAACATGTCGATATATGTGTTTATTTTTTATGTTTTTCTACATTATTTGTTTTCTAACTAT
AAGCACGGGGATTGGCTGGGCACTAGGAAAGAAAATGGTTTGTAGCAAGCTTGATTGT
ATCCGCTTTCCACTTTTGCAGGGCGCATTACAATGTATATCTGCAGAGCATTGCTGTTAA
TGGACAGTTGGTGCCTGTTGATCCATCAGTGTTTGCGACATCTGGCAATCGAGGAACTA
TTGTGGATTCTGGTACAACTTTGGCTTATATTGCCACAGAAGCTTATGATCCCTTTGTCA
ATGCTGTAAGTTCCTACATTTTGCCAATTTATTTACTCCCTCCGTCTCAACTTGAATGTC
CAATTGCTCTCTTTGTCTATACCAAAATTATATCCACATCTCCTAAATATAAAAACTGACC
ACATAAC AACAAG AACAACTACG CCTGTTG G G GTTG G CGAAAAAG G GCAG ATAACTGTT
GATCAAAACCCCTGAAAAATGTTTTATCATTCATCTGCAGATAACTGCTGCTGTTTCACC
ATCAGTTAGGCCAATCATCTCACGAGGAAAACCGTGCTTTCTAGTGTCCTCGAGGTTCA
TCCTTGTATAATTCAATAGATATTTACTTTGAGCTTTTAATGACAAAAATGTCTTTACCCA
ACTGCTTTGGTGACTCGAATATGACACCCTGTTTTGTTCTGAAAACAGCATAGCAGAGA
TATTTCCCCCAGTTTCTCTAAACTTTGATGGTGGTGCATCGATGGCTTTAAGACCATCAG
ACTACCTTGTGCATATGGGCTTTGTCGTGAGTACCAGAATCTGTATTGTGTTTGAATGTC
TTCTTGAAGTCTCATCTGAGACTATATCAACAATGCTATATGCAGCACTATTGTCTTTTTG
ATGAACTACAAAG CTAAGTG ACTCAATTGAATGTTTCTAACAG G AAG GTG CTG CTATGT
G GTG CATCG GCTTTG AAAAACAGG ATCAAG GTGTAACAATTTTAG GAG GTTG GTTCCTT
GTTTACTACAATATTTATGCCTCAACCCCAAGTGGGTTCGCTGGTGTATTTATCAACATA
TCTTCTAGGACTATGGTTACATCTAATATCTGCTGCATCATTACGTGTGCTAGCGTCCTT
AATGGGTGCTTAAGCACTCCATCACTCAAACCATCTAAAAAGGACCTAACTATACCTTGT
CCTGTATTAATTTAAAAAAAAATATAACTTGGCGAGTGTCTGCCTAGATGGAATGTACCT
TAGTGATTGCATGCTTAGATAAAGTCATCATTCCTGCAGATATCGATGTTGATGAAACAC
GGCCATCCATTTTCTATTAGAATATTAGGTCATTGATTTGCAATTTGAAAATTGTAGCTTG
AAGATTGCAGTCCCTTTGTCTTTTCTAATTTTGCATCCGTTCTTTTATTGTGATTGACCAG
AGGATAGGTATACTTGTAGTACCAGTAATTCGGTTAGATATGTTCCGTAGCGTGCCAAG
GAATTTATATTTCCTGTTTTGCCTAGTTTTTCGAAGTTTATGTCAAAGTTACTTGCTTTGG
ATTCCGTGCGTGGGACAACTATATATCTTGTAGAGGCAGTGATTGTTACTTTGATGTTGA
AAG ACAG CAG AGG GTGG AATTCATAG AAAGTTAATACGTG GAAAG GG GTGTTAG ACTG
GATACATGACTGCAGTCCTAAATCAAGGTTGGAGTTGCATTTAATATGGCTATTCCTAAA
GGCTAAAAGACACAGTCTCATCAAGATTGTATGTTGAGAACAAGTCTACGAGAGTGCGC
TCTGACTGTTCTCAGAGATAGGTGCCTCAGACTCTAGGGATTAAGCTAACTACGTGTTT
TG AGTG GTGTTTTGTTCTTTCTTTTCCTTTTTCACTG CTG G CTAACACAACTTG AAG ACTT
GAATTCCAAACTGCTTCAAGTTTTAGGTCTATTATTCATGCCTTCTAACTTCTTAGAGTGT
TGGTTCCCTAGTCCTCTTTTTTTTCCAACTCGTGCATGCACGCTCATGCCCGTATACGTA
CATGCAAGACATCTACAGTTGTAAACACAATTTATGACCAAATATTAACGGAAAGGTATT
AATCCCTTTCTATTTCTTGTTTGTGTGTGTATGTGTAAACCAAGCACAATGTTATGACCA
ACATTAAGTGAAGAGAATTAATCTTCTTTTCTTTTGGCCCATGTCCGTGTATTGTATGCG
CATTGAATGTGTGAGCCTTTCCTTGTTTATCCTGGTTATGAGTTAATCGGTAGCATACAT
GGTGAGGTTTCAGGATGCATGATAGTTGCAGATGATAGTGTGAATTAACAAAATTAGTC
AAAAG CAGCTC CATG GTGAAACACTTAG CAAG GTTTTGG AATAAGTG G AAACAAGATAC
TATCCAGGCAATGCAAGTTTACTCCGCCTAAAAAGTATTAAGGTGAAATGATATTAGATT
ATTATGGTTCCTAAATGCTGGCAGTCAAGTTATTTAAGCTCAAATCTTCCAGGAGAATAG
AGAAATGTCGTG GTAG ATGAG GAAG G AACACATAG AACTAAAATATG CTG GTTG AC ATG
AAAGAGTGTTGCATGAGTGTTATTAGACAGAAAGATTCCTAGAAAATTGAAATATAAGTT
AGTATAACGGTTGTGAGATCAACAATAATATGTGAGAGTGAACATTACATGTCCGCTAAT
TGAATATCATAAAAATGCGGATGTAAAGATGGTTGTGCAATCATATAAGATTGAACATGA
TTAGAAACAATCACTTTTGTTGAAGG GTGCAAGTAGG G CAC ATATAG AG CATGAAAAG G
ACGTCACATGAGATGGTTTGGCCATATCCTATTAGTCCGCCATATGAACCGGTTATTAA GTGTG G CAATATTGTGTTTAAAGTGCTG AAAG GAAC GAG GTAGACTG ATG ACTAC ATTC
AAAAATTTGTCTCAAATGACTTAGAATCTCATGGAGTCAATACGGCTTGAACTATAAACA
AAACCTTATGGAAGAAATGGATCCATATAGGCAATACTAACTAGTTGAAATAAGGCTTAC
TTGATTTTACTCTACTGGGTGCAGTATTTGTCAGGAGTCTTTATAATTTGGTTAGAGACA
TGTTTGTAAGTGTTGGTATAAGTTAGAGATTTAGAGAACCTCTAGATTTAAGAGAACCCC
TTGTCTTAAAAAGATTATGTACTATCGGAGTGAATTATTTCAAAAAAAGAAGATTATTTAC
TATCAGAGTGAATTGGGTTCAAATAGAGCAGAATGGCCCCAAATGATATTAGACCTCAA
CTAGTTTGGGACAGAAGTGTAGTTGATTTATTGATATGCATCTTATGTGCAAATTTTATTT
TAGTCATGCGTGTGCCCTTGAGCTCCTTCCTCTTCCCTCTTCCCTCCTAGTCATTCTACT
AAATTTGGCTATTCAATTAGTTCTGGCTATGCTTTGGCATTTGAACCATGTATTCCTTGT
CAGTTGACTTATTGCATTGCTCTGCATCTTCATTTGTCGAGTATTCTGGATCCATAGATA
AACAAGAATCTAGGTTGGTCTAACTGTATTTTTCTTGATTTTCAGATCTTGTTCTGAAAGA
TAAGATCTTTGTGTATGACTTAGCTCGGCAAAGGATTGGATGGGCAGATTATGATTGTA
AGTACCTCTTTTCAAAAATGAAGTCAACATTTGCTTTTGTCTCTATTTTGCCCCCTTTTCT
TTTGGGGTGGGGAGGGGGTTGTTTGTTGTGACATGACTTACTTCTTCCCTCTGATTTTC
TCTCTATTTTTAATGGTTTTCATGATTAATGTTTTTATTATAGGTTCATCATCTGTGAATGT
GTCTATAACCTCTGGCAAGGATGAATTTATCAATGCTGGACAGTTAAGCGTGAACCGTG
CATCAGGCAGTTTGCTGTTCAATCCGCGGCACACTAGAACTATATTTCATCTGCTATCG
TTGGTTCTGATGATTGGTTCCCCATTTTTAACT
SEQ 32
TCAAG CAACAATAG GGTATGATG CG CAAGTTGCAATACCTACAAACATATCATTAG GAT
CAGTTATTTACCTCATCAAAATCATGACTACATTGACTAGAAAGTTTTTATTATTATTTGT
GTCGATCTAAGATGAACTCAACTTGAACTCGTATATAGAATAGCATTATAGTTCAGATTT
TGTGGCCTCTTTTTTTATGGTCCAAGTTGAAAGTTATTTTTCATTAATTTATCCTAATAGA
ATAAGCTTTCACAATCCGAAAGCAGTGTGCATCCAGTATGCAGAAAGGGGGAAAGGAG
ACTTGATGATGTGGGAGAAACTTACCACACATGTTCTTTCCCATCTCCATTTTGAAGTAT
CCATTGTCACCCCATTCAGCTCCCCACGAGTTCTTGATCAGCCAATATGGAGTACCATT
ATCAACACCATATCCCACAGCAAGAACAGCATGGTTCACATCCTGTCAATGCAATAGAA
AATGTTAATGTAGTATTATGTCATTGTCTCGCATATTTAGTGAAGGTCGGACCATATACT
TTGTTGATGTCATTGATATTTAACAAAAGAAGAAAAAGAATATGTGCCTCTTCATTGCAG
CAAAATCAATCATGTCTCAGTGCCAGATTAGTTGGGTTGGGTGGGGGTATGAATCCTCT
GTGTACACTGCTCATCAGACCTTTTTATTCCAATACTACATAATCTGTATATTAAGGCAG
ATTAGGAGTTTCTAAAAGTAGAGACTATTTCAATCTTTCCTATCATCCAGCATCTTTTTTT
GGTTGACAAAGCATGGTATCTTTCCTATCAGTATTATGACACTCGAATTGTTAAAAATCA
ATAAACACAGAGAGAGAGAGACACACACACACATTACGCATAACTTAAAGATAATTGCA
GACACCCTTCTCGACGAGTAATCAAGCACTGGGTTATAACTAACTACTGGGTACAATTC
TTTCCCCTACGGTGAGTTACATGGAAAATGTGCTTCTTCATCGACAAAGAATAGCAGATT
AAAAGGGCAATATCTCAAATTCATCGGAACTAAAAACTCACCTGGGGAGTGTTGCCACA
TACGGTGCTGCTGTAAATTCCACCCTTGTACTGTTTGAAACCTTTTACCACCTGATAAGC
TACACTAACCGGTCTAATAAATGCAATCGCGTATTTTAGTTCATCTTCAGCACCCTGTAA
TTATAAATGTTTAATTGTAAGGAATCCAACCAATGCCCATTGACCTTTTCAGGTCAAATA
ACGCATATTAAAAGATCCACTCAACTGAAAATCATCAACAATGAAACACTAGAACTATGA
AAGGTTTTTCCCGTCGGAAAACTACAATAATCTTCCTTTAGTGTGTCAAATACAAGAAAT
G ATAAG CACTG ATGTAAG CAGTTAAAAG CTG ATC CAAG ACATTGCTGTTCCAG AGTAAA
GAGGTATCCTACCTTGGTAATATTAACAGAATCAACGACTTTAACAGCAACATTTTCTGA
CGAGAATTTGCATACACCAGCCTTTCCAGCATAAGGATATTCTTCTTCAGTGTCAAGAC
CACCACTGTATTTAATGTACTCAAAAGCCTGTGATGGAAGCCCGCCATGGCATCCAAAG
TTATTAAAAGCTCCAGCACAGTCCAAGAGCTGCTGTTCAGACAGAGAGATGTTCTTCCC AAATGCCTGGGCATATGCTGCCTCCAGAGCACCAGTAGTGCTGCCAAAACATCATGCG
CCGACTCCGTCATTTTTCAACTTCAGAAACCCGAATAGAATGGAAAGGATATAGAATTTA
CATGTTGTTAAGAATAAGTAGTTTAGGCTTAACGCATAGTTGATTAACGATTGATAGGTA
GAATGCCATTCCAAACTGAAAATTTAGGAACCTAGTGACATGATCCTAACTACTCCTTAC
CTGAATGTCCAGCAAGACCCGCACTTGCCCTGCTTCTTCACTGGGCTTACTATCCCTGC
TTCCCTCCAGTCTTTCTGAAATGACACATTCGGAGATCAAAATTAGGAACAAGAGCTACT
TTTGGTAACCAACAATTCGTTCACTTAGCACATGCTCTTCTAGGCAGCGGCTGAATGGA
AGGCAACTGTTCCTGAACTTACTACTACTTTTAGCTTGGAAAAGAAGAAAAAGTAAATAT
ATATAGATAGACAATACACACACACGTTTTTCTAAACACACATATATACCAAATTTTTATT
TCATTCAGATACAGTGGATGCAGCAGTACTCTGCACCTCTTGACTTAAAATCCTACGTC
CAGCGTTGCTCCTAGATGCAACAGCTATCTATGTTGTCATCTCAAGAAATTCCATCATCT
TGAAAGTAATAAAAGTTAAAACATAACAAAGTGATAAAAATCCAATGAAAGAAGAATGAC
AGCAATATTAGGCTGCAAAATGGACAAGACTAAGCATTTATGAATTTTCCTTTTAGGAGA
AC ATAC AAG ATAG AG G C AC AAAG AAAG C ACTG AAAG CTAG ATTTC C AATAAC AG G ATTT
CCGGTTGGAGTGAATGGATGCAAATGGTAATGATTTTTGAGAAAGTATCATAAACAACT
GCAGAAAAAGATATAATGGAGTTCTGATTGCAAATACCGTCTCTGGTAGGTTGACATTA
GTGAGCTGAAGATCGCTCTTTGTGGTAGCGGAACAGTTTTGAGGAGCTCCTAGCCTTTC
TCTCCTAAACTCATCCCATGTTAGGTCAGTAAACTCTGCCAAGAAGCATAGTCATTGTCA
AGAG CAAAATACAG CAAG CAAAATATG CAGTG CCTTCTTGTTTCTTTATTTTGTTTGTTCT
TCTCCTTTACCTTGACCTTGGTTTTCATATTTTTATTTTCCCTTTCTTCACTCACGTACAA
GAAATCTGCTAGGAGTTCTAATTAGAAGTAACAAAAACATTAATCTACTACATTTTGAAC
CACAAATCAAGTTTTTGGCGTCGTTGAAGTCTTAAGAAGGCATCCACCTTAGCATCCGC
ACCTGGGCACAACAAGGTTGAGGGAACAGAGGGTGGACGCGCAGCATCCACCCCAGC
ATCCGATGCTGAGAAGATCAAGATGAGGCGGACGCGAAGCATCCAACTCAACATTAAT
CCCTGAAGCTGATTCGGGAAAGCAAAAGGAAAACTTTGGCCCATAACTTTTGGACGCAA
TATATAAGCCAAAAACGGCTCTTTTAGGTCATCGAACACACTTTTTGAAGGGGATTCGA
CCTAGGGAGAGCAAGGAGCCGCCGTGGAGGCCGAATTTCATCTTCTTCCGCCAAACTT
AGTAATTTTTATGTTTCTTTGTATGATTTGTTGTTTGGCTACCATGTCTATGTGGAGCTAA
ACTTCAC GTTCTAATGTTCTG GTTCTTTCATGACTATTGTTATTCG AGTTG ATTTTC GTTT
CTTGATTTATCATATTAGTTTATTTATTCAATCCTGCGCTTAATTATTTGATTGCTTGATCA
CCAATTAAAACTATCTACGAATCTAGAATTGAACTCGAAAGTGTGAATTCTAGATTGCAT
ATAGGATTAAATAGAGCAAGTTCTTGAACCTGGGTATCGGGGAACGGATTTGCGGTTAG
GATAAACATATATACCCGATTGCCTTGCTTGGTTGATTTACACGAATTTCAAATGCGTTC
TTGTTAGTTCTAATTC CATAG ACATATTG G CGTTAG GTTAG CTTG AATAG AC GAGTAAGA
ACTCGAGAGATTCTTATGAGCAATATTAACACTGTCAACCAATAAACTAGATAAATTAGT
TAGTCAATTCAATTGAAGAATACAATAGGAATGTTAGATAACTCATAACCCTAGATCGTT
TTCATTACACTGATAAT AT AAAAATCAGCTCTTCCTTTGTTCAGAGTTCATT ATTT ATTTTC
TTTTTAGTTTAGTTACTTTTGCATCACTACTTTTGGGTTTAATCCTTGTTTAGATAATTAAC
AAGTCCTCATGGGTTCGACACTCTATCTTATCACTTTATTACTTGACGACCGCATATACT
ATACAAGTCAACTTTATGTATCCACTCTATTCAGATCATTTCATGAACTATAAGTAAGAAG
AAAAACCAAAACGAAAAAGGGCAAATTGTCCATAAAGGCATATATTGTCCAGCCTATAG
CTAATAG G AAAC C ATATAGTATAC AC AAAG G C AATTAC C ATTG AC AC C AAGTTTGTAG G A
AAGTCGTTGCTTGTTGTGAGACCTAATCATCTTCAGATTGTCCAAGTATATCTCAAACCT
TTGCTTGATCTCCTCAACTGAGTCGTATCTCTTCCCATACCTTTTAAAGTTAACAGATAT
CCCAAAAATAAATAATTTTTAAGTAAAAAGGAAAACAAAGCTTATTCATTCAAATAGGAG
GAAATTAGATGAATCGAACTGACCTGCGAACAAAGCGAACGAAGGAGAGAGCACGGCG
CGTTTGGCCGATGAGTTGGAGAATTCCATTCTCCAGCTCCTGCAAACCGTCGGATACTA
CTACTTGCCTGATCGGATTATCATCGTCAAACGTCAACGCTCCGCCTTGTGCGGCGGC
GATTGAGGTCGCGATGAGTAATAGTAATAGTATAATCGAGGCGCGAGTCAT SEQ 33
ATGAATC CTGAAAAGTTTACTCACAAG AC CAATG AG G CACTTG CTG AG G CACATG AACT
AGCTATATCAGCAGGGCATGCTCAATTTACCCCCTTACATATGGCACTGGCCTTAATAT
CCGATCACAACGGTATTTTCCGGCAAGCTATTGTGAATGCTGCTGGTAGTGAAGAAACA
GCTAATTCAGTTGAAAGGGTATTCAAACAAGCCATGAAGAAAATCCCTTCTCAAACACC
AGCACCTGATCAAATCCCACCTAGCACATCACTGATTAAGGTGCTCCGACGAGCTCAGT
CGTTGCAAAAGTCTCGCAGAGACACCCATTTGGCAGTTGATCAGTTGATTTTAGGCCTT
CTAGAAGATTCCCAAATTGGTGATCTTTTAAAAGAAGCTGGGATTGGTGCAGCAAGAGT
G AAATCAG AAGTAG AG AAACTTAG G GG AAAAG ATGG CAAAAAG GTTG AAAGTG CTTCA
G GG G ACACTAATTTC CAAG CACTTAAGACTTATGGTCGTG ATCTTGTTG AACAAG CAGG
AAAACTTGATCCTGTGATCGGTAGGGATGAAGAAATTCGAAGAGTAATTCGGATTTTGT
CGAGGAGGACGAAGAATAATCCGGTGCTTATTGGTGAGCCTGGTGTTGGTAAAACAGC
AGTAGTTGAAGGGCTAGCACAAAGGATTGTTCGAGGCGATGTCCCGAGTAATTTGTCT
G ATGTTAGACTTATAG CATTGG ATATG GG G G CATTAATTG CTG G AG CAAAATATAG AGG
TG AATTTG AAGAG AG GTTGAAG G CAGTGTTAAAG G AAGTGG AAG AAG CAG AAG G GAAA
GTGATCCTTTTTATTGATGAGATTCACTTGGTTTTAGGTGCTGGTAGGACTGAAGGGTC
TATGGATGCTGCCAATTTGTTTAAGCCAATGCTTGCTAGGGGCCAATTAAGGTGCATTG
GTGCAACAACTCTCGAGGAGTATAGGAAGTATGTCGAAAAGGATGCTGCGTTCGAAAG
GCGTTTCCAGCAGGTATACGTGGCTGAGCCTAGTGTTCCTGACACTATTAGTATCCTTC
GTGGGTTGAAGGAGAAGTATGAAGGGCATCATGGTGTCAAAATTCAAGATAGAGCTCTT
GTGGTGGCAGCCCAGCTTTCGGCTCGATACATTACAGGTATGTCCTTTTTTGGATTGTC
ATTGTATTTTATGAATTTTACCTTTGATCTTTAATCGAGTAAAGATGCCACTACAGGAATA
TAGCAATGTATGTAATGTTGAAATGTGATGTGTCACACGTTTGTATTGTGGTTGTCAAAA
CATTTCCTAAAATTTTGAGGAGATAGTCCCTTTCCTTTATGTCTATGCAGGATGGATGTG
AATCTAGTTTTATACTTAATTTAGCTGAATCACGTCCCATTTGAATGATAAAGTTATTTTC
TG CTTCATTGTG CTTTTCAAG GTG ATAAC CTCTAACCTTTG GTTTGTAGTTTCAG ACTTAT
AAAAGTATGATTGGTGCGTGCTCACCTTAATTGATTGGATGGGATTATGTGTTTGCTCTC
TATTAATAC GAATTTTCTTTAAAG CTTTTTCTCTC CCTTG CTATG G AG AATTG CTACTGTT
GTTTTGCGTATCATTTGCCAGTTTGCCATAATTTTGTGCATATAGGGATTACTAATCTGT
GAATTTACGTTCAGGTCGTCATTTGCCAGATAAGGCTATTGACCTAGTTGATGAGGCTT
GTGCAAATGTAAGAGTTCAACTTGATAGTCAACCTGAAGAAATTGATAATCTTGAAAGAA
AGAG GATTCAG CTAGAAGTTGAACTTCATG CACTTG AGAAG G AAAAG GACAAG G CTAG
CAAAGCTCGACTCGTTGAAGTAAGTATACATCCCGGAAATGCTTTGACCTATAATTCTAG
AACCTGTGTAGGAAATGTGGACAAATAACGTAATTACTATTTCAGGTGAGAAAAGAACTT
G ATG ATTTGAG GG AC AAACTC CAG CCTTTG AC GATG AG GTATAAG AAAGAG AAG GAAA
GAATTGACGAGCTTCGCAGGCTCAAACAAAAGCGTGATGAACTCACGTATGCTTTACAA
G AAG CTGAAAG GAG ATATG ATCTTG CTAGAG CAG CAGATCTTAGGTATGG G GCTATCC
AAGAAGTGGAAGCTGCTATAGCAAATCTCGAGAGTAGCACAGATGAGAGTACAATGTTA
ACTGAGACTGTTGGACCTGATCAAATCGCGGAAGTAGTCAGTCGGTGGACTGGTATTC
CTGTGTCAAGGCTTGGTCAGAATGAGAAAGACAAATTGATTGGTCTTGCTAATAGATTG
CACCAAAGAGTGGTTGGGCAGGATGATGCAGTTAGAGCTGTTGCTGAGGCTGTATTAA
G GTCTAG AG CTG GGTTG G GAAG G CCACAACAACCAACTG GTTCATTC CTTTTCTTGG G
ACCAACTGGTGTTGGAAAAACTGAACTTGCTAAGGCTCTCGCTGAGCAGCTCTTTGATG
ACGACAAGTTGATGGTCAGAATTGACATGTCCGAATACATGGAACAGCATTCTGTTGCC
AGGTTGATTGGTGCTCCACCAGGGTAAGGACCCTTTAACTATTGATAGGATAAAAGAAC
AAATCATACTTTTACGAGTAAACTGTATCTGCCATAATGAGATTGTGGATTGCACCTTTT
GTAGAACTCTGTAGCCTCATATTTGTCTAGGTACTTAATAGTTTTACGTCTGAAGTGATG
AATG CTG AACATGTTATGTGTGTG CAGTTATGTTG G AC ATG AG GAAG GAG GACAACTCA
CTGAAGCTGTGAGGAGGCGCCCTTACAGTGTAGTGCTTTTTGACGAAGTGGAAAAAGC
TCATCCCACTGTATTCAATACCTTGCTCCAAGTGCTGGACGATGGACGATTAACAGATG GCCAGGGTCGTACCGTTGATTTTACTAATACAGTCATCATTATGACCTCAAATCTAGGA
GCAGAGTATCTCTTGTCAGGATTAATGGGCAAGTGCACCATGGAGAAGGCCCGCGATA
TG GTC ATG CAGG AG GTAAG CTAG AACAG CCTATTTTCTGCTAATTTTCTG AG CATTGTTT
CCTAGTTTACATCTTTATTTGAGGAAGGATTGTTCACATATATCTTTTTGTGACAGGTGA
GGAAGCAGTTTAAGCCTGAGTTATTGAACCGGCTAGATGAGATTGTAGTGTTTGATCCT
TTATCACACGAGCAGTTGAGGCAAGTATGCCGTCACCAACTGAAAGATGTAGCAAGCC
GTTTAGCTGAGAGGGGTATCGCCTTGGGCGTTACCGAGGCCGCGTTAGATGTCATACT
TGCTCAGAGTTATGACCCTGTAAGTATCACCATCTGGTATTTCAACCTGACATTTCATGG
TGATTAGACTAGGGTCTGAGTTGAGATACCAACTATGCAGATTTTTGCATTTATCTTGCT
GTGGCGGGTTACACTTGTTTTTTCAGTTGCTAATTTCACTTATTATGGAAAATTATTTGTA
GTTACATTTTAGGTGATCTAACATTCTAAAAATTATCTTAGAACCGTTGGCGTATAGAAG
CGAAATACTTTTGACAATTGATTGTGCTAACTTTTGTTACAATTACATCACAGGTTTATGG
TGCAAGACCTATTAGAAGGTGGTTGGAGAAAAAAGTGGTAACTGAGTTATCCAAGATGC
TCGTGAAAGAGGAGATTGATGAGAATTCTACCGTCTACGTCGATGCTGCATCCAGTGG
GAAAGATCTAAGCTACCGAGTGGAGAAAAATGGAGGGCTTGTCAATGCTGCCACTGGG
AAAAAATCTGATATATTGATTCAGCTCCCTAATGGAGTGAGGAGTGATGCTGCTCAAGC
AGTGAAAAAGATGAAGATTGAAGAAATAGTAGACGAA
SEQ 34
TCACTTG CTTTCAG GTATG ATACTAACAAG GAG ACATACTATG CCAGTAACAACAG GG C
TCGCCACACTTGTGCCAGATAATCGTTTACAACGTGTGCTGATTTTGGATCCCATAATCT
CACGCCCATATGCAACAATGTCTGGCTTTACACGACCATAACTGTCAAGGCACAGTACA
GTTAAGAATCTATGGTTTGATAGACGAGTGTAATTATTTAACTGTTATTAGAAAAGCAAG
ATCCTAAAATATTCAAAGAAAAAGAAAGAAAAAAAAATGAAGCAAAGAACCATAATTTTG
AACTTAATTCTTTTTC AAG AAAAAAAAG AAG C AAC AATAAC AAC AC C AATAAC AAG C C C A
GTATTTTTCCACAAGTGGGGTCTGGAGAGGGTGGGAAGTACGTACCCTTACCCCTACC
CTAGAAGGACAGAGAGCTTGTTTCCGATAGACCCTCGGCTGGAGAATGGATGACAAAA
ATAATG G C AAC AATAAG G AATAAC AAC AAG AT AAAAATACTG AAG C C AAG AAAG C AG CT
AAACTCTAGGTAATAATAGCAATCTATGAATAAAAGGATATCATACTAACACTGATGCTA
GCGAACTGGGAAAGACAAAGAGATACGTTCGACTACCTACTAGCCTTCTACCCTAATTC
TCGACCTCCACACCCTCCTATCTAGGGTCATGTCCTCAGTCAACTCCAGTTGCGCCATG
TGTTGTAACCTCGCCCCAAGACTTCTTAGGCCTGCCTCTACCCCTCCCGATACCCATTG
TGGCTAACCTCTCGCACCTTCTAACTGGGGTTTCTATACTTCTCCTCTTAACATGCCCGA
ACCATCTCAACCTCGTCTCCCGCATCTTTTCCTCCACCGAAGCCACTCCCACCTTATCC
CGAATGATTTCATTCTTAATACCTAGTATGCCCACACATCCATCTTAACATCCTCATCTC
AACTACTTTTATCTTCTGGACATGAGTGTTCTTGACCGGCCGACACTCTGTGTCATACAA
CATAGTTGGTCTAACTACAGCCTTGTAGAACTTACCTTTAAGTCTCATCGGCACATTTTT
ATTACACAAGACACCGGTAGCGAGCCTCCACTTCATCCATCCAGCTCTGATACTGTGTG
TGACATCCTCATCAATCTCCACATTACCTTGTATTATAGATCTAAGGTACTTAAAACTACC
TCTCTTAGGGATAACTTGTGTATCAAGCTTCACCTCCATGTCCGCTTCCCGGGTAACGT
TGTTGAACTTGCACTCCAAGTATTCCGTCTTGATCTTGCTCAACTTGAAACCCTTAGACT
CTAGAGTCTGCCTCCAAACCTCCAGCCTCTCATTAACACCACCCCGACTCTCGTCAATC
AGAACTATGTCATCAGCAAATAACATACACCATGGCACCTCTCCTTGAATGTGTTGCGT
CAATG CGTCCAAAAAAAAAAAG AAG CAAAGAG CTTAATTGTGACTTTTTTCTATTTCATG
TTTACGGTTCATCTTTCTTCCTTTCCTTTTTTCCTTAGAAGCTGAGTGGATTGTACAAGA
GGCATTCAACAGATGTCATGCTCCTATTCATCCATAAAGTTTTTGCCATTTTCACCCATC
ATTTTCCACTCAGCAGAATTTTACTCGAAGCATCACAACCATGGATAGAATAAAGCTCAT
AGAATGCTCGTTTGTTTCACAAGAGCTGATTTTTAAAGGCATCTTTTTTAAATGAAGTTG
GTACATCCAACACACTCATTGACTTCCTATGTGGTCATATAGTAGAACACAACTTTAATA C AG AG AAG GAG AG G G C AG AAAAATAAAG AATAC AG ACTATACTC C ATTG C AAAG AGTAA
TACATAGCAAAGAAAGGAGAAAGAGATACCCATGAGGAATCTCCCAGGTACTCATTCCA
CGTGAAGAAAATGAGGCTAAATGATTACTTTGATCAATGGCACCAACACCAATAACATC
ACTTTGATCAGCAGGATTGTTAAGAGTACCATAAAGTGGTCCATCATTTCCAATAGCAGA
AACCATGATAATATTGTTGGCAGTAAGCTCCCAAACCTAGCGGAAAATATTGATTACGTC
CATCAATATAAAGCAACTATGAGGAGAGACTCCAGAGAGTAAAGGTGTTAAGACAAGTA
GAAATAAGACTGAATATATGCATGCTAAGTTTAGCAAGCATGAGAAGAGTGAAGTTGAG
GTAAGATTAGATAAGATTCCCATACCTAAAATGCACACAATTCAGATATCTAGACTCATT
TTCCATGAAATTAGTATGATCCGTGAAGATATCACATCAAATTTGAATAGAATTGGTTGT
AATGGAGGTGCGCTACAGAGGGTTATGTGATAAGAGGATACCTATCAAAGTTAAAGACA
AGTTTATATTGGGGTAAATGTTTGGCTTTGTATTAAAGATATGAGCATCACCAAAATGCA
TATGCAAGATGGGTATTATAAAAGTTTCACAAGATAAGAAATGATCAAATTTGACAGAAC
ATAAAGGATAAAATAAGAATGTCGTTTGAGATGGTCTCATCATGTCCTAAATAAATCTCC
AAAGGCACTGGTCCTTAGATGGAAACCATGATGATTGGAGGTGCTAAAAGAGATGTATA
CCTAAAATCACATGGAAGGAAGTTGTCCCAAAAGACTTATAATCTCGTTGAATTCATACA
GACTCAAAACAGAACACAACAGAAGCAAAAGTCTGTTATACGCGATATCGACTATTAAG
AATCAAGGTGTAGTCATGCTAGTGCACTTACTTTAGGTCCAATGTCTACTAGGAATCTTT
TTAGTCTGTCTGCACTTCTGAAGTTTGTATGTCAGTAGAAAAAGAACCTCCAGATTTATA
GATATCCAAACTACTGAATCTTTGATGACTCCAAATGGAGCAGAATGGATGGTGAGGAT
TCATTAGCCAACCCAACTAGCTTGGAATTAAGGAGTAATTATTCGTGTTGTTGTACATCT
CATCAATATAAAGGTGAAAAGTTCTGCTAATGTTGTTTCAGGTCCTGCTGAAAGTAATGT
TAATTTCAGTGAAATGACAGGCTTAAACAACTCCGAATCTCTTTTACAAAATTGAGTACT
AGATATAATATACAACCCTTGTGTTTAAGACATCCATGACATAGTTCAGCTTGCAAAATT
AATAGATCTCATGAAACAAACGCCCCATTAAGCTCAAGAAAGCCAAGTAAAATCCATGC
ACTGCATAAGAAATATAAGATTACATGCTGCCAATTATAATCAAACTTCTAATACTTCCGA
G ACCACATATTATACAG AAACTTAG ACAATAAGG G GTTATG GAAC AACAG CAAG ATCAT
TTCAATGCCTATGCTGTAGACAAAATGCAATCCAGTATCATACCACATAAAAACAATAAA
GATATAAACCAATAGATAAGTGACCTCACCTTTTCCACAAAAGGGAGATCCAAATAATCA
GGTCCACCTATGCTCAAATTCAGAACATCCATGTTGGTTGCAATTGCGTAATTAAATGCA
TCGAGAAACCACGATGTGTAAGAGACCTGCAATAAACTGAAGAGCCACTTCTTATAATG
CTAAATTGGTCATTACAAGATTGATCTTTTTATTTCTAACTTTTTTATAGGTCGCCTAGCG
TTGTCCTTGTCTGTAACAGTAGCTTTAGTACATGAGTTAGTGTTATTTATGTATTTTCGTA
TTCCTTGACTTATGTGATTACTTGTCGTTGCTTTCGTTCCGGCCTTCTAATTGCAATACT
CAGTTTTAGTTTTGTTCCTTTGTATTTTTTGCTTCGGTTTTCTAATTGGTGTGCTTGTTGC
TGCTCTTCCTTTTATCTTTCCTAAACCAAGGGTCTTCCGGAAATAACCTCTGCCTTCTTG
AAGGTAGGGGTAAGGTCTGTGTATGTACTACCCTCCCTAGACCCCACTTGTGGGATTAC
ACTGGGTCTGTTGTTGTTGTTGTTGATAATGATGGTGTCAAAGCAAAACTTGTCTCGACT
ATTCCAAGGATACCTGCAACCTCCCACTAGCACAGGTACCGGGTATCTCAACCCACCAA
G G CTTAG G C AG ATG G GTAG ATATC AC CTAG C ATTTTTTATCTAG G CAAG G ATTTG AAC C
ATAGTCTCCAAAATTTTAACCCACTTCATTGAACGCTACCCAACACCCTTGGGTGCTACA
AGATTGTTCCTTTTTGTGTGAATAGACTCTCTTTCAAACCCCAACATCAAGGATTCAAAC
CCATCGAACCCATGATGTGCGTCTAACTCACACATCACTTGTTGCGCTCTTACCACTAC
ACCAAAGCCCTGGGGGTGAATACTCCATATCATCTTGATTGTCCTTACTTGCGGATATG
GTTTGTAGCTACTGAAATCAATAGATTGCACAAAGCCAATGATAGGTAGCTTAATTGGAA
AAACTGAGGTCCAAGAACGATAAGGTCTTGGCCTCAAAATCTCCATGAAAGGAAAATAG
TATTTAATGTGTGCCTTATGTAAGTAATTTTGTTTTCCAATTTACTATTAGCAACACTGTT
ACTTGTTATAGTATCTCAAACAGCGCATACAGTTTTTATAATATTTCAAACTGCTTATACC
TCCCCAATAGTGGGCTAATACTAAGTATGGGCTTCCTTGAAAACAAAATAGGAATTATAG
ATTGCACATAATTCGCAGACAAGTTCCTGGCTTTTCTAAAACATAAAAGTAAACAATGTC
CTCCCTCCCCAATCCCCTGAAAAATAGTTGCAATCTTATCACTAAAGTCATAATAAGATG
GCAGAAGAAATATTATATGTTCAATAACATAGCATGTAACATGGACTCCACCACTAAATC CAATCAGTGGGTTTGGCCGCTGGAAAGGGTATGAGAAAACTGTGTATATTTAGGTGCAT
ATTCTTCC AAG ATATTGTAG CTTAATG ATG AGAAGTTAAAG CTAG CACAAAATAAG GTGC
AGAAG CAGAACTTGTCATTTACAG AGACTAG G CAGTCTAAAGTATTTTTTCTTC CATTCC
AGAGGACTTTTCACTAAAAACTATGACTGCAAGAATTTGCTATATTAGGTTCACCACTCA
TGAGGTGGATGTGGCACACTCTACTAGCAGAAAACTGGAAGGGAACGGGGGAAGGAT
CTTAACACATCAAGTATTTGCTTTGCTGCAATTAACAACGAAAGGACCGTTTGATCATAG
GAATCATCATTAGCACTCAGCAAGAAGCAGACTTGTATAAAACATCAGTACAATAAATTA
GAGGCAATAATCCAAGACATCAGATTGTTGAAGATCTTCAAGTCTCAGCTTACTTAAACA
GTTTAAGAAAATAAAGCCCCGTCACCCCCCAAAGAAAAGGAATTGGAATACTCGTTCAA
AACAATCCATTACCTGTGCATCTGTAAATACATGGAAAGCATAGATTTCCGCATCTGGA
GCAAAACCGAGGCATTCTTCATCCTGACCAGCAATAACACCAGCTACAAATGTCCCGTG
TCCAACATTGTCATTCAATGTATCTTCGTTGGTCCAATTTGTGCGTTCCTGAAATTTAAG
GTACCAAGCCCACTGTCTGTTATTAATGACGTAAAGATGGTACGAAGATATATACATTG
GCATATATGTAAATGTCAAGCCTCAAACCACATTGGATTCAACTAAAGCATTTCTTTGTT
ACAGAAAATATATTTTTAAACGACAAAAGACAAGATACCTTGATATTACGAAAATGTGGG
TGATCTGCACGGATGCCTGTATCAAAAATTGCCATTTTGACCTTAGCACCAGTATGCCC
TTTTGACCAAAGCTCATGTGCCCCAAAGAGGGATGTGACTCGAGATTTCTGCAAAATAA
TCATACAGGACAGACTGTATATCAAGAAAAATACAGCGAACAACAACAATATGTTCAATC
TAAAAAAGAAATAGAAAATAGAGCAGCAACATGAGACCCAGACACTCAAAGAATGCAGA
CCATGTTTTTCAAAAAGGAAGTATGCCCTGACAACCTTGAGGAAATAAGAATGACAAATT
ATAAACCTATTAACACTCTCCGATCTGAATGTTACGCTCCAACGGCCAACCAAGTTAAA
GTGAGCCACTTAATGATCTAAGGATATATAATCCATTCTCGTTAGAGGTACTTTAGTGTC
TTG AAAG ACTAAAAAAACAAAAC ATG GTCATATATC CAGTG CAAAAGAGAATATTG GG G
CATAGCAGAGACAACTTGTGAAATTATATGGATCACAGGGTCGCACAAAGATTAAACTT
TATAGTGATCAGTAAGGTGCAGCTTCTGTGTAATAATCAAGTTTCCCTTCATATTGCGTT
GAATTCAGTGTGTCATGAAAGATATAAAAGAATATTATCAATTGTAACTTCGTCACAGAG
AAGATAATCTCAGAAGTCATTTTCACAAGTTCGTGAAGTCGAATGCTTAGATTGTAGATA
TATCCACGAGTTCTTCACTAGTCCCTGAATTAGTCACATATGTAACAAGCACTAGAAAGG
GACTGTTAGATAGTTACGGGAAAATAGCTAAATGTAAATACTTATATTTATTATAAGTGTC
CCACTTCGGGAAAACACAGGTATAGATATTATTACATTGTCAAGTGTGCCCATAAAAGG
AATCAGTTGTAAGATATTAGTCTTCAAGCATTCTCTAATCTTTCTCTTATTTTTCTCACTAT
GGAGTCTCAGCATCATGACATATTAATGGAATAACAGACACTTGATTAGACCAAAATAG
G AAAAG GACAAAGACAAAAG G GAACTGAAAG AG ATTAATTTCCTTTGAACATATAC CAT
GCAATAAAGTTGCAACTATCATATGTCATGAATGCAAAGAGAAGAGTTGCATACATTCCA
ATGATGTAAATTATCAGTAAACGCATACAAAATAAAACACAAATAATCAGTGGTCTTGCC
TGCATCAATAGATGTCTGCTCCAGCTAATTCTCATTATACTAGTGTTGGCCACCGCATAG
TTTTG ACCTTCACTG AAG G ACATAG CAGTAAAAATCTTTC CTGG CCTCTTCTTC CCATTG
GCAAAAGCCCCATTCTTCTCACTCTTCTCTTCAAGAACTATCCTTTGATAGCTCAAATCC
AACGAAACGTCTTTTACAAGATTCATTTTTCTGAACTTTTCTAGCAAGAGTTCTTTCATTG
ACTCGTCGATTTCCACCAATCCAAAGTCAGTAGGAAATCTCGCAGCCGGATTTTTCCGC
TCAATCCATTGCCAACCCTTAAATTTCAAGTTGTTTTGAAGATAATTCCAGTGATCCTCA
GGTTCCTTATAATGATAGAATCGAACAATATAATTTCTGCTATCAGATTGTTGCTTCTGG
TCATGTTGGCACTCATCGGAGCTACTAGAAATTAATGGCTCAGACTCTATTGGTGGGTT
GAAGCGGATGAGTGTATATACCGGAAGGAAGGGGACAAGTGAGAGGGTGAAGAATGA
TTTCTTAG GAG CTTCAG GCAT
SEQ 35
TCATATTGAAGCGACCAAGTCTTCAGTCTCAGTCGTCTGCTGAGAAAGGGTGCCTCCAA TCCACCTCTTGAGCATCTCTAAGGCAACTTTAGGTTGATCCATTGGAACCATGTGTCCA GCATCGTGAACCTGTTATAAGACACCAAAACAGTTAGCTCAAACATCCATCAGTAGAATT
TGAACAATAACATCGACAAAGAAAGGCACCTTCAGGAAACTCAGAGGCCCATGGCTTTT
CAACAATCCAGCTTCAGAACTGTCAACTTCAAAAGGAACATCGGGAGATGCTACAAACT
CTTTCTGACCACTCCATTCCATAGCCTGAACCCATCTTGAGTTACCTGGTAAAAGAGAG
GCGTTATAATATCCGAAATATTTATGTGAAAAGTTTCCATCATTAGGCTTAGAGTTGAGT
CAAAGCTTACCAAGCCAGTTGCAAATAAGATCATATTCTCCAGCATAAACAAGCAACTTT
ATTCCATCCTCGAGCAAGGTTGGAATGCCAGCCTCAAGATTCCTCATCCAATCAACAAG
CATGGCCTGGTACACAGTAGTGCTGCATGAGACAAACTCTATATCCTCAACTCCAAGAG
CCTGCTTAACAGAGTGCATATTCAGCAATTTCTCCATGTTTGAGAAGTCATAGCAGAGT
GCTCCAACGCATTTCTTTCTGATGTCGTAATGCTGCAGGTGAAAGCTCAATGGATCAGA
AATATGGTTAATCAGTCATTTGTTCCAAACTTTGGAAGGCATGCCAATGACAATGTGACC
TCTTGAGTAATTTAACATATTCAACATGAAATGATATGGAGTAGCGATTTAGAAGAAATA
G ATTTCTG G G ATCATTTCTACTCTTTCTG AG G CTAGTAACACCTATTTCTCTACGAAGTA
C AG AATAAGTGTAATAG G C ACTAG AATTAGTAG AAAAC G G G AAAC AG AC AG AAAG G G CT
GATAACTTACATTGATGTCAGCCCCAGCACGTGCACGAACAGCAGAGAATATAGAATTG
C AAAC AAAATAG G C AG C C AAG C AAG AG ATTTTC C C ATC AGTAC CTAAAAG AATAAAAAA
GACAGAGAAACTGAGAAGCAAAACAAAGTACAGAGAATTGATTTGCTGTGGTCAAAGAA
CATCCATTTACTTCATCAGCTCCTTCCCCTTTTCTTTTTCACCCAGGGAAAGCCCGAATG
AGTCAAATGATATGGAGGAAAGAAAGATAGTAAACAGTAAATTAAATAATGTACCACAAA
GGTTTATTGCAACTTCACAAACTGGAAGTATTTTGTTGATACGATCATGATCAGACTTTG
AAATTAATCCCATGTCCAATGCATAGTCAGTATACGCAGCGTATTGTATTTTGGGATCTG
TAAGCCCATTCCCAATGGCAAATCCCTGTTCAAATACTTTCAATGAAGTAAAGACACATG
ATTAAGGAAATAAGAATTCAATAACTGGGAAAATGAGGTACCTTTAAGTTTATATGTATT
CCTTCTTTAGCCTTGTTTCCCTTGTGTACTCTAGCAGCAAAAGCAGGAATATAGTGCCC
AGCATATGATTCTCCAGTTATGTAGAAGTCATTCTTTACAAGCTCAGGATGCTCTTCAAA
GAAAGCCTACACCATTATTAATGATCATACCAACACAAACAAGTCAGGATCATATTATCT
CTGTTACGCATCAATTTAGAAAAATGCTAACTGTGTCACATAAAAGAACGAATCTGAAAT
AGCCAATGTGTCACAAATGCTCAAGAGAATTCATACCACATTCGGCCTTCAGAATTTGC
AAAGCAGAAAAATACAACAATAAAAGCAACATATAACAATATTTCCAACTAGAAATCTGT
TGAAAAATTCACGTTCCGAATAGGTAATGTATAGTCTTAAGGCGGCTAAGCCAAGTTCT
GCTAAATATTCTGGTTTAAAAGCTGTTATACATGCTAACAAAATGCATCATGAGGGAAAC
AACTG AC CAACAAGTTACC CAG CCAAAATTCAG GG ATCAG CTG CAGTTTGTAGTAAAAA
ACAG G AAAACCAG CCCATGAAG AAG G GTATTGAATACTG CAAAAAG GTTGAG GG AC AG
GGTTTTCTAGCAATGTGATCACATCTTTTGCCCCTAATGCATTTGGCAGATAAATGGAGT
CAAAATATTTTAACGCCTCCATTTTTGTTGGGATCAGAACCCAGGCATCAACCAACTATC
ATTTCATAAGCACAATATAAGACTCAAGTCCTAGTATATGACATCTCTCCAATTATCTATA
TG GTAAAAGTATTAAGTG ACCATGTTTCTTTTG ACAAG AGTG G GTTG CTCTAGTG GTGA
GCACCCTCCACTTCCAACCAAGAGGTTGTGAGTTCGAGTCACCCCAAGAGCAAGGTGG
GGAGTTCTTGGAGGGAGGGAGCCTAGGGTCTATCGGAAACAGCCTCTCTACCCCAGG
GTAGGGGTAAGGTCTGCGTACACACTACCCTCTCCTGACCCCACTAGTGGGATTATACT
GGGTTGTTGTTGTTGTTCTTGTTGTTTCTGTAAGGCTCGAGTTCTAGCATTAGAAATCAG
CCTTTTGAGCTCCTGTAGACCTATTGTACTGTACCCCGTCTTTGTATCACATGTACACAG
GTGATCAACACACAGACAAGAGAACTGACAAGCAAACCGCACTCTGAGAGTCCGAGCA
ATTACTTGAAAGGGATCCCAGACAACCCTTACTGGCACCAGTACTTTAATTTGTCACCA
CTCACCAGGCGGCCAATTCTAGACAGGTCAGCAAGGCGATAAACAGAGGTGTCTTAAT
CATCACAATATCATCCCG CACACAG GAGTGG G GAT CAG CAAACACTTCTTTT AT AACTG
AGAAACTATTGTTTCCAGCATTGAAACTGTGGATAATGAGCCTGCCCCTGTACTTTCTTC
AATTTTTTTCAGAACAGGATTCAAACTACTGACATGCGCCTACCACACATCCCATGTTCG
AAATTGAAACCAAAGCTCTGGGGCACTGGAGATTAGCAAATAGGCTAGGTGATTATAAA
GATATCATTCAAGAGTTCTCCTACTAATTCACGGTACTTTCTACAAACCCCTCCCTCCTT
CCACAGTTGATCACAATAAGCTTGACTACTGACGTATATGTCAATACCACAGCCTCTGT G AG ATAG AAAAG CTTCCATAATG ACTACTTG AAAG G AG ACCAAG GG G GTTTAG AAAGTA
TTTATCATTCTGTAAG CTACTGCAACAATAATG ATTTTACTTAACG G AAAG G AATGC CAT
AAATGAATTTGTATTCTTGAGGATGTTCACAACCAGGACTGAAGTTGCTTCCACCCCTA
GCTACATTCTTTATCCGTATTAAGGAAAAGTTTACCATCCTTTTTTTCCAGGTGAAATGTT
TTATTGGCCTTTAGAAGCAGGACAAATTGTCCAGTGCAAGCCTCATGAATAACATACAT
GAAACTAGGAATTGATAGGTGAAGAAAATATAGGAAACCATCCATATCATTAAGTTAAAC
AATTGACTCTCTGTCATCATACGACAATGATTACAACGGTTGATTCCAGAGGAATAAATA
TGGTTCCAAGTTGCTTTAGGGGTTTAATTTCACACAGAACAAACCTGTAGGAAGTCATA
CAAGTCGTCGCTAACACCTGCTTCACTGTGACGGATGTCATGTCTGTCAGAACTGTAAC
TAAAGCCAGTACCTGTAGGTTGGTCCACATAGATAAGGTTTGATACCTGCAGAAATAAC
ACGTCAACATCATTTCTTCAGGNCGTCAACATCATTTCTCTAGGTTTGGAAAATCTATTA
AGGTATTCTAGTGTCCTTGTAGGAGAAAAAGGAAAATCGATAATAAAAATGAAACATCTA
CTTTACAAG GAACAAATGTGG AACACAAG G CAAACTTG ACACTCTAG GAGTCAG CAATA
AAAGACCCAAACCACAAAAACCAAAACTCAAGATCTTATGGAACATAAAGCACTTTCCTC
TCTG CATTCTTGAATTG CCGTG CAG GTAATTTTTTTCAGTAATGAGAAAAAAG AACATTA
AAACAG G CAG AAG CATGACATG G AATTAGG G AAGTG CAGTATCAGAG GTCTAATG AAA
AAAATATG GCTGACATGTTTCCTATG CAAAG CATTAAG ATTTCAGTAAAACACAAAG CTC
CTTCCG G GAAAAAAAGTTTTCTTG CAGTCTG GTT GAT GAT AACTACATAAATG CTG AAAG
TGTAACTATCTAAAGGCTAAAAAGGACATTTCTCCTGACAGATGTATCATGTGCAAGAAA
GAAATGGTCCCCACAGATTTTCTTCACATTGATGAAGTATCAATTTGTGCCTGTGAATTG
GCATTGCATTCATTTCAGGTTGGTTACTGGTACATCTCGAAGAAGGATAATGGTGAAGC
TTGGATTATCAAAGCTTAATCACAAACCAGTATCCTTATACTTATGATTCTTCGATTTTAC
ACAACATTGAACTAGATATAGACATGTTATTAACTGTTTTATCTTGATTTCCTTTTCAAGT
AATCTTTTCAAAACTAAAAATGATAATAACAAAGAGGAACATCCACAACGAACAAGCTTT
TTCG CTAATTGG CACTTTACAAATAAATTAG CAGG CTG CAAAACTACTTATCATG AAAAC
GACTGGATAAAGGACATGCAAAAGATGTTGCCAATGCAAAATCTAATAATCTATACGCC
CATGCACCATTATACTCTTAACTTTCATGTACCATTGAGTAAAAGCAAGAGAAAGTCTTA
TACTATCGAGCTGAATCATCTACTAAAGAGCAAAGAAGGTAAATTGCTACTATTGCTTAA
GCAAAGCTGTGTTTACATACCACAATAAACTGCTGTTAAATAGCATAAACCATTAAGACT
ATAAGGTGGTCTTAAAGATAGTCAATTACCTTGTCCCACCCATATTCATTCCGCACAAGT
GACAAATTATTTGAAATAGAG AAAG GTCCATTTTCATAG AAAAG GG CCAACTCACTG CT
GCAACCAGGCCCTCCACTCAACCAGATGACAACAGGATCGTCCTTACTACCGCGTGAT
TCAAAGAAGAAATAGAACAACCTACAAAATGAAAGATAACACAGTTTCAAGCTAAATAAG
AG C C C AAC C AAC AATAAG AAG AAG CTAGTTG ATAC GTTAG AC AG G AAG AC G AAG AT ATA
AAGCCTTATTACTCCGGAGATGTATATTTTGGTAAATGCTCACGAAAGACAAATATTTAA
ATGAAACAATGTTGCAGCATTCCTTTCAATTCTTACAACTGATTTGGAAACTTCTTAAGC
GCACTGATTGAGTTAAACAGATAAGTGGTCAAACTGTAACAATCTGATGGTTCCCCACA
AAAGCTACATGTAATGACATTTGAAGGTAAATAAATCAAGCAGACAGAATATTTCTCCAT
TAATCAACCAATCAAATAATTCCAAATTAATTGGGTTGACTAAATGAATACTCTTTAACAA
TTCTGCTATATTTAAATCCAATTTGTTGCAATATTAAATAAATATCCTTTGAATTCATTTCC
ACACCAGTTACAACCGTTTTATTCGAATTTCCGAAGTGGAAAGTGAATTTGTAGGAGGA
TAGAG GG G AATG CAG G GGTG CACTG CACG ATAGAAG ACTACTG AGAACTTGG CTACAT
TTTGTAGACAACAAAAACCTATTACCCCAGTACACCACAATGCAATGCTGAATCCGACA
GTATTTTAACATAAAAGAGAAAAGAAAAAATTGAAACTATAATTTGGTTGGATGCCTATG
ATAATGCAAATCGGAACAAGAATCACCAAAAACCAGGAAGTGTTTGAAGCAACAAATAT
GGTACTAGTCTGTTCCAATTTGATAAACTTTCTATTTTTGGGACGTCCGGGAACATTTGA
CACCAGTTTGCTAGTATCTTAACAATTTTCAAGATACTCTAATTCATTTCTTTGAAAAGAA
TAGACATTAGGCCTAACTCAACATCAAAAGCTAGCTCATGAGGTGATTGATTGTTCATTC
AATATATAAGGAGACAACAGTCCACTCACTCCACCAATTTGGGACACTTTAATATTTCCA
CACGACGAGGCCGTGGACAACTGGAGTGTGGACAGCATAACTTGCGACCCCAACATG
GAGAAACACAATGTCGACCTTACCCTACCACAACCTCAAAAACTAGCTCACGAGGTGAG GAACGGAGGATTGCTTGATACCATAGGAGACAACACTCCATTCCCTCAACCAATGTGAC
ACTTCAACATTATGTGTTATTTTTATCGAGTTACATTTTCTACAGTTATCGAAATATGTTTC
TTACAATACTACATCGTACATACTCTAATAATTAAATCAAATCAACATCTTAAATTCCGTG
TTC AACTAAAC ACTAC C ATAAAAATC G AG AC AG AG ATTG AAAAAAAAACTTAAC CTAG G G
TTTGAAAAGGTACCTAGCAGCATGAGAATGCTTAATCTTATAATAACCGGCATGATGCC
CCAAATCTTCAAAGGATATAACACTCGAATTCGTCAAATTAGCGAAATTAAATCGCTTCT
CGACGATTCTAGAAGCAGCGGTGGGAAATGGATCCCGATCGACAATGTTATCGGATTC
TTTCGGGAACAAATTTAGCTCGTGTATCAACTTCTCAGCTTGCTTCGATGCTAACTTCGA
AG ATATTG AAAC CTTC G C G AATG AAG AAG G AG AAAAAG C AAG G AG AAG AAC AAG AG AG
AGAAAAAGGGAAAGTGAAAGCTTCATTTGCGCCAT
SEQ 36
TCAAGAAGAAGGGGTCTTCTTTCCCTCATTTCCGAAGGTTCCAATAGGTTTGGCAAGAA
TGTGCTGCATTGCTTTGACTCCTAGTGGATTGTTCTTGCTCTGTACCAACAACATTTAAA
ACACAATCACGTAAGTAAATGAAACAACCATATCCTTCTGAACAAAACTGTTAAAAGATA
AACCTTGGCCTAACTAAAGGGCAGCACGGGGCACTAAGCTCCCGCTATGCGCGGAGT
C C G GAG AAG G G C C G G ACTAC AAG G AC CTATTGTAC G C AAACTTAC C CTG C ATTTTTG C
AAGAGAATGTTTCTAAAGCTCGAACATATGATAGCTACTTTACCAGTTAAACCTTGGCCT
AACTCAAACAAAAATCTAG CTCATGAGATGAG AATTG CCAAAGAAC CCATTC CCTTAATC
AATGTGCGACAATCTAACAATAACAAGCAGAGATCACTTACAATTGAGCAGGTGCCTGC
TTTAACATTG CAG ACAG G ATAATCATGTGG G CAACAACTGTTATG GTCTTTACAGCAAGT
AGCTCCTTCCATGGGACAACAACCCCAAGCAAAACAGTAGTTATAATACTTGTAGACAC
AGCAGCACGTTGTTCCGGCTGGGCATTCGTTATAATCATCACATTGAGTGGGTGGCTTG
ACTGGTGATGGAGGAGATGGAGCTGGTTTTGGGGGGTTTTGGCCTGTCTTTACAGGGT
AAG AAG CAATTGTAG CAATACCACACAAACCTTTG G G GTTG CCAATGTTTCG CTG CATC
CTGAGGTAACCATTTTCTCCCCACGAAGCACCCCATGAGTTCCTCACGATCCAATAATC
CATGCCATTTTCACTACCATATCCTACTGCAACCACACCATGGTCCACTGCTGCACCAC
ATTTTC C G GTAAAG ATAC C CTG GAG CAT AG G C G G ATC C AAG ATTTAAACTTAATAG GAT
CAACATTTAAATTTTTTAATACTGAACTCATTGTGACTTTGAAAATACAGAAATATTTGTT
GAATCCGTGTAAATACTGGCTAATTCGATCAAAAAGATAACAACTTTATTGACAATAGGC
AAATCTGAAGTAATACCGATTTATAGTGCTGGAAGTCTTTGCCGCCAGCTTCGATAGCA
ACGCTGACGGGTTGACCCGCGACGGCCTTTTTCAGTGCCTTTTCATCATTAGCAGGAA
CATCTTCATACCCGTCGATGGTGACAACCTTGGCATTTTTCTATAATTGTTTCAAACAAA
ATATTAATAGTCATAACTTGTATTTAGTTCATTTCCAATAATCCCTTTTAAGACAGAGAAC
ACCATACTGACCCTTGCTTGATCGCATTTTCCATCTTTGGCTTTGTAGGGGTAGTCTTCT
TCAGTGTCTATTCCTCCATTTTGAATGACGAATTTAAAGGCATCGTCCATTAGACCCCCT
TGGCAGCCTTGGTTATCGGCAGTATCACAATCTACCAGCTCTTGTTCAGATAACGAGAT
CAGATTACCTGTCATTATCTTGTTTACTGCTTCAATTGAAGCAACTGCTGAGAAAGCCCA
GCAACTCCCTATTTCATTTCCCAATTGTAACACAATTAAGACGAAAATCATTAGTTGCTTT
ACGTAATATAAAAATATCATTATTTCTTCAGCATTTATTTATGCATAAAATGGAAGATAAT
CCAATCAATCAAG GCTG CATAATTTG CCAAAACAATTG CACAAAAATACATACTGATG AA
CATAGTGTTTTGTTCAAGGGAAAAAACATTACTTATATTGCCATAATTAGATGCGGAAAT
TG GAG CAAAAGTCCTG CTAACCTAATTTATCATACTTAACAACAACAATAACAAATC CAC
TGAAATCCCATCGTGTGGAATCTCCTTACCCCATCTAGATAAAGTATAGAGACACTTTCT
G AC AG AC C C CAAAAG ATAC AAAG GTAAAG AG G AAAAAAG G C C CAC G AAAG GTAAAAAG
CAGGGAATTAAATAACAATAAACAGCAATACCAACAAAGCAATCAGAAAATTTAAAAATT
AAACATG CTTAACCTTAG CTAAAATTAACAATAATGAAGG GTCAACTG AACTTG CCATTG
ATCTTAGGTTAAAACTTATGACACTGTGATCATTTGAATCCCTTACCCTAGCAAACTACT
AAATTTTCCGAATTTCCATCAAAGCCATCAGTAAAGAAAGCATTGACAAATGTTAAACAG AAAATAAGTATTAACAAAAAGGAAAACAAAAACAAGTAAGAGGTGCAGAGATGAAGAAG
GCAGAGTACGAAAGGAAAATACCACATTGTCCTTGATTTTTGACGTCAACAAGAACACC
TTTCTTCCTCCAGTCAACGGAATCCGGCAAACTATCTCCGACCTTGGGGGCATAACGGT
CACTTTGGGTATACGACAACCTACTACGACCATCGGGCTTAGTACCCAAGTAGATGGAC
TTGTACTC CTCGTTG GTCAAATCTG CAAACTGAGTC AAACC CAG CTTGTAACTTTTTTCA
GGCGCAGAGTTCTGTTCATCGATGTATCTAAGGTTGTCCTTAAAGATCTGGAACCGCTT
GTCCTTTTCTCCTAAGGCGTTATACACTTTTTTATGTTCAACTAGCCAAGATTCATACAAA
GACACGATTTCATCGTCTGTTCGCCAGACCGTTGACTCGCCGTTCGTGTGATGTTTTTC
GTTGTAGCTTATAATGGACATGTCCTCCGCCGATGATGTTACGGCGGAGAACATGAGC
ATTACGAGTATGGAGATGGAGAGGGTGGAAGTATGAATCGCCAT
SEQ 37
TC AG G AAG G C G AAAC AG C AAG AG G ATTC AAC AAAGTG C C C AC G AAC AAAAC AAC AC C A
GCAGTCTCATCTTTCACAAGGAACATGAAGGGATGATCAGCAACAAAGTCTATCTCCTC
TTCAACCTTCATCATCGAGCACCCGAACATCATTGTAGCAACAGTAACAGCTGGAGCTA
CTGGAGCCTCCTCATTTACTTCGATAAAGGCCTTGTGAAAAGCTTTTGCAGCTGGAGCT
TCTGCACCTTCCTCATTTACTTCAATAAAGGCCTTGTGAAAAACGTTTGCAACTGCCAGA
GGATAATTCTCGCCCACCATCTCAGTGAGACCACCTTTAAAAGGTAATGTGAGCTCGAG
TCCTTTTAGAACTTCTAAAGCTTCAATCCCCAAAGATATTTTGAACTTAGGGATAAGAAA
CTCGTGCACTTTAACTTTTTCATATGGAACATGGCGATCCAAAAATCCAGGTTCCGAACT
AATTTTCTCCAGTAAAGTTGGTAATCCATCACGGGCATTTGGGAGATACACATACATGTT
GAGAAATCGCTTGTCCTCGCCCTGTTTATAACGAAGCCTTAACACTTGGAAACCATCAA
AGGCCTTCACGTATTGCCTTTTTTTGCTGGTCATTAAGGGTGCTTGAACAGATCCTCCAT
TAAG G AG ATG G AACTCATG GTCTTTTGTATCTGAAGCATTCAACTTTTCAGTCCATG CTC
CTTTG AAATATAGTG CATTCG CTAAG ATCAG ACTTGTACCG CTATTG ACTG CAACAG GA
GGAAGAATTTGTTTGATAAGACCATTCGTTTTCTCTTCAGCCCACTTATTGACTTCACCA
GTAACCTCATCACCCTACATAAGAAAAATTAAAACAAACAGAGAACATAGAATCAGCAG
GCTAGTAACAGGATAGAGTTGAAAGTAAAAGAAACAAAAAAGATGTCCTATATGACATTA
GATTCGTTTGTTTGTAATCTTATTCGAAAAGCATAATAAACGTTTCTAATGTGCTTACTAA
TTTGAAAATATTCTGTTTACCAACATGCCCATAATATTTTGTTATACATTATAATACTCCCT
TTGTTTCGTACTTCGAGGGTCAAACTTTTCAATTTTAACCGTGAATTCGAACATGAAATTT
TAATTTTTGACATAAAAGTCACATATTTAGAGACTGTAAAAGTAATATAAGTCATTTATAA
GGAATATAAGAAAAATCGCAGTCAAGAAAAACTCGACTCTCGAAATCCAAAAGGTGTCA
CATAAATTGGGATGGAGGGAGTATCATGTAATTTCAAACATCAATTTCTTTTCTAGTGAG
TG AAG G ATTAAG ACTAAC CTTGTTC C G AAAAT CAACAG AAG C C G AAG CAG C CTTATAAA
CATTGTCCATAACCTGTTTGAAAGAATGCTTAAAAGACAAAGATTGGTCAACCCAGGCC
CAATTAGTGACAGACAAACGAGGACCTCCCATGGGGCTGCCATCGGCTAAGACGTCGG
TGATGACCCGAGAATAAACAGAGTTAAGTTCTTCAACAGAGTTGAATTTGAGAAAAGCC
AACAGTTGATCCAATGTGGAACCACTGGAGCCTGCTGCAATAAGAGCAAAAATTATTTG
AATGGAGACCGGGGAAAACACCATGTTTGCGTTGTTGGACTCGTCCTCGTCGGCTTTA
AACTTGCTGAAGAATACATGCTTTGAAAGAATCAATGGAACATCCATCATTTCTGATGAG
GGAAACTAAAATTGAAACAGAGAACGTAAATCCATAAGATGAGGGAGACTGAAACTGAA
ACGGAGAACAGAAATCCATAATTTCTGATGAGGGGAACTGAAACTGAAACTGAAACGGA
GAACGGAAATCCATAATTTCTGACGAGGGAAACTAAGGTACCTTCACAGTTGAACTTGA
CAGCATGGAGAGAATGCTGATGCATATAGAACTAACAGTCATGGTAGGCGACCACGAG
TCATACAGAATATCTGCAAGACAGAAAATAGTTGTCAACTGCTGCTTTTATTAGATGCAT
TCTAACTTTTTTCATGAGTCCGGTAGAAAATCTTAATTATTAAAAAAAAAAACTTATTCTG
GAAGATGAGAGGAATAACTGTGAATGAAAGGCGTTTTGCTTTCACTTTATTAGGCAGGT
AACTATAG AAAGTTTAAG AG ATTAATAAG C AAG C CAT AG G C ATAG AG CAG AAAC C AAAA TTATGCAGGAAAAGAAACTAGTATGATCTCTGCAATTCTTAGATGAAATGACAAATTTGA
C GAG AC C C AC AG AC AAAC AAAAAAC AG ATC C AAAGTTTATTCTACTAG G G CTAAG AGTT
CTGAATCAGTAAACAACTGTGTTTTTTAATCATCCGTCAACCATCATTGACTATTACTCCA
TAATTAGATACGAATCAACAAATATTATCGGGACATGAAAGGGGAAAAGAATCAATAACA
AAAAC AC C AAG AC C AAGTAAAAAG CTC AC G G ATATAG C G AG AC AATCTAC C ATC ATAC C
CATCATCAACTAAGTTGGCATAGAGCATTTCCAAAACTATAATACCTGTCATCTAATCTTT
ATTAAGAGTTCATATCACAAATCATCCATAATTGCCAAGGATTTTCGTTAAATGCTTCATC
TAATGCATTCTAAATCAAGCCATTAAGCTCGAAGAGTAACCAAATAAATAAATTTCTGTT
GGTGATAAGAAATAGAGGAGACTTGAGCTGTTGAATAAATACCTAAACAAATGTGCCCG
TCCCTATAGATATCAGGGTGCAAAGGAGCTGGAGGCAAAAATATCACCTGCATCAATGT
AGTCAGCCCTAATTAGTATGATAGGAATTTCAATCAAGAACTTTAACAAGAAACCACATC
TTATATCCACCTGGGGAGCTTTAATAGGATAATGTTCAGGGAATTCGGCTTGAAGCTGA
TAAGTTTCATTAGCATACAGCGTCCCAGGAGCACCATTCACTTCAATTAACCACCTTTTC
AAAAATCAAAAGTTCAGTAATAATAGCATGGGACTTCGAATTCAATATTAGAAATTTATTG
GAAAACGACAAAAAACTTAACAACAAACAGGAAAAATGAAGACCTTTGAAGATAATCGG
AGGGTTCAAGATTGAAGCCAGACGGGGGATTGACCTGCCAGTTCCCTAGCTCTGTATG
GAGTCGATTGCATGCCAT
SEQ 38
CTGAAAGTTGGTTCCTTTTTTTCTTCTCTTATTTATTCATGCAATAAAGCATCTCCAAACT
TCTATTCTTATTCATTCTCTCTGCTTTCTTGCTTCATCGAACTGGTGAGTAGTTGTTTTGT
TTCCTTTTTCTTCTATTTAAGAAAAAATTAACTCTCTTTTTGTCGATATATTTTATCCTTTTT
TTTTCTTTTTTCTTTTTTTGTTTCTGTGGGTATTAGAGGTTTTGTGCTTCATTTCATATATT
TGTCTCATGATTTTACTACTTTCAAGGTTGGGCTTTTTCCTTAGCAAGAAAGAACTATTTC
TGTTTATGTTTCATTTTTCTTTTGGACCTTGGTTTTCTGTTCTCGAGGATTGTATCTGTTA
AAAATTGAAGTACTTTTTTTTCCCTTCATCTTTTTAATTGATGTTCTGTTTAGTGTTATTTT
CACCTTTTATGGCATTTAGCAATGTTTGTGCTTTGACGGGTTGCTGTTATAAACATAAAT
TTTGGGAAAATAATTACCAGGTAAACTTGTTATTATGCAAGTGCAATTTGTGTGCGTGTG
GTG GTTTTGTTG CTAG G GAG CAAGG CATGTG ATTAGTG ATAAG AG G GTTAAAAGG G GA
GTAG ATAAACAAAG CTCCACTTTTTAGG CTATTGTTTTTACTTG G GTTCTTC CATTTTTTA
TTATAGCTTGATGAAGTAATATGTAGCTTATAAATTTCCCAGAATAAGAATCATCTCTTGC
CTTAGAAAAAATAATTTACCAGTAAGAGCAGAATATATGGTAGGATTCATCCACTCAACT
CCAATTAGTTTGTGACTGAGGCAAAGTTGATTGAGTGATCGATTGAGTTTAGTCTCATTA
GATTGTCATTTATCCATTAAAAACATGCAGCAGGCATAACATGAGTGATTTGATCTTCTG
AGCATTTTCTCTTGTTTGTTGAATTTAATATATCTTCACTAATTGCTTGGCCTAAATTTTAT
TAACTCAAAGTGATGATTTGCCTAGGTCAATATGGGAGCAAAAGCTTTTCTTGTCACCAT
TTTACTCTCATCGCTGTTATTTCCTTTGGCCTTGTCTACGTCAAATGATGGCTTGGTTAG
AATTGGACTGAAAAAGATAAAATTTGATCAAAACAATCGACTTGCTGCACGCGTCGAGT
CCAAGGAGGGCGAGGCTGTGAGAGCCTCTATTAGGAAGTATAATAACTTCCATGGTAAT
CTTG GG G CCTCTG AG G ATACAGACATTGTAGC ACTG AAG AACTATATG G ATG CTCAGTA
CTTTGGGGAGATTGGTATAGGCTCTCCCCCTCAGAAGTTCACAGTCATCTTTGATACTG
GTAGCTCTAATTTGTGGGTGCCTTCATCAAAGTGCTACTTCTCAGTAAGTTATTTTTTTC
CTTAAAAGAATGCATAATAGAGAAAGCTAGTATTGGCTACATAATTTGATGATCATCAAT
ATTTATGTTTCTCTATGTTTGTGCAGGTTCCCTGTTTTTTCCATTCCAAGTATAAGTCAAG
CCAATCAAGCACTTATAAGAAAAATGGTCTGTTTCTGACCTTTGTCTATATTTGATAATTG
CAACACGACACGTGCTTTTCTCTTATACTTGTTATTTATGCTCAATGCTTGCTTGTAAGA
GAAAGCGTTCCATTATTGGCATTATACATGACATGTCTTAGGTTTTGAGATCAAAACTAT
TAACTCTG CTACCAACTTAG G ATTTTTTTAAAAAG AAAATAAAG G AAACC CTCAC CATTTT
TATTGTTGTCATCCAATTATGTGCCTTGTATCAAAGTTTTTTGTTGAAAAATATAATTTGG CAAGTTTATGTTGTTGGCTTTCCCTGCCAAAAATGTGCTAATGTTATCTCTCTGATTTTTT
TTACTCATGATTTGCAATAAAAGCTTGTGCCTTTTAAACTGTTTTGTCTATCAAGGAATCT
GTTATGCTGGAGTTCCTTTATTGAGTTTTGATATCTATCATAATTTACTTTCCTGGAAAAT
TGATGTCTGCTGTGTGTTTGATATGACCTTTGAATATTCTTCTCTGTCGTTGAGTTGGTC
AACGTGTTCAATTGGTTGTTGACCTAAGAACCTGTTCATCCAAACCTTTTTCTGTTTAATA
TGCCATACAGGGAAGTCTGCTGCAATTCGTTATGGTACTGGAGCAATATCTGGATTTTT
CAGTCAAGATAG CGTTAAAGTCG GTGACCTTATTGTG CAAAATC AG GTG AATGTG GCTT
CTCACTTCCTTTTTTTTAATTTTTTTTTATGTTTCTTGAATATATGGTCTCTCATCTGTCGA
GATTGTTAATGACATCAGGAGTTCATTGAGGCAACAAGAGAACCCAGTGTGACTTTTTT
G GTAG CCAAGTTTGATG GTATATTGG GTCTTG GTTTCC AG G AG ATTTCTGTTG G AAATG
CTGTTCCAGTATGGTATGTGGGTTTATTTTGTTTGCGTTCTCTTCTTTCCAAATGTTTCTT
CAATTTCCTATTAACCAAGTGCGTGCCTTGTGAATTTCATTATTATTGAAATGATTTTATC
TTCTGGATTGCAGAATTTCATGAACATTTTCTTCTATATAAAGTTTTAAGTGATACCGGTC
TTG ACG GTTTCTTCTGTGTTTTATAG GTACAACATG GTCAAACAG GGTCTTGTCAAG GA
GCCTGTCTTCTCATTTTGGCTCAACCGAAATACAAAGGAAGACGAAGGGGGCGAAATT
GTGTTTGGTGGGGTTGATCCTAACCACTATAAGGGAAAGCACACCTATGTCCCAGTCAC
ACGGAAAGGTTATTGGCAGGTAAATATCCCTATATCTTCGGAAGATTGATGTTTTGCTTT
CTGCAACTGTTTTCTTACTCTTCAGAATATAATATGCAGTTTGACATGGGTGATGTTCTG
ATTGATGGTCAAGCTACTGGTATGTTACGTTACTTCCTTTTCTATTTTTTTGTGTGTGGA
GATTTCGAGGATATTGATGAGAGCACTTTCCCATGATTTCCCTGCTTTTTCGTTGTATTG
ACATACTGAATAATGTAGGTTACTGTGACAATGGATGTTCTGCAATAGCGGATTCTGGG
ACTTCTCTCTTGGCTGGTCCAACGGTATTCTCCAAAGCATATTCCACTTTTTGTCCCTAT
TATTCAGCTATTTTCAATAGTGAACTAGCTCAGAATATTTTTTGTACCTTCTTGTTCATGT
GTAGCTTCAACAATCTTCGAGCGATGAATAGGTTTAGTTTTTGGTTGGAATATCAGTTAA
ATAATAATCAGCCATTCCTTTGAACTTTTCTCGTTTTTTCCTTTTCCTATTCAAAAAAAGG
ACGACGGGAAGTGCAGTGGAATTGATGTTCATCCCAGTATCAGGACAAACTACCTTGTT
GATTGTCATACCTAAGAAATGTTTTTTTTTAACTTTTGCCTGTTGTTTCTGTCTTATTAAAT
TAATGCAACTTGAGAACTGCTTCTTTCTTCTCATCTTTAAGGCATGGTTGACAAATATGA
TACAAGG AAAAAG CTGC AG CTTTATTTGTCTAGACAATTG CAGTAGTGAAATG CTTTACT
ACTACATTTTCTAGTTCTCATCACTGTATCCTTCCTCCTCTATCTTGCAGACTGTTATCAC
TATGATTAATCATGCCATTGGCGCCTCGGGGGTTGTAAGCCAACAATGCAAAGCTGTTG
TTG AACAGTATG GACAAACAATAATG GATATGCTTTTAG CG GAG GTG AGCAATTAATTAT
TTTAGTTGATAGTTTGTTTTTGTTTTTACCAATAGTTTTCCGTGGTATCTGCAAAGAGGGT
GGTTTCGTGCTACTAGTTGCCTTCCCAATATTCTGATGGATTGGCGTCTTAACAGGCAC
ATCCAAAGAAGATCTGCTCGCAGGTTGGGTTATGCACCTTTGATGGAACTCGTGGCATT
AGGTTAGGCTAATCATTTCTTTCCTAACCTTGGCCAATCATTTGATATGTTAAATCCTATT
ATAAAATGTGTGCTGAGTGGATTTATGTCCTCCACGTGTAGTATGGGCATTGAGAGTGT
TGTAGATGAGAATGCTGGCAAATCTTCAGGACTGCATGATGCTATGTGCTCCGCTTGTG
AAATGGCGGTTGTCTGGATGCAGAACCAACTTAGACAGAACCAGACCCAAGAACGCAT
CTTGAACTATGTGAATGAGGTAAATAGCATCAGTCACATGCTTTCTCTTCTCATCTTAGG
TTAG ATTACTG AC C ATCTTTAAC AG CTTTG C GAG C G ACTAC C AAG C C C AATG G G AC AAT
CTGCTGTTGATTGTGGAAAACTTTCTGGCATGCCTAGTGTTTCCTTCACAATTGGTGGC
AGAACATTTGACCTCTCTCCTGAGGAGGTATGTCTGATATCAATCTTCTGTAGTATACAT
GGTGTCTTCTCAACTTGTAAATGGCTTTTGATTCTTCTGAACGACGTGGTTGGTTGTAGA
ATCCTTTTGTCATGTTTCAGTTTGGCAGTTCAATTCTTTTTGGTTTTCACTAGATTAGCTA
GCAAGGTGTTACGCTGCTTTCAAGAGAAGTACACTTGTCTTGTAGAAAATTTCAACCAT
GACAGCTAAGTGTAGTTTGGATAATTAATGATATTGAATGTGTCGAGCTTCAATATCAGT
TTCTTTGCTTGATAAGTTAACTTATGATTGGATAATTAATGTCATTGAAGTGTGTCGAGCT
TTGATATCAGTTTCTTTGCTTGGTAAGTTCATATGATTGTACTAAGCTTGCATGCTTGTCT
TGTCACCAGTACATACTCAAGGTGGGCGAGGGTCCTGCTGCACAATGTATTAGTGGCT
TCATTGCCTTGGATGTTCCTCCACCCCGTGGACCTCTCTGGTATGTTTTCTTTTCGTCTT AACACACGTGCAGATTCTGTTATTCTAGAAAAGTTATACCAGCTCCCTTTTGATAATGCT GTTTGCTTATGGCTTTGGTGGTGCAGGATCTTGGGGGATGTTTTCATGGGTCGATATCA CACCGTCTTTGATTTTGGCAAACTTAGAGTTGGATTTGCAGAAGCAGCT
SEQ 39
TCATATGGCTGCAGGTCTTCCATCTTTCCTAGGATCACTTACCGCCACAAGCATCCCAT
GAAAAACCCCGTTTTTATATTCTTTCCCGCTCCTTCTTCCCAACTTTAAATGAGAATTTG
G AAG GTTTTG AACAATTAG CTGACAGATGG CTCCTC CGTTATGTGCTTC GAGTTG ATGA
CCCCTCTCTTCCAAGAAATGCTTTTTCTCATCCGATAGCTCAATGTGATCGCCATCGATG
CATGTCCAGTTCTCATACAGAACTACATTCGGAATTAGCTGCAAAAAGAGGAACACGAA
TCAATACTCGGGGTTTCAGAATTCGTCCGTATCAAGAAATACGTGAATTGCTCAGTAGT
GCTTCGCCACTACGCTAGCTGGCGGAAATAGCTTAATGGTAGAGCATAGCCTTTCCAA
GGCTGAGGTTGAGGGTTCAAGTCCCTCCGCTCCTGGCTTCGTCGTTTAGTGGTAACAA
GTTCCGTGCATAAGCCACTTTAGAGATAGGTGATCCTTAAAAATACTCCCTCTGTTCCAC
TTTATGTGAGCCTTTTCGGAGCACGAGGTTCAAATTGACCAATTTTCCTTGTGGATTGAG
ACATAGAATTTTCAAAAATTACTACATAAAAAGTACTATAAGTCATACTAATAGTTAACAA
TTCAAAATAAGAAAACTTTGTCTGACTCCCTAAATAGTAATAGATTCACATAAAGTGGAA
CAGAAGGAGTAATATACATTGCTGATCAGGCAAAGGGACCGACCTCGTGGTAGACTCT
TGGACTCTGCACTGCTGCTAAAGGATCCATTCCCAAGATGAAATGGTTGATGAAAACCT
G G AC C AC C G C G G G G ATTATTTTC ATG C C AC C ACTAC C AC C AATTAC AC C AG C C AACTG A
TTATCCTGTTAACAAGAAGAAGAAAGAATCAATAAAAGATATACTAAACAACAACTACTG
AGGAAGTTCGGATTGTCCATATACTAAGCAGTTAAGAAAGATGACGCATTGCTCAAAGG
ATCGTTTCCTGGATTGGATACCTTGAGAACAATGATTGGAGCCATGGACGACAACGGTC
TCTTTTTTGGTTGAATAAAATTAGCCGGGGCAGGAGGGAGTTCATCAGGGGATATCTCA
CTAGGTGTTGAGAAATCTCCCATTTCGTCGTTGAGTACAATACCAGTTGATGGAGAGAG
CACACCGGCTCCAAATGGATAGTTTACTGTGGTAGTTACTGATACAGCATTTCGATCAG
AATCTACAATACAAAAGTGACTTGTTCCGTGATCTCTTAGCTGACTCCACCTACAAATCA
AGGACCAAAACCGCTTTAGAGCCATTGGTGTGCAATCAATAGGCTTGTTAGTTCTGGGC
TCAGTCTGAAATACTCCTTCTGTCCCAATTTACGTGGCGGTGTTGGATTTCGAGAATCA
AAAAAGTTTTTCTTTGACTGCGATTTTTTCATAAGCCTTTTAAATATTTTGATTTAATTATT
ATTGTGACTTATAGTACTTTTTGCGTAGTTTCCAAAGATTTAAATTTTATTTCAAGACTAA
AAGATTCTATGTCCAAATTCATGGTCAAAGTTAACTTATTTGACTCTCGAAATTCACAAAC
CGCCACATAATTGGGACGGAGGAAGTAGAATTTATCTAACCTGGGCATATAGTATTCAG
GAGGAAAGGTGGTATTGTCGAAAATCTTCTGTCGAATTGCTTTGGCAAAAGATGGGGAA
AGCATGTCTGATACAGTTTTGCTGATATTTACAAAGTCGGGATCACCGAGGTCCATCCG
AAATGCAAACATGTGTTTCATCGCCTCAATTAGTCGATGCAGACCTAAAGAACCTTCTG
CAGCATTATAGCTTTCAAGGATTTTAAGAATCTGCATTTCCCCAATCGCAATGGTTTAGT
CAG ACAATTG CGTTGTATTAAGTTTAG ACATAATTG AG G AAGATTG GAACCAATAC GTCT
ACGTCTGTGTTATTAGAAGGAAAATAGCAGTACAAGCTAAGTTGGTTTATATACATACCA
GAGAAATCCCCAGTGTTCCACTGGACGGAGGTGGCATTCCAACGATGGTGTAGCCCAT
AGCATTAACGGTAACTGCTTCTGGAGTTTCCACTTTGTAATTCCTCAAATCGTCCATTGT
CAAAATTCCACCCGCTTTTTTCACATCTTCGACAAGCTTTTCACCAACCTCTCCATTATA
GAATGCTTCAGGCCCTTGTTCAGCAATAAGCTCTAAGCTGTGGCTAAGTTTTACATTATG
GCAAATATCACCTGCCCGTAACAATTTCCCCTCTGGTGCAATTACTTGTCGTAAACCAG
G ATCTTTAAGTATCAACTTTG CTTTTG AC GCAATATG ATGTG CAAG ATATG GAG CAACCA
CGAATCCATCTCTAGCAAGTTTAATCGCTGGTTGAAATAGGGTCTTCCACGGCAACCTG
CCATGTTTTGACCAAGCGGCGTGAAGACCAGCTAACTCACCGGGAACTCCCATGGACA
ATGCTCCCTCTAACTTGGATTTTCCATTATTATCATACATGTTCTGCTTGAGGACAAACA
AAACAAATTCAGAACCTGCAACTTCGTGTTGTCTATGTTTATATACTAAATACTAATTCCA ATTCCCAATAAGAGCAGAACTGCAGTTTTCTCATGTAGACAGCTACAACAGCTATTACTA
CTATGCCTCAATTCCAAGCAAGTTGGATCAGCTACATGAATCCTCACTGTCCATTTCGCT
TCATTAAGCCACAGTTTACTTATGCCGATACAAATTAAAGAATTTAACTTCTATACACTAA
TAACCTAATTGTATTTTACAATATCAGTACTTCAATCCAAAGCAAGTTGGGATCGATTATA
CGAATCCTTAGTGTTCATGTCTCTCCATTTGAGCCAAAGTTTACTTATGTCGATACAAAT
TAAAGAATATAAACGTAATACACTAACAACCTAATTGTCTTTTTACAATATCAGTACTTCA
ATCCAGAGCAAGTTGAGATCCGCAATATGAATCCTCAATGTTCATGTCGCTCCATTTAA
GCCACAATTTACTCATGTAGGTACAAGTTAAAGAATTTTAACTTATATACACTGATAATAT
AATTCTTTTATACAATATCGGTGTATTTAACTTGGTGTAATAGGCAGCGTGTCTTATTTTC
CACATTAGTATTCCCACTATTTATGGACGATTGCATGTAATTTTCGGTTAGGTGACCTGA
TACTGTAAAAATTCATTTCTCCAAATTAAATAGTTATTTGATGATGTTTGGATGAGTTGAA
ACTTGAATGAGAAAAAG CAATG CTAAACG AGTGAATTACG AACCTGTG AAG CAG CTAAA
GGAGCAGTTTCCCTCATATCAATAGCTTGAACTTCTGATGTTGATGAAGATCTAACAACC
ATAAAACCTCCACCGCCAAGTCCGCTGGCCATTGGATTGACAACTCCAAGGCAAAGTG
CTGTGGCAACTGCAGCATCAACAGCATGTCCACCAATTTTAAGCATGGATATACCAATT
TCCGAGCATCGACCATCATCAGCAGCAACAACTGCTTGCTCCGATTCAACAACGTCAGC
ATTTTGCTGCAGTTTTCCATTATATCTCTCAACATCTCCGATTAGCCAAATACCAACGTG
TCCATGGTGTCTAAGGCCTATAACTAAATTAAGATGGAGGAAAAAAAGGAACTTAAAAAT
CAAATG ACAAAG AAGTGCAAATTG CAGGTG CTAAAATAATTGTGTAAG CTCG AG CAACT
TCTTGATTCTTATTTCAGAAGATAATTAGACAATCGATGTTAAAGGTAGGAATATACATG
CCAATTCTTTATATTTTTTATAATGGAAATATGCCCGAGAGCTAATGGCGCATTGTTCGA
AACTCAATGGATAGTGGGCCCGCTCCTCTAATTCTCACTTAAAGTAGGATTTTTGTCTAT
GACAATGTTCCACTGCTAATTTTGGTTTAAAAAAGTAGGAATGTACATGCCAAATTTTATT
TTTTGATAACCGACAAGCGATTATCAGAAAAGTGCAACCCGGTGCACTAAGCTTCCGCT
ATGCGCGGGGTCCGCAAAAAGGCCCTTGTGGTCTGGCCCTTTCCTGGACCCGTTGCAT
AGCGGGAGCTTAGTGCACTGAGTTGCCTTTTTTGGTAACCGACAAATCCCAGGGTCATT
AGCGTATTGTTCGAAACTCAACGGATAATGGGTCCGCTCCTCTAAATTCTCACTTAAATA
CTAGGATTTTTGTGTATGACAAGGTTTGAACCTGTGAAATGCGCACTCACACATCACAA
GTTGTGCTTTTACCACTAGACCAAAACCCCACAAGCTTGTACATGCCAATTTGCCAGGT
CAAATATTATCCAATTCAAGAAAGCCTGTAAATTTGACAATTTTTAGGCAAAAACCCAGA
ATTTGATTTTTTTTTAAAATATGTCCTTACGATAAAGGGTCAGCAAAATTTTACGCTCTAC
GGTAATTACTCAACAAATTCTTTCACAATTTCAAGATTTATAACATAACACTTCACTTCTC
TTAAGTAAAGCTAAAGAGAGAAATTCCTGTTAGGATCAAAAAGTCACGTGTCATGCGGA
AGCTAGTAACACAAATCTTGAACGACGATAAATCAAACAACAAAAGAGAAATATACCAAA
AGAGACACAAACATTTAACGTGGTTCGGTCAACTGACATACGTCCACGGCGGAGATGA
GCAATCCACTATATATAAAAGAGAGTTCAAAATATCGAGATAACAACCTCACGAAGAGG
CAAACACAAGTGATACACTAACATTTGTCCCGTAAAATTCTCCCCCTAAACACGACTCTC
AAACCTCATATGGCTACATCGTGGATGTTAGAGATAAAGTTCAATCTCTATAAGTTAGGA
TAGAAATCTCTATTAGTTAGGATAGCTATGTTCTGTTAGCTATATTTTAGGATATATGATT
GTTCTATTAGTTACCTTATCTCCCTAGTCTTCTATAGTGTGTTGTAGACTGTTGTATATAT
ATTCAACTATGTACTCAATAGAAAATCATCGAATTCTCTCAACATCATCTCTCATAATGCT
ACTGAATGGGAAAGAAAGATCTCAATTTATAGAAGTTCAAACATTTTTCTACCAGAAAAG
G GACTAGCC AACTATG G AAG CATTATATTTTC CTTCTAG G AAAAG AAAAACTGAATTATG
GTAAATATGTTGTTCTTTCCTCCGTGAAATAGGAAAATCAATTATAGTAAAAAAATCTAGA
CAAACACGTAACAATTCCATAATCATGGTGTTAATTAACTTCATTTTTCATAGCTTTTTAA
AGCCCAATTAACGAAATTCCTACAGAATTCAACTGAATATTCTGTTAACAGAATTGCAAA
TACTAAGAAAACAAAGAAGAAGACAAAAAGTCAAAGGTGAAAAACTCACATGAAATGGC
AGTGAGTGCAAATAAAAAGCAAAGAGCAAAACTCCATTTCTTCCTTCTATTGAAAGTAGC
AGG AGAAG G GTCCAACAATG G AG CTTCTAAATTCTGTTTACTC AT SEQ 40
TCAGTTGTGTCCTGTCAAAG G ATCTACTTTTATGCTTGTG G CAACAATTG GACTTCTAAC
CACATATTTACCTTCAGTTTCTACCCAGCTCAAAGAACCATAAACCACAATATCATCCAT
AACTAATG G AC CTTCTATTCTTAG CTTGTAG CTCAACTTTTC ATACTTCTCTTTG AAAACC
AACTTTTCAGGTACAAGATTAACTTTAAATTTACCCATTGTGGTCAATTTTGCTGTGTATA
CTGACATACCATCTCCAATATTAGTCACGGTCCTCTGGAATTCTTGTATCCTTCTAGGAT
CCGACTCGCTGCTGTTCCCATTGAAAAATCCAATGAAAGATGGATAGTTTAAGTCCAAT
GATGGGTTGGAGCAAGTATAAGATGAGGATCTTGTGATGGTTTTTATTTGTTTGGATGT
GAAGTTCAGAGCACAGAGAAGATTGACATAATCTTGTGGTGTCGCATCATAGATAAGTC
CAGGATCTAGTGCCTTGTTTGGATCGATATGGCCAGCTCCCATGGCTAGAGGAGTAGC
AGCAGCATTCTTACTACCTGTTGAGATATAATATAATTAAATGATTAAATATATGCTCTCT
CTAACATATAAGCTTACTTGATTCAACATAGTATCAGAGCATGCAAGAGGTCCTAGGTTC
AAATCTCACCGCCACCAAAAAAGTCATAAAATAATTCCAAGTGTTTGGTCCATGAAAAAA
AATCAAACTTTTAGATGAGATGGTCACACAATTCAATATTACCTATGTCTCGGATGGGAC
TTTGTGTGTTGTCCATCGCATTGGAAGTGGTCATCATGGCAGATCGGATGGCTGCAGG
GCTCCATTCAGGGTGTGCGGCTTTTAGAAGTGCTGCTACACCAGAAGCATGTGGACAT
GACATTGATGTACCAGATATAATATTGAAGTTACTAAAAAGTTTTCCTGAGGTAACATCA
GTCACTGGTGATTGTTGTGGCCATGAAGCTAGTATTAAGGCACCAGGAGCCATGAGAT
CAGGCTTGAGGATACTTGGACAGCTCGGTGACGGTCCTCTTGAGCTATAGGTAGCAAC
TTTTG GTG CTG GTTTAG CACCAATATGTGTCACTCG GAATTC AAGTTTTC CTTTAG GTGC
AGAGTTGCTCTTAATGTACTCTAGAACTTTATCACCCTCTTGTAAGTTCAAGAACACAGC
CGGGAATTCGCTTTGGAGGTAGAATTCCAAATCAGTTATATTAGTTATGAAGACAGCCC
CAGCAACTTTTGAATTTCTCACATTGTACACATGCTCACTGACCGAATCATTCTTGTCAA
GGCAGACAACAATATTGTGTGCACTTTTTTGCAGTTCCTTGTCATCTTGGCATTCAACAT
AGACAATG G AG CTTTCACTTG AACTAGAATTC CCAG G GTAG AG CG ATAAGCC AGTG ACT
GAAACTCCATTTCCAAGAGTTAATGCGCCAATAAATTCGCGGTCAACTGTGCCAGCTGC
AACAGTTAGCACCCAAGGTGTTCCATTGTGCAAAGTCTCATAATAAGGCCCTTCATTTC
CTGCAGAGGTGGAAACAAATATACCTTTCTCCAATGCAGCAAATGCGGCAATTGCCACA
GGATCTTCGTGTAGTGGAATCGCGTCTATGCCTAATGACAAGGATAAAACATCTACACC
ATCTGTAATTGCTTGATCAATTGCAGCAAGAACATCAGACAAGTATACACCTTCTTCCCA
TAGAGCCTTGTACATAGCCACATGAGCCTTTGGTGCTATGCCAATAGCAGTGCCGGTG
GCATAGCCAAAATAAGATGCACCCTCGACATAACTTCCCGCAGCTGTGGAAGAAGTGT
GAGTTCCATGTCCATCTGTATCTCTAGCAGAATTCATTGAAATGTTAAGATTTGGATTGT
TGGCAAGTAGGCCTTTATTGAAGTAACGAGCGCCAATGATTTTCTTGTTACACAAAGAG
GAATTGAACTCAATGCCACTTTCACATTCTCCTTTCCATCTTGATGGTACTTCACTAATC
CCATAATCACTATAGCTTTTACTCTCTGGCCATATTCCAGTATCAACTAAGCCAATTATG
ATATCTTTACCATAGTCGGACGTTGGCCATACACCAGACTCAGAGTTTAGGCCAAGGAA
TTGGGATGTGTGAGTTGTGTCAATTTTAACTGACATATCCTTAATTGAAGAAACATAACC
TG GAG AATTTTTTATG G CTTCAAATTCAG AAG GAG AAAG ACTTGCACTAAAACC ATTG AT
GGCATTAGTATAAGCATAGACTAGTTTTGAGGACAAGAATTCTTTGTGATTTGTACTACT
GTCTGATAAAGAAGCAAGTGTTGTCAAGTACCAATTATGATGGCTAGCAAAAGCTTTTG
GCATGGCTGACAAATCCATATGAATGATATATGTTTCTGGCTTTGCTAGTGAAATTATAG
AAATAAAGAAGAAAAGCAACCAAATACACAAGGTAATATGACTTGCCATGTTGAGTAATA
TATTGAAGGAGGATATTTTTTTTAACAT
SEQ 41
ATGGAATTTTACCAAAAACTGGCAACATGTTCTCATTTGTCGCTTTTGTGCTTCATCCTC TTACATTCCATTCAAGTTCAAGGTAGCTACTTTGATCAAGAATATGGTAAGCAGGTACTG AGCTCAGCAATACAAGATAAAGATTGGTTAGTATCCATAAGAAGGATAATTCATGAATAC
CCAGAACTCAGATTCCAAGAATATAACACCAGTGCTCTCATTCGTACTGAACTTGATAAA
CTTGGCATTTATTATGAATACCCTTTTGCCAAAACTGGTCTTGTTGCTCTAATTGGCAGC
AGTTCTCCTCCTGTTGTTGCTTTACGAGCTGATATGGATGCCCTTCCTCTCCAGGTTCAT
ACACAATTTTTTTACTATCAATCAATTATACCTCAATCGTCAATTAGTTGGGCAGTTATAT
G CAGTTCG GAG CTAG GTTGTTCC CTAAGG G GAATCAACATATAAAGAAGTAAAG ACG AA
AAAGCCACGGAGATTCAATATATAGTGTATATACAAAAAAAAAATAAAAAAATTGACCTA
TTTACCCTGTGTAATTTTCGACCCAAAGGGTATCAGTTAACTCCCCTTGGATAAGGTTGC
TCTGCCCCTAGTTATATGAATCTTCTTGTATCTAATTGAGAGGATTCAATATAGTTAAATT
ATTTATGCACCGGTCGTCAACCTAGCACAATCCTCCAACTTTATTTGAATCTGCAACTGG
CTATGCTTTGTGAAGCTTAAATAGGTGTAGTTAGAAAGAAATATTCTTAATAGTGTGCAT
ATTTAGTTATGGAATGTCTCTAACATTATTCTCGAGTGAATATAACCATAGGAGCTTGTT
GAATGGGAGCATAAGAGCAAAGTTACTGGCAAAATGCATGGATGTGGACATGATGCCC
ACACGGCGATGCTTCTTGGCGCTGCTAAGCTGCTGAATGAGCGAAAGGACAAACTTAA
TGTAAGTTTGTTAACCTTACCCACTTCACTAATGCTGATTCATTTGGAATGTAATTTGTGC
TTGTGTGATTCTTTAACAAAAGATTTTTTGCACAATGTTGACCAATGACCAGATTGTCTT
GTTCTCAGAAGTAATAATATTAGGTTTGCGCTATAGTGATTATGCTGATCATTTTATCCG
TTGTGCTTTGACTTCTTATCTAGGTTTGCATGTACACTAGGCCTTTGGAGCTTATTCTAA
AAGGGGGTATTTCTTAAACATAGAGGACTGTAAGAAGATAGATGAAAACATTCTTTAATA
GAGGGGGGTATTAAGTGTACTTTGTCGAGATAATGAAAGAACAGACTCAAAAGGAATAG
ACCAAAAAGGATATCTTTTTGCTTTGTTATCAGATTTAGTTCACTTATTCACATGTCTCCC
TCGGAACAGTCCAAATTTCATAGCAGTGTCGCAAAAAAGGAATAGTTGTGCTGTTTGTT
ATCGATGATGCTTCTTAACTTGGATATGACCATGTTATTCTTTGATTCTTTAAATCTGAAA
CTTGGATCGTCCTTCTGTGGGTGACTAGCAGTGCCTGTGGGTAATCATTTTTGCCTTTT
CCTTAGATGAACATATAAAGTGATTTTGCCCATTGAACATAGTTGTGACCATTCATGATT
CATCAATTGTCTCGATGTGGAGAACCTAGCCCTCTGATCCTCCATGGCTTGCGAGTTCA
CATCCAGATGAAGCAACCAGAGAAACTAATTCAGGCATGACGAGAAATTTTCCGGTCAA
GAGAGAGGATCGATCAGAACCTGTTGAAGGAAATGGTAGATGACGGAGCATTGGCCCA
AATCAATTTCTCTCTGGAACCACGAAAAAGAAGCTGAGAAGACCGATAACTTCTATCTAC
ATTACAATAACAATACATGGCTGCATGTATAGGAAACGAGGAAACCATGAATGTTTTTTT
GAATTCTTTTTTGCTTGACCAATAAAAAGGAATTCAAGACTGAACCACACTTTCTAATTAC
TTGTTAGTCTGTAATTGTCTGACTGATACTATTAGATATTTCTTTTCAACTTTATAAGAATA
CATTTGTCACATGACACTCGTAAAGCACTGTTCGAATTGACTTAATCTGTTTTTGCCCTT
TGTGTGGCATCATTCATTATCTATCCATTCTTGGGGTAGTCTACAATAGAAAGTTGATTT
GTTGCTTGTCTCTATTTTTATTTTTTGAACCCGAAAAGGGAACGGTAAGACTTGTTTTCC
AACCTGCGGAGGAGGGAGGAGCTGGTGCATATCATATGATCAACGAAGGGGCTCTAG
GTGATG CAG AAG CTATATTTG GAATGCATGTTGATTTTAAAAG AC CTACAG GG AG CATC
GGTACTAGTCCTGGGCCGATTTTAGCTGCTGTTTCCTTCTTTGAGGCAAAAATAGAAGG
AAAAGGTGGGCATGCTGCAGAACCCCATGCTACTGTGGATCCAATACTTGCTGCATCAT
TTG CAGTTGTGG CATTGCAG CAG CTCATCTCAAG AG AAGTAG ATCC CCTTCATAGTCAA
GTATGTAGCCTAATCTCAATTAGAAGTATAAATCTTTGGTTTACACACACACAGAGACAC
ACAGACACATAATTATGTAGGTACATATATTCCCTTCAGGAACATTTCTTGTTTTAGAAA
G CAG TAT AG CATTTG AGACCTG AAG CCTCATTGACAGTTAAG CTG ACTG AGATTG AAAT
TCTCATTTCTG CCTG AAGGTTCTTTCTGTTACTTATGTCAGAG GTG G ATC AG CATC AAAC
GTAATTCCGCCTTATGTTGAATTTGGGGGAACTCTGAGGAGTCTTACAACTGAAGGCTT
GCTTCAACTTCAAAAGAGGGTGAAAGAGGTAGGTTGCTTACATGAACCTTTGACTGTTG
TTGACTATCAACATCTGCACACTAGATTGTCTGCCAGATGTCTTCAACATGTAGTTTTCT
GTTAAAAAATTTAGTGATTTTTTTGAGTGATGTTTAATAGCCTTAAACTGAGCCTTCTTAG
GTACTGAGAGCTACGTAATCAAATTAATAAGATTAAGGGTGAATAATTCTCGAACACGTG
TTCACATGAATATAGAAGTCTCAGCTGAATGAATGATATAACTTGTGGTCTGCTTGCAAT
TTTCCCATGAAAATGCCATGTAACTCTAGCATTCATAACTGATCATCTTTCCCTGCTTTG CTTCTCTTTCTTTGTCAAAATCAATTTTATGCCTGTCCTCAACATAGAAGCTTATCATTTT
TATTATTGAATCCTCTATTTCTATTTCGCATTGTTGAATTAGATGCTAATCGTCTTCAATG
TCAAGTATTGCGGCAAGATCTTACTAATTAATGTGAACAGAACCTAGATTTCTTGTGGCA
ATTTTGTGCATTTGTAACAC ATATTTACATG GAGC CTGCAG GTAATTG AAG G ACAGG CT
GCTGTGCATAGGTGTAAGGCGTACATTGACATGAAAGAGGAGGATTTCCCAGCATATC
CAGCTTGCATAAATGATGAGCGCTTACATCAACATGTAGGGAGGGTTGGCAAACTCCTG
CTTG GTTC CGAG AACATC AAGG AAACTG AAAAG GTTATGG CAG GTG AG G ACTTTG CCTT
CTATCAAGAATTGATCCCTGGAGTTATGTTTCAAATTGGAATCAGAAATGAAAAACTGGG
CTCTACCCACGCTCCACACTCCCCTCACTTCTTTCTCGATGAGGATGTCCTGCCAATTG
GAGCAGCGTTGCACACAGCCATAGCAGAGATGTATCTGAATGATTACCAACATCCCATT
GCGGTT
SEQ 42
TTAGATTTCCTCAACTCGTCTATAAAATAGGACATAGGCGGCCGAGGTTTTGAGCTTGT
CCTGGCTGATGGGATACACATGGCTGTCATCGAAGTCATACCACCGATCAGCACCTTG
CTAGATTATTAGAAGAAAAAAACACAAAGTTAGAATATCTGGATTAAACTGGGAAGACTG
TAAAAGTCTGAATATTTGACCTTCTTGCATATTCTTCATGATGAAGAAATAAAAACAATGA
AGATGCATGACCAAAGTTAAATATATAATAGATGCACATATGTGCAATATGTGATTAATTT
AGATG CAG ATGATG CATTG G AACTAAAAAATACATCAAAG GAACACTCACTACG AAATA
GGAAGTTTCTATATTTGCCCTTGGAGGTAGGTTTCAGTGACCGCCCACCACCCACCCTT
CTCCCCAAGTACTGATCTTTAGATGCCAATTTTATCAGGTATAAAAGAATCCTATTTACC
ATAG AATAG AAAC AAAAC C AAG AAAAAG AAG C AAG GTAATC C AAGTC G CAG C AC C ACTT
ACATGAACAAACGCAGTGTAGTGACCCCCTCCCATGCTTCCATAATGGTTGCTAATTGC
ATAAAGCATATACCGGTAGGAAGATTTGCCATCTTTGTAGGCCAAATATGAGGATAAAT
CAAGATCATGAGTTGGGAAGTCAACATACGTCTCCAACTTGTTCTTCAGAAACCGGTTG
TACGAGAACCTCTTCAGGTGGATGACCAGAATCTCCGGCAGTCTCCAAAGATCCAACTT
TTTAGTAGCTTGGCGATGCTGCTTGCATGCAGGGCAGTACCTAATGTTGGATATAAGAC
AAAGAAAGTTAGGCAGATCATATTCATACTTCCCAAGCGAATGACACAAAATTAGATGAT
AAGATAGAATACTAGAAAGATTTCATGAAAGAAGTCTTTTCCCCCATGAGTGGCCCCAG
GTAAATAGCTGAATTTTATTATCTCAATTACTGTCTGTAGTTGCTATGAATATACAAGAAA
CAAAGAGCAACACAAAAACTATTTCAGAAGCACAATGTGCAGAAAATCAATAGGTGTTA
CATAAGATCATCAGATGCTGTCATTGTTTCTTTAACATAACTGAACTAGAAGTTGGAAGG
GTACTACTCATGCCATGTTGCAAAGTCAAGGTCCAATTTGGTTGAGGGGAGGCTAATAA
GAATGGTTTATCCACAATAAGCAATCACGACGTGGTGGAAAATATTGACAGAAATGAAA
TAAAATG GTATAATTG G AAAAATACAATTATGTAACTACAAAAGTTGAG GGTTTATAATAT
TAGCATAAACCAAAGAAGAAGCAAAAAGGGAAGCATTACGAGAAATCTCTGAACATTCA
TATGTTATGAATAGTGAATACATAGCTTGCATATGAATGGTGTACAATCACAGAAGTGAG
GGAGTTGCATGCTTACCACATATCTTCTGGCCCTAGAGGCTCTTCCTTCAGAAATGCCT
CAAGACATTTATACAGAGAGACAGATTCTTGTGGTCTTTTGGCAAAAAACCCAGATTTAA
AAACTTCTGGCAGTGAGCTGAAAAGGCCTGTATTGTACTGTTCAAGCATTTTAGGTGAC
CAACTTACAAGTACATTTAACCGTCCAGAGATATCTGTGGACTGTAATGGCTCATTCATT
ACAATCTCGGAGCCTTTAAAGGTTGCCTTATCATCTGATAGGTAAAATTCAAAGTCCATG
TCTAAAGGTTCGGCAGTATCTTCTTCAGCAATGCTTTCTGGAACCCCGTTAACTATTGAG
TTGCCAGGTTCCATGTCTGTGCTGACTTCTGAATCTGTACATACTTCAGTAGCACTTCTA
TCACAGTTAAGAGTAGCACTTCTATCACAGTTAAGATTATCTGCTTGGGCTGTAGTGTG
GACTAAGAATGGTGTAAGTATCTGTAGATAAAGACTACGGATATAAGATCCTGTAAGAA
CTCTACTATGCGCGGCAAGCGGAATTCCAAATGTCTTCATATTTGAGGTCAGCTTTCCG
TATATGTAATGCCT SEQ 43
CTAGCTAACCTGGTTTTCACTAGCCTTGTAATACAGCATTTTCTGCACATAAAACATCAT
GTATCCTTGCGCAGCTCTCACAATACTCTCGCTCACCTGAGTTATCCAGGCATCGTCAC
ATTTGTACCATTGATTGCTTAACCTCAGATATGTTACGTAATGACCAGCATCAAGTTTAC
CAGTATGGGTGATGACAGCAAACAACTCAAATTCCGAGGACGATTCACAGGACGCATC
TTGCTCGTCCCCGTCAAAGGAGAAGATTCTATTTCCAAATCGACTCCTCAAGATAGATG
AAG AG AG ATAAG G CG ACATGTCCAAG GAAAAAG G AAACTGTAG GTAGTG ATCAACCTT
CCTTGACATTTTCTTAATCACAGAATGCTCAAACCTTTTGATATGGAAGCAAGAAACTAA
AGGCAGTTTTCTTATGGACATCTGTTTAAGAGATTCCTGTCTCACTTGACAATGTTGGCA
GAAGAACTTCTGATCAGAACCCAATTTCTCAGGTCTTGTGAAATGATCTAAGCATCCCAT
CAACGTAGAAATTCGACCGTTTTGGCTAAACTTTCCAGATTCTGCCTCCTTTTTGTGAGT
ATTATGAGACTTCTTTGATGTCATCTTGGCGGAACTCCCCTGGCTCAGTTCCAAGTCCA
AGGAGATGTCTATACATGGATCATATGTAGTAGATGTGAAGCCACAAGCTGTACACATG
ACATCAGACCGCAAGATCCCAGAAAATACTCTATGAGCAATGCAACAGTCTCCGCTGCC
TGCACAAAAGAAAAAAATTAGTCAAGATTAGATTACAGATGACAAAGTGCAATGTGTACT
GCTCCTAATAAGTTACATACAGTAACATAGTATGCACAGCTTAAAGTGCTCCTCAACACA
TTTCATAAAAAGCAAAAGTCCAGTTATGCCATAAAGACAAAACAACAATGGTCCTATGAG
GAATGGCAACCAACAGAATGAGTAGCTACAAGCAATGGAGTACAATTGACTTTTGTAAA
AAAAACCTCAGTAAGAAAAGTATCCGCAAGACACTTTATTGCCTCAAAAGGTCTATTATT
CCATTTACACAAAACCATTCAAAATCATAGATCCTTGTATTCATTATTTAACTGCAACAAC
CAGGTGTTAACTAAATTTTGCAAACTCAGGGTAGCCCTGTACTAATCCTCTAATTAAGAG
GAATAAGAATAGGTGTTAACTCTCTCAATTAAAGTACGTGTGACTTAAGCGGGAATCAAA
TCAGCATATTTACAGTGGCAGAAATGAATTGTTTGCTAAGGAGTTACCCTACTAATGATT
GCTCTGCTAAAGAAATTTGAGGTACCGAGGAGGTAGACTCACAAGAATAATAAGACACA
CAGGGGCTTATAGAAAAAAGTAAGCAAAAAGTTGTCAACCTAGTAAAGACTTCAAGCTT
TCTTGAAGGTTGCGATCAGCTCACTCGAGAGTAATACCCTAAAACATGGTAAAAGTGCG
AGTTAATATGAGAACTATTGTACCAGAAGAAAACTCCTATGCTGAAATCAATAAGACTAA
CCAAACTG AG ACTTACTG G G AAGAGTAG GAG G AAATCTTTTTC CTTTTGAG AAAC ATCT
CTAAGCCTGACAGCGAATATCAACTGGTAAATCGCCTATGGCAAAGAAAAGACCTAAAC
CATAACCTGCATTCAAATATTTCTATCTTTTCTCAGTGACAACGGAAGTTGGGATTGCCA
TG AG AGATG AG ATG CTAGAACAACAAAGATAAG CCTATCAAGTAG G CCCTG AGTG CATT
TTAGTTAGACGTTACTTGACATCACAAAGATGCTTGAACACACTATTCTACTTCCTACAG
AAATTACTTTTTCCACCCCCTCTCCACCAACAAAAAAGTTCAAAAAATTCACCTACTGAA
GTACTTGACCTTTGCAAGTAACTATACAAATTTCAGTAATCCACTTATTTGAGGTTTTTAC
TGCACTAGCCTCCCTTGACTATGTTAGATTTATGTTGCTTTATTAAAATATGCATAAACAA
TGTTGCCAATAATTTTCACAGCACAATAAAAAATATTTAATCTCAATGATTGGTCTAATTC
GGAAGAAAAGGAAAAAAGAAAGAACAAGAAACTAACTATGCAAGGTGATGGGGGGAGA
AAAGATGGGTAACTAATAGATACTCTAACGCAACAAACAAATATACCTGGACTCAACGC
CTTTCCTTTATCGTTCTGCATCCTTTCATGAATCCCATCAAGCACGGAAATGAAAAACTC
ATGAGCATCCTGCTGTTCATAACTTGCAAGATTTGATGCATGCTTCCACCAACTGTTCCA
AAGAAAGGTTTACATCCATCAGAATCAGATACAGAACAGACTGCAATTAGATACTCAAAA
AAAAAAAAAAAAAAAANNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNN CAATATAACAAATTACAAATGTATTATATCACAACCAAGTTACACACC
ACTAGCTAATTTTACTTCAATGAAGGGATAAGGGTGGTACATATAACCCCAACAATAAAA
AATTAATCTTGTCCCTACATCATTTGCAAATAAGGGCACAACCAAGCTAAATAAGGAATC
TTTTACCAATATATCATTATCCTTAAAAAAGAATTATAAGTATACAACAACTATGCCTCAA
TTCCAAGCAAATCTGGATCACCTATACGAACTTCTCACCCAAAAAAAAAAAAGAACGAGT
GCACATTATAAGAAAGCCAAACATGATCTCAACAATCCAAGAAAACATGATCAAATGCA GACCTGTAGAGGAACTTTGCAGGACTAATAGGGGTCCAATCGCCGGAGAAAACAGCAG
AAAACATTGCATCCAAATCACAAGCTAAACACAGCATTGTTGAGTTCTTATTCCCATTAT
CACTACTACTCCTTGTTATAACACTGTTATTCTTTCGCTGGCAAAAATATCTGTTATGCTT
GTCACTCAGAAAGTAATTCCTCAATGGTGGTGTATGAAGCAATGCTTGAAGCACTGAAT
TCATAAAACACGTGTTTCCAAGATTGTTAAGGCCCCTCAAACCCCATTGTACTTCTGGG
GTTGTTGAGTCATTGCCGAGCTGACTCGGCAACGGACTCGAATTCCCAACGATCAAGA
CCTGCTCTTTCACATCAGGCGTCCACGGCTTATACTCCACGCGCCTCCTCTTGCGCGT
GTTCTCCGGCTGCGGCGGAGGGTCTTGTATTGATCCGATCACGGTAGCCTCCGTCTGC
GCTAGTACTACGGCGGCGTCGAAGTCTCGATCGTATACCTGGTCCCTACACCCGCAGC
AGAACAGCTCGGCCCTGTCGATGTCCACGGCGATGCTGTGCAGCGAGGGATCGGAAG
CGTTCCCCACCGGATGCGACGGCGCGTGCACACGGCAGAATACCGCGGCGCACGTGA
CGCAGGCGTACAACCGCGGCGGCGCCTGTCCGCACGCACCACATCTCACCAGCTCAT
TCGGCGGCTCACGGCGGATGGACGCACGCCCCAATGGCCGGACCTTCACGCAGCCCC
GGAAGTTGAAAAATGGGTTCGACCCAACTCTGGATCGGAGCTCCGACAAATGCTCGCA
TGTACCACTATCGGTATAAACAGGCGACCCGTTTTTATTTTCGGATCCGGGTTGGATCT
TACCGATTGACAAATGGATCTTAAGACTGTTCCTAGACAT
SEQ 44
TCAAGAACTGCTCTTCCTTCCTCCATTTATGAAGGCCTCAATTGGTTCGGCATGGATGT
GCTTCATTGCCTTCACTCCCAGCGGGTTGTCCTTGCTCTGTATCAACAAATGCAATCAC
GAG CATTGTTTAG AAAG AATG G CGAG CAACATG CTACAACCAGTTAAATTACACTG ATA
ATATCAAAATTCTTTGGACGATCAATGTATAGAAATCAAACTTCTCTTTAGAAATAATAAG
TGCGAGTCACTTACTATTGAGCAGGTGCCTGCACGAATATTGCAGACAGGATAGTCGT
GTGGGCAACAACTGTAGTGATCTTCACAGCAAGTGGCTCCTTCAAGAGGGCAACATCC
C C AAG AAAAG C AG G AGTTATAGTATTC AAAG AC AC AAC AG C AAGTG GTAC CCTCGGGG
CATTGAG CATAATC ATCACACTGAGTAG GTG G CTTG ATTG G AG ATG G AG G AG AAG GTC
CAGGTTTAGGGGGATTTACGCCTGTCTTGACTGGGTAAGAAGGCTCTATGGCTAATCCA
CACAAACCTTTAGAACTGGCAACGTTACGTTGAACCCGAAGGTAACCTTTCTCTCCCCA
GGAAGCACCCCAAGAGTTCCTGATAATCCAATAATCCATACCATTCTCACTACCATATCC
AACAGCAACCACGCCATGGTCCACTGCAGTTCCACATTTTCCAGTAAAGATACCCTGCA
AAATATATGTATTAGAACCATATAGCTACTAGCCAATAAGAACTTCATTATAATAAGATTT
AAATATAGAATTGTAAGTAGTGATAATTGGTAAATCATAGTTTTTGAATTATAACAGGAAT
CTATTAACTCATGCAGCCAGCGTTATCTTTTGCTACATTCTTTGGTCGAAACAAAACTAT
TATATCTGATCGAAAGGTAACGTATTACTGGAACGACGATGTCTTTCTTTGAATTCTGTG
ACTAACTAAAAGACTTCAGATTTTGCATGTTCACTTGTATTTATTACTTTCAGAAGAAAAA
GATAGGTCAATGCGAATAAAAACATTACCGATACATAGTGTTGAAAGTCGTGGCCACCA
G CTTCAATG G CAATG CTCACAG GTTG ATTTGC AACAG CCTTTTG CAGTGCCTTTTC ATTA
TTTGCAGGAACATCTTCATACCCATCTATGGTAACAACCTTGGCATTTTTCTGCATGACA
CAAATGTTTCAATAAATAAATATTCAACTCACTACATAAATCAATATATAAAGTGAATTTA
GTCATTTCAAGAATGCAAAATTTACCCTTGACTGGTCACATCTGCCATCACGGCCTGTG
TAAGGGTAGTCTTCCTCAGTGTCAATTCCTCCATTTTTGATGATGAAATCAAAGGCATAG
TCCATAAGACCGCCATTGCAGCCGTCATTGTAGGAAGTATCACAATCCACCAGTTCTTG
CTCCGATAGTGAAATCACATCTCCGGTGACTATCGAGTTCACTGCTTCAACGGAAGCAA
TTGCAGAGAATGCCCAACAGCTCCCTGGTTCATTAACCGATAGATTAATACACAAATATT
GAGCCATACAACATAAATAATGAATTGAGAAATTTGAACTCAGTGATTTATAAAAGAAAC
TGATATGCAGGCATTAATAAAGAATACACGTCAGTGTCACATGGTATTTAGAAGGCCCG
TCGGGTCACTTCGTGTAACATTTGAGCACATTAAGAGGTCGTTTGGTAGAATGTGTTAG
AGAAAATAATGCATGCATTAGCTTTGTGTATTAGTAATGCTTTGTTTGATACACTTTTTCA
ACCTATGTATAACGGATACAAGCATTAGTTATACAGTCTATTTGGTATTATCCTATGTATA G CTAATG CATAG AAAACCATGACATTAG CTATATCG AGG CTATTAATACTTG CATTAG CA
TGGTTAAAGACAAAATTATCCTTAAAGTCCCTTAAGTAAAGAATATGGAGGGCATTTTTG
TAAACAATTAAATATCTAAAAAATTATGCAATGCATTTTAATTTTTAATACACCACACCAAA
CAATGCATAAAAAATAATCTCTGTATAACTAATGCTTGCATTACAAACCCCTGCATTACTA
ATGCACCTTATTTAGCATTATTCTTATACGCCCTACCAAACGACCCCTAAATGTTGTAGC
AACTAGCAAATATTCTAGTGCCAAAACTCAGCATCAATTGGTCAGCATTTTCTATCATTA
TCAAGTTGTGCTGTTAAGCTTTTCACCCTATAATAATTGCTCATGCGTATAGAAAGATGG
GGTCGGGGAATAATGACTAGGAAAAAAGAAAAGATGGGGTCCTGGAATAATATGATTAA
AAGGGAAATTCGGTGCCCTAAGCTCAGGGTATGTGCGAGGTGCGAAGAAGAAGTGGAT
CATAAGGATTTATTGTACGCGACCTTACCCTGCATCTTTCTATACACCAAAAAATGTAAA
GAATTGTTACATATTACTAGGTAAAATTTGATCTATAACAAATAACTCTTCATATATATAAT
AC C G G CATAG G ACTATTTAAAC AATAAAATAAC AAACTGTTAC AC C C AATTAAATTAC AC
TAATACTATAAAAAATTATACTATCAAAGTATAGAATTTAAACTCGAAGGAAATTGTAAAA
ATAGCACGGTATAGCCAGTTTTCGGATTGGTCATTCAAAAATAGTCAGCGTTTATCAAGT
CAATGAAAAATAGCCACTATTTTGCTGCAACAAAGACCGATCCAACATAATATACTGGAG
TTCGGTGCAACTGTGTATGAACTACAACATATTATGCTGGACCGATATACTTTGTTAGCT
CCAGTATATTACTGTAGCACCGGTGCTCCAAACTCCAGTATATTATGTTGGACCGGTAT
ACTTGATGGAACTCCAGTATATTATGCTGGAGTTCTAGTGCGCTTATGCAGATAGAGTT
CCAGCATACTTATCCTGGAACTCCAGTATAATATGTTGGAGTTCAAGTATACTTATGCTG
GAACTCCAGCATAATATACTGGCGTATTTTCTGAGTTTTAAACAGTGTTTTCGCTCAAAT
TTATCTTTACATAAAAAGTGGCTAAATTTCGATTACTTTTAAAATTTGGCTATTTTTGAAC
GACCAGCTATTTTTTATTTTCACACAAGGAAATCATAAAGGTATCCTACGTTTATGCGTA
GACACACAAATTGGCTTTCTACTCATAGAGGACTAGAACTGAACGCCACCATACTATTG
ACATGAGAGTTTATTTAAGTGCAATGAAAGTGTAAAGAACAGAAAAAATATTTAAATCTG
ATATAATAACATAGAAAATAATTCTTGATATCGATTCTAATCTAATAAAATAGGAAATAAT
AAATAACTAGTTATATTTAGAAACAATTTAACTTTAACGTTTTCATTTTATCTATTTTATCA
TTAGAGAGAAACTTTTATAATCACACGAATGTTACCCACAAATCTTTTGCCCTTGACCTT
TTAGGACCACATGATCAAAAGTTTTCTTTTCTTCTTTTTTTTTAAAAACTTTATATCAAGTC
AAATTATATCATTTAAATTG AAACG GGTAGAGTATTTATATATTTTAG CACAATAAG G CAC
GTATGATTTCCTTTGTTTGTCAAGTTCGTAGAAATACTTTCTACATATAAATTAATTATGG
TAGTG G AATACC AATAG CCTATCTCTATATG CTTTAATAACAAATTAAATCAG G AAAATAT
C ATCTAAAAC G C C GTCTAATTAATTATAG ATC C ATAAC C C AAAAAC C G G AG ATAAAC G AA
AAAGACAAAGCCATTGCTAATTAACTTACCACAGCTTCCTTGATCCTTAACTCCAACAAG
AACACCTTTCTCTCTCCAATCAACTGAGTCCGGCAAGCTATCCCCAACTTTAGGAAGAT
ACCGATCGCTTTTGTTTTTCAACAACCTGCGACGATCACTGGTCTTAGTACCTAAGTACA
TG GACCTGTACTC CTCGTTG GTCAG ATCAG CAAATTTG GTTAAACCAAG CTTGTAACTC
TTGTTTGGAACGGAGTTTTGTTCATCGATGTATCTTAAGTTATCTTTAAAGATCTGAAAC
CGCTTGTCTTTTTCGTCTAAGGCGTTGTACGATTTTCCATGTTCGAGTAGCCATGACTC
GTACAAGGACATGACTTCATCGTCCGTTCGAAAGTGTTGGTTTTCGTCGTAGGTTAAGA
TGGACATGTCGGAAGCGGAAGATAAGGTGGAGAAGAAGAAGAAGAAGAGGAGGAGAA
GTAGGGATATGGATATGGTGAGAGTGGAGCTATGAGTTGCCAT
SEQ 45
TTAGAGTTCATCCTTAGGTGCTGTTGCAGGAGACCCTGTGGGGTTGCTTGATGAAGTCT
TGATGGGGTAGGATGGTTGCATTGCTATACCACACAATCCCTCTTCAGCATCAATCTCG
CGTTGCATCCTAATGTATCCTTTTTCTCCCCATTCAGGTCCCCACGAGTTCCTCACAATC
CAGTATTTGGTTCCATCAAGGGTTGTGCCATAGCCCACAATTGCCACACCATGGTCCAA
CTCAGTACCACAGTCTCCGGTGAATACACCCTGCCATATTTACAGTTCGTAAATGTTTAT
ACCTAGTAAAAAACTTTTTAACTTGAGATAAATGGTCTATCATCATTTACCTCAGAGTAGA ACTGG AAGTCAGAAC CTGAAGCTTGTATAG CTACAG AAACAG G CTG GTTG G CTACTGCT
TTAAGTAGGGAATCCTCATCATTAGGAGGAACATCCTCATATCCGTCAATTGATACCACA
G GAG AATTCCTCTG CCAATTCC ATAAAATTC ATG CACGTG G ATTAG AAACAAGACTG GT
TCGATCTGACAGACTGACACCCTACAGATGTAACAGAATCTTACCTTTTGAATATCACAC
TCGCCACCTTCAGCCATGTATGGATAGTTCTCTTCAGTATTGATGCCTCCCTTCTTCTTG
ATGAATTCAAATGCCATGTCCATCAACCCTCCATTGCATCCTTGGTTTTGACTAGTGTCA
CAGTCAACAAGTTCTTGTTCTGATAAAGATACTAACTCATTTGTTTTGATTTGGTTTATCC
CCTCTACTGCAACGACAGTTGAAAATGCCCAGCAACTTCCTATAACAGGCAAAAGGTCA
GTTTCCATCAGCTATAATATTTTGAAAGAACATATCATATGGTTTACCCTTATGTTATTAT
G CTAG AGGTGTAAAG CG G CATAG AATTAATG ACATG CTACTCTTTTCTTACCACATTTGC
CTTGGTCTTTGACAGGAGTAACAGCACCCTTCTTCCTCCAGTCAACAGAGGGAGGGAC
ATCTTCCACATTGGCGTACATGAAAGTTCCATTTGCTCGTGAAGCTCCAAGAAAAGAAC
GATGATGCTTAATCTTGGAACCAGCATAATGGTGTCTGAATTCATGGTTAGTCATGTCTG
CAAACTTGTTCAATTTCAACTTATAAGGCTTATCCTTCTTGTTGAAGTTGTGAACATAGTG
TACATTAGCCTTGAACACATTGAACCTCTTGTCTTTCTCATCAAGGCTCCTCGATACAGT
GTGATGGCTTCTCCATCTCTCATACAACTCCCACAATTTTTCCTCAGTTTCCAACTCCTT
CTCGTGGAAATCGAAACTCTCCCCAAGCCTAAGTACCAAAGCCAAAGAGAAAAGAACCA
GAAATAACTTCTTCAT
SEQ 46
AAAACCAACCTGTGAGACATTAACATCCAACTCTTGGGCAATGAAATGGGCAAGTTCTG
GAATGCGAGGCTTCAACTCTGAGCAGTTGACACTTAGTGACAAGTAAAATGTAATGAGC
CCAACTTTAATTTCCACTGCAAGTCGACATAAAAGATGAATGTGATTACAACCATAAGTC
TTTGTAATGGAATTATCTAATTTCAATAGCCATCATATCTGCACCGAAGCCTAGCTCAAG
TTTGTGAGAATAGATGTAGTGAAGTAGAAAAAGGGGACTAATACTTGCAAAACTAAATTG
AAATCTTGAAAAGTTTTACAGCAGATAAAAGTCAAAGCATTTGAGATTATGCAAACCATT
GAAGAGGTACATCAAATTGAAATAATACAAAACAGGGCTATGTTTCAACAATGCAAACA
GGAAATATTAGGCAGGAAAAATTTTGCGATTCTGTCATTACTTTAAGGTCTTGCCACAAA
TTTCTCATGCTTGTCGTTGTCCAGTCACAAAATTCACTAGAAATTTGACAATTGATTACTA
TAACTTAGTGGATGGATTTTCAGATAGTCGGTATATGGTCAATGCATGTTCACTTGGTAT
CAGTTGTCGTAGTCCTTAGAAATAACTTTTTGGTCCCTTGATTATACCATATTTGTACTTT
AGATCCCTCAACTATTCAGCTTTACACATTAAGCCTACAATTTAACGAACTTTACAGATG
TAGTCCAATAATAAACAAAACTAACTAACCCACGACATATTCATATTTCAAGTCTCCTTTT
TTAATAATACAAATTTCAAAAGGAGCATTAATGTTGTAAAAGTACCGTTGCCTACTAAAAT
ATCCCAAAAAGATGAACACGCTGCTTTGGAAATGGAACGCACCAATCACACGAGTTGC
GGATTAACAAAATCTAAAAATTCTAGTTCTAATTAAGATCTGACTAAATCTGCAAACTCGA
CAAATTAAAAG GCAGATGTCAAAGTC GG AG AGTTG GACTAAAAGTG G AAACAG GG G GA
TAACAGGGGACCAAAGATTTTCTCAGTATTCCTTGAAGTTATTATAATAAATTTCCAGTTT
AAGTAAATCTTTCTCAAACTACAAGAAGGCTGAGAATGCTGTGACATCAGTACAATTTTA
TG CAG G CAGTTG CTTCTTAGAAATTTTAAAATACCAGG G AAACAAGATG ATTACAAG ACT
AATTTCAG AG AAAG GTCAGATGTC AACTTGAATAG G ATT AT AAACAG GGATCTTT AC ACA
AATAGCCGGCTATATTCATGTTTACTTTTTCTAGCCATATACACAGATTATACATTGATGA
TACACAATTATGCACATATAATACATAAATTATGCATTCACACAAATACCAGCATTCTGGA
CATAAGAGACAGAATGTTGATTGCCCAAAAATGATCTAATCGAAGGCAACACATCAAAA
TCAGCATGATGCAATTCTAATTTTGATTCTCATTTTCATAAAAGAAACCACAAACCATTAT
AGTTCAAAATTGAAG G G AAAATTG AAAAGG G AATTGTATTAATCTATTAGAAAACAG AGC
TAGTAGAGATGCGAAAAATGAAACCAGTTCAGGTTATGACAGCATTCTAATGGGCAGAG
TAACCTTACAGTAAAAATACTATGTGAAGAAAAGCTGTCCCTTCATACCAGGTGTGTTGT
ACCCAGGGGGTCCACTAGGAGCTGATGAAGGAGACAGGTGCGCACTGGAGTTTGTGT TATCCAAGCTTGAGACCGATGGTGAAGGTGGAGATGGAGGGGATAAATTGAGTCTGTC
CCATAACTCAGAACAATTAGTTTTCCAAAAACCAATTCTTTTATTTTCACGATCGTAAGTA
ACAAGAGTGTTGCGAACAACGATTCCTGTTGCAATTAAAAGCACGCATGTTACAGCACC
GAAGACCTAGGACGATGAACTTTGTCAAGTAGTTAGAAATACCTCCAAGAAGACTAGCT
G GATTCTTTCCATTCG GG AAAATTC CTAG G CAATAAG CAC CACGTACTTTG AAGTG CTTT
ATAGCATACAAGTTACAATGAAGAAGGGGGAAATTAATCAATACTAAATCTTATAAAGAA
AAAGAAGCCAGTAGATGTGACGCCAAAATAGTGTCGGCAGATAGTTAAGAAAGAACTAA
AACAGAATAAAATGCCTACCTGAAACAAGTAATTTTCAGGAGAGAGAGTTAGTTTCTTTC
CATCGCTGAATACCATATCGACACGCGGAAAGTTCTTTGAGAGTTCTGATATGTTGCTG
G AATATATAAG AC ATTAAATAATAAG G GTAAC AC C AAC AATAG G AAAAAAAAAC AAATTT
AGACAAGAAAATCATAGACGTCTAATTAATTTTGGAACTCTCCTTATAAATACCTTCCAG
CACCAGAAAAGCAGATATCTTTAAAACTAGGATCTGGCCCTTCAATCTGTTTTAAAGAAT
GAAGCTCTTTCACTACCTGAAAAGAAAAGTTGAAGAGGTTAATCCCAAAAAGGAAAATA
CTTAAGTTTATCTAAAGATAATCGCCACAAGTTACATACTGTATAAATTCACAACCAGTG
ACTCCACTTTTCCTTCGCTAAATGTAACATAGCAAAAGCAATGAAAGACAACATCAAAGA
TATGCAGAAACTTGCAAGGAACTCTGTATATGTAGATAATTACTACCTGCTAACATCATT
GGAAGAAAATGTTGTGTGGTGTATTACCAAATTCAGAAGTTTAAAACAGGAAAAATCAAC
TATTATAAGTGGTTAAGCAATCTAATTAGGTCCTTTAATAGCAGAGAAACTAACAACAAA
AGAAGGGATAGAGGCCATAATATAGCCTTCCTTTCTTTTGATAAAGTACATAGCCTCCAA
CTCATCAC AG G AATAAGTTAG GTG G CAGAGTTATAGG ATCACAAGATAC CTGGTTCTTA
CTTTTCTTCAAAAACATGTTCTAGATATCAAAGTGTTAGCTCTCAAACTATTATTGACCAA
GATGTTTTAGTTTCAGTCCCACTCATGTACAGGAATCTCCAACTATATATTTGATCCACG
TCACAATAGGACATGATGGTTTGTTTCACCAATCAGTGGTGTAAATAGAACTTTGAATGG
CAAGAAAAATTTAACAGATTGAGTTACATATAGAAAACACATTATTCAAATGAATATCTTT
TCACATCTAATTTCTTCTAGGACATTCTGCTCATCAATAGCGTGCTATTCGCCACTAGTC
ACTTTCTCAGACAGAACGAATCAAACAAACAGACAGAAACATTGGCAGATATGTAGATA
AACTATATCTTACAGCATTCTTGAAAGCTGCAAATGCTGCTTCTGGAAGGTACGCATAG
GTGGTACCACTATCAAGTATAGTCCCATGTTTTCCACCAAAAACCCGTGGATTTAGGTTT
AGCGGCTTCCCAGCGACATGTATCTCCTTCAGGTCAATATTGTAGTACGGGCTGTTGCC
ATCCGAGACTAAAGTAAATGCGTAAGATTTATCAAGTGTATTTGCTAGGAGGAGACCTA
ACATCACTGTGTCTCTAAACAATTGTTTCCAATACATACCTGTGACCAAAATCTGATTTG
GTAAAGGCCATGTCAGCAGGGGGTTTTACTCCACCAAGAACCATTGCCCCGCCACCAA
AATCCATCCCTCCATAGCACAAGGAGAAAGAATCACTAATTACATGTTTTTCAACAAGTT
GATCAACTATACTAAGATCACCTCGGCCCAAACCCATTATACCATCAGCACGTTGGCTG
TAAAGATCACCAGTTTCCGCAATTTCACATCCAAAAACAGCTCGTTGTGGTGCAAGCTC
ACTTAGATTTCCAAAAGATATGATGTCCTCTCCAAGCAACCCATAACTTGCACTCATCTC
AGCGTACCGTCTCTCATAAATACATTGCTGCCTCTTATGGTCGCAGGGACAAGCCTTAT
TGCATTTCACAGATTGATAAGTGCTTGACATTTCCGGCTGAAACTTAGGATCCTAAATAA
GACAACATACAGAGTACCACCGATCAAAAGACAAAAATCAAATGCCAAGCAATTCTAAG
TGCTATATTAAGCATTTCTCACAAAATTAATAGGCTGACTCAAAGTAAAATTACTAGTCAT
GAAGTTTCTTAAGTGCTGATATTTTCTCCATTAGGAGATCTTTTATGATTACAAGGCAAAT
TGAGCAAGAATCACATTTAAAACATCATGAAACTATACAATGGATTTGTACAGCTTATCA
ACAAAGAGAGGCTTAGAACTATTTGTATCCTAGTTAGATTGGTTTGTTCTTGTTTACCCT
CTTTCCCTTAATACTTACAAATAACTGCATCCTACTAAGCGATTTCCTCACAAACAAAAA
GTATATTAAGTAATGTTTATGAAATAGCACCTATACATAAACAGTTTCAAATTTTAATTTC
CATATCAAGCTATCAAAACACACTAACTGCAAAATTAAGATAAATATGTAACCTTTAATGT
TTATCCAGAAAAAGAAAAGAGGGAAAAAACCTGATGGTTGCCACACTTTTTACACTCAG
AGCAAGGGACATAGGTAACTGTACTCCCTGTATCAACAATAAGAGCGAACTTCTGCGGT
GGTGTTCCAATCCAAATATGAGTTGTATAGTATCTGCATCACAAGTTACTTGGAATCCAT
TCAACAAAATAAAAATTACTAATAAAGAAATTAGTCTGAACTAGAGGTATATAGAAATATA
TTCCACAGTGAAAACGCAAAATACGCATGTGAATCAGCAGCCAAAAGAGTTAGTAACAG TGAATTTAAATTTTCTGAGCAAAAGCTACGAATTTGAACCCGTTGAGGAGGAGATCATCA
TGGAGAGACATGCGAGCGCTGGCAGGACTTTTCTGGAGGTGGCGACGGGAGATTTCC
GCACGGCGTGAAGTGTCTTTCGGAGGAAAGAGCGGCAGCAGCATGGTTGTGTGACGG
CTGCCGTCGGCCGGCGAAGGGAGGAAAACGGAGCTGCCGTTAGTAACATCAGATAAT
CGGAAGCCGGAAACGACACCGTAATGGATCAACAGAGAGATGATCGCGAGAATAACGG
TGAACTGTGGCCGTGCCAT
SEQ 46
TCAATATTGTAGTACGGGCTGTTGCCATCCGAGACTAAAGTAAATGCGTAAGATTTATCA
AGTGTATTTGCTAGGAGGAGACCTAACATCACTGTGTCTCTAAACAATTGTTTCCAATAC
ATACCTGTG ACC AAAATCTG ATTTG GTAAAG G CCATGTCAG CAG G GG GTTTTACTC CAC
CAAGAACCATTGCCCCGCCACCAAAATCCATCCCTCCATAGCACAAGGAGAAAGAATCA
CTAATTACATGTTTTTCAACAAGTTGATCAACTATACTAAGATCACCTCGGCCCAAACCC
ATTATACCATCAGCACGTTGGCTGTAAAGATCACCAGTTTCCGCAATTTCACATCCAAAA
ACAGCTCGTTGTGGTGCAAGCTCACTTAGATTTCCAAAAGATATGATGTCCTCTCCAAG
CAACCCATAACTTGCACTCATCTCAGCGTACCGTCTCTCATAAATACATTGCTGCCTCTT
ATGGTCGCAGGGACAAGCCTTATTGCATTTCACAGATTGATAAGTGCTTGACATTTCCG
G CTG AAACTTAG G ATC CTAAATAAG AC AAC ATAC AG AGTAC CAC C G ATC AAAAG AC AAA
AATCAAATG CCAAG CAATTCTAAGTG CTATATTAAG CATTTCTCACAAAATTAATAGG CT
GACTCAAAGTAAAATTACTAGTCATGAAGTTTCTTAAGTGCTGATATTTTCTCCATTAGG
AGATCTTTTATGATTACAAGGCAAATTGAGCAAGAATCACATTTAAAACATCATGAAACT
ATACAATGGATTTGTACAGCTTATCAACAAAGAGAGGCTTAGAACTATTTGTATCCTAGT
TAGATTGGTTTGTTCTTGTTTACCCTCTTTCCCTTAATACTTACAAATAACTGCATCCTAC
TAAGCGATTTCCTCACAAACAAAAAGTATATTAAGTAATGTTTATGAAATAGCACCTATAC
ATAAACAGTTTCAAATTTTAATTTCCATATCAAGCTATCAAAACACACTAACTGCAAAATT
AAGATAAATATGTAACCTTTAATGTTTATCCAGAAAAAGAAAAGAGGGAAAAAACCTGAT
GGTTGCCACACTTTTTACACTCAGAGCAAGGGACATAGGTAACTGTACTCCCTGTATCA
ACAATAAGAGCGAACTTCTGCGGTGGTGTTCCAATCCAAATATGAGTTGTATAGTATCT
GCATCACAAGTTACTTGGAATCCATTCAACAAAATAAAAATTACTAATAAAGAAATTAGTC
TGAACTAGAGGTATATAGAAATATATTCCACAGTGAAAACGCAAAATACGCATGTGAATC
AGCAGCCAAAAGAGTTAGTAACAGTGAATTTAAATTTTCTGAGCAAAAGCTACGAATTTG
AACCCGTTGAGGAGGAGATCATCATGGAGAGACATGCGAGCGCTGGCAGGACTTTTCT
GGAGGTGGCGACGGGAGATTTCCGCACGGCGTGAAGTGTCTTTCGGAGGAAAGAGCG
GCAGCAGCATGGTTGTGTGACGGCTGCCGTCGGCCGGCGAAGGGAGGAAAACGGAG
CTGCCGTTAGTAACATCAGATAATCGGAAGCCGGAAACGACACCGTAATGGATCAACA
GAGAGATGATCGCGAGAATAACGGTGAACTGTGGCCGTGCCAT
SEQ 47
ATGGGAGCAAAATCTTTTCTTGTCGCCTTTTTCCTTTCATTGCTGTTATTTCCTTTGGCCT
TCTGTACATCAAATGATGGCTTGGTTAGAATTGGTTTAAAAAAGATAAAATTCGATCAAA
ACAACCGACTTGCTGCACGCGTCGAGTCCAAGGAGGGGGAGGCTTTGAGGGCCTCTT
TTAGGAAGTATAATAATCTCCGTGGTAATCTTGGGGCCTCTGAGGATACAGACATTGTA
GCACTGAAGAATTATATGGATGCTCAGTACTTTGGGGAGATTGGTATAGGCAGTCCCCC
TCAGAAGTTCACTGTCATCTTTGATACTGGTAGCTCTAATTTGTGGGTGCCTTCATCAAA
GTGCTACTTCTCAGTAAGCTTTCTATTACATTTTTACTGTCATAAAACATAACAGAGAAAG
CTAATGTTGGCGTATGCATAATTGACGAGCATCCATATTTATGCGTCTCTGTATTTATGC
AGGTTCCATGCCTTTTCCATTCTAAGTACAAGTCAAGCCAATCAAGCACTTATAAGAAAA ATGGTTTGTGTCTTGACCTTTGTCTATAGCTGAAATTGCTGCATGAAAACATGCTTTTCT
CTTAAACTTGTTATTACGCTCAATGCTTGCTTGTAAGAGAAAGTGTTCAATTATTGCGTTT
TGAGATCAAAACTGTTAACCCTGCTCCCAACTTAGGAGATTTAAAAAAAAAAAAGAAAAT
AAAGAAGACCCTTACCATTCTTATTGTTGTCATCCAATTATGTGCCTTGCACCAAAGATT
TCTGTTGAAAAATATAACATGCGAGATTATGTTGTTGGCTTTCCCTCCCAAAAGATGTGC
TAATGTTATATCTCTGATTTTTTTCTTTCAATTATTGGCAATAAAAGCTTGTGCCTTTTGAA
CCGTTTTGTCTATCGAGGAACCTGTTATGGTGGAGTTCCTTTATTGAGTTTTGGTATCCA
TCATAATTTACTTTCCGGGAAAATTGGAGTCTGCTGTGTGATTGACATGACATGATTTTT
GATTATTCTTCTCTGTCTGCTTTCTAAGTTTCTACATTCTCGGTAGAGGTAAGATATGCG
TACTATCTACCCTCCCCGGACCCCACTTATGGGACTAGGTTTTTTTTGTTGTTGTTGTCG
TCATCTACTTTCTAAGTTG GTCAACGTGTTCACTTG GTTGTTG ACATAAGAAC CTGTTCA
TTCAAACTTTTTTCCTGTTTAATATGCCATACAGGGAAGTCTGCTGCCATACGTTATGGT
ACTGGAGCAATATCTGGATTTTTCAGTCAAGATAGCGTTAAAGTTGGTGATCTGGTTGT
GAAAAATCAGGTGAATGTGGCTTCCCACTTTGTGTGTGTGTGTGTGTGTGTGTTTTAAA
ATGTTTCTCGAGCATATAGTCTCTCATCTTGTTAATGACATCAGGAGTTCATCGAGGCAA
CCAGAGAACCCAGTGTAACTTTTTTGGTAGCCAAGTTTGATGGTATATTGGGTCTTGGT
TTCCAGGAGATTTCTGTTGGAAATGCTGTACCAGTATGGTATGTGGGTTTATTTTGTTTG
TGTTCTCTTCTTTCCAAATGTTTCTTCAATTTCCTATTATCCAAGTGCGTGCCTTGTGAAT
TTCATTATTACATTGAAATGATTTTATCTTCTGGACAGAATTTCATTAACATCTCCTTCTG
TATAAAGGTTTAAGTGATACTGGTCTTGACAGTTTCTTCTGTGTTTTATAGGTACAACAT
G GTC AAACAG GGTCTTGTCAAG G AG CCTGTCTTCTCATTTTGG CTCAACC GAAATAC AG
AGGAAGATGAAGGGGGCGAAATTGTGTTTGGTGGGGTTGATCCTAACCACTATAAGGG
AAAGCACACTTATGTCCCAGTCACACGGAAAGGTTATTGGCAGGTAGATATCCCTATAT
CTTTGGGAGATTGATGTTTGGCTTTTGCAACCGTTTTCTTACTCTCAGAATATAATTTGC
AGTTTGACATGGGTGATGTTCTGATTGATGGTCAAGCTACTGGTATGTTATGTTACTTCC
TTTTCTATTTTTTTGTGTG G AG ATTTCGAG GATAAG ATGAGAG CACTTTCACATG ATTTC
CATGCTTTTTCGTTGTATTGACATACTGAATACTGTAGGTTACTGTGACAATGGATGTTC
TGCAATAGCGGATTCTGGGACTTCTCTCTTGGCTGGTCCAACGGTATTCTCAAAAACAT
GTTCCATTTTTTGTTCCTCTTATTCAGCTATTATCAATAATGAACTGTCTCATAATTTTTTT
TGTACCGTCCTGTTCATGTGTAGGTTTAATTTTTTCGCTGGAATATGAGTTGAATAATAA
TCAGCCATTCATTTGAAGTATTCTCATTTTTTCCGTTTCTATTCAAAAAAAAAGGAGGATG
GCAAGTGCAGTGATATTGATATTCATTCCAGTATCTGGACATACTTCCTTGTTGATTTTC
ATACCTAAGAAATGTTTCTTTTTACTTTTGATCTGTTGTTTCTGTCTTCTTTGTGTGCTCTT
CTTCTTTATTAGGAAAAAAATTGTGCATCTTGAGAACTGCTTCTTAATTGTTTTCTTTTAT
G GCATG GTTG ACAATATG ATACAAG G AAAAACTGC AG CTTCTTTTGTCTAG ACAATTGTA
GTAGTGAAATGCTTTACTACTACATTTCTAGTTCTCATCATTCTTCCCTGTATCCTTCCTC
CTCTATCTTGCAGACTGTAGTCACTATGATTAATCATGCCATTGGCGCCTCGGGGGTTG
TAAGCCAACAATGTAAAGCTGTTGTTGAACAGTATGGACAAACAATAATGGATATGCTTT
TAGCAGAGGTGAGCAATTATTTGTTTTAGTTGATAGTTTTTTGTTGTTTTTACCAATAGTT
TTCCGTGGTATCTGCAAAGAGGATGGTTTCATGCTACTAGTTGCCTTCCCAATATTCTGA
TGCATTGGCGTCTTAACAGGCACATCCAAAGAAGATCTGCTCACAGGTTGGGTTATGCA
CCTTTGATGGAACTCGTGGCGTTAGGTTAGGCTTCAGACCCTTTCTTTCCTCGCCTTGG
CCAATCATTTGATATGGTAAATCCTATTATAAAATGTGTGCTGAGTGGATTTATGTCCTC
CACGTGTAGTATGGGCATTGAGAGTGTTGTGGATGAGAATGCTGGCAAATCTTCAGGA
CTGCATGATGCTATGTGCTCCGCTTGTGAAATGGCGGTTGTCTGGATGCAGAACCAACT
TAGACAGAACCAGACCCAAGAACGCATCTTGAACTATGTGAATGAGGTAAATAGCATCA
GTCACATGCTTTCTCTTCTCATCTTAAGTTAGATTACTGACCATCTTTAACAGCTTTGCG
AGCGACTACCAAGCCCAATGGGACAATCAGCTGTTGATTGTGGAAAGCTTTCTGGCAT
G CCTAGTGTTTCCTTCACAATTG GTGG CAGAAC ATTTG AC CTCTCTCCTG AG G AG GTAT
GTCTGATATCAATCTTGCGTAGTGTACATGGCGTCTTCTCATTTTGTAAATGGCTTTGAT
TTTTCTGAACAAAGTGATTGGTTGTAGAATCCTTTTGTCATGTTTCAGTTAGGCAGTTCA TTTCTTTGTGGTTTTCACTAGATTAGCTAGCAAGGTGTTACTCTGCTTTCAAGAGAAGTA
CACTTGTCTTTTAGAAAATTTCAACCATGACAGCTAAGTGTCGTTTGGATAATTAATGAT
ATTGAAGCGTGTCGAGCTTTAATATCAGTTTCTTTGCTTGATAAGTTAACTTGTGATCGG
ATAATTAATGTTATTGAAGTGTGTCGAGCTTTGATATCAGTTTCTTTGCTTGATAAGTTCA
TATGATTGTACTAAGCTTGCATGCTTTTCTTGTCACCAGTACATACTCAAGGTGGGCGA
GGGTCCTGCTGCACAATGTATTAGTGGCTTCATTGCCTTGGATGTTCCTCCACCCCGTG
GACCTCTCTGGTATGTTTTCTTTTCGTCTTAACGCACAAATGCGTGGATTCTGTTATTAC
CAGCTCCCTTTTGATAATGTTGTTTGCTTATGGCTTTGGTGGTGCAGGATCTTGGGGGA
TGTTTTCATGGGTCGATATCACACCGTCTTTGATTCTGGAAAACTTAGAGTTGGATTTGC
AGAAGCAGCT
SEQ 48
ATGGTTGTTGCATTTGTGGGCATAGCCAAGTCTATCGGGCAACAATGCTTGAGGCGATC
AAAACCCTACTCTTACTCTTACTTCTCCAGCTATGTTCGTTCCTCAAATTCTAAGTATGG
ACTCCAAAATTGGCAATTTCAGAGTCATAGAACTCTAATTTTACAATCGGCTTCTGAATC
CGTCAAATTAGAAAGACTCTCCGATTCCGATTCCGGTAATTCCACTCACTTTGTCCTATT
TTACG G CGTACTGTTACTATTTG G GG AATCAAACTTCTTTTAATTTTGG GTACAATTG CT
TTCTGGGGTTAATTAACAGGGATTTTGGAGGTTAAATTGGATAGGCCCGAAGCGAGAAA
TGCAATAGGGAAGGATATGTTAAGAGGATTACAGCAAGCGTTTGAAGCCGTGAGTAATG
AACGTTCAGCAAATGTTTTGATGATCTGCAGTTCGGTTCCCAAAGTGTTTTGTGCTGGA
G CTGATTTG AAG GTATAACAGCTTCTCTTCATGTTGTGTTTTTAG G AAAAAATG AG G CAA
AAAAAAAACTTTTGAAATCTTGTGCAGAGTATAGGACACACTATTTGGTTAACAAAAAAT
GTGTAATCAAAGGGTTCGAAGTCAGATTAAATAACAAAAATTATGGGTAAATATTAAAAA
TTTTAAAACTTTTAAAAAATCATACTCGTTCAAATTATATTAAATTTTTTCATTTCTCCTCTT
TGCTTCTTCTTCTCCTTCTTTCTTCGTTCTCCTCCTTAATTTCTTATTTCTCTTCAACTTTC
GTTGTTGTTGCTGCTGGTTCGTCCATGTCATCTTCTTCTTCTTCATTTTATTTTTCCATCT
TCGTCTCATTTTCATCTTTCTTCTTTTTTTCATTTTTTCTTTTCTTTCTAGTGGTTTAAAATA
TACGAGAAAAAGAGAAAAATATGAAATTTTACAAAGTAAGTATTTTGCAAAATACCCTCG
AGATATTTTGAGACATACCCATAAATGAGTATTCTACTGAAACATCCCCGGGTTGTTGTT
TGAAACATCCTACGGGATATTTCTTCTACTTCTTTCTTTTTTTCTAGTGTTGTTCTCATTG
ATCTTTTTTTCACAGATGTTTCAAACGTTCCCACAGATGTGTGCAGATGTTTGAAACATT
TCAATAAATGTTTGAAACATTTCTTTAGATATTTGAAACATCTTTCCTAAATGTTTGAGGA
TTAGGAGTGGCGGGGATAGAGAGAGGAGCGTTGCGAGATGGAGTAACAAGGGGAGGA
G GAG GAGTG GCG G GAG GAG GAG GAG GGTTTTTTTTTTTGTAAATAAG AAAACTTTG GG
GGTTTTAAAAATAGTGTCATTAACCCTAATCACAGAAGTGTCCCTTTTACCCCATTCTTA
ACACTTTTGTCTTAAAAAGTGATATAAGTTTTGAAGTGTCTTAAAAATTTAAATGCCCCGT
GTTTTTAGTTGTAATATGTTTTTTCTTGCAAGCAGTATCACAACTTGTAGATATTACATCT
TCCTTTTCGTTATGTATCCTCTTCTCTTCATTGTGCCAACCATTGTATTTGTGTTTTCAAG
CAAGAGACAGATCCAAGATATGAACTTTATGTGTTCGAGTTCTAAATTCTGCCTAGTCCA
TTTGATTTACGGGGTTTGAAATCTATATTTGTACAAGTTTTGGTGAGTTTTTTAACACATA
TATGTTTCGTTGATTCTCTCTGTAATGTCTTGTTGTCTTCTAGTATATTTTTCTTTTGTATC
GCGTGACATGTTGACAAGAAAAACAATAAAATTTCACAAGCTAGGAGCATCTGCCAAAC
ATACAACTTTCAATGTTGAAAAGTTTCTCTGTGGTTTGATCTCTTCGCTTGTGCAGCCTC
TGTGG G ACTC AAACATCTTCACAG ATTCAATTTTCTTTTGTG GG G ACTG CATTTATG ATT
GGCATGGCATATGATTGAATAGCTGATGTTTTTGGTGTCAGATGTTGATTTACGCACTTT
AGTAGTCTTTCTACTGTTTGTTATCTGTGCTGTATTTTGTACCTACTGATAGCTAATTCAA
TTG ATTG G CATTTG CTACATAG G AACG AAAAACTATG ATTCTTTCTG AAGTCCAG G ATTT
TGTAAGCACTTTGAGATCAACTTTTTCCTTTTTGGAGGTACGTGATTTTTATTGATGTTTT
GTTTAATATATTAAAGATCATAGTGTCTTAAAGCTCAAGAAGAAGTTTTTTTTATCAATTT ATGAAAAGCAGAAGTTATCAGTTTATAATCTTCTGAATTCTTCCTTCAAAATGATTGGTTC
ATGAGATCATACTATGTCTCGTTTCTTCTTCCTCACTTATTACTTTATCAACCATAAAGTG
GTCCAGTTGTACATATCCTCTTCTTTTCTTTCACTCTATTTGAGTAAATATTTTCTTGCTC
AGGGTCTTCATATTCCTACAATTGCTGCCATTGAAGGTATAGCATTGGGTGGGGGGCTT
GAAATGGCGATGTCTTGTGATATCCGTATATGTGGTACGTGCTTTCTTGCACTTCTGGG
TGTACCATATTTTCTCCTTCTTGCTCTCTAGTTTGATAATGTGTTAACAGGTGAAGATGC
AGTGCTGGGCTTGCCAGAAACAGGACTTGCTGTAATACCAGGGTAGGTATGCCTTAATT
ACGCATTATATGTTTGCTTATGCAAAATCCCAAAATTCTTTGAAGGATGTGTTAGCTATG
TGTGGTTTATTTCTAAATTTATCAGATCAGTGGGACGCATTTCCACTATCTTTTTGTCACT
TTGTAATCTTTTACTATTCAAAGGTTTCCAACTTCAGAAGTTGCTATAAATACTCTGTATG
CATAGATGATCTTCTTAATGGTATCCTCTTATTTCATACTGGCATTGTGCAGAGCAGGAG
GAACACAACGGCTTCCTAGATTGGTTGGAAAATCAATTGCAAAAGATATAATATTTACTG
GCCGAAAGATAAGTGGGAAAGATGCTGTATCAATAGGTACGTGTATGACTTGTCAGAGC
TCATTTGTCAAGAGACAGGACTCCTTTGTCTTTCCAAGTTCTCTCTTGTTAATATAAAGAT
AGCAGTGATGTCAGCACTTCATTACAAATTATGGGTTAACAGTGTCCTCCAAGGTTTAG
GCAGATAGAAAGAAATCATCTAATTTTTGCTTCTGCTGTAATTTTGGACCTTGATCTCCT
ATGGTTTTCTTTTCCAATTTCTTAGTGAATAATACATTGTATGCAGGGCTTGTCAATTACT
GTGTTCCTGCTGGTGAGGCTCGCCTCAAGACACTTGAACTTGCTAGGGATATTAATCAG
AAGGTTAGACTTTAGTTATTGAGATAAAGAGGATGTGATGTATTTATCCAGTGTGCCACC
CATATGACTTCCAATTGCAATTTAGTCACGAACAAAGAAGAAACATAAAAGAAGTCCAAC
TCTTCCTATAAACAAAATGATTTCAAACTGTACTGTACATAGATAATTGTAAAGATTCGTT
AGCAGTAACGTGTACTCTTTTGTACCCTTTTCACCTTTTATGAGTTATGCACCCTCTTTT
GTACCCTTTTCACCTTTTATGAGTTATGCACCCTCAAGGCCATGAAATATGCTTGTCACT
GGATTTTCTTTTCTTTTGTGTGTGTTGAATGAAGTTGAGGCTCTTGTTTGAACTTTTTATT
GTCATCCATGGACCTTAATTTAATGGCATTTACTAATCCTATGCTTGTTTGTTTTCCACTT
TGTGCACTGCACCTTCATTTTTTGTGACAAGCTTTGTTTCTTGCTTTTGGTCTTTTTCTGT
CTTGTTTTTTCTTAGGTGGAGGATCATTGACTATTGCATAGTTCCTTGCTTTGGTTTCTTT
GTTTTCCTTTTCCCCTTCTTTTTCGATTTTTAGCTATTTTATGGCAGGTTCACATAGAAAA
CAAGTGTTACTCTATTGTTTTTCTTTCCCTTTTTTCTCCATGCATCTTATAGAAACCGAAG
CTTAAGGTTTCTCCACGCAAGCTGCAACGTTCTGTTTTATAAGCTTCATATATTTCTGGT
TTTCATGTAGTATGAAATGATATTGAGTGGGATTATTAGGAAGCTGAGACAGATTGAAAA
GAACAAGTAAAAGCCACATTGGTGATTCCCTCATGCTTTCAACCTTAAAAGGTCATTTCA
ATGTCCAGGGTCCAGAAAAGGGACTCACTCTATTCATGTTCTATAAGAATGGAGTAATC
CACTTGACAAGTTCTGGTGGTACTACTTCTTCATAAGGTTTTATTATACTAGTCAGGGCT
GTGTTGAAAGGATATGGTAGCATCAACTAAGTCCAATTGTTTCGTATTGTATAGAGCCTT
CCTTTTTCCTTTTCCTTTTCCCTTGTTGACTCTGCTTTTATACCTCCAAATGGAATGGTCG
TAAAGCTTTTGCTCTTTATTCAATAAGCTAAAACTTCTGATGAATAAATTTGGTTTAAGGA
TGCACGAGGAGTGAAAGTAAAATAAATATTGATGAAGGTTTTGCTAAAGATGCTCTTTTT
TATGCTCGGGTTTTGCATGTCAACTGACATATACCTTATCAATGTCGACTGACATATATT
CTCTGACAGGGTCCGGTGGCGTTAAGGATGGCAAAATGTGCTATTGACAAAGGAGTGG
AGCTAAATATGGAGTCAGCCTTAGCTTTAGAGTGGGATTGCTATGAACAACTGTTAGAC
ACAAAAGATCGGTTAGAAGGCCTTGCTGCATTTGCCGAGAGGAGAAAACCTAGGTATAA
GGGTGAA
SEQ 49
CTAGC AG CAACCAG CTATAG GAACAAATGTGTCAG CTCG AAGTGACATTGGTTG G CAG CTAACATCCTCACAACTCTTGTGGTAAAGCATCTTTTGTACATAGTACATCAGATAGCAT TGTGAGGCCCTGACAACTTCTTCATCGACTTCAGTAATCCATGCATCGTCGCATTTGTA CCATTGGTTTCTCAAACGCAAATAAGTCACATAGTGACCTGATTCTAACATCCCTGAATG TGTGACCACAGCGAAAATTTCAAATTCCGTAGAAATATCTGATTCATCACCGTCGAATGA
AAAGATTCTGTTCCCGTATCTCTTTCGTACAATTGAAGATGATAAATATGGTTTCATGTCT
AAAGAAAAAGGAAATTGCAGGTGGCGGTCAATCTTTCTGGACATTTTTCGGGTGGGAGA
ATGTTCAAAGCGTTTTATATGAAAAGATAGCACCAGCGGAAGCTTCTTGATGGACATTT
GTTTCAATGCATCTTGCTTTTCCTGACAATTTTCACAATACAGTTTCTGATCAGATCCCAA
CTTTTCTG GTCGTGTG AAG AG GTC CAAG CAACCTACAAGAG ATTCATTTG GCTTACTCG
ACTTATTAGCAAAATCCTTTGGGCTGGAGTTGCAGCTATTCAAGTCAAGAGAAATGTCC
ATACAAGGATCATGAGTTGTTGAAGTGAATCCGCACGATGTACATGTGACATCAGATCT
CAAGAGCCCATAGAAAGTCCTATGAGCAATACACTGGCAATCTCCATTATCTGTTTAAG
AAATGTTATTTTTATCAGAGAAAAAGAACTAGAGGAAGCCACAAATCTTCAGTTCTCAAG
GTTATATATGAATCATCCCTCTTCTGATCAATGCATTGACAGCAAATGGAAGTACAGGCC
AAAATTGATATCCCAGAAAGGAAATTGCGCATGTACATAGACAAGATCTGTACTATATTT
ACTCACTGCGTAGCATAATGTCCTACATAAACAAGAAGAATACCAGCCATTCTCAGCAA
AAGATACCTTTGGTTGCCAAACTAGCTTTCCCCTCTTTATCATGGATCCTGTCCATAACT
GAAATGAAGAACTCATGAGCATCCTGCTGCTCGTAGGTAGCAAGATTTTCTGAATGCTG
CCACCAACTGCAAAAATAAACAACATTTCAATTAAAAGAATCACTACATAAGTTTTCAACA
TGTCAACGTCCAATAAAAGTTAACATTTTCCACGTTTAAGCATCCAATCTAAATTAAACAA
ATGATGATATCTTAGGTGGACATAGCAGCATAGAGAAAATTTCAACAGATTTATTTTATC
ACGATAAGAGAATAGCAATCTTGCTTATTGTCTGATTTTAGCGGTAATGCACCAAATCTT
GTTTATCCAAAAACTTCAAAGTGAAAACCTATGCAGTTAGCGCTAAAAATTGTAAACATT
ATTTACAATTTTGCAGGCCATGTTATAATCAAATATCCAGTATCAATACATTAAATGGTGA
G CACAG ATAAAAAG CAATTAAG AGATAATG ACAG GAGATAAATCCTTTTAACACTTAC AG
ACCTGTAAAGAAACCGAGCTGGACTATAAGGGGTCCGATCACCAGAAAAGACAGCTGA
GAAGATAAGGTCAATATCACAAGGCAGGCACAACCGATCCGACGACATCTTTCTGCAAA
TATCTCGGTTATGCCTATCGCTAAGGAAGTAATTTCTCAAAGGGGGTGCATGAAGTAAC
ACTTGCAACACAGAGTTCATGAAACAAGTATTCCCCAAATTGTTCAAACCCCTTAATACT
AAAGGAAAACATGATTTCGACTTCTGATCCCTCCTTAAAAACAACGTCTTCATATTCTTT
GAATCTAAATCCATCCCAAAACTCAACCTTCTCCTCTTGCTCAACCTCAACTCACTCTCC
ACAACCCCAATTTCTGTTCTAGGAAACCCCATTATGTGTTTACACATCACAACCTTATCA
AAATCAGGATCATACACCTGATCACAACACACTGAGCAATAAAGCTCAGCCCTTTCCAT
GTCTACTGAAATCTCATGCCCAGCTTTACACTGACTATGCAAAAGGGCATGGTTTGATT
CAGGTGACAAACAACATAACACTGATGAACAGATCAAACACATGTAAAATCTACCCTCAT
GTCCACTACAAATACTACATCTAGGTAGCTCTGATTTGGATATTTCTAATGTGGTCCTAC
CATATGGGGTTGTCTTAAAACATTCTTGGATCAAACTATACCCACTCATACCATTTTTCA
CCTTGTAATCTGCAAGATGCTTACAGGGCTTTGGATTTATATATAAAGAGTTACTTGAGC
ACAT
SEQ 50
ATGAAAGAACTTCATTCTCTAAGAGAGATCGAAGGGCCTGACCCGAATTATAAAGATAT
ATGCTTTTCTGGTGCTGGAAGGTAATAAATTAACTATAGTAATGTTAGATCATTAACTTTT
TCTTTCCTTTATTTTTGGTGTTGTTTCTTGTATTGAATGTCTTGTATATTGCAGTGACATC
TCAGAGCTCTCAAAATCATTTCCTCCTATCGACATGGTATTTAGCAATGGAAAGAAACTA
TCTCTCACTCCTGAAAACTACTTATTCAGGGTAGGCATCTTAATCCACATGGTTTTTATC
TTACTACCTTGCCTTTAGAATCTGTATCTCCTTTTGGCTTCATCTCTCCTAGTGGCTACA
TTTTTTGTCTTTGTTTTGATAAGTGGCTCCATTTTCTCTCTGTGTTTCTATATTGACTAATT
CTGCCCTTTTGCTTACTGTAACTGATTGCTATAAAGCACTCAAAGGTGCGTGGGGCTTA
CTGCTTGGGAATTTTTCAGAATGGGAAGGATCCAACTACTCTTCTTGGAGGTATTTGTC
ATATATATCTTTTAGAATCTTGGGAAAGTTCATCTGCCAAATTCTTCAGTTGTATAAGCTG
TAACATGCGTGCCTTTGCTTTTAATTGCAACAGGTATTGTTGTCCGCAACACTCTTGTAA CCTACGATCGTGAAAATGAAAGGATTGGTTTTTGGAAAACCAATTGTTCTGAGTTATGG
GACCGACTAAATTTATCTCCTTCACCTCCACCTCCACCATTGCCCTCAGGCTTGGACAA
CACAAACTCCAGTGCAAATTTGACTCCAGCACTGGCACCTAGTTTACCTCTGGAGCATG
CACCTGGTACGAAGAAACTGTTCTCCTATCTTTTTGTCACCATTAGTATGCCTTTCAGTC
ATGCTTTTATCCAGTTTTGTAGTGGAACTGGTTTTATTTCAATTATTCTACCGGAAGGGG
GGAGCCTTAGAGCAACGGTAATGTTGTCTCCGTCTGGCCTATATGTCATGGGTTCGAG
AAGTGGAAGCAGCCACTATTGCTTGCATTAGGGTAGGCTGTCTACATCATACTCACACC
CCTTAGGGTACGGCCCTTCCCGGAACCACATCAATCCGAGATGCTTTGTGCATCGGGC
TGTCCATTATAATTCCGCCAGGCTGTTTTGCATCATTTCCCCCTAATATTTTTAATCCATT
TTGGTTTCTGATTTGCTATGCTGGTTTTTTGCTATATCGCCTAAGATTAGGTTAGCTTTG
ATGATTTCACATCCTTTCTTTGATTAAGGTCCATGAATGTTCCTGTGTCTCCAAATGTCA
GCTTTCAAAATGACATTTGAGCTTGCGTTTTGTATTGTTTCATCAAGTTTTGTATTCATCT
ATCTCCTTAGCATTCCAGAGTTCCTGAGAAGCACTCGCTAGTAAAGACGTATTCTGATG
TCATGACATTTTTAACCTTGTTGGAGTTTGGACCCAAACAAACTTTTTTTACAGAAGGAA
ACTATAATTTTAAGGAGTACAACAGTTGCTGTATATAGAACATGGTGAGTAACTCCACTC
TTGAGATGCCTTCTCTTCACTGAAGTCAATTTCTAAAAACCTCCGTGCTTGACACAGATT
TGTTGGTATAGATTTGTGCTCCAGAGATGCAGATGGGTCCAGCGAATTTTTCACTAGAA
TTTTTCTTTTTTTTCTCACCACCTGTCCAGCATAGTGCTGTCAAAAGTGACAAGCTTAGA
AAAAAACCATGTGCTTGGTGGGGCTTTAACTGCAACATGCTATAAAAACGTTCACTATAA
ATGTAATGTAAATAAATAACCATAAAC ACAAAATAG CAATTGTTG GAAAAATTG CAATTTA
GTGAAATACACGAGGTGTCAATCAAGTTCAGATCATATTGTAAGTCTTGATTCAAGGTG
CATTTTTAAATTAGTAATTTGGATCAATGATTAGTTTTCTTAGACTATAACTTTCAATTTTC
ATACTCGTAAGAACTGATATACTCATATATAATATAACGTTTTTCTTAATAACTAATAAAT
GCTTTCCTAACTATATTTATTTTTGTGCTATCCTATTAACAACAGAGCCTGTGGATGTTAG
GCACCCACTTTAAGGCCTTTTTCCTCGCACTGAAGCCCTACTTTTAAGGTTTACTGTCAC
GACCCAAAATTCCACCTTAAGGATCGTGATTTCACCTAGTCTCTAAAACTAGGTAAGTC
GATCACTTACAACAGTTAAACCATTAAAACATGATATTATGAAGCGGAGTTTAATATAAAT
GCGAAAATAAAGGTGATACAAGCCAACACGGCGTTAATCACAACAAATCCCCAAGACTA
GGTAATACAGAGTCACGAACTCTAACTGAATACATAGAAATATTTCAAAACAAAGATACA
ATACTGTTCTG GC AG ATAATTGACAGTATAAAGATAAG GAAAG ACTACAAG G G ACTTCG
ACGATCAAGCAGCTCTACCTTGAATCCTCGTGATCAAAAAGCTAACTCTGCCTAGGTCC
TATGCCTCCAACACCTTGATTTGCACAAAATGTGCAGAAGTGTAGTTTGAATACACCATG
GTTGGTACCCAGTAAGTATCAAGGCTAACCTCGATGGAATATTGGCGAGGTTCAAGTAA
AGACACTCACTAGTCAAATAACCTGTGAAAAATATCAAAAATGGGCAAATGGAATAATAA
CATAAAGTCATAACTGTAATCTCTTCAAATTAAACGATACCTATTTAGAATAATTAAAGGT
CCCGTTCTGACAATAAGCCATCAAATAGAATCACGCACACCCGGCACCTCGTACCCACA
TTAACAATCACCCTCGCACGGCAAAGGCCTCGTGCCACAACATAAGATATACCTCGCAC
GACGAAGAGCTTGTGCCACAATATAAGTCACAACCGCATGGATAACTCATATGCCAATA
TCACAATCCGCCTGGCGTGGTCACATGCTCAATATCACAATTCGCCCGGCGTGGTCAC
ATGCTCAATATCACAATCCACCCGGTGTGGTCACATGCTCAATATCCCAATCCGGCTGG
CATGGTCACCGGCTACCTGTCCAAATGTACATGATCAATGGACATCAAGTTTCATACTC
CTGGACTGATATTAATGACATGTTATGGTATATGCATGTGCAAGTGTATTATCACAGCTT
AAATCATCTAAGTAATATCAGAGACACCAAGTGGCACATTAGGAACAACACAACAAATC
ACGTAATATGTATGACACACACAAGGAAGTCAAAAGCAACAACCAGAATACTCCTCTTT
CATCAACAACATGCCCCCAGGCCATCACATAACATCCCCTTATTGCCACCCTTATGTCA
CCACGTTGACAATATCATAATAGCCACCCGTATCGCTCCGCCTAGGCAGTATATCAATA
GCCACCCGTGTCACTGCGCGCAGACAATATATCAATAGCCATCCGTGTCACTCCGCAC
AATCAACAACAGTGAATTGTCATCCTTGTGCTCCGGATAACAACAATCGATCCACACAT
GTCCACATATGCCACAATATCACAGGATAGTAGTATTAGAGATTTATCACGATACAAGCT
CACCACTCATCAACAAAGTGCACAAGGACATATCATTAATATAGAATTGCTGAGGGGTA
TTCAACATTTAAGCATGAAAGCTACTCAAATTAACAAGAGTCTCACAAGCGCCCAACTTG GCCAAATAAGGAATTAAGATCCTAAAACATGATTTGTACATGGAATATAAATAACTTAAT
GTCAAAAATAACTTGATGTCATAAATAAAAGCCATAGGAAACGATTCTGAATAATAAAGC
TTCTATCTTGAACAAGAATAAAAAGTAATCCCAAAAAGTCAACCCCGGGCCCACACCGT
GGAATCCGACAAAACTCACAAATTCCGAACACCCGTTCAAATACGAGTCCAACCATACC
AAAATCATCCAATTCCGGCCTCAAATCGGCCTTCAAATCATCAATTTATGTTTTAAAAAA
GTTTTTACTATGATCTCCAATTTCTCCCATTCAAATCATCAATCAAACACTAAAATTGAGA
TTGGAATCATGAGAATAAACAAATCCGAGTAAAAAATACTTACCCCAATCCAAATCGTGG
AAATTCCCCCAAAATCGCCCAAATCCGAGCTCTATAACTCAAAATGTGATAAAATAACCA
AAACCTTTGAAATAGAGTACTTATAGATCTGCTCCAGGTAAACCCTTCTCAATTGCAGGA
CCAGCTTCGCAATCGCAAAGCACAAACTTAACTGACCACAGAAATACCCTTCGCGTTCG
CGGTACATACCTCGCGAACGCGATGCATGGCTGAGCCAGACCTACGCGAACGCGGCG
TAGACCACGTGACCGCGAAGACAATACCACCAGCTCCCAGTTCTTCATCGCGAACGCG
TCATTGCCATCGCGAACGCATTGACCAAGCCCCACAAAGCTACGGGAACGCGACCCTC
CAGTTGTGAATGCGAAGAGGAAAAACACTCAGCTCCAATCATACACTGCGCGATCGCG
GTTAGCCCCTTGCGATCGCTAAGAACGTCAGCAACAACAGAAAACCAGCAACACAACAT
GAAGGAAAATGGTCCGAAATCACCCCGAAACTCACCCGAGCCCCTCGGGGCCCCGTT
CGAACATACCAACAAGTCCCAAAACATAGACAAACCTACTCGAGGTCCCAAATGACACC
AAACAACATCAAAACTACGAATCACACCGCAAATTCAAGCCTAATGAACTAATGAACTTT
CAATTTCCAAAACTCATGCCGAACCATACCAAATCAACTCAGAATGATCTCAAATTTTGC
ATGCAAGTCCCAGATGACATAAACGGACCTATACCAACTCTCAGAACCGCAATCCGAAC
CCGATATCAACAAAGTCAACTCTCGGTCAAATCTATCAACCTTCCAAACCTTCAACTATC
CAACTTTTGCCGGTTCAAGCCAAAACAACCTAGGAGACTCCAAATCCACATCCGGACAC
ACGCTAAATCCAAAATCACCATCCAGACCTAACAGAACCATCCAAACTCTGATCCGAGA
TCAAATACGCAAAAGTCAAACTTGGTCAACTCTTCCAATTTAAAGCTTCTAAAATGAGAA
TTATTCTTCCAAATCAATCCCGAAATGCTCGAAAACCGAAACCGACCATACACGCAAGT
TGTAATACATCATATGAAGCTACTCACGACCTCGAACCACCGAACAGAAATGCAAATGA
TCAAAAC GACCG ATCG G GTC GTTACATTTATGTATG CTTCAAATG AG CATTCAGTG ACA
CTGTTCAGCAAAAGGAGAAACTCTACTAGCCACTTGTAGCCACCTCCAGGGACCCTCTC
TGTCTCGGCCATGGATTACTTTGAGGAGTAATAGGGCTTCTCCAAGCGGAAACATTCCA
CGCATGCTGTGATCCTCCATGTTTTCTCTGCTAATCTTTGCTACTTTTTCTGGCGGTCCA
ACTGGTTGATCAATTCTACTAACACCATCCGAATGAGCCCTCAGGAATTCATCTCCCTA
CTTCTTTGCCAGATACAAGGAAGGCGTTGTCCTCATAGGGTTGGCCTTGGCATAACTCT
CTG ATAATTTGACAC GTTG CTTTAATTTG GTAACAATTTAC AATTTTGG G GGTG CTG CAC
TTGATTCCATTGAGTTACACCATTTCTCATATTTAGGAATGGTCCTGTGCAAGATTGAAA
TGTACTGAACTCAGTTTTTCCCTGTGCAGATATTTATGATTGTTATTATTATTTCATCTTT
G ACCTG AATTG G CAG G GAAAATCAAAATTG GACTCGTATC ATTTG ATATGTCACTG AGT
GTTGATTACTCAGCATTGAAGCCTCGTGTTCCAGAGCTTGCCCATTTTATTGCGCAAGA
GTTGGAGGTTAACGTCTCACAGGTAGTTTTTGCATGACCCAAAGTTGTGTCAGTCTGAT
GTAATCTAAAACTGTATATCCCATTTTCTTTAAGTTACTTAACTGTATTTTAATTTTGTTCA
ATATGATATGTCACTTATTGGAAGATACCTTGCAGGTTCACTTAATGAACTTTTCGACAG
AAGGAAATGATTCCCTCATTAGATGGGCCATCTTTCCTGCAGGATCTGCAAACTACATG
CCAAATGCCACTGCAACAGTAACTCTAAACATCTAGAATATGTGAGGACTATTTCTTGAT
TGAAGAACCCTTTATTCATCATTTACCTATTTGCAGGAAATAATAAACCGGTTGGCTGAG
AATCGTTTTCATCTTCCTGATACATTTGGAAGTTATAAATTAGTCAAATGGGACATTGAA
CCCCCACCAAAGAGGTATAAAAGCTATCTCCATTCTTTGCATGTTCATAAAATATTGAGT
TCTGCTGTACAAACTTTTAGCATCATAGCATTACTTATAAAATTATTCTGAATTGTCAAAA
CAAATGTG CCTTTTCTTTTC AAAATG CAAAATAAATCTCCG CATTG CATTTCAG ATG GG A
AAAACATGACACGCATCTTTTCATCTTGCCTTAAACACATGTTTGTAAGTTACATTCTAAA
TTAGGAAACGTGAATGAGTCTACATTGCATCGCACCAGTTCGACTGCATATTCCAAGGA
TAATGATGAATAGGTGATGACTTTCGTCTCCATTTTTCATTGTTTCAATTTTTCTCAAAGT
TTCTTACTTG GATTGGTG G ATAAAGTGG CAAAGC CAG AATTTTTTTATAAG G GATTCG AA AATACTAGAATGTCATAATTGAGATCTGAACTTGTGACTTGAAAGCAACTTTTGAATCCT
CTTTG CTACTAAACTAAAAAATTTCCCCTATG G CAAG GAG ATTCAATAG CTTATATATAA
CCAAAAAACTTCATTTTTACCCTATTCGCATACTATAATTTGAAATGTTTTTGGTCAAAGT
TTAATTTG CTG CATCTCAAAATCTTAATAG CAAAATATTAC CTTAATTAACTCTAATGTAA
AGAGATTGGATAACACACCACAACAATATTTTTGGTAGGTGAATATTACTTTAATTTTTTT
GACAATTAATTGAGATAGGAGTTCTTGATATTTTTTTTTTGGTTTTGGTAACTATCAAGTT
GTTGGTTTGATATGGTTTCATTGCAACGATTTAGGATACGATGGCAGCAAAATTACCTTG
TTGTAGTGTTTGCGCTACTAGTTGTCCTGATAATTGGATTATCAGCTTCTCTGGGATGGT
TAATTTGGAGACGAAGGCAAGAAATCCCATATAATCCTGTTGGAAGCGCTGAAACACAT
G AAAAAG AACTC C AG C C G CTAAAT
SEQ 51
ATGGTCACAGGTCTGAACTTCCGCCATAACTTTCTCGTGTCAGCTTTACTATACTCTAAA
TTTGAACTATAATACCACATTGATGTGAAAATTCACACTTAGGTATCGATTTTTTAACACA
GAGATTTATTTTGTGTTCATGCTTTGGTTTCAAGTATTGGAGAACCTCGTAATCGTTCTC
TATAAGCTTCTGGTTTAACAGATCCTAATTTTTCTTAGAAGCTCGAATTATTTTGTATTGG
AATGAAATGAACCTGAATATTGTGGACGATACAGAGGAATTATTGTGGTATAGTTGATTG
ATTGATTGATAGTCTTAAGTAAGAAAAAGACCTATTGGAGATTATGGTGAAGCTTATACA
G GAG CAG CTG G ACTTG GTTTTCACAATATCTTTTTTGTTAAGGTTAGAATAAACCTG CTA
AAATTTTTTACTTATCAAAATAAATAAATAAACTTGCTAGAATTTTTTTCAAGTTGGTGATT
GTTTAAGTTTTTTCGATTGTTTTTTCCTTTGGTAAAAACGTTTTTGGCAAGAACTATATTTT
GAAGTTGTGGTTTGAGAGTGTTTGTCAAATAATCTTTTCAAACAAACTCTCTTTTCAAATA
TCCGAACATCTTCAACTTCCACGAAATAGGTGACAACTGGATTAAATTTGGGGGGGGG
GGGGAGTGGTTGATGGTGTATTAAGTTCAAACATCATTTATCTTTTTCTCTAGGAGCAGA
TTTTTAAATCATTATAGATACATTGCTTGATGTATGTTTGAGAAATACCATTGGTGTTTCA
TTTAGGCATCATCACTGTAATAGTTTGGTTAATGTTTTGTTAATTCATCATGGTGGTTCAT
TCAGACAGCATCATTCGGCTATGATGTTGATGTTGATTTGGTTACACAAGCAACTCATTG
GTGGAAACAGTTTTCTAGGATACCCCTTTTATCATTTTCATTTGATTGTACCTCCTGTTTA
TTTTTG CACTTG G AC AATTACG G GCTACAATTCTCTCCTTGC AAATCTG GTGATTG GTTG
CAGTAAGTGTAAAGTG G CAAAAAGAAGTCTATCCTG CTGTG G AAATTG ACACTAG CCAG
CCTCCATATGTTTTCAAAGCCCAGCTGTATGATCTAACAGGGGTACCACCTGAAAGGCA
AAAGATAATGGTCAAAGGTGGTTTGCTTAAGGTATAAAATTTCGTTTCACTTAGCTTGTT
ATGCCATTTTTCACTTTGCAAATAAAGCACAAAACTCATTTGTGTTTTGAGGAACGCTGA
AATTCTCAGCATCTGATGCTTTGCTTGTTAATTTTGTTGTTAACTCTTCGGTTATTTCTAT
GGTTGTTTAGGACGATGCCGACTGGTCGAAAGTAGGAGTAAAAGAGGTACACGGCTAC
TCATTGAATTACTCTCTATTTTTATGCAATGAAGTGCCAATTATCTAGAAGCATCTGTTAT
TTATTATTTCCAGGGTCAAAGGCTGATGATGATGGGAACTGCAGATGAGATTGTGAAGG
CCCCCGAGAAGGGTCCTGTTTTTGCTGAAGATTTACCTGAAGAAGAGCAAGTGGTTAAT
GTAGTAAGTTTTTTGACACTGATGTTGTTGCATCAAATCGAATGATCCGGAGATGTGTGA
TTCCTTATGTTTAACTGCTTACATAGTTAGTCTTGTCTCATATGCTGTACTTATACCAGCA
CTG G ATCC CTAGTAGATTTATTG GTATAACTTTACCG CAATTG CTTTGTTCATTTTTTTTC
AAAAGCAGTTGCCTTTTCCAACTTCTACATGCAAATAAGCTTTAATATATAATTCTCTATT
CTTTTTCCGCTGGCACAAGTGATTTTGTGGATGCCAAGCGCTTGTCGAATGCGTTTCTT
GTTCCGCTGGCACAAGTGATTTTGTGGATTCCAAGCGCTTGTTGAATGCGTAATTTCTT
C ATTTAC AC ATTATG AATC G G C C CTTC C CAAG AC C C C G C AC ATAG CAG GAG CTTAGTG C
ACTGGGCTGCCTTTTTACACATTGTGAATCAGATTACTATGTTGTTTTAGAGTCCTGTCT
AAAAG AACTG CTAACTTTTATAATG G CAAG GCTTAGTTTTGTACTTTTAATCAGTAAATG
GGTGATGAGAATTTTTATAATTTTGTTTCCTCCAGGGTCATTCTGCTGGATTATTTAATCT
CG GAAATACATG CTACATG AACTC CACAGTACAGTG CTTG CATT CAG TTCCAG AACTG A AGTCTGCTCTAACAGAGTGAGCATTTGCTTCTTTCATCCTTTCCTTCATTTTTGGGAGTC
TTTTG GTTTAG GCTTTTTTTTG GTCCTTTTG CTTTAG CCTTG ATTTC CCAAAACTTG ATC A
AATTCAATATGGTTGCTTTTAAGTCTAGTTCTGAAAATATTTAGGTCTATTTTGATTGCGT
TGCACTTTTTTGGTTAGGCAATTCGATCTATTTGCACCTAATCCGTAATTCCTGTTTTTGC
TAAAATTGATAATTCTGATTTTACTGTTTATATTTGTCAAATCTATTAATCATAATTTAACT
TATATAATGTGTCGCGTTGTATACACCTAGATAGTATGTATTTACGGAGACAAAGCGGA
GAAAACAGTAATTAGTAGAGGAGACATAAATTATCCTGTTTTAATTCCTATATATCCTCC
CTTATATAAATATGGACTCGTTTCTCGGCATGTTCTCCTTTGGATGAAATCAATCCAAAA
TGTAATCCACTTTGAATCAATTTGGACTCCGAAACTGTGGATCTTTCCCGAACATTATCA
GAAAAAAGATCAAAATGGCTCCTGTTAAATACCAAGTGTAGGAGTTCCAAAAACAACTC
CGTTAGGTACATTTCTTTTTGTGTTCCTGAGATTCTGAGTTTATTTATTCTTCCTGTTAGG
TATAACCAGCTTGGTAGAAGCAATGATTTGGATCACTCATCTCATCTCTTGACAGTTGCA
ACAAGAGATCTGTTTAATGACCTGGATAAAAATGTCAAACCAGTGGCACCAATGCAATT
CTGGACGGTACTTTATTTGTCTTTATTCTACTCCTAATATTTTTGGTTACGACTTAGTATT
CCTGACTTTGTATTCTTAGAAAATGTGTTTGGATTTCGAACAAAGTTACCATACCTTTGA
AGGAGAATACGTATGCTGAGTAGGAGATAGTGTTTGCCAATAATTTCCTATTGGCAGAC
TTCAAAATATACTCGTTTAGCGTTGAACACTGAACTCGATATATTTTGTCATGACTTTTGT
GTGCAAATGGATTTGCTCTTAGAGAGCAAGGATGAAGTACTTTGTATAGGATGCGAATA
GAATGACTTAAGCTTGGGCCTGTTGTCTACATGGTCAAAACTTTGTGCATTATCATTCTT
GCACAAGGTTACTTGGATTTATATGAGAACTATAAATGTAGCCTATGTTGATATGTTTGT
CTTTTTAGTGTTTTCGTCACATGAGCTACTCGGGCATCAACATTTGATTAGGTTTATGTT
CACAGGTTTTGCGGAAGAAATATCCTCAATTTGGCCAGCAGAGCAATGGAGCTTTCATG
CAACAGGTTCCAAGCTTACCTAAGCTACCACAATGCTTCCTTGTTATTAAAAAAAAAAAA
GTGTACCACTATTGCAATTGCTATATAGAGGTCCTACTGACATGTCCTGGATAATAACAG
GATGCTGAAGAATGTTGGACGCAACTACTTTACACCCTTTCTCAGTCTCTTAAATCACCG
AACTCTAGGTACTACATCTCCTCTCGGGATATTTCTTGCAGATGAAAGTCCCTTTTCTAA
ATAATTTCCATGTTTTGTTTCGCTAGTTGTTTTCTTTGTTTCAGTTTGGACATATGGTCCC
TATTTTGTAAAAATGTGAAGGAAAACTCTCCTATATATACATTGTGCTTCTTTATGTCATG
ATTGTGACTGCTCTTTATCTCGGTACTTGCAGTGGAAGTCCGGATATTGTGAAGGCTCT
CTTCG GTATTG AGTTTG ACAACAG GTATTTCTG CAGTCAAATGTTGTTTACCTTC CAGTT
ATTCTGTTACCTTATCCCCTTTGCATAGAGTTGTTCTGCACCTAAATATTATAAGAGGCA
TGTGAACTTACTGCTGTATATGTATTGAGATAGGAAGGAATGCAGCTAGTGGTCCTAGG
ATGTAGGATGTTCCCTGTTCTGACTTTGAGTATCTTCTGGGCAACCTGATGAGAATCAA
CATCCTCAACTTTTACTCTGTCATATTGTGAATCATGTAGTTGACAATAAGAGATGAATTA
CTGAAGTTGTTTTGAAAGTTGAAGCTAAAAATCATGTTTATGTTGACTTCTTTTAGTTTCT
CCTACTGTTAGTTAAGTGTACTATAGTGCTACTAGTGTGTATTTGTATTACTGCTAATGA
AAGGTCTGGCTGGTTTATGCCGTTCTAGGGTATTTTCAACAGGCTGTGCTTTTCTGCTG
TAGCCTAGACCTCTGGGCTAATATTTCTTGTCTAGGACCTGACCTACTGCAATGAGGTT
GGGAAGATCCCAATGCCCATCCCAGAGTTCTCATGTGCTCAAATTCATCTACTGATATA
CAAAATTTAATTTTCGATAGTGTGGGAAGCTGTTTATCATTCATGTCTGACTTGATCTATA
CTGTTCTGACTGGATTATGGTGGTTGTGCTAGAGTCCACTCTGTGACTATGTTCCATAT
CTGATATTAACTGCTAGATACTGAAGAAAAATGACTTAACTCTGCCCCTTATTCTCATGG
TACTGATATGGACCAGGATTCATTGTGCTGAAAGTGGTGAAGAAAGCACAGAAACAGAA
ACTGTATATTC CCTTAAATG CCACATTTCACAG GAAGTG AACCATTTGCATG AG G GTTTG
AAACGTGTAAGTTCGGTTCTTTTCCTCCTTGTATGTCCCAACTTCTAACTTTAGTCTTGTT
TCCTCCCAAATGTTTCATATTACTGCTAAGTTCTGTCTCAATTTTTTCTGTTGTGCCAATC
CAGTAATCATCCAATTTGATTAAGAGGACAGTCCCAAAGTGAAAAATGACGTATCTAATT
CATAGAAATTCCTTTGGTAGCTGTAACCTTTAAGGATAACTAACAGTTAGTCCTGAAATG
GTTGGTTGGATGGAGAAACTATTATATAAGATGCCTCACGGGCGGCACATTGGGGGGG
GGGGGGGCTTTTNGGGGGGGGTCTTTTGCTTTAGAATTTTTTCATGCCATGATTGGACC
AAATGTG G GG CCTG CATGTTG AG GTTCATG CAATAGTTTCACATCAG AG GTAG G CTGA GCCATTGGAGCCGACTCATAATGCTTTTGGTGGGAAAAGATGCTTTGGAGATTTTGTTT
AACTTGGTGCAAAATCGTTGATGATGTTTATACTGGAAATCTGGTATTTATCCCCTCAAA
ATAATTTAAATG CACTCATATG G CCATTG CTTTTTCTG AC AG GGTCTG AAATCAG AACTG
GAGAAGGCGTCTCCGTCACTTGGACGGAGTGCAGTTTATGTGAAAGACTCCCGAATCA
ATGGCTTGCCAAGGTATTAACTGGCTCGATTAAATTCCATGGCGATGTAGCGACATATG
TATGATCCGTAGCTTCTGCATATAGACTATCTTAATCCACGCCTTTCACATAACAAAAAT
GCCTTTTGATGTGTTGAAGTAGTTCACCTCATTTTTGGCATCACTTCTTTCTCATTCTCCT
TTCTTTTTTTTTTTNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNN N N G GG G CTTTTAATTCTCACACTTTTCTGG ATAGTTT
CTTTACTATTTGGTGCTGATTATTGACTGGTCTATGTCAAAATCTTCTTCTATTTTTACTA
TTTCTTTTTCATTTGATGAATAGATACTTGACCATTCAGTTCGTCCGGTTTTTCTGGAAGA
GGGAATCAAATCAAAAGGCAAAGATTTTGCGGGTATGTTGACTTACTCATCTTCCTTTTC
ACATTGATCAAATAGTTGGTCCTCCTTGAAAATGTGTCGGAAGAGGGAAAAAGGATCAG
GTTACCATGTCCTGAACCGAGAAAGGGGGATGGAGAAAGGGAGTTCACTGTTTTACTTT
GTTTTGTGAGGATGAGCGGTTCTGTCATGTTTAGGTGCTCTAAAGTCTCGCCTATTCTC
TCGATTGTCAAATTCTGAACTGTATATTTTAACTTTAGGCTGGTTTGTGATGGTGTCGGT
GTAGGAGTCTTGGTCTATAATCCCTTCCCAATAAAAAATTAACTGAAACTTCTCTTAAGT
TTCTAATTG ATCTTGG AGTAG GTGTAACTTG G G CACAATTACAAAG AG G AGTTCTTATGC
ACCAATGACTAGATTTCAGCTACTAGAAGTAAGAAGTAAGAGAAGGATGTGATATTAATA
CTTTGCTATTCTAGTGAGATAGTATGTACATAATATTTTTTTGAAAGAGAGTTCTGGTGA
GAAGCTGATATGTTTTTTTTTCTTTTTTCCTTTGTGTTGATACAAGATCTTCTTAACAATCA
AAAATATCTGAAAAGCTTTTTCCTGGATTCGGCCATTCGGGATAATAACCCACCCATTGC
CATAAGTTCTGGGATCTATAAAGTAGTTGTACTGGGTGTTTCAGATATCTTTGTGTTTGT
GAGAATGCACAGGAGAAATCTATTCTATTGATTAAAGTGTTGTACAACCCTATTTATATA
CAGTAATTACATAATAATAGGTATCTACTTCCCGATGTGGGACACTATAATACGAGAACC
AGTAAGAGACTTAGTGAAAATATCTGCTATGCTAGCTAATCATTCTACTTTACAAACTTT
GTAACAATATCTCCTGAGAGTATCTTTTCTCTGCCAAAGTGACAGTTGATCTCAATGTGT
TTAGTCCTCTCATGGAACACCAGATATGACGCAATATGAAGAGCAACTTGATTATCACA
CGCCAGTTCCATCGTGCTGATTTCTCCGAACTTCAACTCCTTAAGCAACCGCTTGATCC
AAACTAGCTCACACGTTGCCACAGCCATGGCCCGATATTCGGCTTCGGCGCTAGATCA
AGCAACTACATTTTGTTTCTTGCTCTTCCAAGACACCAAATTACTTCCTACTAGAACGCA
ATATCCAGACGTAGAACGTCTATCAGAAGGTGATCCTGCCCAATCAGCATCTGTGTACC
CAACAATCTGCTCGTGGCCTTGATCCTCGAATAGTAACCCTTTGCCTAGAGCTGACTTT
ATATACCGAAGAATGCGAACAGCTGCAATCCATAACTGACTTACAACACTCACCGGAAA
AGAAATGTCAGGTCTACTCACAGTGAGGTAATTCAATTTGCCAACCAACCTCCTATATCT
TGTAGGATCTCTAAGAGGCTCCCCCTGTCCAGGCAGAAGCTTAGCATTCGGATCCATA
GGAGTGTCAACCGGTCTACAGCCCATCATTCCCGTCTCCTCAAGAATGTCTTAAGGCAT
ACTTTCGTTGTGAAATAACAATACCTGAGCTAGACTGAGCGACAGTCTCGAAGTGCTGA
AAGAGTTGCTGCTTCAGATTAGTAATACCATCCTGATCATTGCCAGTAATAACAATATCA
TC AAC ATAAAC C AC C AG AT AAT AC AC AG ATT AG GAG C AG AATG C C G ATAAAAC AC AAAG
TGATCAGGTTCACTACGAGTCATGCCGAACTCCTGAATAATTGTGCTGAACTTACCAAA
CCAAGTTC GAG GTG ATTGTTTCAAAC CATATAGTG ACCTG CG CAATCG G CATACAAG AT
CACTAGACTCCCCCTAGCAATAAAACCAGGTGGTTGCTCCATATAAACTTCTTCCTCAA
GATCACCATGGAGGAAAGCATTCTTAATGTCTAACTGATAAAGAGGCCAATGACGTACA
ACAACCATGGACAAGAAGAGACGAGCAGACTTTAGCCGCGGGAGAGAAAGTGTCACTA
TAATCAAGCCCAAAAATCTGAGTGTATCCTTTTGCAACAAGACGAGCCTTAAGACGATC
AACTTGGCCATCCGGGCCGACTTTGACTGCATAAACTCAACGACAACCAACAATAGACT
TACCTGAAGGAAGAGGAACAAGCTCCCAAGTGCCACTCGCATGTAAAGCAGACATCTC
CTCAATCATAGCATGTCGCCATCCTGGATGAGATAGTGCCTCACCTGTAGACTTGGGAT AGAGACAGTTGACAAAGATGATATAAAAGCATAATGAGGTGATGACAGACGATGATAAC
TTAAACCGACATAATGAGGATTAGGATTAAGAGTGGATCGCGCACCTTTCCAAAGTGCA
ATCGGTTTACTAGGAAGAGGCAAGTCCGCAGTAGGAGCAGGATTAGGTGCAAAACGTG
AATCAGCTGGGCCTGATGCTGGCTGCGGACGACGATGATATGTCAAGAGTGGTGTTCC
TGTGGCGGGGAATCTAGGAGGAGTCTGTGGCGAAAGGTGAAGGAGGAGCTATAGTAA
G CTCCTTAAAG GTC GATACAG GTAAG ACCTC AG ATATATC AAGGTG GTCAG AAGAG GTA
AAGAAAGGTTTAGACTCAAAAAATATGACGTCAGATGATATAAAGTACTTATGAAGATCA
GGTGAGTAACAACGATATTCCTTCTGAACACGAGAATAACCAAGGAAGACACACTTGAG
AGCACGAGGAGCTAACTTATCTTTCCCAGGGGCTAAGTTATGAACGAAGCAAATGCTCC
CTAAAAC AC GAG GAG G AAC AG AGTATAAG G GTG ATTG G G G AAC AATACTG C ATAC G G A
ATCTGATTCTGGATGGGAGATGAAGGCATCCGTTTAACCAAATAACAAGCTGTGAGAAC
TG CATCAG GTG GTCAG AAGAGGTAAAGAAAG GTTTAG ACTAAAAAAATGTG ATGTCAGA
TGGCATAAAGTACTTATGAAGATCAGGTGAGTAACAACGATATCCCTTCTGAACACGAG
AATAAC C AAG G AAG AC AC ACTTG AG AG C AC GAG GAG CTAATTTATCTTTTC C AAG G G CT
AAGTTATG AACG AAG CAAGTG CTACCAAAAACACG AG AG G AACAG AGTAGAAG G GTG A
CTGGGAAATAGTACTGCATACGGAATGTGATTCTGGATGGGAGATGAAGGCATCCGAT
TAACCAAATAACAATCTGTGAGAACTGCATCGCCCCAAAAACGCAACAGAACATGAGAT
TCAATGAGAAGTGTGCGAGCAATCTCAATGATGTGCCTATTCTTTCTCTCTGCAACCCTA
TTTTACTGAGGGGTATAAGAACAAGAGGTCTGATGAATAATTCCTTGAGAAGTCATAAAC
TGCTGAAATTGAGAGGATAAATATTCTAAGGCATTATCACTGCGAAAAGTGCGAATGGA
AACACCAAATTGATTTTTAATTTCAGCACAAAAATTCTGGAATATAGAAAACAACTCAGAA
CGATCTTTCATTAAGAAAATCCAAGTAATCTTGAATGATCATCAATGAGACTAACAAAAT
AACGAAATCCCAAGGTTGAACTGGCTCTACTAGGACCCCATATATCAGAATGAACTAAA
GAAAAAACAGACTCTGCATGACTCTCAATACTACGAGAAAAGGTTTGGGAATGTTTTCC
GAGCTGACATGACTCACACTATAATCTAGATAAACTAGACAAACTAGGCACCATCCTCT
GAAGCTTGGATAAGCTTGGATGTCCTAAACTTATGTGAATTAGGTCCGGAGGATCTGTA
G CTAG ACATGTCTTG G AG GAATTGAGTGAGTTAAGGTAGTAAAG GC CTTCTGATTCAAG
TCATGTTCCAATCGTCTGTCCCGTACTACGGTCCTGCATAATAAAAGAATCATCAATAAA
ATATATACCACAATG G AG GG CACGAGTCAAATG ACTAG CAGATG CAAAG G ACAG CCAG
G GACATAAAG AACAGAATCTAG AGTG AC AG AG G GTG G GG G ATTTG CTTGTCCAACTCC
TTTTGCTTTAGTTTGAGACCCATTGGCTAAAATAATAGTGGGAAGAGACTGTGAATATGC
AATATTTGACAAAAGTGATTTATTACCAGAGATATGATCAGAAGCTGCTGAGTCCACAAC
CCATTATCCAAGAGTACTAGACTGGGAAACACAAGCAAAAGAATTATCAACAACAGAAG
TATCAGTCTGAGCAATAGAGGCTACTTGTGGAGATGTCTGCTTGCTCGATACGGAGGG
AACTCATTATATTCCCCTTCAAATAAAGAAAATCCCTGGTTACCTGTAGTCTCGGTCTGA
GCAACATAAGCATTTTTGAGTGGACGACCTTGTAAAGAATAGCACACGTCACGAGTGTG
TCCAAGTTTATGACAATAAGAGCAATTGGGCCTAGATCTTCCAAAACGACCACCTCCTC
GTCTATTCTCCATAGTTTGAGATACCTGATTGTCCACTGACTGGGATACGAGAACAGAT
GAGTCAAGTGTCTGTGATGAGCTTACTAGGTGACTTGGTATTGCAGCAAGGCGAAGTAA
TTGAGAGAATAATTCATCAACATTGGGGACAGTCGGACTAGCCAAAATCTGGTCACATA
CTGAATCAAGGTCATTAGGGAGTTCAGCGAGTGTAAGAACTAGAAACATCTTCTGTCGT
TGCTCTTGTTGCTTTTCAATACTAGCAGAAACTGACATCAATTGCTCAAATTCTTCCATG
ACTGCCTGTACTTGTCCTAAGTGTGTAGACATATCCAATTCCTGTTTCTTCAAGCTTGTC
ATTCGCGATATTACATCATAGAAACGAGATAGTCATTAGTGTATAAAGTACGAGCCTTCT
CCCAAACTAAATAACATGTCTGGAATGGATGGAACAAAGGCATCAACTTGGAATCAATA
GATCGCCACAAGATACTACATAACTGAGCATCGACCTTCTCCCAAAGTGTTTTGGCCTT
TTCATCACCGTCGCTAGCCCTTTTTGTTAAATGATCTTGAACTCCTTGACCTTTACACCA
CAACTCGACAGACGAAGCCCAAGCTAAGTAGTTTGAACTTCCCATTAAAGGTTCTGAGG
CAATCATAATACCATAACTTCCAGAACCCGTGTTTTTAGACCCAAATACATCCACTCCCA
AAGACATTATTGGATTGAAAAGAGATCTAGCAAATTAGCACCAAATAAAACAAAGAATCA
ACTGTGGTTGCCGAAAAACTGCCGGAAAAAATACTGTAGTTGCAGGAAAATTTTCAAAG TGCTCGGAATCAAAAAATAAAAATATGGGAAGGCTCGGAATTGCAGGGCGATCAGACT
GTTCTAAAGAAGTTTTCTGAAAAAATGGACGGAACGGGCTCCACGCGCCGGCGCGTGG
AGTAGATCTTGCCGGCGAAAATTGTCTTCGGGCGGCGCGTGAGGCGGAGTCTGACGG
AGTTGTTTGCTGGGGTTTGGTCGCCGGAGGTTGGGGACCTTATGGTGGTGTTGGTTTT
TGCACAACACCGATGGAATTGGTTTTGACGAAAAAATAGCCCTAAAAGGTCACCGGGAT
GAAGCACGTCGACGACTGGGTTTTCATTCCCGGATGTTTTCTCACTGCCGCTCTGATAC
CATGTGAGAATGCACGGGAGAAAAATCTATTCTATTGATTAAAGTGTTGTACAACCCTAT
TTATATACAGTAATTACATAATAATAGGTATCTACTTCCCGATGTGGGACACTAAACATG
ACTAACTACTTAACAGTGTTGAACATG G GTAAC CCAG G GG G GTCTTCACCTAATTCCAA
AACTGAAGTGAAAAGAGGAAAAAGAAAGCAGGCTTGTGAGGTTCCGCGTTCCTCTCCA
TCTGATATGACCTTTCTATATATATATATTCTTTTAAATGATGCTTCCCGGCTAGCTTATG
CGCACCTCGATTATTCTATTTAGTACATGCTACCTCCCATCAGAACATGCACAAGGTAAC
TCTGTCCATCAAGGCTTAGGAAAATAGAAGAAATCACCTACTCTCTCCGTTCCAATTTAT
GTGATCCTGTTTGACTGGGCACAGAGTTTAAGAAAAAATGAAGACTTTTAGAATTTGTG
GTCCTAAACAAGTCAAAAAGGGGCCTAGATTATTTGTGTGGTTATAAAAGCTTCTCATTA
ATAGTAGAATTGTAAGTTTAAGCTAAATTGTTACCAAATTTAGAAATGGGTCATTCTTTTT
G AAAC GG AC CAAAAAGG AAATAG GTTCACATAAACTGG AACAG AG G GAGTAGTATTTTT
TGTTTCCATTGGGATTTGTGATTGAGATCTCATGGTTATGTATGCATATGTTGTGATGTA
GTTCCATATCTTTACTGTTTAGATGGTTGACAGGGAAATAGTAAGTTCTTTTTTAACTTAA
TTACGAAAATAAATTGTCTTCTTTTATTTAAGCTTATGTGACATTATTTCCTTTTTAGTTTG
CTTATAAAAGAATTAACCCTTTCTAAGTTTGGAGAACTAAATGTTCTCATTTTACGCTTAA
TGGCAAGCATTTATAGCCACACATTCGTTCTGCATATTTAAAACCTGAAGTTTCAAAAGT
CTTATTTAACCATAGCATGTTTAAGACTACAAGTTTCAAAAGTCATTCTTTTTTCCTTATTA
AACTTC CTGTGTAGTTAAACAAG GTCG CAATAAAATGAAATAGAG GG AGTACTATATTTA
AGAATG GTATC ATACTTG GTAGTTTTTCTCTTCTCGTCTCTCCTTTTTTG G GTAG GG G AT
GCATCAAGCTGTAGGTTCAATTGTTTATAACTTTTAAATAGCTAAAGAACTTGCTCTAGT
AGTGATTTCGGGGGTAAGATAATCTTTGTGGTTAAGAATCCATTGAATATGGGAATAATA
AAAAAAAAAAAAG AAAAAG AATG C ATTG G ATTAG AG ATC AC AAC C ATTTTTAG C C GAT AT
TGGCTGCGATGTGTTCTGGCTAATTTTTTCTTGAAAGGTAACAGAGGATCGGTTCTGGC
GAATTAACTAAGGTTCATTTTAAATATCACCCATATCCAACCTAGCCACCCATCTGTCAC
GTATAAAACTTATTGTGGACAAAAATAGAATGGTCACTTCCTTTTCCACCCACCGCTAGT
TGCGACGTGACAACAAGGAGTTGTAAATGTGATGTGATGGATTTTAGTTATAGCAGGAA
CTTAAATGATAATCTAGGAAGCTACAATTTTGCTATGCTTCAAACAACAGTCATATCTCG
TATTAAAACTCAGCAAGTCTTGCTTTTTTAGTAATATAGGTTTGCTAATATTTCAAAATCC
TATTTTATATTTTCTGCATGTATTGGATCTCTCATTGCTTTAAGATATAAGAAATGGTAAT
CTTAAACTATGTCTATGCCATCAGAAAGTGGATTACCCGCTGTCGTTGGATGTATATGAT
TTTTGTTCGGAAGACCTTCGCAAGAAACTGGAAGGTCCTCGCCAGGTACTGTCTTTTTC
CCATTGATCAATGTCTTTTAAGAAATGAGGAAAGACCAGACCCTCTTTGGCCCCTCTTTC
TCTTTCTTGTTCTGTTATTACATGACTCTAAATTTGCTG CTAG GTTTTG AG G GATG CTG A
AGGTAAGAAGGCCGGTTTAAAAACCAGTGAGAAAACTTCAAGTTCAACTGACGGCGAC
GTTAAAATGACTGAGGCTGAGGTATGAATTAATCTTTGTAATGTAGGAGTGACTTAAGG
GGATAAAGAGGGACCTTTCGGGCCCACCTATGGCGGATTGTACAAGGTGGTTTAAAGT
GGGAAATTAAGAATGTCTATAAGTAACTTCTGCCTTTTCTCCTTTTTTTTCCTATTGTTTA
TG CAG GAATCATCTAGTG GAAGTG GAG AAG CGTCTAAAACAACCCAAGAAG GTAG AGA
AACACCTCCTTTTCTTGATAACTTGATGACTTGATAAACATATGCTGCTGCTGTATTTTAA
TTGGTAACAATGTCTGGCATTAAAATTGTAATATTTGGGAGAGAAGTTATTGTCATGAAA
TTACCTTCCAACTCACATATCCTTTTCAATGCTTTAAATGAAACTCTGTTAGTTAATTGTC
AGATATAATTCTGCAGTAATATTGCGGTTCCATGAGGTTTTACAGTTCTTATGACAACAA
TGTTCTGCCCTGGGTATTCAAACTTCTTTTCACAAAGTCACTGTTAGTATCTTTGATTACA
AG CG ATTG AC CTTCTATTAACAATTTTG G GATCCCATAAAG ATTATTAACTTG GATCAGA
TTTATTCCTTTTTAAATTACTATATGTCCCTAGACACCGGCGGACGTTTGCCAAGTTTCT TTTGAAGGGGGCGCCTTTTAATTTTTAGAACTATGGAAGATCCTTAAGGTTTAGCTCTGC
TGATACAAGTACATTTTAATTTGTTTGGACATTTGTTTGTATGTAGAAAATGTAGAGAACC
TCTAGCATAGATAACCCCGTACTTGCCTTTGAATTTATATAACTACGAAAGATTCTAAAG
GTTTAG CTCTG CTG ATACACTTAC ATTAAGTTG GTGTTTTCTC GG AGTCCTTTAATTTGTT
TAGACATTTATTTGTATGTGGAAAATGTAGAGAACCTCTAGCATTGGATAACCCCATAAA
TTGTCTTAAAAGAAAATTTTCTGAGTATTGGAATGAACTAGGGCCCAGATGGAGCAGAA
TG AATGTCG G GG ATTAAATAAG G GACTCTAACTG GTTCTGG ATTG G GAG CAGTAGTTGT
GAATGATCAATTTTATCAACAGTTTAGTGTTTCTGATATATAAGGGGAAGTTAGTCTAAG
AGCTCAATTTTGGAATTCTGTTATGAATGAGGAGTCGCAATGATTAGTGCTTTTTTTTTTT
TAATGATGCCCGATTAGTAAACTCCATTTCAGGTGTTCTGCCTGAGAAGGAACACCACT
TG ACTG G AATATATG ATTTG GTG GCC GTG CTG ACTC ACAAG G G AAGAAGTG CTGACTCT
GGGCATTATGTTGCCTGGGTCAAGCAAGAAAACGGTCAGTTTAACTGGGAAGAGATTTT
GTTCTAGTAATCGTTGCTCTTGGACTACCATCTGATACAATATATTGAAAATCTCTTTGTA
AACCACAGGAAAGTGGGTTCAATTTGATGATGACAATCCAATTCCGCAGAGAGAAGAG
GACATCCCTAAACTTTCAGGAGGTGGTAAGTGAATCACTTGTGTATTACGTCTTCGGCA
AATTTTCAAAGTCTGG CAAG CATATCCTTTCTTATAACAACAAG ATGTAAAG CAGATG GA
ATATTTTGTTGCTTGTGTGCCTGAATGTGTTTTTCGTTCTGTCAGTTTATAGAAGTGCTTT
ATTTTTGGTTTCAGGTGATTGGCATATGGCTTATATTTGCATGTACAAGGCCCGTGTTGT
TCCCATG
SEQ 52
CTATTTCACTTGATGCAAGGAAAATTGATTACTCCTGGCACTACGAAGAAATGTTTGGTT
AAAAGG ACAAAG GTCAGTG CAAAAACCAGAATAGTTCATTG CTACACTAG GTCCAACTA
G C C AG ATAG ATG C AC AC AATTTG C G G C AG AG ATAACTATAAG AC AC C AC AGTG CTTAAC
TGCCTTTTAACCAAAGACCAAAAACACTCATGAAGAAGAAAAATGACAACCTTTTTATAG
CTATGGACTGTCCATTTCATACGATTCCAGTTGTTAATACTTTATGCATGAGTCAAATAG
AATTTGCATACAAAAAAATAGGTCCATTTGCAACAAATCCGAGTATAACTGAATGAACAG
ATGAGAGCCATTAAAACCTTAATGTCAAATCCTACAAAACAATTGGATCATCTCCACAAT
GCATGACGTAATTCATCTCATGCTGTAAATATATCAGTCTGTGGTTAATATGAAAGTTAT
AGATTAAAGATTCACAAGACACAAATGTGCCTCCTGAACTTCCTAGAGCAGCATATCAG
CTAAAGCAAAGAATAGAAATACACTAATACAGAACAGCAAAGAGAGACTTAATAGACCT
GAGTTCCATATTTTCAATGCGGATACGTTCAACATATAAGTCATGATCTTCTGCTTCACT
GTTGATCTCATCGTCTTTTTTGGGATCCTTTTTGCGCCGTGAACTTGATCTCCTCGAAAA
TAGTCTATCAATCCTG G CAG CATAAG AACCTAAG AGTG CTAAG ACACAATG CATTCACT
AAAACATTGAAAAAAGGCCCAAAACAGAGGCAAAGTGGAGATGAACTTTAGAATTCTAT
CTTCACATTTTCACCAGAAGGGTTCAGAAAGTTATAGGAACTTCATTTCAAGTACAATGC
AAACAGTAAACAGGTTTTTACTATACTAAAGTATCTAAAAGTCACTTACCTTGTTTTTCTC
AAAATCCATGCCCAAGTGCGTCCAATAATCATGTCCACTTTTTCCATGAAAAAGTAATCA
GTAGTCCTATCAATTCCAATGCCGCGTCGAAAGATTACATACTGGAAAGAAAAGGTACG
TATATAGAGATTCAGTGGCATGATCAAACTCTTGACTCTCATCCTTTTCCCTCGAGAAAT
ATATGTTCATATTACACATTGATATACTTAATAAATTTCGAGGTAAAACATTGATATACCA
TATAAGTTGAACAGCTCAAATACAATATAATACATCCATGATACTATTTGCAGGTTCAAAA
AGATATAAGTG ACAACACGTG CACAAGTCACTCTTTAC CTTTGTTG AAG CTTGTGTG CA
CTATTACAG GTTTG AG GAG CAGACCTTG AACAATG ATTAATCTTTTGAAAACTAG CAAG A
AGATGAGAGGATAAAGATTCCAGTTTGTGCTCAGTCATAACACACTTTTTTGAATCCGTG
GTGAAAATAAAAATCTTTCATCAAAGAAAAAGAAGAAAGGGCCAGTAAGCACAATATAAC
TCTAAAGTTCTTCGCAATAAATGATCGCATACTTTTGCTTGTGTTTTATTGGAGTGCAAAT
TCCTTCTTGCCAAAATCATGGGCAGTAATGGACATCCTGACACCTAAGAGCTAATGTTG
TATAGGGACAGTGGTTACTTCAAGATTCACTTTAAACAGCCAATCATACTTGCTGCAAAG CAACAATGAGATAGGGAAATAAATGGACACCATATGTGGATATCAAGGATTCAAGTCGC
ATTCTTGAACCCCAAACAATGCAAGGCACCAACAAAAAGCAGATTGGGAAATATTCAAT
TTGGTGAAAATTTCATAAATACAACCAGAAGCTACACTTGTCGACCTAGCATCTGGGAG
CCATTAACTTCATACCAAACGTTCCAGCAAAAGTAAAGGCCTCATTTCAAATTGTAACTT
G AAG CTACACTAAG AG ATATACCTTGTCAGCAAACTCTG G AAG GTCCTCATG AG G ATG C
TCTG CAAAATACTTCTCTAAAAGTTTTTTGTCAAG CTG CATCC AG GAAAAGGTAGAGTGA
ATCGCATAGCGAACATGACAAAATGGAACTATATCAAGAGAGCCAATAATTGTCGGAAC
CAAATTATATGCGGAAGCAGCTGGTTAAGAACCTTAGATTCATCAACAGTGATTGGAAG
ATTTAGAAGATACTGTCCTGAATGTGCAACATCAATCTCTTCATCACTAGCTATTTTAAAA
TTGCTTTTGTGCATAATCTGCATCAGCATAAGAGAAATGATGATAGAAAGTGGAGAATG
AAATAAGGGTGCCAGAACACAAATGTCATACCCATTTCCCTTAATCCTCCAGCTAGATG
TGGAAAAGCATTCAGAATTCACTGCACTTAGAGTTTGATGCAAGTTTAAAATCCATTACA
GGCTTGGCCTCAATAGCATGATAATGCAGTGCAAAATAACGGAAAACGACTTTAGCGAT
AAATGAATAATATATAGCCGTTCCCCATGCTTGAAAATAAAGAAATATCCCCTTCCATTT
CGATTTTGCAACAAAACAGCATAAATATGCGAAACTTTCTTATTGTAAGGCTACTCCTTA
TG ATTAAAAATG G GAAG CATGTGTTAGAAAAAACAAAGTAAAAGAG AAAAAAAACAG GA
AAATGAAGGACGAAGCAACTCCCACTCCCTGAAGACAAAAACCTCGACCTCTTACCTTT
CACCTCACTGGTTGTCTCCCTCCCCTGAGAAAAAACAATTATTCGCTATGTCCCAATTTA
TGTGATGCACTTTCCTTTTTAGTATGTCCCAAAAAGAATTATACCTTACTATATTTAAAAA
AAAATTAAAACTTTCCATTTTACCCTTAATGAGATAATCTATAGCCACAAAAATATCTATG
ACTTGTTTAGACCACAAGTTCCAAAAGCTTTCTTTCTTTCTTAAATTTTGTGCCCAGCTAA
ACAACATCACATAAAATGGGACGGAAGGAGTAGTTTTTCCTCTCAATTTAATCCAATAGA
ATTTTCTTCCATTTCAGTGGAGATCTTACACTTAATAACTGATGCAGGATTATATCTTTTG
TTATATTTTTATTTCCTAGGCTCGGGAAGGAGAGGATAGCATGTTTTCACCTCCGCTGG
GGATTTTCTTTTGAAGGTACATAAATGTCAGGGCCCTCATACAAGGGTTGTAGATCGAA
AGTATTCTAACTATCTTCTTTTTTTCGGGTAAATTGTTGAGCCATGGGTACACACATACA
CAAATATATAGAGAGAGATATCTTCATATAAGCAACTGCAGACTATAAATATGCACACAC
AGAAGACAGATATGAGATTGGATTTTTTATCAAATTTAAACTGGAACACCAATATCCCGA
CTAAAGATACGCGGATACAATAGAATATAGCAGCTGATGGTGTTACTGAAAGGTCCAAC
ACGGGTGCTCAAAGTCCACTTGGGCAAATGCGGCTAGGGAGTCAGAAAGAATGAGATG
GGCGACGGACTTGCCCTACTGTCAGAGCTAGTGGTGTGTTCTTGAGATGAAGGCTTAG
CATTCACCACCTTCCACGTTTCCGACTTTCTCTTATTTGAACTTTTTAGATCAGTCAACTT
G AGTAAG CAACTTG GG CCAAG G GATG AAACTCAAGTCTCTAG CTCTCAAGTATACTG AG
TAGGCAACTTGGGCTTGCATGTTATACATATTTTTTTAACTAGTATACTGAGTAGGCAAC
TGGGGCTTGCATGTTATATAATGAATCCAAACAGCCCATATAGCTGGCTCAGGTACCAG
AGGAAAACAAACTTTAGATGAAGCTGCAGATTTTTTACCTGAAACAAATATGTCAAGAAA
TTCTGTTCAAGAATGTCAATCTCTTCTGGAGATAACTTCTGTTGTTCCAATTTCTTAGCC
CCGTTCACAGGATCAAACAAAGAGTATAATTGCTGCAATCAGACAAATATTATTGATCAA
AACACAATATCAACATCAGCATATTATATTCATTTGCAGCAGACTGATGAGAAAAGATGT
CTGATCAAATAAAAAGTAGCATAAGATTATATTACCATGAGATCCTCAAATTGTAGAAGA
TACCAAGCATGAATTGTGTACTCAACCCTCTTGCAAAGCTTCAGAAATTCAGCCCGGTC
AGAACTATGTTCTG GG CAAG GAG GAAGGTCG GATCATTAG AACCTCAAAAAAGACTTTT
ACTTAAATAACCAATCATAAAAAAGCACTAAAGGGACTAGAAGCACAGATATTCATGCAG
CATTTTACAGTTTCAAGAAAAGTAGATTTTCTGAAGAAGAAAAAAAGAGTGGAAATGAGC
TGCCAAGACGCAGACATGATTTACGTATATCCAGAAGGAAAACTTGTTTCTGTTAAAAAT
TCTAAACATATAACACGTCAAAGCTTATCAGTGAAGGAAGGGTTTTGAAACCTTTTGGG
G ATTTTG ATCCATG ACC ATCAAGTGTCTCCGTAAAATTCACATG G AG G GTGTACAG G GT
CCCCCCTTCTTTTGAACTGGGGTCTCCTCCTTTTGAAATGTCCAGCACTAATTTACTGCG
TTGTTTTTACTGTGAGTAATACTGTTACTTATCTATCCAAAAAATTACATAGAGGGAAGTC
GTTTCAAAAG CTTAC AATTTCTTG AAATCC ATG AG G ACTTG ATGAAAAGATG GAAG CCTT
TCTATTTCGGTTAGAGTTTATATGCTCGAGTGACCGTGTAAGGAGAAGTGTAAGATTTA GACAATTCACGGTTATAGAGAAACCCTTATTTGTCCTAAAAGACAATTTATTGAGTAGTG
GAATGAAACAAGCCAATGGAGCAGAATGATAATGAGAAAATCACATAACCAACTCCAAC
TAGTTATGAATCAGGGTGGAGTTGCAAAAGTAAAATCTCTATTAAGCATCTAATTTCTTA
AATTCAAGAGATTGTGAACTTCTGAGTTGAGTGAAAGTGACTTACTCTAAGAAAGTATTT
AATGATAAAACTCGCAATTGAGATAAGTCAAATGGTAAACCTTGGAAAATAATCTAATTT
AATCTGGGAAAATATCTAATTCACACAACAGTAAATCTAATTAATCTGGGAAAATAATCA
AGGTCAAATTGTCTGCTTTGTCACAATACAATTAGAGTGCCTCAGAAAACAACATTAAAT
TTTG CAAAATTAAACCATTAAGG G GGTTG CTGTTTACACGAAAAAAG CTAAAACTCAGAC
GTCACCAAAACTATGAGATCGTGTAAAGTTAGTAACATTTTAGGTTTAAATTCATGGCAC
GCAATAATTGCTTGACATTCAGGAGAAAGACAAAAAAGTGACTGCCGTTCCCTATAGAG
TCTAGCGTAGAAAGTTGTATTATTTGCTTCAAAGATCTCCTCTTGGAGAAAGAAGCATTG
TTTCCTTACACTGCCATTTTCTATGGTCTAGTATAGGAAGTTCTATTGTTTGCTTTAAAAG
TCTCCTCGGATTATAGCAGAGTTGTACTTGTGCTAGTCAAATTTGGTAGTTTTTTCCACA
GAAGTAACAGAAAGTGAAAATATCAGAGATATTATCCAATTAAATTAGAGGAGAAGAATT
GTTTCAAAATTCAAACAACTGATCGACCACAAAATAGAGAGGGAAGAAAGAGTAGCCGG
TTGTTTGGGAGCATCCAATTTTTACCTATAAGGTCGGCGAGGGCCATGATGAGCCTGG
GTTTGAGGACTGGAATAACAGACTCGCGCTCCAAACGGATCACCTCTTTCTTCTTCTCC
AT
SEQ 53
TCACCGGTTTGTGACAACTGGATTTCCGTTCGCATCAGACAAATGGATACTCTGGGTTG
ACTTGTTAGCGACAACAGGATTTCCAGACTCATCTAAATTAATTATAGCAACATCACCAG
GCTTGAGATCCCCAGAAAGGAAAGATTCACTCAGGAGATCTTCAACCATTTGAGTAACA
GCCCTCCTAAGAGGGCGTGCACCGTAGTTTCTGTCAAATCCTTGTTGGCATATAAGCTC
CATTACTGCTTCTGACACCTCCAAGCTTATTTCCAATGAAACAAGCCTAGCCCTCACCTC
CTG CAGCATCAGGTCTAGTATCTG GAG CATCTGAAAG GG AAAAAACAAAG CAGTTACTC
GCAGTAGCCGACTGCTTTCTAAAGAGATGTGCCAGAATAAAAGATCACCCCAATGTACA
TTCAGTTATCAGATCAATGCAACTTTCCAAATCAAACAAGAGGTATATTATACGAACCTG
GGGCTTCTCTAGAGGACGGAATACTACTACTTCGTCTAGCCTATTCATCAACTCAGGGC
GGAAATATGTCTTGAGCTCTTCCATCACTATTGCTTTCATACCAGCATAGGAGGCTGCT
GATTCATCATCAGCAAGCAAGAAGCCAATAGTATTCTGTCTACCCTTTACTATGGCTGTA
GAACCCACATTAGAAGTCATCACTATCAGGGCATTCTTAAATGACACTCTTCTTCCCTGT
TGAATCAAGTTTCATAATTATCAACGCCAATCTTCCAAAACAAGGTTTACACCCCATATT
GTGGAACATTCTAGAACAATAAGCCAGATGTAAAAACGGATCTAACCTGAGAGTCTGTT
AGGTGACCATCTTCAAACAACTGAAGGAGAATATTGAATATGTCAGGATGAGCCTTTTC
AATCTCATCTAGCAGCACTACAGTGAAGGGCTTTCTTCTGATAGCTTCAGTAAGTGTTC
CTCCTTCTCCATAGCCTACATAACCAGGAGGCGATCCAATTAACTTGCTCACAGTATGC
CGCTCCATGTATTCACTCATATCCAATCTTAGCATGGCAGATTCCTGTGCAACACAAAAA
GATACTTCACTGAGTACATAGAAATTCACAAAACCAAGTGATTGTTAAACTGAAACAGAA
CCGAAGATAGCAAATATAACTCTAGCAATTTGATGATGAACTTCTAAGAAAAGTAGATGC
ATAACTCTTGATGATGAAATTCTTTTTGATAAAATCCTAGCCACAATTTTCATCTAGAAGG
TTGAACAGAGAATGTCGCTCAGATGAATTTGATTTGATTCAATCTCTGCACTGAAGTCAC
AAACCGTCTACGACCAAGAGATCCAAATTCAGTGTGATGACAAGATTAAAAGCAGCGCT
AACTATATGACTATGATCCATTATCAAAGCTCACTTAGTACTTTTCATTTATTTTGCCAAA
TTACGTGCCTCCACCATAGGTGATTAGAGACTCCTTCTGTTCTTTCTTTTGGGGCGGGC
TGAGAAGGGGTGGAAGGTGCAAGGATCTAGTAAATGTAACAGACTTATCAGATATATAT
ACAATGGTTGACTTCATTTCCCATTGAAATGGATGAAGGAATAATCTGATCCTGGCAACA
GGGAAAGAGATTTGAAATAAGCCAGTAATAGAGCACTACCTAGTACCTAGTATGTCTAA
ACTTGAAGTACTTATTAGTGCCCACCAATAATCAGAAGTCGTGCTCCAACAAGATTAATT GGTGGTATTTCCAACGCTACATTTGACCATGCCGCAATAGCCATAGAAACCTAGAATGG
ACACACAGCACAGCCTCCGGTACCCTTTCCTTCTCTCCTTTTGTTTTTATTATGTGTGTC
CGACCACCTAAAACACAGTCACCACATCTTACTATCATAGTATAATACTTTCTTTACCCG
AATCACCACTACCAACAACATAGAAAATTCCCAAGAACACCAGAACCAAAAAAAGTCCT
GAGAACACCAGTACTATCACCAAAACCTATCCATTGTCACTGAAGTGGCTGGATTGCAT
AACTTCAGTGG CAAAACATG GTG ACTGTCTG GTGAAACAATTAAGG CATCTAG AATG AA
AAGATGAAGCACATTTCTTATCTTACATAATAATTCTTCTCAAATTTAGACAAACTAAAAA
GAGCAAGATTGTGTTTGTGCAAATTATGCTGCCAGAACTCTTGGTCAACACGATTCAATT
TCAGAGTTCCAC AATTTCTACTCAATTG CTTAATCTG GAG ACG CATCTTTG GAG GAATAA
TG C AAAAC AG CTC ATTTTATTAATACTTAC AG AAC C AAAATAAG AC G CTG C C AAAG CTTT
AGCTAGTTCAGATTTTCCAACTCCAGTAGGACCACAGAAGAGCATTGCCGAAATTGGTC
TATTTGGGTCCTTAAGACCAGTTCTAGATCTCTTAACAGCCCGACAAATGGCTGCAACA
GCCTCATCCTGACCAACAACCCTTTTTTTAAGCTGCTCATCAAGACCAACCAGAAGCAT
TCTTTCATCAACAGTAAG CTGCTTAAGG G GAATG CCTGTCCAGAGTGAAG CAACTG CTG
CTATTTCCTCAGGTCCAACTACCGGAGGTCTAGAGTAAAAATCAAAGGATTTAAGTTAAT
TTACTTCTGATTTAAATTTATGACTTCAGACAATCTTTATTCTACTATAATGCCCTTGTAG
GAACATGGAGGCAGATACGTACTCATCTTCATCAGATGTAGAAGGTGATGCTGGCTGTA
AATGAAGTTCACTGCCATCATTCAAACGAGATGCATCATCATTTTCTGTCAGCTTGCTTG
CCAAG ATCTGTTTATG GAG GG G GC ACAGGTAATTGTATTTAG AC CACATAAG GAAGTTC
AAATTTTTCAAG AC GTG CAAAATCTAG AG AGTTTGTATAACATTGAGTACTATTAC CACTT
CATGCATGGCTTGAACAGCTCTAATCTCCTGCCAATAATCACTTGGTGATTGTGAGAGT
ACAGATATCTGCTGTTCCTTTCTTCTTTTGTGAGCTTGCATACGAGATTTACTACCAGCC
TCATCAATAAGATCAATAGCTTTGTCAGGAAGATACCTATCCGGTATATATCTTGCTGAC
AGTTGCACAGCAGCATTTATGGCTTCCAAACTGTATATACACTTATGATGTGACTCATAT
TTCTCACGCAATCCCAACAGTATCTGGACAGCATCCGCCTATATAATTTAGGAAACAAG
AATATCTGTTAG G GC CTTAG ACACC CACTTG AGC ATACAC ATG CATAG AGTATTATTCAC
CTGACTTGGTTCATTAATCAAGACAGGCTGGAATCTTCGGGCAAAGGCCTTGTCCTTCT
CAATATGCAATCTGAACTCATCCATGGTGGTAGATGCAATACACTGTTACAAGAACAAAA
TTTAAG CG CAG AGTCC ATAAAAG CCAACATTATG AATGC AACTAC ATG GAGTAATG CAT
G G AAC C AG AATAG G AAATTAC CTG C AGTTC G C C C C G C C C AAGTG CTG G CTTT AG C AAA
TTAGCAATGTCAAGACCAGAACCCTTATTTCCCCTTCCAACTGTACCAGCACCAACAAG
GATGTGGACCTCATCTATGAATAGAATGATATTGCCTGCAAATTATGTCACAAGCTAAGC
CAATGATCTAAGAATTTGACCAAATTTTACTTCCATTCTATGCTCACCTGACTTTTTGACC
TCCTTAATTAATGTAGTCACACGCCCCTCTAGTTCGCCCCTCTCCTTTGCACCTGAAATG
AGTAGGCCAATGTCTAAAGACATTACCCGCTTTTTCTGTGTATATAGCAAATAACCACTA
GTTAAGTAG GTG CAG CAG CTAGTG G AACAAG AGTATAG G AG ATG CATATTAAATTACAA
AATAGTTCAAAGAATCCCTGGAAAAAAAATGAATTCATTGAAAAGCCCACCATTAAAAAT
GCAGGAATATTTCCCTCAGCAATGTTTATCGCCAGCCCTTCGGCTATCGCTGTTTTCCC
AACCCCAGCTTGACCAAGCAGAATAGGATTGTTTTTGGTTCGACGGCAGAGAATCTCGA
TAATTCGCTGAACTTCAATCTCTCTGCCAATTACTGGGTCTATAAGGCCCTCACTCACAC
G GG CAGTAAG ATCTACACAGAATTG CTCCAG CGC ATTTTTCTCTG CTG GAAATG AAATG
C ATTTG AG AC AG C GTG C C AAAC C AAAC ACTAAC AAG AG AG AG G AACTG G C C ATAC CTTT
TGCTTTCTCAGCGGATCTGTCGATAGTTATTTTTCCAGGAAAGGATTTCTCACGCGACC
TTTTG AATG AAATTG G CTCTCTACCATCTTTAG CAAG CTCTCCTTG AAG CCTG GAAACTG
CCTCAGCTGCCAAACGATTTACATTTACTCCTAACCTGAAATTAGGTCAGTCGACCATG
CAAATCTTAATTTCTTATATAAATAGCGTGTAATAGATGACAGAAAGATAATCATATTGCA
AGAGAGAAGGCATATTAAACTCAATCAGGAAGGTGAAGAAGGATCCTATTCCCAAGCCA
GTAGCATTGTAAGATAAATTATGCAGCAAAAAGGAAATCCGTAAGCATTTGTTTATTTAT
TAACAAGTACATCAGTATGCTTGGTTTAGAGCACAACATGTGCTCATGCATAGCGTTCC
ATCAGAAACAG AAATTG CTT AT CAACAG G AG GTCAAAATATTAAGCTAG G GTTGTCAGG
CGTACAACTTGAAAAAGAATATGTACACATACATGTACACAACACAAGAAGACACCAAAA AAAAAAGAGAGGCAGATACCATTCATAAGTTAAGAACAGAATACTAAGGTAACAACAAC
AATCCAGTGTATTCCCACAGAGGATAAGATGTACGTAGCCTTACCCCTACCCCGGAAAG
GCTGAGAGACTGTTTCCGGTAGACCCTTGGCAAACCAAAATATCTCTTGGTATTTTGTAT
ACGCTTGGGAGGGAAAACAACACCAAATGGAATGTAAAATATGGCAAGCAAGGATAAAT
GAGTGAAGGATAAATTCTACCAATAAGTTCATTGCTTCCTAGCAACAGTGGGAGACAAC
ATCTCATTCCTCCTCAATTGTAGAAGCTGTTTTGGACTCAAAATAAGCAACAATAGCTAA
TCTATTCATGATCCCAAGATTGTTTCAGCTTACCAAAAGCCTATTACTTTTGTTCTTAATA
ACCTCATCATCCACTACTAGAGGCAGATGTAGTGCTTGAGTTAAAGGTTCATCTAAACC
CATTAACTTTGGTTCAAACTTGTATGTATGTTAAAATATGCACCAAATAAGTACAAATAAT
ATATTTCAAACCAAGAATGGGCTGCGGAACGCATATTCAAATCGTTGATCTGCCTCTAA
CTACCTACTTGCTCTACCAAAAAAACTAGCAGTTCCTATACTGAAAACAAAACTACTTGC
CAAAAAAAGTTAAGAAAAAAAAAAGCATTCTTGACTGACTATCTGTAATACCTCTTGAGC
ACACGAGTGGCGTTACCATCATCAACAGTAAACAAACCAAAGGCCATATGCTCGGGAG
CAATAAAATTATGCCCCATGGTCCTTGAATACTCAACCGCAGCCTCAAAAACGCGCTTC
GTACTTGAAGAAAACGCCACATCAGTAGCCGACGTAGCAGAACCGGAGTCCTGAGAAG
CCAATTTTTCTTTATCATCCTCCACGTCATCATGCCATATGCTCCGAACAGCTTCGCGG
GCTTTATCAATTGTTATTCGAGAACCAAGGAATCCACCAGGGCTACGATCCTCTGCGAT
CAGACCCAGCAAAAGATGCTGTGTATACACCATATCTTTGCCCAAAGCCTTTGCTTCTTT
TTGAGAAAACATCACAGCTTTGATTGATCTCTCAGTAAATCTCTCGAACACTCCAGAGAC
AATATAC AAAG AG C G CTTG ATTTTAC GAG G AATTG AG CTG C AG G G C CTATG AG AAAG G
GAAATTCCAAAAAGGGACGAAGTAGAACTGCTAGTACTACAAGCAGCGGTAGTGGCGG
TAGTAATAGTAATATG AG AG GAG G AAG AAG G G C AATATG G G AAAAG C G AAAAC AC G GT
TTG ACATCTCTTGTGAG GGTACACAG AG CCATAG CG ACG AAG CTGAG GATTGAAG CTG
ATTGTTGAGTTCACAGAAAGTGGAGAAGAACACGTTAATTCCAT
SEQ 54
ATGAAGAATATCGAGCGTCTCGCAAATGTTGCTTTATTAGGTATGGTTTCTTTTGATTTT
GATTCATACATTATATCTTTTGATGATAGCTGAATTGCATAGTATACTTGTTTGATTTGTG
CTTGTGAAGTTCAGAAAGTAAAGTAAATCCTGTTTGATTTATAGCTTCGTTTTTTGCCCC
TTAGTTTGTGTTGGTTACTCATTCAGATCATTTTTCCGCTAGATAGATTGCTAAAGCTTTT
GATG CTAATTCTTTGTTGTT AATTG GAAG GAG G CTAC CCTTCCAGG GTAG G GGTAAG AC
TACGTACATCTTATCCTCCCCAGACCCCACTCGTGGGAATTCACTGGGTTTATTGTTGTT
GTTGTTGTTAATTGGAAGGACTCAATGTAGGGAAAGGTGCTAATTATTGTGTAGTTGGA
ATTTGAGGTGTGGTTGATGGTTACCCTAAATATATCTATCCAGCATGTTGGAGTAGATTC
TATAGCGGTTGGAATGAATATTCAAATCCCCTTGGGCAGTCACATTACTACTGTTACCC
GCTTTCCTTTATGTCACAGTAGGTTCCACTTCCACAGTTCCAGTTCAATCGGTAGACAAA
GATGGTCATGTGGGTTCTTTATATCAGTTTTAGCATTTTCTTATATGTTGGATGTTTGTTT
CATCATATTGCCTTTTTGAGGACATTTCACTACGTAATAGCAGCTATGCCGTTCTTGGAA
ATTTACAATGTACGATTATTTGGTCATGGCAATTTCACATCACTTTCCAAATTTTATGTTG
ACGCAATTACCTTGAAACTCTTGCTTTTTTGGTGGATTTCAGGTTTGAGTCTGGCACCAC
TGGTGGTGAATGTGGATCCAAATGTAAATGTCATAGTAACAGCTTGCCTTACTGTCTTTG
TGGGATGCTACCGTTCTGTCAAGCCTACTCCACCTTCAGTATATCTTCTGTACTCCAAGT
TGCAGCTTCCCTTTTTCTTAGATCTGTTTTGATGTCACTTAAACATATTCTACTGCTGTTT
TCCAGGAAACAATGTCTAATGAACACGCAATGAGGTTCCCCTTGGTTGGAAGTGCAATG
CTCTTGTCATTGTTCTTGCTTTTTAAGTTCCTGTCAAAAGACCTGGTTAATGCCGTATTG
ACATGCTACTTCTTCGTTCTTGGCATTGCTGCACTTTCGTATGTTCTCTCCGTATGGATC
ATTCTGTGATGCTTAATATTTTCTATAACAAGTTCTTGAATAGTAGTTTTTCTGTGGGTGT
ATTGGATGTCATCTCTTTCTTTGTGTCTTTGCAGGGCGACATTGTTACCTGCTATCAGAC
GATTCTTGCCCAAAAAGTGGAATGATGATCTCATAATATGGCACTTCCCATATTTCCGCT GTAGGCACCACCTTTCTTGTCTCTTTTGAAATGCCAATTGATCCTTTAGAATCCTTGGGC
ATACAGATCTCATCTTAGTTATTTTGTTTCGTCTTTTTTCAGCTTTGGAGATTGAGTTCAC
AAGATCTCAGATTGTTGCCGCAATTCCTGGAACCATCTTCTGTGTTTGGTATGCTAAACA
GAAGCATTGGCTAGCTAACAACGTTTTGGGCCTTGCCTTTTGCATTCAGGTTTGTCGGC
ATATCCATCCAAGTTACATTCTCATTCTTCAGGATATCTCAAAATGAAAAGTTGTGTAAAA
TAGTATTATTAGTACAATG GTAATATACAATTTTG GATATTTCAAAGTGAAAAGAGTATCA
TATAAATTGGGATAGAGGAAGTACTAAGACACTTGAATGAAGAGATCATATTTCATCACT
AAAAAAGTTGCACTTATCTGTCCATACATGTTCTCGTAACCAAGCATGGTTGCTCTTTAA
TACCAGATGCAAAGGCTACCCGCCTTTATATCTAGCATTTAAATCCACGATAGCACTTGA
TGGCTTCCTCTTTAATTTGTTTGATAACTAGAATTCTCCCAGGAGTTGGCCTACATTTATT
AAACTATGGGAGTATAATAGGCCTCCTCTATCATGCTCCCACTAATATAGCGGCTCCTT
GTTAGTGATAGGGTTCTAACTCATGACGTGGACCCATATTCTGACATTGCGTCATTACAT
TGAACCGGAGCCCCAGGGGCTTACTATTTGTGATTTTCTATTTATATATACATTGAGTTA
ATGGAGATTTTTGCAAGGGAGAAAAGGTTTGATCCTCTCTTATGTCATGTCTACATCAAT
GATTGATATTGATTTTCCCATTGCGATTTTGATTTTCAGGGTATTGAAATGCTTTCACTTG
GATCATTTAAGACTGGCGCCATACTATTGGTAAGAAAGAAAATTTGTTTTCTAATTTCTAT
CTGTAATTATACATGGCTGACAGCTGTATTCTGTTTATGTGTTTCGCCTTAAACTACATAT
TGCTTGTCTTTTTGAATTTGATGCTAACCACATATCTCTTTATTCAAGCGGGAAGAGAAT
TTCATGAAATGAGCTATTAATGATTGTTATTGTTGAGCTATTAATGATTTTACATACAAAA
ATACATAACATTTGCATGGATTATCCCTAATTGCAGAGTTTTTAGACATTTTTGAGGTATT
CTTTTATGTTGGCATTTTTGCTTGTTTATGCAACTATTTATATCCATTAACTTGTAGCTGA
TGTTGAATG CACATG GTTTTCG AG AATG CAG G CAG G ACTTTTTGTGTATG ACATCTTCT
GGGTCTTTTTTACCCCAGTGATGGTCAGTGTTGCCAAATCTTTTGATGCTCCTATCAAG
GTGTGCATACTGATTTTCTCATATAGCTATTTCTTTTGAATTTTCATTTCATGCCTTTATTA
GTTACAGAGTCCTGATTATAACTTCGCTTTCTCTGCAGCTTTTGTTCCCCACAGCAGATG
CTAAACGCCCCTTCTCAATGTTGGGTCTTGGAGACATAGTTATCCCCGGTATAACCTCC
ATTTGCGTGAAAACTCCATTCACTTTATGTGGTTAGAACAGAGAGGTTTAGCATTTTGCC
TAGCGGAGGGATCCTCCACCTCAAACCATGTGGTTTGGGGTTTGAGGCAGTAGGGAAA
CGGTGGGAAAAGCCACTGTTGGTCCCTGGAGGGGAAAAAATGGGGGTGGGTGGGGG
GATGAGGTTTACCATATTAAAAATGAAAAGCTTACTTTTTGTGAGTAGCTACTTGAATGT
ATTTTTCTGTTTCTTACACATGCTTATTATTAGCTTTTGCCATGATGCTGTATTTGTTTTCA
TTTTTCAACTTGTTTTTTGCTTGAATAGATACACTTGGTAACATTTGATCACTTCAATCAT
GCAGGTATTTTTGTTGCATTGGCCCTCCGCTTTGACGTTTCCAGAGGGAAGGGGCCCC
AATACTTTAAG AGTG CATTTTTAG GATACACATTTG GTTTG G CTCTTACCATATTTGTTAT
G AACTG GTTTCAAGCTG CACAGGTTG GTG AATCAAAATAAAG CTTTTACACTTTATTTCT
CTTGCTAGAATTGCAGCGCCCTTATGTTTGACTTGGCCTTTGTTTTTTCCAGCCTGCTCT
GCTATATATTGTTCCAGCAGTGATTGGATTCTTAGCCGTACACTGCATATGGAACGGGG
ACGTGAAGCCTGTACGTTTTTTCTTTTGACAATCTGTTTCAACTTCATCCACTTGCTAACT
TTACTGTTTATGTTCTTTATATGCTGTTTACTACCATTTTAGCTACACACTATTTGTAGATT
ATATTTTCTAGGAGTTAATAGATATGAGAAAATGCATCTTATGGTTTACCTTTAATTCATC
CAAAGAAAACATGCATGTCATGATTTGTTTGGAAACTATGGACAATAAGTTAAAGGTAGG
GAAGGGAAGTTTTCTCGTTTCTTCGAGTTGGAAGCAAAAAAGGACGATCAAGAACTTCT
TTTTCGCCTAACCTTTGATAGAGAGAGTACAAACCCAAACCTTCATTGCCTTTTCTAGTT
TATACGGATACAGAGTTAACGAAATTTTCGTTTATGGAAGTAAGATTGGGGTCTATTCTC
AAGTCGTAGAAAATTACATTTTCGTTTTGATGCTGAGTTTGTATTCTTGATTTTCTGTTGA
GCAGTTGTTGGAGTTCGACGAGGGAAAGACGAAAGGCGCTGAAGAAGCCGATGCCAA
AG AAAG C AAG AAG GTAG AA
SEQ 55 CTAATTTTGTTTGAGATCTCTATAGTACTCTTCTAACCACTTTTGAATAATAGCCACTTCT
TGCTTCCTTTGCATGATCAACCAGCCAGGGTCATTCTTTGTCTCAGAACGAAAGTCAAC
ATGGTGTGCACCTAATTAAATAAGAAAAATTAAAACATGAAACTAGTAATAAAATAAATCA
ATGTCTCTAATCAACAGCAATCGTTTTTCATTATGCTTTAATTCAATGAATATCATGATAT
ACAAAAGCATATATTACACAAATAGCCGGTCATATAGCCGATGTACATAGATTATACAGT
AATTATATATAGTTATACACATTTTATATATGAATTATACATAAATTGTACATGCGCTAGTT
ATTTTTAATTTAAGAAATCAGATCAGTGGCTATTTGGGTTAACTCTTCGTACCTTTTTGAG
TTACGAGTG CCACAATG CTAG CTG ATATATTTTTCAG CACACTGTCACAAAAAG AG AAA
GAAAAAAATCAATTTAATAACAATAAACAAAGACTTATTCAAGATTTAGAGTCTATATGAG
TTTATAATATAAGTCGAAAATAATAGATTCAATTAAAACAACAGGACTTTACCATCACAGA
TAGATATAGGACCACATTTAGTGGTTTTAATTGTATCTATCTATTGTTTTCCACTCCAATT
ATAAACTCATATGGACTCTAAATCTTGAATTTACGAGTCCTCACATTATTATATTTTTATTT
TTAAATTTTGAACTCACCCTCCTCTGCTCCATGGATCTTGCATTCCGTTAGAGAATATCA
TATTACTGCCAAATCTCTTAAGAACTTGCTCAATTCTCTGAAATATAAACATATTTTTTTTA
ATTTCTTATTTTCGTGAATAAAAAAATATGTAAAGAGAATTTTCATTTGTTAACAAAAAAA
GAATGTAAATTAGAAAACTTACATAGCCACCAAATTCAGTAGTGATCCAATGTGGTCGA
GGCTCTACTCCATATTTCTTTTTGCAATCTTCTTTGAATTCCTTGTAACTATAGGAAGATG
GAGGAAACATGCTTTCATTTGAACAAGTCATTGGCATAACCATCTCTGTACATGCCTTAC
TCAAAAACCATGGAAAGAAACACACTAAAATTAGGCATACATTAGACCAAGATTTCAAGA
ATCATTTAATTTTAATCACACCATAACCAAACTAAAGTTTAATATAAGTACCAGTGGCGAA
TCCAACAATGCATTTACGGATTCGATCGAACTTAGTATTTACAGTATAGAAAAATTTGTAT
AT AC G AC AAC AAC ATATC AAG C C C AC C AAG C AG GGGCGTACG C ATG AATTTTTGTAAGT
GATGTCAAAATTTATAGAAGAACGGGAATTATAACTTTAATTATCATACTTCTAGACAAGA
AATAACTTTAATCTATTGGTTTTGTTGAGTCTTAAGTTAGAAAAAGTCCAGAATTAAATTC
TACTTCATGACAGTTTTAACCACATAGATTCTCTAAAATATTTGCAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAG GAAAG CAAAAG AAG CCTGTCAGTTCTC AAATTCAC
AACTTACAATTGAACAGAAACAACCTTCTTTTTTCGTNAGCAAAATCAAGCACATAACCA
ATGCATCAAGAAATAATTTATGTCAAGTGGTGTCATTTTTTTCTACTTATCTGCTTCTTCA
CAGTATTAACACATATATTTAGTAAAAAAATTCGACGAAGCGGTGTCGCGTGACACCGC
TTCGATACATCTGCATCCGCCCCTACCACCAAGTGGGATTTGGGGGGTGGGTAGGACA
TATGCAAACATTATCCCTACTTTTGTGAAGGAAAAAATATTATTTCCGGTAGACCCCCAA
CTCAAAGAAAGATAAAAAGAGATAGAAACAACAAATAGTAACAATCACAAGATAAGGAG
ATAAGGTGATAGAATAATGAGAGATAGAATAACATTTAATAGAAAAATCTGCGAATAAAA
AG CTAC AAAAATAC C ATACTAAAAAAATAC GTATAG ATAG AGTTAG ATATATTTTTC AAAA
TCTACACGTTTAAATTCTGAATCTGCCACTAATAAGTACAAAGAATAATAAATGGAATAG
AATAAAAGGTAATTAAGGTAAAGAATTCTAAACCTGCCAATCCCAACCACGAAGACCAT
GAGCATCATCACCACCTTCTAAATTGAAACATTTTTCTCTTTTTGTATAATTGTAATATAA
ACTTGCCGCAGCAAATGCCCGGCTGATTTTGGAAGCTCCTTTTGGTAATCCATCAATTA
TCTTGCACATCTTAATTTAAAAAAAAAAAAAAGTTTAAGAATTTTTTCACTCGACAATATA
ATTTTTTTACACAATTAGTCATTTATAAGATAATTACAAATAAATTTTTATGATAAGTATTA
ATTGATGAACTGATAAAAATGATAAATAACCTGTTATAATATGTTAAACTACACTAATCAT
GTACAAAATCTCCACATTGTCATTGCATTAAGTCTTATGTACAAATGGTATAAAAAATATT
TATGCCAATCAGATAAATTATAAAGTACCTACAAGTAACCTTTGTGATAAACATTAATTGG
TAAATGGCTAAAAATTGTATCTAGCATGTCATATCAGATTAAACTATACCCACTTTTGTAA
AAATATTTATCCTGTCAATTAGGGGCGGACCTACGTGGGGAGGGGTCACTAGACCCCG
TCAGTCTCGACAAAAAAACTGTATATAAATTTCATATATATCTATATATACAGTAAAGACG
CCTTAAATACTTTATGCGCCCCCCTAAAAGCACAAAAACTGGACAGAGGCACTGGTTTG
TAGGAGTGCTTAATCCAAGTTCGAATCTATGCTCCTACAACTTATATTTTTTATTTTATTT
TTAAAGTGGTGTCGCCGTAATACTCAAATCCCAGGTCCGCATCTGTTATCAATGTATATC
AAGTTCAACAAAAGTTGATCAGGGGCGAAGCTATATATCCCAAAGGGTAGTCAACTGAC
CACCCTTCCTC GAAAAATTTACTTTG CGTATATAG GTAACATATTAG GTTTTAG AG GTAT ATAACATATATGAATACTCTTTATTAGAGAATTTTTTCCACTTCTTTAAGTTTGAACACCC
TTGAGCCTTGAGAAAATTACTGATTTCGCCACTGATGCTGATAGAAAAAAAGGAAATGTA
ATTAAATGCAATGATTTTGAAACCAATTTAGTGCTAATTAAGCAGAATTGTCTACCTCTTG
TACTGGATATGCTGGCAATGGCATCATAAAGTTGGCTTTAGTAGGATAATTCACCATTG
CTGTATATACAAAAGCTTCCCATAGCCAATCTCTAGCTGAATAAACTGAATGTAAACCCC
TGAAATATAATATAAAACAAATCATTTTATTGAATCAAACTTGACCTTGTAATTTGCTACA
ATAAAAAATTATCATGTTTCTATTTTTATATTCAAATATACTTACTTGCAAGTTCTGAAAAG
TTTACTAACTTCAGTCAAGCCTTCTTCATGTTTTGATAAAGCATCCAACTCTGTCCAACTT
CCCTTTATCACTCTATAACAATTCAAGCTTACCTCCTATTCGTGAGGGCAAATAAATAAA
TAATAATAAGGTCTCTTATCCAAGAAAGATTTTTATGAATATATCCTATTCGAGATAAAAG
ATAATTACAGTAATCGCTCATAATAAAGTGAGATTAGTAATCTGCAAATAAGTCAAGTTA
CTTGTTACAACATGTTGAAAATATTGAGTGTAAAAATTTATTTACATTGTCAATAACGGTG
GATCCTGATGGGTTCATCCTCTAGAGTTCAAATAATTTAAGGGTTTGTTTGGCCATGATT
TTTTTTTTTACTTTTTTTTTGGAATCAGTGTTTGGCGATGAAAAATTCTAACATTTGAATTT
CTAAATTTTTTCGAATTTGAAAAACTTCAAAAAACTATTTTTCAAGATTTTCACTTCAAAAC
ACTTAAAAAAATTTAAAAACAACCCCAAATTATATTCATGTCCAAACACAATTCTAATTTT
AAAATACCATTTTCAACTTGAAAACAAAAATTACTTGTTTAAGGAATTTCACAATTCTTAT
GTCCAAACACCCACAAATTTGTTGCGCTTTTAGAAGTAGAGTTTCAAAATTTAATATTTAT
TAAAATTTACAATCTTTTTGCATTTCTTTATAAACCTAGTAAATAAGGCGATATTCGTCCC
AGTGTATATAAATTAAATCCTATTATAAAAGGAGCATCTAGGTGCATGAATGTGACCATG
TAGGATCAGAACATTTTAAACAAAGGGGCTTAATAGATTGTAAAGGCTTTACTATTACTA
ACGACAAGAGCGGGTTGCTCCAGTGGTGAGCACCCTCCACATTCAATCAAGAGGTTGT
GAATTCGAGTCACCCCAAAAGCAAGGTTAGGAGTTCTTCGAGGAAGGGAGCCGAGGGT
CTATCGGAAACAACCTCTCTATCCCAAGATAAAAGTAAAGTTTACGTACACACTACCCTC
CTCAGACCCTACTAGTAAAATTTTACCTGGTTATTATTGTTGTTGCTGCTTTACTATTACT
AACCTTG AAATCTTG GG AAACAG CATCATAAAAG CTTGACCATGG G GTGATTTTGTCAA
ACTGC AAGATTG GTG CTG AAG ATG CCACTG CACCTATTG CTATATGTG G GTACTTCAAT
CTAAACCAAGAAGCCAACACTGGTTGACCAGAAAAAACAAAGTTAAAAGAGGAAAGGAA
AACAAGTATTTAAGAGTGAGAGCAATTTATTAATTATTGGATACTACTCGAGAAGAATGA
ACCTAAATAGCCGTTCACTCAACTGCTTAAACTAAAATAGCTGACGGATGTATAAAATAG
GAAAATTTTATTTTGTGTCTCAGACAACTTACACTAGTTGTATGAAGCATTTTTTTTATCT
CACAGTTTTGTGGACCCGAATTTTTGTGGACCTAGAAGTGTAAGATTGGGGTCCACAAA
TTTATGAGACAACAAAGAGCCTCGCACAACTAATGTCAGTTGTGTGAGGCACACAATAA
AACTTCTCTATAAAATATATAATTCATACTCTTATATAATACACCATCCGGTCCACTTTCA
TTGATTTTTTGACTCTTTTCACATATATTAGAAAATCACATTTTAGCATTAATTCACAATGA
AATTGACCATATTAACCTTATTTTGTTCCTTGAAAATATAACAAATGCTCCTATGCTCTTT
ACTTCAAATGCAACTTTAAAAAAAAAATTAACTTATTCTTAATATCTGGAAAAAATCAAAT
ATTGTGGACCACAAAAAAAATTAAAAATTCAATTAAAATTGACCGGAGAGAGTATATGCA
TAACTATGTATAATCTATATATATCGGCTAGAAACAAACAGTAAATTGAACTGGCTATTTG
TGTAAAGATTCCTACTTAACAAATGCAAAAGTGGAAGAAAAGTTCTGTTTATTTGAATAAT
TGAATGCATCTAATGCTAATGCTAAATTCATACAAAAGAGAACTTTCCATGAACATTTAG
CAACCATAGAATGTAATTATCATTGATTCACATGGATTGGACACTCAATAAGTCAATATG
TCCACACATGTAATGTCATGTCATTTCCATCTATCATTATGTCAAGGCAAAAAATTAGCT
AAAAGTTAAAACTTTTTCACTTATATTATTACTTTTCTTTCATTACTTTTTTTTTGTTTGTTT
GTGTGGTGTTCTACTATATTAGTGGCAGTTTGGACATAAGAATTGTAAAATTTCAAAAAA
AAAAAAATTAACAAAATTTAAGTAAAAATAATATTTGAAAATTAGAGTTGTATTAGAATAT
GAACATAATTTAAAGCTGCTTTTGATTTTTTTTATGAATGATTTGAAATGAAAATTTTGAAA
AACAGCTTTTTGAAGTTTTTAAAATTTTCGAAAAATTCCAAAATTCAACTTCAAGTGAAAT
TTAAAATTTGCATGGCCAAACACTGATTTCGGGAAAAGTGAATGTTTTTTATGGCCAAAC
AGTTCCTTACTTACTTCCTCCATAAGAGCCACCAAAAACCACAACCGGTGATGATTCAG
AAG AAAGATTCTGCTTTAAACTCCTTATTAGAAC AG CATAATCAG CCAATG CTTG CTGTG AATTCAAGTATCCCAAAGTCTTTGGTGACTTGTAAGATTTCTTTCCAAATGGCATTGAAT
C C C C AT AAAAC CT AT G CT AAATTATT AC AAT AC AAAAAAC C ATT AT C AATTT C ATT C C C AA
CAAAGATAAATAATAATAATAATAATAAAATATAAAAAAGGTTCAATTTTACCTTTCTATAT
GAAGTATCTTAATATTACAATTCATTATACTTTGGGCCACTAATATCTTATTTTTGGAAAA
AATTCTTGTATTTGTCTTGATTCTAACGAAGTTCCAACTTGAAGTATAATAGATGGTAATT
TTAAATCATAGTGAATAGCTGGATAAATTTGGATTTTTTCTAGTAGTATTTTGATACGTAG
AATCTACCAAATCAATATTGGAGTTTCATTAAACGTAGTATAAATACGATTCGATTTAATA
ACGGCAAGAATATAAATAATCCCTTAAATAAAACGAAGTGTAAAACTAAAATACTTCGTA
TAGTACAACAACAATAAATTCAGTGTAATTTCACATGTGAAGTTTGGGGAGGATAGTGTG
TACGTAGATCTTACACATATCTTGGGAAGATAAAGAAGTTGTTTTCGATAGACCCTCGG
CTCAACGAATAGTGAAAACAAAGTAACAAACAGTAGCAACAACAACATAATATGAACAAA
AGGCAAAATACTTCGTATAGTATAGGAGTAAATTTAAATATTTTTCTCAAAAATAAATACT
TCAAATAAAAAAACATTTCAAGATTATATACATACTTCAATGAAGACTAGAAGAGCATGA
AACTTAGGAGCAATATCAAGCATAAATCCAGTATTTGCAGCAAACCAATCAATATTTCCT
TCATTTCCAGTGTAGACAAAGATAGGGCCTCCTTGTTTCCAATAATTATCATTTATGAGA
TATTTCTGTTTAAAAACTTTAGAACTCTTTGGTAGAAAAGTGAAATGGTCAAGAATTTGA
GGAAAGTAATGGACTTTAAATGGTATTTTTGACTTGACATGTTGTTTTTCTAATGAAGATT
GATAAGTTCCAGGTAGATAAATTGGCTTAATTTCTCCAACTACAAAAGAGATAATAAACA
GTAAAATCAAGAAAATGAAAGAAAAATAAGAAGAAGAAAAAGCCAT
SEQ 56
ATGTCTCGTTTCTCACTCCTATTGGCTCTCGTCGTCGCCGGTGGCCTTTTCGCCTCCGC
ACTCGCCGGACCGGCGACCTTTGCCGATGAGAATCCGATCAGACAAGTCGTTTCTGAC
GGTTTACATGAGCTGGAGAACGCAATTCTCCAAGTCGTCGGCAAGACCCGCCATGCTC
TCTCCTTCGCTCGCTTTGCTCACAGGTACGATGATCTCTACATGGAAATGAGATTTTTTT
TTGTTATTTGCTTATTAATAGTAATTGTTTTATTTTGAGTTTAAGTTCTATATATGCAGTGA
GCATATTTTTTTTTTTACATAAATAAGATAATAACAAATAAATCACTTAATATGTATTAGTT
GGTAATGATAGTGTAAAAAAATATTATACTGTAATGTGTATATAACTTAAATCTTTTTATTT
TTGGGACGATATTTAAGGTATGGGAAGAGGTACGAGTCAGTTGAGGAGATAAAGCAAA
G GTTCGAG GTATTTTTG GACAATTTG AAG ATG ATTC GATCG CACAACAAG AAAG G ACTA
TCATACAAACTCGGTGTCAATGGTATAATTAATATTATGGCATAACGCTAAGGCCCTGCT
CTTTTCCTTTTTTCTCTTTTGCTTAAGTGGAGTCTTAATTTGTTGATTTGGAGGTAACAAG
TTATAGTTTTGTGGTTCCTTTACCGGAATACTCTTTGTTTTTATCTTCAGCTAAGGTAACA
GATTAAGGCGTAATTATAGTTATTATTGTAAAATAAGGTAATTTTTATTTAGAAGCTTCAA
AATTAAGTACAAGCAATTGAATACTACTCTTTGTAAGTAGACTTTGTATATATGTTTTTAT
TTCATTCTCTTTTTTCATTTGGAGAGATGTGGACAAATTAAAATTATAATATAATGCGATA
AAACATATGTTCCACTACAGTATCAGTATGGTATTTATAGTTTGCATATTTTATTAGTAAT
TAATTGGTCTAGTGCCTTATCATGTGTAGATATTTCATTCATATTGTGTGGCTAGTGGGT
ACCCTTTCTCTCTCCAATCAAAAAACTTTTTTTAAAAGCTCAATTCAAAAGCTTTTCTTCA
TTACAACTGATCCTGCTTAAAGACTAAAAACAATCTAAATTGAATTCTTAATTCTTCTCTA
TTCATTCATATATGGACATAAAAACAAAATCACAGTACATGGAAAGAATATAAGCACCTA
AGCATTGGACTGCCCAAATGAAAAGTTTTTGCAACTTAATCTAGTTGTGCATAGATTCAA
CAACAAAAAGTAAAGAAATAAGTATGCATTTTATGCTTCTAAGTTCTAGTATATATGGCC
CTTATTGTTTATCGATTATTATGTTTCATGACAGAGTTTACCGACCTAACATGGGACGAG
TTCCGGAGAGACAGGTTGGGGGCAGCTCAAAACTGTTCAGCCACCACAAAGGGCAATC
TCAAAGTCACTAACGTTGTTCTGCCGGAGACGGTATATGCACTCAGAACTCCTCTGTAT
CTATTTCTGGAGTTAGTGATCATTAGAGTTAAACTACTTTCTGATGATTTATTATTTCCAG
AATTGTGGAGTGCTCTGAGTTTAATTATGCTGTAACTATAGAAACACTAACTAAAAAGAT
CTTGAATAGGTATCCTACAACAATAAATAGAATCCTCATAAGAAATACCACTAGATCGAG CACCAGTCATGATTTCATATCTGGTAAAAATCTTGGCTAATTGATCGAAGTGGAGTAGAC
TAGC GAG CATGTACTGAGCTAATG CACAATTG GTTG CAAAAAGAAGTTTTTTCTTTCCTA
ACCGAAATTTCCAATTTCGTAATTATAGAAAGACTGGCGGGAAGCTGGGATTGTCAGCC
CAGTCAAGAAC CAG G GCAAGTG CG GATCTTG CTGG ACATTC AG GTAAGAATTAGTTAG
AATCTCACATCATTGGACTCTTAAATTGTAAGTCTTGAAATTGCACTCTTAAGCTGAAATA
TAAC G G AG AAG G C ACTTG G CAG C ACTACTG GTG C ACTAG AAG CAG CAT AT AG C C AAG C
ATTTGGGAAGGGAATCTCTCTATCTGAGCAGCAGCTTGTGGACTGTGCTGGAGCTTTTA
ATAACTTTGGCTGCAATGGTGGGCTCCCATCACAAGCCTTTGAGTATATTAAATCCAAT
G GTG GTCTTGACACTGAAGAAG CATATCC ATACACTG G CAAG AATG GCTTATGTAAATT
CTCATCAGAAAATGTTGGTGTCAAAGTCATCGATTCCGTCAATATTACCCTGGTATGATA
TCTCTTTCCTCCAGTATGCAACCAATCTTTGCCAGTGTTAATATCCAACCTTAATGGTCA
ATAAGGATTGGTTAAGTTCCTTACATACGTGTCATTACAGGGTGCTGAAGATGAACTAAA
ATACGCGGTTGCATTGGTTAGGCCCGTTAGTATAGCTTTTGAGGTGATAAAAGGTTTCA
AACAATACAAGAGTGGTGTTTACACCAGCACCGAATGCGGCAACACTCCCATGGTAAGT
CATCTGTCCCTAGGAACGTGATATGCAAATATATTGACATAGTTACCTAAATACAGGGG
AAAGCTACAGCCGACCAAGGGTCGTCAGTTGAACACCCTTCACTTCACTGTCGTGCATA
TATTAAATCTTGAACACCCTAAGTGAAATTTATAACTTCGCTAAATAGGCATATACACAAT
ATTACAAACATTGTGTGTTGCATTGGCAGGATGTAAACCATGCTGTTCTTGCTGTGGGT
TACGGTGTTGAAAATGGTGTTCCCTATTGGCTCATCAAGAATTCATGGGGAGCAGATTG
GGGTGACAATGGATACTTCAAAATGGAGATGGGAAAGAACATGTGTGGTATTGCCACTT
GCGCATCCTACCCTGTCGTTGCC
SEQ 57
ATGGAGAAGGAACACAAATACTCTTTGTTTCTCACAAAGTTGAAGTTGTTTTTTCTTGTTA
CATTAAGTACTTTCCATGGCCTTAGCCATGGCTTCCAAATGGATCAGGCACGTACATTA
ATGTCTTGGCGTCGTTCTAAAATGCATGCTCAGACAACTACTTATGCTACTAATGAGGAT
GAGACAGAAAACTTAGTATTTTCCGATGAAAAACATGTCGGAAATATGGAGGATGATCT
TATTAAAGATGGTCTTCCAGCGCAGCCTTCAAATGTGATGTTTAAGCAATATGCAGGATA
TGTTAATGTTGATGTAAAGAATGGAAGAAGCCTTTTCTATTACTTTGCTGAAGCTTCTTC
TGGAAATGCTTCTTCAAAACCTCTTGTTCTTTGGCTAAATGGAGGTAAATTATATGTGTT
GATGATTCTTTCTCAACTTAATTTTGTCTTACTAATTACTCATCTTCTCTTAATTCTTTTGT
CATGCACCTAATTTGATTAAGTACTCATTTGTTTTGTTTCGATTTAATCTAACTTACCCTT
TATGCACATATATTCTGCATCAAATTAAGTTGAATATTACTCCACGTGTTCCACATTATAT
ACTTAACATTTTTTTTTTCCAATCTATTTTACACATTTTATATATTTGAATACTTTTTTAACT
TTAGACATTTCAATTTACCCTTAATAGTATATTCTTGTAGCCGATCAAATATCTATGAGAT
ATTTTTGAAGTCTTTTTTCTTCCTAAATCAAGTCAAATGTTAATGTATAAAATAAAGCAGA
GGGAGCAATAACTTTCGTTTTATGTTTGTAATTTTTCTTAATAGTGATACTCATTACTCTC
CCCGGTCCACAATAAGTGACTATTTTACTTTTTTATTTTGGTCAAAAATAAGTATCCATTT
ACCTAATCAATAAGGAATTAATTTTATTTTTCTAAAATTTACCCTTATTTACATATTCCAAC
GTGTCAAGGAAATAATTAATTAAGGTTAATTTAGTGAATATATTTTTTTTCTCTAAGAGTT
CGTATTTCTTTAATGGATGTGCCAACTATAAAATGGTCACTTATTAGGGACCAGAAGAGT
AACTCTTTATTATTTGAAATTTTG ATTTC CCAAAAG GTATAAATG GTCC GAG GAAC ATCTA
ATTTTCGCTCATTTG CAAG AAAGTG GTTCATAGACAAATG GAGTT ATT AAGTG GG GACG
CGCAACAGAGAATTATTGGTCACTTTATTCGTTTTGTTCGCTCTTTTTCTTTCTTTACTTC
TTTACTAAAGTAGAAGAGAAAAAAAGGAAGTTTAAAAAATTGTTAGTGTATATGAAGATA
AGAG CTGTCATTTTCTTCG G CTATTG AATG ACGAATAAAAG CACAATTGG GTACAG GTC
CAGGATGTTCATCATTAGGATTCGGGGCCATGCTAGAGCTTGGGCCTTTTGGTGTAAAC
CCTGATGGTAAAACCCTTTATTCCAGAAGATTTGCATGGAACAAAGGTACATTTCATTTG
CTAAACTAATATAGACCTACTTATAATTAATGAAACTAATTTCTCAAGAAATAAGACAACT ATTTTTTGTAATAACGTATACTCTATCTTTTTCAATTTATGTGACAACATTATGTTAAAGGT
GTCACGTGAGTAGGAAGCCAGCTGACACTTAGTAGGCAAAGAGTCTGTTAGATTAGTTG
TTAATTATACAATTAGAAATTAGTTAGAATCAGTTGGATTACATTGTATATGTATGTGTAT
AGACGGTTATTCAATACAACAGTAATTTTCTCATCTTCTCTTTTCCTCTCTAAGCTGCGAT
CTCTCTTAG CTCAATCTAG AAGC ATCCACG AC AG ATGTTTG G CATG GTATCAG AG CTTT
GTGCGATCATTGCTCTCGTCTAATTCTCCTCTGAGTTCATGTGAACGAAACTTCAACTCG
TTCGTCTTCATCTCCTTCCCTCGAGAACATCGAATCGACGATGACGACGGAAAAAATTG
ACCACATTCATCCTCTGTTTGTGCATCCCTCAGATACTCCAAGTTTCATGTTGATTCCAG
TCCAACTCACTG G ATCTG AGAATTACG AATTATGG CG G AG ATC GATG AAAATTG CACTT
AGGCAAAACGAAAGTTAGGGTTCGTCAATGGCACACCCACTAAGGATCAGTTTAGGTCA
GAG CTACATG AAG ACTG G G AG ACATGTAATG CGATTGTGCTCTC GTG G ATTATGAACAC
AGTATCTCCAAATTTACGTACTTAGTGGAATTGTGTATGCTTTTAATGCTCACCTAGTAT
GAGAAGATCTAAAGGAGATGTTTGATAAGGTGAATACGATGAGGATCTTTCAATTTCATA
GAGAAATTGCTACAATTTTCCAACGAACAGATTCAGTGTCCATGTATTTTATAAAATTGA
AGGAGCTCTGGCTGAGTATGATGCAATGGTACCCTCAACAAATTCGAAGTAGTATGCTG
ATCATCTTCAGCGGCAGAGGCTATTACAATTTCTAAGTGGACTGAATGATTCCTATGCTC
AAG CTAGAAG ACAG ATTCTAATG AAATC AGTAGAAC CTACTTTG AATCG G CCTTATG CTC
CAATTGTTG AAGACG GAAGTCAAATG AGTACATCG G GAACTTTATCACAC ATTG GG CTG
AACTCAATAGCCGAGGAAAATGACATTACAACATTGTGGAGCTCAGCAGTAAAATGAGG
TTCAATCAAGAAGAACAAAAGGAATTACAGTATATTTTGTGAACATTGCAAGATGAAAGG
ACATAGTAAAGAAAATTAGTACCAGCTCATTGGTTATCCGACAGACTTTAAAGATAGAAG
AAAACAAG GAG CACCTACTGGTTAC CAATGAG CACCTATTG GTC ACCAAG G AACAATTG
AGG AAG ATG CAATG CAAG G GAAACAG GTATG ACTGTAGATTTTG GG AATCTTTATG CAG
GCATATCATATGGGGACAACAACATATGCAGGTGCAAAGGCAAGGGACTCATAATCCT
GTACACATGGAAGATGCTCAATCTCAGGGACAATCTTAGGGATATACAGGTGGTGTCAC
TGTTATATTTACTCCGGAACAGTATAGTCAAATCTTACAAATGCTCAACAAAGATTATGTT
CCAGAAACATCAGCTAATATGGCAGGTACTATTTGTTCTTTTCTGGCTAGTAAAACCGG
GCACAATTGGATAATGGACATAGGAGCAACAGATCATATGGTATCTACTCCTTAAATGTT
ATTTGATTTGAATGACTATG CTAAG CAAG GCTCACTGTTG CATTTACCTG ATG GAAAAAG
TTGCCTATTAGTTATGTTGGTAAATGTAGATTGGCACAAGGGGACATCAGGGATGTGTT
GTGTGTACCAGACTTCAAGTTTAACTTGTTGTCAGTGGCTAAACTAACTAGAGAATGCA
GTGTTTCATGTCTTTCTATCTTGATTTTTTTCTGATGCAGGACCTTCACATTGGGAAGGT
GAAAGGGACTGATAGAATGCACAATGACTTGTACTATTGGAGAAATAATATAGAGAATA
AGATAC CACAATCATTG G CTACTACTTTG ACTCAATCTG CAG CATTGTG G CAT AAG AG G
TTG GG G CATGTTCATCATAG AATACTACAACAAATGAACTTTTTTAAAGATATC AAGACA
AATACTGGCAGAACTTGTTCTATATGTCCTTTAGCTAAGCAAACTAGGCTTTCTTTTCCT
CAAAGTACTAGTGGAACTACTACACTGTTTGAGCTAGTTCATGGTGATGTATGGGGTCC
ATACAATGTACCTACATATGGTGGTCATAGATTCTTTCTTACACTTGTAGACGATTGTAG
CAG GATG GTCTG G GTTTTCTTGTTAAG GTTGAAGAGTG ATGTCTCATTTGTATTAAAAGA
TTTTATGTCATTAATAAAGACACAGTTTTATAGTTCAATCAAGGTTTTCAGAAGTGATAAT
GGTACAGAGTTATTTAACTCACATTGTATAGATTTGTTCAGTGGTGCATGAATTGTACAT
CAAAACTCATGTGTTCATACTCCACAGCAGAATGAAGTTGTTGAACGAAAGCACATATAA
TTTTTTG AG GTAGG AAG AGGTTTCAG G GTTG CATTCCTCTAACTTTCTGG G GATTATGT
GTTCAGAATGCTGCGTATCTGATTAACAGGATTCCATCCACTACTGTGGCAAGAAAGTC
ACCATTTGAGGCATTCTATAGGAGGAGTCCTAACCTACAACACCTAAGGGGTGCTTATG
TTATGCCATAAGTGTGGGTGCCAAAAGTGACAAATTTGGAGCAAAAGCAATCCCAACAG
TGCATATGGGATACTCTACCACTCAGAAAGGCTATAGGTTGTATAACACAGCCAATAAA
CTGATCTTTGTCAGCAGGGATGTTTCATTTAGAGAAGATATATTTCCCTTCAAGTCCTCC
TACTATCAACTTAGACCACCTAATCTTGTGGAGTATTGGAATGGTCGCCATGATCCCTTT
GTTCTTGAAACTACTATTGATGCAGCTCCATTGGAGACTTCATCTATAGTTGAGCCAGTC
TTTGTCCCCTCTAGTCCTTCTATTCCTACTTCTTTGAATTTAGGAGACTCTACAGCTGGT GTCTCTGAGAATGCTACTACTGTATCAGTCCCTGCTGCTAGTACTGATTCTCTCATTCTT
AGTAAGGCTCCTTATGATAATGTAGCAGATATTACTGTTGCTCCAGATTCTTGAGAGCTT
ACAGTCACAAGAAAGTCATGCAGAACCTCCAAGACTCCTAGTTGGCTTAGTGACTATGT
TCATAAGGGGTCCAAGCCTCTATCACATGCTGTAATGGGCACAAGTTATCCTTTATCAG
TATATATGTCATATCCTTCACTTTCAGACCCCTATTACAAGGTCATTTATAGCATCTCATC
TGTGAGGGAGCCTGATACTCATGAAGAAGCTCTTTATGATCCACAGTGGGTAGTAGCTA
TG CAACAAG AACTG CAAG CCTTTCAAG ACAATCACACTTG ACAG CTG GTTAATATACCT
CCTGAAAAGAGAGTCATTGGTTGTAAATGAGTATTCAAAGTCAAATACAATGCTAAAGGT
GAGGTGGATAGATACAAAGTTCGTTTGGTAGCCAAGGGATATACTCAGCAGGAGGGGT
TG GATTACTAAGAG ACTTTTTCTCCTGTG G ATAAGATG GTCACTGTGAG GACTATCTTAT
CCTTGGCTGCAATGCATGGTTGAAGGTTGCATCAAATGGATATATTCAATGCATTCCTC
CAGGGTGATCTTGTAGAGGATGTTTACATGGTTCTACCTCCTGGTCTTCTAGGACATGG
GGGGGAGGGNNNNNAGGGGGGGGGGATGTAGGAGAGTATGCAAGCTACATAAGTCTA
TGTGTGGCTTGAAACAAGCCTCTCGACAGTGAAATCTTAAGCTTTGTGAGGCACTTCTC
TCCTCAGGCTTTATTCTAAGTCATCATGACTAGTCCCTCTTCACTCAAAGATCAGGGAAT
GAGCTGTTCCTCATCCTAGTTTATGTGGATAACCTCCTCATCACATGTTCTTCTCCTTCT
CTCATTCATGCAG CTAACTCATGCTC CATCAG CATTTCAAGATCAAG GATCTG G GG G AG
ATGAGATACTTTCTTGGTCTTGAAATTGCAAAGAGCACAATGGAATACTAGTATGTTAAA
G AAAGTTTG CACTAG CTCCTTATTGCAG ACTTATG AGTGG CTG CTTCTAAG CCTACTAG
CATACCTACGGAGGTCAATCAAAGGTTCACTAGTGAATAATTTGATCACAACTATAAGAC
TGAGGGCAATACTGATGAGTTGTTGTCCGATCCTACTGGCTATCAGAAACTAGTAGGGA
AGCTGCTATACCTAACAATGACTCGACCAGATATAAGATACACAGTGCAGAACCTGAGT
CAATTTATG CATAAACCAAAG AG ATC ACACGTGG AAG G GG CTCTAAG G GTG GTGAAGT
ACTTAAAGAATGCACCTGGTTTGGGCATCTTGTTACCTTCTAAGCCATCCTCACAACTTA
CAGTCTACCGTGATGCAGACTAGGCCAATTGTCCCATGACAAGAAGGTTAGTT AGTGG
CTTCATAGTCAAGCTGGGAGACTCCTTGATTTCTTGAAAATCAAAGAAGCAAAGTACAG
TGTCAAGAAGTTCAGCAGAGGCATAATACAGAAGTATGGCCAATGCAATTGCAGAAATA
GTTTGGCTCATTAGACTGTGTGAGGAACTGAAGGTGAAGCTGGAGTTGCCTGTTAAACT
ATATTGTGATAGCAAGGGAGCACTTCAAATTGCTGCTAATCCTATCTATCATGAACGAAC
GAAGCACATAGAAATCGACCGTCACTTCATTAGGGAAAAGATACATGAGGGCATTATAC
ACACAGAACATGTGTCCACAAGTTTGCAGCTGGCAGATATTCTAACTAAAGGTTTAGGA
AAGGCGCAATATGACTTCCTATTATCCAAGCTAGGAATGTTCAATTTGTTCATATTACAT
AGCTTGAGGGGGAGTGTTAAAGGTGTCACATGAGTAGGAAGCCAGATGACACTTAGCT
GGTAAAGAGTGTTAGATTAGTTGCTAATTATACAATTAGAAGTTAGTTAGAGTCAGTTGG
ATTACACTTTATATGTATGTGTATAGACGGTTATTCAATACAATAGTGAAAATAATTTTCT
CATCTTCTCTTTTCCTCTCTAAGTTGCGATCTCTCTTAGCTTAATCTAGAAGCATCCATG
ACAG ATGTTTG ACACATTACTATTTG G G GAG CCAAAGAG GTTCTTCTTTATCATGTGTTT
TCTTAAATGTTTTATAAATATTTTGAATTATAATTTTTTTTATGAATTATAGTACTTTTTATG
TAAAAAAAATGAATTTGTATCTAAATTTACGGTGTAAAGTAAGCTAGCGTTTGGCCATAG
ATTCCCAAATTTGTTCTG AAAAATCTG ATTTG G GTG AAGTTTG GTTTG G AG ATG AAAATG
CGTTTGGACATCAGTTTTCAAAACATATTTCCCAAATTTATTTTGGAAAAACATGAAATAT
GATTTATACCCACAAGTTCTAAAAACTATCACAAATACCCAACAGTACCATTATCAATAA
CATTCATTAAAAAACTTTGATTCTCGTAAAAACTTTGATTATCAATCACAAATATCCAAAT
TTATTTTGGCAAAATCTATGGTCAAACGGGTATTAAGAACTTTGATTCTCGTGCTATGTA
CCTTGCCCGGAATGGAATTGTACGGGTATTAGAAAAATACATTGTGCTGCCAAATGCAT
TGTACAATAACTATAACTACTTATAGTTATTGTATGTTTTTTCTTTCTTTTATTTATTTACAT
ATGTAATGATGGTATACAGTTGCGAATGTGATGTTTCTGGAGTCGCCGGCAGGGGTTG
GGTTCTCTTATTCCAACACTACCTCGGACTATTCAAAGTCAGGCGATAAGAGGACTGGT
ACACACCGAAAAATCTCGTTAATACAATAGTAATAATTGTCAGTTTCATTATTATTTTTTTA
AAACAATTTTAACAGTCAAAATGATGAAATTTTACTCTTTCATTTAACTCCTCAACTTCAA
TTTCAACTTCACATGCTCTATTCGTCAACACTCAACTCCAATCAAACATTGTGCAAACAG TTATATTATTATCGTTTGTAGTCTGTAACTATTTTTTAATTTTTTTTAAAGACTACACTTGA
GTATCGTTAAAAACATGGTCAAATCTTTTGGTCACTTAAAGTGAGGCAGAGGTGGTGTC
TTTTCTTATCAAACGAGATTTTTCATTTTTTTATTTATCATTAATTCAGTTATATATTTATTT
CTTTTCCTTAAACTATATTCTTTTTTATTGGGTGACAGCTGAAGATGCATATAGGTTTCTA
GTGAATTGGTTCAAGAGGTTTCCACATTACAAAGGCAGGGATTTCTACATCATGGGAGA
AAGCTATGCAGGTATCTAGTACAGTATCAGTAATTAACAAACGAAAAAATACAAAACAAA
AACTTTTGATATTCTTGACTTATCCTTCTAGTAGTGAGAGCCCTCATTATTGAGTTCTGTC
CAATAAAATTTGTAAGAATTAAGGACCACTGTATCAGAGACAGCTTGTGCATATTTCAGA
CCATTCACAGAAATGTCCTCCTGTACAGTCTCAGCTCAAAGCCGAAATGAGCATTCTGA
TTGAAAATTTTGGCTATATTTGTAATACAATCTTATATAAGTAGTCTATATCTAAAGTCTAA
TATTGACATGACAGCTAATATTGCTGTGACTGCTCATCACAGGATTCTACGTACCAGAG
CTAGCAGATATCATTGTCAAGAGGAACATGTTGCCTACCACAAACTTCTACATCCAATTC
AAAGGAATCATGGTATTATATCATTTAATTTGTTGACCTTTTAATTTGTTTGATCTCTCTG
TTATCAAATCTTACTTGTATACCTAGTGATGAGGGGCGGATTTAGGGGTGCAAGGGTGT
TCACCCGAATCCCTTCGCCGAAAAATTACACGGTATATATAAGAAAAAGTCTGATATTTA
CCTTTATATATTATGTTTTGAATTTCCTTTACACAGCCCAAAAGTCTACTCTATGTCATGA
CATAAATTATTTCTTTATATTG CAG ATAG G GAATG GTATAATG AATG ATG AAACAG ACGA
GAAAGGGACATTGGATTATTTATGGAGTCATGCACTAATCTCAGACGAGACTCATCGAG
GTCTCCTACAACACTGCAAAACGGAGACCGAAACATGCCAACATTTTCAGAACATAGCA
GAGGCTGAGTTGGGAAACGTCGATCCTTACAACATCTATGGTCCCCAATGTTCCATTAA
TTCAAAGAGCAGATCTTCTTCTCCGAAACTGAAGAATGGATATGATCCTTGCGAACAAC
AATACGTTCAGAATTATCTCAATCTTCCTCATGTGCAGAAGGCCTTGCATGCTAACCTCA
CTAACCTTCCTTATCTTTGGAACCCATGCAGGTAATCCAACTAAGTAAATATTATGTATA
GCATATCGATTTAACTTATATATACCGATAGTATAAACAATTTTTACACTGTCGTTGTATA
TGTATTGATTTATATATATACACTGTTAGTGTAAAATGTGTTGTAACAAATAATCTATGTT
ATTTTTCTATTTATAAATTAAATTCTACTTTTTATAAATAATAACTTGTACTTATCTTTTTGG
TCACCTGATAGAAACTCTTTTATATCATCCAATGTGTATTTAAATCTGTTGCGGCAATGAT
ATTTCTTATTTTCAAGATTACAAAATCTCACTCTTTATGTTTAGTTTATGTCACTTTTAATA
TGTAGAAGGTAATTGAACTCATATAAAAAATAGTGTATATGATATGATATGATGATTTTTT
TTCTTTTTTTTTTTTCATTTGGTATGGTAGCAATTTGGATTGGAAGGATACTCCAGCAACC
ATGTTTCCGATATACAAGAGACTTATTGCATCTGGTCTACGTATACTTCTTTACAGGTAA
CTTTATTATGGGCTTATCTTAGACTTTGGTTTATGTTCATGATACAATATTTTTAATTGTTC
GAATAAAGAACAAGTGGATTTGTATTGTTTGGAAACAGTGGAGATGTTGATGCAGTAGT
TTCAGTTACTTCAACTCGCTATAGCCTTAGTGCTATGAACCTTAAGGTGATCAAACCTTG
GCGTCCTTGGCTTGATGACACACAAGAAGTACGTTCTTCGAATATATTTTTTAATGATAA
TTTTATATATTTGTGGTGAGAAATAAATCTTATTGTTTCGTTCTTTGTTTTTTTTTTATAATT
TAAAGGTAGTTTGTATAATTTCTGCAGGTAGCTGGATATATGGTGGTTTATGATGGATTA
GCTTTCGCAACAGTTAGGGGAGCAGGGCACCAAGTTCCACAATTTCAACCACGTCGAG
CTTTTG CTTTGTTG AATATGTTCTTTG CCAATCATTCT
SEQ 58
ATGGCTAATTCTTATACAAGTATTAATTTTTTCCTTGCCCCTATTATTTTCTTGGCGATTC
TGGGATTGCAGTTGCAGAGCAGCGATGGTTTTGGGACATTCGGGTTTGATATCCATCAC
CGGTATTCGGATCCGGTGAAGGGTATTTTGGACCTTCATGGATTGCCTGAGAAGGGCA
GTGTTGAGTATTATTCAGCTTGGACTCAGCGTGATCGCTTTATCAAGGGTCGCCGCCTT
GCTGAAGCTGATACAGCTAATTCCACTCCCCTCTCTTTTTCAGGAGGGAATGAAACTTT
CCGCCTCAGTTCTTTGGGATTGTAAGCTTCCCTCTATGCATTTTTCTGATTGCTTTTTGC
ACTTGTCTATATCTTTATTGTTTACTTTTTCTAGTCATATACATAGATTATATACTAATTAT
ACATAATTATACATATATAATACAAAAATTATACCTTTTAAGTGGTTGGGTGGGCGGCTA TTTGGGTTAATTCTTCTTCTTTTTTTGTATGTGTGTTTTGTATCTGTGTTATTATTCCTGAT
TGTGAACTAGTACGTCTTTGGAAATTCTTGTTTACTGTCTTTTCCTTTTGTCTGTTTAGTG
TGATGTTAGAGTTGACTGAGCTTTACGTTTGTTTTTGTCTGTTTAGTTGGATGGCAGTTC
AGGAAAAATAGGTTACCTAACTTAAATAAGTTCCATGTGCCATTTTTAACGAGATTCAAG
TGGAGAAAATATGAAGAAGAAAAAGAATGATTTAGGCCTGTTCTGTTCTATAATTCTGTT
TGTGTGTTTGATTGGACTGGAATTTTGTCGATTTAACTACTACATAAAATACTGACTCTTA
ATTTGATTTTACTTTTCTTCTATTTCGAATTCCAAGCTCCGGAAATAAATTCCGTTCTTTTT
TCTGATTTTCCTCTCCTCTGCCGCCACTTAACTCCTCTCCAGACAAGGAATTGTTCTGAA
GTTTCTGGCAGTAGCATGTTGTAATTTATGTGTTATAAAGATAGAGTTGCAAAATCTGTA
GTATCTGTAGTTGTGATTTTTTCTTCTTAAGGTGTGTGACTAAATATTCTTTGGCAATTTG
CAGTTTG CATTATG CAAATGTGACAGTG G GCACTCCTG G ACTATCATTTCTAGTG G CAC
TTGACACTGGCAGTGACTTGTTTTGGCTACCCTGTGATTGCAGCAATTGTGTGCGTGCC
CTCGAGACACGCTCTGGACGAGTATGTTTGCTTCATTCTAGTACCTTTTTCTTTCTACTT
TCAAATGTTTAAAGAGTTTTTCTTTTTTTTGATCGTCATCCTCGTCTGTATATTGCCTTCT
GCTACAAGGAAGTTGTGCATACTTCTCTTCCTTTTGTAATTATGAGACTTTCTGATAACC
TTTTTCAGAAAG G AACCTG CTGATAACACAATG G CTGAATCTGAAACACAGTG G ATTTCT
CTTCAACTGTCTTTTTCGGTCATTATGACAATAATATATTCTCTTAGTTAACAAGATATGG
G GTAG AG AATGTATTGAG G AAATTGTTTTTCTGTTAAG GAAG ATACATAACTAGCG CAAA
AAAGAAGATTTAAACATAATCAATATTTGCAAAGTGAGTCTGATGCATGTAATATACTGA
CTCTGAAATGAAATTTCTGATCCATATTGTTCCGTGGCTTGTTTGTCCTTGAAGAATTTT
GAGATTCTTACTAGCTCAAGTACTTCAACTTGTCACGACCCAAAAATCCCACCACAGGC
GTCGTGATGGCACCTAGTCTCTAAAACTAGGTAAGCCGATTTCAATTACATTTTTGGAG
CCATTTTTTTTTTAATTAAATAAGTAACCAAAACTAACAGCGGAACAAATATGAATGTACA
ATCTCCCAAGACTGGTAGTACTAAGTCACGAACTCTAACTGAATACATGGAATGATCAC
GAGGACCGAATATACAATACTGTTTGATTAAAAACTCCACAGGAGTTCACCTTGAAGAA
CAAAATTTTCTTTGCTCTTTTGCCTTTTCCTTTTAATGTTTCTGCATGTATTATTTGACACT
TGTAATCTTTTGTTTGCTTTTGAAACAGCGAATAAATCTCAATATTTACAGCCCTAATACG
TCGTCAACGGGTCAGATTGTTCCTTGCAACAGCACTCTGTGTGGACAAAGGAGACGAT
GCTTATCTTCACAAAATGCATGTGCTTATGGAGTTGCATATCTCTCCAATAACACCTCAT
CATCAG G GGTACTGGTG GAAG ACATCTTG CACTTAG AGACAGATAATG CTCAACAAAAA
AGTGTTGAGGCTCCAATTGCTCTGGGGTGGGTATGCTTTAGTTTTTTCTCTTTATCTTTG
GAAGAGATTATCTTTGGATCTTCTGATGCATTTCTTTATCCGCCATGATTTTTTATATTCT
ACTTGTTCAATTTCAG GTGTG G GATAAG ACAAACTG GTGCATTTTTAAGTG GCG CAGCT
CCTAATGGTCTATTCGGACTTGGCTTGGAAAATATATCTGTTCCGAGCATGTTAGCAAG
TAAAGGTCTTGCTGCAAATTCTTTCTCCATGTGCTTTGGGCCTGATGGTATTGGAAGAAT
AGTCTTTG G AG ATAAAG G GAGTCCAG CCCAAG G AG AAACACCACTC AATCTTGATCAAC
TACAGTAAG CAAGTCACTTTG ATATTCTG G GTTTATCG GTTG CTTCTGTTTCTG G CTTGA
TTTAGGAGAATGCGACTGAATATTTATTAACTCTTACCCTTTCCTGAATTGCAGCCCAAC
TTATAACATCAGCTTGACAGGAATAACAGTGGGAAACAAGATCACTGATGTTGATTTCAC
AGCCATTTTTGACTCTGGCACTTCATTCACATACTTGAATGACCCAGCTTACAAAGTCAT
TACAGAGAACGTGAGCGACAAGCTGACTGTATGATTTTAAGTTGGAGTTTGTAACTTTG
TATTGTAAAACTGAAGATATTTTTTTTCTTTTTTCAGTTTGATTCTCAAGCAAAACAGCCA
CGTATTCAACCTGATGGCGAAATTCCTTTTGAATACTGCTACGGGCTAAGGTGAACCAT
CTTTTATAATCTTCATCATTTATTACTTTCTTGACGTCCTTTGAACTCTCAGGATTAACAT
GCTACATACGCAGTGCAAATCAAACTACCTTCGAAGTTCCTGATGTAAATTTGACAATGA
AAGGCGGCAACCAATTATTTCTTTTTGATCCGATAATAATGCTCTCGCTCCAGGTAAGAT
GGTTTCTGCTCCTTTTATATTACAAAAGTTCTCTTTTAGAATATCCTAATATCCAGTGATG
ATCATCAGGATCGTTCTGGCGCATATTGCTTAGCTGTTGTGAAAAGTGGGGATGTCAAC
ATCATTGGACGTAAGTATCTATCAGTTGCTTGCTCGTAAGATTTTGTTTCTATCCATGGA
ATTCTGCAATATAACTTGCACCATGCCAGCTAATGATCTCACAATTACCAACTTTTAGAA
GTTTTGGTTCCTATCGAGTTTTTTACATACTTCTAGCTTATGTATAATTGGAAATGTGAAT GTGACAAAGTAAATTAGTAAAAACCAACTAGTAAAACTGGTTCCATTGTCAAAAGTCTGA
GCTATTTGTTGATTTACTTGGATTTTGTCTCTCTATTTGGAATTCATGACAGAAAACTAAT
ACACGGATGTTTTTGCAGAAAATTTTATGACAGGCTATCGCGTGGTTTTCGATCGGGAG
AAGATGGTTTTGGGTTGGAAACCATCGGATTGTGAGTTCGCATTCCTGAGTATGACCTC
TTTAGTGTGCACACCTGCTCATATAATTTAACTATAAACCTTTCTTGGCAGGTTATGATTC
TAGAGGATCCAACGACAAATCGACAACTCTGCCAGTGAACAAGCGTAATTCTACTGAAG
CGCCTTCGCCCTCCAGTGTGGTGCCAGAGGCCACCAAGGGAAATGGAAGTGGAAATG
AACCCGCTACTTCGTTTCCATCTGTTCAATCATCTAAACCTGCAGCAAACCAAGCACCA
GCACATTTCATTTGCCAACTTATGATGGCTCTGTTTTCCCTTTTTAGCTATTATTTGATCA
TTATTTCTTCA
SEQ 59
ATGGCGATTCATACTTCCACTCTCTCCATCTCCATACTTGTAATGCTCATGTTCTCCGTC
GTATCATCATCGGCGGCGGAGGACATGTCCATTATAAGCTACAACGAAAAACATCACAC
GAACGGCGAGTCAACGGTCTGGCGAACAGACGATGAAGTCATGTCTTTATATGAATCTT
GGCTAGTTGAACATAAGAAAGTGTACAACGCCTTAGGAGAAAAGGACAAACGGTTTCAG
ATCTTTAAAGATAACCTTAGATACATCGATGAACATAACTCTGTGCCCGATAAAAGTTAC
AAGCTGGGTTTGACCCAGTTTGCAGATTTGACCAACGAGGAGTACAAGTCCATCTACTT
GGGTACTAAGCCCGATGGTCGTAGCAGGTTGTTAAATACCCAAAGTGACCGTTATGCC
CCTAAGGTCGGAGATAGTTTGCCGGATTCCGTTGACTGGAGGAAGAAAGGTGTTCTTG
TTGACGTCAAAAATCAAGGGCAATGTGGTATTTTCCTTTTACCCTCTGCCTTGACTCTGC
ACCTGTTGTTTTTGTTTTCCTTTTTGTTCGTACTTATTTTCTGTTTAAAGTTTGTCCATGCT
TTCTTTACTGATGGCTTTGATGGAAATTTGGAAACTTTAGTAGTTTGATAAGGTAAGATA
TTAAAATAATCACAGAGTCATGAGTTTTAATCTAAGATCAATTTTAATGGCAAGTTCAGTT
GACCCTGCATTATTGTAAATTTTAGCTTAACATTAAGTATGATTAATTAGGTCAGCACGA
TG AAGTTG ACAACTTTTG CTCCAATTTCCG CATCTAATTGTG G CAATATAAGTAATG CTT
TTTTCCCTTGGACAAAACACTAGTTTCCGGAATTGAGCTATTTTATTCAATTTAAAATGAA
AATTTTCTGTTTTAATGTATTAGAACTATAAAGAAACCGAAACATTAAGTAAACTTCGGAT
TGATCTGTGTTTTTCGGGAATTTAGTTGTTAGTGGTCTAATTTTCGGTTTAAATGCAGTT
CTTAATATTGGATAGGCATTTTGGCACTTTTCTTGGCTGTCGCTTCTCTTACCTTAAAATT
AAAATTATGGAGTACCTACCAAGTTCAAGATCTTATGGTTGTAAATTGAATTTGTAAAAG
GGGTTCTTCTTCGTTTGCTCTGAGATCCTTCTTTTAGCTCGCTCCTTAAATATTTACTAAT
CAGTGGTTTGTAGCTCCAACCGAGTGTCTATCGGAAACAAACTCTTTACCCTTCTAGGG
TAGGGGTAAGGCTGCGTCACTTGTGTGAACTCACTGGGTTTGTTGTTGGTCTGTAGTCC
GATATACCCCCATCAAACACCCTTGGAGTTGTTTCACTATGTCTAGTTGTGTCAATTGTT
TTGGCAAATTATGCAGCCTTGATTGATTGGATTATCTTCCATTTTATGCATAAGTAAATG
CTGAGGAAAAAATGATATGTTTATATCACATAAAGCAACTAATAATTTTCTTCGTAATTGG
TGTTGC AATTGG G AAATGAAACAG GG AGTTGTTG GG CTTTCTCAG CAGTTG CTTCAATT
G AAG CAGTAAACAAG ATAGTG ACAG GTAATCTG ATCTCGTTATCTGAACAAGAG CTG GT
AGATTGTGATACGTCCGATAACCAAGGCTGTCAAGGGGGTCTAATGGACGATGCCTTTA
AATTCGTCATTCAAAATG GAG GAATAG ACACTG AG G AAG ATTATCCTTAC AAAG CCAAA
GATGGAAAATGCGACCAAGCAAGGGTCAGTATGGTGTTCTCTGTCTTAAAGGGATTATA
G GAAATG AACTAAATACAAGTTGTG ACTATTAATATTTTGTTTG CAGAAAAATG CCAG GG
TTGTCACCATCGACGGGTATGAAGATGTTCCTGATAATGATGAAAAGGCACTGAAAAAG
GCCGTTGCTGGTCAACCCGTCAGCGTTGCTATCGAAGCTGGTGGCAAAGACTTCCAGC
ACTATAAATCGGTATTACTTCAGATTTGCCTATTGTCAGTAAAGTTGTTTTCTTTTAATCG
AATTAGCTAGTGTTTACACAGGCTCAACAAATATTTCTGTATTTTCAAAGTTACAGTGAG
TTCAGTATTAAAATTTTTAAATGTTGATCCTATTAAGTTTAAATGTTGGATCCGCCTATGC
CCCAGGGTATCTTTACCGGAAAATGTGGTGCAGCAGTGGACCATGGTGTGGTTGCAGT AGGGTATGGTAGTGAAAATGGCATGGATTATTGGATTGTGAGGAACTCGTGGGGTGCT
TCGTGGGGTGAAAAGGGCTACCTCAGGATGCAGCGAAACATTGGCAACCCCAAGGGTT
TGTGTGGTATTGCTACGATTGCTTCTTACCCTGTAAAGACAGGCCAAAACCCTCCAAAA
CCAGCTCCATCTCCTCCACCAGTCAAGCCGCCCACTCAATGTGATGATTATAACGAATG
CCCAGCTGGAACGACGTGCTGCTGTGTCTACGAGTACTATAAATACTGCTTTGCTTGGG
GTTGTTGTCCCATGGAAGGAGCTACTTGCTGTAAAGACCATAACAGTTGCTGCCCACAT
GATTATCCTGTCTGCAATGTTAAAGCAGGCACCTGCTCAATTGTAAGTGATCTCTGCTT
GTTATTGTTAGATTGTCCCGCATTGGTTGAGGGGAAGTGTTGTTGTCTCCTTATATAGTC
TTCGGCAAGTCTTTTTAACAGTTAAGGTTGTTTCCTTTACTTATGGAATCATGTTTTTGTT
G ATACAGAGC AAGAACAACCCACTAG G AGTCAAAG CAATG CAG CACATTCTG G CCAAA
CCTATTGGTACCTTCGGAAATGAGGGAAAGAAGAGCCCTTCTTCT
SEQ 60
CTAAGCACTTTCTGCAAATCCAATTTGTGAACTGCCAAAGTCGAAAACTGTGTGATATGC
TCTCAAGAATGCATCTCCAAGAACCCTGCAAAAGTAGAGTTGAATATATCATACAACTG
GATCTTCATGAATATATAATATATTATAACTTATGGCAGTGAAAATAGTCTTACCAGAGG
GGACGTCGCGGATGCGCGTTTAAAGTTGTAAATCCACTAATACAGTGGACACCTTGGCT
GTCATCAACTCTGATAACATACTATTACACAGAAAGGAAGATAGTAAGTGGAAGAAGAG
AAAG C ACTG CTATTTAAAACTATAATATGTTACTAATATAC G G C CTAAAAAC AAG AC G CT
AACGGCTTTTCCAATGTACCTTACTGGAAAACCAGTTCAAGTTGTCGAAGCAATTTTAAT
CTACATTGCTCCGATCCTCCAAAAATGCTAACCGCACTCATGTTGGATCCTCCAAAAAG
TGCAATGAGATTTTTGCGGGATCCCAGCAATCAGTGGCGAATCCAGGATTTGAATTTTA
TGGGTTCAATCTTTAAGATTTTTAGTATTGAACTCATTGTATTTTGAAGTTATTGCTTCAG
TACTACTATTTATTAGATTTGACTGAACCCGGTACTAATATGATGCATCTGCCTCTGCCA
GCAACATAAATTTTAATGGGAAGATAAGAACTTTTCTCCGATATTATGTCATCTCCGATA
AGAACAGATGTAATTCTAGCACCAGTTGCTACCTTTAACACATATTGATAATGGAGGAAT
AGTATCAG ATAACAACAAACCTG ATCTG G AG AAAGG G GAAAAGATTTGTCTCCAATG GT
AAATGTTATATGTGGCAGGGCAAAGACATCACAGTTGATAAATGATTTTCCCCCGGGAT
TCGGAAGCTTCTCACACAGCTTCAGCACAAATTACATGTATTAGTAGTAAGTAACTAATG
AGAACTCGAAAAACAAAAAGACACAACTGTGACATGAATACCTGATTGGCATATTGAAA
CGCTTTTTCTTTTGATCTCTCTTTTCTGATCTCTACTTGTATCCAGAACACTATCATCTCA
CAAGAAGAACATAACGACCCATTATTTGTACAGAGTCCAATTCTGTTGCATACGTTCTCC
G GTTGTAACTG CATGCAG CCAAACAAAAG GATG ACATCTAAATAAGAG AACAC CTAAAA
TACCCGCACACAACAATTAGAAGTTAAAAAGCAGCTACCCCTGCTATCAAGCGTTCCCA
GATCGAATCCCCATAACTCGAGACAACTTTTTTGCATTCCAAACTAATAATTCCTTCCGC
TCCAATGGCATGATTTATTTGAGTTAAAATAGTCTGAAATGGAGATGGCTGAGAAGAAAT
TGATCTTGCAAAAAAGAATTAAAATCAACAATGGTTTAAATACAGACAGTTGGACCAGCG
ATAAATGATGTCCCTGTATCCACAATAGCTGGACATCCATCCTTACAAAGGCCTGAAAG
G GAAAG AG CAG AATTACACTAG AAATG CATTGTTTTTATATAGTAAATCACTTATGTATAT
GAAGATATTACCTGTTGAATTGCTTCCTATAAAAAGATCCCCTATCTCAATCTGGAATTTT
TGAAAAAAAACTCAGTTTTTTCAGGTTACCAAGTAAACAATACAAATAAAGTTCCATTAAA
CGGAGGCGCAGGTTACCTCCCAATAACCATTTTGAGCGACTGGTACGTATGTATGCTG
ACCCCTGAAGTGAGTCCAATCCATGCCTCCAAAGATAATTTCACCCGCTATCTTAGACG
TAGGATCTCGATTTAGCCAGAATGAGAAGATTGACTTGGTAACCATATGCTGAAGCAAC
ATGTTATACCTGAAATTAACTCACACAAAAGAATGTGGATTTCAATTTCAATGACACAGG
TAAGAAATGAATGAGAGAACAAGCTCATTAAAACTATGTCAGAAAGCATAAATTACTATG
TCAAAAGTTATTATAATATGAAGAGATAATAATTTACCATACTGGTGTGACATTCCTTGAT
GTCGTGCTCTGATCAAATCCAAGTCCTAGTACTCCATCAAATCGTGCACGCAACAATGT
CAAGTATCCCTCCCGTGTTACCTCAGTGAAAACCTGTTTAATAAATTTTATTCAACATGT AACTTG AAAACATATATATCTAC AATTTCAG CTG CAAAGACCG G CACCTG CTG CTTTAAG
ACAGCACCTCCAACTTTCACATTGTCTTGGCTGAAGAATCCATGAACTGAACCAGTGCC
AAAAGGGATTTTGCTAGACTTTCCTATCATAAAAGCCCATGTATAGAAGTGATAAGTATT
TCGATTAAGCTATTCTACGCTTAAATATCAACAATGCTCTAACCAAAATAAAGGTTTGGA
GGTCCATTGGGTAACTACCAATTTTTGTATACGTATTTGATAGTCTTGATTTGTACCTGG
AACGAAGATAACATGCAATCTGCATTGGATAAATCCATTCAAAAGTTAGTTAGTTTCATG
TAGTGAAAATTGTTAATCCACAACTTGACAAAGACTAACCGAGAAGAAACATCTGGAAG
AAGGGACCCAAAGATTGGAACTTCCAGTATCAAACACAACAATGAAGCGTTGGGGCGG
TGAACCAATACCAATCTCCGCGAAGTACTGAACATCATGATAATTTTTGAGGTAAACTAT
CTGGTCATTCGGAGCAGCCAAATTTCTATTGCGACCCCTGAGATCTTTAGCGTAGATTC
TTGCATCGCTTATGCTAGAAAGGTCCAACGATTGCCTTTTTAGCTCAATCCTAACCATAT
CATCAG CATATACGTTGATG CAG GTTATATAC CATATTAC AAGTG ATGCAAGAAG G ATTT
TGATCTCCAT
SEQ 61
ATGGCGTCAATTTTCGCTCTTTCATTATTTTTCATTATTATCTCTTTCTGCATCACTTCGA
TCACCATTCCCGTTCAATCCGACGGTCACGAAACTTTCATCATTCACGTTTCTAAATCCG
ATAAGCCCCGTGTGTTCGCCACCCACCACCATTGGTACTCCTCCATCATCCGATCCGTT
TCTCAACACCCTTCTAAAATCCTCTACACCTACTCACGCGCTGCCGTGGGCTTCTCCGC
CCGCCTCACCGCCGCGCAGGCCGATCAGCTCCGCCGTATTCCCGGCGTAATCTCCGT
CCTTCCCGACGAAGTACGCCACCTCCACACCACCCATACCCCTACCTTCTTAGGCCTTG
CTGACTCCTTCGGCCTTTGGCCCAACTCCGATTACGCCGATGACGTCATCATCGGAGTT
CTGGACACAGGTATATGGCCGGAAAGACCGAGTTTTTCCGATGAGGGTCTCTCTCCTG
TTCCTTCAAGTTGGAAAGGGAAGTGCGCTACTGGACCGGATTTTCCTGAAACCTCATGT
AATAAAAAAATCATAGGTGCCCAAATGTTTTACAAAGGCTATGAAGCTTCACATGGCCCA
ATGGATGAATCAAAAGAATCGAAATCGCCAAGAGATACTGAAGGACATGGAACACACAC
AGCATCAACTGCAGCTGGTTCTGTAGTGGCAAATGCTAGCTTTTATCAATATGCCAAAG
GTGAAG CTAG AG GTATG GCTATAAAAG CAAG AATAGCTG CTTACAAG ATTTG CTG GAAA
AATGGTTGTTTTAATTCTGATATATTGGCTGCCATGGATCAAGCTGTTAACGATGGTGTG
CATGTGATTTCACTTTCCGTTGGGGCTAACGGTTATGCTCCACATTATCTCCTTGATTCT
ATTGCAATTGGAGCTTTTGGTGCATCTGAACATGGCGTCCTCGTCTCATGTTCAGCTGG
AAATTCTGGTCCCGGCGCTTATACGGCAGTGAACATTGCCCCCTGGATTCTCACCGTTG
GTGCTTCAACTATAGATCGTGAGTTCCCTGCAGATGTTATTCTAGGAGATAATAGAATAT
TTGGTGGCGTATCATTGTACTCCGGCGATCCATTGACCGATGCCAAATTGCCGGTGGTT
TATTCCGGCGACTGTGGTAGCAAATACTGTTATCCAGGAAAGCTAGACCATAAAAAAGT
CGCTGGAAAAATTGTTTTGTGCGATAGGGGAGGCAACGCTAGGGTTGAAAAAGGGAGT
GCAGTGAAGCAGGCAGGCGGAGTAGGGATGATACTCCTTAATTTGGCCGACTCCGGTG
AAGAGCTCGTCGCCGATTCACATCTTCTCCCCGCGACGATGGTAGGTCAAAAAGCAGG
AGACAAAATAAGACACTACGTAAAGTCTGATCCTTCACCGACGGCGACGATCGTGTTCA
GAGGAACCGTGATCGGAAAATCACCGGCGGCGCCACGTGTAGCGGCGTTCTCGAGCA
GGGGACCGAATCATTTGACGCCGGAGATTCTCAAACCGGATGTTATTGCACCTGGAGT
TAACATTTTGGCCGGTTGGACCGGATCTGTTGGACCGACCGATTTGGATATTGACACGA
GAAGAGTGGAATTTAATATTATTTCTGGAACTTCCATGTCGTGCCCTCACGCTAGTGGA
TTGGCTGCGTTACTTAAAAGGGCCCACCCTAAATGGACCCCAGCAGCGGTAAAGTCAG
CACTCATGACAACAGCTTACAATTTGGACAATTCTGGTAAAGTATTTACAGATCTTGCCA
CTGGCCAAGAATCTACTCCTTTCGTTCATGGATCAGGTCATGTAGACCCGAACCGAGCA
TTGGATCCGGGTTTGGTTTACGATATCGAAACTAGCGATTACGTGAATTTCCTATGCTC
CATTGGCTATGACGGCGACGATGTCGCCGTGTTCGTGAGAGATTCTTCTCGAGTGAATT
GCAGTGAACAGAATTTGGCTACTCCAGGAGACCTGAATTACCCGTCGTTCTCTGTTGTT TTTACCGGTGAGAGTAACGGTGTGGTTAAATACAAGCGGGTGATGAAAAATGTAGGGA
AAAATACAGATGCTGTTTATGAAGTGAAGGTGAACGCGCCGTCGTCTGTGGAGGTGAG
TGTGTCGCCGGCGAAGCTTGTATTCAGTGAGGAAAAGAAAAGCTTGTCGTATGAGATTA
G CTTTAAG AGTAAAAGCAGTG GTG ATTTGG AG ATGGTG AAGG G G ATTG AATCTG CATTT
GGGTCGATTGAGTGGAGTGATGGAATTCACAATGTGAGAAGCCCAATTGCAGTGCGTT
GGCGTCACTATTCTGCGGCATCCATT
SEQ 62
TCACATAGGAGCAAGATGACCTTGTTTGGACAATTTATCTTGCATCCACCTGTGAAGCA
TTTCCAGTGCTGCCTTAGGTTGATCCATTGGAACCATGTGACCAGCATCATGGACCTTT
AAGAAAGTTAAAGGCCCATAGTTCTTTTGAACACCTTTCTCTACACCATCTACTGCAAAA
GAAACTTGTGTGGCTTTTCCAAAGGCTTTTTGCCCTGACCATTTCATTGCATGCACCCAT
CTCGAGTTTCCTGCCCATTATAGCAAGAAATTGAGTTTAGTTGTCAATTACTAGGTTGTT
TTCATCTTTCAGTTTATTGTAACAAAATTATGTTTATATATCACATATAAAATAAAAATAGT
TACCGGAAATATATACTAAATCCGGTCAAAGAGAGATCATATCCATGTAACACAATGTAT
AGGAAGGCCCATTTTTTTTTTGGTTGGTCGGGTAATGTTTTGTTTTGACAATAAGTAGTG
TCACATGGATATTCTGAATGTAAGGTCAGTTGCCGATATGACTATATTATTACATGTAAA
TGTTATACATTTGGCAGCCTACTTGCACTTTCTTAGAACCGGCTTTAACATTTTTTGACTT
TGTATTTTAATTTTCTGTGATACATAGGTATAATTAAGTTATTTTGCTTTACAGAAACATG
CTTAATTTTGTCCATATATGAAATTAGATGTTGGGTCAGAGATGAATGAAGCTTACCAAG
CCAATTGCAGATAAGGTCATATTCCCCAGCATACACAAGTAGCTTGATACCATCCTCAA
GGAGTGAAGGAATTCCCTCTTCAAGATTCCTCATCCAGTCCAACTGCATTGCCTGGTAA
ACCTCAGAGCTACATGAAACAAACTCAATATCCCCAACACCAAGTGCCTTTTTAACTTGT
TGGTCATTGAGGAAAGTTTCCATTTTTGAGAAGTCATAGCAGAGATCACCCTCACATCT
CTTCCGCACGTCATAGTACTGCAACTCGGAAATTACATGACTTTAACTTCTCTACACTAA
CTAACAATGTAAAGAAATTTTTATTGTTACTTAAAAGATAACTACAGATAATAAGTGAGAT
CAGTAATTTGGAAAATGAGACTGCAAGTAGCGTGTTGTACATCATTAGCTCAAAGAAAAT
GAGATTGCGCTTTCTTACATTTTTGTCACCAGCAATGTCCATAATCTTGTTGAAGATGCT
TGTACAAACAAGATATGCAGCCATGCAAGCAGTTCCGCCATCTTTTCCTAATATTATTTG
CATAG AAATAAAG CTAATTAATGTCAAG ATTATATTAG CTG CTGCTATATGAAG GAG AAA
GAACTTACAGGATAAACAAAAAATTAAGAATTACCACAAAGCTTAATTGCCAACTGACAT
TTTGGATATGATTTCTCTATGGCATTGTAATCAGATTTTTTTATCAATTTCATATCCAGAG
CATAGTCAGTGTAGGCTTTGTATTGAATTTCTGGATCAGTGAGTCCATTACCAATAGCAA
ATCCCTATACAAATTAAATACACTTGGTTAAGTTATCGGCATGATGACAAATTTAAATTAA
ACCTACCTAAAAGTTTAACTGAAAAAAAAAAAGAATGGTGGAGGAGCTAATGAGTTAGA
AATACCTTGAGATTTACGTAGATGCCTTCTTTATTTTTGTTTCCTTGGTGGACCCGAGAA
GCAAATGCAGGAATGTAATGCCCAGCATATGATTCTCCAGTAATATAGAAATCATTCTTT
G CATACTG AG G GTGTG CCTTGAAG AAGG CCTAT CAT CAAAAGAATTTGAATTAAATTTTA
TTAATTATATCAGTTAAACTTTAGAGACTTATCACGAGCTAAAAAAAGAAGAATGAAAGA
ATAAGATCAACCTGCAAGAAGTCATAGAGATCATTGCTTACGCCCCTTTCATCGTGACG
AATATCATCATCGTTTGAACTATAACTGAAACCAGTTCCAGTTGGCTGATCGACGTATAT
AAGGTTTGAGACCTGTCAAATTGCAATTTATCTTATGTTATCATCATTCTTCAACTAACAA
ATGAAAGTTGCATGTTTGATTATAGGATTTAACCAATGTAAACGACTTTTACAGTATTGTT
ATATATATATATATATATATATATATATTAACATGTTGTATTAGTCCGGTTACTAATCCCAC
TTAAATAAAGAGAAGCGTAGTAGTCATTGCTGTCAATAAGCGATGAACTACTTTTAAACT
TTTG AATTCTACAAGTCACAACTAATG AACAAGTGATAAAGAAAG GAAATG CTAGTAG GT
AAAAAG GTACTTTG CATGATG GAG CAAG GTTG AGTAAC AAATAAAAACATG GAG GG AAT
TCTTTTAGACTTTTACCATATTCAAAAGATCTAACCGACGTTTCTTGAAAAATTAATTGGG
TAAAATAAAAAAAATAAAAAAT AAAAAG C AAAAG G AAG AAG AC AAG AAC CTTGTC C C AG C CGAAATCATTCCAGACAAGAGACATATTATCTGCAATTTTGAATGGTCCGTTTTCGTAAA
ACACAGCCAATTCACTGCTACATCCTGGCCCTCCTGTTAGCCATATAACTACTGGATCA
TTCTTCCTGCTTCTCGATTCAAAGAAAAAGTAAAACATCCTGCGAAAAACATATTAAAAA
AAC AC AC AG ATAATTTAG C ATTAACTAATAATAC C C ATAAATG AAG C AAAAAG AG C ATTT
AATCCAATTCAAACCTTGCATCTTTAGTATGTGGAAGACGATAATAACCAGCGTGATGAC
CCAAGTCTTGAACTGTAGACCCAGAATTACCAACATAAGATAAATTCAATTTCCTCTCAA
AAAGCCTCTGTTCAGAATCCCCTGTTGCTGCAGCCTTGTTAATATCATGTTTAGGGAATA
AATTAAGCTGTCTGATTAGCTTTTCTGCCATTGTTAATGGGAATTTAGGAGTAGAAGATA
GGAAAAACTCATCGTCATTAGAATTTAAAGTTGATGAGAAAGATAAGGAAATAGAAGCAA
G AAG C AG AGTAAG AAAG AG G G ATG AAG G C AT
SEQ 63
ATGTTAGTTATCAGTGATTGTTATATAAATTCTTGCAAAGCTTTCAACTTTGTGATCAATT
TGCCCGTCATGGGACACTCTCACTCTCATTCTTCTCATTCTCACTCTCACTTTCACTCAT
CTAAATCTTCCGATGATCAAAATATGGATATGGGGGAATCGATCACCACCCAAACAGAC
GTTTCTTTCATGCTCGCTAAGCATGTTTTCTCCAAAGAAGTTAAGGGCGATTCCAACCTG
GTGTTTTCTCCTCTCTCAATTCAAATAGTACTTGGCCTGATTGCGGCCGGTTCTAAGGG
GCCAACTAAGGATCAGCTGCTCTGCTTCCTCAAGTCCAAATCCATTGATGAACTCAACT
CTCTTTATTCTCATTTTGTCAGCGTCGTCTTTGTTGATGGCAGCCCCAATGGAGGTCCT
CGTTTGTCTGTTGTTAATGGTGTTTGGATCGACCAAACACTGCCTTTTAAGCCTTCTTAC
AAAAAG GTTGTG GAT AAAG TTTACAAAGCAG CTTCCAATTCTGTTGATTTTCAGTG CAAG
GTTAGGCCTTTATTCGTTTGTTTCATTCAAATCTTGTTTCTTTTGTGCTGGGGTTTAATAT
TCTTTGTTCATGCTGACTGCTGAAAATTGGTTCTTTAACTAGTATAATTGACCCTGCATAT
TACTCTCATCATAAGCCCTCCAAATATATCATATAAAATGGATATACATATAGTAAACTGC
AACTAATTAACTTGGGATTGAGGTATAAATGATTGATTGACCAATCTGACTTTAAAATAAT
GAAAAGTGTTAAACAATTAGGACAGAAGCTATATTGCTTAGCCTCAAGTAGTAACAAAAC
TAAATAATGTCAACGGTTGATACTCGTTTCACAGAATTGAGGCAGTTTAAAGAGTAAAAA
GTATTGGTTGTTAATTTGAAAAGTAAGATGAAAGAGTCACAATTCATCTTCATCAATGCT
TATTGTTTAGCAGGTTAGTTGACTAGTTCGACATTTTACTGAGTGGTAATAAGCTTCTTTT
TTGTAGGTAGCTAAAGAAGCCTAAGTAGTTCTAAGCTCAACTGGATATGTGGCCGTGCT
TAATTTTGTAAAACTTCAGTTTTTGGCCTAAATCTACACCCAATCAGTGCTTAAAATATAC
CATGTAAAGCATCCAAATTCTCACTTACCCCTTGCAAGTACTGTAATCAATCTTCTTACT
GCAAACTCCCTTTGTTGGCTAAAGCATATACGTGTTAATTCTGTCGTATACTCTGTTTGT
CTTGCTAATTGAATAAGGCTGCTGAGGTTGCCAATCAAGTCAATCAGTGGGCTAAAATG
AAGACAAATAATCTCATTAAAGAGATTCTTCCTCATGGAACAGTAAACAATATGACAAGG
CTCATCTTTGCAAATGCATTATATTTTAAAGGAGTATGGAATGACAAGTTCAATGCTTCA
GAAACAAAAGACCATAAATTCCATCTCCTCAGTGGAGGGTCTATTAAAGCGCCGTTCAT
G ACTAG CAAG AACAAG CAATATG CAGTAG CCTTTG ATG GCTTCAAAGTGTTGG G ACTTC
ATTACAAGCAAGGCAAAGATATGCGTCGTTTCTGCATGTATTTAATTTTGCCAGATGCTC
GTGATGAATTACCAGCTCTATTGGACAAGATTAGTTCAGAACCTGGTTTTATAGATCATC
ACATTCCGTTTGAAAAAGCTAAAATGCGCAAGTTTCTTATCCCTAAATTCAAAACAACTTT
TGGTTTTGAAGCTTCCAAGGTTCTAAAGGGACTTGGCCTCACATTGCCTTTCTCCAGTG
GTGGCCTCACTGAGATGGTGGATTCCCCGTTAGCTGGGAGGTTGTTTGTTTCGCAGAT
TTTTCACAAGTCCTTCATTGAGGTAAATGAGGAAGGAACAGAAGCTGCAGCTGTTACAG
CTAGTGTAATAATGACCAAGTCCTTGATAATTGAGAAGGAAATGGAGTTTGTTGCTGAC
CATCCATTTCTATTCCTTATAAGAGACGAATCTACCGGTGCTGTGTTTTTCATAGGGAGC
GTGCTGAATCCTCTAGCTGGT SEQ 64
TTAGCTTGAGCAAGCTGACTGAAGTTCAACTGCATTTTCATCATCAGTAAGGTCACTAGA
CATGGCATGAGGTATTCTATGGCGTTTCAATATACGCGATGTGGCTATTCTTGCCGATT
CATAGTTCAAAACAATCACCTTCTCATCATCCAAATCAAATCTTACATTCTTTTGGTTACC
ATCTTCCACAAGCTGACGTAAATGTTTCAAATTCAGAACTTCTACGCCATTAACCTTCTT
CACCTGTACAAAACGGTTCAAAATTTCCCCGAAAACAAAGAGGAAAAATAATTATAAGG
CGCCCTTATGTTGTCTTTTAAGGGAGAATGAAGTAATAGAAGCAACAAGTGTTTTACCTG
CAACTC GG CAAG G CGCTC ATAACC AG CATTAATATCATC CATCAACAC CTAATTAATTG C
AAACG GTCATATTAG CAG CATATAG CTCATTG ATCTGCG AGTACAATCCTAAAG G AAAA
ACAGACCTCCGTCTCTATGTATAGGTTAGGAATCATGTTGCGAAGCAAAAGTTAAGGCA
CAAAAAATTGAAGTAAACAAATACAAAGTTTATTATTTATGAAGTTAGCCAAATGACATAG
ATTGTCAAGTAAAATAAGACATGCCTTCCTCCCATGTCAGAAATCTTACTGGAAACATTT
ACAAAAGGTTATCTAGCAGCAAATATCTTAATACGTTAACAATGTCTGCCCATTTGGATA
GACCCGTCCACCATCAGTTTTGTGTGTCTCAGTATGTATTTAGCATGTGAAATTATGGTG
GAACTTCATGGCGTCTTTATGCTTTTTTGTTTTTCTATTTTTATAAGAGATCAAATTATTTG
AGTTTTCTATTTGACAAAAGGATAAGAAACTCTAGATACCTGAGAAAGGATGATGAATTG
TTCACCAGGTTTCTTAGGTAGTTCCCGAAGGGCTCGTTCACACAACCGACGGGGTGAG
GCATTATACCAGTCTTCTCCATACTCGTGAAGGAATGGTTGAGTTAATGGAATAAAGAC
GAG AC CAG CAAATATG AAATAACTCG GAAGCTTGTCAAATTGATG AACTG GAAC AAGTG
GCTGCAACTGCATAATGTTAACTAGTTGTTAGATACAGTAGTCAGTTACACAGCATCAAA
GAAACCTTTAGATTACCAATCATACAAGTTGATAAAGTGTTAAATCTAAGGCTTAGCAAA
AAGTGCAGTCACCTTCCCAACACCTTACTCCTATGCATCAGATTCCCAAGAGCGACAAA
GCAAAAAATAACCTTGGTCCACTCGCACACCAAAATACTTCTAACAAAACAAAAATTTAC
CTTGCTCCAATACCGCACCAAAATACTTCTAACTACATAAATTATGTGACACCCCCAAAT
ACTTCAAGGCTTCTTTCATCTTTTCTATTTTTCTATTTTTGTAAAAGATCAGATTGTTTAAG
TTTTCCATTTAACAAAAGGATACTTGATGAGAGCAACTATAGATACCTGAAAGAGGATGA
TGCATTGTTCAACCTTCATATTATAACTTGTGCAAATGATGAATTTACTTCTACCGGAGC
GATAATGTCAATAAGTTATTTTACCAGAGTAATTCTAATTGAACAATGTAGATGCAGTGA
TTTCGATCTATTCGTCTGATTACATGCACCCAGGATATGAGCAGAAGTTTGAAAAATCAC
TGGGAGATACATTCGAAACAACAATTAATTATGAAGGGTTAATAGAAGGGTAATTATGAA
GGGTAAATAAAAGTGACTTACAGGATGAAGCGTGATTTTGAAGTCATGCACTTTGCCAT
TTCTCAAGACTTTAAGTTCAGCAGTTTCATTAGGTTTCTTCATAGATACCAGATGGTCAA
ATGTGATCCTCTCTCTGTTTCGGAAAGGAACTGCAGATCATAAGCAATCTCAAAACTTTA
GTCCAGG G AAATAAG GTTG CTAATTTTG ATATG CATTTTAGTCAAAATG CAG AAG GG CA
AACTTTTCTAACAGAAAATAAGTTATTCTAGCCTTACAATTTTCATATCCTGCACACCCAC
TTATCTGGTTTCTGGTATAATTATCAGCTATTCACATGGCAAGAGAAGAAACTCAGAATT
AAAAACGACTAGACTCTAGGCTTTTCCATCTCTCAAAAGGAGGTCATTAATTTGTTACTC
ATAGTTGCTCGTTGAACATGAATACTTTTTAGTTGGTGCTGGCCTGCCTAAAAGAGCCT
CTACAGCAAGAACCACATCCATGTTTCATCCAAATCCGTCATTTTACGTCACGAAACATT
TAGAGAAAGAAAAAGGCATGTGCCATAAAATGTAAGAAAAAGCTGTGAAGAAGAATGTT
GTACTAGTTTAACTGTC GTACATAAGAATTACAAG AAATAG AG AGAAG CAAG G CAATAA
GAGAAAGCCTTACCTGTTCCATCATTTGCTATGGGTACGCCATCAAATGAGAGGATTAT
GTCGTCTTTCTTTAATACTCTAGAAGCATCAGAAAGTGGGTTGATTCGGCTAACAAGCA
CACCTGTCAATTTGGACTGCATTTGGAAGTACTCTCGAATTTGTGCATTTTCAGTAGGTT
GGCATGACAAGCCCAGAGAGCAAAACCCAATGTATTCACCCCGTTCTTCTACTCCAGCT
ATAAAATGCTTTATCACAGGAACAGGAATAATGTAGCTGCAAAATCCAAATAGAACTGAA
ACTTTTAAGGCCAACTATGCACACTGTAAATTTCCTTTCCAATTACAACAGTTTTTCTCCA
AGGTAGATTGTTTAATCAAGGGTATCAGTTTAGATTTGATTGTTTGCAGCACTGAGGAAG
AGTCGAAATAGAATGAACTTGAATTCGAGACCGCTATTTGAAATCATAGTGAAATACTGG
AATTTTTATCTCATGTCTAAGAGCTACTAAATGTTCCCACAAGCTAAGCAAATGTTGATTA AAACTAGTAAATGTCATCAACCAAATCTCCTTATCAACTGCATGTCATCACAACTAAAAG
CTTTCAGGCATTCCCACATATGCCATCCTTTGTAACCCCTCTGAGATGAAAAGAATAATA
TTATGAAG CTAGAG CCAAAGG G CTACAACTCAAG CTTCAAATTTGTG AATG CATG AACA
AGGACTGCGTGAAGGGAAAAATCTGAATATTATGACGAAGAAAAATGGAAGAGAAGATT
TG CAG G AG AATACATGTGTGAAG G G AAGAATAACATG CAGTCAG ATTCAGGTAAAGG A
GAGAAAAATCTGAATTTTGTGATGATGCAAGTGGATATTAGAGTATATACCCCATCAATA
TTTAACTAAATAATATATG GTAG AC CCAG CAATAAATG ATG CAATG G AACTAAATCTGTA
TAGTAGTTCTATCCCTCCGGGGTAGGGGTAAGGTCTGCGTACACTCTACCCTCCCCAG
ACCCCACTTGTGGGATCCTACTGGGTTGTTGTTGTTGTTGTAGTTACAGAGACACAAAG
TACCAAAATAAAATTCTAATTTACCCAATATTCTCTGCACCAGAGAGGTTTTGGAAAGCA
ACTCCAGCAACTTTGTCACCCATAATTGCTGGTCCTCCACTATTCCCTGGATTTATAGCC
GCATCAATTTGTATTGCCAATAGTTGACTAGCGCCGTGTACATATTGCGTAGGTTCTAC
CCTTGAGACAACACCTTTTGTCACGGATATATTATCTCCCCCTAGAAAAAAGCAAAAATT
ATTAGAGAACTCCTCAGGTGAAGGTATTTGTGGCTTACAATTAAGTAAAGAAAAAAAAAA
AGAGAACATAGTATAGTGAAGAAAACAAAAATATAACTAGACGTCAACAAAAGATTAAGA
AGGATCTGCACTATTGAAGACAAGAATCTAGTATATGCAAGCTACAAATATCCAGCCTT
GCACCTAGTTGACACCAGAGAGAAACAAAATACATCATGAAAGTTTCCTTTTCACTCTTC
TGGATCTTATTTGTTCGCTGCTCGTATGAGCCCTCGAAAAGGGTACACCGAATCCGAGA
AAATAGTGCACTCTCTCGCGGAGTATAGCTCACATCCAAACATCTGATTAGGGAATGGG
GCAATGCCCATGAAGCTCTGGCGGAAAGGGAAGGCATGCCAGGCCGTATGCCTATGG
GTGCACAATTCTTCGAAAAAGCGCATGCTACCTCGGAGACCTGGGACCTTGGCTTAGT
AATGAATGAAGGGAAGCTCTTCGAGCTTTCTCCGCCAGCGGCTTATGTAGTGGTCGGC
CTTATAAAGCTCGCTAAGCCTCGCTTCCCTCTCCCCTTCACTTATTAATTAAGTGGAAAA
TAGTCGTCGGCATTCTATAAGCGACTTGACCGAGTCTACGAAGCTTTGCTTTTCTTGTA
GTCGGCCCGTAATGCCTCCCTTCATTTGCTTGCCTCCTTTCAGCTCAGAATGCCTAATG
CCTATTATTTAGTGCTAGAAGCTAACCGCCATAAGCTCACCCTTCGGGTCTCGCTTCCA
GCGCAGGAGGCCAAGCATTCTGCCAGACGTCCCGGCCTGGGAGCCCCGTTCGCCTTT
GGTNGCTTATTAATTAAGTGAAAAATAGTCGTCGGCATTCTATAAGCGACTTGAGCAGA
G GAG ACAAG AAAAG G CTCAGTG GTAACTGTTATG AATGTGG AGG AGTG G GG AATTTTT
GTCGTTACTGTTACAATGTTCAGACGGACGAATGCTGCTAGGGGAGTGCAGAATAGAC
CAGCTGAAC GTAGTC GG AGAAG CTCTAAAG GTAAGTTC AAGTCG GAAACATTGTGTTTC
CACAAGAAAAGAAGCACATATTTTTTATAGGATTTGATTTGATAAATGGAAGGAAATTGT
CAATTGAATTGGCAGAAAACTATGGTAAAGGGGAGCCCAACGTCCTCAGCAGAGAGGA
ATGGCAGATGCATTCCCCCCNAATGACATTAGCCTAATTTATTTGAACTATAGACCAAAA
AGAGCCCCATTGGAGTGAAATATATATAGTACTCATACTGCTGTTCCCAATTGATTTGGA
TTGAGGTGTAGCTGATTGGTTGATTTCGTGAGAAAACATGAATAAAAGGAGCTCACTTG
TGAATTGTGATTCATCTGCCGTTGTATTTTAAGACACGAAAAGAAGAATGTGAATTTCAT
ATTAACTCGACACCGTATCAACAACTCAGTTATTTTTATTTTTCTGAAAATTATATCAATA
ACTCAGTTATTTCGAATAGGTATAGCATCCCATTGCTAAACATGAATGTTGTATATCACA
CAAACAATAGGGGGGAAAGGGAGGTTCAGAGTTCGATTCAAGTGTATGGGGGGTTGG
GTGTTTACATTACATCTTGAATCAGTTACAAGAAAAATAATTATCTTGAGGATGACCGCT
G ATTAAAAAAAAAATTAC CTTG AG G ATAAC C AAC AAC AG C C AC AG CTTCTTG G AG AAAT
GGAACATCACCAAGCTCCAAAGAGTTCATGCCCTCCCAGAATTCTTCACTTTCTACCAC
CAGAATAGCCAAGTCACATTCATGACCAACAGCTTGCACTGTTGCTCTATACTTGGTAG
GAGAACCATGCTTTCTTACAAGTACAAACGTATGATCAGCCACAACATGAGCATTTGTTA
GGATCCTCTTTCCCCGAATAACAAAACCTATAACATTAAGAATGCAAAATCCGAAAGTAA
GCTTTATGTTCTCTTAAATTATTAGTAGGTCAGATACTTGATAAGTTGATATGAGCTCTGT
ATTCTCTTCTTTTAAG AAAAAGAAATTATTACACATAG AG G GTG G GG AAG GG G AAATG G
G G G AG G G G ATTAC AAGTG G G G AATC G AAAC C CAAACTTGTAAC AAAAG C AG ATAAG AG
AATTAACCTTCCAAAATCGAGTCCCCATAACAAATCAATCAACAATCCTCAATCCTAAAC
TAATTGTAGTTACAACATCAATAATTTCATGCTTTAATCCTAAACTAGTTGGCGTTGGAGT TG GG ACAACAAATCTCTAATACTTG CAG CCCTTG G GACAACAACAAACTATTTATCTCAA
TCAGTGGCGGAGTCACCTTATACCAAGGGGTGTCAATTTGACACCTGACACCCCTTCAC
GGGAAAAAAATATACTACATAGGTAGGTAAAAAAAATTATATATATGTTGACTCCCCTTA
ATTTTTTCGTCTATTTACTTATATATATTTTGACACCCCAATGAAAAGCATGCCTCCGCCA
CTGATCTCAATGAACAACAACTTCATAAAAAGTAGCCTTTTGACAAGGCTTCCTTAGTAA
ATGAAGTGCCAATGTAAGATTTTCACAATAACCAAATGGCTAGTAAAAAAACGGAAGCAT
TACACTGATAGAGAATAACATATTTAGAAAGTAAATGAAAGAGAATAAAAATACCAGAGC
CCGTAGTTTCACGCTGGGACTTGTTCTGCCATGGAAGGAAGTAATTAGGACTACTGGAA
ACAGTGAATATTTTAACTACAGAATCCAATGCTAGCTCTATTGCTAAATAAGCATCCACC
ATTCCACTACTTAATCGCTGCTCCACCGCCGCCACCGGTTCTACTTCCGATGCATTGGA
TTGAAGACTATCATTTTCCTCAGCTCTCTCGACATGAGGTGTCGTAGAAGTTGTTGAGC
TATTGCTGTTATTCAAGGTGGAAAAAAATGAGGCGGAAGCAGAAGTATTACCAACAGTG
CTGCTATAATTGCAGCGCCGAACAAATCGATGAAGCTCTTGACGTCGGTGATGTACCG
GAGCTACATCTCCGGCGATAATAGGGGATTGAAAGTGGAGATTTCTGTTTAAGAGCTTT
CGTGCTGTACGGAGACTTGGACCTATTCGTAACAT
SEQ 65
CTAGCTAACTTGCTTTTCACTAGCCTTGTAATACAGCATTTTCTGCACATAGAACATCAT
GTATCCTTGCGCAGCTCTCACAATGTTTTCGCTCACTTGAGTGATCCAAGCATCATCAC
ATTTGTACCATTGATTACTTAACCTAAGATATGTTACGTAATGACCAGCATCAAGTTTAC
CGGTATGGGTGATGACAGCAAACAGCTCAAACTCTGAGGACGATTCACAGGACGCATC
TTGCTCGTCCCCATCAAAGGAGAAGATTCTATTTCCAAATCGACTCCTCAAGATAGATG
AAGAAAGGTAAGGCGACATGTCCAAGGAAAAAGGAAACTGTAGGTAGTGATCAACCTT
CCTTGACATTTTCTTAATCACAGAATGCTCAAACCTTTTGATATGGAAGCAAGAAACCAA
AGGCAATTTTCTTATGGACATCTGTTTAAGAGATTCCTGTCTCACTTGACAATGTTGGCA
GAAGAACTTCTGATCAGAACCCAATTTCTCAGGTCTTGTGAAATGATCTAAACATCCCAT
CAACGAAGAAATTCGACCGTTTTGGCTAAACTTTCCAGATTCTGCTTCCTTCTTGTGAGT
ATTATGAGACTTCTTTGATGTCATCTTTGAGGAACTCCCCTGGCTCAGTTCCAAGTCCAA
GGAGATGTCTATACATGGATCATATGTAGTAGATGTGAAGCCACAAGCTGTACACATGA
CATCAGACCGCAAGATTCCAGAAAATACTCTATGAGCAATGCAACAGTCTCCACTACCT
GTGCCAAAGAAAAAGATTTGATAGAATGAGATCACAGCTGAAAAAACACAATGTGTACT
CCTAATAAGTTACACAGAGTAACATACTTAATATGCACAGCTTAACGTGCTCCTCGACAC
ACTTCATAACAAGTAAAAGTCCAGTTATGCCATAAAGTTCTATCAGCTATTGCATTACTA
GACAAAACAAAAATCGTCCTATCAGGAATGGCAACCAACAGAATGAGTAGCTAAAAGCA
ACGGAGTACTATTGAAATAAAAGTAAAAAACTCAGTAAGAAATGTATCCGCAAGACACAT
TTATTGCCTCAAAAGCTCTATCTATTCCATTTACACAAAACCATTCAGAATCATAGATAAT
GTGTATTCATTATTTAAATGCAATACCCAGATGTTAACTAAATTTTGCAAACTCAGGGCA
GATAGCCCTGCACCAATCCTCTAATTAAGAGGAATGAGAATAGGTGTGAACTCTCACAA
TCAAAGTACGTGTCACTTAAGTGAGAACCAAATCAGCATATTTACAATGGTAGAAATGAA
TTGTTGCTAAGGAGTTACCCCTCTAATGATTGCTCTGCTAAAGAAATGCGAGGTACCGA
G GAG CTAG ACTC AC AAG AATAATAAG G C AC ATAAG G G CTTATAG AAAG GAG G AAG C AA
AAG G CTGTCAACCTAGTAAAG ATTTCAAG CTTTCTTG AAG GTTGCG ATCAG CTCACTCG
AGAGTAATACCCTAAAACATGGTAAAATTGCGAGTTAATACGAGAACTATTGTACCAGAA
G AAAACTCCTATG CTGAAATCAATAAGACTAACCAAACTG AG ACTTACTG G G GAGAG AG
TAGGAGGAAATCTTTTTCCTTTTGAGAAACATCTCTAAGCCTGACAGCGAATATCAACTT
GTAAATCGCCTATGGCAAAGAAAAGACCTAAACCATAACCTGCATTCAAATATTACTATT
TTTTCTCAGTGACAATGGAAGTTGGGATTGCCATGAGAGATGAGATGCTAGAACAAAGA
TAAGCCCATCAAGCAGGCCCTGAGTGCCTTTTAGTCAGACGTTACTAGACATCACAAAG
ATGCTTGAACACACTATTCTGCTTCTGACAGAAATTGCTTCTTCCACCCCCTCCCCACCA ACAAAAGAAATTCAAAAAATTCACCTACTGAAGGACTTGACCTTTGCAAGTACCTGTACA
AGTTTCAGTAATCCACTTGTTTGAGGTTTTTACAATACTAGCCTCCCTTGGCTATGTTAC
ATTTATGTTACTTTAAAGTTGCTGCCATGTGACCTGGAGGTCACGGGTTCGAGCCGTGG
AAACAACCTCTGCAGAAATGCATGGTAAGGCTGCGTCCGATAGACGCCTGTGGTCCAG
CCCTTCCCCGGACCCCGCGCATAGCGGGAGCTTAGTGCACCGGGCTTCCTTTTTTTTT
TATTAAAATATATATAAACAATGTTGTCAATAATTTTCCCAGTACAACAAAAAAAAGAAAT
CTCAATGATTGGTCTAATTCGGAAGAAAAGGGAAAAAGGAAGTATAAGAAACTAATATA
GGCAAGGTGATGGGCGGAGAAACGATGGGCAACTAATAGGTACTCTAATGCAACAAAC
AAATTTACCTGGACTCAACGCCTTTCCCTTATCGTTCTGCATCCTTTCATGAATCCCGTC
AAG CAC GG AAATGAAAAACTCATG AG CATC CTGCTGTT CAT AACTTG CAAGATTTGATG
CATGCTTCCACCAGCTGTTCCAAAGAAAGGTTTACATCCATCTGAATGAGACACAGCAC
AGACTGCAACTAGATACTCAAAAAGTCGAAATCCACATCTAATAAAACAAATTACAAATG
TATGTATATCACAACCAAGTTACACACCACCAGTGGCGGAGCCAGGATCTCCGCGAAG
GGGGTTCAAGAAAAAAAAAATCGTAGCTAGTGGGAATTGAACCTATGACCTTTCAAAGA
TTTTGAACCCCCTTGACCACTAAGCTACACTTATGGTTGTGTCAAGGGGGTTCAAAACT
TAATATATAG AG GTAAAAAACAGATTTTG CCTTATATATACAGTGTAATTTTTC GG CG AA
GGGGGTTCGGGCGAACCCCCTTTCGCCCCCCTAAATCCGCCCCAGCACACCACTGTCT
AATTTCACCTCTATGAAGGGAAAAGCGTGGTACATACAATCCCAATAATAAAAAACTAAT
CTTGTCCCTACATCATTTTCAAAGAAGTGCACAACCCAAGCTAAATTAAGGAATCTTTTA
CCAATGTATTGTTATCCTTATAAAAAAGAATTATATACAACTATACCTCAATCCCAAGCAA
ATCGGGATCAGCTATATGAACTTCACAACACACACACACACACACACAAAAAAAAAAAA
CGTGCACATTATAACAAAGCCAAACATTATCTCAACAAACCAAGAAAACATGATCAAATG
CAGACCTGTAAAGGAACTTTGCAGGACTAATAGGGGTCCGATCGCCAGAGAAAACAGC
AGAAAACATTGCATCCAAATCACAAGCCAAACACAGCATTGTTGAGTTCTTATTCCCATT
ATCACTACTACTCCTTGTTATAACACTGCTATTCTTTCGCTGGCAAAAATATCTGTTATGC
TTGTCACTCAGAAAGTAATTCCTCAATGGTGGTGTATGAAGCAATGCTTGAAGCACTGA
ATTCATAAAACACGTGTTTCCAAGATTGTTAAGGCCCCTCAAACCCCATTGTACTTCTGG
GGTTGTTGAGTCATTGCCGAGCTGACTCGGCAACGGACTCGAGTTCCCAACGATCAAG
ACTTGCTCTTTCACATCAGGCGTCCACGGTTTGTACTCCACGCGCCTCCTCTTGCGCGT
GCTCTCCGGATGCGGCGGAGGGTCTTGTATTGATCCGATCACGGTGGCCTCCGTCTGC
GCTAGTGCCACGGCGGCGTCGAAGTCGCTATTGTACACCTGGTCCCTACACCCGCAGC
AGAACAGCTCGGCCCGGTCGATGTCCACCGCGATGCAGTGCAGCGAGGGGTCCGCAG
CGTTTCCCGCCGGATGTGACGGCGCGTGCACGCGGCAGAATACCTCGGCGCACGTGA
CGCAGGCGTACAACCGCGGCGGCGCGTGTCCGCACGCACCACATCTCACCAGCTCAT
TCGGCGGCTCGCGGCAGAT
SEQ 66
TCATAACTTACTGTGCACGAGCTTATTTGAAAGACGTTCAACCTTGCCAAGTTGAGTTGC
GCTCTTCAGCATTGCCTCGCTGTCAGCTATAAACTTTGCAAACTTAGAACCCTTTGTATC
ACCATCACTATCTCCTCGAGGAAGCATTGGCAGATGAAGCATAGAGGGACTTTGCTTG
G AAGATAAAGG CG G AGTTAATG CCCACACAGAAGAAAG CTGTTCATCTG GTTTGTCAAG
ATATTCCAATGATAGATTCTGCATGTCTGTNTAACGGTGATAGATCCTGCATGTCTGTCA
GGAAGAACAGAACAGAAATATCAGACGTTGGTATCGTGCCTACCAAGTGTTATTCGGAA
CTG ATTTCTTG AG AGTTG CAAAACTTATTAAG AG CTTG CAG ATC AAATGTTACTGCTTTT
ATTCTTCATACGGGAAAAGACCACCCATTATCTGAAAATGGAAGTATCAGGAACAGTCG
AATAAAGTACCTTCAACAAACTTG AAG ATG G GCTCTAAAG CTG CACATG GAATG CTG AA
GTTCAAATGTGGAATGACCGTCCCTCCACCATGTCTAGCATTACTGCATCAAAAATAAAA
CTACAAGACATGAAGTTTCAACAAGAAAGATATTAATAGATGGAAAAACTATAGAAAGAC
CATGATATGTGAAGATTAACATAAATGAGATGACTAAAAGATCTTCTGTATGAGATCATT CAAATTGGACACCATTATTTTTCCTGTATGAAAAGCGTATAACCAATATACATTTTTGGTA
AGGACAATTAATATACATTCAAACAGGAATATCTTTCTTCAAGGACTTCTCAAAGTACTC
CAGGACGCAGTGTACCACAATATCATTAGATTGAACTTCAAGAACAAACACATGTAAATA
TACATAAGCTGAAAAAGAAATATCCTCAATTATAAGCATCCCCAGTTGTCATCCAAAAGT
TAATTACCTTGTTACAAGGGCAATCATGTGTCCTTCTGAATTAACAACAGCTCCACCACT
ACCACCAGGGTGTACAGCAGCCGTTGTTTCAAGCATTGCCGGAAAATGTTCTCCTAAAC
TTGATTGGTTGAGCAGAGACCGCTTGGCTTCAACTACCTTAGCTATTGCACCCACACAA
G CAGATG GAAG GAAGTCTATATTATAAG AAAAAGTAAG ACAAATTACAAATACAACTAAA
GAAGTCTAATATAGGTATGAAACATATGAAATAAATATACGTAGTATATCATGTTCAAAAT
GAAAGAACTTAACAAAATTATTCACATGAAAAGCAATTTAACCTCCAGAACAACATGATT
TAGTACTATTGGGCGCACAAAGATAGTCAGTTCCAGAAAATTATGTTCAGCAAAGGTTAT
GGAACAGACAAGTTATCTGTATCAACGAAAAAAGATGGAACAGACAAGTTAAGATTGCA
TCAATAAACAATAGTAGCACTTGCAACAACCTAGCTACTATTAAAATATCCTTGAGATAC
AGCCCGACTCGAATAAGTGAGTTACCAGGAATTTCCTATTCAAAAATCCCATTCTTTAAA
GCTGATCATTTGTACTTGCTTTCACAATAGAAAACATCAATTTAATGCTCCAGAAATTTAC
CTTTTTTCGTGATAACTGATAAGTGACTTCAAAACTCTAGATTTGATTCCCCAATTCCACT
TTGTTAGCATAGGTATTAGGTATATATCATTCTTATGGATGAAGATCTGAATTAGTGCCT
ATGGCTTTTATTAGCCCACGAAAGGAAAACGCTTCTTTTTAATTTCGTCTACCTTTCTCC
TTGTTCTGCTAGCCTTGTTTGAGCCCTACAACAACCTCGCTATTCTTAATCTGACGTGCA
ATTTTTTTTAACCAGAAGATCAAAACACTGACTTGGACTACAAATCAAATTCAGTATCAGT
AAACAATGTCTTCACCTAAAAGATTACCCAGTTTTGAGCCCTCCCGACCAATCTGGTTAT
TATTTCTCCATTGGAAGACACCTCAAGTTCCCTTGCGGAGGCGTCGACATCACCTCATA
ACTACTCAATCAGTCATCAAATGATCATCATTTGGTGAAGGAAGAAACATCAAGTATTCC
AGCAGTAACAAGGAGATGAAAATGATATAATACGACCCAATCCTGCAATTGATTATAATG
ACACTTCAACAAATTCTTAACACAAGAGAAGCAAGGTGGAAGAGAGAGAAATTCAAGAT
AAACAAAGTTTTTGTAGAATATTCTAAAATTTCAGATTTACTGTGATGCGTGTGTCAAAAT
AAAAGTAAAGGCAAAAATATTTTATTTAGACAACAAATCTAAAGCAAGATTTACCACATC
GTGGTCCAAATAGCCCATGTCCGAGAATGTATGCTTTTGATCCAGGGGACGGGCACAT
G AAGTCAG CAGTAATG G G ACAG AG CTG ATCAGG AACTAGCTC AAGTTGTAGTAATG CA
ACATCCAGAGGTCCTCTGGAGACATGAACTACCTTTGCATTTGTCCATACCCAGGGATC
CATAAAATCCAAGCGAACACGAATGATCCTACTGCCTGTCTTTGCCAGGTTAACTCTAA
AGCTACCTTGCTCATTGTCAACCAAGAAATGCGGAGTTTTTAGTTCCTTTTGAATCAAAT
GTTTATTCCTACGCTGAATGTCAAATTTCTCAACCCCTGGATGCTCAGATTGATCAGAAG
GGATGAGAACTACATCAGATTTGGTGTTATATCCTGAACCGTTTACAGATGTTTTTCCAA
ATCTCCATGGCTCTAGAAGATGAGCATTTGTAAGAAGAAGACCCTGCTTGTTGAGCAAA
ACTCCAGAAGCCCATGCTCCATCATCAACAGTGATAAGGCAGATAGATGTCATTGCCTT
CTCAATCAAG G ATG G GG G AACAG G ATCTATTAG G AG ATGCTCTCG AGTATCATTG G AA
GGTCCATTCCGAATATTATTGGAGGGTGATTCATTTTTAACACTGATTAGGTTTCCATTA
TCAAAATG G ATCTTTCTCCTAGTTTGTAGCTCTTCTTTAAG CAG G CTACC ACAAG CAG AT
GTAATAGCTTCCCATGGAATCACCATCTGCCACCCAATGCTTCAATTTCAAATTAAAGCA
ATTTATGTGAAGATAGATCTCAGCAAGAACATAGAGCACATTATGAAATGTTTACCTGAA
TTTCAGCAGCAGTAGCCCTTTGTCTGAGTGGCCGAGACAGAACACCAATAAGCTCTGC
ATGTTCCCCTAACACTGGGCTACCTTCCAT
SEQ 67
CTGGATATAAGTTTGAACATGATCCTTCCAGAGAGAACTTGTCTTCAAAACATCCCTTAG GGTTGATAGAACTTCAAAAGATGTCGCCTTTATAACATCGTCATCCTTGTTGTACGGCTG TTCCTGCAACATAATTAAGAAAGCAAATTTTAGAAACAATTTCAAAACTGAACATCAAGC TTTATGAGTTTATCATTGCTGGGTAGAAAGAATGCACAAAAAACTGAGGCATGCCTTGA GATGATCAACTTTCACTGTAAGGGGTTCTTCACTGACCTGCAAATGAATGCCACTGTTT
CAGTCCTCAAAGTGTTACTCTTGTACATATCACAAAATGGGCTATCGAGACGGGAAAAA
ACAACATGGTGAATAAAGTAAGGAAATCTGACAGTGAACATTCAAGTGTGACAAATCTA
CTGCATAGTCATCAAATTGTTACACTAGTATATTGAAGAAAATTTTTATTGAAACAAAAGA
AATAACTACATAAGGGAATCTGACTGCAAACATCCAACAATAAATAAACCCAGTGTAATC
CCACATGTGGGGGTCCAAGGAGGGTAGTGTGTGCGCAAACCTTACCCTACCTTGTGAA
GGTAGAGAGGTAAACATCCAACAGTATGTATAATCTAATAGGAGAATTATTAATAAGGAG
CTCAGG G GTTG AGAG ATTTAAGTCCAATAGAAGAG GATAAG AG GTC CTCAATCACAAAA
TAAAAC G AAAAAATAAG AAG AC ATAAG AC C CTC C AATG G CTAAGTAAG C C C ATAAATG G
CTGACCATGCTTGAGTACTCCAGCTTTACAGCTAAAGCCAAGAATACGACTCTATACCA
AATTAGAGCTCAAACAAATAGGATTCTATTCAATCTTTTATGTGTTTTATATATGTTTAAG
CCCCAGGGCTTTGGTCTAGTGGTAAGAGTACAGCCCGTGATATGTAGGTTGGGTGCAC
ATCACAGTTCGTGCTCTGGCGCAAACAAAAGCCTAGTATTTAGGTAGACAATGGTAGAA
TGGCGAGCCCATTATCCACCGAGTTTGAAACCATGAGCCACTTACCCTCAGATTTCTCG
GTTATCAAAAGTCTAGGATTTCATTCAAACCAAATGCATCGGCAATCAAACAGATGCTAA
ACTAGCACATCATAGAAGAAACTCAATAGCTTTTCTCTTCTACAACCCCGAAGTGCAATA
AGAAAGACATTCCTATGACTTTGCAGGAAACCAGTTGTCCAAAATTATTACAAAGTGCTA
CTTTCTGTTGCAAACAAGCAATCAAGAATAACTAAAAGAGTGTTTCATTCTATTGGTAAC
ACATCATACACAACTGG G AAAAACAG CTCATTTAG CCTCAAATTG CAACAG ACCTCTATC
GCTTGACAACGCAGCTGCTCAGTTATATGAAGCCAAAAGGAGAAAGACTGGGTGTTGG
GACACAAAAGGTAGGAAAGAGAGAAAAGTATACAAGTGCAATGAACCCAAGATGTCAAA
TCGGCAGTCTCAAAATCAGGAGATGATGGATATGTAGATGAAAAATACAACTAAATAGG
AC AC AAAG AAAAC ATATG G AG CTACTTATAAC C AAAAAATAAC G C AG AC ATTTACTTAC C
ACCTCTGTCATACGTATCCGCCTGTGACCAATAAGAATAACCTGGTCGTCTTTAATACTT
GTTATCTGGCTCATCATAAGTTGCATCAACCTTGATCAGAAACTTCTTCAATAACTTAAA
GCAACAGAATAACACAAGGAACAACAAGATGTACCTGAGCAAGTGTACCAACTTCATGA
AGACGGTTCAACATGTCTTTTCCTTTAAGCTCATAGATGTTTTTTTCTGTATCTGAGGCA
GACACAACATTAGGATCAGTCCCTTGCTCGTCTTTCATAAGGAAAGCGCCAGCATAAGG
TGCTTGCCTTTTTCGACTTTCCAGCAAGGCTGCTAATACCTTGGGATCCTGCAGTTAAAT
TTACGGATTATGTGGAGCCTCAGATCAATTTGTTGCATGCCATTACAGTATAATTTATCT
TTTTCTGGAAACTAACAAATGAGGCTAAAGAATATTACAATTGAGCAAAGACGCTCTCCG
AATTG AAG GTTTTAG CTTTTCAAG CTTACAACTTCAG CCAAG AACTTTCCATCAATTAAA
GTAGAGCATTCTTAAAGCTGAATGCTGGTTACAGACTTAAGAGGAAATAGGGTCCAAAA
CACTAGCTGAAACAGTACAACTCTCAGAAAGTTGACAGTAATTATACACTTAATGTTTAA
AGCTTATATCTATTTTATTTTGAAAATTACGAAGACTAAGACATGGTTTAACCCATCCTAG
ACAAGGTACCATACCTTCACATAAATATGCATATAAAACCCTGGAAATAACGGTCTGTGT
G GAAGTG G CAGTG CTAG AACCTG CACCAAG AAAACAACATGC ATCATTATAAATTATAA
AGAATTAAAAAAGTACAGCCCACACAGACTTTATTTATTTTTAGTATAATAAATGACATAA
AGCCCTTAATTGCAAGTCTTACAGGTAAATCTTATCCAAAGATAAAACCATAACCAACCA
AGAAGGCAAGAAATGGGTCGCATCCATTAAATTTACAAACAGATGAATTATGAGCAAAC
AAGAAGCCTCCAAGAAACAGTGTTTTCTTTGCATTCGTTCAGTATCTCTCTTTTTTAGGG
AAGGTTGATAAATACCATTTTTACACTGATCTTGCAACAACTAGCAAAAACTGAAATACA
CATATGAATCTTCCTTTCAAAGAAAAACGAGATAGTAGGGCAACCTATTACTTTATCCTC
ACATGTACTTTCCAATGCTTTACTTCTTAAATAACCTAGAAGCTGTTGATTAAATTGAAAC
TAAAGCATGAAGTAACCCTCTCTAGGCCTCCAAATATAGGATGAAATGTTAAAAATTAAC
ATCCATCAAGGTGCCTTCTGTACGGCCAGAAGTCAGAGGCATGAGGCATAACTGGTCG
AATGTTGACTCTGAAAAGAGCATATGGGAAGGAGAGTATTCTAATTACAATGGACAAGC
AAACAAAAAAAAATTCTTTTGCTGAACAATGTGAATCACCAGCTTCTTGCAGGTGAGAAA
TCATTCAAAATCATTAGGAACTTTCTGAAGCTTCAGTCTTGACATAGAGCTGATCAATTT
G CCTCTACAAGGTG GG AGTTACACATG GTCCAG AGAGAG ACAAG G CAG G AAGTACTCA
AGGATCGACGGGTTTCTAATATCAGGAGATTTCAAAGAAACAATCTCAACCAATACGGA TG GTTC AACTG ACAATGTG CTCAAAACAAATCAATCTTCTTG ATTATG G GATTCTCTGTA
AAAAAAATAAGCCATTTAAATCTGAAAATTGTGAGACAATAGAAGAAGCAGATGAAAAGA
TAAGTTGATGAATACTTATTACATAATACATGACACTTAATTTTCAGGTGTGAAAAGAACA
ATAATCTTTGATGTTGGACTAACTTAACATTAACATTCTACTAACTAACATTGCCATTGTA
TTATGTGTCCTCGAACTCAAGGTAAGAAGAAGTGAAGAACACCTTGAATTTGCTTTTCTG
GAAAACACGGAATAGTGAACAATAGTCGTAATAGACTAATAGTAAAGCATCACCTAGAA
AGTGAACAGTTCAAATCGTGATCAGTCACCAGAACACCATAATGGATTTTTTCGCTGAAA
GTCACGAAAATAGCCCAGCTAAAGAAATCATTAATATTCAGCATAGACTAAGGGAACCG
ACTCTCAATGACCAGAGAGAACGCGATACCAGGAATCTCAAAACCAACCTTGGTGTGG
CGTAAGGTGGCAGCTAATGTGGCTCGAATTGGCGGCTAAACAGTGACTAGAATAGCAA
TGAAATAGCGTCCAAAATAGATATCTGGAACAATAATCATAATACTGGCCTGAAAGATG
GTAG AAAACTCATTG GAAG AATG G CCAAAATAGTTCTTTTACACTGTAAATG GG ACATG
GTTGGGATGGTGGCAGAAACTCAATGGAAGAATGGTCAAAAAGGAGGAATTTTACAGC
TAAAAAGATGACTGAGAAAGAGGCATGATACAGTCAAATGGTCTACAGAAATAGGTATG
AGGTTAACGATCACCAAAAGTGGAAAGAATAGTAAAAAAGTTGACCAATCGGGTAGCAT
AGGCTGCCCCTTATTAAGATCACCAATGGTATGTGATACGATCTGAGAAAATAAGAAAG
AGGATGAATTGGTAACTTGATGAAAATTGGAATACAAGACACATCTATTTGTACCTAACA
ATCAAATAGAAAAGACACATCTATGTGTAGGTGAGAAATACATTAACTTTATGATATCTTT
CCAATGTGAGACTAATTTAACACTACTAAGATAACACACATTAACATTTGAACAAGAACA
AGTG G GTG CTACAAG ATG GTCAAG GG ATGACAAG AATAATATCAC ATTG CCG AAAATCC
AAGCTGTGTGATTGGAAAAAAACTTGAAATTACTTACAGATGATCTTAGAAGATGAAAAT
AGAGAGGTATATGCTTGAGCGGAGATGGAGATGAAAGAATTGTTGGAAAAATTGAAACC
TTTAGAAATCATTAG GTGAAG CAAAAGAG ATG ATTG CAAATGAATTACCC CTAATGGAAG
ACTTGATTAAG GAGTTGG CAGACTGTG CCTT AGTG GAG GAG GTATCAAG G AAGCAAAA
ATCAATAACATTGTGGTTGAAGGAGGGGTACAGAAATACCACATTCTTTCTTCCTCCTCT
TCTTTTTTTTTTTTG G AAGAAAGTG GAATACCACATTCTTTCATTG AAAG GGTAATG CTAA
TTCAAG CTACAATAGC ATTAAAAAAGTTG CTCACTG ATG AG AG G CTCTCTG ACG ATCCT
AAGAAGATTCAGGACAAGAAATCTTTAAAAAGACGACTAGATTTTAAAAATTTGTCAATT
GTCATTTCATTAAAATACTAATGCTAATTGAAGCTACAATAACATTAAAAGTGTCGGTCAT
CGATGTGAGGTTGTCTGAGGATCTTAAGAAGATTGATATCAAGATTCTTTACCACTACAG
CCTG ATTTAAAAAAGTCTACACTACTATTCTTTC GTCTCAAACG G AATGG CTG G GTTCAA
AATATGAGAATCAAAGCATCTGATAATAAAACCGTTCTTTTTGAGACAAAGAGGGTATCA
AAAGCTGTTCTTTGGAAAATAAAATGTTGAAGACCTTCACAGCATTATATCACGTCAAGG
TTG GG AAG G CCTTTTACTAAAG AATAATTG AGTGTACAAGTCTAC GAG AAATATAATCTT
GGCGATCCACCTTTTATATCCTAGAAAGTTAACTTCTTAGCCAACCATCCCTTTTGATTG
CCGGCAGAGTGTTGCAACCAAAGTTAAAAATATTTTCTATATCAAATTGATCCCAATTCT
GCGTCAATGCCAACCACAAAGATTAGAAATTTCATCATCAGCGTCCATGTTTGCATCCAT
ATGTACAAATTTGTATAATAATGAATAAAATCATATCACAACTACCCAAAGAAGACATAAA
ACCAGCACTAACAACAGAAGGAACTCACTTAATATCTTCTCTTTTGGTGCACAATCTGAC
ATGGCCACGCCGTTATAACAAAAAGTTGACATCTTTTCTTAAGGAGTCATTTTACCAGCC
ATCAAATATTG GAATAAAATGTACTTATATAATGATG GAG CAG G AAAATACTTAATTATGC
GCCAAATTATATTAGTTAATAACAATGAAGTGATAAATTGAACTGAGAATGTACATTAGTT
AACAAAAGTCATAAAATTTATTTTAAAGAGACGTGTAATTCTATTGGTGCGGGTGCAAGA
AATTATTATG AACTAACC GTAAG G CAATCTTCAGG CTTAAAAACAGTAG GAACAATAG CA
GCGGAAGCCTTAGAATCAGCATCTCCGCCTTTCTCGGCCGGCTTGGCTTCGGATGCAG
CAGCTTCGGAATTCGATTCGGACCCATCAGTAGAATCCGAACAAAAGAATCGTCGAGAC
AAATAAG GGCCCCTGC G ATTC G AACTTCTTAG C G AAC CTAAAAC C C GAAG C AAG G G C G
TATTCGAGTCAGTGCCACGGCGAACTTGAGGGGTAAATGCCGTTGTGACGGCGTGGAA
ACG ATTCTG CAG ACATG AG G ATGTG AG AG CCTTCAACAT SEQ 68
TCACGCAGCTTCAGCAAAACCCAGTTGAAGATTACCATAGTCAAATACAGTATGATACA
CCCCCATGAATACATTACCAAGAATCCTGCATGTACATGAATCTTTTCGTATGAAACAGA
TCGAGAAGCTTTTGAATAAATTGAGGTATTCTTGTCTATGCATGCAATTTCTATGACCTT
ATTAAAGCGAAATGGTTGTAAGAATAATAAAAACGTACCAAAGAGGACCACGAGGTGGT
GGCACATCCAAAGCAACAAACCCACTGAGGCAAATGGTAGCAATCCCCTCTCCAGTTTT
CAAAATATACTGGAAACAGAAACAAAGCCACAGTTGAAACAAGGCTGAGTATGCAACAT
AACAATGTAACATATATAAAATTAGGAAGAGAGGCATAACCTGATCTGGAGTCAGGACA
AAATCCTTATCACCAATATTGAATGTAACATTTGGCATGGATGATATGCTATTGCAGTCG
ATTACGGATTGCCCCATAGGACTTGGTAATTTCTCACAAAGCTGTGAAATACCAAACATT
AGGCATTAAGACAATGCTCTGCACTACTTAAAGAGAAAATTCACTCACCTGATTCACATA
TTCTAGCACACTCTCCTTTGTTGTCTTCTGTTTCAGCTGGTTCTGCATCCAAATTACAGC
CATCTCACAGGCAGTGCACAATGGGGCCTCTCCTATGGAACTTCCTTCATTTTCCTTCT
CAACCACACTTCTGATATTCGAGCTGCAAAATTGGTTGTAGATTGTCACATGGTTACACA
ATTCAACATGTATTGTAACAGACATAGATCCTTGGTTCAAATCTCATTACCAACCTTCAT
CATAATGAATTTCCAAGTACTAGGCCCATGAAAAGATCATACTTACACGTAAGAGAGCAT
GTCCAAGATATAATAATAATTGAATAAGATTGTGACATCTTTGACAGCTTAAGTTTTTAAA
CGAGATGGTTACACAATTCAACGTAAACAACATAGCTCTAGAGCCTTGACATATGTTCC
AAAGACAGATTGTCCAGTTCAATTTAGTAACTTCACCAAAACCACCTTAATTCACATATG
CTCCAAAGAACAGGAGAAAAGCATCCATACCTCAAATGCTGAGCTCCATTAAGATAACA
TAAACCTACTTGTAAACAAATTTGATCTGGTGTGACCTGCAGACAATTACAAGTAGCATT
ATATCTACAATGTCACATGCAGTAGTGGCCGGCTCAAATAAATAAAAGAAGAGAAATAG
AACAAACCCCTGATACTAGTAAATCCCAAATCATTTCCCCATATTGAGAAATGGTTTCTT
TACATTCCATGCTCAATACTCCTTCTGCTCCAATGGCATGGTTGACTTGTGTCACAACAG
CCTGTTATGTACTTGCTATAATTACTTTCAAGAATTTGTTCAAGATAACTAGATAGTCGAA
AGGTCTACAACTTCTGCAACAACCCAACTAAAGTAAATCATACAGTTGGACCAGCAAGC
AATGATGTTCCAGAATCAACTATAGCAGCACAACCGCCTTCACAGAAGCCTGTAAGGGA
AAAGTATTAAG AC AAAAAG AAATTATTCTTATCTTAG GCTC GTAATC AACAG CAAC GCTC
ACCTGTTGATTGGTTCCCAATAGAGAAATCTCCCATTTTAAACTGAAACAGAAAAAGGAA
AAG AACAGATGG G GG G ACC GTG AAAAAAG G CAAAAAG G AATG GAAACCATAGTG GAAA
AAACAATTCAAGTACCTGCCAGTAACCTTTCTGAGTCAAAGGAACATAAGTATGTTTATC
CTTGAAGTGTTTTGGATCAACACCACCAAAAACAAGTTCACCTCCCTCTTTTGCATTTAT
ATCGCGATTAAGCCAGAAAGAGAACACAGGCTCCTTTACGAGATCTTGCTTCACCATAT
TGTACCTGCACATATATCATGAATCAAACAGACAAGATTCCAAAAACTGAGAAGAAAGG
AAAACAAGATAGAATTGGTTAAAAACTGAACTGAACAAATCAATTGCAGGCCTTATTACC
AGACAGGTGTAGTGTTTCCAACAGCAATTTCCTTGAAACCAAGCCCAAGTATTCCATCA
AACTTTGCAACTATAAATGTAACACTTGATTCCCGTGTCGCCTCAATAAAGACCTGAAGA
AATTGATGTAAAAAATTCTCATCCATTGTGTTTTCAGAAGAGCAGAAAGGACCATAATAT
GAG G CAGTG ATG ACTTATTG CCAAG CAAG ATTTCAC CTGATCCGTG ACTACAAGATCG C
CAACTTGAACATTATCTTGACTGAGAAATCCTGAAATTGATCCAGATCCATAGTGGATTG
AACAAGATTCTCCTGGATTAGAATAAACATCAAATATAATCAGAAGCCATCAATAAATAA
CTTCTTAGTCTTTCAATTAATGTGAAAGAAATATAACATTAAACTATGATATGAACATCAC
CTTTTTTTGTGTATGTACTAGACTTCCTTGCCTTGTATTTGGAATGGATCCAGCATGCAA
TCTGGAAATAATTCAAGTTTAAGGAAAAATTCTGTATAAACCGGTAATTCAACAAAGGAA
CAAATAACTAAAAGAATTCAAGTTATATACACTAAAAGTGTAAGGATTTTTGTTACTATCA
GTATAGTTTAACTTCTGATAGAAATAATTAAGTACCAATTATTGTTACCAATTAATGTTTT
CTTATAGAGATTTACATGTAATTACCTTATAAGTGACCTGATTATATAACTAACCTTTGCA
C C GTTAG C AC ATATAG AAC GTAAAC ATTAACTAAAAAG AAG C AC AACTTAC AG AG AAATA
ACATCTTGATGATGGAACCCAGAGATTAGAACTTCCTGTATCAAATATGACAGTGAAATT
TTGAGGGGGTGAACCAATACTAATATCTCCATAATATTGAGCATCCAAGTAGTTCTTTAA GGACACTATATCTGAATTTGTGTCAGATTTCTTCTTCTTCTTCTTCTTCTCTATGTCCTTC ATCACATGCTTTCCATATCTGTCTTCAAGTCTTGCTACATTGGCTACATTTAAGCTACTG ATATCTAATTGTCGCTTCTTCAGACTAATTCTTAGCAAACTATCAGAGGAAGCAGGAAAT AC AAAG C AG G C AATG G C C AATAAAAG GAG AG C AG C C C AAAG ATG CTTC CTTTC C AT
SEQ 69
TCAAGATCTTATTACAACATACTTCTATTACAATATCTTTTTCTTTTTGTAATGGCTTTGAT
CCTTGGATGGAAAATACTATTTATCCTTCTTTTTGTGATAATTGGGATGTGTACATCTCAA
GTCACTTCTCGTAATATTCAAGCTTTATCCATGTTAGAAAAGCACGAGTTATGGATGTCA
AGTCATGGACGTACTTACAAAAATGAAGCAGAGAAGGAAAAGAGATTGAATATATTTAAA
GAGAATGTGAAATTTATTGAGTCTTTCAACAATAATGGGACTAAAAAGCCATACAAATTA
G GC ATCAATG CATTTG CTGATCTTACTG CAG AGG AATTCTTG AGTTATTATACTACTG GA
CTTAAGTTGTCTAATTCCTACTCTCAAATTCAATCATCATTTAAGTATGAAAACTTGAGTG
ATGTTCCATCTGTTATGGACTGGAGAAAGAGTGGTGCTGTCACTAGAATCAAACATCAA
GGTCAATGTGGTAAGGCACAGTTTCCTATTCAAGAAAAGTTTCATATTCTCTTCTTATTA
AGTG CTG AC GTAACTAGTAAAGTTG ATGATATGTG ACCAG CAG GTCACG GGTTCAAGTC
ATAGAAACATTCTCTTGCAGAAATGTAGGGTAAGGCTGCGTACAATAGACCCTTGTGGT
CCGGCCCTTCCCCAGACTCCGTACATAGCGAGAGTTTAGTGCACTGAGCTGCCCTTTTT
ATTAAGTATTGAGAAAGGATTTAAGTAAAATACTACATACTCCTTTCAAATTTGTGATCTT
AAACATGTTTTATCATTGTATTATAACGGAGTATCACTAAGGTTAAAATGAGAATATTAGA
AGCAAGCATACTAAATATAAAAATACATTCTTTCTGTAATAGACTAAAATGGAAAATAAGA
TATGCATAGAGTACTCTCTTCTTGTCCAATAATGTTGACAAGGCACTTAAATTATGAGTG
TGTGAAGTCTCACATTGGTAACTGAAAAAATTAGGAGTCTACATATAAGCCTACATATAA
GGTTTAGAGTTTTTTTATGGTGTGAGGTCTTTTGAAAAAAATCGTGCGGACTTAATCCAA
AGTGGATAATATCACACTATTCTAAGAGTATCTTTGAGCTGTTTTAGCTCAACAACTCGT
ATCAGATCCCAGGTTCTGCGGACGAGCATAGCGATGGCGACCTGTGGATCGTGGTAAT
AGCCACATGAAACTGGTTCGACGGGGAGACCCGTGGATCATGATCATGGTAGTGAGCC
ACATAAAACTTAGTTCGAGAGAAGGATTATTGGGTATGCAAACAAAGTCTCACATTAATA
G CTAAAAAGTTTG G GAG CCTG CATATAAG GCGTAGAG AACTTTTAATATTGTG AGTCCT
TTTGGGGAAACCGTACAGTTTGGCCAAAGCGGACTATATCATACTAAGTTAAGAGTATC
TTTGAGCCATTTTAGCCCAACAAATCATATATGATAATTTAAATTTGTTTTACACTACCAA
TAATGTATTTGACCTACTTTGCAGTATAGTTACTATTTTTGTATGTTTATCATAAAAGTTAA
CCTTTAAAACAATACAAGTGATATGATTTGTATAAATATGTGCATAGAACTTCCAACTCAT
TAATAAATTGCATGAAATATAGGATGTTGCTGGGCATTTTCAGCAGTTGCAGCCTTAGAA
GGAGCAAACAAACTCTCAACGAACAACTTGATTTCACTCTCCGAACAACAACTGTTAGA
TTGCACCACCGAAAATAACGGTTGCAACGGCGGTTTAATGACCACAGCCTACGATTTCA
TCATTCAAAATGGCGGCATTGCCACAGAATCCAACTACCCTTACGAGGAATATCAAGAT
TCATGCAAAAGCCAAGAGATGAACTCTGCAGTGAAAATCAATCGTTACGAAACTCTGCC
CTCGACTGAATCAGCATTGTTAAAAGCCGTAGCTAAACAACCGGTCTCTATCGGTATTG
CAGTG AATG AAG ATTTTC ATCTGTACCAAAATG GTGTTTACAATG GAAATTG CG AG G GT
CAAGAACTAAATCATGCAGTTACTGTAATTGGTTATGGGACAGAAAATGATGGTACAAAA
TATTGGTTGATCAAGAATTCTTGGGGGACAAGTTGGGGTGAAAATGGTTACATGAAAAT
TGCTAGAGATACTGGAATTGAAGGAGGTCTTTGTGGGATCACCACTTTAGCTTCCTATC
CTGTTCTT
SEQ 70 TCATAACTTACTGTGCACGAGTTTATTTGAAAGACGTTCAACCTTGCCAAGTTGAGTTGC
ACTCTTCAGCATCGCCTCGCTGTCAGCTATAAACTTTGCAAACTTAGAACCCTTTGTATC
ACCATCACTATCTCCTCGAGGAAGCATTGGCAGATGAAGCATAGAGGGACTTTGCTTG
G AAG ATAAAG G CGG AGTTAATG CCCACACAG AAGAAAG CTGTTCATCTG GTTTGTCAAG
ATATTCCAATGATAGATTCTGCATGTCTGTCAGGAAAAACAAAACAGAAACATCAGACGT
TGGTATCGTGCCTACCAAGTGTTATTCGGAACAGATTTCGTGAGAGTTGCGAAACTTAT
TAAG AG CTTAC G G ATC AAAG ATTACTG CTTTTATTCTTC ATAC G G G AAAAAAC C AC C CAT
TATCTGAAAATGGAAGTATCAGGAATAGTCGAATAAAGTACCTTCAGCAAACTTGAAGAT
GGGCTCTAAAGCTGCACATGGAATGCTGAAGTTCAAATGTGGAATGACCGTCCCTCCA
CCATGTCTAG CATTACTG CAT CAAAAAT AAAG CTACAAG AC ATG AAGTTTCAACAAG AAA
GACATTAATCGATGGAAAAACTATAGAAAGACCATGATATGTGAAGATTAGCATAAATGA
GATGACTGAAAGATCTTCCATATGAGATCATTCAAATTGGACACCATTATTTTTTTCCTGT
ATGCAAAGCGTATAATTAATATACATTTTTGGTAAGGACAATTAATATACATTCAAACAGG
AATATCTTTCTTCAAGGACTTCTCAAAGTACTCCAGGACGCAGTGTACCACAATATCATT
AGATTGAACTTCAAGAACAAACACACGTAAACATACATAAGCTGAAAAAGAAATATCCTC
AATTATAAGCATCCCCAGTTGTCATCCAAAAGTTAATTACCTTGTTACAAGGGCAATCAT
GTGTCCTTCTGAATTAACAACAGCTCCACCACTACCACCAGGGTGTACAGCAGCCGTTG
TTTCAAGCATTGCCGGAAAATGTCCTCCTAAACTTGATTGGTTGAGCAGAGGCCGCTTT
G CTTC AACTAC CTTAG CTATTG C AC C C AC AC AAG C AG ATG G AAG G AAGTCTATATTATA
AGAAAAAGTAAGACAAATTACAAATATAACTAAAGATGTCTAAATAGGTATGAAACATAT
GAAATAAATATACGGTATTATATCATGTTCAAAATGAAAGAACTTAACAAAATTATTTACA
TGAAAAGCTATTTAACCTCCAGAGCAACATGATTTAGTACTATTGGGCGCACAAAGATA
GTCAGTTCCAGAAAATTATGTTCAGCAAAGGTTATGGAACAGACAAGTTAACTTTATCAA
CGAAAAAAGATGGAACAGACAAGTTAAGATTGCATCAATAAACAATAGTAGCACTTCCA
ACAACCAAGCTACTATTAAAATATCCTTGAGATACAGCCGACTCGATTAAGTGAGTTACC
AGGAATTTCCTATTTTAAAACCCCATTCTTTAAAGCTGATCATTTGTACTTGCTTTCACCA
TAGAAAATATCAATTTAATGCTCCAGAAATTTACCTCTTTTCGTGATAAGTGACTTCAAAA
CTCTAGATTTGATTCCCCAATTCCGCTTTGTTAGCATAGGTATTAGGTATATGATCATTC
TTATGGATGAAGATCTGAATTAGTGCCTATGGCTTTTATTAGCCCACGAAAAGAAAACG
CTTTTTTGTTTTTTAATTTGGTCTACCTTTCTCCTTGTTCTACTAGCCTTGTTTGAGCCCA
ACAACAACCTCGCTATTCTTAATCTGACAAGTGCAATTTTTTTTAACCGGAAGATCAAAA
CGTTAACCTGGACTACAAATCAAATTCAGTATCAATAAACAATGTCTTCACCTAAAAGAT
TACCCAGTTTTGAGCCCTCCCGACCAATCTGGTTACTATTTCTCCACTGGAAGACACCT
CAAGTTCCCTTGCGGAGGCATCGACATCACCTCATAACTACTCAATCAGTCATCAAATG
GTCATCATTTG GTG AAG G AAGAAACATCAAGTATTCCAG CAGTAACAAG G ACATG AAAA
TGATATAATACGACCCAATCCTGCAATTGATTATAATGACACTTCAACAAATTCTTAACAC
G AG AGAAG CAAG GTG GAAG AGAG AG AAATTCAAG ATAAACAAAG ATTTTGTAGAATATT
CTAAAATTTCAGATTTACTGTGATGCGTGTGTCCAAATAAAAGTAAAGGCACAAATTTTT
TATTTAGACAAGAACATATCTAAAGCAAGATTTACCACATCGTGGTCCAAATAGCCCATG
TCCGAGAATGTATGCTTTTGATCCGGGGGATGGGCACATGAAGTCAACAATAATGGGA
CAGAG CTGATCTG GAACTAG CTCAAGTTGTAGTAATGC AACATCC AG AG GTCCTCTG GA
GACATGAACTACCTTTGCATTTGTCCATACCCAGGGATCCATAAAATCCAAGCGAACAC
G AATG GTCCTACTG CCTGTGTTTG CCAG GTTAACTCTAAAG CTACATTG CTCATTGTCAA
CCAAGAAATGCGGAGTTTTTAGTTCCTTTTGAATCAAATGTTTATTCCTACGCTGAATGT
CAAATTTCTCAACCCCTGGATGCTCAGATTGATCAGAAGGGATGAGAACTACATCAGAT
TTGGTGTTATATCCTGAACCGTTTACAGATGTTTTTCCAAATCTCCATGGCTCTAGAAGA
TGAGCATTTGTAAGAAGAAGACCCTGCTTGTTGAGCAAAACTCCAGAAGCCCATGCTCC
ATCATCAACAGCGATAAGACAGATAGATGTCATTGCTTTCTCAATCAAGGATGGGGGAA
CAGGATCTATTTGGAGATGCTCTTGAGTATCATTGGCATGTCCATCCTGAATATTATTGG
AGAATGATTCTTTTTTAACGCTGATTAGGTTTCCATTACCAAAATGGATCTTTCTCCTAGT
TTGTAGCTCTTCTTTAAGCAGGCTACCACAAGCAGATGTAATAGCTTCCCATGGAATCA CCATCTGCCACCCAATGCTTCAATTTCAAATTAAAGCAATACATGTGAATGTAGATCTCA
GCAAGAACATAGAGCACATTATGAAATGTTTACCTGAATTTCAGCAGCAGTAGCCCTTT
GTCTGAGTGGCCGAGACAGAACACCAATAAGCTCTGCATGTTCCCCTAACACTGGGCT
ACCTTCCATTCCTGAAACATAAGGTACCATACGTTATCAGGTACCCAAAGATGGAAAGG
ATTAATTAACTAAAGTTGCAAG G CAAAAG CTCAAACCAGG G AG ACAACG GATGTCAG CA
ATCAACAGTG CTTTATTCTGTG G ACTAG GTG G ATAGCTGTTTG CAATG G ACCCAACTGA
TATGCTGTATAGCAACAAAAGATCAAATTGAGCTTTCACAAGGAGATGTAACAAGAATCA
AAGTGAATATTAATCCTATGTATGATCACAACTTGGCAGACAAATAATCATCGAACAACA
TCTGCATACACCAGATACAGCTTATATAGCTTACTCAAGTAAAGATGAGCTAAAACATCT
TTAGCACTAGCAACAAGAATTACACCAGTGCTTTCATATTCAGATAGTCTATTAACACAG
CACAAATGTTCAGTCATCATACATTTTGATGCTGACAAGCAAGGATGGAAAGATTACAA
GAAAGTTGTAATTCCAACATATGATAGAAACAATACCAAAGCAACTACCTGTTGAAAAAG
TGACTGGGAGACAAGATACCGAAAGGAGAACCCATACCCAGAAGAAGATCACCTCTCC
TACTCCAG GTAG CCACTTTTAATG CAG G CAG GTC CTGAAAG GAAATACCTATG CAGTCA
TTATCATCAAAAACCTACTAAATG GAG CTTTCAAAAATTACCAG CAAAAAAG CAAG CCCA
TTCAGCAAAATGCAATGCCATGCTTCATTTCACAAGTTTGATGCCACTTAGTGCAGTAAA
AGCCTGAAACCTCATATGGATTTGAAGAAACTCTGAGAAGAGCAATTCTTGTAGTTGAT
GTTCCTATCACACTTGGCAGGCTGGACTGCGCCTCCATCATTGGAGTTTGACTTGGGAA
AGAAATTTTCTCAACCTGAAATTATCAGTTGACAGTTCCTGTTAAAAAATAGTCAATCTAC
AAAATCAATCTCCGTGGTCCTTTGAGACTAGCATCTCATAGACCTTTGGCTATGAAAACT
AGAATG CTTTGTG CAAAAATTGTTTG G GC CATCTTG AACATAGTACTAGG G ACAATTG A
GATGCCACCGTTGAAGAACACCATGGTAAGAGTTCCTTATGTCATGCCATTTTGCACTT
GACAGCGTGAGACTCAAATGTGCTCCTTTCCAATAACACTGCCGAATGCTTATACCATT
ACAAATATGTAACCAGAGTAGCTTACTGAGTTGGAGTTCTTAATATGAATTATTTTAAGA
AATGCCTCCAAGTTTTACGGGTGGTAGACTACCTTGTATTCAGACACGGGAAGGTGATA
TTCAATCATTAATCTTTGAATTGGAATTAGGTGTTGAGACATTCTATGATGATAAAGAGG
CAACTATTTTGGCATGAGTAGAGACATGAAACAATGTGATGTCACTCTATTGTAAATATA
GGGAGCATGAAATAAGGAACTCATTACATGAATTTCCATTATTCTCCTATAAAAAGAGGT
TACAGAACATATAAAAGGTTCTGCTTGGAAACAACTCAACTGATAATGCAATTAATGCTA
AATATATGGAGGAAACTTGCATGTTCCACAACTCGAAGAAGTGGTGTGCAACATATATT
AGTTCAAATCCATACTGCTCTTATGCCAGAAAAGAGAAGAAAAAACAGAACCCTGAAAG
TG CAAAGTG GTCATG ATTAG AC GTAAG ACAG G AATAAAATG G CATTTCTG CTCAAAAGA
AATAGGTGCTTGATTTATTTAGTTATTTAAGAATGATAAATGATATGCCTTTCAGTTGATA
GAACTTTAAAGTGTTAGCCTGATAAAATTATAGTTATTTGATGAAGTCTTTTGAAATTTGA
ACCACAAAGTGGGATCATAGCAGAAGTTAGCTCATGAAAAATGACCAAGGATCTACAGC
ATCCAATCAAAATATGCATGCAAGAGAATTTGGCTTATCTTGGTCTCGGCGACATTTAAT
TATCTTTGAATGGAAGTATTGTCATGTTAATCCTATTTATGACTATGTTATGCATTAATGA
AACAATCACCTAATTCAATAATAACACAGCTAGATAGTGTCAGGAGTATTTAGAAAGCGA
GTGAGAGGCTCGGACACATCTATCAATAGGTTGTACCAATCATATATAGATGAAGTACA
GAGACTATGGTTACATATTTACTCATTAATACATAAAGGTACAGAGATTGTTATTGGGTA
CACATTTACTGCATGTAAGTGCCAAACAGAGGAAACATCATACCTGTGTGCGCTTAGTG
TTAGTAAATG ATTG AC G G G AATTTC C ATAG G CAG CTAG G G AC C AAC C AAC CTC C C AC C C
ATGTTCAATTGAACTAGACGAACCTTCAACTAGGGACTGGACAGCAGCAGATGACACAG
GGATGTCAACCTTCAGCAAGAACACCAAATAGACGGTGTCAAGTGCAACTTTTACGTTA
GATCAGTAACAAAATAGCGACAAAGAAGAACACCAGCCATAAAAGTGATCAGTTAACCC
ATGCTAAAAACTAGGATAAAGTTCAAAACTAGGTATTCCTTCTCCATTTTATACGGCACA
CTTTCCTTTTTAGTATGTTCCAAAAAGAATAGAACCCTTCTCTATTTGGAATATCTTAAAA
CTTTAAACTTCCCACTTTACCTTAATGACATGCTCTTATAGCCATAGAAGTGTTATGAAAT
GTTTAAAACCACAACTTCTAAAG GTAATTTG GTATGTGTCAAAATCTTTTGTG CACG G GC
ACAAAACACCAGATGACATCAAACTAGAATTTATATGCATGCATCAAAATGAGAGCACTT
ATCAATTCAATATTGCAAATAAAAAACATATATAAGATAAAGTAACAGGTTTTATGATAAT CAGCATTCAGATTACAAAATCCTTTCAGTCTCCTACTAAATCCTACCACTCTTAGGAGTT
CTGCAGGTAGCCAGTTCAAACCCTCTTTGTTGGTCACTTTGATATCATTTTGCAATGTAT
TTCCTCCCTGTGAACCACGAGCAGGACATCCAATGTCAATTTAATATATCAATGAACCTA
AAAAACGTATTTTCTTCCGTAGGCTCAAATAAATACCTCCCACAGTATATCAATTTGAGC
ACCAGGAATCAGCTCCGGCTTATCCTGCAAGATGGTACATTGTAAAAGAATTATGAAAA
AGCATAACGACAAAGAAATAATCATGTTGTTCCATTTTATACTACATCCATTCAAAAACCT
ACCTTTCTATACATAGAATTTTTTTAATCTTGTAATTCTCATTTTGCACTTAATGGCATGCT
CTTATAGGAATTGACATGGCATGCTAAAGACAACTAGATAACTTCTTACACATGTAATTA
AATATGTGACAAAAGTGGTTCTTTCTTTATTAAACTCCTTCTCCAATCAAACACCATCATA
TAAAGTGAAACCAAACAGAGGGAGTAATTATCAACTGAAAGAAGACTAAAGATCCAAAC
CTTTGATATGTCCCCTCTATCCTGTTGTACAACAAAAGGCTCAATAACAGAAGCAACTGT
TAAAACC AAGAAGTG AC CTCCAAAAG AGTG CAACTTG CTTTCACCTTG AATCTG CTTAG
ACACTGAAGCATTAACAAAGGAACTGGGCAAAAGCATCCCAGATGCTGATAGTGTCGTC
TTC C C AG AACTAC AC C AC AAAATTAC AAC C C AAATTTAAAACTTTC AGTTC AAAAC AC AT
AACATAAACAACTATAATATAGAGACAGAGAGAGCTATGTAAAATCACTTACTTGTACAG
GTGG AAAG CATGTTTTCG CATTTTTAG G CCTTTAG G GTC CTTCAAG AAG AAGAATTTATT
TGTATTTTTTCAAAAATTAATTAGTAGAATAAGCAAAGTGATTGAAAATTACAGTATCTTG
GCTTAAGAAAAGGGACTCACTGGGCCTTGAATTCTGACCATGACGGCATAATTGCGGG
CAACATCAACCACTTCAGGAAGACCCAT
SEQ 71
ATGGATAACCCATCGGAGGATTCCTCGGATTCTCCTCAACAGCAGCCCGAATCTCCTGT
AAACGATGACCAACGTGTTTATTTAGTTCCTTACAGGTAAAATCTCCCTTCCCCGTTTTG
ACCCATTCCTCATGCAACTGTTTGTTTATGTATATCAACATAAAAGTAAAAATAAATAAAA
ATAAAGAATTGAATTCTCGGATTTTGCTTTCCCAATTGATTTTATGATTTGGTTTGATCCA
ATTCAGCTAAACCCGAATCTGAACCCATGAGATAACGAGAAAGTCGAAACAAGTTCTAG
TTTTTTTTTTCTTTTTTCTTTTTGTTTAAATTACTTATATTTTTATTTGTATTACTTGTCATTT
AGATTGGTAATTGTATTAGCTTCCCTACATTGGAATGTTGTAGTTTTTTTAATCAAGTCTT
ATTATCTGGATCAAATCGTGTTGTGAGTTTTTTTATTTTTTTTATTAGTTGCCATTTGGATT
GGTAATTGTATTAGCTTTTGTACATTGAACTAGTGTTGGTTTTTAATCAATGTTGTTGGTT
TTTGTTATCTGTTAACCGGTGGATCAAATCATGTTGTGGGTTGTATATTTTTGTTTTGTGA
G CTTAAG CATAAG AAAGTATCG G CCTTG GATTTTCAGTTGTGTTTTTTTGATG AAGTAAA
TAGTTTCACCAATGTCATCAAGAAGATGCAAGTATTACGAAAGATTAGGCCAGAGAGTA
TCAGCTTC AATTACATTG GTCTAG ATTG CTAAG GAG CTGATAAAGTCCAG AAAGTTAACA
GGGTAAGTTACAATAGATAGTTTTGCCAACTAAATAAAAGTAAAAGACACCTAGCTATCA
GTTGTTAACAATGGAGAAGTAGTATAGCAAAGTGCCGGCAAGATCTGAAAGTGGTGGTT
ATAGGGACCTGTTTAATAACTTAGTAAACCTTAGAAGAAGCTGACAAATTGTTCCATCTA
CAATTTGTCAACCTTAATAGAGGTGCACACAAGCTGGTCGGACACCACGGTTATCAAAT
TTTTTGTTTAAAAAATGTTCCATCTCGATAAAATATCAATTGATTATGCATTATGTTGTCA
GTTCAAATATTGTTTCTCGCAATTATTATAAAAAGTGCATATCTGTGGAGAAGTGCTCCG
CGGGCTAGTGCGGTGGTAGGGGAGAGTGGTAAAAATGACACAAATGATGCTTTCCACT
TGCTAGTGGTTGTTAAGAAGAGAGAGAATGTTTGAGCGGGAAGGACGGGGTAAATAGC
ATGGAAATGTTAATTGAAAGAAGTTAAAAGTTACCCTTTGCAGCATCTTCTCTAGGTAAG
AATTTTTTGTCTGTGTTTTCCCGAGTAGAGGGTTAAAGTGTTGCACACACATATATTACA
GGTGCCACAGACACGTATATGTTTAGAGTACTATATAAGAAAGCGTGTTTGTGTTCTAG
GTGGTGGAAAGAAGCACAGGAGTCATCACCATCAGATGGGAAGTCAGTGACTTTGTAC
G C AG C G G C AC C AG CTC C ATCTTAT G GAG G G C C AATG AAAATC ATTAAC AAC ATATTTAG CCCAGACGTCGCATTTAACTTGAGGAGAGAGGAGGAATCTTTATCACAGAGTCAGGAG
AATGGTGAAGTTGGGGTATCTGGTCGGGACTATGCTTTGGTCCCTGGCGACATTTGGC
TGCAGGCACTCAAATGGTCAGTATTTTAGAGCAGTTTCCAATTTGTATTCCTTGAAGTGT
GTTAGATAAAGCCTCTTCTGACGGAGATTTACGCCATAGTTGTTGAGCATTCTGAGGAT
ACCATTTGCATATGTGTTTTTCTCGACTTCAAATAAAACATTGATTTTTCACTTCTGGTTA
CAACAACCACTTGCAATTTGTTGTTTGGTTTCTTCTGCTTTTCAGACCATTCACATTTTCA
TTTCACATGAAAGAGGCCTCAAGCCTTTCGAGGCTTCATTGTTGTTGCTAGTCCGATGG
CAATTCCCAGTTATAAATATATATTGTTAAATGCCTTGTGAATGCATATGGAAGCTCGTTT
TTTAAAGCATTTTGAGATTTCATTCTAAAAAGACCACTGTTTATTCTTTCAGCTTTAAAGT
GCTAAGCTCAATCTATTAATTCGCTTCCTTATTTTCTTTGTCTCTTTCATATATTTTTTTTG
TGTGTGTGGGGGGTGGGGATTGGTGTTAACTTATAACTGATTATTTCACTTTCCTTTTTG
GTGTTTTTGCACATCTAAGAAAGGGAATTTGTCTTTTGATCCTAGTAACATGTTATTTAG
CACGTTAATTTCATACATCTGGCACTATGTAAAAGTTGATCTTTTGATTATAGAGTTCTGA
TTAGTTTGATTGGAATTGCTCCTTTCCATCCAGGCACAGTAACTCTAAAGCTGCGGCTA
AGAATGGAAAAAGCTTTTNCAAAATTGCTCCTTTCCATCCAGGCACAGTAACTCTAAAGC
TGCGGCTAAGAATGGAAAAAGCTTTTCAGCTACAGATGAGGATATTGCAGATGTCTATC
CTTTACAGCTGAGGCTTTCTGTTTTGCGGGAAACCAGTTCCTTGGGAGTCAGGATAAGC
AAAAAGGTTAATAATAACTTTGGAATTTCTAGTTTCATCTACAATTCCCATGAGATTTGTA
CTGTCATAATATCCATAGAGTGCATAACCACATGTGATCTTTTTGGTACAGACCTTGCCA
CAATTTGTGAAGCTGACTTTCTTTTTCATCAGCTGCTTTGCCATCTTTCATCCTTCACTTT
TGTTGTTGCTGTTTATTGTTGAAACTGAAGGATCTAAGATGGACAAGTCCAAAATATCAA
CTTTAAGAACAAGGGTTATGCATGAAGCCCTCTATTCTCATCCTCATATTAATTACTCGG
AATGGGCATAGCTGTTAAGTGCTTCCATTTTTGTTTGATATTTAAATATTAACAGTAAGAG
CTTTTTATG GTTCTG G GTTGTG CAAAAAG AGG ACAAGTATGTTAG CTGG ATAC GTATCTT
TTTGCTGACAAAGTGGGATAAATTTTGCATGGTAATTTTTGGCTTTACATGAATTTGTTG
ACAAGGATATCAATCCTATTGATATTTATGTAATTCACTTGAAGCATTTAATACTTCTATT
TCCCAATTACAGAATTGCTCATCAGCAATTTTTGACCACAGTGAGTTCTAGAGAAGCGA
GTCTTTTGAAAATGATAGTAGAAAGAGCGCTTCCTTTCTCTAGATTGTCTTTCTAGCAAA
GTAAATTAATCTAGAGTTAAATTGTATTGACAGGAATAAACAATATAAGCGGTATTTTCAT
CCAGACACCCCTCCCCTGTTTGTGGAGCAGAAAGAAACTATGTAATTGGGAATCACTTA
GTTTTGAGATAACATAGTAGACTATGTGACAGTTTTATTCTTTTAATTTAAAACTAACAAG
TTGTCTTATATCTAAAGTTTTAGCAGAATCATTTATTCTGCCTCTAATAGCTTGGAAGAAC
TATATATCATTTGAGGTCTTTACTTGCCGAAAGACATCGGAGATGAAGTTAGTTTTTTATT
AGATCAGATATGAAATTAGTTCTAGCTTTTTTTTATATACTCAAGGATGCCCTGACTTCCT
CCATCTTTATCTATTTTTGAGAAATTCTCTTTCTTGACTGCCAAATGCTAAGAGGAAATG
GTACCAAGCGGTTACATGCAAACCCTGGTCTGAAGAAGTAAAATAGCTAGAGTTCTAAT
TTTCATAAAGCTAATAGGAAATAATTTGATCACTTGTGAAAATAAGCCAAAAGAATGATG
CTCATTCGAACAAAGTTCTCTTAGAGTTACTACATATTTCGTTGTGTATACATGATCCTTC
AATGCTACTTCATATTTATTATTTACCCGAAAGTTGATGTTAATTTGAGCTCTTTTTTTCTT
AACAATGTTATTGCTGACTTGTCCGTTTACTGCCTCAGCTTGTGTACATAAAAATGTAGT
TTCCAAAATGTTGTTTGTATTTTGTAACTGTTGCATGTTAAACATTTGCAATATTGCGGTG
GACAACGTTTCTTTTTTTTTTTCTTGCAATAATATCCTCACCGATGTTCTTTTTTTTTTTTT
AACTATCCTCACCGATGCTTTGGAGTCAGTTAAGCTGCTTTATTTCCTTTAGAGTCTAGT
TTTACATTTGTCTTCTCACATTTGATTCAGGACAATACAGTTGAATGCTTTAAAAGAGCCT
GCAGAATTTTTAGTGTCGATACAGAACCCGTAAGTTTCAATACTGTTGTTAATCAATTGC
AATGGTATCTCTTTCAGGAGATTAATGGTATTTTGGTTCTCTGCAGTTACGGATTTGGGA
TTTATCTGGGCAGACGGCATTGTTTTTTTCAGATGAAAACAATAAGATCCTCAAAGACTC
TCAGAAACAGTCAGAGCAAGATGTATGTACTTTCAACTGTGTCATACTTCATGACTAACC
AATAAACAAGTCGACCAATGCTTCTGCGGCATTCACTATTTTTCCTGTCTTTACTAAGGA
AATAATTTAGTTATGCTTTTTTCTAATTGTTTTCTAATTAAGTGTTTTTAGCAGATTTTTCC
ATTTCTTTATCTAGTGTTTGTGTCGTAAAAAGATATATAATGATTGAGGTGATGAATATGC TTACTTAACACTTCATCTAGGAATGAAGTGAGACAATGATTTTTCTCCATTTTCTATATAA
GTGTTGTTTTTTCTTGAGCATGGACAATGCTAAGCCCACCAAAATTCAGTTTTATGCGAC
TCTCTTTCATTTTAGGGTTCGTTCTGGAGTTATTCATTTATAAGCAGTAGATGTGCTCTTT
CCTTGTACTTTCAATGATGTACACTCTAAGAAACTTTAGCTCTTTTATTACCCTGGGACA
AAAGAAACACATAATAAACGGGACTGTCATGTCTAACGACCAGCTTATACACCCATTCG
TCTTG GAG AACAGG CG GGTAAATTCAGCTTAGTTATG CTGTTTTTCAG CTG G ATCTAAG
ATTACAAAAAGAGCAACTGTTTTGTTTTTTGTTTTTTTCCATTTGTGGCAGTTATTACCGG
TGTGGTTCATCATTGATTGTTTTTGTATTTTCTTAGGCTGTTTATCTGTGAAACAAATTGA
AAGAGATGCCAAAGTTTTATCTGTTTTATGTTTCTTTTTTCCTTGTGGGTTACCATTTAAC
TGAGAGCAAGGTAAACCTTTACTGTTGAAGGCATTTTGCTGGTTATGGGTTGCCTTATA
GTCTTATTACTGACTCTTGAATTAACTCTAGAATTTAGTGTTTAATGGTTCGCACCGCTT
GTAAG CAAGAAATG ATTTG G AC AAACTTCTTATTTTGTCCTCTTATGTTTTTG CTTG CAGA
TGCTCTTGGAGTTGCAGGTCTATGGGTTATCAGATTCTGTTAAAAATAAAGTGAAAAAAG
ATGAGATGTCAATGCAATACCCTAATGGTTCTTCTTTTCTGATGAATGGTACTGGCAGTG
GTATAAC CTCTAATCTCACTAG GAG CAGTTCTTCATCATTTTCTG G AG GTCCATGTG AAG
CTGGTACCTTGGGCTTGACTGGATTGCAAAACCTAGGGAACACCTGTTTCATGAACAGT
GCTCTTCAGTGCCTTGCACATACGCCAAAGCTTGTTGATTACTTTCTCGGGGACTACAA
GAGAGAAATAAATCATGATAACCCTTTGGGAATGAATGTAAGCAATCTTGAATATTTCAA
GATCATTACGTGCTGCTTTAGATGTTTTCTTCAGTTCTCTCTGAATAAGTCAATGTTGAC
ATCCCTTAACCTATTCTACATTATATGTGGTTGGAAAAGTAAAAGAAAAAGAGAAATTCA
TTTGATTACTCTCCAGGTGAGGAATTCTTTATTTACCTCCAATTGTTTTGTTAGCCCGGA
CAAAAGAAAACGATATGCTTATCCGTTCCATTCAATTTAGTAGGGGTTGAGAAAATTGAC
TCGGAGGGTATTCAATATCTCCACTTTTTGTTTCGTACCAAACAAGGGGAATAAACTTTA
CCTCTTTTACTTTTCCTCCTCCTTCCACCTCATCTCATCCCAATCAAACATTGTGTTCTAA
TCTGTCTCCTTACATATTTTATTGTCTAAGTTCCTCTCTTTAAATTCTTTCAGGGTGAAAT
TGCATCTGCTTTTGGTGACCTTTTGAAGAAATTATGGGCTCCTGGAGCGACTCCTGTGG
CACCTAGAACATTCAAATTAAAGCTTGCTCATTTTGCTCCTCAATTCAGCGGCTTTAATC
AGCATGATTCTCAGGTCCTTTCAGTCCTTCCTGTTGGATTTAGTTTCCCAGTTTTAGGTC
ACTTATTAACGCTCTCTTTTCTGTCCTCTCATTTTGTGGGCATCTTTTGACATCTAATTCT
CCTATTTATATCTGCAGGAGCTCCTAGCTTTTCTATTGGATGGACTCCACGAAGATTTGA
ACCGTGTCAAGAATAAACCTTATGTTGAAGCTAAGGATGGAGATGATCGTCCAGATGAA
GAAATTGCTGATGAATACTGGAATAATCATCTGGCTCGTAATGATTCCATCATAGTGGAC
GTTTGC CAG GTAAGTAACATCC GATG GTCTCTTGTATCTCACTAG AAGTAG G AAACATTT
GATATCACCGGCACTCAGTGGTCTCTCGTCTCTCACAGGCAAATGTGAATTATTGATCT
CATTTCAACATTGACTTGAAAAAGCAAGAAGAATGAAGTGGCATATTTTTTAAAAATATCT
GAACTCTACTGTATTTGTGCTGCGAGAGTTGTTCTAGGATGAGAGAGTAATTATACCCC
AACTGTTTG G G GAAGTTTAACC AGTGTTC CTAAAG CTTG CTTCAAATTTCTCAG ATATTT
TTGTCTAGATTCTCTGCCTTTTTCATCAATAAGATTTCTTACCTTACTCAAAAGAAATTGT
AAGTAATGGAAATTGAATTCAACTCTTACCATAAGTAATGGAAATTGAATTAGTCGTCCC
TTTCAAATCAATCCGATAACCAACTTGGTTCAATAATTCGGAATAGTGGGAGTACTATTT
GTTAACAACTGACATACTATTTTCCTAGAATGCAGTCCTGAACTAAGAGCTGAATTTTGG
ATTCAGGTTTATTTTAAAACAAATTAGTTTTATTTGTAGTGCTCCCTGTTTTCTCTATATG
GTCTCCGCATTTTATCTTTGATGTCTTTTTGAGTTTTACTGAAATTTCCTAAAAGAAGAAG
AATATTCCAGCCTTTAATCTCCATAAGAAAGTTAAATTTTTGTTCTTCCATAGTTTACAAG
TTTAATTATATAAAAACTTGAACCTACCTTATCAAAAAAGAAAAAGAATTATGTGAAAACC
TTGAACATTGTGAATGTTTCTGACATTTGTGCACTCTATAGGTGTGTTTGGAAGTAATTG
GTCTTTTTATGTGTTATGTAGTCTATGGCGGTATGTAATTTCTAAATTCCTCTTTGGCCAA
TCATCTGTGAGATAAAGCTCTGCAATTCTAAGATAATTTCGAATACCACCAATTGATGAT
AGCTTGTCATTTTTTTTAGTTTTTTTATTCTGTAATTTTTTGCATTAATAAAGTTGACTCTT
AGCCATGTCTATCATTAGATGTCGTAAATTTTGAAATTTCACTTTAGAAACCATGTACATA
GTACATGTTTCCTAGCAGGTCGTGATTTGTCTATGACTTTTAGGAGGATTTAAAAGTTTA CTTCATCTG ACTC CCTTTTTCTG CAC ATTTATACAAGTTTCTTTTCTAGTCTTG CTTCTCA
GAGCTTATCTCGATTGTTGCAGGGTCAATATCGTTCCACATTGGTCTGTCCTGTTTGCA
AAAAGGTCTCCATCATGTTTGATCCTTTCATGTATTTGTCACTGCCTCTTCCATCTACAT
CTATGAGGTCAATGACTGTCACAGTTATAAAAAATGGCAGTGATATTCAGATATCTGCCT
TTACAATCACTGTTTCCAAGGATGGAAGACTTGAAGATCTTATTCGTGCTTTAAGCACTG
CATGCTCTTTGGACGCTGATGAGACCCTTTTGGTGGCTGAGGTAAAGTGCAGAATTTCC
AGTGATGAGAAATGGTTATGGATTTCAAGTTGTTGCTTTATTGTTTCCTAAATAGAACTTA
TTACATACTGTGTATTGGATAGTCAAGTAGAGTCCTTTTTCCTATTTCCAAAATTTTATTT
CCAGCTCTTGCTGGGTTGTTGTTGTTGTATTTCCAGCTCTTACTCCATTTAATGTTACAG
ATATACAACAACCGCATTATACGTTATCTTGAGGAGCCAGCCGATTCATTATCCTTAATA
AGAGATGGTGACCGACTTGTTGCTTATCGGTTGCACAAGGGTACTGAAGAAGCCCCCT
TGGTTGTGTTTACGCATCAACAGATTGATGAGTATGTCTTGACTTCATAATTTGGGCATT
ATCTTTTTTTGCTTTAAAGTTCATCAAACATTACTAGCCATTACTCAGATGTGTCTTGCAT
GCACAGCTATGTTTCATAAGTAATAAGTTGGGGGAAAAAGTACTCCAAGGGTGGTGCTT
CCACATCATCACTCTTAATC ATG G CAG G GTTTG G ATGTG GG CGTACTTGATG ACTACAT
TGCTTAAAAGAATTGACAAAATATTTTCGCAGATGACATATGTAGTAATATCTCAGTCTAT
TAGTTTGCTTTATGGAGATCGGGTGATTAATTCATGATCGACACAACTCCAGTTAGTTAA
TAGAGTAGGCTGTTAGTTGTCATATACTTCTATCTTGTATAAAGTAAAAATGTGAGGTGG
TTTATTTAGTACTGTTGAGCATCCTCAGTCTCAATTCGCTTCACTTGAATACATTACAAAT
CATTGTTATGCATGGTTCGTCGAGCAACATGTAGTTCAGATGATGTGTGTGATCCTTCTA
GATTATTTGGACAATCATGAAACTATTGCTTCTTCCATGCATCTTACTGCTGAAGCTGTA
TATGATATGGAATTTCATGCTGTTTTGTTTGCTGATTATGTTTAGTTTAACTTTTGATCCA
TTGAAAGATTCTCATAGTGGTCCTTGACTCATTAAATGAGATGGCTGATATTTATTTTGG
CAAAATATCATTTCCTTCTTGATTTCCTCTTCCATTCTAGCAATCTTATGAAGACTGCATG
TGCAGGCATTATATATACGGAAAGCTGACCTCAAACATGAAGACATTTGGCATTCCGCT
TGCCGCGCATAGTAGAGTTCTTACAGGATCTGATATCCGTAGTCTTTATCTACAGATACT
TACACCATTCTTAGTCCACAATACAGCCCAAGCAGATAATCTTAACTGTGATAGAAGTGC
TACTGAAGCATGTACAGATTCAGAAGTCATCACAGACATGGAACCTGGCAACTCAATAG
TAAACGGGGTTCCAGAAAGCATTGCTGAAGAAGATACTGCCGAACCTTTAGACATGGAA
TTTCAATTTTACCTATCAGATGATAAGGCAACCTTTAAAGGCTCCGAGATTGTAATGAAT
GAGCCATTACAGTCCACAGATATCTCTGGACGGTTAAATGTACTTGTAAGTTGGTCACC
TAAAATTCTTGAACAGTACAATACAGGCCTTTTCAGCTCACTGCCAGAAGTTTTTAAATC
TGGTTTTTTTGCCAAAAGACCACAAGAATCTGTCTCTCTGTATAAATGTCTTGAGGCATT
TCTGAAGGAAGAGCCTCTAGGGCCAGAAGATATGTGGTAAGTATGCAACTCCCTCACTT
CTGTGATTGTACACCATTCATATGCAAGCTATGTATTCATAACATATGAAATTTCTCGTAA
TGCTTCCCTTTTTGCTTCTTCTTTGGTTTGTGCTAATATTATAAACCCTCAACTTTTGTAA
TTACATAATTGTATTTTTCCAATTATACCATTTTATTTCATTTCTGTCAATATTTTCCACCG
CGTCGTGATTGCTTATTGTGGATAAACCATTCTTATTAGCCTCCCCTCAACCAAATTGGA
CCTTGACTTTGCAACATGGCATGAGTAGTATCCTTCCAACTTCTAGTTCAGTTATGTTAA
AGAAACAATGACAGCATCTGATGATCTTATGTAGCACCTATTGATTTTCTGCACATTGTG
CTTCTGAAATGCTTTATGTGTTGCTCTTTGTTTCTTGTATATTCATAGCAACTACAGACAG
TAATTGAGATAATAATATTCAGCTATTTACCTGGGGCCACTCATGGGAGGAAAGACTTCT
TTCATGAAATGTTTCTAGTATTCTATCTTATCATCTAATCATTTATTTGTGTCATTCGCTTG
GGAAGTATGAATATAATCTGCCTAACTTTCTTTGTCTTATATCCAACATTAGGTACTGCC
CTGCATGCAAGCAGCATCGCCAAGCTACTAAAAAGTTGGATCTTTGGAGACTGCCGGA
GATTCTGGTCATCCACCTGAAGAGGTTCTCGTACAACCGGTTTCTGAAGAACAAGTTGG
AGACGTATGTTGACTTCCCAACTCATGATCTTGATTTATCCTCATATTTGGCCTACAAGG
ATGGCAAATCTTCCTATCGGTATATGCTTTATGCAATTAGCAACCATTATGGAAGCATGG
GAGGGGGTCACTACACTGCGTTTGTTCATGTAAGTGGTGCTGCGACTTGGATTACCTTG
CTTCTTTTTCTTGGTTTTGTTTCTATTCTATGGTAAATAGGATTCTTTTATACCTGATAAAA
ATGGCATCTTAAGATCAGTACTTGGGGAGAAGGGTGGGTGGTGGGCGGTCACTGAAAC CTACCTCCAAGGGCAAATATAGAAATTTCCTCTATTGGTCTTATTCTTATTGTTCGTAGT
GAGTGTTCCTTTGATGTATTTTTTAGTTCCAATGCATCATCTGCATCTAAATTAATCACAT
ATTGCACACATGTGCATCTATTATATATTTAACTTTGGTCATGCGTCTTCATTTTTTTTATT
TCTTCATCATGAAGAATATGCAAGAAGGTCAAATATTCAGACTTTTACAGTCTTCCTAGT
TTAATCCAGATATTCTAACTTTGTGTTTTTCTTCTTCTAATAATCTAGCAAGGTGCTGATC
GGTGGTATGACTTCGATGACAGCCATGTGTATTCCATCAGCCAGGACAAGCTCAAAACC
TCGGCCGCCTATGTTCTATTTTATAGACGAGTTGAAGAAATC
SEQ 72
TTACACTTGCCTACTACACTCTCCTTTGCCAAAACCTACTCGTCGATTTCTCATATCAAA
TTCTACCCATAAATTTTGCTGGTGAAAATTACCAATAATATTGCTTGCTATTCCAAGTGAT
TCTGACCGTCCGATTCCAACACAATGGATCCCACCTTCTACTTCATCCAACATCCTTTCC
TTATTGATCAAAATATCAACCCCGTTTTCAAATTGCAATGTCATATCACCTATCAACCGTC
CGATTTCGATCGGACGGTTATCGAAGCACATGTCGAGTGCACCACCATAAACGTAACCT
TTTTTCAATCTTGGACCTACTAACCTAACAATTTCTTCTCTGACCTTATTGTACGCTTCTT
CCACTAAGAAAGTGTACTCCGTGCCGGAATCAATGATCGTCTGGCCGGAACCACCAGC
GTTTGGCCGGAAAACCCTCCCGGAGATGTTTAATTTTTTGCCGCCAATTTTTATCCCCA
CCATGCCAACAGTAAAAGCTAGTGGATCCAAATTTGGCATGCGTTGACTTTGAGGAAAA
GTCAAAAGATTTATGTATTGAAATGTATGGGAATTAGGGTTTTGGCCTAGGTAAAATGTT
CCACTAGGTTTAACTGCATGGCTACCTTGTCTAATTGGCACGCAATATGAGAATTTTTGT
ACCTTAGCTTGGGAGGCAAAAGAAAACCGTCCAAGATTCATTCCCAAAATACCCTCAGC
ATCTTCGGACTCGGTCGCACAACCAAGAATCAAAGGAGGGGTACTTTGGGAACGTGAA
AATGTAATTTTTTCACGGACAAGATTACCCTCAGCTAAAGTACCATCAGCATAAAAGTAG
G AATAGTG G CACAAACG ATTTTG GTCACAAGTAGTTG GAAG G GTAAAATCG GG AATTCT
TGGCTTACATAAAGGATGAGTACAAGGAAGAACAGAGAAAGTAGAAGACAAAGAAGGA
TCAAACGACGTCGTTGGTGGGGGTCTTTTGGGAATTTTCTTATGACATTGAATCCAAGA
AAGTTGGCTACCAGTGTCCAAAACCATTTGTTGATTTTGTGGTGGTGTTCCTATTGGTAG
TGTAACAATTAAAGCCATTGAATATTTAAAAGTTGATTTATAGTTCAAAGATGGAATTCTA
GACATAGTTTTTGTATTTTGAGTTTGTCTTCTATTATTAGAAGCCATAAAAGAAGAAAGAA
AAAGAGCTTTAGAAGAAGAGTTATGTGATAAAGATGTTGAAATAAGAGGAAATGACATA
GAAAAAGGCTTATGTTTAATGGTTTTTTGTGCTGAGATGTAGAGAAAATTGAAGATTATG
AG AAG AAG AAG AAC AAAAACTCTAG AAG AAG AAG C CAT
SEQ 73
TACACTATAATTATATTTTCGTTAAATATGAAGATTTTTTCCATATTCTCTTTGCTTCTTCT
CCTTCTCCTTCCCATCTTGGCTTCATGTCATGAAAAACAGGTACAAGCATATACAATTCT
AGTTTCTCATTGATTCTTTAATCGCAGTTCTACTTCTGTTTATTCTTTGTTTTAATTATGGG
GTTTTGTTTTG CAG GTTTATATAGTGTATTTTG GAG GACATAAAGG G GAG AAAG CATTGC
ATGAGATTGAAGAAAACCATCACTCATATCTCATGTCAGTGAAGGAAAGTGAAGAAGAA
GCCAGATATTCTCTTATTTACAGTTACAAACATAGCATCAATGGCTTTGCTGCACTTCTC
ACCCCACATGAAGCCTCCAAGTTATCTGGTATAATAACCACGAAAAAAGTTCACTCTTTC
AAAGAAAGAGTTTAAGTTACATATAGTAAAATTTAATTGGTTATAGCAGGTTATTGCTCTA
TTTTCTAGGTCAGAGTAACTTGTTTTCATATGTCAAATTAATCTGATAGTGTAAAAAATCC
TGTATAAGAAACACAAGGTTCTTGTATGTAGAAGAACTTACCTTATGTATTATTTGAACA
CAGAATTGGAAGAAGTGGTATCGGTGTATAAAAGTGAGCCAAGGAAATACAGATTGCAA
ACAACAAGGTCATGGGAATTTTCTGGAGTGGAAGAGTCAGTGCAACCAAATTCCTTGAA
CAAGGATAACTTGCTACTGAAAGCCAGATATGGCAAAGATGTCATTATTGGCGTTCTTG ACAGCGGTACATACATATATATTTGCTTACCATTATTTCCAATATGGCATTATTTTCCCTT
TGTTTTAAATTTTAAATGTATTTCCACAAAGGGCTACATAATCTAGCATGTGATTATCGTT
TCTCCAATAGTGATACAGACAATCTTATTAGTAAGACTAATGCCTTGTATGTATAATAGTA
GAAAGGGATAACACGTGAGGAATCAACCTATATATATATATATATATATATATATAAATGT
ATTTCAAAAAATACTACTTATAG ACATATAAG GAAAATTGTG AGAAG CCTTGTAC CAAAG
GGAGTCTAAAGTTAAAATAAAAATTCAACATGTTTAAGGATTATGGTTATATAGGATGGA
CGTGTAACTGTGTCTATCCTCCGGCTTATCACTGGCAACTGAACACGAGGGTTGCGCT
CGTTGCGGGACTCATTAATTATGAGATTATCAACTGTAACTAGTGTTAATTGACTAGTCT
GATACTTAAAAAAAAATTGGAGTATGATATTATGTGATGAATGTTGTTGGATGATTTACC
AGG G CTATG G CCAGAATCTAAGAG CTTTAGTG ATG AAG G GTTG G GACCGATTC CAAAG
TCATGGAAAGGAATCTGCCAATCTGGAGATGCTTTCAACTCTTCAAACTGTAATAAGTGA
GTGTAATTCCTCTTCCATATGTTTTATATCTTTCCTTTAACTTTTTCTTTCTTTCTTTATCTT
ATCCCTTTTTATTATCTCGATGATCTGATGTCTACCTGTTTTACAATGATTTAATGTGGAT
TTTAGCCATTCTTGGGTTAGAAAATGTTCAGCTGCTCTACAACCCTAGACCACATTCTTT
TTGTTTTGGGAATTCCTGCTAAAATAAGCTGATTTACTACCTTAGACGTTTGGTTTATCAA
ATATACCAACCTATACGTATTTCTTTATTTTTCTTTTTTTAATAAACTTTATTAAATTTTATA
AGGCTGAGATGACTTTGAACGAAAAATATGATTCATTTAGTTTAAATCCAACTTATTTGG
AACTGGCATAATAGTTGTTGTTGCTATTAAATTTCATAAGTAGGCTTAATAAACATGTCAT
CAAGTTTTGTGCGCACCTATCATATGATGCCTTGTTTATCCAATTATGGATTTCAGGATT
TGCTTGGAAATGAAGTGTTTGGCACTTATCTCTTGTCTCTTATACATTGATCTAACTTCG
TAAGATTAATTGTATTTAAATGGCTTGTAATAGAAAAGGCCAAAGGTCAATTTCAAGGCC
GATTTTTGGAAGTTTTCCTTTGTCTTCTTTATCAGTTGACCCTAAACCATTCTCATAATTT
AGCTTAATTAAAATCAATTAAAAGAAAGCAGATACATGTTTAGTTTTTTAATCTTGTACCT
CTCTAAAGAGTGAAAGAGAGTTTTTTTGAGAGGACAGGACCCATTGGGTGTCCATGCCT
GTCCTTTGGTGGCCTTAGGATATCAGTGTAATAATTTCAATATTGTCCATTTCAATCAAA
CCAAG AGAG GTTATG CTGACAAGTTG CTAATTGTTTTTTG GATTCTTG CTTTG CCATCTT
GTGAACTTTGTATCCTTCCAATGCTTTGTTGTGCAGTAATTTGTTTTTTGCATGTGTGTTG
TCATTATGGTTATTGTGAAGTCTATAGTGAAATTTTGTGAGGCCCTTACTTCCAGTTTTG
CACGGATATTCTCAGTAGTAGCCAGTAATATTATCCATTTTGACTATCTCATGACTTCCA
TGCAGCAGCTTTTTGACCTTTAGAAAGTTGATGATGAAATTCTACCATTTTAGAATGATA
AGTCATTTTCTAGCTGTTAAGTCACAAAAAGAGCACTAGAGCAGTAAAACTTTTGAAGTT
TCATTGTGAGGTTGGGAGGAGTGGGCCTGATTATCACATTCTTGTCCTAATTTGTTACT
GCTACTATCCTTTTTTTTTTTTCTTATTAAGAAGAAGAAAGCCTTTTCTTCCCTTCTTTTCA
AAGGGTAGGGGGTGGGTGGAATATATTAGCCTAATTTGTCATATTTTCCTTCTCGTATAT
AACCATGCTACTATATATGTTGTACTCAAAATATAAGATTTTGTATACCTTTTCCTCTATA
TACTAGATAGTGTGATCCCCTCATGCATCATCTTCTTTTCTCTAGAAGAAAATGTTTTATT
CATGGTGACAGGGGAGGGAGAGGGTGGGAATGTTGGGATCATATCTTGATATCTTGTC
TAATTGATCATCTCAGGCAAATTTAGGGTGGTCATGTGAGTTAAACTAAATAATTTTATTT
CATAGGATCAGCCCGCCCTGATCAAGAATTACTTACTAGCTAGCCAGACTAGTGGAGC
CCTAGCCGGAGACATTCTCTAAATCATGCCTTAACGCGCCCATCTTCCAAATAAAAAAG
GGCTAGTTAGTAAGAAAGATGGAAAGACCTTTATCCATAATTCTTTCCCAGTCTACCTCC
TTCCTTAATTGTGACATGTCCCGTTGATCCCACCTACGAGCTATCTGTCTTTGCCTAGCA
AGATAATTTTTGGTCTCCTATTCTTGCCTATTTTTATAGCCTGTCTTTATCAAGCGAGATA
ATTCTAGTTCTTTTATTTTTGCCTATCATGGTAGGAAATTGGTTCGGCTTGATTGAAATTT
TTTAAAATGTTTACATATAAAAAGAGTACACGCATTCTGAACCCACCAACTCTAAATCCT
GAACTTGCTTCTCCTAATTATGTAAGATAACTTTAATATTTATTCTCCTATGCTACTTTGG
G ACTTCTATTG CAGG AAAATAATTG G AG CTAGGTACTAC AT CAAAG GTTACG AG CAATA
TTATGGCCCTCTAAACCGAACTCTAGATTATCTATCTCCACGAGACAAGGATGGACATG
G AACTCATACATCATCAACAG CAG GAGG CAGAAAG GTTCCAAATGTCTCTG CCATTG GT
GGCTTTGCATCTGGCACCGCCTCGGGTGGCGCGCCACTCGCACGGCTAGCAATGTAC
AAAGTCTGCTGGGCTATTCCGAAGGAGGGCAAAGAAGATGGAAACACTTGCTTTGACG AAGATATGTTAGCAGCAATGGATGATGCTATTGCAGATGGTGTTGATGTTATTAGCATTT
CTATTGGAACAAAAGAACCTCAGCCTTTTGATCAAGATAGCATTGCTATTGGAGCACTTT
ATGCTGTGAAGAAAAACATTGTTGTGTCTTGTAGTGCAGGGAATTCAGGACCTGCACCT
TCTACATTGTCTAACACAGCTCCCTGGATTATCACTGTTGGTGCTAGCAGTGTTGACAG
AGCATTCTTGTCACCTGTTATCCTAGGAAATGGCAAGAAATTTACGGTAACACGATAATC
TATTCATTTTCTGTACACTATTTCATCTAAAATGTTGTAACACTAGGATCATAACGTTTTC
CTTTATCTATTTAATTACATTCATATTGGAATGAAATTGAATCCATTTTTCGTTTGCTTAAT
ATCAGGGACAAACAGTTACACCTTACAAGCTCGAGAAGGAGATGTACCCTCTAGTTTAT
GCAGGACAAGTAATCAACTCTAACGTAACCAAAGATGTAGCAGGGTACTCTCCTTGCCT
CAAAGTTTCAATATTTTTAATTAATAATCATAATTTTCTTTTGGTTGATTATGTTAAACACT
ATCTGAAACTTTTTCAAAAAAAAAATTCAGGCAATGTTTACCAGGTTCCCTTTCGCCGAA
AAAGGCCAAGGGGAAGATAGTAATATGCTTGAGAGGGAACGGGACAAGAGTAGGAAAA
G GTG G AG AG GTG AAAAG G G C AG GAG G AATTG GTTAC ATACTAG G AAATAATAAAG C AA
ATGGAGCTGAATTAGTAGCTGATCCTCACTTTCTTCCAGCCACTGCAGTGGACTATAAA
AGTGCAATGCAGATTCTCAACTACATCAATTCTACAAAGTCCCCAGTGGCATATATTGTC
CCAGCTAAAACAGTTTTGCATTCTAAACCAGCACCTTACATGGCTTCCTTCACTAGTAGA
GGTCCAAGTGCAGTTGCACCTGATATCCTCAAGGTCAGAATTTACATAACAAACTTAAG
ATATTTACCTGACTTATGATTTATGCTTCCTCATCTAAATTAAATTCTGATTTTCGCTACTT
CCACAGCCTGATATCACCGCACCAGGGCTGAATATATTGGCAGCATGGAGTGGCGGAT
CTTCCCCAACGAAACTAGATATCGATGATCGTGTGGTTGAGTATAACATAATCTCAGGT
ACTTCCATGTCTTGCCCACATGTCGGTGGCGCCGCTGCACTTTTGAAGGCTATACATCC
CACTTGGAGCAGTGCTGCAATAAGATCTGCTCTTATAACCTCAGGTACCTCTCAACTAC
TTTTGAACTTAACTTATATACACTAACTACAGTATTTTAACCTGTTATAACATATATAGTTA
TTTTGCTGCAGGTGACCTGATAGTGTGTAAACATTATTTTACATTGTCGGTGTATAGAAT
TTAAACTCCTTTTTCGTCCAAAATTTTGTATTTTGAACTGATCAATCGTTATATTTTCAGCT
GGATTACGAAATAATGTTGGTGAGCAAATAACGGATGCATCAGGGAAGCCAGCAGATC
CATTCCAATTCGGAGGAGGGCATTTCAGGCCATCAAAGGCAGCAGATCCTGGACTTGT
CTACGATGCTTCCTACCAAGACTATCTTCTCTTCCTTTGCGCTTCTGGTATTAAGGATCT
TGACAAATCCTTCAAGTGTCCCAAGAAATCACATTTACCTAACAACCTAAATTATCCATC
TCTG G CTATTCCC AATCTCAATG GTACTGTTACTGTTAG CAG AAG GTTGACAAATGTTG
GTGCACCAAAGAGTGTTTACTTTGCCAGTGCTAAACCTCCATTGGGATTCTCTGTTGAG
ATTTCTCCTCCCGTCTTGTCTTTTAAGCACGTTGGTTCGAAGAGGACGTTCACTATTACA
GTGAAAGTTCGAAGTGATATGATTGACAGTATTCCGAAAGATCAGTATGTGTTTGGATG
GTATTCCTGGAATGATGGAATCCATAATGTTAGGAGTCCAATTGCAGTCAAATTGGCA
SEQ 74
ATGGCAACACGTAGAAGCTCTAGCTCTGCTCTCACGGCCCTTGCGGCGTCTCGTTCCC
GCCTACTCTCGCGGTTTCGTCCTGCAGTTTCTCGTCTCTCTCAGAATACTTTACTCGGC
ACCGGCAGGTGTCCACCTCCCAATAGTGGATTTTTTGTTGCAGAAACAACTGCTGCACT
TTGGCCGAATTATAACGTGTTGTCCAAAAGTTTCGTGCACTCTTACTCTACTACTGCTGC
TAGCTCCGGACAGGCACGACTTTCTTCTTCCTAATTGCATTCTTCTCTGTTCAACGACTT
TTCTTCTTCCTAATTGCATTATTCTGTCCGTTCAATTGGAAGTGCTAATAGAATTAACTCT
AATTG AC GTTTAG ATTAAACTTG AATG AATG CTGTTG GTTCTTTTATTTAG CTTTTG ATG C
GAAGTGAAGTAATCTCTATTTAGATATTGTCAGTTAGAGAACTATTTTCTCAACGTTAAG
G AACATCATTTC CAG CCTTTTTTTTTTTTG CAG AGTG GAAG CCTTAAATTGTGTATTTTTG
GACGAGAAATAACAAAAATGGTCCCTTATATGTGGGGTAGAATAAAATAGTCCCTTAATA
TACTCCTGAGCAGTTTTGGTTCTTCAAGTTTGCAAAAAAGTGAGCAGTTTTAGTAGTCGT
CAATTATTTTAACAAACTCTGGTTGTTTAATTTGACGAATACGAAGTCGCATTTGGAGGT
GCATTTTTTGCCGTTTATGATATATTTGGTCTATTTCTGGTGTCTGATGGAGTTTCTGGG ATTTAATGGGTCTTTCCTAGTGGTTCTAGTTAAATCCTTTTCTTTTTCATAGTTTCGCTAA
ATCTAG CAGC CAAATTTTGATAAACAGTTG ACAAGATAAAAAATG CTCATG CGTGG CAAA
CATAG ATCCTTCTG ATAAGC GTCAAGCAGTG GAAAACACTTTTAG GTG CTG AAGTG G AT
TTTTATAAATTGGCAGTTACGTGTTAAGTGAGAAGTGGAACTGATAATCAATTAGTATGG
TTGGTAAAAAAACTGTTGATAAACACTTTTTTTGCTAAAATAACTGTAATGACCTTAAAGT
TATTTACAAATTCTATAATTTTAAAGTATTTATTACATAAAAAGACGAAAAATAGAGGTAAT
TAAAAGTTATGTTAGAAGAATATATTGGAGATTACAAAAGATCATAGGGATAAAATCGTA
AAAGG CTTG GTCAAACAAAAAATGTTTATAAG GTATAACTTTTG ACTGATTTTG G CTTAC
AAGTTCTTCTCGTACGAGCACTTTTGATGTTTATCAAACGTGTAGATAAGCCAAAATGTG
CTTACAAGCTAGTAGGACCCTCTTATAGCTTAGACAAATACATGTATTTAAGAGTCTATT
TTATACCTACCTGGNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNN N AAAAAAATAAAAGTTACTACTTTTGTCATTTTCTAAT
TGTTGGACAGGCGGTTTCATTCTACTATTTTTTGACAATTTCGTGGGTCATCCTTGGGGT
AGGGTAGGCTGCCTTCGTCACACACCCTTGGGGTGCGACCCTTTCTCGGGCCCTACAT
GAATGCACGATGTTTCGTGCACTGGACTTCCCTTTGTGGAGTACACTTTATCAAAGTTA
CGTATAAACACATGCACTAAACTAGACTTTTGTGATAGTTGTTTCACTTGAGGCTGTCAG
TTTACTTCTTTTTTTGTTTGGCTGCTATTGGTGTTGCCATAAAAACTACTCTAGAGTTTAC
TATTGATGCTTAGAACTTATGGGGTGAATGAAAGGACGGGTGACTGAAAGGACATCTCC
TTCCTCAAACGTGTTATATCTTCTTAAACAAGAAATTTTGATGTACTCAAACAATACTAAG
GGGGGTTGTGCAGATAAAAACAGCTATTTATAAAGGTCCATTGGTCTCCCTGGCCTTCT
CTAGAATGAGGAGTGCCATGATTGTAGATATTTACCTATACCATATCGCCATTTGTCAAT
TCTCTTGGATTTGTCCTGATGATTACTTATCTTTAATTTTGGGTAATTTTAAATTGTTTTG
G CATATTATTTGTTGTTATGTTAATTAGTATG AAG ATTTAAC AG ATTAATAATATG G ACTA
CACCGAGATGGCTCTTGAGGGTATTGTTGGTGCTGTAGAGGCGGCACGGACTAGCAAG
CAACAAGTAGTTG AGACTG AG CACTTAATG AAAG CTCTTTTG G AG CAGAAG G ATG G GTT
G GCTC GAAG AATATTCACTAAGG CTG GGTTG GACAACTCATCAGTTCTG CAAG AAACAG
ATCAATTTATATCTCAGCAGCCAAAGGTATGAAAAATGGAGACTGATTGTGGATTCTGAT
GAGTTCTTGGACTAGAGAATCAGATATTTTTTCTTGCAGGTAGTAGGTGATACTAGTGG
CCCCATATTGGGGTCACATCTTAGTTCTCTCCTAGAGAATGCGAAGAAGCACAAGAAAG
AAATGGGAGATTCCTTTGTGTCTGTGGAGCATATGTTGTTATCTTTTTTGTCAGACACAA
G ATTTG GTCAAAAGTTATTC AG GAATCTCCAGCTTACG GAG AAG GCTTT GAAG G ATG CT
GTCAATGCTGTTCGTGGAAGTCAGAGAGTAACTGATCCAAGTATGTATATATTTATATAG
CTTCATGTTTCGTGGCCATGTCTCTTATGATTTCATTCGTTCTGGTTGAATGGTAAATCC
CAGTTGGCGAAAGGATCTTACTTTATCACAGCATGAGATGATCCTTACTTTGGTTGATTG
GGTGGATCGTGCAATATTGTCTCTCTGGTTTGTCATGGACATTAAATTTTTCCTATTGAT
GGGGTGGGGGGGGGAGTCTTGTAAGGTGGTTAGAGGCTTAGAGCTGATAATATTTCAA
AAAGTACTTGGCAAAAAAGGCTAATTCCAAGTAGATGGTATAATACTCCATTCACATGAA
CTTAGATTT AAATCTGAATTT GAAG AATGACTTTT GAT ATCAAAAGGTAGAAT GAG GAAT
TAGCAAAATTGCATTGAGGGAGTATGGATGTCCTTGATATATATGACAATGCATATTACT
GTCTCTTAGTGACTACATCATGAAGTTTCAAAATGATTGACCATGCTCAGTGCAAATAAG
ATATCTAATTGTGTCAAAATAGATTTTAGTATGGCTCCGATATTTATCAAGATCTTCCAGC
GGCTCTTCCTGTCTAGGAATGAGTTTGAAATCTCGATCCTAAGGGTGTGAACATGTTTG
CAATCTGACACGTTTGTTTACTATAGTCGTCAAAAGCATGCTTTGTGAGATAGCAATGCC
TATCTTTGATTGTGCTACCTCAATCTCCTGTAATTTTCTCAAATCCTGAGCCTTACAGCA
ATTGAACAAGTGTTGCTAGAGTTAAGAAACTATTTGGTCATTGCCTTCTATAATAAAATG
TCATCAACATAGTCCACTAGTAAATACATTAACCACATTGTTGTGAAGAGCGTGAAGTGA GAAAAAGCGACAACCCCCATTTCGCTTAAAGCGAGAAGCGTAGCACTCACTTTTTTGAA
GTGAAGCCGAATTTTCAAAAAAAAATTAAAATAAATACTGCATAGACAACACATGTAATT
GTAAGCAAATGTTCAATACTTCAATGTAAAAACTAAATAGTAGCATCAATTAAAGCACAA
AATGAGCATCCTATTCTTCTACAAGATTGTGAAATTCTTGTATTCCACTATCATTATTATA
TTGCTCGTCATCTTCTTCAACTTCTTCTTCTTCATCAACTAGGGACAAAGTAGCCGCCTC
TTTTCCCTTCCTCTGTGAACTTGAAGTTGAGGTACTCCCCTTCAAACCATAAATCCTCTC
CCCAATTTCACACGCCTCCGCAACATCACCCCAAGTGAAATCAGAAGTTTCCTCAAATA
CTTCTTCATCTGCATGATCTTCCGGGACTCCAATTAGCCATTCATTAGCATCATTGATGT
TGTCCAAACTAATTGGATCAATTACATTGCGAGCATTGTAACGACGCCTCAATGTTCTAT
TGTACTTAATGAAGACTAGATCATTGAGACGCTTCAAGGTTACTTTGTTCCTCTTTTTGG
TGTGGATCTGCAAATAATAAGTAATTAGTAAGATTAGGACAATTGAGATATAATTTAATTA
TCTCAATATACTAAATCTTTCTTCTTAGTTCTTACATGTTCAAACACGCTCCAATTCCTTT
CACACCCGGATGAACTACATATTAGACTTAGAACCTTAATGGCAAACTTTTGTAAATCTG
GGGTGAAATGGCCATATTGCTTTCACCATTCAACTGTAGAAGTAGAACAAATTTTCAAAA
CATAATATTGATAAATTAATAACTTATTAACTATAAAACAACATATAAGCACTCGTTCACC
TGGTGACTTCGTCTTTTTTTGTCTAATCGCCATGTTTTTTCCAAAAGTTGCTCAGCATTC
CTATAAATACTAAATTGCTCTGTTATTTTATCTTGCACGGATTCTTTGGGTATCAACTTCT
CAGTACATTCATAGTATCCATTCCACAAATGTTCATCTCCTAGAATCCTCTCTTCATTGTC
ATAAAACAGTTCCGGGTTCAAAATAAGTCCAGCTACATGCAAAGAGCTATGAAGCTTAC
TATCCCACCTTTTATTTATGAAATACACTTTTGTATTTTCTTTGATCACTAAAAGAGACTT
GAATAGCCTTCTTTGCCCTATCCATTGCTTCGTACATGTAGCCCATTGGTGGCCTTTGC
TCCCCATCCACCAAACAAAGCACTTTAGCTAAAGGACCACCAATCTTCAATGCATGAAC
CACATTGTTCAAGAATGAAGGAGAAAGTATAATATCTGCAAATTCTCTCCCTCGAGCTTC
CCTTCCATAGGCACCGTTAGTGTACTCATCTGAAACAAACAACTTCTTCAAATTGCTTTT
TTGCTCATACATCCTATGCAAAGTCAAGAAAGCGGTAGTAAATTTTGTCTTTGCAGTTTC
ACCAAGCTTCTTTGTTTAGTGAATCTCTTCATCATATTCAATAACAAAGGCCTTTGAACAA
TAGAG GAATG CACTCTAATTG CCTG ATTAAAG ACTG AATTG ATGG GTCTTTCCTTG AAAA
TGTCACCGAAAATCAAATTAATGCAATGTGCTGCACACAGAGTCCAATAAATATGCGCG
TACCCAACAGACATCAAATCACCAGCTTTAACATTTTCACTGGCGTTGTCCATGACAACA
TGAACAACATTTTCTGCTCCAATAGAGTCTATTGTACTCTTGAACAAGGAGTATATTTTG
GTTGAATCAGTCAAAGAGTCGCTTGCATTAACGGACTCAAGAAACATACTTCTTTTAGGA
GAATTCACCAAGATATTGATGATCATTTTTCCATTTCTCGCCGTCCACTTATCCATCATAA
TGGAACAACCAAACTTGTTCCATTCTACTTTGTGATCCTCCCCGATTTTGTTCAACTCTG
CCGCCTCTTTTTTTTAGATATAGACCTCTAACTTCATGATAAGTTGGAGGCTTCATTCCT
G GACCATATTG G CCTAC GACATC AATAAAAG CAG AAAAAGTGTCAGTATAATTAACACA
ATTAAAAGGAAGACCTGCATCATACATCCACCACGCAAACATTGTGACTGCACGAACCC
TCAAAATCGCTTTGGCATCAATTTGAGGATTACCACTTTTTCCTCCCTTATCCCCAGATT
TTTGCGGGAAGTAACAATCCATAGGACCTTTGGTCTTGCCAGTAGATCCACAGCTAGAA
GACATTGGTTGCATCCTTCCCTGCTTTTGTGATTTTGGTGGAAGCGACGAAGCATCGTC
ACCTTCTTCTGTTTCATCATCGTCATCATCATGATTATACAGTTCTTGTTCATGAATCATT
TG AGTCTTTAACTCTTTTTTTTTTTG AAG GAATG CTTTCAATTCTG CCTTCACATG CGATG
GAACTTTAGGACAATATGCGACATTTGGATCACCACCGATTAGATGCGCTATTTGACCG
ATAGATTCCCCCATTTGAAATCTTGTCACAAAAAAGACATCTAATTGCCATCTTGTTTTGT
TGGTCTCGCTAACTCTTTCAGAGTAAGTCCAAGCCGGATCTTTCCTATCTTCTTTTGGTG
CCATTAAAAGAACAACTACATAACACATAAGAAAGACAATAAGAACTAAGAATAAAGGAT
ATAAAAATTAG AG G AAG GG CAGTAATTCTCTTTTAAAAATTGG G CAG AGTTAAAAAAAAA
AAAACTAGGCAGTAATTCTCTGTTAAAAATTGGCATCAATTCTGAGAAAGAACCGAAGGT
TAAAAAAAAAAATTTGCCAGCATTTCGCTGCTTTTTGGACGTTTAAACAGTGAAAGAAAA
AGAAAAGAAGAAGAAGAAGGAGGAAAATCAGAACATACCTGTTGCTGTTGAAGACTTGA
ACTTGAAGTTGAAGACTTGAACTTGAAGTTGAAGTTGAACTCTTGAAGAAGAAGAAGAA
CCCCAGTCGATGCTTGTCGATGCTTTCAGAAGTTGATGAAGAAGAAGAAGTCGACTAGT GTCTCTGCTTTAAAACCCTAGTCGCATTTGTTCGTTTAATGAAAGACCAGACATCTTGTT
TTTAATAAAATAGGGTTGAGTCTGATTTAAAACACAGAAGCGATCGCTTCTCTCGCATCG
CATCGCTTTCCTGCTTCTCGTTTTTTAGTGGGAAGCGGTCGCTTTTCTACACCTAAGTC
GCTTCACCCTGTTGAAGCGTGCACTTTCTTGCTTCGCTTCGCTTCTCGCTTAAAGCGAG
GAAGCGGACGCTTTTTTAAACACTGATTAACCATGACCATTATAATTGTATACGGGTAAA
ACCGAGCCCATGATGCACCTCGATTTCCGACAAGAGAAGCCAGGCTCGAGATGTGATG
GCAAGGGACAAATATCAAGCCGAAAGTCCCATTGAGCCAGAGCCCTGGGACACGATGC
CTGCCCTCGGGAATATCGAGGTCATAATTACAGAATCGGTCCTAACCTCGAACAACTTC
GAG G AAC ATTATC G G AC G ATC AAG C GTAG C C AAC AG AAAG C C G AAATATC C ATG AC C G
GCCGAGTATCACGACGGGGATCTCGGCACGTATCGATAAGGAACCTTCAACCAGTTAA
TCAGAAGACCTTTTACCTTTTACAGAGTTGTACCTAAAGTAGGACTCCCCTACTATATAA
AGGGGGTTTGATAATTCATGTAACACATTGAAAACACGCGTTCCAAGGAAATATATTATC
ATTTTCTCTTTTATCTAGCTTTTTTCACTTGTTCATCAGTGTTGACTATAGCAAGCCCGGG
ATCGAGGGTGAACAATTTTACTAAGGTTGAATCTGTCTTATTCGCATGGTTTGAATTCAT
TTTATCTTTACTAGTTCAATCTAATCCAATTTATAGCTTTGTGTCAAATTAATCCGCGTAT
CCTTAAAACCACTTATAAATTCAATTGTTATCCGATTTTGAGGGTAAACAATAATGAAAG
GAAATTGAACGATCAGCTTCACTTTGGACACTTCCCATGATATTGGACCACGAACTGAA
TTAGAGAAATTCCAAGCCACACTCCCTATAAGTGTTTTAGACCATATAAGTGACTGTCAA
CAGCATACTAAGCCTGAATCCTTTGAGCAACAAGCTAGAGTGGTTGCTTCATGTACACC
TCCATGGCTAATTCTCCATGCAAGAAGGCATTTCAGATATCCTGTTTATAAAGTGTACAT
TAAAACATTGACGCTAAAAAGTTAAACAGTTGTGTGGGTGGATACCATATAAGTAGCGA
CGTCCAAAAAACACTTACCAGAATTAGCATGAGTTTCAGATTCAAATTCCTGCGGAGGC
AAAAGACACTAGGTGATTTCTTTTTGTATGTCCAAGTCTTGGTGAACATATGTGAGAAAG
AAAGAGTTTGAG AG AGAACTCAG CTTG ATTATTAC CTTAAG CAGTG GG CTTTAAATAGCT
AAGTGCACATATTCTAAGTATTAGATCAAGTTTGAGACCTAGTAAACTTCCACAGGAAAA
GGGATAGAGTCAGATACCTAAACCTAGTTATACTCTAAAGTTTTTATGTAATTAGACCTC
CTTAGTCTCTATCCTAATGGTATAGTCTCATCTGTGCTGTCCCAAATTAATATATTGAAG
AGAAATTCAATCCCAAAGTTGTGTGTGAATTTATATGGTATCAGAAGCCATGTCGATATC
TTCCTCCTCCTCCCCTCCATCACCTACACTCGTTAACCCTCTTTCTTCGTCATCTTCCTC
GCATGCACCCCTTGACCATGCTCATCACTTCATTTCAGTTAAATTAACTTCTACGAAATT
TTTCTTTTTTGGAAGACGCAACTATTACATTTTCTTCGAGGACAAAATCTCCACAAAAACT
CCACGGCTATATTGATGGAACTAATCCTTGCCCACCATCACACACTACGGTTGAAGGCA
AAGAAATACCAAATACAACCTATGTATAATGGATCTAACAGGACCAATTGATTCTTAGCC
TGTTGATTTCATCACTTTCCAAAGAAATGTTGCCCATGAAAATTGGTTTAAATACCTCCA
AAGCAGTTTGTGATGCACTTGAGGCAGCCCTATCCGAACCTTCAAATGCACGAATCCTC
AATCTTCATATGCAACTTTAAAACTTGAAGCAAGAAGATCTTTCGGTTACTCAATACTTG
CACAAGGCCAAACTCATCTCCGACGAGTTGGCAGTTGCTGCAAGGCCCCTTCGTCTTG
CCGATCAAAATGTGTACATCTTTAAGGGACTGAGATCTGATTTCAAGGACATTGTTACAA
CTCTCTCAGAACGACATGAACCAATCACATTCTCAGAACTTCACAGCCTCTTGCTTAACC
ATCAATTTAGACATGGTTCCTCTATCTCCTCACTTTCCTTAACCACCCCAAAACCACCTG
CTCTTCAACAATACCCACAGCTAACTTCAATCAACGAACTACAAATCTGATCGTAATAAT
G GTTTCAATTCAAATAG GG G ACG AG G CAG ATCTTCGTGTG AAAGAG GG G GTAGAG GTG
GTTGTTCATCCTCAAGGAATTTCTCTAACAATGGACAATCTTGGTCTCAATATGATCAGC
GAACCCGGTGTCAAATATGCAATGGTACCAACCATCTTGCATCAACTTGCTTCCAGAGG
TACAATCACTTGATTAACCCTATGGCTTATTTGTCTAACCAAGCTCCTTTACCCTCAACTT
TGCAATGGTTTGCGGACATTGGAGCCACTCGCTACATCACTTTGGATCTCACAAATATT
CATCAAGTTGAAGATTATAGGGGTTCAGATCAGGTCCAAATTGGCTATAGACATGGCCT
TTCTATCCATCGCACTGGTAACTCCTCTCTCTGATCACCCTCTTGGTCTCTCTATCTTAA
GAATATCCTTCATGTTCCTTCAATTACCAAACGTTTACTCTTTGTTCAACTTTTGCTCGTC
ACAATAATGTCTTCGAACTTCATCCCTTTCATTTTGTTGTCAAGGATCTACAATCCAGGA
CACCTCTTTTTACAGGGCAGAGTGATGGCGATTTATACACACTTCCATCCAAGTCTTCTT CTTCTTCCATCTCCCAGCCAGTTCCAGCCTCTCCAACAGCTTCTCTATCCATCAACACAT
CACCTTCATGCTGACATCTTCATCTTGGTCACCCCCATCAACTAGTACTTACGCAGATTC
TTAGGACCTACTACAATCTGAAATGAATGCTTTGCTACGAAATAATACCTGGTCTTTGGT
TCCTCATAATCTTTCAATGAATGTTTTAGGATGCAAATGGGTGTTTCTCATTTAAAAAAAT
TTCTATTGGGGCAATTAAGAGATGAAAAGCCCATCTTGTGGCTAAAGGTTTTCATCAACT
TGAAGGCCAGGACTACTCTAAGACTTTCAGTCCAGTTGTAAGGCTGCAACCATTTGCAT
TGTTCTATTTTTAGCAGCTTCACATGGGCGGTCTCTCCAACAATTTGATGTGCAAAATGC
AGTTTTACATGGTGAGCTTCAAGACCATGTGTTCATGAGCCAGCCTTCAGGTTTCATCC
ATCCTCTTTTTCCTCATCATGTTTGTCAACGTAAGAAGTCACTATACGAGCTCAAATGGC
TCCCAAGGCATGGTATATGCGTCTCCATAAGTTCTTGCTCAGCGTAGGCTTCATCACCT
CTAGATCGGACACTTCCCTGTTTGTCTGCAACTCAAATGGTGTTGTCGCCTACCTCTTA
GTATACGTTGATGATACATAGTCACTGGCAGTGGTACCTCCTTTTTAGAATCCATTTTCC
TCAAACTTGGAGATGTCTTTTCCATATGTAATCTTGGTCCTCTCAGTTTCTTTCTTGGTCT
TCAGGTTTCACGTGATCACCATGGCATCTCTATGTCCCAAGCTGAACACATTAAGACTA
TTCTTGCAAGAGCACGTATGTAGCACTGCAAACCTTTAATTACTCCCATGGAAGTGAAT
GTCAAACTTCACAATGGAGAAAGTCTTAGCTTTCATGATCCTACCTTGTACTGTCATATT
GTGGGCCTTACAGTATGTTACTCTCACTTGGCCGGACTTAGCTTTTGTGGTGAATAAAG
CTTGTCAATTCATGC ACAATC CTACTATG AGTCAGTG G GCAG CAGTCAAG CG CATACTC
TGCTATTTGATGCATACCCAACGTATGTGTTTTCACATTCCTAGGTCTCTTACACTCACT
TTTCAAGCCTTCACACACTCAGATTGGGCAGGTTCACTCGATGATCGTAAGTCCACTAC
GGTTATGCCATTATCTTGGGTGAAGCTATTCTCATGGTCGTTCAAAAAGCGGCGCATTG
TAGTAGATCTTCCACAGGTTCAGAGTATAAAGCTTTAGTAGATGCAGCTGCCGAGCTGA
CTTG GATTCTGTCTCTCTTGTTTG AG CTTG GTGTTTAACTTCC CAAAG CTCCAATTCTAT
GGTTTGACTACCTATCTTTCGGTAATCCTGTGTTTCATGCACGAACCAAGCATGCGGAA
ATTAATTTTCACCTTGTTAGACAAAGTAGCTCGAAAGGATCTCACAGTTCAATTTTTATCC
TCCAAAGATCAGCTTGGTGATGTCTTCACAAAGCCACTAGCTTCCTCTAGATTTGAGTTC
CTTTG GTC GAAG CTCAATGTG GTTTATCCACCTCAGCTTGCAAG G GAGTATTGTATCAA
CTTTGAGTCCTGGTAAACTTAGGATATAGTCGGGTACCTAAACCTAGTTATACTTTGAAG
TTCTTATGTAGTTAGATCTCCTTAGTCCCTATCCTATAGTATATACTCATCTCTATAAATG
TACGACCGCTGTACCAAATTAATACATTGAAGAGAAATTCAATCACAAAGTTGTGTGTGA
ATTTATACTAAAAGGAAAGAGAATTAATCACAATGAAAATACAATCAAGCTATACTCTATT
TACAAGATCCTAGAATATTCTAGAATATTGACAAGATTCTAATAAGAGTTAGCCCGTATC
TGTGCTGGTGGGAGGTAGCAGGTATCCTTTAGAATTAGTGGAGGTGTGCGCAAGCGCC
AGAACACCGTGGTTATTAAAAAAAATCCCTACGAATTATTGGTATGAATAACAACAATCT
CCTCTATCATGGAATGACTCCATATTAGATGTATCAATAATTTTCCAACGTCTTTTAAATG
TGGAAACACTAGAAAAAGTTGACACATTGATTAAGAAAGTAACAAAATGATACGGGAAG
TACGAATATCTTTGCGTGAGGTAATGAATGGAGTTGTTATGTCTGTTGAAGGCATGATTA
G GATTAGTG ATG G AG GAG GAG GAG AAAGTCATTGAATG CTG GC GATGTACTTGTATCA
TGAGAAAAATCTCATCTGTATTAGTAGGAGGATGTTTATGAACTGATTAGACTCGGACA
G GACAAACTTGTGTTTTAAGAAGTTAGG G ACTAGTAG ATGTCAG GAG GG AAATTTTATTT
CATAG AATTTG GTG AAAAGTATAG AGATGAGTGG G AG GTTACATTTG CTGG G ACG AACT
TTACTTTTTGGGGGATTATAAACATTTGCAACTTTTCTTGAAAATGGGTGAGTAAAAAAA
GACACAATTCTGATTAGGATTGAGGTCTGTCTTTTCCAGTAGTGTGATCATGAAGGAAG
CAAGTACATCCAAGCACTTCCAATTGTAGATGGTGGAGATTTTGATCTGGAAATTTTAGT
GTGGACAGTAAAGTTGCAACATAGAAAAGATTGATTTGTACATTTGCCATCGATAACAAT
GCTTCTAGCAACATGATTCTAGGGTGAATGGGTTTGTTGGGGCTCTATGAGTCTTCCTC
GAGTGGCGGTGAGGTCTAGAAGTCGATGCAAAGAGAATTGAGTGGTACAGTCAGTGAC
TATGACTCATTCTGTCAGCCTCTGTAATTTTTTGCCGCTATAGCTTTTCAGTAAGTAGTTT
CTTCCCTGGTTTTAATGTCATGAAAACTAGAAATAAAAGAATGAAAAATGATGGAAAAAC
TTGATTATTTCCTCAAGTTTGATTAAGAACTTAAACTAGTTACAATGTTGGTAAAATACAG
AATATGGTGAAGTAATTCTCCTAATGTGACTCATTCATAGCAACTGAGTTAACCCAATAA ATGCAAATGACTGGACGATCTCGACAAATCCAAGTCTATCAATTTCAACAAGTTTCGCTG
CCTTAATCATGACAATATATTTAAGTGATGGTCATTAAATTGGAAAGAGTTGCTGTTGCT
CTTGTTTTTGCCATCATTCAGCTGTTCACTGTGGTAGATTATGGTTTCCTACCAAGTCCA
ATGAAACTGAGCAGTCTTGACAATGCCTGATGTTCAGTTTCTAAAGTCTGTTCTCTCTCC
AAAAGTAGAGGAACATAATGTTATCTGATTGGCTCGGGAAAAAGTTGTATGTAGGGTGA
AACCTTAATGACATAATGAAACATGTAATGGTCTTTGTGCCTTTGGTTCATTATGTCTGC
TACTAGATACTGAAATTGCTGCTGAAAGTGCTTTTTGAGGTGTCACTCATTTTTTCTTGC
TG CTATTAATACATAG CGTTCTG ATTCTTTTCAG AC CCAG AGG G AAAGTATGAG GC ACTT
GAGAAATATGGAAATGACTTAACTGAACTTGCCAGACGTGGAAAACTTGACCCGGTGAT
AGGAAGAGATGATGAAATACGGCGCTGCATCCAAATATTAAGTCGGAGGACAAAGAATA
ATCCTGTTATTATTGGTGAGCCTGGAGTGGGGAAAACTGCAATTGCCGAAGGGTATGAT
CTCTAGCCTTTTTTGGTCTCACGGGGTGATGTATGAACATGTTTTTCCTTATATTTATTTG
TCTGGATCCTGGTTCAGTTGAAAACAATCCAAAAACAGTAATGGAATAGCAGATCTGTG
GAGGACATTTTTATTATTACTTGTCTACAATGATATTCTTTTGGTTGGATATGCTGTTTTA
ATTAGTTTTGTTGAATAATGCTGCCTGGCAAACCAAACATTGAACTTTAAAAGAGTTATG
TTTCTAGAAAAGATATGCTCGGAATAGCATGATTATCCACTCAGAGAGGTTTGTGATATT
TAGCAGACTGTATGTGGGGTTTACAGGAAAAAATCATTGCTAATACATATTGTTTCTGGA
CAATACGTTATTCCTGTTAACTATTTGAATTGTGAGATGGGTCAGGGTGTGTTATGTGGA
GTTGTTGAAAATTAGTAATTGTAGGATGGAATGAAACAATTATAATCTTTTTTGTTTACCA
TGGTTCCTTGTGTTTATTTTTAATGAGCAGTGGCGTGGTAATGAAATGTAGTTATAGAAC
TTCTTACTAAG GG GTCTGC CTATATGTTG CAG AATCAG G ATCG CTG AC AAGGTTCAGTG
AAGAGTTCTGATGGATGCATTCAACATTTCTTGAAAATATATATCAAAATAATGCTGTTGT
TAGTGAATCACTGAATTTGTGTTAACTTCCCTTTGATTTGAAACTACCTTACCCGTAAGC
CTGTAAAATTGAGGTGGCTTGCAGAGCCTTTGAATGATCAAATTTCTATTTAATAATTAAT
ATAATGCCAAAATTGTGGCAGCTTGTAGAGCCTCTGTGTGACCTCGTTTTAACTTTTTAC
ATAGGTCCAAGGTAGTGTTAGCTGTAATATTTGCGTGTACTTAAGAGTCCTGTTCAAGT
GGTGGCATTTTTCACGTCATCTACTTTATGCAATATGTTATGTTTCCAGCCTTCGTGAAA
TGGGGATGTGTTTTTGACAGATTAACTGATAAAAGTCAATCAGTTCTGTCTCTTGTAATG
ATCTTTTTCCAG CAAAG GAG GCTTTTTATTTATTACTTGTTAC CAGTAC GTTATG AG ATTT
AGAGCCTTTGTGGTTTGGATATTTAAAAAGTTAATCAATACTTACTTTTTATTAGATCAGT
TGTGCTAGTTGTAACTTACTTATCTTACATATCGGAATAATTAGTTTGGTTTAACTGCCAG
ACGAATGTGTTTCTGCACCGGAAGTGAATAATTCTAAATTGATATTGGATAGTGACTTAT
TTGAGTGTATCTGCAATATAGTTTTTTTTTGCTTAGACGTAGAGAAAGCATATGATGAGT
TGGGAAAAAATGTCGTTTAGTGGGTGCTTGAGAACAAAGAAATTTCTACTCAAAATATAT
AAGTACCATAAAAGATATATATGAAGGAGTGGTCACAAGTGAGGGACCAGTGGGCGGA
GATATTGAGGAGTTCCTTGTAACCGTGAGTCCATAATAGGGAGCTGCCCCATAGTTGCT
TACCCTGGCTATGGAATAAGTTATTCAGTAGCAAATAGGACGGCATTATGGGATATGCT
ATTTGTCGATGATATTTTGTTATTTGACGAGGGGCAGGGCAAGAAAGTAACAATCATAC
CCGAGAGAGTCAATTGTTAATTGACGAAATTAGTTAGGAGCCGATCAAAAGTTGGAATT
ATGGAGAAGCACGCTAGAGAATAAGGATTTTACAATAGGTAGAAGTTAAACAGAAGATA
TG CAATG CAAGTTTAG CTTTTG AAGTG AG GTTAG ATAG GATACTAGTGTCG AG AC AAAA
AATTCAGATATCTAGACTCGATCTTTCAATAGAATGACATGATAGATGAATATGTAACAC
AATAATATGATGGCCGAATTGAAGGAAGCCCATGGAAATGCTTTGCGATAATGATATAG
CTACTAAAGTAGAACATAAGGTCTATGGAAACTGGTGATATAAACAGGGTTATATAGGA
GTAAAGGCTGGAATTTTAAGACCCACAATATCGGCAAGATTAGCATCGTAAATATGTGG
TGTCAGATGGATGTACTATACGACCCAAAAATGGTTGAGTCCAAGTCACTACATTGACA
AGATGAGCATCGCAAATATGTTGCGTCAGATGGATGTATACACACCAAAAATGGTTGTG
TCCATGTGAAGGTGCATGTAGCACACAGTGATAGTAAATTGAGAGATGGCCACATTTCC
ATCATGTCTTTCCTAGGCCTTCCAAGTGCATTGGTTTATTGGTGAGACTATGATGACTAA
AGCTGTTGAAAAGGTATGAACTAGATCTAAAATTACATGAAGAGAAGTCATCTAGAATCA
CCTACAATCTCACAGAATCTGCGTGGATTCATTATGAACATAGCACAATGAAAGCAAATT ATCAAAAG CATG CAATAGTAATTAG CTAG AGTTG AAG CCTAGTCG CTATTG CTCTTAC AT
TAGGTTGTGTGTTTTCCAGGAGCTTTTAATTTGGTTAGAGATTTGTATATCAAATGTGAG
ATATAGAGACCCTCTAGTTTTAACAAACACCCTTAAGTTATCTCAAAAGAGAGCAAAATA
GATAGATGATTCATATAACGATCCCAACTAGGTTGGATTTGAGGTATTGATTAGAATGAT
TGATATAGTTATCCAAATTTTAAAAATCAACCTTATTAGTAAGGCAAAGATGCTCTTAACA
TGTTAAAAAGAAGTCGAACAAGAAATTGTTCTTCCTTTCCTTTGATATAAGATATTTCTCT
CCCACCATCCTGGAAGGAGAAGGTAAGGATTGTGAGAATGCATGGGAGAAAAATCTAT
CTTTTTAAATATATGATACAACGAGCCCTATATATAATATATATTCTACTCCTACTACATAT
AGGACTAGGACATATTCTACTCCCACTGACTGGAATGATGCACACAACGGCTAGAATAG
CCTCCAGAGAGGGTGGAGCAGCAACCGAAGAAAATGTTGGCGGGTTGGGCTGCCGGT
AGACAGAAGTTATTGAGTCTCTACTAAGAGAATAGAGGTTTTGGTATCATGAGAAAAGA
AGAGAGTACTTATTGACTTGATTATTGACACAATGAGAGTGTTTTTTATAAAGGATTCTTA
TTCTAGTGATATAAGCTTAAGTATTTATATTATGCTAATGATATGAATAGTGATTTTTCTCT
TGTAGTTAAGTAAAGAATACTCCAAAATATCATATAAAATATTTTACATATTCTCCTATTCA
G ATACAATG AG ACTAG G CAATATTAACCTATAATTACTTTGATTCTTTTG GTATTTTAACA
CG GTTTTTAGTCTTTAAAAG AAATCAG G AAAAAGAG AG GTG AAAG ATG GTATCAGC CAA
ATCTATATGACAATTAACCTCGGAATGATTATGCTTTCTGCTCCTCATCCATGCAGGTTA
GCTCAAAGGATAGTCCGCGGTGATGTTCCTGAACCTTTGATGAATCGGAAGGTTATAAC
TTCTCTTCCTTGGTCTTAATTTGATTGACTTTATTTTATTGTGAAAAGGTCAAGTGATTGT
GCATATGCAAGATCTTGGAAGATGGCCTTTGCGTAGTGTGCATTGGATTGCTTTCTCTT
CAATTTGAAAAAGATAGTATCTGGGAAAGGCATGTCTTGAAGAAATAAATCAATGGTTGA
AAACCTGCTGCCTTTCTCTTGAGAACCATGAGCCGTAACTTGTTGCTTACTGCAAATAGT
TCTGTTTCTCGTGTTGCAGTGATTCATTTGTATGACGCCAGTATTATTGGATCATAATTTT
CTTTTCAAGATATGAGTCTGCTAAAGACAGTTGATGCAGTCTTTACTCAAATGAACGACC
GTGGTTAAGTAACATATTGTTTTAATCATTTCTGGGAGCAATTTCAGTCGAGTTTGATAC
TAAGTGAGATATGCATAACATTGCACTCAAATGTCAAACTAAATTTTGATCATCTAAGCT
AGTCCGATGAGTTAGTCTCTTTTCTGAAACCCCCAACCCCTCACTCAGCAAAGGGGTAG
GAGAAAAGGAAACGAACCACCTCATTCTCTAAAAAGAGTGGAACTAAATTATGATGCCT
CCTAAAGTAAAGTAGTTGGACATAAGTTTCTGAAGTTTCAAATTTGAGAGTGGGATTCCG
AGGAAAGCCTTCTGTAGCTTGAACCTTAATTGTCTACTAGGATTTGTTAATTTCGACTTT
TAAGTGCTAATACTGTAATTTGATTTCCTTTTTCCTTTTCATTTTATTTAGTTGATGTCTCT
CGATATGGGTGCCTTGCTTGCTGGTGCAAAGTACCGTGGAGATTTTGAGGAAAGGCTG
AAAGCTGTTTTAAAGGAAGTCTCTTCATCCAATGGGCAGATAATATTGTTTATTGATGAG
ATACACACTGTAGTTGGTGCAGGTCTGGTACTTTTTTTTTAATATCCATTTCTCCATGAA
GGAAGAAGTTTATTTCTACCGACTGGTTAGAAAATTTGCCAAATGTATTCTTTCTCTCTA
AGATCAAATCTCTATTATTTATAGAGTTCGATATTAAAGAAAAGTGCTGACTCAAACTGC
TTGTCTGCTTTCAAATCTCATGGCGTGTGGACAGCAAGTAGTGACTGCTCTTTTTGTTG
GATGATTTTTATCCTTGCTTAATTCAATCTTAAACAATCCAAAGTTCATGATTTAATTCATT
TGATGTCACTGGGAAACACTTGTTCTTCATTCTAGTGAGGTTAAAAAGCACTATTGCTTC
GTGATTGTGTTTTAGGTGATCCATTTTTAAATTTGAGCCTGGAGCTGGGAAATGCTTGAA
CTAAAGTTTACTTTTCTGTTAATTTGAGCCTGGACCTTGGAAATGATTGGAATAGGTTTA
TTTTTCTGTTCCATAAAGCTGATAAAACATTACTGATGATGTCCTTTATGTATGCAACCGA
GTAAAG GAG AAG GTACCTTTACAC CCAAG ATATAGTTTCTGTTTGTAG CTGG CTTATATT
AGTAACTAATTCAGCGAATGTTACTTGGTCAAATTATGTTGATACGTTATATTCTAAAACT
TTTGTTTGTTTATACCTTCCTAATCATTGTGTTTGCTTATGCTTTGTGTCTAGGAGCTACT
AGTGGGGCCATGGATGCAGGGAATTTGTTGAAACCCATGCTTGGTCGGGGTGAACTTA
GATGTATCGGAGCAACCACTTTGAATGAATATAGGAAGTACATTGAGAAGGACCCTGCT
CTGGAGCGCAGATTTCAACAAGTATATTGTGGCCAACCATCTGTGGAAGATGCAATTTC
CATCCTCCGTGGATTGCGTGAACGATATGAGCTGCATCATGGTGTTAAAATATCAGACA
GCGCTCTTGTATCAGCTGCAGTTCTTGCAGATCGATATATCACTGAGCGATTTTTGCCG
GACAAGGGTAGGCTAATGTATCCTTAGAACTGCAAGTTGTCTGAAATACTTGCTTTTCAT TCCTATAAAATTCTTGTGAACGTTTTTCATGATATCTTCAAATAATACAGCAGCCTAATGT
TACTTTTACATAATAAG AAAGTTACAG GGTTACAAGTAG CTTATTTTTATG G CTTCTTTAC
ATGTTTTATTG CATTG AGTG G ATCAATGG GTCCAG ATTTTCAAG CTTCTTCTAAATGTTTT
TAGCTGTGCGTGATCTGACATACGTTACTTGGGGCTTTTCACTTATGCTCAGTTCTTTCT
TTTCAGCC ATTG ATCTTGTTG ATG AAG CTG CTG CAAAACTAAAAATG GAAATTACTTCAA
AG CCAACTG AATTG GATGAG ATAGATAGG G CAGTGCTAAAGTTG GAAATG GAGAAACT
CTCCCTGAAAAATGACACGGATAAAGCATCTAAAGAAAGACTTAACAAGCTAGAAAGTG
ATTTGAAGTCCCTTAAGGCAAAGCAGAAAGAGTTAAACGAACAGTGGGAACGCGAGAA
AGATCTGATGACACGTATACGTTCTATAAAGGAGGAGGTAAATTGCATCTTTCATTGATG
AGGTCAAATCAAAGTTGCAGTTTTTCTTTGTTTTCTCATGATTACTGTTCAATTTTTTCCG
TTGCGTAGATTGACAGGGTGAACTTAGAGATGGAAGCTGCTGAACGTGAGTATGACTT
GAATCGTGCTGCTGAACTCAAGTATGGCACCCTAATCTCCCTTCAACGGCAGCTAGGA
GAAGCAGAGAAAAACCTGGCAGACTACCGGAAGTCTGGGAGTTCGTTGCTTCGTGAAG
AAGTAACAGATCTTGATATTACTGAAATTGTTAGCAAGTGGACGGGTATACCACTATCAA
ACCTTCAG CAGTCTGAG AG G GACAAG CTTGTCTTTCTAG AGAATG AACTTC ACAAAAGA
GTTGTTGGTCAGGATATGGCAGTAAAATCTGTGGCTGATGCAATCAGGCGATCTCGGG
CAGGCCTGTCCGATCCAAATCGGCCCATTGCAAGCTTCATGTTCATGGGTCCCACTGG
AGTTG G CAAAACTG AACTTG G AAAAG CTCTTG CTGC GTACCTTTTCAATACTG AAAATG
CTCTGGTGCGTATTGACATGAGTGAATACATGGAAAAACATGCTGTTTCACGGTTGGTT
GGTGCACCACCAGGTTATGTTGGATATGAAGAGGGTGGGCAACTCACTGAAGTGGTCC
GTCGGAGGCCTTACTCTGTGGTCCTTTTTGATGAAATTGAGAAAGCGCATCATGATGTT
TTTAACATTCTCTTACAGTTGTTGGATGATGGAAGAATAACTGATTCTCAAGGGAGGACT
GTTAGTTTCACAAACACTGTTGTAATAATGACATCAAACATCGGGTCACATTACATTCTT
GAGACGCTGCAAAACACTCGAGATAGCCAGGAGGCAGTTTATGATGCGATGAAAAAGC
AGGTTATTGAATTGGCAAGACGGACTTTCCGGCCTGAGTTCATGAATCGGATTGATGAA
TACATTGTTTTCCAACCTCTGGACCTTAAGCAAGTTAGCAGAATTGTTGAGCTCCAGGTA
ATACAGATCTGTAATCTGTTGAATTCTGATTCTCCTGACTTCATACGTTTTTCTTCTGTGT
TGTTTTCTGTTTGCTGCGGTGTCATCTGCTTTCTGATTACTTTGACTTTAAGAGTTTTATA
AGCACTACAGCAGATTACTGTTTGTGCGTTATCTCTGTAAATTTCAGTTTTTCTGTGTGA
GAACAAAAAAATGTTTTAGTGTGCATTAGATCTCAAAATTACACATAAGTACATCTCATTT
GCTTGGTGGTCGTCGTCCTAGTTTGTCCTCCTTGCTGCTTTCTGATGAGTGCATGGTTG
AGTATGTCAAGCTTGAGAACTGCAGCGCACTGCGCATCCTGTCTAATGTCTGCTCTTGC
AGTAGTTTTCTAACAGAGTATAATGTAAAATATATCATTTCATCTGGTGGTTAAGCTTTCT
CCAAGATGAAACATAATTTGATATCTGTTCTTTGTGGTTCTTAATTTGGGGAAAGTGTTT
GGCTATTTCTTATTTTAACCTTATCATCGCATCTGCAGCCATAGCATAACCTTGGTGAGG
TTCATGGGAAAGAAAGTTACCGAGGCTACCTGACAATATCGTTAACTGATGAAAATATTT
GTAGAACAAAACTTTGTGTTTTCATTATCATTACTATATTAGTTGCTCTCTTTTATCTTTTT
TTCGTCCATCTTTTCTGTTTGAAGAGATTTTTCTTCTGTTTCATGAGATAATTCGAGGTG
GAACTGCTGAGTGCTATGTATTACATGCGGCTGATTATCTATTTCATTTTCTATAAAACC
TTCTCTCTATCGTGAGAAAGCGAAGGTCTCCCTCTGAAGTTATACTGCTGATATTAGAGT
TTCTTAGAACCTGACGACTAGTTCTCTTTTTCTCTTCGCAGATGAGAAGGGTGAAAGAC
AGACTCAAACAG AAG AAAATTG ATCTTCATTACAC GCAG GAAG CTATCAGTCTACTGG C
AAATATGGGCTTCGACCCTAACTATGGAGCTCGACCCGTTAAACGAGTGATTCAGCAGA
TGGTTGAGAACGAAGTAGCAATGGGTGTTTTAAGAGGAGATTTTTCGGAGGAAGACATG
ATTATCGTTGATGCTGATGCTTCTCCTCAGGGGAAGGACCTTCTTCCCGAGAAGAGACT
GTTGATACGAAGAATTGAAAATGGTTCCAACATGGATGCCATGGTTGCCAACGAT
SEQ 75 GTGAATGTGAAATGTTTCTTTGTTTCTTTCTTTTTTTCTTTTTCTTGTATGTCACTTTTTTTT
TTG CAAG GCTG G AACTTTG AAACTTTTTGTTTG AAAAC ACAATC ATTCG CAGTAACAAAC
AAGAACCACCGTCCCCATCTTCACTCCCATCACTCTTCTTTTCTTTGTTTTCACACTTCAT
ATTTACTCTTCTTTCTCATCCTTTATATTTACATAGCAAAAACAACGTCAAGATTTGCAAA
AACACAGCAACCCCCCCAAAAAATGTCAAGATTTACAATGCTAGTAGTTCTTGTTCTTCT
TCTTCTATGTCTATGCCATTTATCAGTAGCAACAATAGGAAGTAGTAGTAATAAGAAGAG
TACTTAC ATAGTAC AC GT G G C AAAATC C C AAATG C C G G AG AGTTTT G AAAAC C ATAAAC
ACTGGTATGATTCATCACTAAAATCAGTTTCTGATTCAGCAGAAATGTTGTATGTTTACA
ACAACGTTGTACATG GTTTCTC AG CAAGACTGACTGTTCAAG AAG CAG AATCACTTG AG
AGACAAAGTGGGATTCTGTCTGTTTTGCCGGAGATGAAATATGAACTTCACACGACAAG
AACACCATCTTTTCTGGGTCTTGATCGAAGTGCTGATTTTTTCCCAGAATCAAATGCTAT
GAGTGATGTGATTGTTGGGGTTCTTGATACTGGAGTTTGGCCAGAAAGTAAGAGTTTTG
ATGATACTG G ACTTG G AC CTGTTCCTG ATTCTTGG AAAG GAGAGTGTG AATCTG GTACC
AATTTCAGTTCTTCAAATTGCAATAGGAAATTAATTGGTGCAAGGTAAAACTTTTCTAAAA
GTTTATGCGGTTAGAGACAAGACATTTTTAAGTTAGTTAATTATATTATATCTCAAATTGT
GGTCGCGAGGATTCATATTGCTTACTTCAACTTTTTTGGGACTGGGACGTAGCAGTTGT
TATTATATGTTAAACTCGTCCTCTCGCATGTTGGTCTGATTAATTTTATGATTTCTCTAGT
TG G CAGTGAAATTTG AATCTG G GATTTTTTG CTTG GTTTG ATACCATGTTGAGTTGTCTG
ATTAGTTGCATAACTAAGTGGTAACTGGTAAAGCTGCTCCCATATGATCGGAAGGTCAC
GGGTTCGAATCGTGAAACCAGCCTCTTGCTGAAATGCACGGTAAGGCTGTATACAATAA
ACCTTTTGTGGTCCGGTCCTTACCCGGATACTGCTATGGTATAGCGGGAGCTTAGTGCA
CGGGGCTCCCTTTTGGCATAGCTAAGTGTGTTGAAGAGGGTTTAGTTAAATCCCATTCA
TCAGAGTTGTACTGTACAAATAAGCTAAAAATGAATTATTTTTGTGTATGTAAATTGGTGT
ATCCCATTGATAATACAGGTTTGTACTTTTTTGAATTTCCTTGTTAGAATTATTTTAAAAAA
AAATAAAAAATATCATGGCTCTGCCACTGTTGTGCTCAACTTATCTAAAAGCTAAAACTA
TTAGAGATAAGATATACTTTTAATTACTTAATCATATTATGTCTGTTGATAGGTACTTCTC
GAAAGGTTATGAGACCACTTTGGGTCCAGTTGATGTATCCAAAGAGTCGAAATCTGCGA
GGGACGATGACGGACATGGAACACACACTGCTACTACTGCAGCTGGTTCAATTGTTCA
GGGCGCTAGTCTCTTTGGTTATGCTTCTGGAACTGCTCGTGGAATGGCAACACGCGCT
AGAGTTGCTGTGTACAAAGTTTGCTGGATTGGTGGTTGTTTTAGTTCTGATATATTAGCA
GCTATGGACAAAGCAATTGATGATAATGTGAATGTGCTTTCTTTGTCACTTGGTGGTGG
CAATTCAGATTATTATAGAGATAGCGTCGCAATTGGAGCATTTGCTGCTATGGAGAAAG
GGATTCTAGTCTCTTGCTCTGCAGGTAATTATGCTAGTCGGAAAATATGAAGAACTTCTA
GTACTTCTTAATTATTACATTTTATTTTATACTAGACCAGACTAGTTTAAAACTGAGCGAC
ATTAACAATGAAGATTCATTCATATTGCCGATTCTAACTTGCTTGGGATTGAGACGTAAT
TGTTGTTGTTGCTCTGCAGGTAACGCTGGTCCTGGTCCCTATAGTTTGTCCAATGTAGC
GCCGTGGATAACTACTGTGGGTGCAGGAACATTGGACCGTGATTTTCCTGCATATGTAA
G CCTTG GCAATGGTAAG AATTTCTCTG GTGTTTCACTTTACAAAG G G GATTTGTC GCTG
AGTAAAATGCTTCCGTTTGTGTACGCTGGTAATGCTAGTAATACTACAAATGGAAATCTT
TGCATGACGGGTACCTTGATTCCTGAGAAGGTTAAAGGGAAAATTGTTCTATGTGACCG
CGGGATAAATCCCAGGGTCCAAAAAGGTTCTGTGGTAAAAGAAGCTGGTGGGGTCGGT
ATGGTTTTGGCTAACACTGCCGCCAACGGGGATGAGCTGGTGGCTGATGCCCATTTGC
TTCCAGCAACGACAGTTGGTCAGACGACAGGGGAAGCAATCAAGAAATACTTAACCTC
GGATCCTAATCCAACCGCTACAATTCTTTTCGAGGGAACTAAGGTGGGGATCAAACCAT
CACCAGTG GTTG CTG CATTTAG CTCCAGAG GACCAAACTCAATCACG CAG G AAATTCTC
AAACCGGACATCATAGCACCAGGTGTTAACATTCTCGCAGGGTGGACAGGTGGTGTTG
GACCAACAGGGTTGGCCGAGGACACGAGACGTGTCGGGTTCAACATTATCTCGGGCA
CGTCTATGTCTTGCCCGCACGTGAGTGGTTTGGCTGCTTTGCTTAAAGGAGCGCACCC
CGATTGGAGTCCAGCGGCTATTCGCTCGGCTCTTATGACCACGGCTTATACAGTGTACA
AGAACGGCGGTGCACTCCAAGATGTCTCGACGGGAAAGCCATCCACACCATTTGATCA
TGGTGCAGGACATGTAGACCCTGTTGCAGCACTAAACCCCGGACTTGTTTACGACTTGA GGGCTGATGATTATCTGAATTTCCTCTGTGCCTTGAACTACACATCAATCCAGATTAATA
GCATTGCTAGAAGAAACTACAACTGTGAAACAAGTAAGAAATACAGTGTCACTGATTTG
AATTACCCTTCATTTGCTGTTGTTTTTCTAGAACAAATGACTGCAGGCAGTGGAAGCAGT
TCTAGCTCCGTTAAATATACACGAACGCTTACTAATGTTGGACCAGCAGGAACATACAA
AGTTAGTACTGTTTTTTCATCAAGCAACTCAGTAAAAGTCTCGGTTGAGCCTGAAACATT
GGTTTTTACTCGTGTGAACGAGCAGAAGTCATATACTGTGACTTTCACTGCTCCTTCAAC
TCCATCAACTACG AATGTGTTTG GTAG AATCG AGTG GTCAG ATG G CAAG CATGTAGTTG
GTAGTCCAGTGGCCATTAGTTGGATA
SEQ 76
ATGTTGAAGGCTCTTACATCCTCATGTCTGCAGAATCGTTTCCACGCCGTCACAACGGC
ATTTACCCCTCAAGTTCGCCGTGGCACTGACTCGAATACGCCCTTGCTTCGGGTTTTAG
GTTCGCTAAGAAGTTCGAATCGCAGGGTCCCTTATTTGTCTCGACGATTCTTTTGTTCG
GATTCTACTGATGGGTCCGAATCGAATTCCGAGGCTGCTGCATCCGAAGCCAAGCCGG
CCGAGGAAGGTGGAGATGCTGATTCTAAGGCTTCGGCTGCTATGGTTCCCACTGTTTTT
AAGCCTGAAGATTGCCTTACGGTTAGTTCAAAATAATTCTTTGCACCCGCACCGATAGA
TTTAGACGTGTCTTTAAAATAAATTGTATGACTTTTGTTAACTAATGTACATTCTCAGTTC
AATTTATCACTTCATCATTATTAACTAACATAATTTGGTGCATAATTATGTATTTTCCTGCT
CCATCATTATATAAGTACATTTTATGCTAATATTTGATAACTGCTAAATGACTCCTTAAGA
AAAGATGTTAACTTTTTGTTATAACG GTGTG GC CATGTCTG CTTGTG CACCAAAAAAG AA
GATATTAAGTGAGTTCCTTCTGTTGTTAGTGCTGGTTTTATGTCTTCTTTGGGTAGTTTT
G ATATG ATTTTATTCATTATTATTCAAATTTGTACATATG G ATG CAAACATGACG CAG AAT
TGGGAACAATTTGATATAGAAATTATTTTTAACTTTGGTTGCAACACTCTGTTAAGATTTC
GGCAATCGAAAGGGATGGTTGTTTAAGAAGTTAACTTTCTAGGATATAAAAAAGGTGGA
TCACCAAGATTATATTTCTCGTAGATTCGTACACTCAATCATTCTTTACTAAAAGACCTTC
CCGACCTTGACTTGATATAATGCTGTGAAGGTCAACATTTTATTTTTCCAAAGAGAGGCT
TTTGATACCCTCTTTGTCTCTAAAAGAATGGTTTTATTATTAGATGCTTCGATCCTCATAT
TTTGAACCCAGCCATTCCGTTTGAGATGAAAGAATAGTATTGTAGACTTTTTTAAAATCA
GGCCGTAGTATTAAAGAATCTCGATATCAATCTTCTTAGGATCCTCAGACACCTCACATC
GATGACCAACACTTGTAATGTTATTGTAGCTTCAATTAGCATTAGTATTTTAATGAAATGA
GAATTGACAAATTTTTATAATCTAGTCGTCTTTTTAAAGATTTCTTGTCCTGAATCTTCTTA
GGATCATCAGAGAGCCTCACATCAGTGAGCAACTCTTTTAATGCTATTGTAGCAATGTA
GCTTCAATTAGCATTACCCTTTCAATGAAAAAATGTGGTATTTCAGTACCCCTCCTTCAA
CCACAATGTTATTGATTTTTGCTTCCTTGATATCTCCTCCACTATTTATTCTGCTTTTATAT
GGCTTTTTGGTACTATCCCTTCTTGTCTATATTTTCATTAATGTGGTGCTTATGCTTTCCT
G AG CCG AG G GTCTATTG GAAACAACCTCTCTTT CAT CACAAG GTAG GG GTAAG GTCTG
CGTACACACTACCCTCCCCAGACTCCACGGGGTGGGATAAGACTGGGTATGTTGTTGT
TGTTGATACTTCCTCCACTAAGGCACAATCCGCCAACTCCTTAATCAAGTCTTCCATTAG
GGGTAATTCATTTGCAATCATCTCTTTTGCTTCACCTAATGATTTCTAAAGGTTTCGATTT
TTCCAACAATTCTTTCATCTCCATCTCCACTCAAGCATATACCTCTCCATTTTCATCATCT
ATGATCAGCTCTAAGTAATTTCAAGTTTTTTCCCAATCACACAGCTTGGATTTTCGGCAA
TGCACTTGTTCTTGTTCAAATGTTAATGTGTGTTATCTTAGTAGTGTTAAATTAGTCTCAT
TGGAAAGATATCATAAAATTTATGTATTTCTCACCTACACATAGATGTGTCTTTTCTATTT
GATTGTTAGATTTTCTCAGATTGTATCACATACCCTTGGTGATCTTAATAAGGGGCAGCC
TATGCTACCCGATTGGTCAACTTTTTCACTATTCTTTCCGCTTTTGGTGATCTTTAACCTC
ATCCCTATTTCCGTAGACCGTTTGACAGTATCATGCCTCTTTCTCAGTCATCTTTTTAGC
TATAAAATTCCTCCTTTTTGACCATTCTTCCAGTGAGTTTCTGCCACCATTCCAACCATG
TCCCATTTACAGTGTAAAATTACTATTTTGGCCATTCTTCCAATGAGTTTTCTACCATCTT
TCAGGCCAGTATTATGATTATTGTTCCAGATACTATTTTGGACGCTATTCCATTGCTACT C C AGT C ACTGTTTAG C C G C C AATTTG AG C C AC ATTAG CTG C C AC CTTAC G C C AC AC C AA
TGATGGTCTTGATATTCGTGGTATCGCGTTCTCTCTGGTCATTGAGAATCGGTTCCCTTT
TTTTTTTGATAAGGTAAATTGTATTAATCAAAAGGGAGAAAAAAAAAACTCCCGCATACA
AGAAGTATACAAAAAGTAGAGAATTTACATCAGAACATGATTCTCTACAAATGACGCCCA
ATCTTCTACGCAAGTAGGGGCTACATGTGTGCACCATTAAGCGATCAAAGATAAAAGAC
TATTCTTAAGATGTGAAAACGGAGACTCAATCCCCTCAAAAACTCTCTTATTTCTCTCTC
CCCAGACTAACCACATGAAAGCTAACGGGACGAACTTCCACGCCTTTTGCTTGCTTCTT
CTACGGAAATTGGCCCAGCTATATAGCATTTCCTTTACAGTGTTTGGCATCACCCATTGA
AC AC C AAAAAG ATTC AG G ATTG C C CT C C ATAAAC C C C AG G AC AC AAG G C AAC G CATC AT
AAGATGATCAACTTCTTCCCCTGAGCTTTTACACATGTAGCACCAACTAACATGTGTAAT
TCCTCTCTTTCACAGATTTTCAGATGTCAAGATCACTTCCCTCGCTGCTAGCCATGCAAA
GAAGCACACCTTCGTGGGCGCCTTAGGAATCCAGATCGATGAGTATGGGAAAGTAGCT
TCCTGTCTCNTGCCGCATCCCAGAGAAAATCCCTTTGAAGCCGCTCCAGTTTTTCTGTG
ATGCTCACAGGTGCTTGCAACAGGGATAAGTAATAGGTAGGGATACTTGACAAAGTGCT
TTTAATAAGCACTTCCTTACCGCCTTTTGACAAATACCGTTTCTGCCAGCCTGCTAATCG
TTTTTCAACCCTTTCAATGACTGGATTCCAAACNGAATTTGTTGGACCATCAATTCCTTC
TGGTTTGAAACTCTGAAGACGTGGGGGAAAGAGACCCTCAGAGTAGCCTCTCCACACC
ACCTGTGATTCCAAAAGCTGATTCTCCTTCCATCACCCACCTTGTAAGTGATGTTGCCAT
AGAAAGCTTCCCAGTTCTTCATGATGTTCCTCCACATGCCACACCCGAACGGTGTTGTG
ATTGCCTTGGTCCTCCAACCCCCTCCCGTGGAATCATACTTTTCTGCTATGACCTCCCT
CCATAGAGCATGCTCTTCTACCCCGAATCTCCACAGCCACTTCCCCAGCAAAGCTCTGT
TGAATACCCTGAGATCTTTTACTCCAAGTCCACCCCACTTCTTTGGGGAAGTGACTGTC
TGCCAATTCACTAGATGAAACTTTCTAGTTCCATCTGCCGCATCCCAGAGAAAATCCCTT
TG AAGTCGCTC CAGTTTTTCTGTGATG CTCACAG GTG CTTG CAACAGG G ATAAGTAATA
GGTAGGGATACTTGACAAAGTGCTTTTAATAAGCACTTCCTTACCGTCTTTTGACAAATA
CCGTTTCTGCCAGCCTGCTAATCGTTTTTCAACCCTTTCAATGACTGGATTCCAAACAGT
AGTATCCTTTTGCAAAGCACCCAATGGTAGACCCAGGTAGGTAGTGGGGAGAGAGCCC
ATCTTGCATCCGAGAACATGAGACAAAGCATCAATGTTAGCAACCTCATCCACCGGGAA
AATCTCACACTTGCTGAGGTTGATTTTGAGTCCTGATACTATCTGAAATCACTGCAGTAG
CTGCTTCAGGCAGGTCAACTGATCCATATCGGCATCACAGAAAACTAGGGTGTCATCC
GCAAAAAGCAAATGAGAGACCCTTCGGGCACTGAGCACCTCGATCGGAGCTGAGAAAC
CTCTCAAGAAGCCTCCACTCGCTGCACGATCCATCATTTTACTCAGAGCATCCATCACT
AAAATGAATAGCATGGGGATAAGGGGTCACCTTTCCTAAGCCCCCTGGAGCTGCCAAA
G AAACC ACACG AG CTACCATTAAC CAG G ACAGAG AATCTG ACTG ATGAAATG CAAAACT
TGATCCATCCCCTCCATCTTTCCCCAAACCCCATCCGTTTCATAATGAAGTCCAGGAAC
TCCCAATTGACATGATCAAAGCCTTCTCAAGGTCCAACTTGCACAGTAATCCGGATTCT
CTATTTTTCCTTTTGGAGTCTACAAGTTCATTTGCCACCAGAGCAGCATCCAGGATCTGC
CTACCTTCCACAAACGCATTCTGGGAGGACGAAACAGACACGTCAAGAACCTTCTTTAG
TCTGTTAGAGAGCACTTTAGAAATAATCTTGTAAATGCTCCCCACTAGACTGATAGGCCT
ATAGTCTCTGATACAAGATGCACCTTCTTTCTTAGGCACAATGGTAATAAAAGAAGCATT
GATGCTTCTCTCGAAAGCACCATTCACGTGGAAGTATTCGATGGCTTCCATCAACTNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNAGCATC
GTGCACCTCCTCCTCCTCGAAAGCTCTTTCTAGCCACTCATTGTCTCCCTCCCCTATGT
GGTTGAACTCTATTCCCCCCAAAGTAGGCCTCCAACTAACTTCCTCTTTATATAAGTTCT
CATAAAACCCCACTATCGTCCCGTTTACCTCCTCCTCACCCTCGATCCTCACCCCATCT ACAACTAAGGACTCTATGAAGTTTCTCCTTCGATTTGCAATCGCAACCCTGTGAAAAAAC
TTAGTATTCGAATCCCCCTCCTTTAACCAGAGCGCCCTTGATTTTTGTCGCCAACTAGTC
TCCTGAGCTATTGCTAGCTCGACTATTTCACTCTTAACCTCCCTCAATCTCACTTTCTCA
GATTCATCTAGCTCTCTCGCCCCTTCCCCGCTCTCTAATTCCCCCAATTCATGCATAAG
CTCCCTCATTTTGACCTTCACCCTCCCGAAGACCTCCTTGTTCCACCTAATAATATCTCC
CTTGAGCAATTTGAGTTTCTTCAACAGACGGAATGAGGGTGACCCTGATACCCCGTAGT
TGGTCCACCATTCTTTCACTTTGTCACAAAATCCTGGCACTTTCAACCACATGTTCTCGA
ACCGGAAGGGAGCACGGACCCCCCTCCCTCTGCATCCATCTAGCAAGATAGGCAAGTG
ATCCGAAGTCAGCCTAGGTAAAGGGATCTGCAACACATNTTGCAACACATTCGGTACCA
GCTCATCCCAGGAGGACGATATCAAAAATCTATCGAGTCTGGACCTTGAGTTTGAATCC
TCCGCCCTGGTCCAAGTAAACCTTTCCCCTGACAGAGGAAGATCAATGAAAAAGTGATC
ATTGATGAAGTCAGAAAACTCCTCCATGGCCGTCGAAATCAAATTGCACCCTAATCTCT
CCTCCGGGAATCTAATTGTGTTAAAGTCTCCCCCTAGAACCCATGGAATGTCCCATTCC
CCCATAATATTAGCAAGTTCCTCCCAGAAAGACACTTTGGACTCTTCCCCTACCGGCCC
ATACACTCCCCCAAATCTCCACTCCACCCCACTCACTCTATCTTTGACGAGAGCTGCCA
AAGAGAAAGCTCCCTTCCTAATCTCCTTTACCTCCAAACTTCTATCATCCCACATAAGTA
GAATACCCCTGGCACTTCCTGCTGCAGGGACCCAGTCATATTTCACCCAACAACCCCC
CAGACACTTCTCACGATCGCATCCGACAAAACTTCCATTTTGGTCTCTTGAAAACAAACT
AGGTTGGCACCCCACTCCTTAACTCCCACTTTGATGATGGCCCTTTTGTTTGGGTCATT
CATCCCCCTGACATTCCATGAAAGGATCTTAACTTCCATAAGCAACAATAGCCCCCTTTT
CCCCACCCCCCTAGACGAGATCCCTTTCTCCATGTCACACACTCGACCCCTTCTCTGAT
ACACCACTACATCCCCTATTTTATTTTGATTAACACCTTTACCCAACTCCCTCCCCGCGC
CTTCGTAACAAGGCTGTTGCTCAATCTCCCTCAACAGTTCTAGCACCCTATCTTCCTTCC
CTTCAAAAGAAACACCTAAGAACTTCCCAAAGTTTAATAGTTTATCCTCCATCCATAAGG
ACATGTCCACTTCTCCAATCCGTGACGAAATTGGACATGCGTCTTCTGTAATTACCTTTT
CCTTGCCCCTACCGGAGGAATGCAAGACCATAGCATTGCTAGTAGTAGGAACTCTGGG
TGTCGGGTGGATCACGTCGTTTCGTTGAGAGGCACCAACCTCAGGAATTAATGCCTCT
CTTG GAG GAG ATTGTATTGTAAC CTCAG CACATCTAAGCACAGTG G GAG ATG AGG G GG
CGATGGAGCTGGGTCCATCAGAGGCGGGGGGCCGTGGAATAACACCGCTAGCTTTTG
CCGGTGCTGCCCCTGGAACAGTGGCTGAAGAGGAGCACAAGTCTCTAGTGTTTGGTAT
AGCACTGTTTGAGGCATCATTGGCAGAAGGAGGAGGAGTGACTTGCTCCCATATGTTTA
GGCGCCTTATCGGACGTCGCTTGATATAATTTGGGCCCTTTTGGATCGATTCTAATAGA
ATTGGGAAAAGTTGCTGGGCCCATGTGTAGAGGCCCAGACCTATGCTTGCTGAAATCTT
GCCCGACCCTATGATAGTTAAGAGAGGACGCATCACTATCCCATTCACGTCTTTTCCTC
CTGTTAAAATTTTGTCTTCTTCTCCCAAGTTCAAACCTCCCGTCACTTCTCGATCTACCG
TCTGATTCCAGCTTAGTCCACTGGTCCGGTGAGCCCTCCCCTCCTCTCAGAAACTGGTT
TCCCCTTCCCTCCCTTCTATTTGACCTTTCCTTTTCTTCCGCCAAAATCGCAGGCATAAA
GACAGGGCGAAATTCAGGGCTCAACCAAACCCTGTAGCTCAAGTAGCCATCTTCCACA
ACAGTGGAGACAGGGATTCGGCCCCCTTTCCTCACCACAATCCTTACTCGCGAGAGAT
CGTGCAAGCTGCAAGTGACATCAATGAAACCCCCACAGATTTCCCCAACCTTCTTGAAT
AGGTCAAGACACCAGAGATGCACCGGGAGACCCACCATGTTGACCACCAATGTTTCGG
CAGGGAAACGAGGGTCGAGACACCCATCGTGCTCCACCCATCTGTCTAACTTTAAGAA
ATTCCCATCAAACCACCTGTTTCCTCTTACGAGAATTTCTGAAGCTTGAGACTCGGTGA
CAAATCGAAAGAGATATTGCGTGTCACCCAGTTTCGATATTTTCAGCCCGTCCTTGACG
TTCCATGCCTCTGCTGTCCAGTTCCTAACAGTTTCTGGAGAACCAACCCATTTGGAGAA
AGACCCAATCAAACAAGAGTTCAAGAAACCTTTACGCCTGAGAGTGTGCTCCTCGGACA
GCGAGAAGTGTGGCTCCTGCCCTACGTCAAGCTTTCGCTGGCATCTTCCCCCAAGTTC
G CGG G CTG GCC AAGATG CAG CTTTG CTAAAAG ACATG CTTTCTAATTTTCTAG ATTCTA
GGGAGAAGTAGGAGCTTCTCTCTGTAGAGTCCGTCATAGCTAAACCAAGAGCCTCGCA
AGGAGACTGACTGAAGAAGGGAAAGTGAGAGGGAGTGAAATCTGGCCGGAAAAAGAC
ACCTCCGTCGCCGAAAAATAGCAAAAGTGGTCGGAAGTTGGAGGTTGATGGATGGGTG AGGTCGGAATAGCCTCGCGAAGAGGCTGTCTGAAGGAAAAAAGTTGGGATCTGGCCTG
AAAAGGCGTCACGTGCCGGCGCGTGGGGTGTCAGATCCCGGCGATTCTTTCTGGGGT
AGTGTCGCGACAGGGTTGGCTCTCGGATGGTGGTGGTGGAATCCTAGGATTCATGGTT
CATGGTGGTGGTGAAGGGGTGGGATGGGGTGTCTGTGTGCTACAGTGGTTTTTTTCCG
CTTCATGCACATTGAGTTCTGGTGACTGATCTCGTTTTGAACTGTTCACTTTCTAGGTGA
CGCTTTACTATTACGACTATTGTTCACTATTCCGTGTTTTCCAGAAAAAGCAAACTCAAG
GTGTTCTTCACTTCTTCTTACCTTGAGTTCAAGGACACATAATACAATGGTAATGTTAGT
TAGTAGAATGTTAGTGTTAAGTTAGTCCAACATCAAAGATTATTGTGCTTTTCACACCTG
AAAATTAAGTGTCATGTATTATGTAATAAGTATTCATCAACTTATCTTTTCATCTGCTTCTT
CTATTGTCTCACAATTTTCAGATTAAAATGGAGTATTTTTTTTACAAAGAATCCCATTAGC
AAGAAGATTGTGATTGTGTCTTGGGCACATTGTCAGTTGAACCATCCGTATTGGTTGAG
ATTGTTTCTTTGAAATCTCCTGATATTAGAAACCCGTCGATCCTTGAGTACTTCCTGCCT
TGTTTCTCTCTGGACCATATGTAACTCCTACCTTGCAGAGGCAAATTGATCAACTCTATG
TCAAGACTGAAGCTTCAGAAAGTTCCTAATGATTTTGAATGATTTCTCACCTGCAAGAAG
CTG GTG ATTCACATTGTTCAG CAAAAG ATTTTTTTTTTGTTTG CTTGTCAAATTAACATTG
TAATTAGAATACTCTCCTTCCCATATGCTCTTCTCAAAGTCAGCATTCAACCAGTTATGC
CTCTTACTTCTGGCCGTACAGAAGGCACCTTGATGGATGTTAACTTTTAACACTTCATCC
TATATTTG G AG G CCTAG AGAG GGTTACTTCATGCTTTAGTTTCAATTTAATCAACAG CTT
CTAGGTTATTTAAGAAGTAAAGCATTGGAAAATACATGCGCGGTTTAAGTAATAGGATTG
CCCTAGTATCTCATTTTTCTTTGAAAAGAAGTTTGATTCATATGTGTATTTCAGTTTTTGC
TAGTTGTTGCAGGATCAGTGTAAAAATGGTATTTATCAGCCTTCCCTAAAAAAAGAGAGA
TACTGAACGAATG CAAAG AAAACACTGTTTCTTG G AG G CTTCTTATTTGCTCATAATTCA
TCTGTTTGTAAGATTTATGGGTGCGACCCATTTCTTGCCTTCTTGGTTGGTTATAGTTTT
ATCTTTGGATAAGATTTACCTGTAAGACTTGCAATTAAGGGCTTTATGTCATTTATCATG
CTAAAAATAACATAAAGTCTGTGTGGGCTGTACTTTTTTAATTCTATATAATTTATAATGA
TG CATGTTGTTTTCTTGGTG CAG GTTCTAG CACTG CCACTTC CACACAGACCGTTATTTC
CAGGGTTTTATATGCATATCTATGTGAAGGTATGGTACCTTGTCTAGAATGGGTTAAACT
ATGTCTTAGTTAGTCTTCGTGATTTTTGAAATAGAATAGATATAAGCTTTAAACATTAAGA
GTAAAATTACTGTCAACTTTCTGAGAGTTGATACTGTTTCAGCTAGTGTTTTGGAGCCTA
TTTCCTCTTAAGTCTATAAAACCAGCATTCAGCTTTAACAATGCTACTTTAATTGATGGAA
AGTTCTTGGCTGAAGTTGTAAGCTTGAAAAGCTAAAACCTTCAATTAGGAGAGCGTCTG
TTGCTCAATTGTAATATTCTTTAGCCTCATTTGGTAGTTTCCAGAAAAATATAAATTCTGC
TGTAATGGCATGCAACAAATTGATCTGAGGCTCCACATAATCCATAAATTTCACTGCAG
GATCCCAAGGTATTAGCAGCCTTGCTGGAAAGTCGAAAAAGGCAAGCACCTTATGCTG
GCGCTTTCCTTATGAAAGATGAGCAAGGGACTGATCCTAATGTTGTGTCTGCCTCAGAT
ACAGAAAAAAACATCTATGAGCTTAAAGGAAAAGACATGTTGAACCGTCTTCATGAAGTT
GGTACACTTGCTCAGGTACATCTTGTTGTTCCTTGTGTTATTCTGTTGCTTTAACTTTATT
GAAGAAGTTTCTGATCAAGGTTGATGCAACTTATGATGAGCCAGATAACAAGTATTAAA
GACGACCAGGTTATTCTTATTGGTCACAGGCGGATACGTATGGCAGAGGTGGTAAGTA
GTTGTCTGCTTTATTTTTTGGTTATAAGCAGCTCCATATGTTTTCTTTTTGTCCTATTTAG
TTCTATTTTTCATCTACATATCCATCATCTCCTGATTTTGAGACTGCCCAATTGACATCTT
GGGTTCATTGCACTTGTATATTTTTCTCTCTTTCCTACCTTTTGTGTCCCAACACCGAGT
CCTTCCCCTTTGGCTTCATATAACTGAGCAGCTGCATTGTCAAGCGATAGAGGTCTATT
GCAATTTGAGGCTAAATGAGCTGTTTTCCCCAGTTGGTTATGACGTGTTATCAATAGAAT
GAAACCCTCTTTTAGTTATTCTTGATTGCTTGTTTGCAAGAGAAAGTAGCACTGATTTTG
TAATAATTTTTGGACAACTGGTTTCCTGCAAAGTCATAGGAATGTCTTTCTTATTGCACTT
CG GG GTTGTAG AAG AG AAAAG CAATTG AGTTTCTTCTATG ATGTG CTAGTTTTG CATCT
GTTTGATTGCTGATGCATTTGGTTTGAATGAAATCCTAGACTTTTGATAACCTAGAAATC
TGAGGGTAAGTGGCTCACGGTTTGAAACTCAGTGGATAATGGGCTCACCATTCTACCAT
TGTCCACCTAAATACTAGGCTTTTGTTTGCGCCACGGCACGAACTGTGATGTGCACCCA
AC CTAC ATATC AC G G G CTGTACTCTTAC CACTAG AC AAAAG C C C C G GAG G CTTAAAC AT ATATTAAAACACATAAAAGTTTGAATAGAATCCTATATATTTGGGCTGTAATTTGGTATAG
AGTCGTATTCTTGGCTTTAGCTGTAAAGCTGGAGTACTCATGCATGGTCAGCCATTAAT
GGGCTTACTTAGCCACTGGAGGGTCTTATGTCTTCTTATTTTTTCGTTTTATTTTGTGATT
GAGGACCTCTTATCCTCTTCTATTGGACTTAAATCTCTCATCCCCTGAGCTCCTCAGTAA
TAATTCTCCTATTATACTATACTTACTGTTGGATGTTTACCTCTCTACCTTCACATGGTAG
GGTAAGGTCTGCGTACACAGTACCTTTCTTGGACCCCACCTGCACGATTACACTGGGTT
TGTTTGTTGTTGGATGTTTGCAGTCAGATTTCCTTATGTAGTCATTTCTTTTGTTTCAATA
AACAATTTCTTCAATATACTAGTGTAACAATTTGATGACTATGCAGTAGATTTGTCACACT
TGAATGTTCACTATCAGATTTCCTTACTTTATTCACCATGTTGTTCTTTTTCCGTCTCAAT
AACCCATTTTGTGAAATGTACAAGAGTAACACCTTGAGGACTGAAACAGTGACATTCATT
TGCAGGTCAGTGAGGAACCCCTTACAGTGAAAGTTGATCATCTCAAGGCATGCCTCAGT
TTCTTGTGCATTCTTTCTACCCAGCAATGATAAACTCATAAAGCTTGATGTTCAGTTTTGA
AATTGTTTCTAAAATTTGCTTTCTTAATTATGTTGCAGGAACAGCCGTACAACAAGGATG
ACGATGTTATAAAGGCGACATCTTTTGAAGTTCTATCAACCCTAAGGGATGTTTTGAAGA
CAAGTTCTCTCTGGAAGGATCACGTTCAAACTTATATCCAGGTGTTAGTCATTTCTTTCT
AAATGTTAAGTCCTAATTGTTTGATTTGGTGATAACTCCAAAAAAAAAAAATTTACTCTCT
CAAAATGCATTGCCTTTTGATATTCTAGCCCACACTATTGTGTGAAGCTCCACAATGCCG
GTGGATTGTGTTAAACTTAGATCATGCCCTAAGTTAGGCATTTCATGCCATTCTTTTTAA
TGGAAAATTATGTTCGTGTTGGTGGTAATCAAATCCATGACTAGATTCTCTGCCTTATAG
TTG ATG CCCTG CTG AAAGAG GAGACTAATTAAAG ATGTCAAG ACTGCTAG G AAATG G AA
GGTCATAAATGTTCTCTGAACCAACTGTATACATATTGTTATACATATTTCTAAAGAGAC
GCTAGGAAGTTTATACAACTGAACAACTGTATAAGATGTATTTGGAGCTAGGAAGTTTAT
ACAATACCTAATTCTCTCTCTGCTGAATTTCTAAAAAGATGCTACTGATTGTTATGAATAG
TGTCAAAATCAAAAG ATAATATTGTGG CATGTTTTG CTG AAGAAGTG G GAGTG G AG G GA
TAAAGGGATAGGGACTTACTGAATACTTCTAAGGGAAGCATTTGTGTCCCTCCAAGTTA
TTG GTACACTAGTAATTG G AGTAAG AGACATTACTCCTTTCGACCAATTTC AG AG G G GT
TATCAAGTCTTCAAAGCGTGGCGATATTTGTGGGGGTTGGCTGGAAATTGAAAAAAAGA
CAACTCTAAGAAATCATC CTAG ATG G GCTTCGATCAAG GAG AG G AG GC AG ATG G AG AA
GATTTCGAAAACCCTAGAAGTGGAGAAGAAGGGTTTTCTTTATAAATTACATGTTTGGAG
TGTAGCGTTGGTAACAATTATCTCCAATCGACGAAATGGAGAAGAAGATGGACCTGCTA
G GTACTATGATG AAGTTTG AGATAG G CCTG CTACC GTAG AAAAG ATCG GTCATGTG GG
G GTTG G GAAC AATATTTCAAATTCTAAAAAAGTGTG GACAG ATGTG G GAGTGG CACGTG
GAAAGGTCTTAAGGTAGAATGGGTCAGTAGTAAAGGAAAGGGCTCCAGAATTGGGCCA
CTTATGTCATGATGTTTGGAAGCTTAACATAACTTTTGGGTCATAAAAAATAATAGATATA
GGCTCAGATGAGTCAACCTTTATTTTCAATAAGGGGCTTGGTCTTTTAGCCATGAACCTT
TAGGGTCTCAGTTTTGATACTCTTTTAGTGGAGATTAATGTCATTAATTTGTCTGAAACAT
G ATAC ATTTTAG ACT C CTTTC AC AAC AC C ATG ATC G G C AAAC C G ATG AAAGAG C AG G AA
AGCCCTAGTCTGCCACAAATTTCACAGAACTCACATGCAGCTGATTACACAAAAATTGT
CATTCAAGATTAAGAAAACTCTTTCTTCGTACAATCGAAAGAGGCTGACCAAAATCAGAT
AGTTTTTTGAATTTGAAAGAACTAAACCATGGCTTATCGAAGAGGCTACTCAAGCTTTCT
TATGGCTGCATAATAATGTTACAAAGATGAGCAAACAATATGGATTGAATTTTTGTGAAT
GTGAG ATTG AG G GTCTAG CTCTGTTC ACG AAG CTTGACAGG AGAAG G CATAAG AGAAA
TG AAG CTACCATGTCC AG ATTC ACAATTC CAAAAGTGATAG GTATAAAG GAG CTCCAAA
AACTGTTTTTTAATGTGAATTATGGGGAGCTCGGATCAATGATGGGAAGGGGGATCACA
AACACTAGGTAC CCATG AAG CTG AATATTCTC ACTTG G AATATTAG GG G GTTG AATG AC
AGGGAAAAGATAAAGGTGATAAAAAGTTAATCCATAAATTGAAGGCAAATATTTATTGCT
TTCAAAGACGAACTTAGAAGGGGGTGTGGAAATACTAGTTAAACAAATGTGGTCAGACC
CATCTCAAGTGTTGTTTG GAGTCCAACG AC AG GAAAG G GAAAAT ATTG GTG ATGTG G GA
AAAGAATGTTTG G ACAAG AGAAACTATCAACAAG G GATGTATACTATCACTCG CAAAATT
TCCTCATTATCACAGAATTTCTCCTGGCACCTCACAGGGGTGTATGAATCACACTGCAA
GTTGGAGAAACAAGAACGCTGGTGGGAGATAGTAGCATCTAAGGCATTTGTGCAGGGC CTTG GGTG GTGTATAG AGATTTTAACACTAATAG ATTCATAG CAAAAAGAAAG AACAACA
ATAAACTCACTAGGGCTATGATGGACTTCTCTAATTTTATAGATCATTAGAAGCTTGTAG
ATTCTAATCTTAATGGGGCTCCTTTTACTTGGACAAAGGGTAATAATCAGGAAAACTCTT
C AAG ATTG G ATAG ATTTTTTTC C C G G CTAAATG G G CTG AG G AATTAAAG AAC AAAAG G C
AAGCAGTACTCCCCAGTGTATTTTCTGATTGTACTCCCGTTTCTTTTCAATGCGGAGATG
G GG AAGTTTAAAATCTTACTTCAAGTTTG AAAG CTG GTG GTTGG GTGTTGAGAG ATTCA
AT GAAATG GTG AAAAG CTGTG GAACTCTTTTGAAGTACAGG GTAG ACTAG ACTCCATTC
TTTCAAG CTAACTG AAG TTGTTG AATACAAAGCAGTGG G ATTG G GAATCTTCAG ACTAT
GACAGTGTTGTGAAGAGCCTGAAGCGAGGAAAAGCGATAAGCCCCTTTTCGCTTAAAG
CGAGAAGCGAGAAGCGAAGCGCTCGCTTTTTTGAAGTGAAGCGGTTTAAAAAGATATTA
AAATAAATAATGCATAGACAACACATGTAACTGTAAGCAAATGTTCAATACTTCAATGTA
AAAACTAAAGAGTAGCATCAATTAAAGCACAAAATGAGCATCATATTCTTCTTCAAGATT
GTCAAATTCTTGTATTCCACTATCATTATTATATTGCTCGTCATCTTCTTCAACTTCTTCTT
CTACATCAACTAGAAATGAAGTAACTGCTTCTTTTCCCTTCCTCTGTGAGCTTGAAATTG
AGGTACTCCCCCTCAAACCATAAATTTTCTCCCCAATTCCACGCGCCTCCGCAACATCA
CCCCAAGTGAAATCAGAAGTTTCCTCAAATACTTCTTCATTTGCATGATCTTCCGGGACT
CCAATTAGCCATTCATTAGCATCATCGATGTTGTCCAAACTAATTGGATCAATTACATTG
CGAGCATTGTAACGACGCCTCATTGTTCTATTGTACTTAATGAAGACTAGATCATTGAGA
CGCTTCAAGGTTAGTTTGTTCCTCTTTTTGGTATGAATTTGCAAATAATAAGTAATTAGTA
AGATTGCATGCGCATAACTGTCTGTCATCATTCAACATTCTAACTTCTTGAAGTTGCAGC
ATATTG GTG ATTTCAATTATGCAAG GTTAG CAGATTTTG GAG CAG CAATATCTGG AGCC
AACAAGCTACAATGCCAGCAAGTGCTTGAAGAGCTAGATGTAAGTCCGTGGTTCAAGAA
GTTAGATATTCCCTCTGTTTCAATTTAGATGACACACTTTCCTTCTTAATCCGTTCCAAAA
AGAATGACACATTTCTACAATTGAAAATAATTCAACTTTAAACTTTTCATTTTACCCATTTA
CCCTTAGTGAGAAGTTTTTATAACCACACAAATGTTATGCCCCCACAAAGCTTTTACCCC
TTAAGCTTTTAAGTCCACAAGTTTCAGAAGTCTTTTTTCCTCTTAAACTTCCTGCCAAGTC
AAACTACCTCATCTAAATTTAAACGGAGGGAGTACATAATATTTTCTGTATGTGTGCCTT
TTTCAATCCAAACCTCAAAGAGAGAGGTTTACCAGTATTGTGAGTTCTAACCACTCTGGA
AGAGAGCGAGGTTTTATTACTAGGAGGTAGATCTAAGAATTTTTCCTTGCCATTTGCTTC
CTATTGTAACAAAAAAAAAATATATGTTTGGTCAGTGCTTCCCTGCTCTGAAAAGGAAAA
TG GTTG AAAGTTAAAAAAAAG GAG G ATCAAAG CGTATTAGCG GTAAAGCTTC ATTTCTT
GCTGATGGGTAAAATGGCTTCTCATTGGTACTCTTGCTCGAAAAATTAGGAATCTAAAAA
AGTTATCAGTTTGTTCACCACTTCATAAAAAGAAATTGTCAGTTTGTTCAAGTGGAGCTA
CTTATTTTG GG AAAATTAG AAAG G GAAATATTCAGTTTAG AACAG CTAG CAATATTTTGT
ATTTCCTCGTAAGAATCATAAACTTGCATTACAATGGTTATTTCATGATTTTCATCGAACT
GAGCCTTTTGTCTGCTCAGGGATCAGAATATAGTGTGTGCATGAGATAAAATGAATGTT
TCCCGCTTGTTTCATAGCTAACACCATAATACCTGCATCAGGTGCATAAGCGGCTACAG
CTTACCCTGGAGCTAGTGAAGAAAGAAATGGAGATTAGTAAGATTCAGGTAAATGCACA
TCAAGACGCATACCTGAACTTTAAATAGGTGCTATGCATGCATTTAGCATTTTACGTCTT
TTCTGTTGTCTGCAGGAATCAATAGCAAGAGCAATTGAAGAAAAAATAAGTGGAGAGCA
ACGCCGTTATTTGTTGAATGAACAATTAAAGGCCATAAAGAAGGCATGTGGATTATGTG
CAGCTTTTTTTGTGTTATCATCCTTAAACTTAGTACTTACATATGTTTATTCCTAGATAATA
TATATGTTAAGCTTCTGATTCTTATGTTTGATTCACAGTTTAAACAAAGAGTTGATTTGAA
AAAAAAGTG CTTCCTTATGACGTG ATTTTG ATTG G CTTAATG CAG G AACTAGGTTTG GA
GACTGATGACAAGACAGCTCTTTCTGGTTCGTTGTCTCTTAATTACTACTGAAATGAATA
ATGTTCTTTTTGGATTTATTACGGCCAGACGTGTTTTCCCATCTGGCCAATGAAACATCT
TATGTTGGCCCTGGAAAAATTCCATGTAGCAGTAGCATAGAAAAGGCCATTTGCAATAC
GTTTTGCCCTTTCTGTTGTAACAGTTTTATGGTGGCTGATTGATCTCCTGTTTACCATTG
AAAATCTTC ACTAGAAGCATG AATAG CATTCAG G CG ATAATTG G CTTTACAG ATAG G AAA
TTGAAGGTTGAATTTTCTTTGTACACGGTCCAACCAACTTTATTGACATGCTTTCTAAAG
CTATTTGAGACATACCAAACTTTTGACAGATAAGAATACATTGTACCTATAGGGGGCTCT TTGTGTTTTCACTAAAGCACTTAATAAGACAATACTGGAATCCTTCAACTTCAATTTCCGA
AAGGTTACTGTCTACCATATTTGGTAATGTGGCACTAAATCAACATGGATTTACTGGAAG
GCCACATTGTTAGCACTGCATTGTTGTTTATTCAAGTTATGGACTGCTCATGATTCAGAT
TTATCTTCCTCCTCCAATGTTTCTTTTTCTCTTGAACTCTGTCCTGTTTCCCTTTTCAATG
CACTTCCTTATTAGAAGCCTACTTGCTTTTTCTGTCATGTACTCTTTCTTCTCCCCTTTGC
TCCTGTTG AG G AGTGTAATGCTTATG AAG GTG CTTCTTCCAG GTTTTG GTGATAATAGA
CATTACTGCAAACCCTAAAACGCGGTTCTCATTGTTTTTTCTCTATAATTTTCACAAAAAT
AGAGTTAAGCTGCGTGTTGTAACATTGTTTTGGTAGACCTGGTTTTGGTGTTGAAATTGA
G AAG AAACTG CCTAATTATTAGGTTTG CTTTAAG G GCTTCG ATTTTTCTG GTCAATACTA
ACGAAAACCTGCAGTTTTTTGTTCCCTTTTTTATTACATCTAATGGTGTGACACTATATTT
GTTTATTACTG CAG CAAAGTTC AG G GAAAG ATTG G AG CCTAATAAAG AAAAAATACCAG
TACATGTTATGCAAGTTATTGAAGAAGAACTGACAAAACTGCAACTGTTGGAAGCTAGTT
CCAGTGAATTTAACGTAACACGTAATTATCTTGATTGGTTGACTGCCTTGCCATGGGGT
AATTACAGGTTTGTTGTCTATCGATTCTGCCTTACATTGTCTTGGGTTCAACCCAACTGA
TGTTATCCTTATCCTTGGCTAGCTGTACTAGAGGAATCTGTTTGAGAAGCTGGCTAAAC
AGTCCAGCGAGAAATAAAAATGTTATTCTCTGAAATTTGCTGCTTCCAAGTTAACCTTAC
TGCCTAGTGATGTGACTTGCCTAAATATCTATCGAGTAATATCCATTTGTCTTTAACTTTT
CTTTCTCCTCCAGTTC CTTATTTTG G GTTCTTACATGTCATG CTTCTG G CTTTG AG G ATG
CTTACTTGACATCCCAATGTATGAGTTTAGACCAGGATCTCATGAGAGCAGCAAAACTA
GGATTGTACTTATGATGAGCTCCTTAAGATGGGGGCTTGATTTGCCGTAGTTCGTGTTG
TTTGCTGCTGATGGTGGTGGTGTTGGCTTTATAGTTTTGTTCTCTGCCATGGGTGTATT
GCATTGGTTCCTGAAGTTTTCTTTTTATGATCCAAATGCAGTGATGAAAACTTTGATGTA
CTACGGGCAGAACAAATTCTTGATGAAGACCACTATGGGTTAACCGATGTTAAGGAAAG
GATCTTGGAATTTATAGCTGTGGGAAAACTCAGAGGAACCTCGCAAGGTTGGTAAATGC
CTTTTTTTAAAAATAATAACCCTCATTTTTATTAAAAAAAATCCTATTTTATAAGGTTCAGC
CATAATCATATTAAAAGAACGGAAAATGATCCAGCCATCTCCTTACTGTCCATTGTCATA
ACATTATAATGGACCAATGGAAAATATATCCATAGAACATGAGATTTATGGTTCCCAAAT
ACTTTATTGACATCAAATTGAAACGAGTAAACGGAAAGAAGTGAACATTTTAGGGAATTT
GAGAAATATTTATTGGTCAAACTTAGGTAATACTTTTTGTGTCAGTCTAGAGTTCCTCCA
ATGTTTTCTTGTGATTATCTGTGGAGTAAAGAAATATATCTTGAGCTTAATTTCTTCCCTT
GAAAAGCAACTAATGTGAATTAAACTGCTGCACCTTGGGCCATAGTTTGTTGGTGTTCTT
CTTACATTCTGATTTTGTGCTGTCCATGATTGGGCACTCGCTGTGTGGTATTCGATTGAT
AACTTACTTTCACCACCAGTTGTACTTGTATATCTTTGGGACATTGAACTTGAGATGTAG
TTGTTTGTTGAGGATAATCTTTGGAAACTATGAAGTGTTGAGAAAAAAACAGGTTGAATG
AAAGTTAACAATATAATCCAAAGACAAAGGTTAACTCTAGAAAAATGTGAATTGCATCAT
AGCTAGGACAATATTAGGTTCAAATGATAATATTAATCCCAATTATACACTGGCACTGTC
TTTCATATGTGCGGCTTCACCTGATTTATCTCAGTTTAATTTTGAATCTGAGTCAGGAAC
TAGAAACAGACTCATTGCTTATTTTTGTTTGAACAGGGAAAATCATATGCCTCTCTGGCC
CTCCTGGGGTGGGCAAAACCAGTATAGGTCGTTCAATTGCACGTGCATTGAACCGCAA
ATTTTACCGATTTTCTGTTGGAGGGCTGTCTGATGTTGCTGAAATAAAGGTAATGGGAAT
ATCTGG CCAG CTAAAACAG AGTTGTTTTGTG G CG CACAG AATCTTG AACTTTCATGACT
AACTTTGGGATACACTTCAAGGGACATCGACGAACTTATATCGGTGCCATGCCGGGGA
AGATGGTGCAATGTTTAAAAAGTGTGGGAACCGCTAATCCTCTTGTTTTGATAGACGAA
ATTGACAAGGTATTTTATGGTTTGTGAGTTCATGCTTCAATTGTATGGCTTTGACTATGA
GAGGAAGTCTAACTTCTTTTTCACCATTTAATCTCGTTTTTCTGTATATGACCACTGGAA
GAATCTTGAGCCTGAACATATTATGTTTTTGCTTGGATTTCCTCTGCAATCTAAATGTTTG
AAGAAAGTGTTTATCGATCAGTTTAATAATAGCCTTGTATTTTTTCTTCTATGGCAGTTGG
GAAGAGGACATGCTGGTGATCCAGCAAGTGCTATGTTGGAGCTTCTTGATCCAGAACA
GAATGCAAATTTCTTGGATCATTATCTTGATGTTCCTATTGACTTATCAAAGGTAGTTGTT
TTCTGGAGCACTTATCAAATTATTGTGGCTGTTGATTGGCCCTTATGAAATGCCCTCTAA
CATCATTTGATGAATGGGGACTAATGTGATATTAAAAATCTTGCAAATATCTACTATCATT TTGCTTTTTCACCTTTTGATTTCCCCCCCTTTTTCTTTACTGATTGATGTTCTCTTTCGTC
TCTTACCTTAGTTAAGTTTGGAAAAGGTTTCGTGACTGAGCCTGTTTCTTTTATTGTCCA
TAATGGAAGGTTATCTGGAAGTATTTTATTTCACGTCTGTGTTACCTTTGTCTCTGTCATT
ATGACTGTAATATGATTAGTGTAAGTAGAATGTTGTTCTCTTGATACTTGAGAAAAAACT
AGGTTTCATTGGCATTGGTTTTGGTGATCATTCAATAGGAAAAAGGTTGATGCATTGAGT
TTTCCTACTGTCTGTTCATATTATCTTTGATGGCTTTACTGATGAGGGATATGGTTTTATA
CCTCTTGGAGTTACCGATAATTGGAGCGATAGACTTTGAGTGTTGAGTGTTAGTTTACC
TATTGATTAAAGAATTTGGCCCATAATTCAAATATGACAGTTATCGATACTATTTCATTTT
ATTATATCAAACTTCAGCAAATCAACATGGCTACAGAGAGAAGGTTGATGGGTATATTC
GCATACAATTTCATTTATTTTGTCCTTTAATGTGATAATTTACTGTGTGTTCCTATGTCTTT
TCATAATGGTAAATATTTGTGTTATATTTCCAAAGGGCTTTGTCAAAAGAAGATAGCATC
TTTCGATTATTTTGGTAGTATTTTGGGTCTGACTTGGTTATGGGGAGGGAGGCTATATCA
AAGCCCTGTTATGGATGTTGAGTTATATATGAACCTGAAGAAATTATAAAATCCAACCTC
TAGGTTTTAATGTACTTCTAAAATTTGAAAAAATCGAGCTTTGGCACCTTCAGTGTTTTCT
ATCCTCTTTGCTTAATACAGTTACCTATTCCACTGTAGTTTATACGAGAATGACATCTGC
ACATTTATCAATTTTTTGAGTGCATAATTTGAGGTCGATCATTTTCCTTTTTGCGGCTGCA
GGTTTTGTTTGTCTGCACAGCCAATGTTGTAGAAATGATACCTAATCCTCTTTTGGATAG
AATG G AAGTAATTTCAATTG CTG GTTACATTACG GATG AG AAAATG CACATAG CCAGG G
ATTATTTGGAGAAAGCTACTCGTGAAACATGTGGGATCAAGCCTGAGCAGGTATGTCTT
ATAGAAACATCTCTAGTTGCATCTCTTTCATTATCTCCTGTGCATATTCATTATCGGGTG
AATAGTTTTGTATTTTTTCCGCTCCCAATTTTTGACATTCAGAAGCAGACATTGGTCTGT
CGGGAAAATGTTCGAAACATGATTGGGCAAGAATGTAACTTCTTAAAGTAAAATGTGAA
AAGCTTTATTCAGTTAGAAGTTTATGGAATAAAACATATGATTGGAACATTCAGGGGATC
AAG CTCTTTACACC CTTCCATATAGAACATG GCTAAAG G AAAAAATCG CAGTAAATCTCA
TTATAGGTTTTTTGCCATATGCTTTAATATTTCCTAGAAAATATACTGTTTGATGTATTGAT
ATCATTTTCATCTTAAGCCTCTACATTTAGAATATACTTGCCTCCAAAGCTAATGGATTAT
AAAGTTTAATACAAACTCTTCCATCACTATCCTTTTCAAGAAGTATTTAAGTGGTTCACAT
AAACAACTGAACGAGTCCACATTGCTGTATATATCCTCTCAAGTGCTGCTTTCTCTTTGA
GGAATGTTTGCTTGACCGAGCCAAACTGTGACTGCCTATGCTAGCCAATGTGCTCGTG
CATCATTAACACAACCACTGTGCCACACTTTGCTGGGAGTATAAACGCAATTGAAGGTG
GTTCTGTCGTGGTAGTCGCCAATCTCTATTAGTAAATGGAGGTATAATGAGTTTCTTAGT
GGTAGGGATACATGTGCCTGGTGGGTGGGGTTGCTGGATGGTATTGTCTAACAGATTT
GAGCGGGGCGTGAACTAAAGCAGGATGGATGTTTAAAATATGCCAAGTGTTTTGGTGTT
TTTGCTGTTGACTTCAATTTGACCATCGTGTAAAAGAGAAACTAGACTAGTTTGTTACAC
AGTAGTGAATTTTTTATTACTGTACAAGGAATCATCATACGATGCCAAGATTGTGGTGAT
GACTCCAAAAAACACTTGACAATGTCTATTGAGTGAGCTTAAGTTATAATATGTCTGCAG
TAAACATGTTCTCCCACAGCCTTAAAGATGAGATCCGTGGAATTCGGAGAACCTTATATT
TAGCTCTCTAATGTGATCTTCAGTTGAGTTACATGCCATATGGATCTCTGGTGAAGAAAA
GGGAGCTGAAGACTTACTGATTAAGGGCATCAAAATGACATTTGACCTCATCTAACTCT
GTGGTGTCATGTGGTGTCATTTACTGTGCATATATACTCTAATAATTAACGTTTTCCACTT
TTCCGGGAAAAAACACCCTAAATATGGGTGTCAGAAGTTAATAAGTTGATATGTGGGAG
GGCTGATGGCTACTTTGGTTTGTTTTGGCCCTCCATAGCGGATAGGAGCAGTTCTAATT
TTTGAAGTATCTTGTAGGCTGTTGGTACTTTTTCATGACATAGTAGTGGCGTTATGCTTA
TCTAATGGCAAATGTATGACAGGACGAGGGTAGAGCTTTGGGTACCATTTCTCGGGAAA
TGTTAACTCCCTAATGCTCCTGTAAAATTACTCGAGTTGATATGTAATCAAGTTGAAGAG
ATGCAACGTTTGAGCCCATGTATATGTCATCTCTCTGTCTCTCAGGCTTTTAACATGCTG
AGGGCCGTTTCTCAGATGCAGAGATGTGATATCAAGCCGGAAAATTAAGTGTGCATGAT
TTATGTTAGCAATTGATTGCAATTCAAAGCCTTGAATGGTTGATGTTTGGTTTAAATATTT
CAACACGTATAACCATCCTGGCAGTTAGTATATGTGGTTATAGGAAGACTTTCAATTGAT
GCCTTGATGTTGTTTTTCTAAATATTTCACAAGTTACACTGAAGCGTTACTTGCAATAAG
CTAATTTTATTGGCTTCATGAATTCTTTATTTCAGGTTGAAGTGACCAATTCAGCTCTTCT TGCTTTAATAGAAAATTACTGCAGAGAAGCTGGTGTACGCAATCTGCAGAAGCAGATTG
AAAAGATTTATCGCAAGGTTCTGTGACCTGTCCTCTTTTGTATAATACATCTATGATGAA
ATTGTCTCGAGTCTCGTACCTAGTCAACTTTTGATGAAAACTGATGGTTATTTCATGAAA
TCCAATACGAACACTTTGAATCGTTTTCATGCTCTGATACTTTGCACATCCTGTTCTTGC
TTTTCAGATAGCTCTAAAG CTTGTCAG G GAAG ATGG AGAG ATTG AG CCTCAG AATG CAG
AGGTAGGTGAGGTAGAAGCAGAATCTATCCATCTATCAGACGAAATCAAGTCTAAGGAA
GAAATTCAAGCTGGAGCTGAGTCCGCAAACGGTAGCAATGATGACAAGGCCTCTGAAA
ATAATGCTGAAGCTGAAGCACAGGGAGCACCAGTGAATCAAACACAGAAATCTGCTAAT
GAAGATGCTTGTTTACAGGTAAATGAAAAACATTAAAAAGCAAAATTATAATGTTTAGTA
CTTCAGGTGATTCTTGCCAGTTGTAACTATGTATGCTACAATGTATTTTAATGCTTCTAA
GTTTTATCTATGCTCAAAATAAAAATACAAGATGCAGTGTAATAGGTAGTTTTGTAGGCA
CAGAAGTGTCTTTTTACAACTTGTCTTACTGTCACATGTTGGATTAAATATGGTTAACAAA
TG AATGTAAGAAATCATTTATCTG G CAATATAACAC CAAACAAGCAG GG G AAG CACTTT
CTTTTG AAATGTTAACTAAG AACAAG CAAG GG AAGC GCTTG GTATTCATTTTCTAC GCTA
AGTCCCCCCTGTATCTTAAAAAGGTATCTATGCATAATTTGCATTTATATCAGACTGTAA
GACAAGAGTGGGTTGCTCTAGTGGTGAGCACCCTCCACTTCCAACCAAGAGGTTGTGA
GTTCGAGTCACCCCAAGAGCAAGGTGGTGAGTTCTTGGAGGGAGGGAGCCGAGGGTC
TATCGGATACAACCTCTCTACCTCAGGGTAGGGGTAAGGTCTGCGTACACACTACCCTC
CCCAGACCCCACTAGTGGGATTATATTGGGTTGTTGTTGTTATCATACACTGTAATCAGT
TAAAATCAAATTCTTTG G GATG GAG GG GTTTCTTTAGTCTG G CTAATTAGACTTG ATAGT
CAAATTTGTTTTAAGAAGTTGTGCCAAATGTTAACTCACTTGTTATAATCTAGCGCGCGC
ACACACATTTTGATATGTTTGATGTTAAGCTGCTAATGAATAACGTCCTTATATTTTCCTT
GACATTGCATGATTTTATGTTTACATATTTATCTGCGTTTAATTTAATGCTATTGTGTCCT
TG CAACTG AATTTTG GTTCTG AGTG GTTGTG ATCTTAACAGG ATACTCAAGAAACTG AG
AAAGCAACAGAAAGTGAAGCGAGTAAAACAGTAAATAAAGTGGTTGTTGACTCGCCAAA
CCTAGCTGATTATGTTGGCAAACCTGTTTTCCATGCGGAGCGCATATACGATCAGACAC
CAGTTG G AGTTGTGATG GGTCTTG CTTG GACTTCAATG GGTG GCTCAACACTCTATATA
G AAAC ATCTCTG GTGG AGCAAGG AGAAG G GAAAG G GG CTCTCAATGTAACAG GACAAC
TAGGCGACGTTATGAAAGAAAGTGCCCAAATTGCCCATACGGTTGCCAGGACCATTTTG
CAGGAAAAGGAGCCTGATAACCAATTCTTTGCAAATAGTAAGCTTCATCTTCATGTTCCT
GCAGGTGCTACCCCTAAGGATGGCCCTAGTGCTGGTTGTACTATGATAACGTCCTTGTT
GTCTCTTGCCATGAAAAAGCCTGTTAAAAAGGACCTGGCAATGACAGGGGAAGTCACG
CTAACTGGCAAAATTCTTCCTATCGGCGGGGTATGTTAACAATTCTTACACCTCTCCTTA
TAATTTCATGCAGCTTTTGTGTCTGATCATCTATCATGTTTTCTTTTTATTTTTCGTTGATT
TTGTCTTTAATGTTCTTTATGCTTTAATTATTTCCTGTGTCTGTATGTGTTACATGCATGC
GCATGTAAGCATAATAAGAGTGGTCTTTTCTTTTTGACCCAACCAGTGTTGGGTTCCTTT
CTTGATTTTACAAAAGCTTTACCTTTTGGTTCAATAATAGGTCAAGGAGAAAGCCATAGC
AGCGCGAAGAAGTGATGTGAAAACTATAATATTCCCTTCAGCCAATCGCAGAGATTTTG
ACGAGCTTGCTCCTAATGTCAAGGAAGGCCTTGATGTACACTTTGTGGATGACTACAAG
CAAATATTTGATTTGGCATTT
SEQ 77
ATGCAGTTTTTCCGAAGAAACCCATCACTTCACAGAATCTCCTCCAGATTCCTTAATCAA
GTTCGTTTTCTTTTCTTTTCCTTTTCCGAAGTATAACTAGCTTTTCAATTTTTGTTTGGCTT
TTCGATTATCTTACTAGTGTAACATATATTTCATATGTCTTGAGGTTCATTGCAAAAACTC
GTTATATTTCTAAATGGGGTTGAGCACGTGGTCCAATACAATTCAGTAGGACTTACAGC
TGTAGCCTGTAGTTAGAGGACATTGGATTAATTAATTATATGGCTGCAATTCAGATATTC
AAAACGTTTCTTTTCCCTGTTTGCAATTTTTTCCTCCAAGTAGTGAAACAGTGGAATTTTC
TCCCCATTCTTAGGTCAAGCTATCATCTTTTTGCTTAAGAGTTGGTTTGGATGTTTACATT TATTTTCTAACACAATTTGTTTTGGTTTGCTGCTGATATCTCATGAATTGGATACGATAGG
TAGTCAAAACCAGTGCATATTCAACCAAGAAAGTTTACAATGCTGGGCAGCCGACTGCT
GCTACTCACCCTCAGGTACTTCTACACTGCATATGTAGTATTGACATTTGGTAGCATAGA
ATTAAGACCGTTGACATTAACAAATGATAAATGTGCAGTAAATTTAAAATCTTGTTTTCTT
GGTGGTTGTTTTCCTTGTATGGGACATACTTCGCTGTCCTTTGGAGCTCCTTTGTGAATT
TCTGTTAAATGTTGTTACATCACTGCATGCAGTTAATGAAGGAAGGGGAGATTACTCCT
GGCATTACCAGTGAAGAATATATGCAGAGAAGGAAGAAATTATTGGAGTTTCTTCCGGA
GAATAGCTTAGCAATTGTTGCAGCCGCTCCCATAAAAATGATGACTGATGTTGTACCAT
ACAATTTTAGGCAGGATGCTGACTATTTGTACATCACCGGATGCCAACAACCTGGTGGT
GTTGCAGTTCTAGGGCATGACTGTGGTTTATGCATGTTCATGCCAGAACAAAGCCCCCA
GGTATTTCAGGAACCATTCACTTGCTTCCTTCTTGTTGACAAGAAGCTGTTAATAAGAGA
AAAGCTTCGTCCTATAATTTAGTGACATTTTTCTTTAGATTCAGTTACTACCATGATTTTT
TGGTAGTTAGTATACATTGTAGCAAGTTAAAGATTGTTTCCATACTAAAAGTGAAAAAGT
ATTTTTAG G ACG CTCTTTG G CAAG GAG AAACTGCTG G AGTTG ATG CAGCTCTACAG ATA
TTCAAGGCTGACCTTGCTTACCCTATTAACAGATTGCCTCAGGTAAATCTTTTTTAAAAT
CATATCTCCAACTGCAAATAAGTTTGAGATTCTTTTTAGAAGCGAATACCTCTCACACTG
ATAAGTAAAAGGGCATATGATAACATCCCTTCTTTTATTCCTTTCAATAGGACAATGAAG
TACTTTATCTAAAAAGGGAGTGGAGACCTATTTGTCTCCTTTCCACTTGATTATAGAAAT
TTATGTCAGAGAATTGAATCTGTTAGAGTTGGCTTGTAGATACCTTTTGACTGTTGATGC
AATTCTTAATATG CGTAAAAG AATTGTTTTTCTC CTTTTTCTCTTTTCTTG CCG GG G AAAA
GAATTGTTTTCCTCCTTTAATATGCGTAAAAGGTATAGAGGGAACAAAGTGGATGGAAG
TTAGAGTTTTCACCTAAGTTGCTCCGACACGGCAATTTAGGTGCCGCACCCATATCGAC
ACGACACTAGTATGGGTGTGGGTATGGGATCCGTACCGGATCTGGTCAAACAATTTTG
GGTACTTTGACCACGACGGATGGAAAAATTCGAGACGAGATACAATTTGATTCCCAAAA
TCAGAATCTAAGGTAAATTTAAATAAAATAATATACCTTATCTAGAAAATCAATCCTTTAC
TTATCTATAACTTGAAAATAAAAAGGAAATCCACACTTTACAAGCTATACGTAAGTAATCC
ACAAAATTTCTCATAATTTAAAAATATTTTTATTTTTTTTGAATTATTTTTAGTCGGATCCC
CGCACCCATATCTGTACTAGGATCTGTATCCCCGAATCTTAGAATTTACATCTCGAAGG
ATCCAACCTCTAGATTCGCACCCATGTCGGACACCCGCACCCGTGTCCGAGCAACTTA
G GTTTTCACATATATAG G AGTCG GG CCTG G CTTATTACTATAAATTCATGTTTG ATAG G A
CCTATTACTGGATGTAGCCTTTCCTCATAATTTTGAAAATCAAGCAGGCATCACGCTAGG
ATCGGTTGAGAATATAATATTATGGGTAAGAAATTGAAGAATGAAGGTTAACAGAAAGTG
G ACACTGTGTTC CAAATG G AAACTAG GTAAATG ATTTAG G CAG ACGG AATATTTTTTG GT
TGGCTGTATTTGGCTCTCAGGTTAACCGTTTGACCACTTAATGGTAATTTACTATTTAAA
TCGGCAACACAGAGAACAGGAAGTGAAGATGTATATATGACTGTGTTATTTGTAGAGAA
CCAGTTTATGGTGTAGGTTTTCTAGTTATTGTAGAGCACTTGCGTACAAGAGTTTAAATT
CCG C AAC ATC G ATAAATTCTTAC CTTATAAAAAAG AG CAG G AAATG AAG AC GAG ATG C C
GATATCATGATTAGATCTATGTCAGCAAAGAAAAAATGTCAATTATTTCCTTCTAAGCTGT
CTTTCCTGTACATGGCTTCAATGTAGTGACTTTGTTTCACTTTCTCCATGTTCCAATTTCT
CTTCTCTAATTTTTGCTGGACTTGTCAGATTCTCTCCAGGATGATAGAAAGTTCTTCCAC
TGTGTTC CATAATGTG AAG ACAAG G ACTTCATCCTACCTG G AG CTTG AG G CCTATAAAA
AAGCAGTTAGCAATTACAAAGTGAAAGATTTCTCTGTGTACACTCATGAAGCCCGATTTG
TG AAGTCTCCAG CAGAG CTGAAATTG ATG AGAG ATTCTG CATCTATAG CTTGTC AG GTA
ATGGTAGTTCTTTCATTTTTGTCAGGTTCATGGGTTAGAGTGGTAGTTCTTACTCATAGA
G GTTCTTGTTTTTATTG G ACATGG AAAAG CAGTCTCTGG GTTATAGAG ATG GAAAG ATA
GAGTGTACACTTGACACTACTTTTGTATGTTTATTTGTTTTCCATTGAAGTTGATACTCTT
CACACAGTTAACATGTGACTAAGGTATTGATATGCCAGCGAGGTGTTTTCAGAATTTTAA
AAAGCTTATTTGCAGGTCAGCTGTTATACAATCTTAAGTAACATGTTTGTCATTTTGCTAT
ACGACAAAATTTTTTAGAAAGGTAAAATAGGTATTTGCATTTCCCTTTTTTTCCTCTTCTT
CTTTCATGTCTAG G GTG GTGTCTTC AG GTTG AAG ATACTACACTTCTGAG GATCTAAAAA
ATATTTCACAAAAG GAAAAAG G GTACAGTCAGAT AAAAG G ATCACCAGTCTAAAAGAAG ACGGTTCTTAATATTCCAAAAGTTGGAGTCCCAGCTTTCTTACTTGGGTCAACATATTCT
TGGTCTAATTGTGAAGGAACAGTTCTTGCATGTACAATCCTTTCTTTGATAATGTGCTTC
TGTGTTAAGTAGTTCAGAAGCTCTAGGCATGCTTAACCAAAAGATGTGTATATACTACTC
ATTCATTCTATTTCACAATCATGATTTGCATGTTTTCTTATGAGAGACTGGTCTAGAAAAT
GCTTCTTCCTATTCCTGGATTTGTATGCAGTTGCCTAGCAATAAAGTTGCCAGTTATATG
GGAGTTGAGATATTTTCCTTTCACTAATTCAGTCCTTTTTTATACTGTATAAAGGATATTT
TTTATTTCTTGATCTTTTAATGTCTGTCTTGTCTTTCGGAAACAGCCTCTCTACCCCTCG
GGGTAGGGGTAAGGTCGGCGTACACACTACCTTCCCCAGACCCCACTAGTGGGATTTC
ACAGGGTCGTTGTTATTATTGTTGTTCTTTTAATGTATATATTTTTGGTAGGCACTTGTCC
AGACCATGTTGTACTCGAAGTTGTTTCCTGATGAAGGAATGCTGTCAGCCAAATTTGAA
TATGAATGCAGAGTTAGAGGTGCCCAAAGAATGGCGTAAGCTTTTTCTTGTAATAATTTT
TGGAAGTTTGTATATAGAGAGGAGCACGTTGCAATTTCTAAGTATTTTAGTCTAACATGA
GTTGCAGGAGAGTAAATCAAAATGCCACTAAGACCTCATGTGTAAACATGCAATTGATTT
TCTTTTTCTTCTATTCTCTGCGATTCTGATAATTTGTTGTTTTTCCTGACTGTTAGTTTTG
GTCATACTTCTGGTTGAGATAGTTTCAAGGATTATACATTTTTCTTTTCCTGTTCCACAG
GTTTAATCCTGTTGTTGGTGGCGGACCTAATGGCAGTGTCGTGCATTATTTTCGTAATG
ACCAGAAAGTATGTTTACTGTCTTTAAGCACAGTTGAATTTGAATATCAAGCATATTGAG
TAGTAGTATCTAATTTGTTGTTTTAACAGATTGAAGATGGGGACTTTGTTGTTTTAACAGA
TTGAAGATGGTAACCTTGTCCTCATGGATGTTGGATGCGAGCTCCACGGTTATGTCAGT
GATCTTACTCGTGTTTGGCCGCCCTTTGGAAAATTTTCTTCTGTTCAAGTAAGTATAGAA
TCCATGATTTTCTTCTCCGTTTTCCCCTTAAAACTCAAGTCAAACCCCACTCCTCTGGGT
AAAAACCCTGTTAGCTGATCAAAGTCATAGACAACCTTCCATTTCAGAAAGAATGCACTG
AC CTAAATTG AAAC C C C AG ACTAC C AAAAG C AATCTAG AAAG AC AAAC G GTAAAATG AA
AATCATCAGGTAATGTAGCCTAGCAGCTAGCTTCACCCTCCAGTGGTATGAGTTATGAA
TCTTAATTCCAGATCCTCAATGGCCTTGCTGATTTGATGTGGATGTGAAGAATGAGAAT
GATATAATCATATAAAGTTCCTCCTTATTAGAAAAACAAAATTTCAATTTTACTTCTAAGC
TACAGATTATGCTTGAGAAATAAATTCTTCCCTTGGTATGGTTTTAAATTGCTACTTTTCT
GTGATATAGTCTCTATACATTATTGTCTGGAACTTAGTGATACGCTCCAAGATACATTTC
AGG AGG AACTTTATAATCTTATTTTGG AGACAAACAAG GAATG CGTG GAG CTGTG CAGA
CCTGGCACAACCATCCGAGAAATACACCACTACTCGGTACTATTTTAGTTAATCCATCTC
GTAATTTCTTTTG GTTTATATTCAAGG G GTAG CTG AG AGTAGG AATTTAATTTTTTTTCTC
TTGCCTTTCATAGACTTAGACCCAGTTATATTGCCAAGTTACATTGGTAGTCTCGGTGAT
AGAAATTTGGGTACAGTTGTGAAGGCCCTACTCTTGCTTTATGTTTTGCCTAATTCTCAA
GTTACACTGACTTCCACCTCCTATTGTGAATAGCAATGTTGCTTCAAAGTTTTCGTTCCT
ATGCATGAGCCGAGGAAAATGAGGCTTGATGGCATTTTCGAAGATAAGGAAAAAAGTTT
AATTCCTTTTAAGCCTTTGAGTATAGAGGTGTTGGGAAGAGATAGATGCCTTTTGATTGC
CCTCCCTTGATTTGAAAATAGTATTTGTTCCCCCATTTCTTCATATATGATGAATAATGCT
TTGTAAAATAAGCCAGTAAGGTAATGATTAGAGGTGTCTAATTAGTGTAGTGTGTGGTAA
TTGATTCTCAGTGAAGGGTAGTATTACCTGGTTGATGCAAATGCTTTATGAATTAAGGGT
CATTCTCTGCATTTGTTGTTAGGGCAGCTGTTAAAGGTCTTATTACTGTTCAAGATATGG
GCGTATATGGTATATATTCTGTAGTGAGTAATTGCCCTATATCAGCATGCTCTTTTCTTTA
G ATACTTG AG G ACTG CCAAG GTCTCATTCTTTTTTTTTTATTTG ATGTGTATAG GTAG AA
ACGCTGCGAAGAGGATTCAAAGAAATTGGGATACTAAAAAATGATCGGCGTGGAAGATA
TGAAATGTTAAATCCTACAAATATAGGTCTTTCCTTTTAACCCTTACTCTTCCGCTGCAG
ATATAAGTACAAATGCATGTGCAATAGCAGCAGAAACCTGCTCCCTCTCAATTGTCTTCA
CAGTTGCTAATGCTATTCCTATTATGCTTTTTGCTGAAAAGAGAAATGATTTCTTGTACC
GCTGCAACCATCTCTGAATGAATTTGGTTTGTTCAATTATATTTCCAGGTCACTATCTAG
GAATGGACGTTCATGATTGTTCTACAATTGGAAATGATCGACCTCTGAAACCTGGTGTA
GTAAGTTTCCTTCCTTACTGATGATTGCTTTGATATTTGAAAATAATCGGGAAACTGCTA
GGTTTGCAAAAGAAATTGGTTGTCATTATTTTGAAATCCTCCTAACGAATGAGGACCAGT
G ACTTG CTCATTTG GAAACAAATGAG CTTTGC CATAATG CATCACTCTCTTTTAG CAATT TACAATGAGTTTTCCTCCATAGGATAGTTCAGTCATTCTCTCTTTGCTTCTTGACTGGAT
TACAATATGAACCAACTAAGAATGCTTATGTTTTTAGAGCCATGTGGTCAAATTTCCCTT
TTCCTTCACTTTTTTCATTTTTATAGAGATGCAAAGGGTTAAAAGAGAGGATTGAATGAT
GATATATATGAATATTTTCTAAACTGCTTTTGTCATATGCACCTCTTTGGTCCTATTGCAG
CTTCTAATTTCATCACTTCTACTAGTAGTTACATGGAGAAAGTTAAATTCAGAAATGAAGT
TGGTTAGGTTACATGGAGAAAATTAAATTCAGAAATGAAGTTGGTTAGGTTTATTTCTAT
TGAAGGATGTGTACATCAGGGTAGTGCGTGTATTTGCATAAAAATTATGTTTGAAATATC
TGACGGTCCAATTAGCAGAAAGATCAAAGTATCTTTTGCTTCTCTCTAGATTCTATATAG
ACCTTTGTTTTGATTGATTAATTTGAAATATTTGAAATGATTAATTCCTCGACCGCTTAGT
TCATTAACCCCGGCTTCATACACGATGCAATTTTTGTTATTGATAAAGGTTTCGTTCCTG
GTATGTCTACTCTTTATCAAAAAGTAAAGCTCTTGTATATCTTTTCATTCTGAGTCACAAG
GAATGTAATATTCCTGCAGAATGTGTAATGTTATATGCAATTGTAGTAAAATTTCTCAGTA
GCTCGCGCATCTTGTTTTCATGTTGGTACTGCAAACTGTTAACTTATACTTTATGTATAAT
CCATTTCATGACAAGTACCTTGCTCTTGTGAAAGGTACTTAACGTCCAACATGCTTGCTT
CCATGTATGAAAAATTATTAGTTATCCCATTTTGCTCCTTTTTCCTTCATTCTTCTAATCAT
AAAAAATTGGAATATGCTCCCGACCTGTCTGATATGACAATAAAAACATACACAATATAT
ATCAAGTCAGCTGTTATATGCAAAATTACTGAAGGTAATTCCAGTAATACAGCTTATTGG
TGTCAGCGGTAGATTTATGTTAAATTATGCTTTAACTGAGGTCTATTTTGCCAGTGATAT
CTGTATCCATGCATTTGTTTTTAGTCCTTACAAAAAACAATTTCAGAACTACTATGTTTTT
GAATAAGAAGCACCATGAACATGCTACTTAGAGGTCTTAGTTTGTATAATTTATGTTGAT
CATATCTGTGGAGAAATATGCTAATTTTGCGTGGGCCGTGGCATTTGTATTTGAAAGAA
GTACATGACTTTTGATTGTTCTGAGATTATGTGCAGAGTCTAGTTCTTTGACTTCGGGAC
TATGGAAAACTTTGTGTTTACTCTACATTCATACATCTAGACAAGGTCATGGCCAATTGC
GAAACATGCTTTACGTTTTTTTAAAAGTGACGGAGACGCATATAGCTCAGAATGATGAG
ATGAAATTGGACAAAGCTAATAGTCTAGGATAAAATTGCCTGGTCATTGTTTAGACATTT
GTAAACTCCTGTTCCCTCTGTTTGTTTCTACATTAACCTTGATGAGGCGGTCATTGATAA
GAAGAATCTCTGACCCCATAAATAGATGAGTCCTTTTCATCTTAGCTTCCAATTACTTGT
GATTTCTGCAAGAACTTGTGATCAATCTTCATGGACTGTTATATGTAGGTCATCACAATT
GAACCAGGAGTATACATCCCTTCATGCTTTGATTGTCCAGAAAGGTAATACTTGTTACCT
CATCAAATTAATGTTCCTTTTGGCATGCATTCAGAAGTTACTGTATCTTAGATCATCCTC
CAGATTCTTGGTTTATTGAACTGGTTCTATCTGCCAAAATGTATTATGGTGGATGGACGA
AGAGTTACTCTTTCATGCAGAAAAGATTGAGAATCATATAAAGATGTCTGTTCTAGGTGG
GTCAGGTAGATCTTGTAGGTCTAGTCAACATGTTAATCTAAGGACATTGTAGCAGAAGT
AGTCGTATCGTGCAAAGGAGTCCCAGTTTGGTTGGGCTTTCAGTGAATGACTGATGGCT
ACTTGTATGTTCAACCGAGAATGGGATCAGGAACCTTTACTTCGAAATTTCTTTTGCAAG
GTTCGAGAAAACTTCCCAGAAACAAATAGCTCGAAAGAAATATGAAAGAAATATTCCAC
GTATCAATGGCATATTCTGGCTGTCCCATTCTAGGGAAAATTGTTGCTTTTGTCATAAAT
CTTAGGGAGAAAGAATTATTGCTCACTTGAACTAAGAAGCTTCATCTGCTGATTCATCTA
TTGTTAAGGAGCTAGATTATTCTCTTACCCCAATAAGGAACAACGTGTCTGTTTCCTTAG
AATTCTTGATTTTTACTCACTTATATATGTCTATATTCACTTATGCTCATTCAGGTTCCAA
GGCATTGGATTTAGGATTGAAGATGAAGTCCTTATTACAGAATCAGGTTATGAGGTATA
GTTACAGAAATCGTTCAATTGTTTGAACAACCGAGTTATACAAGTACCAGTTCATATGAT
CTCTGATACTTTGATCACTTCCGACACTTGTTAGCATTCAAGACCTGATTTTCTGCCCTA
CTGGAAACAGGTACTTACTGCATCCATACCGAAGGAAATTAAACACCTCGAGTCCTTGT
TGAACAACTTTGGCAGTGGGAGAGGAACAGAAATTAGAGCTGCTCTCAGT
SEQ 78
CTACTCACCTCTCACAAAAACCATATAATTCTCCTTCCCTTTCTTCTCTACAAAATCTTCA TTTCTCTCCAAAAACAAACTCTCATGGCTTCTTCTACTAGAGTTTTTGTTCTTCTCCTTCT CATAATCTTCAACTTTCTCTACATCTCAGCACAAAAAACCATTAAACATAAGCCTTTTTCA
ATGTCATTTCCTCTTACTTCAACATCTTTATCACATAACTCTTCTTCTAAAGCTCTTTTTCT
TTCTTCCCTTTTGGCTTCTAATCAAAGAAAACAAGCTCCAAACACAAAAACTGTGTCTAG
AATTCCATCTTTGAACTATAAATCAACTTTCAAATATTCAATGGCTTTAATTGTTACACTT
CCAATAGGGACACCACCACAAAATCAACAAATGGTTTTGGACACAGGCAGCCAACTTTC
TTGGATTCAATGTCACAAGAAAATTCCAAAAAGACCCCCACCAACGACGTCGTTTGATC
CTTCTTTGTCCTCCACTTTTTCTGTTCTTCCTTGTACTCATCCTTTATGTAAGCCAAGAAT
TCCCGATTTTACCCTTCCAACTACTTGTGACCAAAATCGCTTGTGCCACTATTCTTACTT
TTATGCTGATGGTACTTTAGCTGAGGGTAATCTTGTCCGTGAAAAAATTACATTTTCACG
TTCCCAAAGTACCCCTCCTTTGATTCTTGGTTGTGCTACGGAGTCCGAAGATGCCGAGG
GTATTTTGGGAATGAATCTTGGACGGTTTTCTTTTGCCTCCCAAGCTAAGGTACAAAAAT
TCTCATATTGCGTGCCAATTAGACAAGGTAGCCATGCAGTTAAACCTAGTGGAACATTTT
ACCTAGGCCAAAACCCTAATTCCCATACATTTCAATATATAAATCTTTTGACTTTTCCTCA
AAGTCAACGCATGCCAAATTTGGATCCACTAGCTTTCACTGTTGGCATGGTAGGGATAA
AAATTGGCGGCAAAAAATTAAACATCTCCGGTAGGGTTTTCCGGCCAAATGCTGGTGGT
TCTGGCCAGACGATCATTGATTCCGGCACGGAATACACTTTCTTAGTGGAAGAAGCGTA
CAATAAGGTCAGAGAAGAAATTGTTAGGTTAGTTGGTCCAAGATTGAAAAAAGGTTACG
TTTATGGTGGTGCACTTGACATGTGCTTCGATAACCGTCCGATGGAAATCGGACGGTTG
AT AG GTG ATATG ACATTG CAATTTG AG AACG G GGTTG AG ATTTTG ATCAATAAG GAAAG
GATGTTGGATGAAGTAGAAGGTGGGATCCATTGTGTTGGAATCGGACGGTCAGAATCA
CTCG G AATAG CAAG CAATATTATTG GTAATTTCCATCAG CAAAATTTATG GGTAGAATTT
G ATATGAG AAATCG AAG AGTAGGTTTTG GCAAAG GAG AGTGTAGTAG G CAAATG
SEQ 79
ATGGCTGCACTCAATTTCTTCATAATCTTCACATCACTAGTCTTACCAATTGCATCTGAT
CCTCTGTTGTCAACTTATGTTGTCCATGTTGACACCAAAGCCAAGCCATCTCATTACTTA
ACTCAAGATGAATGGTATAATTCAGTGGTTGAGTCAGTTCTTGCAAACAAAATGGACTCA
GATTCTACTTCTCCAAGATTGTTCTACTCATATGATGTAGTGTTACAAGGTTTTGCAGCA
AGATTGACTGATCAAGAATCTGAAAAACTAAATAAATTTCCAGAAGTCATTCACATTTTCA
AAGATCAGTCTAGAATCAAGCTTGACACAACACGTTCGCCGAATTTTCTTGGCCTAAAC
ACAGGTTATGGTCTGTGGCCACAATCTAACTTTGGAGATGATGTTATAATTGGCCTTGTT
GATACAGGGATTTGGCCTGAGAGTGAGAGTTTCAAGGACAATGGTATTGGTCCTATTCC
AACAAGGTGGAAAGGTAAATGTGTTGATGGAATTGAATTCAACGCGACGAGTAGTTGTA
ACAG AAAACTTATTG GTG CTAG GAATTTC GTTAAG G GTGTTG AG AATG ACTATCATCATC
AATCG G CACG AGATCAAAATG G ACATGG AACACATACTG CTTCAACTG CAGC AG GTACA
GAGGTAAATGGTGCCAATGTATTTGGTTTTGCTAAAGGGAAAGCACGAGGGATTGCGA
GTAAAGCTAGGATTGCAATGTACAAAGCTTGTGGGAGTAGTTCTTGTGCAGAATCTGAT
ATTTTAGCAGCTATTGAAAGTGCTATAAAAGATGGCGTAGACATACTTTCGCTCTCTTTA
GGATACGATGATGCTCCGTTTTATGAAAATCCAGTGGCAATTGCAACATTTGCTGCTGT
TAAAAGGAACATATTTGTTGCTTCTTCAGCTGGAAATCTTGGACCTTATCCATTTTCAGT
TCACAATACAGCACCTTGGGTTACAACAGTTGGAGCTGGATCACTTGATCGCGATTTCC
CCGTTG AAATC AACTTATC AAACAACAAG ACTTTTGTTG GTTCTTCTCTTTATC CAG G GA
GAATCAGTGGTAAAAGTTACTCTCTTGTTTATATTGAAAATTGTTCTATAATGACAATCGA
TCGTTCTAAAGTTGAACGAAAGATTGTAGTTTGCAACACTAGTAAAATCGAAGCTCTTAG
AAATGGGATTTTAATTCAGAAAGCAGGTGGTTTTGGACTGATTCAATTAAATCTTCCAAC
TG AAG GAG AAGG G ATTAG AG CAATG G CTTACACATTGC CTTCTG CAACATTGG GTTATA
AAGAAGGTATAGAGCTTCTTTCTTATATCAAATCCAATGCTAATCCAAGAGCAGGGTTCG
TACGTC GAAAG GATAC AGTAATTG GG AAAAAAGTTAG AG CTCCAATTGTTG CTAG CTTTT
CTTCAAGAGGGCCTAATGTTGTTGTTCCTGAAGTCCTCAAACCTGACCTCATTGCTCCG GGTTTGAACATTCTTGCTGCATGGCCAGGTAACCAGAGACGGATCCAGGATTTATACCT
TATGCATTCAACCTTTATTCTTTACCATTGACCCCACGACACTTTTAAACTTATGAGGTA
GGAATTTTATACTTTTTGAAATTGTTGTGATTTTTCATATTGCGTGGAAGCAACCACTAAT
GCTTATGGTAGGATAGGCTGTCTACATCACACTCCTTAAGTGCGGCCCTTCGCCCGAC
CCTGCGTGAGCAAGGGATACTTTATGCACTAGTCTACCGCTTTTCTTTATTTAGTGATTC
TTCACATTGTGTGTGTCTATGCAGGTGACATTTCCCCAACACGTCTCAAGATGGATCCA
AGGAGAGTGAAGTTCAATATAAACTCGGGAACATCAATGGCGTGCCCTCACATAGCCG
GAGTAGCTGCATTAGTCCGCGCTGTTCATCCAGATTGGTCCCCGGCTGCTATAAAATCC
GCACTCATGACTACATCCACAGCATTCGACAATGCACAACTCCCTATCATAAAACACGA
AGACATGGAGCTAGCAACTCCGATCAGCATTGGAGCCGGGCACGTGAACCCTGAATCG
GCTATTGATCCGGGCCTAATATACGACACTGATACATCAGACTACATCAACCTACTATG
CAGCTTGAATTACACAGAGAAACAAATGAAACTTTTCACGAACGAGTCAAATCCTTGCTC
GGGTTTCACTGGATCTCCACTTGATCTTAACTATCCATCACTTTCTGTTATGTTCAGGCC
TGATTCCTATGTTCATGTAGTTAAGAAGACACTGACACATGTCGCGGTATCTAAGCCCG
AGGTGTACAAAGTAAAGATAGTGAATCTGAATTCTGAAAAGGTGAGTTTAAGTATAGAG
CCAAGGAAGCTGATTTTCAATGAATCTTTACAGAAACAAAGCTATGTGGTCAAATTTGAG
AGCCATTATGCATTCAACAGCAGCAGGAAAATAGCTGAGCAAATGGCGTTTGGTTCGAT
ATTGTGGGAGAGTGAAAAGCACAATGTTAGGAGCCCCTTCGCTGTTATGTGGGTTCAG
CAAAATTTCAATAACAGTAGATTATACAAA
SEQ 80
TCAAAATGCCAAATCAAATATTTGCTTGTAGTCATCCACAAAGTGTACATCAAGGCCTTC
CTTG ACATTAGG AG CAAG CTCGTCAAAATCTCTG CG ATTG G CTG AAGG G AATATTATAG
TTTTCACATCACTTCTTCTCGCTGCTATGGCTTTCTCCTTGACCTATTATTGAACCAAAA
GGTAAAGCTTTTGTAAAATCAAGGCACTTCAGAAAAGGAACCCAACACTGGTGGGGTCA
AAAAGAAAAGACCACTCTTATTATGCTTACATGCACATGCATGTAAACACATACACACAG
AG G AAATAATTAAAG C ATAAAG AAC ATTAAAG G C AAAATC AAC AAAAAATAAAAAG AAAA
CATGATAGACGATCAGACATAAAAGCTGCATGAGATTATAAGGAGGTGTAAGAATTGTT
AACATACCCCACCAATAGGAAGAATTTTTCCAGTTAGTGTGACTTCCCCTGTCATTGCCA
G GTC CTTTTTAACAGG CTTTTTCATG GCAAGAG ACAACAAG G ACGTTATCATAGTAC AA
CCAGCACTAGGGCCATCCTTGGGGGTAGCACCTGCAGGAACATGAAGATGAAGCTTAC
TATTTGCAAAGAATTGGTTATCAGGCTCCTTTTCCAGCAAAATGGTCCTGGCAACCGTAT
GGGCAATTTGGGCACTTTCTTTCATAACGTCGCCTAGTTGTCCTGTTACATTGAGAGCC
CCTTTCCCTTCTCCTTGCTCCACCAGAGATGTTTCTATATAGAGTGTTGAGCCACCCATT
GAAGTCCAAGCAAGACCCATCACAACTCCAACTGGTGTCTGATCGTATATGCGCTCCG
CATGGAAAACAGGTTTGCCAACATAATCAGCTAGGTTTGGCGAGTCAACAACCACTTTA
TTTACTGTTTTACTCGCTTCACTTTCTGTTGCTTTCTCAGTTTCCTGAGTATCCTGTTAAG
ACCACAACCACTCAGAACCAAAATTCAGTTGCAAAGACACAATAGTCATTATATTAAATG
CAG ATAAATATGTAAACATAAAATCATG CAACG CCAAG G AAATACAAG G ACATTATTCAT
TAGCAATTTAAATTGGTAGAATTCTATATATTTTTCTTTCAGTACAGACCAATATGAGGGT
AGAAATAACATCAAACATATCAAAATGTATGTGCGCGCGCTAGATTATAACAAGTGAGTT
AACATTTGGCACAACTTCTTAAAACGAATTTGACTATCAAGTCTAATTAGCCAGACTAAA
GAAACCCCTCCATCCCAAAGAATTTGATTAACTGATTACAGTGTATGATATAAATGCAAA
TC ATG C ATAG ATACTTTTTAAG ATAC AAG G G G G AC AT AG C ATAG AAAATG AATAC CAG G
CGCTTCCCTTGCTTGTTCTTAGTTAACATTTCAAAAGAAAGTGCTTCCCCTGCTTGTTTG
GTGTGATATTGCCAGATAAGTGATTTCTTACGTTTTGTTAACCATATTTAATCCAACATGA
GACAGTAAGACAAGTTGTAAAAAGACACTTCTGTGCCTACAAAACTACCTATTACACTGC
ATCTTGTATTTTTATTTTACGCGTAGATAAAACTTAGAAGCACTAAAATACATTGTAGCAT
ACATAATTACAAATAGCAAGAATTACCTGAAGTACTAAACACTATATTTTTGCTTTTTAAT GTTTTTCATTTACCTGTAAACAAGTATCTTCATTAGCAGACTTCTGCGTTTGATTCTCTGC
TCCCTGTGCTTCAGCTTCAGCATTATTTTCAGAGGCCTCATCATCATTGCTACCGTTTGC
TGACTCAGCTCCAGCTTGAATTTCTTCCTTAGACTTGATTTCGTCTGATAGATGGATAGA
TTCTGCTTTTACCTCATCTACCTCCGCATTCTGAGGCTCAATCTCTCCATCTTCTCTGAC
AAG CTTTAGAG CTATCTG AAAAG CAAG AACAG GATGTGCAAAGTATCAG AG CATGAAAA
CGATTCAAAGTGTTCGTGTTGGATTTCATGAAATAACCATCAGTTTTCATCAAAAGTTGA
CTAGGCACGAGACTCGGACAATTTCATCATTAGATGTGTTATACAAAAGAGGACAGGTC
ACAGAACCTTGCGATAAATTTTTTCAATCTGCTTCTGCAGATTGCGTACACCGGCTTCTC
TGCAGTAATTTTCTATTAAAGCAAGAAGAGCTGAATCGGTCACTTCAACCTGAAATAAAG
AATTCATGAAGCCAAATAAAATTAGCTTATTGCAAGAAACACTTCAAGTGTAACTTGTGA
AATATTTTGAAAAACAACATCAAGGCATCAATTGAAAATCTTTCTACAACCACATATACTA
ACTGCTAGGATAGTTACACGTGTTAAAATATTTAAACCAAACATCAACCATTCAAGGCTT
TGAATTGCAATCAATTGCTAATATAAATCATGCACACTTAATTTTCCAGCTTGATATCACA
TCTTTTTTTTGATAAGGTGAAGATTTTATTAAAAACAGTATCAAGCTGATACTGTAAAAAT
ACAAGGACACTGCTGGCTTAAAAACATTAAAATCCTAAGCGGTCTAGCATGTCCAGCTT
G ATATCACATCTCTG CATCTG AG AAACAG CCTCAG CATGTT AAAAG CCTG AG AGACAAA
GAGATGACATATACATGGACTCAAATGTTGCATCTTTTCAACTAGATTACATATCAACTC
GAGTAATTTTACAGGAGCATTAGGGAGCTAGCATTTCCCGAGAAATGGTACCCAAAGCT
CTACCCTAGTCCTGTCATATGTTTGCCATTAGATAAGCATAAAGCCACTACTATGTCATC
AAAAGTACCAACAGCCTCAAGATACTGAAAAAATTAGAACTGCTCCTATCCGCTCTGGA
GGGCCAAAACAAACCAAAGTAGCCATCAGCCCTCCCACATATCAACTTATTAACTTCTG
AC G C C C AT ATTT AG GGTGGATTTTTTTTTTTTTTTTTTTTTTTTTGGGGGGGGGGGGGGG
GTAGGCACAGTAAATGACACCGCAGATATAGATGAGGTCAAATGTGTCATTTTGATGCC
CTTAATCAGTAAGTATGCAGCTCCCTTTTCTTCACCAGACATCCATATGGCATGTAACTC
AACTGAAGATCACATTAGAGAGCAAAATATAAGATCCACCGAATTCCATTGATCTCATCT
TTAAGGCTGTGGGAGAACATATTTACTGCAGTCATCTTATAACTTAAGCTCCCTCAATAG
ACATTATCAAGTGTTTTTTGGAGTCATCACCACCAATCTTGGCATTGTATGATGATTCCT
TGTACAGTAATAAAACATTCACTACTGTGTACTAGTCTAGTTTTTCTTTTACACGATGGTC
AAATTGAAGTCAACAACAAAAAAAACAAAACACTTGGCATATTTTAAACATCCATCCTGC
TTTAGTCCATGCCACACTCAAATCTGTTAGACAATACCATCCAGCAACCCCAGCCCACC
AGGCACATGTATCCCTACCAGTAAGAAACTCATTATACCTCCATCTACCAATAGAGATTG
GCGACTACCACGACAGATCCACCTTCAATTGCGTTTATATTCCCAGTAAAGTGTGGCAC
AGTG GTTGTGTTAATG ATGCACG AG CACATTG G CTAG CAAAG GCAGTCAC AGTTTG GCT
CGGTCAAGCAAACATTCCTCAAGAGAAAGCAGCACTTGAGAGGATGTATACAGCAATGT
GGACTCATTCAGTTGTTTATGTGAACCACTTAAATACTTCTTGAAAAGTATAGTGATGGA
AGAGTTTGTATTAAACTTTATAATCCATTAGCTTTGGAGGTAAGTATATTCCAAATGTAGA
GGCTTAAGATGAAAATGATATCAATACATCAAACAGTATATTTCCTAGGAAATATTAAAG
CATATGGCCAAAAACCTATAATGGGATTTACTATATCCGTGCGATTTTTTTCCTTTAGCC
ATGTTCCATATGGAAGGGTGTAAAGAGCTTGATCCCCTGAACTTTCCAATCATATGTTTT
ATTCCATAAACTTCTAACTGAATAAAGCTTTTCACATTTTACTTTAAGAAGTTACATTCTT
GCCCAATCATGTTTCAAACATTTTCCCAACAGACCAATGTCTGCTTCTGAAATGTCAAAA
ATTGGGAGCGGAAAAAATACAAAACTATTCACCCGATAATGAATATGCACAGGAGATGA
TGAAAGAGATGCAACTAGAGACGTTTCTATAAGACATACCTGCTCGGGCTTGATCCCAC
ATGTTTCACGAGTAGCTTTCTCCAAATAATCCCTGGCTATGTGCACTTTCTCATCCGTAA
TGTAACCAGCAATTGAAATTACTTCCATTCTATCCAAAAGAGGATTAGGTATCATTTCTA
CAACATTGGCTGTGCAGACAAACAAAACCTACAGCCGCAAAAAGGAAAATGATCTACCT
CAAATTATGCACTCAAAAAATGGATAATGTGCAGATGTCATTCTCATATAAAGTACAGTG
G AATAG GTAATTGTATTAAG CAAAG AG G ATAGAAAACACTGAAG GTGTC AAAG CTCCTT
TTTTTCAAATTTTG GAAGTACATTAAAACCTAGAG GTTG G ATTTTATAATTTCTTCAG GTT
CATATATAACTCAACATCCATAACAAGGTTTTGATATAGCCTCCCTCCCCATAACCAAGC
CAGACCCAAAATGCTACTAAAATAATCGAAAGATGCCATCTTCTTTCGACAGAGCCCTTT GGAAATATAACACAAATATTTACCATTATGAAAAAGACATAGTAACACACAGTAAATTATC
ACATTAAAGGACAAAACAAATGAAATTGTATGCGAATATACCCATCATCCTTCTCTCTCT
AGCCATGTTGTTGATTTGCTGATGTTTGATATAATAAAATGAAATAGTACCGGTAACTGC
AAAGTCGTCATATTTGAATTATGGGCCAAATTCTTTAATCAATAGGTAAACTAACGCTCA
ACACTCAAAGTCTATCGCATAATTGTAAAGTCGTCATATTTGATTTATGGGCCAAATTCT
TTAATCAATAGGCAAACTAACGCTCAACACTCAAAGTCTATCGCTCCAATTATCGGCAAC
TTCAAGAGGAATAAAACCATATCCCTCATCAGTAAAGCCATCAAAGATAATATGAACAGA
CAGTAGGAAATCTCAATGCATCAACCTTTTTCCTATTGAATGATCACCAAAACCAATGCC
AATGAAACCTAGTTTTTCTCAAGTGTTAGATATAGAAATGTAGTTGTCCCACATTGGAAT
AGGTGTAGTATGCCTTTGTATAGAGTAGCTATAAATAAGCCCATCTTGTATTGCATTAGA
CACACAATATCAATATATCATATTTTCTCCCGTGTCTTCTCACATGGTATCAAAGCAATC
GTGAGAGATTTATCGTTGTGCATAAATTCCAGCGACTCCGGGAAGGAAAATCAGTTGAC
CGGAAGCCTTTTCCGGCAGGTCTGCCGCAAGTAAAAAAAAAGCCACTTCGTCAGTGTT
GTGCAAAAAAACCAACACCACCACGAAGTAGATCGGGCTCTGGCAACCAACCCATAAA
AAAATCTCCGTCAGAATACCCTCCACGCGCCGTCACTTGCTACCGGAAGAAAATTTTCC
GGCGAAGTTCCGACGTCGCGTGGGCCACCTTCCAGCCATTTTTTGGCGACGACTCTTC
AGGACAAATTATTCCCCTTGCAATTCCGAGCCTACCCATCCAGGTTACACCAAATTCCA
G ACAACTTATATATTTTTTCCAG CATG CATAGTGATTTCAAAAGTG G ACTTCC GG CAATT
TTTTGAAAACGTTTCTTCAGAACAGTTGGGTCATCTGGTAATTCCGATCCTACCCCTACT
GTTTTTATTTCATTCCGACCACTTTGAATTTTCCCGGCAGCTACAGTACTATTCCGACTG
CTACAGTAATATTCCGATAGCTACAGTATTTCCTTATTCTGTTTCACTGTTCCTTACTCTG
TTTTAGTGGATTAAATTTGATTATTTCTATAATTTGGTAATAATTTGCAACGATGTCTATG
GGAATTGATGCTTTTGGGTCTAAAAACATGAGTTCTGGAAGCTCTAGTGTTATGATTACT
TCAAAACCTTTAATGTG AG GTTCAAACTACTTAG CTTG GG CTTCATCTGTCGAGTTGTG G
TGTAAAGGTGAAGGTGTTCAAGATCATCTAATTAAACAGTCTAGCGAAGGAGATGAAAA
GGCGATAGCGCTTTGGGCAAAGATTGATGCTCAATTATGTAGCATCTTGTGCCGTTCTA
TTGATTCCAAGTTGATGCCTTTGTTTCGTCCATTCCAGACATGTTATTTGGTTTGGGCAA
AGGCTCGTACCTTATACACTAATGACATATCTCGCTTCTATAATGTGATATCACGGATGA
CAAACTTAAAGAAGCAAGAATTAGATATGTCTACTTAATTGGGTCAAGTACAAGCAATCA
TG GAG GAATTTGAG ACATTAATG CCAGTTTCTG CTAGTGTG G CAAAACAACAAGAG CAG
CGACAAAAGATGTTCTAGTTCTTACACTCGCTAGACTTCCTAATGATCTTGATTCAGTGC
GAG AC CAG ATTTTG G CTAGTCCG ACTGTTC CCACAGTTGATG AATTATTCTCTC GATTAC
TCCGCCTTGCCGCACCACCAAGTCACCCAGTGATCTCATCACAAATACTTGATTCCTCT
CTCACATCGCAGACGGTGGATGTTCGGGCGTCTCAAACTATGAAGAACAGAGGAGGAC
GAGGTCGTTTTGGGAGATCTAGACCCAAGTGTTCTTATTGTCACAAACTTGGATACACT
CGTG AAATGTG CTATTCCTTAC ATG GTCGTC CACC CAAAAATCTTAC GTTG CTCAG ACT
GAGACTACATGTAACCAAGGTTTTTCTGTATCTAAAGAAGAATATAATGAGCTCCTTCAG
TATCGAGCAAGTAAGCAGACATCTCCACAAGTAGCCTCAATTGCCCAGACTGATACTCC
AGTTGTTGGTAATTCTTTTGCTTGTGTTTCCCAGTCTAGTACTCTTGGACCATGGGTCAT
GGACTCAGGCGCTTCTGATCACATCTCTGGTAATAAATCACTTTTGTCGAATATTGTATA
TTCACAGTCTCTTCCCACTGTTACTTTAGCCAAGGGATGTCAAACTAAGGCACAAGGAG
TTGGACAAGCTAACCCATTGTCTTCTATCACCCTAGATTCCGTTCTTTATGTCCTTGGTT
GTCCTTTTAGTCGTGCATCTGTTAGTCGTTTGACTTGTGCCCTCCATTGTGGTATATATT
TATTAATGATTCTTTTATTATGCAGGACCGCAGTACGGGACAGACAATTGGTACAGGAC
GTGAATCAGAAGGCCTTTACTACCTTAATTCACTCAGTCCTTCCACAACATGTCTAGTTA
CTGATCCTCCGGACCTAATCCACTGTCGTTTAGGACACCCAAGTTTATCCAAACTTCAG
AAGATGGTGCCTCTTTTAGGACACCCAAGTTTATCCAAACTTCACAGTCTACATTAGATT
GTAAGTCGTGTCAGCTTGGGAAACATACCTGAGCTTCCTTTCCGCGTAGTGTTGAGAGT
CATGTAGAGTCTGTTTTCTCCTTGGTTCATTCTGATATATGGGGTCCTAGTAGAGTCAGT
TCAACCTTGGGATTTCGTTATTTTGTTAGTTTCATTGATGATTACTCAAGATGTACTTGGC
TTTTCTTAATGAAAGATCGTTCTGAGTTATTCTCTATATTCTAGAATTTTTGTGCTGAAAT AAAAAATAAATTTAGTGTCTCTATTTGCATTTTTCGTAGTGATAATGCCTTAGAATATGTA
TCTTCTCAGTTTCAGCAATTTATGACTTCTCATGGAATTATTCATCAGACATCTTGCCTTA
TACCCCTCAGCAAAATGGGGTTGCAGAGAGAAAGAATAGGCACCTTATTGAGACTGCT
CGTACACTTCTAATTGAATCTCGTGTTCCGTTGTGTTTTTGCGGCGATGTAGTTCTCACA
GCTTGTTATTTGATTAATAGGATGCCTTCATCTCCCATCAAGGATCAGATTCCGCTTTCA
GTATTGTTTCCCCAGTCAGCCTTATACCCTCTTCCACCTCGTGTTTTTGGGAGCACATAT
TTTGTTCATAACTTAGCCCCTAGGAAAGATAAGTTAGCTCCTCGTACTCTCAAGTGTATC
TTCCTTGGCTATTCTCGTGTTCAGAAGGGATATCGTTGTTATTCACTTGATCTCCGTAGG
TATCTTATGTCAGCTGACGTCACATTTTTTGAGTCTAAACCTTTCTTTGCTTCTGCTGAC
CACCATGATATATCTGAGGTCTTACCTATACCGACCTTTGAGGAGTTTCCTATAGCTCCT
CCTCCACCTTCGAACACAGAGGTTTCACCCATACTAACCATTGAGGAGTCTAGTGTTGT
TCCTCCTAGTTCCCCAGTCACAGGAACATCACTCTTGACTTATCATCGTCGTCTGCGCC
CTACATCAGGCCCAACTGGTTCTCGTCCTGCACCTAACCCTGCTCCTACTGCGGACCC
TGCTCCTAGGACACTGATTGCACTTCGAAAAGGTATACGGACCACACTTAACCCTAATC
CTCATTATGTTGGTTTGAGTTATCATCGTCTGTCATCTCCCCATTATGTTTTTATATCTTC
TTTGTCCTCGGTTTCCATCTCTAAGTCTACAGGTGAAGCGTTGTCTCATCCAGGATGGC
G ACAG GCTAG GAGTGATG AG ATGTCTGTTTTACATACAAGTG GTACTTG G GAG CTTGTT
CCTCTTCCTTCGGGTAAATCTACTGTTGGCTGTCGTTGGGTTTATGCGGTCAAAGTTGG
TCCCGATGGCCAGATTGATCGACTTAAGGCCCATCTTGTTGCCAAAGGATATACTTAGA
TATTTGGGCTCGATTACAGTGATACCTTCTCTCTTGTGGCTAAAGTGGCATCAGTCCGC
CTTTTTCTATCCATGGCTGCGGTTCGTCATTGGCCCCTCTATCAGCTGGACATTAAGAA
TGTCTTTTTTCACGGTGATCTTGAGGATAAGGTTTATATGGAGCAACCACCTGGTTTTGT
TGCTCAGGGGGAGTCTCGTGGCCTTGTATGTCGCTTGCGTCGGTCACTTTATGGTCTTA
AGCAATCTCCTCGAGCCTGGTTTGGTAAGTTCAGCACGGTTATCCAGGAGTTTGGCATG
ACTCGTAGTGAAGCTGATCACTCTGTATTTTATCGGCACCCTGCTTCAAGTCTATGTATT
TATCAGGTAGTCTATGTTGATGATATTGTTATTACTCGCAATGATCAGGATGGTATTACT
AATCTGAAGAAGCATCTCTTCCAGCATTTTCAAACTAAGGATCTAGGCAGATTGAAGTAC
TTTCTAGGTATTGAGGTTGCTCAATCTAGCTCAGGTATTGTTATTTCTCAAAGGAAATAT
GCTTTAGACATTCTTGAGAAGATAGGGATGATAGGTTGCAGACCTGTTGATACTCCAAT
GGATCCGAATTCTAAACTTCTGCCAGGACAGGGGGAGCCGCTTAGCGATCCTGCAAGC
TATAGGCGGTTGGTTGGTAAATAAAATTATTTCACAGTGACTAGACCCGACATTTCTTAT
CCTGTGAATGTTGTAAGTCAGTTTATAAATTCTCCCTATGATAGTCATTGGGATGCAGTC
GTCCGCATTATCCGGTATATAGAATCGGCTCCAGGCAAAGGATTACTGTTTGAGGATCG
AGGTCATGAGCAGATCGTTGGGTACTCAAATGCTGATTGGGCAGGATCACCTTCTGATA
GACGTTCTACGTCTGGATGTTGTGTTTTAGTAGGAGGAAATTTGGTGTCCTGGAAAAGC
AAG AAACAG AATGTAGTTG CTCG GTCTAGTG CAG AAG CAG AATATCG AG CAATG G CTAT
G GTAAC ATGTG AACTAGTCTG GACCAAACAATTG CTCAAG G AGTTGAAATTTG GTG AAA
TCGGTTAGATGGAACTTGTGGAACTTGTGTGCGATAATCAAGATGCCCTTCATATTGCA
TCAAATCTGGTGTTTCATGAGAGAACTAAACACATTGAGATTGATTGTCACTTCGTAAGA
GAGAAGATACTTTCAGGAGATATTACTACGAAGTTTGTGAGGTCGAATGATCAACTTGC
AGATATTTTCACCAAGTCCTTCACCGATCCTTGCATTGGTTATATATGTAACAAGCTCGG
TACATATGATTTGTATGCTCCGGCTTGAGGGGGAGTGTTAGATATAGATATGTAGTTGC
CCCACATTGGAATAGGTGTAGTATGCCCTTTGTATAGAGTAGCTATAAATAAGCTCATCT
TGTATTGCATTAGACACACAATATCAATATATCATATTTTCTCCCGTGCCTTCTCACATCA
AGTATCAAGAGAACAACATTCTACTTACGCTAATCTTATTACAGTCATAAAGACAGAGAC
AAAGGTAACACAGACGTGAAATAAAATACTTCCAGATAACCTTTCATTATGGACAATAAA
AGAAACATGTTCAGTCACGAAACTTTTTCCAAACTTAATTAAGGTAAGAGACGAAAGAAA
ACATCAATCAGTAAAGAAAAAGAAGGGGAAATTAAAAGGTGAAAAAGCAAAATGATATTA
GATATTTGCAAGATTTTTAATATCACATTAGTCCCCACTCATCAAATGATGGTAGAGGGC
ATTTCATAAGGGCCAATCAACAGCCACAACAATTCGATAAGTGCTCCAGGAAACAACTA
ACCTTTGATAAATCAATAGGAACATCAAGATAATGATCTAAGAAATTTGCATTCTGTTCT GGATCAAGAAGCTCCAACATAGCACTTGCTGGATCACCAGCATGTCCTCTTCCCAACTG
CCATAGAAGAAAAAATACAAAGCTATTATTAAATTTGATCGATAATTATACACATTCTTTA
AACATTTAGATTGCAGAAGAAATCCAAGCAAAAACATAATATGTTAATGCCCAAGATTCT
TCCGGTGGTCATACACAGAAAAGCAAGATTTAAATAGTGGAAAAGAAGTTACACTTCAC
TCATAGTCAAAGCATATAATTGAAGTATGAACTCACAAACCATAAAATACCTTGTCAATTT
CATCGATCAAAACAAGAGGATTAGCGGTTCCCACACTTTTTAAACATTGCACCATCTTCC
CCGGCATGGCACCAATATAAGTTCGTCGATGTCCCTTGAAGTGTATCCCAAAGTTAGTC
ATGAAAGTTCAAGATTATGTGCACAACAAAACAACTCTGTTTTAGCTGGTTAGATATTTC
TGTTACCTTTATTTCAGCAACATCAGACAGCCCTCCAACAGAAAATCGGTAAAATTTGCG
GTTCAATGCACGTGCAATTGAACGACCTATACTGGTTTTGCCCACCCCAGGAGGGCCA
GAGAGGCATATGATTTTCCCTGTTCAAACAAAAATAAGCAATGAGTTTGTTTAGTTCCTG
ACTCAGATTCACAAATTAAACTGAGATAAATCAGATAAAGCCGCACATATGAAAGACAGT
GCCAGTGTATAATTGCGATTAATATCATCACTTGAATCTAACATTGTCCTAGCTATGGTG
CAATTCACATTTTTCTAGAGTTGCCCTTTGTCTGTTTTTTCCTCAACACTTCATAGTTTTC
AAAGATTTTCCTCAACAAACAACTACATCTCAAGTTCAAATGTCCCAAAGATATACAAGT
ACAACTGGCAGTGAAAGTAAGTTATCAATAAAATACCATAGAGCGAGTGCCCAATCATG
GACAACACAATATCAGAATTTAAGAAGAACACCAACAAAGTATGGCCCAAGGTGCAGCT
ATTTAACTCACATTAATTGCTTTTCAAGGGAAGAAATTAAGCTCAAGATATATTTCTTTAC
TCCACAGATAATCACAAGAAAACATTGGAGGAACTCTCGACTGTCACAAAAAGTATTAC
CTAAGTTTGACCAATAAATATTTCTCAAATGCCCTAAAATGTTCACTTCTTCCAATTTACT
CGTTCCAATTTGATGTCAATATAGTATGTGGGAACCATAAATCCCATGTTCTATGGATAT
ATTTTCCATTGGTATATTATAATGTTATGAAAATGGACGGTAAGGAGATGGCTGGATCAT
TTTCCGTTCTTTTAATATGATTATAGCTGAACCTTATAAAACTGAGGATTTTATTAAAAAT
GAG G GTCATTATTTTTAAAAAATAAG G CATTTACCAAC CTTGTG AGGTTCCTCTG AGTTT
TCCCACAGCTATAAATTCCAAGATCCTTTCCTTAACATCGGTTAACCCATAGTGGTCTTC
ATCAAGAATTTGTTCTGCCCGTAGTACATCAAAGTTTTCATCACTGCATTTTGGATCATA
AAAG AAG AAAAGTTC AG G AAC C AATG C AATAC AC C C AG G C AG AG ATC AAAACTATAAAG
C C AAC AC C AC C AC CATC AG C AG C AAAC AAC AC G AACTATG AC AAATC AAG C C C C C ATCT
TAACCTACATAAGGAGCTCATCATAAGTACAATCCTAGTTTTGCTGCTCTCATGAGATCC
TG GTCTAAACTGATACATTG GG ATGTCAAGG AAG CATC CTCAAAG CCAG AAGCATG ACA
TGTAAG AAC C C AAAATAAG G AAC AG GAG GAG AAAG AAAAGTT AAAG AC AAATG GAT ATT
ACTCGATAGAAATTTAGGCAAGTCACATCACTAGGCAGTAAGGTTAACTTGGAAGCCTA
G CAAATTTCAG AC AATAACAATTTTATTTCTCATAACTGTTTAG CCAG CTTCTAAC AAACA
GATTCCTCTAGTACAGCTAACCAAGGATAAGGTTAACATCAGTTGGATTGAACCCAAGA
CAATGTAAGGTAGAATAGATAGACAACAAACCTGTAACTACCCCATGGCAAGGCAGTCA
ACCAATCAAGATAATTACGTGTCACGTTAAATTCACTGGAACTAGCTTCCAACAGTTGCA
GTTTTGTCAGTTCTTCTTCAATAACTTGCATAACATGTACTGGTATTTTTTCTTTATTAGG
CTCCAATCTTTCCCTGAACTTTGCTGCAGTAATAAACAAATATAGTGTCACACCATTAGA
TGTAATTAAAAAGGGAACAAAAAACTGCAGGTTTCCCTTAGTATTGACCAGAAAAATCCA
AGCCCTTAAAGCGAACCTAATCATTAGGCAGTTTCTTCTCTCAATTTCAACACCAAAACC
AGGTCTACCAAAACAATG CTACAATG CG GAG CTCAACTTTATCTTTGTG AAAATTATAGA
GAAAAAATAATGAGAACCGCATTTTAGGGTTTGCAGTAACGTCTATTATCACCAAAACCT
G GAAGAAG CACCTTCTTAAG CATCACACTCCTCAACAG GAG CAAAGG AGAG AAGAAAG
AGCACATAACAG AAAAAG CAAGTAG GCTTCTAATTAG G AAGTG CACTG AAGG G AAAC AG
GACAGAATTCATGAGAAAAAGAAGACATTGGAGGAGGAAGATAATCTGAATCATGAGCA
GTCCATAACATGAATAAACAACAATGCAGTGCTAAAACGGAAAATGTGGCCTTCCAGTA
AATCCATGTTGAATTTGTGCCACATTACCAAATATGGTAGACAGTAACCTTTCGGAATTG
AAGTTGAAGGATTCCAGTATTGTCTTATTAAGTGCTTTCGTGAAAACACAAAACAGCCCC
CTATAGGTACAATGTATTCTAATCTGTCAAAAGTTTGGAATGTCTCAAATAGTTTTAGAAA
GCATGTCAATAAAGTTGGTTGGACTGTGTACAAAGAAAATTCAACCTTCAATTTCCTATA
TGTAAAGCCAATTATCGCTTGAATGCTATTCATGCTTCTAGTGAAGATTTTCAATGGTAA AC AG G AG ATC AATC AG C C AC C ATAAAACTTTTAC AAC AG AAAG G G C AAAAC ATATTG C A
AATG G CCTTTTCTATG CTACTG CTAAATG G AACTTTACCAGG G ACAAC ATAAG ATGTTTC
ACTGGCCAGATGGGAAAACACGTCTGGCCATAATAAATCCAAAAAGAACATTATCCATT
TCAATAATAATTAAGAGACAACTAACCAGAAAGAGCTGTCTTGTCATCAGTCTCCAAACC
TAGTTCCTGCATTAAGCCAATCAAAACCACGTCATAAGTACTTTTTTTTCAAATCAACTCT
TTGTTTAAACTGTGAATCAAACATAAGAACCAGAAGCTTAACATATATATTATCTAGGAAT
AAACATATGTAAGTACTAAGTTTAAGGATGATAACACAAAAAAAGCTGCACATAATCCAC
ATGCCTTCTTTATGGCCTTTAATTGTTCATTCAACAAATAACGGCGTTGCTCTCCGCTTA
TTTTTTCTTCAATTGCTCTTGCTATTGATTCCTGCAGAAAACAAAGACGACGTAAAATGC
TAAATGCATGCATAACAACTATTCAAAGTTCTGGTATGCGTCTTGACGTGCATTTACCTG
AATCTTACTAATCTCCAT
SEQ 81 to 160 are putative protein SEQ related to SEQ 1-80
SEQ 81
MALRFSLIFLFSLFLTTSLLLSVNGNINGGEDDDILIRQVVGDDDDHLLNADHHFTIFKRRFGKTYA
SDEEHHYRFSVFKANLRRAMRHQKLDPSAVHGVTQFSDLTPAEFRRNFLGVNRRLRLPSDANK
APILPTEDLPSGFDWRDHGAVTSVKNQGSCGSCWSFSTTGALEGATYLSTGKLVSLSEQQLVD
CDHECDPEEKDSCDAGCNGGLMNSAFEYTLKAGGLMREEDYPYTGTDRGTCKFDNTKVAAKV
ANFSVVSLDEEQIAANLVKNGPLAVAINAVFMQTYVGGVSCPYICSKKLDHGVLLVGYGTGFSPI
RMKEKPYWIIKNSWGEKWGENGYYKICRGRNVCGVDSMVSTVSAVSTSSH
SEQ 82
MGAKVFLVALFLSALLFPLASSSNDGLMRIGLKKMKFDQNNRLAARIESKEGDVLRASIRKYNFR
GKLGDSEDTDIVALKNYMDAQYFGEIGVGTPPQKFTVIFDTGSSNLWVPSSKCYFSVPCFFHSK
FKSSESSTYKKNGKSAAIQYGSGAISGFFSQDNVKVGDLVVTDQEFIEATREPSVTFLVAKFDGIL
GLGFQEISVGNAVPVWYNMVQQGLIKDPVFSFWLNRNTEEEQGGEIVFGGVDPNHYKGEITYV
PVTHKGYWQFDMGDVLIEGKATGYCESGCSAIADSGTSLLAGPTTIITMINQAIGASGVASQQCK
SVVEQYGQTIMDLLLAEAHPKKICSQVGVCTFDGNRGVSMGIESVVDEKAGRSTGLQDGMCSA
CEMAVIWMENQLRQNQTQDRILNYVNELCERLPSPLGESAVDCGKLSSMPTVSFTIGGKVFDLV
PKEYILKVGEGAKAQCISGFTGLDIPPPRGPLWILGDVFMGRYHTVFDYGKLRVGFAEAA
SEQ 83
MGSFLCFSVIVVLLVLQPCLAKKVYIVHMKNHQIPSSFATHHDWYNAQLQSLSSSSTSDESSLLY
SYDTAYSGFAASLDPHEAELLRQSDDVVGVYEDTVYTLHTTRTPEFLGLNNELGLWAGHSPQEL
NNAAQDVVIGVLDTGVWPESKSYNDFGMPDVPSRWKGECESGSDFDPKVHCNKKLIGARFFS
KGYQMSASGSFTNQPRQPESPRDQDGHGTHTSSTAAGAPVANASLLGYASGVARGMAPRAR
VATYKVCWPTGCFGSDILAGMERAILDGVDVLSLSLGGGSGPYYRDTIAIGAFSAMEKGIVVSCS
AGNSGPAKGSLANTAPWIMTVGAGTIDRDFPAFATLGNGKKITGVSLYSGKGMGKKVVPLVYST
DSSASLCLPGSLDPKMVRGKIVLCDRGTNARVEKGLVVKEAGGVGMILANTAESGEELVADSHL
LPAVAVGRKLGDFIRQYVKSEKNPAAVLSFGGTVVNVKPSPVVAAFSSRGPNTVTPQILKPDVIG
PGVNILAAWSEAIGPTGLEKDTRRTKFNIMSGTSMSCPHISGLAALLKAAHPEWSPSAIKSALMT
TAYVRDTTNSPLRDAEGGQLSTPWAHGSGHVDPHKALSPGLIYDITPEDYIKFLCSLDYELNHIQ
AIVKRPNVTCTKKFADPGQINYPSFSVLFGKSRVVRYTRAVINVGAAGSVYEVTVDAPPSVTVTV
KPSKLVFKRVGERLRYTVTFVSKKGVNMMRKSAFGSISWNNAQNQVRSPVSYSWSQLLD
SEQ 84
MGTKFILFILLFIFLFSSGFVACGGFYSFRNLNSSVSGIEFPNHPSFNAVSSSADSDCNYGVSQKS
KTHSIAQEVDGVDVKNGENEEVSIFGNQKKEAVKFQLRHRSAGKKIEAKDSVFESRARDLSRIQT
LHTRIVEKKNQNYNSRLAKSNEKHVDKHKPVIAPAAVSLESYELSGKLMATLESGVSLGSGEYF
MDVFVGTPPKHFSLILDTGSDLNWIQCVPCFDCFEQNGPHYNPQDSTSFRNISCHDPRCKFVTS
PDPPQLCKSENQTCPYYYWYGDSSNTTGDFALETFTVNLTTTSGSEFRKVENVMFGCGHWNR
GLFHGAAGLLGLGRGPLSFASQLQSLYGHSFSYCLVDRNSNSSVSSKLIFGEDKELLKHPQLNF
TSLVGGKEVETFYYVQIKSVIVGGEVLNIPEETWNLSLEGLGGAIIDSGTTLSYFADPAYEIIKEAF
VNKVKGYPIVQDFPILNPCYNVSGVKNLEFPSFGIVFGDGAVWNFPVENYFIKLEPEDIVCLAVLG
TPRSALSIIGNYQQQNFHILYDTKRSRLGYAPTRCADA SEQ 85
MALTLKSLATPLLFGALFILILQVVAEQPISEAKVESAILQESIIKEVNENAKAGWKAAFNPRFSNFT
VSQFKRLLGVKPAREGDLEGIPILTHPKLLELPKEFDARKAWPQCSTIGRILDQGHCGSCWAFGA
VESLSDRFCIHHNLNISLSVNDLLACCGFLCGSGCDGGYPITAWRYFIRRGVVTEECDPYFDNE
GCSHPGCEPGYPTPKCQRKCVKEILLWGKSKHYGVNAYRIHHDPNSIMTEIYKNGPVEVSFTVY
EDFAHYKSGVYKHVTGQSMGGHAVKLIGWGTSEQGEDYWLIANSWNRGWGDDGYFKIRRGT
NECGIEHNVVAGLPSAKNLNVELDDVSNAFLDASM
SEQ 86
TLVLHTSFYLLLSVASPGDCLLLSIFPFSFSSPRYFPYKQNTVKIISSNFLFSPFFQMGSFLCFSVIV
LFLVFQPCFSKKVYIVHMKNHQIPSSFATHHDWYNAQLQSLSSSSTSDESSLLYSYDTAYSGFA
ASLDPHEAELLRQSDDVVGVYEDTVYTLHTTRTPEFLGLNNELGLWAGHSPQELNNAAQDVVIG
VLDTGVWPESKSFNDFGMPNVPSRWKGECESGPDFDPKVHCNKKLIGARFFSKGYQMSASGS
FTNQPRQPESPRDQDGHGTHTSSTAAGAPVANASLLGYASGVARGMAPRARVATYKVCWPTG
CFGSDILAGMERAILDGVDVLSLSLGGGSGPYYHDTIAIGAFSAMEKGIVVSCSAGNSGPAKASL
ANTAPWIMTVGAGTIDRDFPAFATLGNGKKITGVSLYSGKGMGKKVVPLVYSTDSSASLCLPGS
LDPKIVRGKIVLCDRGTNARVEKGLVVKEAGGVGMILANTAESGEELVADSHLLPAVAVGRKLG
DFIRQYVKSEKNPAAVLSFGGTVVNVKPSPVVAAFSSRGPNTVTPQILKPDVIGPGVNILAAWSE
AIGPTGLEKDTRRTKFNIMSGTSMSCPHISGLAALLKAAHPEWSPSAIKSALMTTAYVHDTTNSP
LRDAEGGQLSTPFAHGSGHVDPHKALSPGLIYDITPEDYIKFLCSLDYELNHIQAIVKRPNVTCAK
KFADPGQINYPSFSVLFGKSRVVRYTRAVTNVAAAGSVYEVVVDAPPSVLVTVKPSKLVFKRVG
ERLRYTVTFVSNKGVNMMRKSAFGSISWNNAQNQVRSPVSYSWSQLLD
SEQ 87
MASSCLHAILLCFLLFITSTTAQNQTSFRPKGLILPITKDASTLQYLTQIHQRTPLVPVSLTLDLGGQ
FLWLDCDQGYVSSSYKPARCRSAQCSLAGAGSGCGQCFSPPKPGCNNNTCSLLPDNTITRTAT
SGELASDTVQVQSSNGKNPGRNVTDKDFLFVCGATFLLEGLASGVKGMAGLGRTIISLPSQFSA
EFSFPRKFAVCLSSSTNSKGVVLFGDGPYSFLPNREFSNNDFSYTPLFINPVSTASAFSSGEPSS
EYFIGVKSIKINQKVVPINTTLLSIDNQGVGGTKISTVNPYTILETSIYNAVTNFFVKELVNITRVASV
APFGACFDSRNIVSTRVGPAVPSIDLVLQNENVFWRIFGANSMVQVSENVLCLGFVDGGVNPRT
SIVIGGYTIENNLLQFDLAGSRLGFTSSILSRLTTCANFNFTSIT
SEQ 88
MNPEKFTHKTNEALAGAHELALSAGHAQFTPLHMAVALISDHNGIFRQAIVNAGGNEEVANSVE
RVLNQAMKKLPSQTPAPDEIPPSTSLIKVLRRAQSSQKSCGDSHLAVDQLILGLLEDSQIGDLLKE
AGVSASRVKSEVEKLRGKEGRKVESASGDTTFQALKTYGRDLVEQAGKLDPVIGRDEEIRRVVR
ILSRRTKNNPVLIGEPGVGKTAVVEGLAQRIVRGDVPSNLADVRLIALDMGALVAGAKYRGEFEE
RLKAVLKEVEEAEGKVILFIDEIHLVLGAGRTEGSMDAANLFKPMLARGQLRCIGATTLEEYRKYV
EKDAAFERRFQQVYVAEPSVTDTISILRGLKERYEGHHGVKIQDRALVVAAQLSSRYITGRHLPD
KAIDLVDEACANVRVQLDSQPEEIDNLERKRIQLEVELHALEKEKDKASKARLVEVRKELDDLRD
KLQPLMMRYKKEKERIDELRRLKQKRDELIYALQEAERRYDLARAADLRYGAIQEVETAIANLES
TSAESTMLTETVGPDQIAEVVSRWTGIPVSRLGQNEKEKLIGLGDRLHQRVVGQDHAVRAVAEA
VLRSRAGLGRPQQPTGSFLFLGPTGVGKTELAKALAEQLFDDDKLMIRIDMSEYMEQHSVARLI
GAPPGYVGHDEGGQLTEAVRRRPYSVVLFDEVEKAHPTVFNTLLQVLDDGRLTDGQGRTVDFT
NTVIIMTSNLGAEYLLSGLMGKCTMETAREMVMQEVRKQFKPELLNRLDEIVVFDPLSHEQLRQ
VCRYQMKDVALRLAERGIALGVTEAALDVILSESYDPVYGARPIRRWLERKVVTELSKMLVKEEI
DENSTVYIDAGVGRKDLTYRVEKNGGLVNAATGQKSDILIQLPNGPRSDAVQAVKKMRIEEIEED
EMED
SEQ 89
MQSFKSASILRRLLQNSRLVSHSRSFCSVSTNALVDESQSTVLVEGKASSRTAILNRPHALNALN
FSVVDRLLKLYKNWEDDPDIGFVVLKGSGKAFSAGGDIVTIYNLLKQDAGNLQDCKDFCWTINNL
VYVVGTLLKPHVALLNGITMGGGAGISIPGTFRVATEKTVFATPETLIGYHPDAGASFYLSHLPGY
LGEYLALTGDKINGAEMISCGLATHYLHSAKLPLIEEQLGKLMTDDPSVIERSLENCGEIVHPDPT
SVLHRIETLNKCFSHDTVEEIIDALESEAAKKQDAWCVSTLRKLQETAPLSLKVSLRSIREGRHQT LDQCLIREYRMSVQAFSGQITNDFCEGVRARLVDRDFAPKWDPPSLDKVTDDMVDQYFSRLTA FEPELELPTQQREAFT
SEQ 90
MALTLKSLATPLLLGAFFILVLQVVAEKPISEAKVESAILKESIIKEVNENAKAGWKAAFNPQFSNF
TVSQFKRLLGVKPAREGDLEGIPLLTHPKLSELPKEFDARKAWPQCSTIGRILDQGHCGSCWAF
GAVESLSDRFCIHHNLNISLSVNDLLACCGFLCGSGCDGGYPISAWRYFIRRGVVTEECDPYFD
NEGCSHPGCEPGYPTPKCQRKCVKENLLWGKSKHYGVNAYRIHRDPYSIMTEIYKNGPVEVSF
TVYEDFAHYKSGVYKHVTGQSMGGHAVKLIGWGTSEQGEDYWLIANSWNRGWGDDGYFKIRR
GTNECGIEHNVVAGLPSAKNLNVELDDVSDAFLDASM
SEQ 91
MGVLKKTLLLLFLCVFLGDISLCFSSKLYVVYMGSKDSDEHPDEILRQNHQMLTAIHKGSIEQAKT
SHVYSYRHGFKGFAAKLTEAQASEISKMPGVVSVFPNTKRSLHTTHSWDFMGLSDDETMEIPG
FSTKNQINVIIGFIDTGIWPESPSFSDTNMPPVPAGWKGQCQSGEAFNASICNRKIIGARYYMSG
YEAEEENGKTMFYKSARDSSGHGSHTASTAAGRYVANMNYKGLANGGARGGAPMARIAVYKT
CWSSGCYDVDLLAAFDDAIRDGVHVISLSLGPDAPQGDYFNDAISVGSYHAVSRGILVVASVGN
EGSTGSATNLAPWMITVAASSTDRDFTSDILLGNGVRLKGESLSLSQMNTSTRIIPASEAYAGYF
TPYQSSYCLDSSLNRTKAKGKVLVCLHAGSSSESKMEKSIIVKEAGGVGMILIDDADKGVAIPFVI
PAATVGKKIGNKILAYINNTRLPMARILSARTVLGAQPAPRVAAFSSRGPNSVTPEILKPDIAAPGL
NILAAWSPAASTKLNFNVLSGTSMACPHITGVVALLKAVHPSWSPSAIKSAIMTTAKLSDKHHKPII
VDPEGKRATPFDFGSGFVNPTNVLDPGLIYDAQPADYRAFLCSIGYDEKSLHLITRDNSTCDQTF
ASPNGLNYPSITIPNLRSTYSVTRTVTNVGKARSIYKAVVYAPTGVNVTVVPRRLAFTRYYQKMN
FTVNFKVAAPTQGYVFGSLTWRNKRTSVTSPLVVRVAHSNMGMMV
SEQ 92
MGAKAFLVAMFLSALLFPFASSSNDGLMRIGLKKMKFDQNNRLAARIESKEGDVLRGSIRKYNF
RGKLGDFEDTDIVALKNYMDAQYFGEIGVGTPPQKFTVIFDTGSSNLWVPSSKCYFSVPCFFHS
KYKSSESSTYKKNGKSAAIQYGSGAISGFFSQDNVKVGDLVVTDQEFIEATREPSVTFLVAKFDG
ILGLGFQEISVGNAVPVWYNMVKQGLIKDPVFSFWLNRNTEEEQGGEIVFGGVDPNHYKGEITY
VPVTQKGYWQFDMGDVLIDGKATGYCESGCSAIADSGTSLLAGPTAIITMINQAIGASGVASQQC
KSVVEQYGQTIMDLLLAEAHPKKICSQVGVCTFDGNRGVSMGIDSVVDEKAGRSTGLQDGMCS
ACEMAVIWMANQLRQNQTQDRILNYVNELCERLPSPLGESAVDCGKLSSMPKVSFTIGGKVFDL
SPNEYILKVGEGAKAQCISGFTGLDIPPPRGPLWILGDIFMGRYHTVFDYGKLRVGFAEAA
SEQ 93
MTFFRSFLFFLLTLFVISSALDMSIISYDEQHGQMGTTHHRTDDEVRELYESWLVKHGKNYNAIG
EKERRFEIFNDNLRFIDEHNAENRSYKLGLNRFSDLTNEEYRAMFVGGRLDRKTRLMKSPKSNR
YAFQAGEKLPESVDWREKGAVAPVKDQGQCGSCWAFSTVGAVEGINKIVTGELISLSEQELVD
CDRSYNQGCNGGLMDYAFDFIKNNGGIDTEDDYPYHAQDGTCDPYRKNARVVSIEGYEDVPEN
DEKSLMKAVANQPVSVAIEGGGRAFQHYSSGVFTGYCGTQLDHGVVVVGYGTENGEDYWIVR
NSWGANWGESGYIKLQRNFANSTTGKCGIAMQASYPLKSGANPPNPGPSPPTPVTPSTVCDEY
YSCPQGTTCCCIYQYGEYCFGWGCCPYESATCCDDNYSCCPHDYPVCDVDAGTCLMSKDNPL
KVKALKRGPARVNWSGMKSNRKVSYV
SEQ 94
MANSYTSFNFFLAPIIFLAILGLQLQSSDGFGTFGFDIHHRYSDPVKGILDLHGLPEKGSVEYYSA
WTQRDRFIKGRRLADTTNPTPLSFSGGNETFRLSSLGFLHYANVTVGTPGLSFLVALDTGSDLF
WLPCDCSNCVRALETRSGRRINLNIYSPNTSSTGQIVPCNGTLCGQRRRCLSSQNACAYGVAYL
SNNTSSSGVLVEDILHLETDNAQQKSVEAPIALGCGIRQTGAFLSGAAPNGLFGLGLESISVPSM
LASKGLAANSFSMCFGPDGIGRIVFGDKGSPDQGETPLNLDQLHPTYNISLTGITVGNKITDVDFT
AIFDSGTSFTYLNDPAYKVITENFDSQAKQLRIQPDGEIPFEYCYGLSANQTTFEVPDLNLTMKG
GNQFFLFDPIIMLSLQDGSRAFCLAVVKSGDVNIIGQNFMTGYRVVFDREKMVLGWKPSDCYDS
RESNDKSTTLPVNKRNSTEAPSPSSVVPEATKGNGSGNEPATSFPSVPSSRPAINHAPAHFNSY
ICQLMMALFSLFSYYLIIVSS
SEQ 95 MVTKFSIFILVVLLRLFSFGSVASREIHNSGLNLNSSASGIEFPQHPSFNSVTASGNSDCSYGTSK
KSTTTHVITQEENRSDEKEDEDLMVSKNQPREAVKFHLRHRSAGQNIEAKDSIFESTTRDLGRIQ
TLHTRIVEKKNQNSISRQTKNSEKPTQSSSFEFSGKLMATLESGVSHGSGEYFMDVFVGTPPKH
FSLILDTGSDLNWIQSVPCYDCFEQNGPHYDPKDSISFKNISCHDPRCHLVSSPDPPQPCKSEN
QTCPYYYWYGDSSNTTGDFALETFTVNLTTPSGDSEIKKVENVMFGCGHWNRGLFHGAAGLLG
LGRGPLSFSSQLQSLYGHSFSYCLVNRNSNSSVSSKLIFGEDKELLKHANLNFTSLVGGKENHL
ETFYYVQIKSVIAGGEVLNIPEETWNLSTEGVGGTIIDSGTTLSYFAEPAYEIIKQAFVNKVKHYPV
LEDFPILKPCYNVSGVEKLELPSFGIVFGDGAIWNFPVENYFIKLEPEDIVCLAMLGTPHSAMSIIG
NYQQQNFHI LYDTKRSRLG FAPTRCADA
SEQ 96
MPSSFSLLFLTLLLASISLSFSSTLNSNDDDFFLSSTPKFPLTMAEKLIRQLNLFPKHDINKAAATG
DSAAVTEQRLFEKKLNLSYVGNSGSTVQDLGHHAGYYRLPHTKDARMFYFFFESRSRKNDPVVI
WLTGGPGCSSELAVFYENGPFKIADNMSLVWNDFGWDKVSNLIYVDQPTGTGFSYSSNDDDIR
HDERGVSNDLYDFLQAFFKAHPQYAKNDFYITGESYAGHYIPAFASRVHQGNKNKEGIYVNLKG
FAIGNGLTDPEIQYKAYTDYALDMKLIKKSDYNAIEKSYPKCQLAIKLCGKDGGTACMAAYLVCTS
IFNKIMDIAGDKNYYDVRKRCEGDLCYDFSKMETFLNDQQVKKALGVGDIEFVSCSSEVYQAMQ
LDWMRNLELGIPSLLEDGIKLLVYAGEYDLICNWLGNSRWVHAMKWTGQKAFGKATQVSFAVD
GVEKGVQKNYGPLTFLKVHDAGHMVPMDQPKAAMEMLQRWMQDKLSKEGHLAPM
SEQ 97
MTLTLKSLAAPLFLGAFCILILQVVAEKPISEAKVESAILQESIIKEVNENAKAGWKAAFNPRFSNFT VSQFKRLLGVKPAREGDLEGIPILTHPKLLELPKEFDARKAWPQCSTIGRILDQGHCGSCWAFGA VESLSDRFCIHHNLNISLSVNDLLACCGFLRGSGCDGGYPISAWRYFIRRGVVTEECDPYFDNE GFHTRVVNQDIPPQSVV
SEQ 98
MFRLVMVTKFSIFILVVLLRLFSFGFVASREIHNFGINLNFSASGIEFPQHPSFNSVTASGNSDCSY
GTSKKSTTTHVITQEENNSDEKEDEDLMVSENQPREAVKFHLRHRSAGQNIEAKDSIFESTTRDL
GRIQTLHTRIVEKKNQNFISRQTKNSEKTTQSSSFEFSGKLMATLESGVSHGSGEYFMDVFVGT
PPKHFSLILDTGSDLNWIQSVPCYDCFEQNGPHYDPKDSISFKNISCDDPRCHLVSSPDPPQPC
KSENQTCPYYYWYGDSSNTTGDFALETFTVNLTTPNGDSEIKKVENVMFGCGHWNRGLFHGA
AGLLGLGRGPLSFSSQLQSLYGHSFSYCLVNRNSNSSVSSKLIFGEDKELLKHLNLNFTSLVGGK
ENHLETFYYVQIKSVIVGGEVLNIPEETWNLSTEGVGGTIIDSGTTLSYFAEPAYEIIKQAFVNKVK
RYPILDDFPILKPCYNVSGVEKLELPSFGIVFGDGAIWTFPVENYFIKLEPEDIVCLAILGTPHSAM
SIIGNYQQQNFHILYDTKRSRLGFAPRRCADA
SEQ 99
MSGFRLPLLFHLLLPLTLFLQYVQSLPQNSSTVEFLPGFDGPLPFYLETGYIGVGKSEEVQLFYYF
VKSESNPKKDPLLLWLTGGPGCSSFTGVAYEVGPLAFGQKAYNGSLPILVSTPYSWTKFASILFL
EQPVNTGFSYATTSAASKCTDLQACDQVYEFLLKWFNNHPEFISNPFYVSGDSYSGITVPVIVQL
ISDGIEAGKKPLINLKGYSLGNPLTFPEESNYQIPFCHGMGLISNELYESLKETCKGDCRNIDPTN
KLCLENFKMFKKLVSSINDQQILEPFCGTDSESPNPRQLSGERRSLEEDFIFLKHDDFICRESRVA
TRKLSNHWANDPSVQEALHVRKGTIRRAWARCRQSIMGTTYRVTFMNSIPYHVNLSSKGYRSLI
YSGDHDMVVPFQSTQAWIKYLNYSIIDDWRPWTIDGQVAGYTRSFSNHMTYATVKGGGHTAPE
YKREESFHMFKRWIAQQPL
SEQ 100
METNGLIKEILPRDAVNNMTRLILSNALYFKGEWNEKFDVSETKDHDFHLLNGGSIQAPFMTSKK KQYIAAFDCFKILRLPYKQGTDTRRFCMYFILPDAHDGLPALLEKISLEPGFLNNHVPYGKVRARK FLIPKFKITFGFEASNILKGLGLTLPFCGGSLTEMVDSPMPQNLSVSQVFHKSFIEVNEEGTEAAA VTATVIMTMSLIIEKEMDFVADHPFLFLIRDESTGAVLFIGSVMNPLAG
SEQ 101
MNESYGNSRASSSSTTSSLNSSSHGTEDDHTIARILAEEEENALKYGGNKLGRRLSHLDSIPHTP RVIGEIPDPNDATLDHGRLSSRLATYGLAEMQIEGDGNCQFRALSDQLYHNPEYHKHVRKEVVK QLKRFRKLYEGYVPMRYKSYLRKMKRLGEWGDHVTLQAAADRFGVKICLVTSFRDNGYIDILPK DIQPSRELWLSFWSEVHYNSLYEIGEVPARVRRKKHWLFF
SEQ 102
MSWLCPSLVLVLLIFQGPICTCSSISDLFESWCQQNGKTYSSEQERVYRLEVFEENYAYIIEHNS
KGNSTYTLNLNAFSDLTHHEFKNSFLGLSSSANDFIRLKTGSSSAGVFNDVGVVDIPSSLDWREK
GAVTKVKNQGSCGACWSFSATGAIEGINKIVTGSLVSLSEQELIDCDKSYNDGCGGGLMDYAFE
FVKKNGGIDTEEDYPFNEREGTCNKNKLQRRVVTIDGYTDVPQYDEDKLLKAVANQPVSVGICG
SERAFQSYSKGIFTGPCSTVLDHAVLIVGYGSENGVDYWIIKNSWGTSWGINGYMHMQRNSGN
QEGICGINKLASYPTKSSPNPPSPPSPGPSKCSMFTSCGQGETCCCGWRLLGVCVSWKCCGL
DSAVCCKDGRHCCPHDYPICDTSRNLCLKRMSNATIVQQPQKEAFSGKFGGLIYPF
SEQ 103
MCEPESEAARGVLSFLDVDQLFSSNYYGDGRKHDVEICHEQYARENHYHTSYCNVDNDEAIAH
VLQEDLSELSIAEDAESSHADEQYLQASTGVQHWHTPPREYYAGHDTSLEADDVGPSSSCSSP
GDRSYDGEEYTYTLEIQDEFELDGEVGKRINQLSAVPHVPRINGDIPSVDEATSDHQRLLNRLQL
FDLVEHKVQGDGNCQFRALSDQFYRTPEHHKFVRQQVVSQFQHHPEMYEGYVPMEYGEYLTR
MSKSGEWGDHVTLQAAADSYGVKILVITSFKDTCYIEILPKNQKSNRVIYLSFWAEVHYNSIYPQ
GDFLPFDFKKKKKKWSFWNKH
SEQ 104
MPSLLQIFLPLFPFFFLVSFSVSHGPFLPKAIILPVNKDLSTFQYVTQVYMGAHLVPTNLVVDLGG
SFLWTNCGLTSVSSSQKLVPCNSLKCSMAKPNGCTNKICGVQSENPFTKVAATGELAEDMFAV
EFIDELKTGSIASIHEFLFSCASTTLLQGLARGAKGMLGLGNSRIALPSQLSDTFGFQRKFALCLS
SSNGAIISGESPYLSLLGHDVSRSMLYTPLISSKDGVSEEYYINVKSIKINGKKLSLNTSLFAMDEG
VGGTKISTIPPFTTMKSSIYKSFIEAYEKFAISMELNKVEAIAPFELCFSTKGIDVTKVGPNVPTTDL
VLQSEMVKWRIYGRNSMVKVSDEVMCLGFLNGGVNQKASIVIGGYQLEDNLLEFNLGTSMLGF
TSSLSMAETSCSDFMFHSVSKDSAFDS
SEQ 105
MGAKEVLILVLVCMFIVFPSCHGDDECLNPFLVDQNCYVKDYITKLANATETVKWMMKIRRQIHE
NPELAYEEFKTSGLIREELDRMGVKYRWPVAKTGVVATIGSGKPPFVALRADMDALPIQELAKW
EHKSKVDGKMHACAHDAHTAMLLGAAKILQQLRHNLQGTVVLIFQPAEERGHGAKDMIEEGVLE
NVEAIFGMHLVHKYESGVVASRPGEFLAGCGSFKATIRGKGGHAAVPHDSVDPILAASTSVISLQ
SIVSRETDPLESQVVSVAMIEGGHAFNIIPELATISGTYRAFSKKSFYGLRKRIEEVIRAQAAVHRC
TVEIDFDGRENPTLPPTINDERIYEHARKVSKMIVGEESFKIAPSFMGSEDFAVFLEKVPGSFFLL
GTKNEKIGAIYPPHNPHFIIDEDVLPIGAAIHATFAYSYLLNSTNKFTSHSS
SEQ 106
MKLNPYSWTKVASIIFLDLPVGTGFSYARTPTALQSSDLQASDQAYEFLYKWFLDHPEFLKNPLY
VGGDSYSGMVVPIITQIIATKNEMGIKPFVDLQGYLLGNPSTFKGEKNYEIPFAYGMGLISDELYE
SLTRNCKGEYQNTDPSNTQCLQDVHTFQELLKRINNPHILEPKCQFASPKPHLLFGQRRSLNVK
FHQLNNPQQLPALKCRNDWYKLSSHWADDGQVREALHIRKGTIGKWVRCASLQYQKTIMSSIP
YHANLSAKGYRSLIYSGDHDKVVTFLSTQAWIKSLNYSIVDDWRPWIVDNQVAGYTRSYSNRMT
FATVKGAGHTAPEYKPRECLAMLKRLMSYKPL
SEQ 107
MCEPESEATRGVLSFLDVDQLFSSNYYGDGRKHDVEICHEQYARENQYHTSYCNVDSDEAIAH
LLQEELSELSIAEDAESSHADEQYFQASTGVQHWHTPPREYYAGHDTGLEADDVGPSSSCSSP
GDRSYDGEEYTYTLEIQDEFELDGEVGKRINQLSAVPHVPRINGDIPSVDEATSDHQRLLDRLQL
FDLVEHKVQGDGNCQFRALSDQFYRTPEHHKFVRQQVVSQLKHHPEMYEGYVPMEYGEYLKR
MSKSGEWGDHVTLQAAADSYGVKILVITSFKDTCYIEILPKNQKSNRVIYLSFWAEVHYNSIYPQ
GDFLPFDLKKKKKKWSFWNKH
SEQ 108 MPSLLQIFLPLFPFFFFVSFSVSHGPFLPKAIILPVNKDLSTFQYVTQVYMGAHLVPTNLVVDLGG
SFLWTNCGLTSVSSSQKLVPCNSLKCSMAKPNGCTNKICGVQSENPFTKVAATGELAEDMFAV
EFIDELKTGSIASIHEFLFSCASTTLLQGLARGAKGMLGLGNSRIALPSQLSDTFGFQRKFALCLS
SSNGAIISGESPYLSLLGHDVSRSMLYTPLISSKNGVSEEYYINVKSIKINGNKLSLNISLFTMDEE
GVGGTKISTISPFTSMKSSIYRTFMEAYEKIAISVNLTKVESIAPFELCFSTEGIDVTKVGPNVPTM
DLVLQSEMVKWRIYGRNSMVKVSDEVMCWGFLDGGVNQKASIVIGGYQLENNLLEFNLGTSML
GFTSSLSTAETSCSDFMIHSVSKDSAFDS
SEQ 109
MKMSPALSLSVIQFPLCKSQDLSKDTNNPKIFSKETPCQKSYSDTRINRRKLLSGSGLSLVAGTL
AKPARAETEAPIEATSSRMSYSRFLEYLNEGAVKKVDFFESSAVAEIFNPALNKVQRVKVQLPGL
PPELVRKLREKDVDFAAHLPEMNVIGPLLDLLGNLAFPLILLGSLLLRTSSSNTPGGPNLPFGLGR
SKAKFQMEPNTGVTFDDVAGVDDAKQDFQEIVEFLKTPEKFAAVGAKIPKGVLLVGPPGTGKTL
LAKAIAGEAEVPFLSLSGSEFVEMFVGVGASRVRDLFNKAKENSPCLVFIDEIDAVGRQRGTGIG
GGNDEREQTLNQLLTEMDGFTGNTGVIVIAATNRPEILDQALLRPGRFDRQVSVGLPDIRGREEI
LKVHSNNKKLDKDVSLSVIAMRTPGFSGADLANLMNEAAILAGRRGKDKITSKEIDDSIDRIVAGM
EGTKMTDGKNKILVAYHEVGHGVCATLTPGHDAVQKVTLIPRGQARGLTWFIPGEDPTLISKQQ
LFARIVGSLGGRAAEEIIFGEAEITTGAAGDLQQITQIARQMVTMFGMSEIGPWALTDPATQSGD
VVLRMLARNQMSEKLAEDIDASVRHIIERAYEIAKNHIRNNREAIDKLVDVLLEKETLTGDEFRAIL
SEFTNIPSANINSKPIRELIEA
SEQ 1 10
MEGLHQLKTGELIPLSEQELVDCDVEGEDEGCSGGLLDTAFDFILKNKGLTTEVNYPYKGEDGV CNKKKSALSAAKITGYEDVPANSEKALLQAVANQPVSVAIDGSSFDFQFYSSGVFSGSCSTWLN HAVTAVGYGATTDGTKYWIIKNSWGSKWGDSGYMRIKRDVHEKEGLCGLAMDASYPTA
SEQ 1 1 1
MGCRMKFLNVVLVVAAVMAAAAAVAFGAEKLPAGVLSLERIFPLNGKMELEEVRARDRARHAR
MLQSFAGGIVNFPVVGSSDPYLVGLYFTKVRLGTPPREYNVQIDTGSDILWVTCSSCDDCPRTS
GLGVELNFYDATISSTASPISCADQVCASIVQTASAECSTETNQCGYSFQYGDGSGTTGHYVAD
LLYFDTVLGTSLIANSSAPIIFGCSTSQSGDLTKTDRAIDGIFGFGQQGLSVISQLSSHRITPKVFS
HCLKGEGNGGGILVLGEILDPRIVYSPLVPSQAHYNVYLQSIAVNGQLVPVDPSVFATSGNRGTI
VDSGTTLAYIATEAYDPFVNAITAAVSPSVRPIISRGKPCFLVSSSIAEIFPPVSLNFDGGASMALR
PSDYLVHMGFVEGAAMWCIGFEKQDQGVTILGDLVLKDKIFVYDLARQRIGWADYDCSSSVNVS
ITSGKDEFINAGQLSVNRASGSLLFNPRHTRTIFHLLSLVLMIGSPFLT
SEQ 1 12
MTRASIILLLLLIATSIAAAQGGALTFDDDNPIRQVVVSDGLQELENGILQLIGQTRRALSFVRFVR
RYGKRYDSVEEIKQRFEIYLDNLKMIRSHNKQRLSYKLGVNEFTDLTWDEFRRERLGAPQNCSA
TTKSDLQLTNVNLPETKDWREAGIVSPVKKQGKCGSCWTFSTTGALEAAYAQAFGKNISLSEQQ
LLDCAGAFNNFGCHGGLPSQAFEYIKYSGGLDTEEEYPYAGKAGVCKFSSENVAVKVVDSVNIT
KGAEDELKYAIAFIRPVSVAYQVVKGFKQYKGGIYSSTVCGNTPQDVNHAVLAVGYGVDNGTPY
WLIKNSWGAEWGDNGYFKMEMGKNMCGIATCASYPIVA
SEQ 1 13
MNPEKFTHKTNEALAEAHELAISAGHAQFTPLHMALALISDHNGIFRQAIVNAAGSEETANSVER
VFKQAMKKIPSQTPAPDQIPPSTSLIKVLRRAQSLQKSRRDTHLAVDQLILGLLEDSQIGDLLKEA
GIGAARVKSEVEKLRGKDGKKVESASGDTNFQALKTYGRDLVEQAGKLDPVIGRDEEIRRVIRIL
SRRTKNNPVLIGEPGVGKTAVVEGLAQRIVRGDVPSNLSDVRLIALDMGALIAGAKYRGEFEERL
KAVLKEVEEAEGKVILFIDEIHLVLGAGRTEGSMDAANLFKPMLARGQLRCIGATTLEEYRKYVEK
DAAFERRFQQVYVAEPSVPDTISILRGLKEKYEGHHGVKIQDRALVVAAQLSARYITGRHLPDKAI
DLVDEACANVRVQLDSQPEEIDNLERKRIQLEVELHALEKEKDKASKARLVEVRKELDDLRDKLQ
PLTMRYKKEKERIDELRRLKQKRDELTYALQEAERRYDLARAADLRYGAIQEVEAAIANLESSTD
ESTMLTETVGPDQIAEVVSRWTGIPVSRLGQNEKDKLIGLANRLHQRVVGQDDAVRAVAEAVLR
SRAGLGRPQQPTGSFLFLGPTGVGKTELAKALAEQLFDDDKLMVRIDMSEYMEQHSVARLIGAP
PGYVGHEEGGQLTEAVRRRPYSVVLFDEVEKAHPTVFNTLLQVLDDGRLTDGQGRTVDFTNTVI
IMTSNLGAEYLLSGLMGKCTMEKARDMVMQEVRKQFKPELLNRLDEIVVFDPLSHEQLRQVCR HQLKDVASRLAERGIALGVTEAALDVILAQSYDPVYGARPIRRWLEKKVVTELSKMLVKEEIDEN STVYVDAASSGKDLSYRVEKNGGLVNAATGKKSDILIQLPNGVRSDAAQAVKKMKIEEIVDE
SEQ 1 14
MPEAPKKSFFTLSLVPFLPVYTLIRFNPPIESEPLISSSSDECQHDQKQQSDSRNYIVRFYHYKEP
EDHWNYLQNNLKFKGWQWIERKNPAARFPTDFGLVEIDESMKELLLEKFRKMNLVKDVSLDLS
YQRIVLEEKSEKNGAFANGKKRPGKIFTAMSFSEGQNYAVANTSIMRISWSRHLLMQKSRVTSL
FGAHELWSKGHTGAKVKMAIFDTGIRADHPHFRNIKERTNWTNEDTLNDNVGHGTFVAGVIAG
QDEECLGFAPDAEIYAFHVFTDAQVSYTSWFLDAFNYAIATNMDVLNLSIGGPDYLDLPFVEKVW
ELTANNIIMVSAIGNDGPLYGTLNNPADQSDVIGVGAIDQSNHLASFSSRGMSTWEIPHGYGRVK
PDIVAYGREIMGSKISTRCKRLSGTSVASPVVTGIVCLLVSIIPESK
SEQ 1 15
MAQMKLSLSLFLSLVLLLAFSPSSFAKVSISSKLASKQAEKLIHELNLFPKESDNIVDRDPFPTAAS
RIVEKRFNFANLTNSSVISFEDLGHHAGYYKIKHSHAARLFYFFFESRGSKDDPVVIWLSGGPGC
SSELALFYENGPFSISNNLSLVRNEYGWDKVSNLIYVDQPTGTGFSYSSDRHDIRHSEAGVSDD
LYDFLQAFFEEHPELVKNDFYITGESYAGHYIPAFAARVHKGNKAKEGIHINLKGFAIGNGLTDPKI
QYAAYTDYALDMGLISKSDHDRINKILPVCEVAINLCGTDGKISCLAAYFVCNSIFSAVRARAGADI
NHYDIRKKCVGALCYDFSNMEKLLNMHSVKQALGVEDIEFVSCSTTVYQAMLVDWMRNLEAGIP
TLLEDGIKLLVYAGEYDLICNWLGNSRWVQAMEWSGQKEFVASPDVPFEVDSSEAGLLKSHGP
LSFLKVHDAGHMVPMDQPKVALEMLKRWIGGTLSQQTTETEDLVASI
SEQ 1 16
MAIHTSTLSISILVMLMFSAVTSSAEDMSIISYNEKHHTNGESTVWRTDDEIVSLYESWLVEHKKV
YNALGEKDKRFQIFKDNLRYIDEQNSAPEKSYKLGLTQFADLTNEEYKSIYLGTKPDGRSRLSYT
QSDRYAPKVGDSLPDSVDWRKKGVLVDVKNQGQCGSCWAFSAVASIEAVNKIMTGNLISLSEQ
ELVDCDTADNQGCQGGLMDDAFKFVIQNGGIDTEEDYPYKAKDGKCDQARKNAKVVTIDGYED
VPANDEKALKKAVAGQPVSVAIEAGGKDFQHYKSGIFTGKCGAAVDHGVVAVGYGSENGMDY
WIVRNSWGASWGENGYLRMQRNIGNPKGLCGIATIASYPVKTGQNPPKPAPSPPSPVKPPTQC
DDYNECPAGTTCCCVYKYYNYCFAWGCCPMEGATCCKDHNSCCPHDYPVCNVKAGTCSISKN
NPLGVKAMQHILAKPIGTFGNEGKKTPSS
SEQ 1 17
MACNRLHTELGNWQVNPPSGFNLEPSDYLQRWLIEVNGAPGTLYANETYQLQAEFPEHYPIKA
PQVIFLPPAPLHPDIYRDGHICLDILYDSWSPTMTVSSICISILSMLSSSTVKFPSSEMMDVPLILSK
HVFFSKFKADEDESNNANMVFSPVSIQIIFALIAAGSSGSTLDQLLAFLKFNSVEELNSVYSRVITD
VLADGSPMGGPRLSVTNWAWVDQSLSFKHSFKQVMDNVYKAASASVDFRNKGDEVTGEVNK
WAEEKTNGLIKQILPPVAVNSGTSLILANALYFKGAWTEKLNASDTKDHEFHLLNGGSVQAPLMT
SKKRQYVKAFDGFQVLRLRYKQGEDKRFLNMYVYLPNARDGLPTLLEKISSEPGFLDRHVPYEK
VKVHEFLIPKFKISLGIEALEVLKGLELTLPFKGGLTEMVGENYPLAVANVFHKAFIEVNEEGAEAP
AAKAFH KAF I E VN E EAP VAPAVTVATM M FGCSMMKVEEEID F VAD H P FM FLVKD ETAG VVLF VG
TLLNPLAVSPS
SEQ 1 18
LKVGSFFSSLIYSCNKASPNFYSYSFSLLSCFIELVNMGAKAFLVTILLSSLLFPLALSTSNDGLVRI
GLKKIKFDQNNRLAARVESKEGEAVRASIRKYNNFHGNLGASEDTDIVALKNYMDAQYFGEIGIG
SPPQKFTVIFDTGSSNLWVPSSKCYFSVPCFFHSKYKSSQSSTYKKNGKSAAIRYGTGAISGFFS
QDSVKVGDLIVQNQEFIEATREPSVTFLVAKFDGILGLGFQEISVGNAVPVWYNMVKQGLVKEPV
FSFWLNRNTKEDEGGEIVFGGVDPNHYKGKHTYVPVTRKGYWQFDMGDVLIDGQATGYCDNG
CSAIADSGTSLLAGPTTVITMINHAIGASGVVSQQCKAVVEQYGQTIMDMLLAEAHPKKICSQVG
LCTFDGTRGISMGIESVVDENAGKSSGLHDAMCSACEMAVVWMQNQLRQNQTQERILNYVNEL
CERLPSPMGQSAVDCGKLSGMPSVSFTIGGRTFDLSPEEYILKVGEGPAAQCISGFIALDVPPPR
GPLWILGDVFMGRYHTVFDFGKLRVGFAEAA
SEQ 1 19 MSKQNLEAPLLDPSPATFNRRKKWSFALCFLFALTAISFIGLRHHGHVGIWLIGDVERYNGKLQQ
NADVVESEQAVVAADDGRCSEIGISMLKIGGHAVDAAVATALCLGVVNPMASGLGGGGFMVVR
SSSTSEVQAIDMRETAPLAASQNMYDNNGKSKLEGALSMGVPGELAGLHAAWSKHGRLPWKT
LFQPAIKLARDGFVVAPYLAHHIASKAKLILKDPGLRQVIAPEGKLLRAGDICHNVKLSHSLELIAE
QGPEAFYNGEVGEKLVEDVKKAGGILTMDDLRNYKVETPEAVTVNAMGYTIVGMPPPSSGTLGI
SLILKILESYNAAEGSLGLHRLIEAMKHMFAFRMDLGDPDFVNISKTVSDMLSPSFAKAIRQKIFDN
TTFPPEYYMPRWSQLRDHGTSHFCIVDSDRNAVSVTTTVNYPFGAGVLSPSTGIVLNDEMGDF
STPSEISPDELPPAPANFIQPKKRPLSSMAPIIVLKDNQLAGVIGGSGGMKIIPAVVQVFINHFILGM
DPLAAVQSPRVYHELIPNVVLYENWTCIDGDHIELSDEKKHFLEERGHQLEAHNGGAICQLIVQN
LPNSHLKLGRRSGKEYKNGVFHGMLVAVSDPRKDGRPAAI
SEQ 120
MLKKISSFNILLNMASHITLCIWLLFFFISIISLAKPETYIIHMDLSAMPKAFASHHNWYLTTLASLSD
SSTNHKEFLSSKLVYAYTNAINGFSASLSPSEFEAIKNSPGYVSSIKDMSVKIDTTHTSQFLGLNS
ESGVWPTSDYGKDIIIGLVDTGIWPESKSYSDYGISEVPSRWKGECESGIEFNSSLCNKKIIGARY
FNKGLLANNPNLNISMNSARDTDGHGTHTSSTAAGSYVEGASYFGYATGTAIGIAPKAHVAMYK
ALWEEGVYLSDVLAAIDQAITDGVDVLSLSLGIDAIPLHEDPVAIAAFAALEKGIFVSTSAGNEGPY
YETLHNGTPWVLTVAAGTVDREFIGALTLGNGVSVTGLSLYPGNSSSSESSIVYVECQDDKELQ
KSAHNIVVCLDKNDSVSEHVYNVRNSKVAGAVFITNITDLEFYLQSEFPAVFLNLQEGDKVLEYIK
SNSAPKGKLEFRVTHIGAKPAPKVATYSSRGPSPSCPSILKPDLMAPGALILASWPQQSPVTDVT
SGKLFSNFNIISGTSMSCPHASGVAALLKAAHPEWSPAAIRSAMMTTSNAMDNTQSPIRDIGSKN
AAATPLAMGAGHIDPNKALDPGLIYDATPQDYVNLLCALNFTSKQIKTITRSSSYTCSNPSLDLNY
PSFIGFFNGNSSESDPRRIQEFQRTVTNIGDGMSVYTAKLTTMGKFKVNLVPEKLVFKEKYEKLS
YKLRIEGPLVMDDIVVYGSLSWVETEGKYVVRSPIVATSIKVDPLTGHN
SEQ 121
MEFYQKLATCSHLSLLCFILLHSIQVQGSYFDQEYGKQVLSSAIQDKDWLVSIRRIIHEYPELRFQ
EYNTSALIRTELDKLGIYYEYPFAKTGLVALIGSSSPPVVALRADMDALPLQELVEWEHKSKVTGK
MHGCGHDAHTAMLLGAAKLLNERKDKLNGTVRLVFQPAEEGGAGAYHMINEGALGDAEAIFGM
HVDFKRPTGSIGTSPGPILAAVSFFEAKIEGKGGHAAEPHATVDPILAASFAVVALQQLISREVDP
LHSQVLSVTYVRGGSASNVIPPYVEFGGTLRSLTTEGLLQLQKRVKEVIEGQAAVHRCKAYIDMK
EEDFPAYPACINDERLHQHVGRVGKLLLGSENIKETEKVMAGEDFAFYQELIPGVMFQIGIRNEK
LGSTHAPHSPHFFLDEDVLPIGAALHTAIAEMYLNDYQHPIAV
SEQ 122
RHYIYGKLTSNMKTFGIPLAAHSRVLTGSYIRSLYLQILTPFLVHTTAQADNLNCDRSATLNCDRS
ATEVCTDSEVSTDMEPGNSIVNGVPESIAEEDTAEPLDMDFEFYLSDDKATFKGSEIVMNEPLQS
TDISGRLNVLVSWSPKMLEQYNTGLFSSLPEVFKSGFFAKRPQESVSLYKCLEAFLKEEPLGPE
DMWYCPACKQHRQATKKLDLWRLPEILVIHLKRFSYNRFLKNKLETYVDFPTHDLDLSSYLAYK
DGKSSYRYMLYAISNHYGSMGGGHYTAFVHQGADRWYDFDDSHVYPISQDKLKTSAAYVLFYR
RVEEI
SEQ 123
MSRNSLKIHLSIGKIQPGSENKNGSPVYTDSGTCEHLSELRSRVGSNPFFNFRGCVKVRPLGRA
SIRREPPNELVRCGACGQAPPRLYACVTCAAVFCRVHAPSHPVGNASDPSLHSIAVDIDRAELF
CCGCRDQVYDRDFDAAVVLAQTEATVIGSIQDPPPQPENTRKRRRVEYKPWTPDVKEQVLIVG
NSSPLPSQLGNDSTTPEVQWGLRGLNNLGNTCFMNSVLQALLHTPPLRNYFLSDKHNRYFCQR
KNNSVITRSSSDNGNKNSTMLCLACDLDAMFSAVFSGDWTPISPAKFLYSWWKHASNLASYEQ
QDAHEFFISVLDGIHERMQNDKGKALSPGSGDCCIAHRVFSGILRSDVMCTACGFTSTTYDPCID
ISLDLELSQGSSAKMTSKKSHNTHKKEAESGKFSQNGRISTLMGCLDHFTRPEKLGSDQKFFCQ
HCQVRQESLKQMSIRKLPLVSCFHIKRFEHSVIKKMSRKVDHYLQFPFSLDMSPYLSSSILRSRF
GNRIFSFDGDEQDASCESSSEFELFAVITHTGKLDAGHYVTYLRLSNQWYKCDDAWITQVSESI
VRAAQGYM M FYVQKM L YYKAS EN Q VS
SEQ 124
MATHSSTLTISISLLLLLFFFFFSTLSSASDMSILTYDENQHFRTDDEVMSLYESWLLEHGKSYNA LDEKDKRFQIFKDNLRYIDEQNSVPNKSYKLGLTKFADLTNEEYRSMYLGTKTSDRRRLLKNKSD RYLPKVGDSLPDSVDWREKGVLVGVKDQGSCGSCWAFSAIASVEAVNSIVTGDVISLSEQELVD
CDTSYNDGCNGGLMDYAFDFIIKNGGIDTEEDYPYTGRDGRCDQSRKNAKVVTIDGYEDVPAN
NEKALQKAVANQPVSIAIEAGGHDFQHYVSGIFTGKCGTAVDHGVVAVGYGSENGMDYWIIRNS
WGASWGEKGYLRVQRNVASSKGLCGLAIEPSYPVKTGVNPPKPGPSPPSPIKPPTQCDDYAQC
PEGTTCCCVFEYYNSCFSWGCCPLEGATCCEDHYSCCPHDYPVCNIRAGTCSISKDNPLGVKA
M KH I H AEP I EAF I N GG RKSSS
SEQ 125
MKKLFLVLFSLALVLRLGESFDFHEKELETEEKLWELYERWRSHHTVSRSLDEKDKRFNVFKAN
VHYVHNFNKKDKPYKLKLNKFADMTNHEFRHHYAGSKIKHHRSFLGASRANGTFMYANVEDVP
PSVDWRKKGAVTPVKDQGKCGSCWAFSTVVAVEGINQIKTNELVSLSEQELVDCDTSQNQGC
NGGLMDMAFEFIKKKGGINTEENYPYMAEGGECDIQKRNSPVVSIDGYEDVPPNDEDSLLKAVA
NQPVSVAIQASGSDFQFYSEGVFTGDCGTELDHGVAIVGYGTTLDGTKYWIVRNSWGPEWGEK
GYIRMQREIDAEEGLCGIAMQPSYPIKTSSSNPTGSPATAPKDEL
SEQ 126
MARPQFTVILAIISLLIHYGVVSGFRLSDVTNGSSVFLPSPADGSRHTTMLLPLFPPKDTSRRAEIS
RRHLQKSPASARMSLHDDLLLNGYYTTHIWIGTPPQKFALIVDTGSTVTYVPCSECKKCGNHQD
PKFQPEMSSTYQSVKCNKACPCDHKRQQCIYERRYAEMSASYGLLGEDIISFGNLSELAPQRAV
FGCEIAETGDLYSQRADGIMGLGRGDLSIVDQLVEKHVISDSFSLCYGGMDFGGGAMVLGGVK
PPADMAFTKSDFGHSPYYNIDLKEIHVAGKPLNLNPRVFGGKHGTILDSGTTYAYLPEAAFAAFK
NAVVKELHSLKQIEGPDPSFKDICFSGAGSNISELSKNFPRVDMVFSDGKKLTLSPENYLFQHFK
VRGAYCLGIFPNGKNPASLLGGIVVRNTLVTYDRENKRIGFWKTNCSELWDRLNLSPPSPPSPS
VSSLDNTNSSAHLSPSSAPSGPPGYNTPVEIKVGLITFYLSLSVNCSELKPRIPELAHFIAQELDV
NVSQVGF
SEQ 127
MGAKSFLVAFFLSLLLFPLAFCTSNDGLVRIGLKKIKFDQNNRLAARVESKEGEALRASFRKYNN
LRGNLGASEDTDIVALKNYMDAQYFGEIGIGSPPQKFTVIFDTGSSNLWVPSSKCYFSVPCLFHS
KYKSSQSSTYKKNGKSAAIRYGTGAISGFFSQDSVKVGDLVVKNQEFIEATREPSVTFLVAKFDG
ILGLGFQEISVGNAVPVWYNMVKQGLVKEPVFSFWLNRNTEEDEGGEIVFGGVDPNHYKGKHT
YVPVTRKGYWQFDMGDVLIDGQATGYCDNGCSAIADSGTSLLAGPTTVVTMINHAIGASGVVSQ
QCKAVVEQYGQTIMDMLLAEAHPKKICSQVGLCTFDGTRGVSMGIESVVDENAGKSSGLHDAM
CSACEMAVVWMQNQLRQNQTQERILNYVNELCERLPSPMGQSAVDCGKLSGMPSVSFTIGGR
TFDLSPEEYILKVGEGPAAQCISGFIALDVPPPRGPLWILGDVFMGRYHTVFDSGKLRVGFAEAA
SEQ 128
MVVAFVGIAKSIGQQCLRRSKPYSYSYFSSYVRSSNSKYGLQNWQFQSHRTLILQSASESVKLE
RLSDSDSGILEVKLDRPEARNAIGKDMLRGLQQAFEAVSNERSANVLMICSSVPKVFCAGADLK
ERKTMILSEVQDFVSTLRSTFSFLEGLHIPTIAAIEGIALGGGLEMAMSCDIRICGEDAVLGLPETG
LAVIPGAGGTQRLPRLVGKSIAKDIIFTGRKISGKDAVSIGLVNYCVPAGEARLKTLELARDINQKG
PVALRMAKCAIDKGVELNMESALALEWDCYEQLLDTKDRLEGLAAFAERRKPRYKGE
SEQ 129
MCSSNSLYINPKPCKHLADYKVKNGMSGYSLIQECFKTTPYGRTTLEISKSELPRCSICSGHEGR
FYMCLICSSVLCCLSPESNHALLHSQCKAGHEISVDMERAELYCSVCCDQVYDPDFDKVVMCK
HIMGFPRTEIGVVESELRLSKRRRLSFGMDLDSKNMKTLFLRRDQKSKSCFPLVLRGLNNLGNT
CFMNSVLQVLLHAPPLRNYFLSDRHNRDICRKMSSDRLCLPCDIDLIFSAVFSGDRTPYSPARFL
YSWWQHSENLATYEQQDAHEFFISVMDRIHDKEGKASLATKDNGDCQCIAHRTFYGLLRSDVT
CTSCGFTSTTHDPCMDISLDLNSCNSSPKDFANKSSKPNESLVGCLDLFTRPEKLGSDQKLYCE
NCQEKQDALKQMSIKKLPLVLSFHIKRFEHSPTRKMSRKIDRHLQFPFSLDMKPYLSSSIVRKRY
GNRIFSFDGDESDISTEFEIFAVVTHSGMLESGHYVTYLRLRNQWYKCDDAWITEVDEEVVRAS
QCYLMYYVQKMLYHKSCEDVSCQPMSLRADTFVPIAGCC
SEQ 130
MKELHSLREIEGPDPNYKDICFSGAGSDISELSKSFPPIDMVFSNGKKLSLTPENYLFRHSKVRG AYCLGIFQNGKDPTTLLGGIVVRNTLVTYDRENERIGFWKTNCSELWDRLNLSPSPPPPPLPSGL DNTNSSANLTPALAPSLPLEHAPGKIKIGLVSFDMSLSVDYSALKPRVPELAHFIAQELEVNVSQV
HLMNFSTEGNDSLIRWAIFPAGSANYMPNATATEIINRLAENRFHLPDTFGSYKLVKWDIEPPPK
RIRWQQNYLVVVFALLVVLIIGLSASLGWLIWRRRQEIPYNPVGSAETHEKELQPLN
SEQ 131
MVTVSVKWQKEVYPAVEIDTSQPPYVFKAQLYDLTGVPPERQKIMVKGGLLKDDADWSKVGVK
EGQRLMMMGTADEIVKAPEKGPVFAEDLPEEEQVVNVGHSAGLFNLGNTCYMNSTVQCLHSV
PELKSALTEYNQLGRSNDLDHSSHLLTVATRDLFNDLDKNVKPVAPMQFWTVLRKKYPQFGQQ
SNGAFMQQDAEECWTQLLYTLSQSLKSPNSSGSPDIVKALFGIEFDNRIHCAESGEESTETETV
YSLKCHISQEVNHLHEGLKRGLKSELEKASPSLGRSAVYVKDSRINGLPRYLTIQFVRFFWKRES
NQKAKILRKVDYPLSLDVYDFCSEDLRKKLEGPRQVLRDAEGKKAGLKTSEKTSSSTDGDVKMT
EAEESSSGSGEASKTTQEGVLPEKEHHLTGIYDLVAVLTHKGRSADSGHYVAWVKQENGKWV
QFDDDNPIPQREEDIPKLSGGGDWHMAYICMYKARVVPM
SEQ 132
MEKKKEVIRLERESVIPVLKPRLIMALADLIEHSSDRAEFLKLCKRVEYTIHAWYLLQFEDLMQLYS LFDPVNGAKKLEQQKLSPEEIDILEQNFLTYLFQIMHKSNFKIASDEEIDVAHSGQYLLNLPITVDE SKLDKKLLEKYFAEHPHEDLPEFADKYVIFRRGIGIDRTTDYFFMEKVDMIIGRTWAWILRKTRID RLFSRRSSSRRKKDPKKDDEINSEAEDHDLYVERIRIENMELSARSNQFSLHQVK
SEQ 133
MELTCSSPLSVNSTISFNPQLRRYGSVYPHKRCQTVFSLFPYCPSSSSHITITTATTAACSTSSST
SSLFGISLSHRPCSSIPRKIKRSLYIVSGVFERFTERSIKAVMFSQKEAKALGKDMVYTQHLLLGLI
AEDRSPGGFLGSRITIDKAREAVRSIWHDDVEDDKEKLASQDSGSATSATDVAFSSSTKRVFEA
AVEYSRTMGHNFIAPEHMAFGLFTVDDGNATRVLKRLGVNVNRLAAEAVSRLQGELAKDGREPI
SFKRSREKSFPGKITIDRSAEKAKAEKNALEQFCVDLTARVSEGLIDPVIGREIEVQRIIEILCRRTK
NNPILLGQAGVGKTAIAEGLAINIAEGNIPAFLMKKRVMSLDIGLLISGAKERGELEGRVTTLIKEVK
KSGNIILFIDEVHILVGAGTVGRGNKGSGLDIANLLKPALGRGELQCIASTTMDEFRLHIEKDKAFA
RRFQPVLINEPSQADAVQILLGLREKYESHHKCIYSLEAINAAVQLSARYIPDRYLPDKAIDLIDEA
GSKSRMQAHKRRKEQQISVLSQSPSDYWQEIRAVQAMHEVILASKLTENDDASRLNDGSELHL
QPASPSTSDEDEPPVVGPEEIAAVASLWTGIPLKQLTVDERMLLVGLDEQLKKRVVGQDEAVAAI
CRAVKRSRTGLKDPNRPISAMLFCGPTGVGKSELAKALAASYFGSESAMLRLDMSEYMERHTV
SKLIGSPPGYVGYGEGGTLTEAIRRKPFTVVLLDEIEKAHPDIFNILLQLFEDGHLTDSQGRRVSF
KNALIVMTSNVGSTAIVKGRQNTIGFLLADDESAASYAGMKAIVMEELKTYFRPELMNRLDEVVV
FRPLEKPQMLQILDLMLQEVRARLVSLEISLEVSEAVMELICQQGFDRNYGARPLRRAVTQMVE
DLLSESFLSGDLKPGDVAIINLDESGNPVVANKSTQSIHLSDANGNPVVTNR
SEQ 134
MKNIERLANVALLGLSLAPLVVNVDPNVNVIVTACLTVFVGCYRSVKPTPPSETMSNEHAMRFPL
VGSAMLLSLFLLFKFLSKDLVNAVLTCYFFVLGIAALSATLLPAIRRFLPKKWNDDLIIWHFPYFRS
LEIEFTRSQIVAAIPGTIFCVWYAKQKHWLANNVLGLAFCIQGIEMLSLGSFKTGAILLAGLFVYDIF
WVFFTPVMVSVAKSFDAPIKLLFPTADAKRPFSMLGLGDIVIPGIFVALALRFDVSRGKGPQYFKS
AFLGYTFGLALTIFVMNWFQAAQPALLYIVPAVIGFLAVHCIWNGDVKPLLEFDEGKTKGAEEAD
AKESKKVE
SEQ 135
MAFSSSYFSFIFLILLFIISFVVGEIKPIYLPGTYQSSLEKQHVKSKIPFKVHYFPQILDHFTFLPKSS
KVFKQKYLINDNYWKQGGPIFVYTGNEGNIDWFAANTGFMLDIAPKFHALLVFIEHRFYGDSMPF
GKKSYKSPKTLGYLNSQQALADYAVLIRSLKQNLSSESSPVVVFGGSYGGMLASWFRLKYPHIAI
GAVASSAPILQFDKITPWSSFYDAVSQDFKEVSLNCYRVIKGSWTELDALSKHEEGLTEVSKLFR
TCKGLHSVYSARDWLWEAFVYTAMVNYPTKANFMMPLPAYPVQEMCKIIDGLPKGASKISRAFA
AASLYYNYTKREKCFNLEGGDDAHGLRGWDWQACTEMVMPMTCSNESMFPPSSYSYKEFKE
DCKKKYGVEPRPHWITTEFGGYRIEQVLKRFGSNMIFSNGMQDPWSRGGVLKNISASIVALVTQ
KGAHHVDFRSETKNDPGWLIMQRKQEVAIIQKWLEEYYRDLKQN
SEQ 136 MSRFSLLLALVVAGGLFASALAGPATFADENPIRQVVSDGLHELENAILQVVGKTRHALSFARFA
HRYGKRYESVEEIKQRFEVFLDNLKMIRSHNKKGLSYKLGVNEFTDLTWDEFRRDRLGAAQNC
SATTKGNLKVTNVVLPETKDWREAGIVSPVKNQGKCGSCWTFSTTGALEAAYSQAFGKGISLSE
QQLVDCAGAFNNFGCNGGLPSQAFEYIKSNGGLDTEEAYPYTGKNGLCKFSSENVGVKVIDSV
NITLGAEDELKYAVALVRPVSIAFEVIKGFKQYKSGVYTSTECGNTPMDVNHAVLAVGYGVENGV
P YW LI KN SWG ADWG D N G YFKM EM G KN M CG I ATCAS YP VVA
SEQ 137
MEKEHKYSLFLTKLKLFFLVTLSTFHGLSHGFQMDQARTLMSWRRSKMHAQTTTYATNEDETE
NLVFSDEKHVGNMEDDLIKDGLPAQPSNVMFKQYAGYVNVDVKNGRSLFYYFAEASSGNASSK
PLVLWLNGGPGCSSLGFGAMLELGPFGVNPDGKTLYSRRFAWNKVANVMFLESPAGVGFSYS
NTTSDYSKSGDKRTAEDAYRFLVNWFKRFPHYKGRDFYIMGESYAGFYVPELADIIVKRNMLPT
TNFYIQFKGIMIGNGIMNDETDEKGTLDYLWSHALISDETHRGLLQHCKTETETCQHFQNIAEAEL
GNVDPYNIYGPQCSINSKSRSSSPKLKNGYDPCEQQYVQNYLNLPHVQKALHANLTNLPYLWN
PCSNLDWKDTPATMFPIYKRLIASGLRILLYSGDVDAVVSVTSTRYSLSAMNLKVIKPWRPWLDD
TQEVAGYMVVYDGLAFATVRGAGHQVPQFQPRRAFALLNMFFANHS
SEQ 138
MANSYTSINFFLAPIIFLAILGLQLQSSDGFGTFGFDIHHRYSDPVKGILDLHGLPEKGSVEYYSA
WTQRDRFIKGRRLAEADTANSTPLSFSGGNETFRLSSLGFLHYANVTVGTPGLSFLVALDTGSD
LFWLPCDCSNCVRALETRSGRRINLNIYSPNTSSTGQIVPCNSTLCGQRRRCLSSQNACAYGVA
YLSNNTSSSGVLVEDILHLETDNAQQKSVEAPIALGCGIRQTGAFLSGAAPNGLFGLGLENISVPS
MLASKGLAANSFSMCFGPDGIGRIVFGDKGSPAQGETPLNLDQLHPTYNISLTGITVGNKITDVD
FTAIFDSGTSFTYLNDPAYKVITENFDSQAKQPRIQPDGEIPFEYCYGLSANQTTFEVPDVNLTMK
GGNQLFLFDPIIMLSLQDRSGAYCLAVVKSGDVNIIGQNFMTGYRVVFDREKMVLGWKPSDCYD
SRGSNDKSTTLPVNKRNSTEAPSPSSVVPEATKGNGSGNEPATSFPSVQSSKPAANQAPAHFI
CQLMMALFSLFSYYLIIISS
SEQ 139
MAIHTSTLSISILVMLMFSVVSSSAAEDMSIISYNEKHHTNGESTVWRTDDEVMSLYESWLVEHK
KVYNALGEKDKRFQIFKDNLRYIDEHNSVPDKSYKLGLTQFADLTNEEYKSIYLGTKPDGRSRLL
NTQSDRYAPKVGDSLPDSVDWRKKGVLVDVKNQGQCGSCWAFSAVASIEAVNKIVTGNLISLS
EQELVDCDTSDNQGCQGGLMDDAFKFVIQNGGIDTEEDYPYKAKDGKCDQARKNARVVTIDGY
EDVPDNDEKALKKAVAGQPVSVAIEAGGKDFQHYKSGIFTGKCGAAVDHGVVAVGYGSENGM
DYWIVRNSWGASWGEKGYLRMQRNIGNPKGLCGIATIASYPVKTGQNPPKPAPSPPPVKPPTQ
CDDYNECPAGTTCCCVYEYYKYCFAWGCCPMEGATCCKDHNSCCPHDYPVCNVKAGTCSISK
NNPLGVKAMQHILAKPIGTFGNEGKKSPSS
SEQ 140
MEIKILLASLVIWYITCINVYADDMVRIELKRQSLDLSSISDARIYAKDLRGRNRNLAAPNDQIVYLK
NYHDVQYFAEIGIGSPPQRFIVVFDTGSSNLWVPSSRCFFSIACYLRSRYKSRLSNTYTKIGKSSK
IPFGTGSVHGFFSQDNVKVGGAVLKQQVFTEVTREGYLTLLRARFDGVLGLGFDQSTTSRNVTP
VWYNMLLQHMVTKSIFSFWLNRDPTSKIAGEIIFGGMDWTHFRGQHTYVPVAQNGYWEIEIGDL
FIGSNSTGLCKDGCPAIVDTGTSFIAGPTTILTQINHAIGAEGIISLECKKVVSSYGDSIWERLIAGL
QPENVCNRIGLCTNNGSLCSSCEMIVFWIQVEIRKERSKEKAFQYANQLCEKLPNPGGKSFINC
DVFALPHITFTIGDKSFPLSPDQYVIRVDDSQGVHCISGFTTLNAHPRRPLWVLGDAFLRAYHTV
FDFGSSQIGFAESA
SEQ 141
MASIFALSLFFIIISFCITSITIPVQSDGHETFIIHVSKSDKPRVFATHHHWYSSIIRSVSQHPSKILYT
YSRAAVGFSARLTAAQADQLRRIPGVISVLPDEVRHLHTTHTPTFLGLADSFGLWPNSDYADDVI
IGVLDTGIWPERPSFSDEGLSPVPSSWKGKCATGPDFPETSCNKKIIGAQMFYKGYEASHGPMD
ESKESKSPRDTEGHGTHTASTAAGSVVANASFYQYAKGEARGMAIKARIAAYKICWKNGCFNS
DILAAMDQAVNDGVHVISLSVGANGYAPHYLLDSIAIGAFGASEHGVLVSCSAGNSGPGAYTAV
NIAPWILTVGASTIDREFPADVILGDNRIFGGVSLYSGDPLTDAKLPVVYSGDCGSKYCYPGKLD
HKKVAGKIVLCDRGGNARVEKGSAVKQAGGVGMILLNLADSGEELVADSHLLPATMVGQKAGD
KIRHYVKSDPSPTATIVFRGTVIGKSPAAPRVAAFSSRGPNHLTPEILKPDVIAPGVNILAGWTGS VGPTDLDIDTRRVEFNIISGTSMSCPHASGLAALLKRAHPKWTPAAVKSALMTTAYNLDNSGKVF TDLATGQESTPFVHGSGHVDPNRALDPGLVYDIETSDYVNFLCSIGYDGDDVAVFVRDSSRVNC SEQNLATPGDLNYPSFSVVFTGESNGVVKYKRVMKNVGKNTDAVYEVKVNAPSSVEVSVSPAK LVFSEEKKSLSYEISFKSKSSGDLEMVKGIESAFGSIEWSDGIHNVRSPIAVRWRHYSAASI
SEQ 142
MPSSLFLTLLLASISLSFSSTLNSNDDEFFLSSTPKFPLTMAEKLIRQLNLFPKHDINKAAATGDSE
QRLFERKLNLSYVGNSGSTVQDLGHHAGYYRLPHTKDARMFYFFFESRSRKNDPVVIWLTGGP
GCSSELAVFYENGPFKIADNMSLVWNDFGWDKVSNLIYVDQPTGTGFSYSSNDDDIRHDERGV
SNDLYDFLQAFFKAHPQYAKNDFYITGESYAGHYIPAFASRVHQGNKNKEGIYVNLKGFAIGNGL
TDPEIQYKAYTDYALDMKLIKKSDYNAIEKSYPKCQLAIKLCGKDGGTACMAAYLVCTSIFNKIMDI
AGDKNYYDVRKRCEGDLCYDFSKMETFLNDQQVKKALGVGDIEFVSCSSEVYQAMQLDWMRN
LEEGIPSLLEDGIKLLVYAGEYDLICNWLGNSRWVHAMKWSGQKAFGKATQVSFAVDGVEKGV
QKNYGPLTFLKVHDAGHMVPMDQPKAALEMLHRWMQDKLSKQGHLAPM
SEQ 143
MLVISDCYINSCKAFNFVINLPVMGHSHSHSSHSHSHFHSSKSSDDQNMDMGESITTQTDVSFM
LAKHVFSKEVKGDSNLVFSPLSIQIVLGLIAAGSKGPTKDQLLCFLKSKSIDELNSLYSHFVSVVFV
DGSPNGGPRLSVVNGVWIDQTLPFKPSYKKVVDKVYKAASNSVDFQCKAAEVANQVNQWAKM
KTNNLIKEILPHGTVNNMTRLIFANALYFKGVWNDKFNASETKDHKFHLLSGGSIKAPFMTSKNK
QYAVAFDGFKVLGLHYKQGKDMRRFCMYLILPDARDELPALLDKISSEPGFIDHHIPFEKAKMRK
FLIPKFKTTFGFEASKVLKGLGLTLPFSSGGLTEMVDSPLAGRLFVSQIFHKSFIEVNEEGTEAAA
VTASVIMTKSLIIEKEMEFVADHPFLFLIRDESTGAVFFIGSVLNPLAG
SEQ 144
MLRIGPSLRTARKLLNRNLHFQSPIIAGDVAPVHHRRQELHRFVRRCNYSSTVGNTSASASFFST
LNNSNSSTTSTTPHVERAEENDSLQSNASEVEPVAAVEQRLSSGMVDAYLAIELALDSVVKIFTV
SSSPNYFLPWQNKSQRETTGSGFVIRGKRILTNAHVVADHTFVLVRKHGSPTKYRATVQAVGHE
CDLAILVVESEEFWEGMNSLELGDVPFLQEAVAVVGYPQGGDNISVTKGVVSRVEPTQYVHGA
SQLLAIQIDAAINPGNSGGPAIMGDKVAGVAFQNLSGAENIGYIIPVPVIKHFIAGVEERGEYIGFC
SLGLSCQPTENAQIREYFQMQSKLTGVLVSRINPLSDASRVLKKDDIILSFDGVPIANDGTVPFRN
RERITFDHLVSMKKPNETAELKVLRNGKVHDFKITLHPLQPLVPVHQFDKLPSYFIFAGLVFIPLTQ
PFLHEYGEDWYNASPRRLCERALRELPKKPGEQFIILSQVLMDDINAGYERLAELQVKKVNGVE
VLNLKHLRQLVEDGNQKNVRFDLDDEKVIVLNYESARIATSRILKRHRIPHAMSSDLTDDENAVEL
QSACSS
SEQ 145
ICREPPNELVRCGACGHAPPRLYACVTCAEVFCRVHAPSHPAGNAADPSLHCIAVDIDRAELFC
CGCRDQVYNSDFDAAVALAQTEATVIGSIQDPPPHPESTRKRRRVEYKPWTPDVKEQVLIVGNS
SPLPSQLGNDSTTPEVQWGLRGLNNLGNTCFMNSVLQALLHTPPLRNYFLSDKHNRYFCQRKN
SSVITRSSSDNGNKNSTMLCLACDLDAMFSAVFSGDRTPISPAKFLYSWWKHASNLASYEQQD
AHEFFISVLDGIHERMQNDKGKALSPGSGDCCIAHRVFSGILRSDVMCTACGFTSTTYDPCIDISL
DLELSQGSSSKMTSKKSHNTHKKEAESGKFSQNGRISSLMGCLDHFTRPEKLGSDQKFFCQHC
QVRQESLKQMSIRKLPLVSCFHIKRFEHSVIKKMSRKVDHYLQFPFSLDMSPYLSSSILRSRFGN
RIFSFDGDEQDASCESSSEFELFAVITHTGKLDAGHYVTYLRLSNQWYKCDDAWITQVSENIVRA
AQG YM M FYVQKM LYYKASEKQVS
SEQ 146
MEGSPVLGEHAELIGVLSRPLRQRATAAEIQMVIPWEAITSACGSLLKEELQTRRKIHFDNGNLIS
VKNESPSNNIRNGPSNDTREHLLIDPVPPSLIEKAMTSICLITVDDGAWASGVLLNKQGLLLTNAH
LLEPWRFGKTSVNGSGYNTKSDVVLIPSDQSEHPGVEKFDIQRRNKHLIQKELKTPHFLVDNEQ
GSFRVNLAKTGSRIIRVRLDFMDPWVWTNAKVVHVSRGPLDVALLQLELVPDQLCPITADFMCP
SPGSKAYILGHGLFGPRCDFLPSACVGAIAKVVEAKRSLLNQSSLGEHFPAMLETTAAVHPGGS
GGAVVNSEGHMIALVTSNARHGGGTVIPHLNFSIPCAALEPIFKFVEDMQNLSLEYLDKPDEQLS
SVWALTPPLSSKQSPSMLHLPMLPRGDSDGDTKGSKFAKFIADSEAMLKSATQLGKVERLSNKL
VHSKL SEQ 147
MLKALTSSCLQNRFHAVTTAFTPQVRRGTDSNTPLLRVLGSLRSSNRRGPYLSRRFFCSDSTD
GSESNSEAAASEAKPAEKGGDADSKASAAIVPTVFKPEDCLTVLALPLPHRPLFPGFYMHIYVKD
PKVLAALLESRKRQAPYAGAFLMKDEQGTDPNVVSASDTEKNIYELKGKDMLNRLHEVGTLAQI
TSIKDDQVILIGHRRIRMTEVVSEEPLTVKVDHLKEQPYNKDDDVIKATSFEVLSTLRDVLKTSSL
WKDHVQTYIQ
SEQ 148
MERKHLWAALLLLAIACFVFPASSDSLLRISLKKRQLDISSLNVANVARLEDRYGKHVMKDIEKKK
KKKKSDTNSDIVSLKNYLDAQYYGDISIGSPPQNFTVIFDTGSSNLWVPSSRCYFSIACWIHSKYK
ARKSSTYTKKGESCSIHYGSGSISGFLSQDNVQVGDLVVTDQVFIEATRESSVTFIVAKFDGILGL
GFKEIAVGNTTPVWYNMVKQDLVKEPVFSFWLNRDINAKEGGELVFGGVDPKHFKDKHTYVPL
TQKGYWQFKMGDFSIGNQSTGFCEGGCAAIVDSGTSLLAGPTAVVTQVNHAIGAEGVLSMECK
ETISQYGEMIWDLLVSGVTPDQICLQVGLCYLNGAQHLSSNIRSVVEKENEGSSIGEAPLCTACE
MAVIWMQNQLKQKTTKESVLEYVNQLCEKLPSPMGQSVIDCNSISSMPNVTFNIGDKDFVLTPD
QYILKTGEGIATICLSGFVALDVPPPRGPLWILGNVFMGVYHTVFDYGNLQLGFAEAA
SEQ 149
SRSYYNILLLQYLFLFVMALILGWKILFILLFVIIGMCTSQVTSRNIQALSMLEKHELWMSSHGRTY
KNEAEKEKRLNIFKENVKFIESFNNNGTKKPYKLGINAFADLTAEEFLSYYTTGLKLSNSYSQIQS
SFKYENLSDVPSVMDWRKSGAVTRIKHQGQCGCCWAFSAVAALEGANKLSTNNLISLSEQQLL
DCTTENNGCNGGLMTTAYDFIIQNGGIATESNYPYEEYQDSCKSQEMNSAVKINRYETLPSTES
ALLKAVAKQPVSIGIAVNEDFHLYQNGVYNGNCEGQELNHAVTVIGYGTENDGTKYWLIKNSWG
TSWGENGYMKIARDTGIEGGLCGITTLASYPVL
SEQ 150
MGLPEVVDVARNYAVMVRIQGPDPKGLKMRKHAFHLYNSGKTTLSASGMLLPSSFVNASVSKQ
IQGESKLHSFGGHFLVLTVASVIEPFVVQQDRGDISKDKPELIPGAQIDILWEGGNTLQNDIKVTN
KEGLNWLPAELLRVVDIPVSSAAVQSLVEGSSSSIEHGWEVGWSLAAYGNSRQSFTNTKRTQV
EKISFPSQTPMMEAQSSLPSVIGTSTTRIALLRVSSNPYEDLPALKVATWSRRGDLLLGMGSPFG
ILSPSHFFNSISVGSIANSYPPSPQNKALLIADIRCLPGMEGSPVLGEHAELIGVLSRPLRQRATAA
EIQMVIPWEAITSACGSLLKEELQTRRKIHFGNGNLISVKKESFSNNIQDGHANDTQEHLQIDPVP
PSLIEKAMTSICLIAVDDGAWASGVLLNKQGLLLTNAHLLEPWRFGKTSVNGSGYNTKSDVVLIP
SDQSEHPGVEKFDIQRRNKHLIQKELKTPHFLVDNEQCSFRVNLANTGSRTIRVRLDFMDPWV
WTNAKVVHVSRGPLDVALLQLELVPDQLCPIIVDFMCPSPGSKAYILGHGLFGPRCDFLPSACV
GAIAKVVEAKRPLLNQSSLGGHFPAMLETTAAVHPGGSGGAVVNSEGHMIALVTSNARHGGGT
VIPHLNFSIPCAALEPIFKFAEDMQNLSLEYLDKPDEQLSSVWALTPPLSSKQSPSMLHLPMLPR
GDSDGDTKGSKFAKFIADSEAMLKSATQLGKVERLSNKLVHSKL
SEQ 151
MDNPSEDSSDSPQQQPESPVNDDQRVYLVPYRWWKEAQESSPSDGKSVTLYAAAPAPSYGG
PMKIINNIFSPDVAFNLRREEESLSQSQENGEVGVSGRDYALVPGDIWLQALKWHSNSKAAAKN
GKSFSATDEDIADVYPLQLRLSVLRETSSLGVRISKKDNTVECFKRACRIFSVDTEPLRIWDLSGQ
TALFFSDENNKILKDSQKQSEQDMLLELQVYGLSDSVKNKVKKDEMSMQYPNGSSFLMNGTGS
GITSNLTRSSSSSFSGGPCEAGTLGLTGLQNLGNTCFMNSALQCLAHTPKLVDYFLGDYKREIN
HDNPLGMNGEIASAFGDLLKKLWAPGATPVAPRTFKLKLAHFAPQFSGFNQHDSQELLAFLLDG
LHEDLNRVKNKPYVEAKDGDDRPDEEIADEYWNNHLARNDSIIVDVCQGQYRSTLVCPVCKKVS
IMFDPFMYLSLPLPSTSMRSMTVTVIKNGSDIQISAFTITVSKDGRLEDLIRALSTACSLDADETLL
VAEIYNNRIIRYLEEPADSLSLIRDGDRLVAYRLHKGTEEAPLVVFTHQQIDEHYIYGKLTSNMKTF
GIPLAAHSRVLTGSDIRSLYLQILTPFLVHNTAQADNLNCDRSATEACTDSEVITDMEPGNSIVNG
VPESIAEEDTAEPLDMEFQFYLSDDKATFKGSEIVMNEPLQSTDISGRLNVLVSWSPKILEQYNT
GLFSSLPEVFKSGFFAKRPQESVSLYKCLEAFLKEEPLGPEDMWYCPACKQHRQATKKLDLWR
LPEILVIHLKRFSYNRFLKNKLETYVDFPTHDLDLSSYLAYKDGKSSYRYMLYAISNHYGSMGGG
HYTAFVHQGADRWYDFDDSHVYSISQDKLKTSAAYVLFYRRVEEI
SEQ 152 MASSSRVFVLLLLIIFNFLYISAQKTIKHKPFSMSFPLISTSLSHNSSSKALFLSSFMASNNRRQTQ
NTKTMSRIPSLNYKSTFKYSMALIVTLPIGTPPQNQQMVLDTGSQLSWIQCHKKIPKRPPPTTSF
DPSLSSTFSVLPCTHPLCKPRIPDFTLPTTCDQNRLCHYSYFYADGTLAEGNLVREKITFSRSQS
TPPLILGCATESEDAEGILGMNLGRFSFASQAKVQKFSYCVPIRQGSHAVKPSGTFYLGQNPNS
HTFQYINLLTFPQSQRMPNLDPLAFTVGMVGIKIGGKKLNISGRVFRPNAGGSGQTIIDSGTEYTF
LVEEAYNKVREEIVRLVGPRLKKGYVYGGALDMCFDNRPIEIGRLIGDMTLQFENGVDILINKERM
LDEVEGGIHCVGIGRSESLGIASNIIGNFHQQNLWVEFDMRNRRVGFGKGECSRQV
SEQ 153
YTIIIFSLNMKIFSIFSLLLLLLLPILASCHEKQVYIVYFGGHKGEKALHEIEENHHSYLMSVKESEEE
ARYSLIYSYKHSINGFAALLTPHEASKLSELEEVVSVYKSEPRKYRLQTTRSWEFSGVEESVQPN
SLNKDNLLLKARYGKDVIIGVLDSGLWPESKSFSDEGLGPIPKSWKGICQSGDAFNSSNCNKKII
GARYYIKGYEQYYGPLNRTLDYLSPRDKDGHGTHTSSTAGGRKVPNVSAIGGFASGTASGGAP
LARLAMYKVCWAIPKEGKEDGNTCFDEDMLAAMDDAIADGVDVISISIGTKEPQPFDQDSIAIGAL
YAVKKNIVVSCSAGNSGPAPSTLSNTAPWIITVGASSVDRAFLSPVILGNGKKFTGQTVTPYKLE
KEMYPLVYAGQVINSNVTKDVAGQCLPGSLSPKKAKGKIVICLRGNGTRVGKGGEVKRAGGIGY
ILGNNKANGAELVADPHFLPATAVDYKSAMQILNYINSTKSPVAYIVPAKTVLHSKPAPYMASFTS
RGPSAVAPDILKPDITAPGLNILAAWSGGSSPTKLDIDDRVVEYNIISGTSMSCPHVGGAAALLKAI
HPTWSSAAIRSALITSAGLRNNVGEQITDASGKPADPFQFGGGHFRPSKAADPGLVYDASYQDY
LLFLCASGIKDLDKSFKCPKKSHLPNNLNYPSLAIPNLNGTVTVSRRLTNVGAPKSVYFASAKPPL
GFSVEISPPVLSFKHVGSKRTFTITVKVRSDMIDSIPKDQYVFGWYSWNDGIHNVRSPIAVKLA
SEQ 154
MATRRSSSSALTALAASRSRLLSRFRPAVSRLSQNTLLGTGRCPPPNSGFFVAETTAALWPNYN
VLSKSFVHSYSTTAASSGQINNMDYTEMALEGIVGAVEAARTSKQQVVETEHLMKALLEQKDGL
ARRIFTKAGLDNSSVLQETDQFISQQPKVVGDTSGPILGSHLSSLLENAKKHKKEMGDSFVSVEH
MLLSFLSDTRFGQKLFRNLQLTEKALKDAVNAVRGSQRVTDPNPEGKYEALEKYGNDLTELARR
GKLDPVIGRDDEIRRCIQILSRRTKNNPVIIGEPGVGKTAIAEGLAQRIVRGDVPEPLMNRKLMSL
DMGALLAGAKYRGDFEERLKAVLKEVSSSNGQIILFIDEIHTVVGAGATSGAMDAGNLLKPMLGR
GELRCIGATTLNEYRKYIEKDPALERRFQQVYCGQPSVEDAISILRGLRERYELHHGVKISDSALV
SAAVLADRYITERFLPDKAIDLVDEAAAKLKMEITSKPTELDEIDRAVLKLEMEKLSLKNDTDKASK
ERLNKLESDLKSLKAKQKELNEQWEREKDLMTRIRSIKEEIDRVNLEMEAAEREYDLNRAAELKY
GTLISLQRQLGEAEKNLADYRKSGSSLLREEVTDLDITEIVSKWTGIPLSNLQQSERDKLVFLENE
LHKRVVGQDMAVKSVADAIRRSRAGLSDPNRPIASFMFMGPTGVGKTELGKALAAYLFNTENAL
VRIDMSEYMEKHAVSRLVGAPPGYVGYEEGGQLTEVVRRRPYSVVLFDEIEKAHHDVFNILLQL
LDDGRITDSQGRTVSFTNTVVIMTSNIGSHYILETLQNTRDSQEAVYDAMKKQVIELARRTFRPEF
MNRIDEYIVFQPLDLKQVSRIVELQMRRVKDRLKQKKIDLHYTQEAISLLANMGFDPNYGARPVK
RVIQQMVENEVAMGVLRGDFSEEDMIIVDADASPQGKDLLPEKRLLIRRIENGSNMDAMVAND
SEQ 155
VNVKCFFVSFFFSFSCMSLFFLQGWNFETFCLKTQSFAVTNKNHRPHLHSHHSSFLCFHTSYLL
FFLILYIYIAKTTSRFAKTQQPPQKMSRFTMLVVLVLLLLCLCHLSVATIGSSSNKKSTYIVHVAKS
QMPESFENHKHWYDSSLKSVSDSAEMLYVYNNVVHGFSARLTVQEAESLERQSGILSVLPEMK
YELHTTRTPSFLGLDRSADFFPESNAMSDVIVGVLDTGVWPESKSFDDTGLGPVPDSWKGECE
SGTNFSSSNCNRKLIGARYFSKGYETTLGPVDVSKESKSARDDDGHGTHTATTAAGSIVQGASL
FGYASGTARGMATRARVAVYKVCWIGGCFSSDILAAMDKAIDDNVNVLSLSLGGGNSDYYRDS
VAIGAFAAMEKGILVSCSAGNAGPGPYSLSNVAPWITTVGAGTLDRDFPAYVSLGNGKNFSGVS
LYKGDLSLSKMLPFVYAGNASNTTNGNLCMTGTLIPEKVKGKIVLCDRGINPRVQKGSVVKEAG
GVGMVLANTAANGDELVADAHLLPATTVGQTTGEAIKKYLTSDPNPTATILFEGTKVGIKPSPVV
AAFSSRGPNSITQEILKPDIIAPGVNILAGWTGGVGPTGLAEDTRRVGFNIISGTSMSCPHVSGLA
ALLKGAHPDWSPAAIRSALMTTAYTVYKNGGALQDVSTGKPSTPFDHGAGHVDPVAALNPGLV
YDLRADDYLNFLCALNYTSIQINSIARRNYNCETSKKYSVTDLNYPSFAVVFLEQMTAGSGSSSS
SVKYTRTLTNVGPAGTYKVSTVFSSSNSVKVSVEPETLVFTRVNEQKSYTVTFTAPSTPSTTNVF
GRIEWSDGKHVVGSPVAISWI
SEQ 156 MLKALTSSCLQNRFHAVTTAFTPQVRRGTDSNTPLLRVLGSLRSSNRRVPYLSRRFFCSDSTDG
SESNSEAAASEAKPAEEGGDADSKASAAMVPTVFKPEDCLTVLALPLPHRPLFPGFYMHIYVKD
PKVLAALLESRKRQAPYAGAFLMKDEQGTDPNVVSASDTEKNIYELKGKDMLNRLHEVGTLAQI
TSIKDDQVILIGHRRIRMAEVVSEEPLTVKVDHLKEQPYNKDDDVIKATSFEVLSTLRDVLKTSSL
WKDHVQTYIQHIGDFNYARLADFGAAISGANKLQCQQVLEELDVHKRLQLTLELVKKEMEISKIQ
ESIARAIEEKISGEQRRYLLNEQLKAIKKELGLETDDKTALSAKFRERLEPNKEKIPVHVMQVIEEE
LTKLQLLEASSSEFNVTRNYLDWLTALPWGNYSDENFDVLRAEQILDEDHYGLTDVKERILEFIA
VGKLRGTSQGKIICLSGPPGVGKTSIGRSIARALNRKFYRFSVGGLSDVAEIKGHRRTYIGAMPG
KMVQCLKSVGTANPLVLIDEIDKLGRGHAGDPASAMLELLDPEQNANFLDHYLDVPIDLSKVLFV
CTANVVEMIPNPLLDRMEVISIAGYITDEKMHIARDYLEKATRETCGIKPEQVEVTNSALLALIENY
CREAGVRNLQKQIEKIYRKIALKLVREDGEIEPQNAEVGEVEAESIHLSDEIKSKEEIQAGAESAN
GSNDDKASENNAEAEAQGAPVNQTQKSANEDACLQDTQETEKATESEASKTVNKVVVDSPNL
ADYVGKPVFHAERIYDQTPVGVVMGLAWTSMGGSTLYIETSLVEQGEGKGALNVTGQLGDVMK
ESAQIAHTVARTILQEKEPDNQFFANSKLHLHVPAGATPKDGPSAGCTMITSLLSLAMKKPVKKD
LAMTGEVTLTGKILPIGGVKEKAIAARRSDVKTIIFPSANRRDFDELAPNVKEGLDVHFVDDYKQIF
DLAF
SEQ 157
MQFFRRNPSLHRISSRFLNQVVKTSAYSTKKVYNAGQPTAATHPQLMKEGEITPGITSEEYMQR
RKKLLEFLPENSLAIVAAAPIKMMTDVVPYNFRQDADYLYITGCQQPGGVAVLGHDCGLCMFMP
EQSPQDALWQGETAGVDAALQIFKADLAYPINRLPQILSRMIESSSTVFHNVKTRTSSYLELEAY
KKAVSNYKVKDFSVYTHEARFVKSPAELKLMRDSASIACQALVQTMLYSKLFPDEGMLSAKFEY
ECRVRGAQRMAFNPVVGGGPNGSVVHYFRNDQKIEDGNLVLMDVGCELHGYVSDLTRVWPPF
GKFSSVQEELYNLILETNKECVELCRPGTTIREIHHYSVETLRRGFKEIGILKNDRRGRYEMLNPT
NIGHYLGMDVHDCSTIGNDRPLKPGVVITIEPGVYIPSCFDCPERFQGIGFRIEDEVLITESGYEVL
TASIPKEIKHLESLLNNFGSGRGTEIRAALS
SEQ 158
LLTSHKNHIILLPFLLYKIFISLQKQTLMASSTRVFVLLLLIIFNFLYISAQKTIKHKPFSMSFPLTSTSL
SHNSSSKALFLSSLLASNQRKQAPNTKTVSRIPSLNYKSTFKYSMALIVTLPIGTPPQNQQMVLD
TGSQLSWIQCHKKIPKRPPPTTSFDPSLSSTFSVLPCTHPLCKPRIPDFTLPTTCDQNRLCHYSY
FYADGTLAEGNLVREKITFSRSQSTPPLILGCATESEDAEGILGMNLGRFSFASQAKVQKFSYCV
PIRQGSHAVKPSGTFYLGQNPNSHTFQYINLLTFPQSQRMPNLDPLAFTVGMVGIKIGGKKLNIS
GRVFRPNAGGSGQTIIDSGTEYTFLVEEAYNKVREEIVRLVGPRLKKGYVYGGALDMCFDNRPM
EIGRLIGDMTLQFENGVEILINKERMLDEVEGGIHCVGIGRSESLGIASNIIGNFHQQNLWVEFDM
RNRRVGFGKGECSRQM
SEQ 159
MAALNFFIIFTSLVLPIASDPLLSTYVVHVDTKAKPSHYLTQDEWYNSVVESVLANKMDSDSTSPR
LFYSYDVVLQGFAARLTDQESEKLNKFPEVIHIFKDQSRIKLDTTRSPNFLGLNTGYGLWPQSNF
GDDVIIGLVDTGIWPESESFKDNGIGPIPTRWKGKCVDGIEFNATSSCNRKLIGARNFVKGVEND
YHHQSARDQNGHGTHTASTAAGTEVNGANVFGFAKGKARGIASKARIAMYKACGSSSCAESDI
LAAIESAIKDGVDILSLSLGYDDAPFYENPVAIATFAAVKRNIFVASSAGNLGPYPFSVHNTAPWV
TTVGAGSLDRDFPVEINLSNNKTFVGSSLYPGRISGKSYSLVYIENCSIMTIDRSKVERKIVVCNT
SKIEALRNGILIQKAGGFGLIQLNLPTEGEGIRAMAYTLPSATLGYKEGIELLSYIKSNANPRAGFV
RRKDTVIGKKVRAPIVASFSSRGPNVVVPEVLKPDLIAPGLNILAAWPGDISPTRLKMDPRRVKFN
INSGTSMACPHIAGVAALVRAVHPDWSPAAIKSALMTTSTAFDNAQLPIIKHEDMELATPISIGAG
HVNPESAIDPGLIYDTDTSDYINLLCSLNYTEKQMKLFTNESNPCSGFTGSPLDLNYPSLSVMFR
PDSYVHVVKKTLTHVAVSKPEVYKVKIVNLNSEKVSLSIEPRKLIFNESLQKQSYVVKFESHYAFN
SSRKIAEQMAFGSILWESEKHNVRSPFAVMWVQQNFNNSRLYK
SEQ 160
MEISKIQESIARAIEEKISGEQRRYLLNEQLKAIKKELGLETDDKTALSAKFRERLEPNKEKIPVHV
MQVIEEELTKLQLLEASSSEFNVTRNYLDWLTALPWGSYSDENFDVLRAEQILDEDHYGLTDVK
ERILEFIAVGKLRGTSQGKIICLSGPPGVGKTSIGRSIARALNRKFYRFSVGGLSDVAEIKGHRRTY
IGAMPGKMVQCLKSVGTANPLVLIDEIDKLGRGHAGDPASAMLELLDPEQNANFLDHYLDVPIDL
SKVLFVCTANVVEMIPNPLLDRMEVISIAGYITDEKVHIARDYLEKATRETCGIKPEQVEVTDSALL ALIENYCREAGVRNLQKQIEKIYRKIALKLVREDGEIEPQNAEVDEVKAESIHLSDEIKSKEEIQAG
AESANGSNDDEASENNAEAEAQGAENQTQKSANEDTCLQDTQETEKATESEASKTVNKVVVD
SPNLADYVGKPVFHAERIYDQTPVGVVMGLAWTSMGGSTLYIETSLVEQGEGKGALNVTGQLG
DVMKESAQIAHTVARTILLEKEPDNQFFANSKLHLHVPAGATPKDGPSAGCTMITSLLSLAMKKP
VKKDLAMTGEVTLTGKILPIGGVKEKAIAARRSDVKTIIFPSANRRDFDELAPNVKEGLDVHFVDD
YKQIFDLAF

Claims

Claims
1 . A mutant, non-naturally occurring or transgenic tobacco plant cell comprising:
(i) a polynucleotide comprising, consisting or consisting essentially of a sequence encoding a functional protease and having at least 95% sequence identity to any one of SEQ ID NO:1 to SEQ ID No: 80;
(ii) a polypeptide encoded by the polynucleotide set forth in (i);
(iii) a polypeptide comprising, consisting or consisting essentially of a sequence encoding a protease and having at least 95% sequence identity to SEQ ID NO:81 to SEQ ID No: 160; or
(iv) a construct, vector or expression vector comprising the isolated
polynucleotide set forth in (i),
and wherein the expression or activity of said protease is modulated as compared to a control tobacco plant cell in which the expression or activity of said protease has not been altered.
2. A mutant, non-naturally occurring or transgenic tobacco plant cell according to claim 1 , wherein the expression or activity of said protease is upregulated compared to the control tobacco plant cell.
3. A mutant, non-naturally occurring or transgenic tobacco plant cell according to claim 1 , wherein the expression or activity of said protease is downregulated compared to the control tobacco plant cell.
4. A mutant, non-naturally occurring or transgenic tobacco plant cell according to any preceding claim, wherein the expression or activity is modulated of a protease selected from: at least one of SEQ ID NO: 1 to 16; or
at least one of SEQ ID NO: 30 to 41 ; or
at least one of SEQ ID NO: 17 to 22; or
at least one of SEQ ID NO: 42 to 44; or
at least one of SEQ ID NO: 45 to 61 ; or
at least one of SEQ ID NO: 62 to 80 or
at least one of SEQ ID NO: 23 to 29.
5. A mutant, non-naturally occurring or transgenic tobacco plant cell according to claim 4, wherein the expression or activity of a protease selected from at least one of SEQ ID NO: 30 to 41 is modulated in an Oriental type tobacco.
6. A mutant, non-naturally occurring or transgenic tobacco plant cell according to claim 4, wherein the expression or activity of a protease selected from SEQ ID NO: 17 to 22 is modulated in a Virginia type tobacco.
7. A mutant, non-naturally occurring or transgenic tobacco plant cell according to claim 4, wherein the expression or activity of a protease selected from at least one of SEQ ID NO: 42 to 44 is modulated in a Burley type tobacco.
8. A mutant, non-naturally occurring or transgenic tobacco plant cell according to claim 4, wherein the expression or activity of a protease selected from at least one of SEQ ID NO: 45 to 61 is modulated in a Virginia or Oriental type tobacco.
9. A mutant, non-naturally occurring or transgenic tobacco plant cell according to claim 4, wherein the expression or activity of a protease selected from at least one of SEQ ID NO: 62 to 80 is modulated in a Burley or Oriental type tobacco.
10. A mutant, non-naturally occurring or transgenic tobacco plant cell according to claim 4, wherein the expression or activity of a protease selected from at least one of SEQ ID NO: 23 to 29 is modulated in a Burley or Virginia type tobacco.
1 1. The mutant, non-naturally occurring or transgenic tobacco plant cell according to any preceding claim, wherein said mutation(s) is a heterozygous or homozygous mutation.
12. The mutant, non-naturally occurring or transgenic tobacco plant cell according to any preceding claim, wherein the expression of one or more proteases is increased by about 10% to about 1000%,
13. The mutant, non-naturally occurring or transgenic tobacco plant cell according to claim 12, wherein the expression of one or more proteases is increased by at least 10%, at least 20%, at least 25%, at least 50%, at least 100%, at least 200%, at least 500%, at least 750% or up to 1000%.
14. A mutant, non-naturally occurring or transgenic plant or component or part thereof comprising the plant cell according to any preceding claim.
15. Plant material including biomass, seed, stem, flowers or leaves from the plant of claim 14.
16. A tobacco product comprising the plant cell of any of claims 1 to 13, at least a part of the plant of claim 14 or the plant material according to claim 15.
17. A method for preparing a tobacco plant with modulated levels of protease, said method comprising the steps of:
(a) providing a plant comprising (i) a polynucleotide comprising, consisting or consisting essentially of a sequence encoding a functional protease and having at least 95% sequence identity to at least one of SEQ ID NO:1 to SEQ ID No: 80;
(b) inserting one or more mutations into said polynucleotide of said tobacco plant to create a mutant tobacco plant; and
(c) curing the tobacco plant material.
18. The method according to claim 17, wherein the tobacco plant in step (b) is a mutant tobacco plant, preferably, wherein said mutant tobacco plant comprises one or more mutations in one or more further sequence encoding a functional protease and having at least 95% sequence identity to at least one of SEQ ID NO:1 to SEQ ID No: 80.
19. A method according to claim 17 or claim 18, wherein the genome of a cell of a tobacco plant is modified by a genome editing technology or by genome engineering techniques selected from CRISPR/Cas technology, zinc finger nuclease-mediated mutagenesis, chemical or radiation mutagenesis, homologous recombination,
oligonucleotide-directed mutagenesis and meganuclease-mediated mutagenesis.
20. A method for producing cured plant material, preferably cured leaves, or flowers with an altered flavour profile as compared to control plant material comprising the steps of:
(a) providing a plant according to claim 14 or the plant material according to claim 15;
(b) optionally harvesting the plant material therefrom; and
(c) curing the plant material for a period of time such that the levels of at least one protease are modulated compared to control cured plant material.
21 . The use of
(i) a polynucleotide comprising, consisting or consisting essentially of a sequence encoding a functional protease and having at least 95% sequence identity to any one of SEQ ID NO:1 to SEQ ID No: 80;
(ii) a polypeptide encoded by the polynucleotide set forth in (i);
(iii) a polypeptide comprising, consisting or consisting essentially of a sequence encoding a protease and having at least 95% sequence identity to SEQ ID NO:81 to SEQ ID No: 160; or
(iv) a construct, vector or expression vector comprising the isolated
polynucleotide set forth in (i),
for the modulation of the expression or activity of one or more proteases in tobacco during a tobacco curing procedure.
22. Use according to claim 21 , wherein the curing procedure is selected from the group consisting of air curing, fire curing, smoke curing and flue curing.
PCT/EP2015/066341 2014-07-18 2015-07-16 Tobacco protease genes WO2016009006A1 (en)

Priority Applications (11)

Application Number Priority Date Filing Date Title
KR1020177001642A KR20170032317A (en) 2014-07-18 2015-07-16 Tobacco protease genes
CN201580038165.0A CN106661556A (en) 2014-07-18 2015-07-16 Tobacco protease genes
BR112017000932A BR112017000932A2 (en) 2014-07-18 2015-07-16 tobacco protease genes
MX2017000834A MX2017000834A (en) 2014-07-18 2015-07-16 Tobacco protease genes.
CA2954828A CA2954828A1 (en) 2014-07-18 2015-07-16 Tobacco protease genes
RU2017105148A RU2756102C2 (en) 2014-07-18 2015-07-16 Tobacco protease genes
AP2017009676A AP2017009676A0 (en) 2014-07-18 2015-07-16 Tobacco protease genes
JP2017502853A JP2017529063A (en) 2014-07-18 2015-07-16 Tobacco protease gene
EP15738907.3A EP3169149B1 (en) 2014-07-18 2015-07-16 Tobacco protease genes
US15/325,997 US20170265516A1 (en) 2014-07-18 2015-07-16 Tobacco protease genes
PH12016502546A PH12016502546A1 (en) 2014-07-18 2016-12-20 Tobacco protease genes

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
EP14177715 2014-07-18
EP14177715.1 2014-07-18

Publications (1)

Publication Number Publication Date
WO2016009006A1 true WO2016009006A1 (en) 2016-01-21

Family

ID=51210347

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2015/066341 WO2016009006A1 (en) 2014-07-18 2015-07-16 Tobacco protease genes

Country Status (12)

Country Link
US (1) US20170265516A1 (en)
EP (1) EP3169149B1 (en)
JP (2) JP2017529063A (en)
KR (1) KR20170032317A (en)
CN (1) CN106661556A (en)
AP (1) AP2017009676A0 (en)
BR (1) BR112017000932A2 (en)
CA (1) CA2954828A1 (en)
MX (1) MX2017000834A (en)
PH (1) PH12016502546A1 (en)
RU (1) RU2756102C2 (en)
WO (1) WO2016009006A1 (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106811453A (en) * 2017-01-05 2017-06-09 上海交通大学 Afriocan agapanthus cathepsin B and its encoding gene and probe and application
WO2019185699A1 (en) * 2018-03-28 2019-10-03 Philip Morris Products S.A. Modulating reducing sugar content in a plant
EP3632925A1 (en) * 2018-10-02 2020-04-08 Universität für Bodenkultur Wien Plant serine proteases
WO2021063863A1 (en) 2019-10-01 2021-04-08 Philip Morris Products S.A. Modulating sugar and amino acid content in a plant (sultr3)
WO2021063860A1 (en) 2019-10-01 2021-04-08 Philip Morris Products S.A. Modulating reducing sugar content in a plant (inv)
CN115747249A (en) * 2022-11-28 2023-03-07 湖南大学 Application of tobacco NtabCrRLK12 gene in relieving tobacco continuous cropping obstacle
WO2024160860A1 (en) 2023-02-02 2024-08-08 Philip Morris Products S.A. Modulation of genes coding for lysine ketoglutarate reductase
WO2024160864A1 (en) 2023-02-02 2024-08-08 Philip Morris Products S.A. Modulation of sugar transporters

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB201805949D0 (en) * 2018-04-10 2018-05-23 British American Tobacco Investments Ltd Smoking article
CN111763687B (en) * 2019-03-12 2021-12-07 中国农业大学 Method for rapidly cultivating corn haploid induction line based on gene editing technology
CN114032324B (en) * 2021-11-23 2023-09-01 云南省烟草农业科学研究院 SSR marker linked with tobacco black shank No. 1 physiological race resistance gene qBS1
CN116590308B (en) * 2023-05-09 2024-03-29 西南大学 Potato drought tolerance related heat shock protein gene HSP101 and application thereof

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040106198A1 (en) * 2002-07-16 2004-06-03 Large Scale Biology Corporation Inhibition of peptide cleavage in plants

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1997010353A1 (en) * 1995-09-14 1997-03-20 Virginia Tech Intellectual Property, Inc. Production of lysosomal enzymes in plant-based expression systems
AU2007299219A1 (en) * 2006-04-05 2008-03-27 Metanomics Gmbh Process for the production of a fine chemical
RU2324737C1 (en) * 2006-10-18 2008-05-20 Институт цитологии и генетики Сибирского отделения Российской академии наук (СО РАН) Process of obtaining transgene tobacco plants with increaed content of prolyl
US8319011B2 (en) * 2006-12-15 2012-11-27 U.S. Smokeless Tobacco Company Llc Tobacco plants having reduced nicotine demethylase activity
EP2537937A3 (en) * 2008-04-29 2013-04-10 Monsanto Technology LLC Genes and uses for plant enhancement
CN103403170B (en) * 2011-01-17 2015-11-25 菲利普莫里斯生产公司 Protein expression in plant
CN103012571B (en) * 2012-12-05 2013-12-04 北京师范大学 Gene capable of reducing nicotine content of tobacco and application thereof

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040106198A1 (en) * 2002-07-16 2004-06-03 Large Scale Biology Corporation Inhibition of peptide cleavage in plants

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
BOVET, L. ET AL.: "Gene expression changes during tobacco curing", 2013, XP002746297, Retrieved from the Internet <URL:https://www.pmiscience.com/system/files/asset/files/2013_bovet_l_swissplant13_poster.pdf> [retrieved on 20151015] *
CATHERINE NAVARRE ET AL: "Identification, gene cloning and expression of serine proteases in the extracellular medium ofcells", PLANT CELL REPORTS, SPRINGER, BERLIN, DE, vol. 31, no. 10, 17 July 2012 (2012-07-17), pages 1959 - 1968, XP035112133, ISSN: 1432-203X, DOI: 10.1007/S00299-012-1308-Y *
FLORIAN MARTIN ET AL: "Design of a tobacco exon array with application to investigate the differential cadmium accumulation property in two tobacco varieties", BMC GENOMICS, BIOMED CENTRAL LTD, LONDON, UK, vol. 13, no. 1, 28 November 2012 (2012-11-28), pages 674, XP021140791, ISSN: 1471-2164, DOI: 10.1186/1471-2164-13-674 *
G. BEYENE: "Two new cysteine proteinases with specific expression patterns in mature and senescent tobacco (Nicotiana tabacum L.) leaves", JOURNAL OF EXPERIMENTAL BOTANY, vol. 57, no. 6, 1 March 2006 (2006-03-01), GB, pages 1431 - 1443, XP055220617, ISSN: 0022-0957, DOI: 10.1093/jxb/erj123 *
KAWASHIMA N ET AL: "STUDIES ON PROTEIN METABOLISM IN HIGHER PLANTS V SOME PROPERTIES OF A TOBACCO-D LEAF ENZ PROTEASE INCREASED DURING CURING INST COLUMN CHROMATOGRAPHY", AGRICULTURAL AND BIOLOGICAL CHEMISTRY, vol. 32, no. 9, 1968, pages 1141 - 1145, XP002746298, ISSN: 0002-1369 *

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106811453A (en) * 2017-01-05 2017-06-09 上海交通大学 Afriocan agapanthus cathepsin B and its encoding gene and probe and application
CN106811453B (en) * 2017-01-05 2020-07-14 上海交通大学 African agapanthus cathepsin B, coding gene and probe thereof, and application of African agapanthus cathepsin B
WO2019185699A1 (en) * 2018-03-28 2019-10-03 Philip Morris Products S.A. Modulating reducing sugar content in a plant
US11591609B2 (en) 2018-03-28 2023-02-28 Philip Morris Products S.A. Modulating reducing sugar content in a plant
EP3632925A1 (en) * 2018-10-02 2020-04-08 Universität für Bodenkultur Wien Plant serine proteases
WO2020070197A1 (en) 2018-10-02 2020-04-09 Universität Für Bodenkultur Wien Plant serine proteases
WO2021063863A1 (en) 2019-10-01 2021-04-08 Philip Morris Products S.A. Modulating sugar and amino acid content in a plant (sultr3)
WO2021063860A1 (en) 2019-10-01 2021-04-08 Philip Morris Products S.A. Modulating reducing sugar content in a plant (inv)
CN115747249A (en) * 2022-11-28 2023-03-07 湖南大学 Application of tobacco NtabCrRLK12 gene in relieving tobacco continuous cropping obstacle
WO2024160860A1 (en) 2023-02-02 2024-08-08 Philip Morris Products S.A. Modulation of genes coding for lysine ketoglutarate reductase
WO2024160864A1 (en) 2023-02-02 2024-08-08 Philip Morris Products S.A. Modulation of sugar transporters

Also Published As

Publication number Publication date
EP3169149A1 (en) 2017-05-24
RU2756102C2 (en) 2021-09-28
EP3169149C0 (en) 2024-01-17
MX2017000834A (en) 2017-05-01
JP2021045130A (en) 2021-03-25
BR112017000932A2 (en) 2017-11-14
JP2017529063A (en) 2017-10-05
CA2954828A1 (en) 2016-01-21
RU2017105148A (en) 2018-08-20
KR20170032317A (en) 2017-03-22
PH12016502546A1 (en) 2017-04-10
US20170265516A1 (en) 2017-09-21
AP2017009676A0 (en) 2017-01-31
RU2017105148A3 (en) 2018-12-06
CN106661556A (en) 2017-05-10
EP3169149B1 (en) 2024-01-17

Similar Documents

Publication Publication Date Title
CA2998286C (en) Plants with reduced asparagine content
RU2733837C2 (en) Reduced nicotine conversion into nornicotine in plants
EP2751273B1 (en) Threonine synthase from nicotiana tabacum and methods and uses thereof
US10563215B2 (en) Tobacco specific nitrosamine reduction in plants
US9683240B2 (en) Modulating beta-damascenone in plants
EP3169149B1 (en) Tobacco protease genes
CA3039428A1 (en) Plants with shortened time to flowering
US20170145431A1 (en) Modulation of nitrate content in plants
EP3775224A1 (en) Modulating amino acid content in a plant
EP2586792A1 (en) Modulating beta-damascenone in plants

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15738907

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 12016502546

Country of ref document: PH

ENP Entry into the national phase

Ref document number: 2954828

Country of ref document: CA

WWE Wipo information: entry into national phase

Ref document number: 15325997

Country of ref document: US

ENP Entry into the national phase

Ref document number: 2017502853

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

WWE Wipo information: entry into national phase

Ref document number: MX/A/2017/000834

Country of ref document: MX

ENP Entry into the national phase

Ref document number: 20177001642

Country of ref document: KR

Kind code of ref document: A

REG Reference to national code

Ref country code: BR

Ref legal event code: B01A

Ref document number: 112017000932

Country of ref document: BR

REEP Request for entry into the european phase

Ref document number: 2015738907

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 2015738907

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 2017105148

Country of ref document: RU

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 112017000932

Country of ref document: BR

Kind code of ref document: A2

Effective date: 20170117