Nothing Special   »   [go: up one dir, main page]

IL121243A - Pecombinant insecticidal protein toxin from photorhabdus, polynucleotide encoding said protein and method of controlling pests using said protein - Google Patents

Pecombinant insecticidal protein toxin from photorhabdus, polynucleotide encoding said protein and method of controlling pests using said protein

Info

Publication number
IL121243A
IL121243A IL121243A IL12124396A IL121243A IL 121243 A IL121243 A IL 121243A IL 121243 A IL121243 A IL 121243A IL 12124396 A IL12124396 A IL 12124396A IL 121243 A IL121243 A IL 121243A
Authority
IL
Israel
Prior art keywords
protein
leu
seq
ala
toxin
Prior art date
Application number
IL121243A
Other versions
IL121243A0 (en
Original Assignee
Wisconsin Alumni Res Found
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wisconsin Alumni Res Found filed Critical Wisconsin Alumni Res Found
Publication of IL121243A0 publication Critical patent/IL121243A0/en
Publication of IL121243A publication Critical patent/IL121243A/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N5/00Undifferentiated human, animal or plant cells, e.g. cell lines; Tissues; Cultivation or maintenance thereof; Culture media therefor
    • C12N5/10Cells modified by introduction of foreign genetic material
    • C12N5/12Fused cells, e.g. hybridomas
    • C12N5/14Plant cells
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/82Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
    • C12N15/8241Phenotypically and genetically modified plants via recombinant DNA technology
    • C12N15/8261Phenotypically and genetically modified plants via recombinant DNA technology with agronomic (input) traits, e.g. crop yield
    • C12N15/8271Phenotypically and genetically modified plants via recombinant DNA technology with agronomic (input) traits, e.g. crop yield for stress resistance, e.g. heavy metal resistance
    • C12N15/8279Phenotypically and genetically modified plants via recombinant DNA technology with agronomic (input) traits, e.g. crop yield for stress resistance, e.g. heavy metal resistance for biotic stress resistance, pathogen resistance, disease resistance
    • C12N15/8286Phenotypically and genetically modified plants via recombinant DNA technology with agronomic (input) traits, e.g. crop yield for stress resistance, e.g. heavy metal resistance for biotic stress resistance, pathogen resistance, disease resistance for insect resistance
    • AHUMAN NECESSITIES
    • A01AGRICULTURE; FORESTRY; ANIMAL HUSBANDRY; HUNTING; TRAPPING; FISHING
    • A01GHORTICULTURE; CULTIVATION OF VEGETABLES, FLOWERS, RICE, FRUIT, VINES, HOPS OR SEAWEED; FORESTRY; WATERING
    • A01G13/00Protecting plants
    • AHUMAN NECESSITIES
    • A01AGRICULTURE; FORESTRY; ANIMAL HUSBANDRY; HUNTING; TRAPPING; FISHING
    • A01NPRESERVATION OF BODIES OF HUMANS OR ANIMALS OR PLANTS OR PARTS THEREOF; BIOCIDES, e.g. AS DISINFECTANTS, AS PESTICIDES OR AS HERBICIDES; PEST REPELLANTS OR ATTRACTANTS; PLANT GROWTH REGULATORS
    • A01N63/00Biocides, pest repellants or attractants, or plant growth regulators containing microorganisms, viruses, microbial fungi, animals or substances produced by, or obtained from, microorganisms, viruses, microbial fungi or animals, e.g. enzymes or fermentates
    • A01N63/50Isolated enzymes; Isolated proteins
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/195Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/195Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria
    • C07K14/24Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria from Enterobacteriaceae (F), e.g. Citrobacter, Serratia, Proteus, Providencia, Morganella, Yersinia
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/48Hydrolases (3) acting on peptide bonds (3.4)
    • C12N9/50Proteinases, e.g. Endopeptidases (3.4.21-3.4.25)
    • C12N9/52Proteinases, e.g. Endopeptidases (3.4.21-3.4.25) derived from bacteria or Archaea
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A40/00Adaptation technologies in agriculture, forestry, livestock or agroalimentary production
    • Y02A40/10Adaptation technologies in agriculture, forestry, livestock or agroalimentary production in agriculture
    • Y02A40/146Genetically Modified [GMO] plants, e.g. transgenic plants

Landscapes

  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Engineering & Computer Science (AREA)
  • Chemical & Material Sciences (AREA)
  • Organic Chemistry (AREA)
  • Zoology (AREA)
  • General Health & Medical Sciences (AREA)
  • Wood Science & Technology (AREA)
  • Biotechnology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biomedical Technology (AREA)
  • General Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Biochemistry (AREA)
  • Microbiology (AREA)
  • Biophysics (AREA)
  • Plant Pathology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Pest Control & Pesticides (AREA)
  • Medicinal Chemistry (AREA)
  • Environmental Sciences (AREA)
  • Cell Biology (AREA)
  • Physics & Mathematics (AREA)
  • Dentistry (AREA)
  • Insects & Arthropods (AREA)
  • Agronomy & Crop Science (AREA)
  • Virology (AREA)
  • Gastroenterology & Hepatology (AREA)
  • Toxicology (AREA)
  • Botany (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)
  • Agricultural Chemicals And Associated Chemicals (AREA)
  • Peptides Or Proteins (AREA)
  • Breeding Of Plants And Reproduction By Means Of Culturing (AREA)

Description

121243 j7'n | 453347 τηκ nv "Jii ^a n τηρβη fwVpwino photorhabdus -» ο ηη oip »oj)a«)p»* >m n j¾n Recombinant insecticidal protein toxin from photorhabdus, polynucleotide encoding said protein and method of controlling pests using said protein Wisconsin Alumni Research Foundation C.107289 121243/2 Field of the Invention The present invention relates to toxins isolated from bacteria and the use of said toxins as insecticides.
Background of the Invention Many insects are widely regarded as pests to homeowners, to picnickers, to gardeners, and to farmers and others whose investments in agricultural products are often destroyed or diminished as a result of insect damage to field crops.
Particularly in areas where the growing season is short, significant insect damage can mean the loss of all profits to growers and a dramatic decrease in crop yield. Scarce supply of particular agricultural products invariably results in higher costs to food processors and, then, to the ultimate consumers of food plants and products derived from those plants.
Preventing insect damage to crops and flowers and eliminating the nuisance of insect pests have typically relied on strong organic pesticides and insecticides with broad toxicities. These synthetic products have come under attack by the general population as being too harsh on the environment and on those exposed to such agents. Similarly in non-agricultural settings, homeowners would be satisfied to have insects avoid their homes or outdoor meals without needing to kill the insects. / 2 PC17US96/18003 The extensive use of chemical insecticides has raised environmental and health concerns for farmers, companies that produce the insecticides, government agencies, public interest groups, and the public in general. The development of less intrusive pest management strategies has been spurred along both by societal concern for the environment and by the development of biological tools which exploit mechanisms of insect management. Biological control agents present a promising alternative to chemical insecticides.
Organisms at every evolutionary development level have devised means to enhance their own success and survival. The use of biological molecules as tools of defense and aggression is known throughout the animal and plant kingdoms. In addition, the relatively new tools of the genetic engineer allow modifications to biological insecticides to accomplish particular solutions to particular problems.
One such agent, Bacillus Churingiensis IBC), is an effective insecticidal agent, and is widely commercially used as such. In fact, the insecticidal agent of the fit bacterium is a protein which has such limited toxicity, it can be used on human food crops on the day of harvest. To non- targeted organisms, the St toxin' is a digestible non-toxic protein.
Another known class of biological insect control agents are certain genera of nematodes known to be vectors of transmission for insect-killing bacterial symbionts. Nematodes containing insecticidal bacteria invade insect larvae. The bacteria then kill the larvae. The nematodes reproduce in the larval cadaver. The nematode progeny then eat the cadaver from within. The bacteria-containing nematode progeny thus produced can then invade additional larvae.
In the past, insecticidal nematodes in the Sceinernema and He rorhabdiCis genera were used as insect coatroL agents.
Apparently, each genus of nematode hosts a particular species of bacterium. In nematodes of the Hecerorhabditis genus, the symbiotic bacterium is Phocorhabdus luminescens .
Although these nematodes are effective insect control agents, it is presently difficult, expensive,.—and -inefficient to produce, maintain,, and distribute nematodes for insect control.
It has been known in the art tha one may isolate an insecticidal toxin from Phocorhabdus luminescens that has SUBSTITUTE SHEET RULE 26 121243/2 activity only when injected into Lepidopteran and Coleopteran insect larvae. This has made it impossible to effectively exploit the insecticidal properties of the nematode or its bacterial symbiont. What would be useful would be a more practical, less labor-intensive wide-area delivery method of an insecticidal toxin which would retain its biological properties after delivery.
Clarke, D. J. and Dowds, B.C.A. (1995), Virulence Mechanisms of Photorhabdus sp. Strain K122 toward Wax Moth Larvae. Journal of Invertebrate Pathology, 66: 149-155, describes the virulence of Photorhabdus sp. strain K122 to larvae of the wax moth, Galleria mellonella . The virulence correlated with the growth rate of the cultures and all larvae died after the cultures entered the stationary phase. Although the toxicity of the lysates was located in the membrane fraction, this was found to be a non specific effect. It was proposed that the non-specific toxicity of bacterial lysates to G. mellonella derives from the non-localized activation of its humoral immune response by components of the membrane fraction, followed by damage to its tissue by a product of its own immune response .
U.S. Patent No. 5,039,523 discloses a novel Bacillus thuringiensis (B.t.) toxin gene encoding a protein toxic to Lepidopteran insects. The gene was cloned from a novel iepidopteran-active B. thuringiensis microbe. The DNA encoding the B.t. toxin can be used to transform various prokaryotic and eukaryotic microbes to express the B.t. toxin. These recombinant microbes can be used to control Lepidopteran insects in various environments.
U.S. Patent No. 5,254,799 discloses novel transformation vectors containing chimeric genes which allow the introduction of exogenous DNA fragments coding for polypeptide toxins produced by Bacillus thuringiensis or having substantial sequence homology to a gene coding for a polypeptide toxin as described in the patent, and expression of the chimeric gene in plant cells and their progeny after integration into the plant cell's genome. Transformed plant cells and their progeny exhibit stably inherited polypeptide toxin expression useful 121243/1 for protecting the plant cells and their progeny from certain insect pests and in controlling the insect pests.
The doctoral thesis of David J. Bowen, entitled "Characterization of a high molecular weight insecticidal protein complex produced by the entomopathogenic bacterium Photorhabdus luminescens" describes experiments leading to the purification and characterization of the cytoplasmic protein inclusions produced by Photorhabdus luminescens . The results of experiments designed to elucidate a possible function for these proteins are also described. Furthermore, experiments leading to the discovery and purification of a high molecular weight insecticidal protein complex produced by P. luminescens are disclosed.
It would be quite desirous to discover toxins with oral activity produced by the genus Photorhabdus . The isolation and use of these toxins are desirous due to efficacious reasons.
Until applicants, discoveries, these toxins had not been isolated or characterized.
Summary of the Invention The native toxins are protein complexes that are produced and secreted by growing bacteria cells of the genus Photorhabdus, of interest are the proteins produced by the species Photorhabdus luminescens . The protein complexes, with a molecular size of approximately 1,000 kDa, can be separated by SDS-PAGE gel analysis into numerous component proteins. The toxins contain no hemolysin, lipase, type C phospholipase, or nuclease activities. The toxins exhibit significant toxicity upon exposure administration to a number of insects.
The present invention provides a polynucleotide that is operably associated with a heterologous promoter, wherein said polynucleotide encodes a protein that has toxin activity against an insect pest wherein a nucleotide molecule that codes for said protein maintains hybridization with the complement of nucleic acid sequence ID NO: 46 after hybridization and wash wherein said hybridization is conducted at 60°C in solution containing 10% w/v PEG (polyethylene glycol, . . approximately 8000), 7% w/v SDS, 10 mM sodium phosphate buffer, 5mM EDTA, and 100 mg/ml denatured - 3a - 121243/1 salmon sperm DNA, said wash is conducted in 6X SSC and 0.1% SDS at 60°C.
The present invention also provides a method of controlling an insect pest wherein the method comprises feeding a protein according to the invention to the pest.
Objects, advantages, and features of the present invention will become apparent from the following specification.
Passages of the description which are not within the scope of the claims do not constitute part of the claimed invention.
Brief Description of the Drawings Fig. 1 is an illustration of a match of cloned DNA isolates used as a part of sequence genes for the toxin of the present invention.
Fig. 2 is a map of three plasmids used in the sequencing process . - 3b - . - several partial DNA fragments.
Fig. 4. is an illustration of a homology analysis between the protein sequences of TcbAn and TcaBii proteins.
Fig. 5 is a phenogram of Phocorhabdus strains. Relationship of Phocorhabdus Strains was defined by rep-PCR.
The upper axis of Fig. 5 measures the percentage similarity of strains based on scoring of rep-PCR products (i.e., 0.0 [no similarity] to 1.0 [100% similarity]) . At the righ axis, the numbers and letters indicate the various strains tested; 14=w-l4, Hm=Hm, H9=H9, 7=WX-7, 1=WX-1, 2=WX-2, 88=HP88, NC-1=NC-1, 4=WX-4, 9=WX-9, 8=WX-8, 10=WX-10, WIR=WIR, 3=WX-3, 11=WX-11, 5= X-5, 6=WX-6 , 12=WX-12, xl4=WX-14 , 15=WX-15, Hb=Hb, B2=B2, 48 through 52=ATCC 43948 through ATCC 43952. Vertical lines separating horizontal lines indicate the degree of relatedness (as read from the extrapolated intersection of the vertical line with the upper axis) between strains or groups of strains at the base of the horizontal lines (e.g., strain W-14 is approximately 60% similar to strains H9 and Hm) .
Fig. 6 is an illustration of the genomic/. maps of the w-14 Strain.
Detailed Description of the Invention The present inventions are directed to the discovery. c £ a unique class of insecticidal protein toxins from the genus Phocorhabdus that have oral toxicity against insects. A unique feature of Phocorhabdus is its biolurainescence. Phocorhabdus may be isolated from a variety of sources. One such source is nematodes, more particularly nematodes of the genus Heterorhabdicis. Another such source is from human clinical samples from wounds, see Farmer et al. 1989 J. Clin. Microbiol. 27 pp. 1594-1600. These saprohytic strains are deposited in the t American Type Culture Collection (Rockville, MD) ATCC #s 43948, 43949, 43950, 43951, and 43952, and are incorporated herein by Reference. Strain W-14, ATCC 55397 has been deposited at the American Type Culture Collection, 12301 Parklawn Drive, Rockville, MD 20852 USA on March 5, 1993. Strains ATCC Nos. 43948, 43949, 43450, 43951, 43952 were obtained from the American Type Culture Collection, 12301 Parklawn Drive, Rockville, MD 20852 USA and have been redeposited at the 4 American Type Culture Collection, 12301 Parklawn Drive, Rockville, MD 20852 USA on November 5, 1996, and the second deposits are identified ATCC Nos . 55878, 55879, 55880, 55881 and 55882, respectively. It is possible that the other sources could harbor Photohabdus bacteria that produce insecticidal toxins. Such sources in the environment could be either terrestrial or aquatic based. 4a The genus Phocorhabdus is caxonomically defined as a merri."r i. of che Family Encerobacceriacea , although ic has certain era:-.? acypical of chis family. For example, scrains of his genus are nicrate reduction negative, yellow and red pigmenc producing and bioluminescenc . This latter trait is otherwise unknown within the Encerobacceriaceae . Phocorhabdus has only recently been described as a genus separate from the Xenor abdus (Boemare ec al., 1993 Inc. J. Syst. Bacceriol. 43, 249-255). This differentiation is based on DNA-DNA hybridization studies, phenocypic differences (e.g., presence I Phocorhabdus > or absence (Xenorhabdus) of catalase and bioluminescence) and the Family of the nematode host (Xenorhabdus; Steinernemacidae, Phocorhabdus; Hecerorhabdicidae) . Comparacive, cellular faccy-acid analyses (Janse ec al. 1990, Le c. Appl. Microbiol 10, 131-135; Suzuki ec al. 1990, J. Gen. Appl. Microbiol., 36, 393-401) supporc che separacion of Phocorhabdus from Xenorhabdus .
In order Co escablish that Che scrain colleccion disclosed herein was comprised of Photorhabdus scrains, che scrains were characcerized based on recognized traics which define Phocorhabdus and dif ferenciace ic from ocher Encerobacceriaceae and Xenorhabdus species. (Farmer, 1984 Bergey's Manual of Systemic Bacteriology Vol. 1 pp.510-511; Akhurst and Boemare 1998, J. Gen. Microbiol. 134 pp.1835-1845; Boemare eC al. 1993 Inc. J. Syst. Bacteriol. 43 pp.249-255, which are incorporated herein by reference) . The craiCs studied were Che following: gram scain negative rods, organism size, colony pigmencacion, inclusion bodies, presence of cacaiase, abilicy Co reduce ni race, bioluminescence, dye upcake, gelatin hydrolysis, growch on selective media, growth Cemperacure, survival under anerobic conditions and motility. Fatty acid analysis was used to confirm that che scrains herein all belong co Che single genus Photorhajbdus.
Currencly, Che bac erial genus Phocorhabdus is comprised of a single defined species, Phocorhabdus lu inescens (ATCC Type scrain #29999, Poinar ec al., 1977, Nemacologica 23, 97-102) . A variecy of relaced scrains have been described in Che liceracure (e.g. Akhursc ec al. 1988 J. Gen. Microbiol., 134, 1835-1845; Boemare et al. 1993 Int. J. Syst. Bacteriol. 43 pp. 249-255; Putz et al. 1990, Appl. Environ. Microbiol., 56, 181-186) . Numerous Phocorhabdus strains have been characterized herein. Such strains are listed in Table 18 in the Examples. Because there is currently only one species ( luminescens) defined within the genus Phocorhabdus , the luminescens species traits were used to characterize the strains herein. As can be seen in Fig. 5, these strains are quite diverse. It is not unforeseen that in the future there may be other Phocorhabdus species that will have some of the attributes of the luminescens species as well as some different characteristics that are presently not defined as a trait of Phocorhabdus luminescens . However, the scope of the invention herein is to any Phocorhabdus species or strains which produce proteins that have functional activity as insect control agents, regardless of other traits and characteristics.
Furthermore, as is demonstrated herein, the bacteria of the genus Phocorhabdus produce proteins that have functional activity as defined herein. Of particular interest are proteins produced by the species Phocorhabdus luminescens. The inventions herein should in no way be limited to the strains which are disclosed herein. These strains illustrate for the first time that proteins produced by diverse isolates of Photorhabdus are toxic upon exposure to insects. Thus, included within the inventions described herein are the strains specified herein and any mutants thereof, as well as any strains or species of the genus Phocorhabdus that have the functional activity described herein.
There are several terms that are used herein that have a particular meaning and are as follows: By -functional activity" it is meant herein that the protein toxins function as insect control agents in that the proteins are orally active, or have a toxic effect, or are able to disrupt or deter feeding, which may or may not cause death of the insect. When an insect comes into contact with an effective amount of toxin delivered via transgenic plant expression, formulated protein compositions (s ) , sprayable protein composi ion ( s ) , a bait matrix or other delivery system, the results are typically death of the insect, or the insects do not feed upon the source which makes the toxins available to the insects.
The procein toxins discussed herein are typically referred to as "insecticides". By insecticides it is meant herein that the protein toxins have a "functional activity" as further defined herein and are used as insect control agents.
By the use of the term "oligonucleotides" it is meant a macromolecule consisting of a short chain of nucleotides of either RNA or DNA. Such length could be at least one nucleotide, but typically are in the range of about 10 to about 12 nucleotides. The determination of the length of the oligonucleotide is well within the skill of an artisan and should not be a limitation herein. Therefore, oligonucleotides may be less than 10 or greater than 12.
By the use of the term "toxic" or "toxicity" as used herein it is meant that the toxins produced by Phocorhabdus have "functional activity* as defined herein.
By the use of the term "genetic material" herein, it is meant to include all genes, nucleic acid, DNA and RNA.
Fermentation broths from selected strains reported in Table 18 were used to determine the following: breadth of insecticidal toxin production by the Phocorhabdus genus, the insecticidal spectrum of these toxins, and to provide source material to purify the toxin complexes. The strains characterized herein have been shown to have oral toxicity against a variety of insect orders. Such insect orders include but are not limited to Coleoptera , Homopcera , Lepidcpcera , Dipcera, Acarina, Hymenopcera and Diccyopcera.
As with other bacterial toxins, the rate of mutation of the bacteria in a population causes many related toxins slightly different in sequence to exist. Toxins of interest here are those which produce protein complexes toxic to a variety of insects upon exposure, as described herein. Preferably, the toxins are active against Lepidopcera, Coleopceca , Homopoteca . Dipcera, Hymenopcera , Diccyopcera and Acarina. The inventions herein are intended to capture the protein toxins homologous to protein toxins produced by the strains herein and any derivative By the use of the term "Photorhajbdus toxin" it is meant any protein produced by a Photorhabdus microorganism strain which has functional activity against insects, where the Photorhajbdus toxin could be formulated as a sprayable composition, expressed by a transgenic plant, formulated as a bait matrix, delivered via a Baculovirus, or delivered by any other applicable host or delivery system. -7/1- SUBSTTTUTE SHEET (RULE 26) strains thereof, as well as any protein toxins produced by Phocorhabdus . These homologous proteins may differ in sequence, but do not differ in function from those toxins described herein. Homologous toxins are meant to include protein complexes of between 300 kDa to 2,000 kDa and are comprised of at least . > (2) subunits, where a subunit is a peptide which may or may net be the same as the other subunit. Various protein subunits have been identified and are taught in the Examples herein.
Typically, the protein subunits are between about IB kDa to about 230 kDa; between about 160 kDa to about 230 kDa; 100 kDa to 160 kDa; about 80 kDa to about 100 kDa; and about 50 kDa to about 80 kDa .
As discussed above, some Phocorhabdus strains can be isolated from nematodes. Some nematodes, elongated cylindrical parasitic worms of the phylum Nemacoda , have evolved an ability to exploit insect larvae as a favored growth environment . The insect larvae provide a source of food for growing nematodes and an environment in which to reproduce. One dramatic effect that follows invasion of larvae by certain nematodes is larval death. Larval death results from the presence of, in certain nematodes, bacteria that produce an insecticidal toxin which arrests larval growth and inhibits feeding activity.
Interestingly, it appears that each genus of insect parasitic nematode hosts a particular species of bacterium, uniquely adapted for symbiotic growth with that nematode. In the interim since this research was initiated, the name of the bacterial genus Xenorhabdus was reclassified into the Xenorhabdus and the Photorhabdus . Bacteria of the genus Phocorhabdus are characterized as being symbionts of Hecerorhabdicus nematodes while Xenorhabdus species are symbionts of the sceinernema species. This change in nomenclature is reflected in this specification, but in no way should a change in nomenclature alter the scope of the inventions described herein.
The peptides and genes that are disclosed herein are named according to the guidelines recently published in the Journ.ii of Bacteriology "Instructions to Authors* p. i-xii (Jan. 1996) , which is incorporated herein by reference. The following peptides and genes . were isolated from Phocorhabdus strain W-14.
Pepcide / Gene Nomenclature Toxin complex (Tc) Pep ide Gene Patent Name Name Sequence ID# tea genomic region TeaA tcaA 12 TcaA i tcaA 4 TcaBi ccaB 3 (19, 20; TcaBii ccaB 5 TcaC tcaC 2 tcb genomic region TcbA tcbA 16 TcbAi cbA (pro-peptide) TcbAii cbA 1 (21, 22, 23, 24) TcbAi ϋ cbA 40 ccc genomic region TccA CccA 8 TccB tec3 7 ccd genomic region TcdAi cdA (pro-peptide) TcdAii tcdA 13, (38, 39 17, 18) TcdAiii tcdA 41, (42, 43) TcdB ccdB 14 (bracket sequence indicates internal amino acid sequence obtained by tryptic digests) The sequences listed above are grouped by genomic region.
The tcbA gene was expressed in E. coli as two protein fragments TcbA and TcbA i as illustrated in the Examples. It may be beneficial to have proteolytic ciippage of some sequences to obtain the higher activity of the toxins for commercial transgenic applications.
The toxins described herein are quite unique in that the toxins have functional activity, which is key to developing an insect management strategy. In developing an insect management strategy, it is possible to delay or circumvent the protein degradation process by injecting a protein directly into an organism, avoiding its digestive tract. In such cases, the protein administered to the organism will retain its function until it is denatured, non-specif ically degraded, or eliminated by the immune system in higher organisms. Injection into insects -9- of an inseccicidal toxin has potential application only in the laboratory, and then only on large insects which are easily injected. The observation that the . insecticidal protein toxins herein described exhibits their toxic activity after oral ingestion or contact with the toxins permits the development of an insect management plan ba39d solely on the ability to incorporate the protein toxins into the insect diet. Such a plan could result in the production of insect baits.
The Phocarhabdus toxins may be administered to insects in a purified form. The toxins may also be delivered in amounts from about 1 to about 100 mg / liter of broth. This may vary upon formulation condition, conditions of the inoculum source, techniques for isolation of the toxin, and the like. The toxins may be administered as an exudate secretion or cellular protein originally expressed in a heterologous prokaryotic or eukaryocic host. Bacteria are typically the hosts in which proteins are expressed. Eukaryotic hosts could include but are not limited to plants, insects and yeast. Alternatively, the toxins may be produced in bacteria or transgenic plants in the field or in the insect by a baculovirus vector. Typically the toxins will be introduced to the insect by incorporating one or more of the toxins into the insects' feed.
Complete lethality to feeding insects is useful but is not required to achieve useful toxicity. If the insects avoid the toxin or cease feeding, that avoidance will be useful in some applications, even if the effects are sublethal. For example, if insect resistant transgenic crop plants are desired, a reluctance of insects to feed on the plants is as useful as lethal toxicity to the insects since the ultimate objective is protection of the plants rather than killing the insect.
There are many other ways in which toxins can be incorporated into an insect's diet. As an example, it is possible to adulterate the larval food source with the toxic protein by spraying the food with a protein solution, as disclosed herein. Alternatively, the purified protein could be genetically engineered into an otherwise harmless bacterium, which could then be grown in culture, and either applied to the food source or allowed to reside in the soil in an area in which insect eradication was desirable. Also, the protein could be genetically engineered directly into an insect food source. For instance, the major food source of many insect larvae is plant material .
By incorporating genetic material that encodes the insecticidal properties of the Phocorhabdus toxins into the genome of a plant eaten by a particular insect pest, the adult or larvae would die after consuming the food plant. Numerous members of the monocotyledonous and dictyledenous genera have been transformed. Transgenic agronmonic crops as well as fruits and vegetables are of commercial interest. Such crops include but are not limited to maize, rice, soybeans, canola, sunflower, alfalfa, sorghum, wheat, cotton, peanuts, tomatoes, potatoes, and the like. Several techniques exist for introducing foreign genetic material into plant cells, and for obtaining plants that stably maintain and express the introduced gene. Such techniques include acceleration of genetic material coated onto microparticles directly into cells(U.S. Patents 4,945,050 to Cornell and 5,141,131 to DowElanco) . Plants may be transformed using Agrobacteriu technology, see U.S. Patent 5,177,010 to University of Toledo, 5,104,310 to Texas A&M, European Patent Application 0131624B1, European Patent Applications 120516, 159418B1 and 176,112 to Schilperoot, U.S. Patents 5,149,645, 5,469,976, 5,464,763 and 4,940,838 and 4,693,976 to Schilperoot, European Patent Applications 116718, 290799, 320500 all to MaxPlanck, European Patent Applications 604662 and 627752 to Japan Tobacco, European Patent Applications 0267159, and 0292435 and U.S. Patent 5,231,019 all to Ciba Geigy, U.S. Patents 5,463,174 and 4,762,785 both to Calgene, and U.S. Patents 5,004, 863 and 5 , 159 , 13 5 both to Agracetus . Other transformation technology includes whiskers technology, see U.S. Patents 5,302, 523 and 5,464,765 both to Zeneca. Electroporation technology has also been used to transform plants, see WO 87/06614 to Boyce Thompson Institute, 5,472,869 and 5,384,253 both to Dekalb, WO9209696 and W09321335 both to PGS . All of these transformation patents and publications are incorporated by reference. In addition to numerous technologies for transforming plants, the type of tissue which is contacted with the foreign genes may vary as well. Such tissue would include but would not be limited to embryogenic tissue, callus tissue type I and II, hypocotyl, meristem, and the like. Almost all plant tissues may -11- be ransformed during dedif fsrentiacion using appropriate techniques within che skill of an artisan.
Another variable is the choice of a selectable marker. The preference cor a particular marker is at the discretion on the artisan, but any of the following selectable markers may be used along with any other gene not listed herein which could function as a selectable marker. Such selectable markers include but are not limited to aminoglycoside phosphotransferase gene of transposon Tn5 (Aph II) which encodes resistance to the antibiotics kanamycin, neomycin and G418, as well as those genes which code for resistance or tolerance to glyphosate; hygromycin; methotrexate; phosphinothricin (bialophos ) ; imidazolinones , sulfonylureas and triazolopyrimidine herbicides, such as chlorosulfuron; bromoxynil, dalapon and the like.
In addition to a selectable marker, it may be desirous to use a reporter gene. In some instances a reporter gene may be used without a selectable marker. Reporter genes are genes which are typically not present or expressed in the recipient organism or tissue. The reporter gene typically encodes for a protein which provides for some phenotypic change or enzymatic property. Examples of such genes are provided in . Weising et al. Ann. Rev. Genetics, 22, 421 (1988), which is incorporated herein by reference. A preferred reporter gene is the glucuronidase (GUS) gene .
Regardless of transformation technique, the gene is preferably incorporated into a gene transfer vector adapted to express the Phocorhabdus toxins in the plant cell by including in the vector a plant promoter. In addition to plant promoters, promoters from a variety of sources can be used efficiently in plant cells to express foreign genes. For example, promoters of bacterial origin, such as the octopine synthase promoter, the nopaline synthase promoter, the mannopine synthase promoter; promoters of viral origin, such as the cauliflower mosaic virus (35S and 19S)and the like may be used. Plant promoters include, but are not limited to ribulose-1 , 6-bisphosphate (RUBP) carboxylase small subunit (ssu) , beta-conglycinin promoter, phaseolin promoter, ADH promoter, heat-shock promoters and tissue specific promoters. Promoters may also contain certain enhancer sequence elements, that may improve the transcription efficiency. Typical enhancers include but are not limited to Adh-intron 1 and Adh-incron 6. Constitutive promocers may be used. onscicuc iv¾ promocers direcc concinuous gene expression in ail ceils cypes and ac all cimes (e.g., accin, ubiquicin, CaMV 35S) . Tissue specific promocers are responsible cor gene expression in specific cell or cissue cypes, such as the leaves or seeds (e.g. zein, oleosin, napin, ACP) and chese promocers may also be used. Promocers may also be are active during a certain scage of the plants' development as well as active in plant tissues and organs. Examples of such promoters include but are not limited to pollen-specific, embryo specific, corn silk specific, cotcon fiber specific, rooc specific, seed endosperm specific promocers and the like.
Under certain circumstances it may be desirable to use an inducible promoter. An inducible promoter is responsible for expression of genes in response to a specific signal, such as: physical stimulus (heat shock genes); light (RUBP carboxylase); hormone (Em); metabolites; and stress. Other desirable transcription and translation elements that function in plants may be used. Numerous plant-specific gene transfer vectors are known to the art.
In addition, it is known that to obtain high expression of bacterial genes in plants it is preferred to reengineer che bacterial genes so that they are more efficiently expressed in the cytoplasm of plants. Maize is one such plant where it is preferred to reengineer the bacterial gene(s) prior to transforma ion to increase the expression level of the coxin in the plant. One reason for the reengineering is the very low G+C content of the native bacterial genets) (and consequent skewing towards high A+T content) . This results in the generation of sequences mimicking or duplicating plant gene control sequences that are known to be highly A+T rich. The presence of some A+T-rich sequences within the DNA of the gene(s) introduced into plants (e.g., TATA box regions normally found in gene promoters) may result in aberrant transcription of the gene(s) . On the other hand, the presence of other regulatory sequences residing in the transcribed mRNA (e.g., polyadenylation signal sequences (AAUAAA) , or sequences complementary co small nuclear RNAs involved in pre-mRNA splicing) may lead to RNA instability.
Therefore, one goal in the design of reengineered baccsrial SUBSTITUTE SH LE 26 geneisi , more preferably referred co as plane optimized gene < s 1 , is co generace a DMA sequence having a higher G+C concenc, and preferably one close co chac of plane genes coding for mecabolic enzymes. Anocher goal in che design of che plane opcimized gene(s) is co generace a DNA sequence chac noc only has a higher G+C concenc, buc by modifying che sequence changes, should be made so as co noc hinder cranslacion.
An example of a plane Chac has a high G+C concenc is maize. The cable below iliuscraces how high che G+C concenc is in maize As in maize, ic is choughc chac G+C concenc in ocher planes is also high.
Table 1 Compilacion of G+C concencs of protein coding regions of maize genes Number of genes in class given in parentheses Standard deviations given in parentheses. c Combined groups mean ignored in calculation of overall mean.
For the data in Table 1, coding regions of the genes were extracted from GenBank (Release 71) entries, and base composicions were calculated using Che MacVeccor™ program (131, New Haven, CT) . Incron sequences were ignored in che -14- SUB TTUTE SHEET RULE 26 ralc lations . Group r and II storage protein gene sequences were distinguished by their marked difference in base composi ion.
Due to the . lasticity afforded by the redundancy of the genetic code (i.e., some amino acids are specified by more than one codon) , evolution of the genomes of different organisms or classes or organisms has resulted in differential usage of redundant codons . This 'codon bias* is reflected in the mean base composition of protein coding regions. For example, organisms with relatively low G+C contents utilize codons having A or T in the third position of redundant codons, whereas those having higher G+C contents utilize codons having G or C in the third position. It is thought that the presence of 'minor* codons within a gene's mRNA may reduce the absolute translation rate of that mRNA, especially when the relative abundance of the charged tR A corresponding to the minor codon is low. An extension of this is that the diminution of translation rate by individual minor codons would be at least additive for multiple minor codons. Therefore, mRNAs having high relative contents of minor codons would have correspondingly low translation rates. This rate would be reflected by the synthesis of low levels of the encoded protein.
In order to reengineer the bacterial gene(s) , the codon bias of the plant is determined. The codon bias is the statistical codon distribution that the plant uses for coding its proteins. After determining the bias, the percent frequency of the codons in the gene(s) of interest is determined. The primary codons preferred by the plant should be determined as well as the second and third choice of preferred codons. The amino acid sequence of the protein of interest is reverse translated so that the resulting nucleic acid sequence codes for the same protein as the native bacterial gene, but the resulting nucleic acid sequence corresponds to the first preferred codons of the desired plant. The new sequence is analyzed for restriction enzyme sites that might have been created by the modification. The identified sites are further modified by replacing the codons with second or third choice preferred codons. Other sites in the sequence which could affect the transcription or translation of the gene of interest are the exon:intron 5' or 3 ' junctions, poly A addition signals, or RNA polymerase termination signals. The sequence is -15- further analyzed and modified to reduce the frequency of TA or doublets. In addition co the doublets, G or C sequence bLocks that have more than about four residues that are the same can af fee ' transcription of the sequence. Therefore, these blocks are also modified by replacing the codons of first or second choice, etc. with the next preferred codon of choice. It is preferred that the plant optimized gene(s) contains about 63% of first choice codons, between about 22% to about 37% second choice codons, and between 15% and 0% third choice codons, wherein the total percentage is 100%. Most preferred the plant optimized gene(s) contain about 63% of first choice codons, at least about 22% second choice codons, about 7.5% third choice codons, and about 7.5% fourth choice codons, wherein the total percentage is 100%. The method described above enables one skilled in the art to modify gene(s) that are foreign to a particular plant so that the genes are optimally expressed in plants. The method is further illustrated in pending provisional application U.S. 60/005,405 filed on October 13, 1995, which is incorporated herein by reference.
Thus, in order to design plant optimized gene(s) the amino acid sequence of the toxins are reverse translated into a DNA sequence, utilizing a nonredundant genetic code established from a codon bias table compiled for the gene DNA sequence for the particular plant being transformed. The resulting DNA sequence, which is completely homogeneous in codon usage, is further modified to establish a DNA sequence that, besides having a higher degree of codon diversity, also contains strategically placed restriction enzyme recognition sites, desirable base composition, and a lack of sequences that might interfere with transcription of the gene, or translation of the product mRNA.
It is theorized that bacterial genes may be more easily expressed in plants if the bacterial genes are expressed in the plastids. Thus, it may be possible to express bacterial genes in plants, without optimizing the genes for plant expression, and obtain high express of the protein. See U.S. Patent Nos . 4,762,785; 5,451,513 and 5,545,817, which are incorporated herein by reference. -16- SUBSnTUTE SHEET (RULE 26) One of the issues regarding commercial exploiting transgenic plants is resistance management. This is of particular concern with Bacillus churingiensis toxins. There are numerous companies commerically exploiting Bacillus churingiensis and there has been much concern about St toxins becoming resistant. One strataegy for insect resistant management would be to combine the toxins produced by Phocorhabdus with toxins such as B , vegetative insect proteins (Ciba Geigy) or other toxins. The combinations could be formulated for a sprayable application or could be molecular combinations. Plants could be transformed with Phocorhabdus genes that produce insect toxins and other insect toxin genes such as St as with other insect toxin genes such as sc.
European Patent Application 0400246A1 describes transformation of 2 St in a plant, which could be any 2 genes. Another way to produce a transgenic plant that contains more than one insect resistant gene would be to produce two plants, with each plant containing an insect resistant gene. These plants would be backcrossed using traditional plant breeding techniques to produce a plant containing more than one insert resistant gene .
In addition to producing a transformed plant containing plant optimized gene(s), there are other delivery systems where it may be desirable to reengineer the bacterial gene(s). Along the same lines, a genetically engineered, easily isolated protein toxin fusing together both a molecule attractive to insects as a food source and the insecticidal activity of the toxin may be engineered and expressed in bacteria or in eukaryotic cells using standard, well-known techniques. After purification in the laboratory such a toxic agent with "built-in" bait could be packaged inside standard insect trap housings.
Another delivery scheme is the incorporation of the genetic material of toxins into a baculovirus vector. Baculoviruses infect particular insect hosts, including those desirably targeted with the Phocorhabdus toxins. Infectious baculovirus harboring an expression construct for the Phocorhabdus toxins could be introduced into areas of insect infestation to thereby intoxicate or poison infected insects. -17- Transfer of the lnsecticidal properties requires nucleic acid sequences encoding the coding the amino acid sequences for the Phocorhabdus toxins integrated into a protein expression vector appropriate to the host in which the vector will reside. One way to obtain a nucleic acid sequence encoding a protein with insecticidal properties is to isolate the native genetic material which produces the toxins from Phocorhabdus , using information deduced from the toxin's amino acid sequence, large portions of which are set forth below. As described below, methods of purifying the proteins responsible for toxin activity are also disclosed.
Using N-terminal amino acid sequence data, such as set forth below, one can construct oligonucleotides complementary to all, or a section of, the DNA bases that encode the first amino acids of the toxin. These oligonucleotides can be radiolabeled and used as molecular probes to isolate the genetic material from a genomic genetic library built from genetic material isolated from strains of Phocorhabdus . The genetic library can be cloned in plasmid, cosmid, phage or phagemid vectors. The library could be transformed into Escherichia coli and screened for toxin production by the transformed cells using antibodies raised against the toxin or direct assays for insect toxicity.
This approach requires the production of a battery of oligonucleotides, since the degenerate genetic code allows an amino acid to be encoded in the DNA by any of several three-nucleotide combinations. For example, the amino acid arginine can be encoded by nucleic acid triplets CGA, CGC, CGG, CGT, AGA, and AGG. Since one cannot predict which triplet is used at those positions in the toxin gene, one must prepare oligonucleotides with each potential triplet represented. More than one DNA molecule corresponding to a protein subunit may be necessary to construct a sufficient number of oligonucleotide probes to recover all of the protein subunits necessary to achieve oral toxicity .
From the amino acid sequence of the purified protein, genetic materials responsible for the production of toxins can readily be isolated and cloned, in whole or in part, into an expression vector using any of several techniques well-known to one skilled in the art of molecular' biology . A typical expression vector is a DNA plasmid, though other transfer means -18- including, but noc limited to, cosmids, phagemids and p age are also envisioned. In addition to features required or desired cor plasmid replication, such as an origin of replication and antibiotic resistance or other form of a selectable marker such as the bar gene of Strepzcmyces ygroscopicus or viridochromogenes , protein expression vectors normally additionally require an expression cassette which incorporates the cis-acting sequences necessary for transcription and translation of the gene of interest. The cis-acting sequences required for expression in prokaryotes differ from those required in eukaryotes and plants.
A eukaryotic expression cassette requires a transcriptional promoter upstream (5') to the gene of interest, a transcriptional termination region such as a poly-A addition site, and a ribosome binding site upstream of the gene of interest's first codon. In bacterial cells, a useful transcriptional promoter that could be included in the vector is the T7 RNA Polymerase-binding promoter. Promoters, as previously described herein, are known to efficiently promote transcription of mR A. Also upstream from the gene of interest the vector may include a nucleotide sequence encoding a signal sequence known to direct a covalently linked protein to a particular compartment of the host cells such as the cell surface.
Insect viruses, or baculoviruses , are known to infect and adversely affect certain insects. The affect of the viruses on insects is slow, and viruses do not stop the feeding of insects. Thus viruses are not viewed as being useful as insect pest control agents. Combining the Phocorhabdus toxins genes into a baculovirus vector could provide an efficient way of transmitting the toxins while increasing the lethality of the virus. In addition, since different baculoviruses are specific to different insects, it may be possible to use a particular toxin to selectively target particularly damaging insect pests. A particularly useful vector for the toxins genes is the nuclear polyhedrosis virus. Transfer vectors using this virus have been described and are now the vectors of choice for transferring foreign genes into insects. The virus-toxin gene recombinant may be constructed in an orally transmissible form. Baculoviruses normally infect insect victims through the mid-gut intestinal mucosa. The toxin gene inserted behind a strong viral coat protein promoter would be expressed and should rapidly kill the infected insect.
In addition to an insect virus or baculovirus or transgenic plant delivery system for the protein toxins of the present invention, the proteins may be encapsulated using Bacillus -huringiensis encapsula ion technology such as but not limited to U.S. Patent Mos . 4,695.455; 4,695,462; 4,861,595 which are all incorporated herein by reference. Another delivery system for the protein toxins of the present invention is formulation of the protein into a bait matrix, which could then be used in above and below ground insect bait stations. Examples of such technology include but are not limited to PCT Patent Application WO 93/23998, which is incorporated herein by reference.
As is described above, it might become necessary to modify the sequence encoding the protein when expressing it in a non-native host, since the codon preferences of other hosts may differ from that of P ocor abdus . In such a case, translation may be quite inefficient in a new host unless compensating modifica ions to the coding sequence are made. Additionally, modifications to the amino acid sequence might be desirable to avoid inhibitory cross-reactivity with proteins of the new host, or to refine the insecticidal properties of the protein in the new host. A genetically modified toxin gene might encode a toxin exhibiting, for example, enhanced or reduced toxicity, altered insect resistance development, altered stability, or modified target species specificity.
In addition to the Phocorhabdus genes encoding the toxins, the scope of the present invention is intended to include related nucleic acid sequences which encode amino acid biopolymers homologous to the toxin proteins and which retain the toxic effect of the Phocorhabdus proteins in insect species after oral ingestion.
For instance, the toxins used in the present invention seem to first inhibit larval feeding before death ensues. By manipulating the nucleic acid sequence of Phocorhabdus toxins or its controlling sequences, genetic engineers placing the toxin gene into plants could modulate its potency or its mode of action to, for example, keep the eat ing- inhibitory activity while eliminating the absolute toxicity to the larvae. This' change could permit the transformed plant to survive until harvest without having the unnecessarily dramatic effect on the ecosvscem of wiping out all target insects. All such modifications of the gene encoding the toxin, or of the protein encoded by the gene, are envisioned to fall within the scope of the present invention.
Other envisioned modifications of the nucleic acid include the addition of targeting sequences to direct the toxin to particular parts of the insect larvae for improving its efficiency.
Strains ATCC 55397, 43948, 43949, 43950, 43951, 43952 have been deposited in the American Type Culture Collection, 12301 Parklawn Drive, Rockville, D 20852 USA. Strain W-14, ATCC 55397 has been deposited at the American Type Culture Collection, 12301 Parklawn Drive, Rockville, MD 20852 USA on March 5, 1993. Strains ATCC Nos . 43948, 43949, 43450, 43951, 43952 were obtained from the American Type Culture Collection, 12301 Parklawn Drive, Rockville, MD 20852 USA and have been redeposited at the American Type Culture Collection, 12301 Parklawn Drive, Rockville, MD 20852 USA on November 5, 1996, and the second deposits are identified ATCC Nos . 55878, 55879, 55880, 55881 and 55882, respectively. Amino acid and nucleotide sequence data for the W-14 native toxin (ATCC 55397) Is presented below. Isolation of the genomic DNA for the toxins from the bacterial hosts is also exemplified. erein.
Standard and molecular biology techniques were followed and taught in the specification herein. Additional information may be found in Sambrook, J., Fritsch, E. F. , and Maniatis," T. (1989) , Molecular Cloning, A Laboratory Manual, Cold Spring Harbor. Press, which is incorporated herein by reference.
The following abbreviations are used throughout the Examples: Tris = tris (hydroxymethyl) amino methane; SDS = sodium dodecyl sulfate; EDTA = ethylenediaminetetraacetic acid, IPTG = isopropylthio-B-galactoside, X-gal = 5-bromo-4-chloro-3-indoyl-B- D-galactoside, CTAB = cetyltrimethylammonium bromide; kbp = kilobase pairs; dATP , dCTP, dGTP, dTTP,: I = 2 ' -deoxynucleos ide 5 ' -triphosphates of adenine, cytosine, guanine, thymine, and inosine, respectively; ATP = adenosine 5' triphosphate.
Example 1 Purification of toxin from P. luminescens and Demonstration o£ toxicity after oral delivery of purified toxin The insecticidal protein toxin of the present invention was purified from P. luminescens strain W-14, ATCC Accession Number 5539". Stock cultures of P. luminescens were maintained on petri dishes containing 2% Proteose Peptone No. 3 (i.e., PP3 , Difco Laboratories, Detroit MI) in 1.5% agar, incubated at 25°C and transferred weekly. Colonies of the primary form of the bacteria were inoculated into 200 ml of PP3 broth supplemented with 0.5% 21a polyoxyethylene sorbitan mono-stearate (Tween 60, Sigma Chemical Company, St. Louis MO) in a one liter flask. The broth cultures were grown for 72 hours at 30°C on a rotary shaker. The toxin proteins can be recovered from cultures grown in the presence or absence of Tween; however, the absence of Tween can affect the form of the bacteria grown and the profile of proteins produced by the bacteria. In the absence of Tween, a variant shift occurs insofar as the molecular weight of at least one identified toxin subunit shifts from about 200 kDa to about 135 kDa.
The 72 hour cultures were centrifuged at 10,000 x g for 30 minutes to remove cells and debris. The supernatant fraction that contained the insecticidal activity was decanted and brought to 50 mM K2HP04 by adding an appropriate volume of 1.0 M 2ΗΡ04. The pH was adjusted to 8.6 by adding potassium hydroxide. This supernatant fraction was then mixed with DEAE-Sephacel (Pharmacia LKB Biotechnology) which had been equilibrated with 50 mM 2HP04. The toxic activity was adsorbed to the DEAE resin. This mixture was then poured into a 2.6 x 40 cm column and washed with 50 mM K2HPO4 at room temperature at a flow rate of 30 ml/hr until the effluent reached a steady baseline UV absorbance at 280 nm. The column was then washed with 150 mM C1 until the effluent again reached a steady 280 nm baseline. Finally the column was washed with 300 mM KC1 and fractions were collected.
Fractions containing the toxin were pooled and filter sterilized using a 0.2 micron pore membrane filter. The toxin was then concentrated and equilibrated to 100 mM KPO4, pH 6.9, using an ultrafiltration membrane with a molecular weight cutoff of 100 kDa at 4°C (Centriprep 100, Amicon Division-W.R. Grace and Company) . A 3 ml sample of the toxin concentrate was applied to the top of a 2.6 x 95 cm Sephacryi S-400 HR gel filtration column (Pharmacia LKB Biotechnology) . The eluent buffer was 100 mM KPO,, pH 6.9, which was run at a flow rate of 17 ml/hr, at 4°C. The effluent was monitored at 280 nm.
Fractions were collected and tested for toxic activity.
Toxicity of chromatographic fractions was examined in a biological assay using Manduca sexca larvae. Fractions were either applied directly onto the insect diet (Gypsy moth wheat germ diet, ICN Biochemicals Division - ICN Biomedicals, Inc.) or administered by intrahemoeelic injection of a 5 ul sample through the first proleg of 4th or 5th instar larva using a 30 gauge -22- needle. The weighc of each larva within a creacmenn group was recorded ac 24 hour intervals. Toxicity was presumed if the insect ceased feeding and died within several days of consuming treated insect diet or if death occurred within 24 hours after injection of a fraction.
The toxic fractions were pooled and concentrated using the Centriprep- 100 and were then analyzed by HPLC using a 7.5 mm x 60 cm TSK-GEL G-4000 SW gel permeation column with 100 mM potassium phosphate, pH 6.9 eluent buffer running at 0.4 ml/min. This analysis revealed the toxin. protein to be contained within a single sharp peak that eluted from the column with a retention time of approximately 33.6 minutes. This retention time corresponded to an estimated molecular weight of 1,000 kDa. Peak fractions were collected for further purification while fractions not containing this protein were discarded. The peak eluted from the HPLC absorbs UV light at 218 and 280 n but did not absorb at 405 nm. Absorbance at 405 nm was shown to be an attribute of xenorhabdin antibiotic compounds.
Electrophoresis of the pooled peak fractions in a non-denaturing agarose gel (Metaphor Agarose, FMC BioProducts) showed that two protein complexes are present in the peak. The peak material, buffered in 50 mM Tris-HCl, pH 7.0, was separated on a 1.5% agarose stacking gel buffered with 100 mM Tris-HCl at pH 7.0 and 1.9% agarose resolving gel buffered with 200 mM Tris-borate at pH 8.3 under standard buffer conditions (anode buffer 1M Tris-HCl, pH 8.3; cathode buffer 0.025 M Tris, 0.192 M glycine). The gels were run at 13 mA constant current at 15°C until the phenol red tracking dye reached the end of the gel. Two protein bands were visualized in the agarose gels using Coomassie brilliant blue staining.
The slower migrating band was referred to as "protein band 1" and faster migrating band was referred to as "protein band 2." The two protein bands were present in approximately equal amounts. The Coomassie stained agarose gels were used as a guide to precisely excise the two protein bands from unstained portions of the gels. The excised pieces containing the protein bands' were macerated and a small amount of sterile water was added. As a control, a portion of the gel that contained no protein was also excised and treated in the same manner as the gel pieces containing the protein. Protein was recovered from the gel pieces by electroelut ion inco 100 mw Tris-borate pH 3.3, at voles (constant voltage) for two hours. Alternatively, prot-ii.n was passively eluted from the gel pieces by adding an equal volume of 50 mM Tris-HCl, pH 7.0, co the gel pieces, Chen incubating at 30°C for 16 hours. This allowed the protein to diffuse from the gel into the buffer, which was then collected.
Results of insect toxicity tests using HPLC-purif ied toxin (33.6 min. peak) and agarose gel purified toxin demonstrated toxicity of the extracts. Injection of 1.5 ug of the HPLC purified protein kills within 24 hours. Both protein bands i and 2, recovered from agarose gels by passive elution or electroelution, were lethal upon injection. The protein concentration estimated for these samples was less than 50 ng/ larva. A comparison of the weight gain and the mortality between the groups of larvae injected with protein bands 1 : 2 indicate that protein band 1 was more toxic by injection delivery .
When HPLC-purified toxin was applied to larval diet at a concentration of 7.5 ug/larva, it caused a halt in larval weight gain (24 larvae tested) . The larvae begin to feed, but aftec consuming only a very small portion of the toxin treated diet they began to show pathological symptoms induced by the toxin and the larvae cease feeding. The insect frass became discolor*? i and most larva showed signs of diarrhea. Significant insect mortality resulted when several 5 ug toxin doses were applied to the diet over a -10 day period.
Agarose-separated protein band 1 significantly inhibited larval weight gain at a dose of 200 ng/larva. Larvae fed similar concentrations of protein band 2 were not inhibited and gained weight at the same rate as the control larvae. Twelve larvae were fed eluted protein and 45 larvae were fed protein-containing agarose pieces. These two sets of data indicate that protein band 1 was orally toxic to Manduca sexca. In this experiment it appeared that protein band 2 was not toxic to Manduca sexca.
Further analysis of protein bands 1 and 2 by SDS-PAGE under denaturing conditions showed that each band was composed of several smaller protein subunits. Proteins were visualized Ly Coomassie brilliant blue staining followed by silver staininy to achieve maximum sensitivity. -24- SUBSTTTUTE SHEET (RULE 26) The procein subunits in the two bands were very similar. Protein band i contains 8 protein subunits of 25.1, 56.2, 60.3, 65.6, 166, 171, 184 and 203 kDa. Protein band 2 had an identical profile except that the 25.1, 60.8, and 65.6 kDa proteins were not present. The 56.2, 60.8, 65.6, and 184 kDa proteins were present in the complex of protein band 1 at approximately equal concentrations and represent 80% or more of the total protein content of that complex.
The native HPLC-purif ied toxin was further characterized as follows. The toxin was heat labile in that after being heated to 60°C for 15 minutes it lost its ability to kill or to inhibit weight gain when injected or fed to M. sexta larvae. Assays were designed to detect lipase, type C phospholipase, nuclease or red blood cell hemolysis activities and were performed with purified toxin. None of these activities were present. Antibiotic zone inhibition assays were also done and the purified toxin failed to inhibit growth of Gram-negative or -positive bacteria, yeast or filamentous fungi, indicating that the toxic is not a xenorhabdin antibiotic .
The native HPLC-purified toxin was tested for ability to kill insects other than Manduca sexta. Table 2 lists insects killed by the HPLC-purified P. luminescens toxin in this study.
Table 2 Insects Killed by P. luminescens Toxin Genus and Route of Common Name Order species Delivery Tobacco Lepidoptera Manduca sexta Oral and horn worm injected Mealworm Coleoptera Tenebrio moli or Oral Pharaoh ant Hymenoptera Monomorium pharoanis Oral German Dictyoptera Blatcella germanica Oral and cockroach injected Mosquito Diptera Aedes aegypti Oral -25- Example 2 Insecticide Utility The Phocorhabdus luminescens utility and toxicity were further characterized. Phocorhabdus luminescens (strain W-ΙΊ; culture broth was produced as follows. The production medium was 2% Bacto Proteose Peptone" Number 3 (PP3, Difco Laboratories, Detroit, Michigan) in Milli-Q* deionized water. Seed culture flasks consisted of 175 ml medium placed in a 500 ml tribaffied flask with a Delong neck, covered with a Kaput and autoclaved for 20 minutes, T=250°F. Production flasks consisted of 500 mis in a 2.8 liter 500 ml tribaffied flask with a Delong neck, covered by a Shin-etsu silicon foam closure. These were autoclaved for 45 minutes, T=250'F. The seed culture was incubated at 28°C at 150 rpm in a gyratory shaking incubator ;ith a 2 inch throw. After 16 hours of growth, 1% of the seed culture was placed in the production flask which was allowed to grow tor 24 hours before harvest. Production of the toxi appears to be during log phase growth. The microbial broth was transferred to a 1L centrifuge bottle and the cellular biomass was pelleted 130 minutes at 2500 RPM at 4*C, [R.C.F. = -1600] HG-4L Rotor RC3 Sorval centrifuge, Dupont, Wilmington, Delaware) . The primary broth was chilled at 4°C for 8 - 16 hours and recentrifuged at least 2 hours (conditions above) to further clarify the broth by removal of a putative mucopolysaccharide which precipitated upon standing. (An alternative processing method combined both <;teps and involved the use of a 16 hour clarification centrifugat ion , same conditions as above.) This broth was then stored at 4'C prior to bioassay or filtration.
Phocorhabdus culture broth and protein toxin (s) purified from this broth showed activity (mortality and/or growth inhibition, reduced adult emergence) against a number of insects. More specifically, the activity is seen against corn rootworm (larvae and adult) , Colorado potato beetle, and turf grubs, which are members of the insect order Coleopcera . Other members of the Coleopcera include wireworms, pollen beetles, flea beetles, seed beetles and weevils. Activity has also been observed against aster leafhopper, which is a member of the order, Homopcera.
Other members of the Homopcera include planthoppers , pear pysila, apple sucker, scale insects, whiteflies, and spittle bugs, as well as numerous host specific aphid species. The broth and purified fractions are also active against beet arm worm, cabbage looper, black cutworm, tobacco budworm, European corn borer, corn earwor , and codling moth, which are members of the order Lepidopcera . Other typical members of this order are clothes moth, Indian mealmoth, leaf rollers, cabbage worm, cotton bollworm, bagworm, Eastern tent caterpillar, sod webworm, and fall armyworm. Activity is also seen against fruitfly and mosquito larvae, which are members of the order Diptera. Other members of the order Dipcera are pea midge, carrot fly, cabbage root fly, turnip root fly, onion fly, crane fly, house fly, and various mosquito species. Activity is seen against carpenter ant and Argentine ant, which are members of the order that also includes fire ants, oderous house ants, and little black ants.
The broth/ fraction is useful for reducing populations of insects and were used in a method of inhibiting an insect population. The method may comprise applying to a locus of the insect an effective insect inactivating amount of the active described. Results are reported in Table 3.
Activity against corn rootworm larvae was tested as follows.
Phocorhabdus culture broth (filter sterilized, cell-free) or purified HPLC fractions were applied directly to the surface (-1.5 cm2) of 0.25 ml of artificial diet in 30 μΐ aliquots following dilution in control medium or 10 mM sodium phosphate buffer, pH 7.0, respectively. The diet plates were allowed to air-dry in a sterile flow-hood and the wells were infested with single, neonate DiabroCica undeci punctata howardi (Southern corn rootworm, SCR) hatched from sterilized eggs, with second instar SCR grown on artificial diet or with second instar Diabracica virgifera virgifera (Western corn rootworm, WCR) reared on corn seedlings grown in Metromix*. Second instar larvae were weighed prior to addition to the diet. The plates were sealed, placed in a humidified growth chamber and maintained at 27°C for the appropriate period (4 days for neonate and adult SCR, 2-5 days for WCR larvae, 7-14 days for second instar SCR). Mortality and weight determinations were scored as indicated. Generally, 16 insects per treatment were used in all studies. Control mortalities were as follows: neonate larvae, <5%, adult beetles, 5%. -27- Activity against Colorado potato beetle was tested as follows.. P ocor abdus culture broth or control medium was ap Lied to the surface (-2.0 cm:) of 1.5 ml of standard artificial diet held in the wells of a 24-well tissue culture plate. Each well received 50 μΐ of treatment and was allowed to air dry.
Individual second instar Colorado potato beetle iLepcinocarsa decemlineaca , CPB) larvae were then placed onto the diet and mortality was scored after 4 days. Ten larvae per treatment were used in all studies. Control mortality was 3.3%.
Activity against Japanese beetle grubs and beetles was tested as follows. Turf grubs (Popillia japonica, 2-3rd instar) were collected from infested lawns and maintained in the laboratory in soil/peat mixture with carrot slices added as additional diet. Turf beetles were pheromone- rapped locally and maintained in the laboratory in plastic containers with maple leaves as food. Following application of undiluted Phocorhabdus culture broth or control medium to corn rootworm artificial diet (30 μ1/1.54 cm2, beetles) or carrot slices (larvae), both stages were placed singly in a diet well and observed for any mortality and feeding. In both cases there was a clear reduction in the amount of feeding (and feces production) observed.
Activity against mosquito larvae was tested as follows. The assay was conducted in a 96-well microtiter plate. Each well contained 200 μΐ of aqueous solution ( Phocorhabdus culture broth, control medium or H20) and approximately 20, 1-day old larvae (Aedes aegypci). There were 6 wells per treatment. The results were read at 2 hours after infestation and did not change over the three day observation period. No control mortality was seen. Activity against fruitflies was tested as follows.
Purchased Drosophila melanogascer medium was prepared using 50% dry medium and a 50% liquid of either water, control medium or Phocorhabdus culture broth. This was accomplished by placing 8.0 ml of dry medium in each of 3 rearing vials per treatment and adding 8.0 ml of the appropriate liquid. Ten late instar Drosophila melanogascer maggots were then added to each vial. The vials were held on a laboratory bench, at room temperature, under fluorescent ceiling lights. Pupal or adult counts were made after 3, 7 and 10 days of exposure. Incorporation of Phocorhabdus culture broth into the diet media for fruitfly SUBSTT ULE 26 maggocs caused a slight (1~%) but significant reduction in day-LQ adult emergence as compared to water and control medium (3% reduction) .
Activity against aster leafhopper was tested as follows. The ingestion assay for aster leafhopper (Macrosceles severini) is designed to allow ingestion of the active without other external contact. The reservoir for the active/ " food" solution is made by making 2 holes in the center of the bottom portion of a 35 x 10 mm Petri dish. A 2 inch Parafilm M* square is placed across the top of the dish and secured with an "0" ring. A 1 oz. plastic cup is then infested with approximately 7 leafhoppers and the reservoir is placed on top of the cup, Parafilm down. The test solution is then added to the reservoir through the holes. In tests using undiluted Photorhabdus culture broth, the broth and control medium were dialyzed against water to reduce control mortality. Mortality is reported at day 2 where 26.5% control mortality was seen. In the tests using purified fractions (200 mg protein/ml ) a final concentration of 5% sucrose was used in all treatments to improve survivability of the aster leafhoppers. The assay was held in an incubator at 28°C, 70% RH with a 16/8 photoperiod. The assay was graded for mortality at 72 hours. Control mortality was 5.5%.
Activity against Argentine ants was tested as follows. A 1.5 ml aliquot of 100% Photorhabdus culture broth, control medium or water was pipetted into 2.0 ml clear glass vials. The vials were plugged with a piece of cotton dental wick that was moistened with the appropriate treatment. Each vial was placed into a separate 60xl6mm Petri dish with 8 to 12 adult Argentine ants (Linepithema hu ile) . There were three replicates per treatment. Bioassay plates were held on a laboratory bench, at room temperature under fluorescent ceiling lights. Mortality readings were made after 5 days of exposure. Control mortality was 24%.
Activity against carpenter ant was tested as follows. Black carpenter ant workers {Camponotus pennsylvanicus) were collected from trees on DowElanco property in Indianapolis, IN. Tests with Photorhabdus culture broth were performed as follows. Each plastic bioassay container (7 1/8" x 3") held fifteen workers, a paper harborage and 10 ml of broth or control media in a plastic shot glass. A cotton wick delivered the treatment to the ants chrough a hole in Che shoe glass lid. All treatments contained 5% sucrose. Bioassays were held in che dark ac room temperature and graded at 19 days. Control mortality was 9%. Assays delivering purified fractions utilized artificial ant diet mixed with the treatment (purified fraction or control solution) at a rate of 0.2 ml treatment/2.0 g diet in a plastic test tube. The final protein concentration of the purified fraction was less than 10 g g diet. Ten ants per treatment, a water source, harborage and the treated diet were placed in sealed plastic containers and maintained in the dark at 27°c in a humidified incubator. Mortality was scored at day 10. No control mortality was seen.
Activity against various lepidopteran larvae was tested as follows. Phocorhabdus culture broth or purified fractions were applied directly to the surface (-1.5 cm2) of 0.25 ml of standard artificial diet in 30 μΐ aliquots following dilution in control medium or 10 mM sodium phosphate buffer, pH 7.0, respectively. The diet plates were allowed to air-dry in a sterile flow-hood and the wells were infested with single, neonate larva. European corn borer (Oscrinia nubilalis) and corn earworm { elicoverpa zea) eggs were supplied from commercial sources and hatched in-house, whereas beet armyworm (Spodopcera exigua) , cabbage looper {Trichoplusia ni) , tobacco budworm {Heliochis virescens) , codling moth (Laspeyresia pomonella) and black cutworm (Agrocis ipsilcn) larvae were supplied internally. Following infestation with larvae, the diet plates were sealed, placed in a humidified growth chamber and maintained in the dark at 27°C for the appropriate period. Mortality and weight determinations were scored at days 5-7 for Phocorhabdus culture broth and days 4-7 for the purified fraction. Generally, 16 insects per treatment were used in all studies. Control mortality ranged from 4-12.5% for control medium and was less than 10% for phosphate buffer. -30- SUBSTT JTE SHEET RULE 26 Table 3 Efface of Phocor abdus iuminsscsns (strain w-14i Culture Broth and Purified Toxin Fraction on Mortality and Grc/ h Inhibition of Different Insect Orders / Species Mort. = mortality, G.I. = growth inhibition, na = not applicable, nt = not tested, a.f. = anti-feedant -31- SUBSTTTUTE SHEET RULE 26 Example 3 Insecticide Utility Upon Soil Application Phocorhabdus iuminescens (strain W-14) culture broth was shown to be active against corn rootworm when applied directly to soil or a soil -mix (Metromix') . Activity against neonate SCR and WCR in Metromix* was tested as follows (Table 4). The test was run using corn seedlings (United Agriseeds brand CL614) that were germinated in the light on moist filter paper for 6 da s. After roots were approximately 3-6 cm long, a single kernel / seedling was planted in a 591 ml clear plastic cup with 50 gm of dry Metromix*. Twenty neonate SCR or WCR were then placed directly on the roots of the seedling and covered with Metromix". Upon infestation, the seedlings were then drenched with 50 ml total volume of a diluted broth solution. After drenching, the cups were sealed and left at room temperature in the light for 7 days. Afterwards, the seedlings were washed to remove all Metromix* and the roots were excised and weighed. Activity was rated as the percentage of corn root remaining relative to the control plants and as leaf damage induced by feeding. Leaf damage was scored visually and rated as either -, +, ++ , or +++, with -representing no damage and +++ representing severe damage.
Activity against neonate SCR in soil was tested as follows (Table 5). The test was run using corn seedlings (United Agriseeds brand CL614) that were germinated in the light on moist filter paper for 6 days. After the roots were approximately 3-6 cm long, a single kernel /seedling was planted in a 591 ml clear plastic cup with 150 gm of soil from a field in Lebanon, IN planted the previous year with corn. This soil had not been previously treated with insecticides. Twenty neonate SCR were then placed directly on the roots of the seedling and covered with soil. After infestation, the seedlings were drenched with 50 ml total volume of a diluted broth solution. After drenching, the unsealed cups were incubated in a high relative humidity chamber (80%) at 78°F. Afterwards, the seedlings were washed to remove all soil and the roots were excised and weighed. Activity was rated as the percentage of corn root remaining relative to the control plants and as leaf damage induced by feeding. Leaf damage was scored visually and rated as either -, +, ++, or with - representing no damage and +++ representing severe damage.
Table 4 Effect of Phoccrhabdus luwinescens (strain W-14) C lcui Broth on Rootworm Larvae After Pos - Infestat ion Drsnchi (Metromix*) Treatment Larvae Leaf Damage Root Weight (gj Southern Corn Rootworm Water 0.4916 ± 0.023 100 Medium- (2.0% v/v) - 0.4416 ± 0.029 100 Broth (6.25%v/v) - 0.4641 ± 0.081 100 Water + 0.1410 ± 0.006 28.7 Media (2.0% v/v) *■ 0.1345 ± 0.028 30.4 Broth (1.56% v/v) + 0.4830 ± 0.031 104 Western Corn Rootworm water 0.4446 ± 0.019 100 Broth (2.0% v/v) - 0.4069 ± 0.026 100 Water + - 0.2202 ± 0.015 49 Broth (2.0% v/v) + - 0.3879 ± 0.013 95 Table 5 Effect of Phocorhabdus luminescens (strain W-14) Culture Broth on Southern Corn Rootworm Larvae After Post-Infestation Drenching ( Soi l ) Treatment Larvae Leaf Damage Root Weigh (g) water 0.2148 ± 0.014 100 Broth (50% v/v) 0.2260 ± 0.016 103 Water +++ 0.0916 ± 0.009 43 Broth (50% v/v) 0.2428 ± 0.032 113 Activity of Phocorhabdus luminescens (strain W-14) culture broth against second instar turf grubs in Metromix* was observed in tests conducted as follows (Table 6). Approximately 50 gm of dry Metromix* was added to a 591 ml clear plastic cup. The Metromix* was then drenched with 50 ml total volume of a 50% (v/v) diluted Phocorhabdus broth solution. The dilution of crude broth was made with water, with 50% broth being prepared by adding 25 ml of crude broth to 25 ml of water for 50 ml total volume. A 1% (w/v) solution of proteose peptone #3 (PP3), which is a 50% dilution of the normal media concentration, was used as a broth control. After drenching, five second instar turf grubs were -33- placed on che top of the moistened Metromix*. Healthy cure grub larvae burrowed rapidly into the Metromix*. Those larvae that d: not burrow within lh were removed and replaced wich fresh larvae The cups were sealed and placed in a 28°C incubator, in the dark After seven days, larvae were removed from the Metromix* and scored for mortality. Activity was rated the percentage on mortality relative to control.
Table 6 Effect of Phacorhabdus luminescens (strain W-14) Culture Broth o Turf Grub After Pre- Infestation Drenching (Metromix*) T eatment Mortality' Mortality % Water 7/15 47 Control medium (1.0% w/v) 12/19 63 Broth (50% v/v) 17/20 85 ♦expressed as a ratio of dead/ living larvae Example 4 Insecticide Utility Upon Leaf Application Activity of Phacorhabdus broth against European corn borer was seen when the broth was applied directly to the surface of maize leaves (Table 7) . In these assays Phocorhabdus broth was diluted 100-fold with culture medium and applied manually to the surface of excised maize leaves at a rate of -6.0 μΐ/cm" of leaf surface. The leaves were air dried and cut into equal sized strips approximately 2 x 2 inches. The leaves were rolled, secured with paper clips and placed in 1 oz plastic shot glasses with 0.25 inch of 2% agar on the bottom surface to provide moisture. Twelve neonate European corn borers were then placed onto the rolled leaf and the cup was sealed. After incubation for 5 days at 27°c in the dark, the samples were scored for feeding damage and recovered larvae. -34- Table 7 Effect of Phoccrhabdus iuminescens (strain W-14) Culture Broth ? European Corn Borer Larvae Following Pre- Inf estat ion Application to Excised Maize Leaves Treatment Leaf Damage Larvae Recovered Weight (tog) Water Extensive 55/120 0.42 mg Control Medium Extensive 40/120 0.50 mg Sroth (1.0% v v) Trace 3/120 0.15 mg Activity of the culture broth against neonate tobacco budworm {Heliochis virescens) was demonstrated using a leaf dip methodology. Fresh cotton leaves were excised from the plant and leaf disks were cut with an 18.5 mm cork-borer. The disks were individually emersed in control medium (PP3) or Pho orhabdus Iuminescens (strain W-14) culture broth which had been concentrated approximately 10-fold using an Amicon (Beverly, MA), Proflux M12 tangential filtration system with a 10 kDa filter. Excess liquid was removed and a straightened paper clip was placed through the center of the disk. The paper clip was then wedged into a plastic, 1.0 oz shot glass containing approximately 2.0 ml of 1% Agar. This served to suspend the leaf disk above the agar. Following drying of the leaf disk, a single neonate tobacco budworm larva was placed on the disk and the cup was capped. The cups were then sealed in a plastic bag and placed in a darkened, 27°C incubator for 5 days. At this time the remaining larvae and leaf material were weighed to establish a measure of leaf damage (Table 8) .
Table 8 Effect of Photorhabdus Iuminescens (Strain W-14) Culture Broth on Tobacco Budworm Neonates in a Cotton-Leaf Dip Assay Pinal Weights (mg) Treatment Leaf Disk Larvae Control leaves 55.7 ± 1.3 na * Control Medium 34.0 + 2.9 4.3 + 0.91 Phocorhabdus broth 54.3 + 1.4 0.0** * - not applicable, ** - no live larvae found -35- SUBSTITUTE SHEET RULE 26 Characterization of Toxin Peptide Components In a subsequent analysis, the toxin protein subunits of the bands isolated as in Example 1 were resolved on a 7% SDS polyacry lamide electrophoresis gel with a ratio of 30:0.3 (acrylamide:BIS-acrylamide) . This gel matrix facilitates better resolution of the larger proteins. The gel system used to estimate the Band 1 and Band 2 subunit molecular weights in Example 1 was an 18% gel with a ratio of 38:0.18 (aery lamide : BIS-acrylamide) , which allowed for a broader range of size separation, but less resolution of higher molecular weight components .
In this analysis, 10, rather than 8, protein bands were resolved. Table 9 reports the calculated molecular weights of the 10 resolved bands, and directly compares the molecular weights estimated under these conditions to those of the prior example. It is not surprising that additional bands were detected under the different separation conditions used in this example. Variations between the prior and new estimates of molecular weight are also to be expected given the differences in analytical conditions. In the analysis of this example, it i thought that the higher molecular weight estimates are more accurate than in Example 1, as a result of improved resolution. However, these are estimates based on SDS PAGE analysis, which are typically not analytically precise and result in estimates of peptides and which may have been further altered due to post- and co-translational modifications.
Amino acid sequences were determined for the N-terminal portions of five of the 10 resolved peptides. Table 9 correlates the molecular weight of the proteins and the identified sequences. In SEQ ID NO: 2, certain analyses suggest that the proline at residue 5 may be an asparagine (asn) . In SEQ ID NO: 3, certain analyses suggest that the amino acid residues at positions 13 and 14 are both arginine (arg) . In SEQ ID NO:4, certain analyses suggest that the amino acid residue at position 6 may be either alanine (ala) or serine (ser). In SEQ ID NO:5, certain analyses suggest that the amino acid residue at position 3 may be aspartic acid (asp) . -36- SUB RULE 26 Table 9 EXAMPLE 1 ESTIMATE MEW ESTIMATE * SEQ . LISTING 208 200.2 kDa SEQ ID NO: 1 184 175.0 kDa SEQ ID NO: 2 65.6 63.1 kDa SEQ ID NO: 3 60.3 65.1 kDa SEQ ID NO: 4 56.2 58.3 kDa SEQ ID NO: 5 25.1 23.2 kDa SEQ ID NO: 15 •New estimates are based on SDS PAGE and are not based on gene sequences. SDS PAGE is not analytically precise.
Example 5, Part B Characterization of Toxin Peptide Components New N-terminal sequence, SEQ ID NO: 15, Ala Gin Asp Gly Asn Gin Asp Thr Phe Phe Ser Gly Asn Thr, was obtained by further N-terminal sequencing of peptides isolated from Native HPLC-purified toxin as described in Example 5, Part A, above. This peptide comes from the tcaA gene. The peptide labeled TcaAn, starts at position 254 and goes to position 491, where the TcaAiii peptide starts, SEQ ID NO: 4. The estimated size of the peptide based on the gene sequence is 25,240 Da.
Example 6 Characterization of Toxin Peptide Components In yet another analysis, the toxin protein complex was re-isolated from the P atorhabdus luminescens growth medium (after culture without Tween) by performing a 10% - 80% ammonium sulfat precipitation followed by an ion exchange chromatography step (Mono Q) and two molecular sizing chromatography steps. These conditions were like those used in Example 1. During the first molecular sizing step, a second biologically active peak was found at about 100 ± 10 kDa. Based upon protein measurements, this fraction was 20 - 50 fold less active than the larger, or primary, active peak of about 860 ± 100 kDa (native) . During this isolation experiment, a smaller active peak of about 325 ± 50 kDa that retained a considerable portion of the starting biological activity was also resolved. It is thought that the 325 kDa peak is related to or derived from the 860 kDa peak. -37- A 5 kDa procein was resolved in this analysis. The N- cerminal sequence of this procein is presented in SEQ ID no : A second, prominent 185 kDa protein was consistently present in amounts comparable to that of protein 3 from Table 9, and may be the same protein or protein fragment. The N-terminal sequence of this 185 kDa protein is shown at SEQ ID NO:7.
Additional N-terminal amino acid sequence data were also obtained from isolated proteins. None of the determined N-terminal sequences appear identical to a protein identified in Table 9. Other proteins were present in isolated preparation.
One such protein has an estimated molecular weight of 108 kDa and an N-terminal sequence as shown in SEQ ID NO: 8. A second such protein has an estimated molecular weight of 80 kDa and an N-terminal sequence as shown in SEQ ID NO: 9. when the protein material in the approximately 325 kDa active peak was analyzed by size, bands of approximately 51, 31, 28, and 22 kDa were observed. As in all cases in which a molecular weight was determined by analysis of electrophoretic mobility, these molecular weights were subject to error effect.-? introduced by buffer ionic strength differences, electrophoresis power differences, and the like. One of ordinary skill would understand that definitive molecular weight values cannot be determined using these standard methods and that each was subject to variation. It was hypothesized that proteins of these sizes are degradation products of the larger protein species (of approximately 200 kDa size) that were observed in the larger primary toxin complex.
Finally, several preparations included a protein having the N-terminal sequence shown in SEQ ID NO: 10. This sequence was strongly homologous to known chaperonin proteins, accessory proteins known to function in the assembly of large protein complexes. Although the applicants could not ascribe such an assembly function to the protein identified in SEQ ID NO: 10, it was consistent with the existence of the described toxin protein complex that such a chaperonin protein could be involved in its -38- assembly. Moreover, although such proteins have not directly been suggested to have tcxic activity, this protein may be important to determining the overall structural nature of the protein toxin, and thus, may contribute to the toxic activity or durability of the complex in vivo after oral delivery.
Subsequent analysis of the stability of the protein toxin complex to proteinase was undertaken. It was determined that after 24 hour incubation of the complex in the presence of a 10-fold molar excess of proteinase , activity was virtually eliminated (mortality on oral application dropped to about 5%) . These data confirm the proteinaceous nature of the toxin.
The toxic activity was also retained by a dialysis membrane, again confirming the large size of the native toxin complex.
Example 7 Isolation, Characterization and Partial Amino Acid Sequencing of Phocorhabdus Toxins Isolation and N-Terminal Amino Acid Sequencing: In a set of experiments conducted in parallel to Examples 5 and 6, ammonium sulfate precipitation of Phocorhabdus proteins was performed by adjusting Phocorhabdus broth, typically 2-3 liters, to a final concentration of either 10% or 20% by the slow addition of ammonium sulfate crystals. After stirring for 1 hour at 4°C, the material was centrifuged at 12,000 x g for 30 minutes. The supernatant was adjusted to 80% ammonium sulfate, stirred at 4°C for 1 hour, and centrifuged at 12,000 x g for 60 minutes. The pellet was resuspended in one-tenth the volume of 10 mM a;«P0 , pH 7.0 and dialyzed against the same phosphate buffer overnight at 4°C. The dialyzed material was centrifuged at 12,000 x g for 1 hour prior to ion exchange chromatography.
A HR 16/50 Q Sepharose (Pharmacia) anion exchange column was equilibrated with 10 mM Na2«P04, pH 7.0. Centrifuged, dialyzed ammonium sulfate pellet was applied to the Q Sepharose column at a rate of 1.5 ml/min and washed extensively at 3.0 ml/min with equilibration buffer until the optical density (O.D. 280) reached less than 0.100. Next, either a 60 minute NaCl gradient ranging from 0 to 0.5 M at 3 ml/min, or a series of step elutions using 0.1 M, 0.4 M and finally 1.0 NaCl for 60 minutes each was applied to the column. Fractions were pooled and concentrated using a Centriprep 100. Alternatively, proteins could be eluted by a single 0.4 M NaCl was without prior elution with 0.1 M NaCl.
Two milliliter aliquots of concentrated Q Sepharose samples were loaded at 0.5 ml/min onto a HR 15/50 Superose 12 (Pharmacia) gel filtration column equilibrated with 10 mM Na^-P^, pH 7.0. The column was washed with the same buffer for 240 min at 0.5 ml/min and 2 min samples were collected. The void volume material was collected and concentrated using a Centriprep 100. Two milliliter aliquots of concentrated Superose 12 samples were loaded at 0.5 ml/min onto a HR 16/50 Sepharose 4B-CL (Pharmacia) gel filtration column equilibrated with 10 mM N 2»POj, pH 7.0. The column was washed with the same buffer for 240 min at 0.5 ml/min and 2 min samples were collected.
The excluded protein peak was subjected to a second fractionation by application to a gel filtration column that used a Sepharose CL-4B resin, which separates proteins ranging from -30 kDa to 1000 kDa. This fraction was resolved into two peaks; a minor peak at the void volume (>1000 kDa) and a major peak which eluted at . an apparent molecular weight of about 860 kDa. Over a one week period subsequent samples subjected to gel filtration showed the gradual appearance of a third peak (approximately 325 kDa) that seemed to arise from the major peak, perhaps by limited proteolysis. Bioassays performed on the three peaks showed that the void peak had no activity, while the 860 kDa toxin complex fraction was highly active, and the 325 kDa peak was less active, although quite potent. SDS PAGE analysis of Sepharose CL-4B toxin complex peaks from different fermentation productions revealed two distinct peptide patterns, denoted "P" and "S" . The two patterns had marked differences in the molecular weights and concentrations of peptide components in their fractions. The "S" pattern, produced most frequently, had 4 high molecular weight peptides (> 150 kDa) while the "P" pattern had 3 high molecular weight peptides. In addition, the "S" peptide fraction was found to have 2-3 fold more activity against European Corn Borer. This shift may be related to variations in protein expression due to age of inoculum and/or other factors based on growth parameters of aged cultures.
Milligram quantities of peak toxin complex fractions determined to be "P" or "S" peptide patterns were subjected to preparative SDS PAGE, and transblotted with TRIS-glycine (Seprabuff™ to PVDF membranes (ProBlott™, Applied Biosyscems) for 3-4 hours. Blots were sent for amino acid analysis and N-terminal amino acid sequencing at Harvard MicroChem and Cambridge ProChem, respectively. Three peptides in the "S" pattern had unique N-terminal amino acid sequences compared to the sequences identified in the previous example. A 201 kDa (TcdAii) peptide set forth as SEQ ID NO : .13 below shared between 33% amino acid identity and 50% similarity with SEQ ID NO:l (TcbAi ) (Table 10, in Table 10 vertical lines denote amino acid identities and colons indicate conservative amino acid substitutions) . A second peptide of 197 kDa, SEQ ID NO: 14 (TcdB) , had 42% identity and 58% homology with SEQ ID NO:2 (TcaC) . Yet a third peptide of 205 kDa was denoted TcdAii. In addition, a limited N-terminal amino acid sequence, SEQ ID NO:16 (TcbA) , of a peptide of at least 235 kDa was identical in homology with the amino acid sequence, SEQ ID NO: 12, deduced from a cloned gene (tc A), SEQ ID NO:ll, containing a deduced amino acid sequence corresponding to SEQ ID NO:l (TcbAii). This indicates that the larger 235+ kDa peptide was proteolytically processed to the 201 kDa peptide, (TcbAii) , (SEQ ID NO:l) during fermentation, possibly resulting in activation of the molecule. In yet another sequence, the sequence originally reported as SEQ ID NO: 5 (TcaBii) reported in Example 5 above, was found to contain an aspartic acid residue (Asp) at the third position rather than glycine (Gly) and two additional amino acids Gly and Asp at the eighth and ninth positions, respectively. In yet two other sequences, SEQ ID NO:2 (TcaC) and SEQ ID NO:3 (TcaBi ) , additional amino acid sequence was obtained. Densitomecric quantitation was performed using a sample that was identical to the "S" preparation sent for N-terminal analysis. This analysis showed that the 201 kDa and 197 kDa peptides represent 7.0% and 7.2%, respectively, of the total Coomassie brillant blue stained protein in the "S" pattern and are present in amounts similar to the other abundant peptides. It is speculated that these peptides may represent protein homologs, analogous to the situation found with other bacterial toxins, such as various CryI Bt toxins. These proteins vary from 40-90% homology at their N-terminal amino acid sequence, which encompasses the toxic ragment. -41- Internal Amino Acid Sequencing: To facilitate cloning of toxin peptide genes, internal amino acid sequences of selected peptides were obtained as followed. Milligram quantities of peak 2A fractions determined to be "P" or "S" peptide patterns were subjected to preparative SDS PAGE, and transblotted with TRIS-glycine (Seprabuff™ to PVDF membranes (ProBlott™, Applied Biosystems) for 3-4 hours. Blots were sent for amino acid analysis and N-terminal amino acid sequencing at Harvard MicroChem and Cambridge ProChem, respectively. Three peptides, referred to as TcbAii (containing SEQ ID NO:l) , TcdAii, and TcaB, (containing SEQ ID NO:3) were subjected to trypsin digestion by Harvard MicroChem followed by HPLC chromatography to separate individual peptides. N-terminal amino acid analysis was performed on selected tryptic peptide fragments. Two internal peptides were sequenced for the peptide TcaBi (205 kDa peptide) referred to as TcaBi-PTlll (SEQ ID NO: 17) and TcaBt-PT7-9 (SEQ ID NO:18). Two internal peptides were sequenced for the peptide TcaBi (68 kDa peptide) referred to as TcaBi-PTISS (SEQ ID NO: 19) and TcaBi-PT108 (SEQ ID NO: 20) . Four internal peptides were sequenced for the peptide TcbAii (201 kDa peptide) referred to as TCBAII-PT103 (SEQ ID NO:21) , TcbAii~PT56 (SEQ ID NO:22) , TcbAij.-PT81(a) (SEQ ID O:23), and TcbAii- PT81 (b) (SEQ ID O:24) .
Table 10 N-Tenninal Amino Acid Sequence 201 kDa (33% identity & 50% similarity to SEQ ID NO.l) L I G Y N N Q F S G ' A SEQ ID NO: 13 : I I I : I F I Q G Y S D L F G N - A SEQ ID NO:l 197 kDa (42% identity & 58% similarity SEQ ID NO.2) M Q N S Q T F S V G E L SEQ ID NO.14 I I : ! I : : I M Q D S P E V S I T T L SEQ ID NO.2 Example 8 Construction of a cosmid library of Phocarhabdus luminescens W-14 genomic DNA and its screening to isolate genes encoding peptides comprising the toxic protein preparation As a prerequisite for the production of Phocarhabdus insect toxic proteins in heterologous hosts, and for other uses, it is necessary to isolate and characterize .the genes that encode those pepcic.es. mis OD]eccive was pursuea in parallel. One approach, described later, was based on the use of monoclonal and polyclonal antibodies raised against the purified toxin which were then used to isolate clones from an expression library. The other approach, described in this example, is based on the use of the N-terminal and internal amino acid sequence data to design degenerate oligonucleotides for use in PCR amplication. Either mechod can be used to identify DNA clones chat concain che pepcide-encoding genes so as co permic the isolation of the respective genes, and the determination of their DNA base sequence .
GENOMIC DNA ISOLATION; Phocorha dus luminescens strain W-14 (ATCC accession number 55397) was grown on 2% proteose peptone *3 agar (Difco Laboratories, Detroit, MI) and insecticidal coxin competence was maintained by repeated bioassay after passage, using the method described in Example 1 above. A 50 ml shake culture was produced in a 175 ml baffled flask in 2% proteose peptone #3 medium, grown at 28°C and 150 rpm for approximately 24 hours. 15 ml of this culture was pelleted and frozen in its medium at -20°C until it was thawed for DNA isolation. The thawed culture was centrifuged, (700 x g, 30 in) and the floating orange mucopolysaccharide material was removed. The remaining cell material was centrifuged (25,000 x g, 15 min) to pellet the bacterial cells, and the medium was removed and discarded.
Genomic DNA was isolated by an adaptation of the CTAB method described in section 2.4.1 of Current Protocols in Molecular Biology (Ausubel ec al. eds, John Wiley & Sons, 1994) [modified to include a salt shock and with all volumes increased 10-fold] . The pelleted bacterial cells were resuspended in TE buffer (10 mM Tris-HCl, 1 mM EDTA, pH 8.0) to a final volume of 10 ml, then 12 ml of 5 M NaCl was added; this mixture was centrifuged 20 min at 15,000 x g. The pellet was resuspended in 5.7 ml TE and 300 ml of 10% SDS and 60 ml of 20 mg/ml proteinase K (Gibco BRL Products, Grand Island, NY; in sterile distilled water) were added to the suspension. This mixture was incubated at 37°c for 1 hr; then approximately 10 mg lysozyme (Worchington Biochemical Corp., Freehold, NJ) was added. After an additional 45 min, L ml of 5 M NaCl and 800 ml of CTAB/NaCl solution (10% w/v CTAB, 0.7 M NaCl) were aaded. xms preparacion was mcuoated 10 min ac 65°'j. Chen gently agitated and further incubated and agitated for approximately 20 min to assist clearing of the cellular material. An equal volume of chloroform/ isoamyl alcohol solution (24:1, v/v) was added, mixed gently and centrifuged. After two extractions with an equal volume of PCI (phenol/chloroform/ isoamyl alcohol; 50:49:1, v/v/v; equilibrated with 1 M Tris-HCl, pH 8.0; Intermountain Scientific Corporation, Kaysville, UT) , the DNA was precipitated with 0.6 volume of isopropanol. The DNA precipitate was gently removed with a glass rod, washed twice with 70% ethanol, dried, and dissolved in 2 ml STE (10 mM Tris-HCl pH 8.0, 10 mM NaCl, 1 mM EDTA) . This preparation contained 2.5 mg/ml DNA, as determined by optical density at 260 nm (i.e., ODj.o) .
The molecular size range of the isolated genomic DNA was evaluated for suitability for library construction. CHEF gel analysis was performed in 1.5% agarose (Seakern' LE, FMC BioProducts, Rockland, ME) gels with 0.5 X TBE buffer (44.5 mil Tris-HCl pH 8.0, 44.5 mM H]BO), 1 mM EDTA) on a BioRad CHEF-DR II apparatus with a Pulsewave 760 Switcher (Bio-Rad Laboratories. Inc., Richmond, CA) . The running parameters were: initial A time, '3 sec; final A time, 12 sec; 200 volts; running temperature, 4-18°C; run time, 16.5 nr. Ethidium bromide staining and examination of the gel under ultraviolet light indicated the DNA ranged from 30-250 kbp in size.
CONSTRUCTION OF LIBRARY: A partial Sau3A 1 digest was n de of this P ocarhabdus genomic DNA preparation. The method was based on section 3.1.3 of Ausubel (supra.). Adaptions included running smaller scale reactions under various conditions until nearly optimal results were achieved. Several scaled-up large reactions with varied conditions were run, the results analyzed on CHEF gels, and only the best large scale preparation was carried forward. In the optimal case, 200 μg of Phocarhabdus genomic DNA was incubated with 1.5 units of Sau3A 1 (New England Biolabs, "NEB", Beverly, MA) for 15 min at 37°C in 2 ml total volume of IX NEB 4 buffer (supplied as 10X by the manuf cturer) . The reaction was stopped by adding 2 ml of PCI and centrifuging at 8000 x g for 10 min. To the supernatant were added 200 μΐ o£ 5 M NaCl plus 6 ml of ice-cold ethanol. This preparation was chilled for 30 mini ac -20°C. chen centrif ged ac 12,000 :·: g for 15 min. The supernatant was removed and che precipitate was dried in a vacuum oven at 40°C, then resuspended in 400 μΐ STE. Spectrophotometric assay indicated about 40% recovery of the input DNA. The digested DNA was size fractionated on a sucrose gradient according to section 5.3.2 of C?MB (op. cie.) . A i.0% to 40% (w/v) linear sucrose gradient was prepared with a gradient maker in Ultra-Clear™ tubes (Beckman Instruments, Inc., Palo Alto, CA) and the DNA sample was layered on top. After centrifugation, (26,000 rpm, 17 hr, Beckman SW41 rotor, 20°c: , fractions (about 750 μΐ) were drawn from the top of the gradient and analyzed by CHEF gel electrophoresis (as described earlier) . Fractions containing Sau3A 1 fragments in the size range 20-40 kbp were selected and DNA was precipitated by a modification (amounts of all solutions increased approximately 6.3 -fold) of the method in section 5.3.3 of Ausubel Isupra.) . After overnight precipitation, the DNA was collected by centrifugation (i7,0i0~ x g, 15 min), dried, redissolved in TE, pooled into a final volume of 80 μΐ, and reprecipitated with the addition of 8 μΐ 3 M sodium acetate and 220 μΐ ethanol. The pellet collected by centrifugation as above was resuspended in 12 μΐ TE.
Concentration of the DNA was determined by Hoechst 33258 dye ( Polysciences , Inc., Warrington, PA) flubrometry in a Hoefer TKO100 fluorimeter (Hoefer Scientific Instruments, San Francisco, CA) . Approximately 2.5 μg of the size-fractionated DNA was recovered.
Thirty μg of cosmid pWE15 DNA (Stratagene, La Jolla, CA) was digested to completion with 100 units of restriction enzyme E¾mH 1 (NEB) in the manuf cturer's buffer (final volume of 200 μΐ , 37°C, 1 hr) . The reaction was extracted with 100 μΐ of FCI \nd DNA was precipitated from the aqueous phase by addition cf 20 μΐ 3M sodium acetate and 550 μΐ -20°C absolute ethanol. After i O min at -70°C, the DNA was collected by centrifugation (17,000 x g, 15 min) , dried under vacuum, and dissolved in 180 μΐ of L < 1 mM Tris-HCl, pH 8.0. To this were added 20 μΐ of 10X CIP buffe;. (100 mM Tris-HCl, pH 8.3; 10 mM ZnCl2; 10 mM MgClj) , and 1 μΐ (0.25 units) of 1:4 diluted calf intestinal alkaline phosphatase (Boehringer Mannheim Corporation, Indianapolis, IN) . After 30 min at 37°c, the following additions were made: 2 μΐ 0.5 M EDTA. pH 3.0; 10 μΐ 10% SOS; 0.5 μΐ of 20 mg/ml proteinase K (as above) , followed by incubation at 55°C for 30 min. Following sequential extractions with 100 μΐ of PCI and 100 μΐ phenol ( Intermountain Scientific Corporation, equilibrated with 1 M Tris-HCl, pH 8.0) , the dephosphorylated DNA was precipitated by addition of 72 μΐ of 7.5 M ammonium acetate and 550 μΐ -20°C ethanol, incubation on ice for 30 min, and centrifugation as above. The pelleted DNA was washed once with 500 μΐ -20°C 70¾ ethanol, dried under vacuum, and dissolved in 20 μΐ of TE buffsr.
Ligation of the size-fractionated Sau3A 1 fragments to the BamH 1-digested and phosphatased pWE15 vector was accomplished using T4 ligase (NEB) by a modification (i.e., use of premixed 10X ligation buffer supplied by the manufacturer) of the protocol in section 3.33 of Ausubel. Ligation was carried out overnight in a total volume of 20 μΐ at 15°C, followed by storage at -20°C.
Four μΐ of the cosmid DNA ligation reaction, containing about 1 μg of DNA, was packaged into bacteriophage lambda usin a commercial packaging extract (Gigapack* III Gold Packaging Extract, Stratagene) , following the manufacturer's directions. The packaged preparation was stored at 4°C until use. The packaged cosmid preparation was used to infect Escherichia coli XLl Blue MR cells (Stratagene) according to the Gigapack' III fJjid protocols ("Titering the Cosmid Library") , as follows. XLl BLue MR cells were grown in LB medium (g/L: Bacto-tryptone, 10; Bacto-yeast extract, 5; Bacto-agar, 15; NaCl, 5; [Difco Laboratories, Detroit, MI]) containing 0.2% (w/v) maltose plus 10 mM MgSC , at 37°C. After 5 hr growth, cells were pelleted at 700 x g (15 min) and resuspended in 6 ml of 10 mM MgSO«. The culture density v/as adjusted with 10 mM MgSO, to OD*0o = 0.5. The packaged cosmid library was diluted 1:10 or 1:20 with sterile SM medium (0.1 M NaCl, 10 mM MgS04. 50 mM Tris-HCl pH 7.5, 0.01% w/v gelatin) , and 25 μΐ of the diluted preparation was mixed with 25 μΐ of the diluted XLl Blue MR cells. The mixture was incubated at 25° for 30 min (without shaking) , then 200 μΐ of LB broth was added, and incubation was continued for approximately 1 hr with occasional gentle shaking. Aliquots (20-40 μΐ ) of this cuicure were spread on LB agar places concaining 100 mg/1 ampicillin (i.e., LB-Amp...) and incubated overnighc ac 37°c. To score the library without amplification, single colonies were picked and inoculated into individual wells of sterile 96-well microwell plates; each well containing 75 μΐ of Terrific Broth (TB media: 12 g/1 Bacto-tryptone, 24 g/1 Bacto-yeast extract, 0.4% v/v glycerol, 17 mM KH2PO«, 72 mM 2HPO«) plus 100 mg/1 ampicillin (i.e., TB-Ampi10) and incubated (without shaking) overnight at 37°c. After replicating the 96-well plate into a copy plate, 75 μΐ/well of filter-sterilized T3:glycerol (1:1, v/v; with, or without, 100 mg/1 ampicillin) was added to the plate, it was shaken briefly at 100 rpm, 37°C, and then closed with Parafilrn* (American National 'Jan, Greenwich, CT) and placed in a -70°C freezer for storage. Copy plates were grown and processed identically to the master places. A total of 40 such master plates (and their copies) were prepared.
SCREENING- OF THE LIBRARY WITH RADIOLABELED DNA PROBES: To prepare colony filters for probing with radioactively labeled probes, ten 96-well plates of the library were thawed at 25°C (bench top at room temperature) . A replica plating tool with ? prongs was used to inoculate a fresh 96-well copy plate concaining 75 μΐ/well of TB-Ampmo. The copy plate was grown overnight (stationary) at 37ec, then shaken about 30 min at 100 rpm at 37°c. A toeal of 800 colonies was represented in these copy plates, due to nongrowth of some isolates. The replica tool was used to inoculate duplicate impressions of the 96-well at rays onto Magna NT (MSI, Westboro, MA) nylon membranes (0.45 micron, 220 x 250 mm) which had been placed on solid LB-Ampioo (100 ml/dish) in Bio-assay plastic dishes (Nunc, 243 x 243 x 18 mm; Curtin Mathison Scientific, Inc., Wood Dale, ID. The colonies were grown on the membranes at 37°C for about 3 hr.
A positive control colony (a bacterial clone containing a GZ4 sequence insert, see below) was grown on a separate Magn-Λ tiT membrane (Nunc, 0.45 micron, 82 mm circle) on LB medium supplemented with 35 mg/1 chloramphenicol (i.e., LB-Camn) . and processed alongside the library colony membranes. Bacterial colonies on the membranes were lysed, and the DNA was denatured -47- and neutralized according to a protocol taken from the Genius System User's Guide version 2.0 (Soehringer Mannheim.
Indianapolis, IN). Membranes were placed colony side up on filter paper soaked with 0.5 N NaOH plus 1.5 M NaCl for 15 min to denature, and neutralized on filter paper soaked with 1 M Tris-HC1 pH 8.0, 1.5 M NaCl for 15 min. After UV-crosslinking using a Stratagene UV Stratalinker set on auto crosslink, the membranes were stored dry at 25°C until use. Membranes were trimmed into strips containing the dupiicace impressions of a single 96-well plate, then washed extensively by the method of section 6.4.1 in CPMB (op. cic): 3 hr at 25°C in 3X SSC, 0.1% (w/v) SDS, followed by 1 hr at 65°C in the same solution, then rinsed in 2X SSC in preparation for the hybridization step (20X SSC = 3 M NaCl, 0.3 M sodium citrate, pH 7.0).
Amplification of a specific genomic fragment of a ccaC i?ne. Based on the N-cerminal amino acid sequence determined for the purified TcaC peptide fraction [disclosed herein as SEQ ID NO:2], a pool of degenerate oligonucleotides (pool S4Psh) was synthesized by standard β-cyanoethyl chemistry on an Applied BioSystem ABI394 DNA/RNA Synthesizer (Perkin Elmer, Foster Cicy, CA) . The oligonucleotides were deprotected 8 hours at 55°C, dissolved in water, quantitated by spectrophotometric measurement, and diluted for use. This pool corresponds to the determined N-terminal amino acid sequence of the TcaC peptide.
The determined amino acid sequence and the corresponding degenerate DMA sequence are given below, where A, C, G, and T are the standard DNA bases, and I represents inosine: Amino Met Gin Asp Ser Pro Glu Val Acid S4Psh 5' ATG CA(A/G) GA(T/C) (T/A) (C/G) (T/A) CCI GA(A/G) GT 3 ' Another set of degenerate oligonucleotides was synthesized (pool P2.3.5R), representing the complement of the coding strand for the determined amino acid sequence of the SEQ ID NO: 17: Amino Acid Ala Phe Asn He Asp Asp Val Codons 5' GCN TT(T/C) AA(T/C) AT(A/T/C) GA(T/C) GA(T/C) GT 3' P2.3.5R 3 'CGiA/C/G/T) AA (A/G) TT (A/G) TA (T/A/G) CT(A/G) CT(A/G) CA 5' These oligonucleotides were used as primers in Polymerase Chain Reactions (PCR*, Roche Molecular Systems, Branchburg, J) to -48- SUBSTnUTESHEETRULE26) amplify a specinc UNA cragment from genomic DNA prepared from Phoco habdus strain W-14 (see above). A typical reaction (50 μΐ i contained 125 pmol of each primer pool P2Psh and P2.3.5R, 253 ng of genomic template DNA, 10 nmol each of dATP, dCTP, dGTP, and dTTP, IX GeneAmp* PGR buffer, and 2.5 units of AmpliTaq* DNA polymerase (both from Roche Molecular Systems; 10X GeneAmp" buffer is 100 mM Tris-HCl pH 8.3, 500 mM KC1, 0.01% w/v gelatin).
Amplifications were performed in a Perkin Elmer Cetus DNA Thermal Cycler (Perkin Elmer, Foster City, CA) using 35 cycles of 94°c (1.0 min), 55°C (2.0 min), 72°C (3.0 min), followed by an extension period of 7.0 min at 72°C. Amplification products were analyzed by electrophoresis through 2% w/v NuSieve' 3:1 agarose (F C BioProducts) in TEA buffer (40 mM Tris-acetate, 2 mM EDTA, pH 8.0). A specific product of estimated size 250 bp was observed amongst numerous other amplification products by ethidium bromide (0.5 g ml) staining of the gel and examination under ultraviolet light.
The region of the gel containing an approximately 250 bp product was excised, and a small plug (0.5 mm dia.) was removed and used to supply template for PCR amplification (40 cycles) . The reaction (50 μΐ) contained the same components as above, minus genomic template DNA. Following amplification, the end? of the fragments were made blunt and were phosphorylated by incubation at 25°C for 20 min with 1 unit of T4 DNA polymerase (NEB), 1 nmol ATP, and 2.15 units of T4 kinase (Pharmacia Biotech Inc., Piscataway, NJ) .
DNA fragments were separated from residual primers by electrophoresis through 1% w/v GTG* agarose (FMC) in TEA. A yel slice containing fragments of apparent size 250 bp was excised, and the DNA was extracted using a Qiaex kit (Qiagen Inc., Chatsworth, CA) .
The extracted DNA fragments were ligated to plasmid vector pBC KS(+) (Stratagene) that had been digested to completion with restriction enzyme Sma 1 and extracted in a manner similar to that described for pWE15 DNA above. A typical ligation reaction (16.3 μΐ) contained 100 ng of digested pBC S(+) DNA, 70 ng of 250 bp fragment DNA, 1 nmol [Co (NH>) .]C1> . and 3.9 Weiss units of T4 DNA ligase (Collaborative Biomedical Products, Bedford, MA), in IX ligation buffer (50 mM Tris-HCl, pH 7.4; 10 mM MgCl2; 10 mM dithiothreitol ; 1 mM spermidine, 1 mM ATP, 100 mg/ml bovine serum albumin) . Following overnight incubation at 14°C, the ligatevl products were transformed into frozen, competent Escherichia : ii DH5(x cells (Gibco BRL) according to the suppliers' recommendations, and plated on LB-Cam. plates , containing IPT (119 g ml) and X-gal (50 μg ml) . Independent white colonies were picked, and plasmid DNA was prepared by a modified alkalme-lysis/PEG precipita ion method (PRISM™ Ready Reaction DyeDeoxy™ Terminator Cycle Sequencing Kit Protocols; ABI/Perkin Elmer) . The nucleotide sequence of both strands of the insert DNA was determined, using T7 primers [pBC KS( + ) bases 601-623: TAAAACGACGGCCAGTGAGCGCG) and LacZ primers [pBC KS(+) bases 792-816: ATGACCATGATTACGCCAAGCGCGC ) and protocols supplied with the PRISM™ sequencing kit (ABI/Perkin Elmer) . Nonincorporated dye-terminator dideoxyribonucleotides were removed by passage through Centri-Sep 100 columns (Princeton Separations, Inc., Adelphia, NJ) according to the manufacturer's instructions. The DNA sequence was obtained by analysis of the samples on an ABI Model 373A DNA Sequencer (ABI/Perkin Elmer) . The DNA sequences of two isolates, GZ4 and HB14, were found to be as illustrated in Figure 1.
This sequence illustrates the following features: 1) bases 1-20 represent one of the 64 possible sequences of the S4Psh degenerate oligonucleo ides, ii) the sequence of amino acids 1-3 and 6-12 correspond exactly to that determined for the N-terminus of TcaC (disclosed as SEQ ID NO:2), iii) the fourth amino acid encoded is a cysteine residue rather than serine. This difference is encoded within the degeneracy for the serine codons (see above) , iv) the fifth amino acid encoded is proline, corresponding to the TcaC N-terminal sequence given as SEQ ID NO: 2, v) bases 257-276 encode one of the 192 possible sequences designed into the degenerate pool, vi) the TGA termination codon introduced at bases 268-270 is the result of complementarity to the degeneracy built into the oligonucleotide pool at the corresponding position, and does not indicate a shortened reading frame for the corresponding gene.
Labeling of a TcaC peptide gene-specific probe. DNA fragments corresponding to the above 276 bases were amplified (35 -50- SUBSTTrUTE SHEET(RULE 26) cycles; by PCR* in a 100 μΐ reaction volume, using 100 pmoi each of P2Psh and P2.3.5R primers, 10 ng of plasmids GZ4 or HE14 as templates, 20 nmol each of dATP, dCTP, dGTP, and dTTP, 5 units of AmpliTAq* DNA polymerase, and IX concentration of GeneAmp" buffer, under the same temperature regimes as described above. The amplification products were extracted from a 1% GTG'* agarose gel by Qiaex kit and quan itated by fluorometry.
The extracted amplification products from plasmid HB14 template (approximately 400 ng) were split into five aliquots and labeled with ":P-dCTP using the High Prime Labeling Mix (Boehringer Mannheim) according to the manufacturer's instructions. Nonincorporated radioisotope was removed by passage through NucTrap* Probe Purification Columns ( Stratagene) , according to the supplier's instructions. The specific activity of the labeled DNA product was determined by scintillation counting to be 3.11 x 108 dpm/ g. This labeled DNA was used to probe membranes prepared from 800 members of the genomic library.
Screening with a TcaC-peptide gene specific probe. The radiolabeled HB14 probe was boiled approximately 10 min, then added to "minimal hyb" solution. [Note: The "minimal hyb" method is taken from a CERES protocol; "Restriction Fragment Length Polymorphism Laboratory Manual version 4.0", sections 4-40 and 4-47; CERES/NPI, Salt Lake City, UT. NPI is now defunct, with its successors operating as Linkage Genetics] . "Minimal hyb" solution contains 10% w/v PEG (polyethylene glycol, M.W. approx. 8000), 7% w/v SDS; 0.6X SSC, 10 mM sodium phosphate buffer (from~ a 1M stock containing 95 g/1 NaH2PO<«lH20 and 84.5 g/1 Na>HP04« 7HjO) , 5 mM EDTA, and 100 mg/ml denatured salmon sperm DNA. Membranes were blotted dry briefly then, without prehybridization, 5 strips of membrane were placed in each of 2 plastic boxes containing 75 ml of "minimal hyb" and 2.6 ng/ml of radiolabeled HB14 probe. These were incubated overnight with slow shaking (50 rpm) at 60°C. The filters were washed three times for approximately 10 min each at 25°C in "minimal hyb wash solution" (0.25X SSC, 0.2% SDS), followed by two 30-min washes with slow shaking at 60°C in the same solution. The filters were placed on paper covered with Saran Wrap* (Dow Brands, Indianapolis, IN) in a light-tight autoradiographic cassette and exposed to X-Omat X-ray film (Kodak, Rochester, NY) with two DuPonc Crcnex Lightning- Plus CI enhancers (Sigma Chemical Cc . , St. Louis, MO), for 4 hr at -70°C. Upon development (standard photographic procedures) , significant signals were evident in both replicates amongst a high background of weaker, more irregular signals. The filters were again washed for about 4 hr at 69°C in "minimal hyb wash solution" and then placed again in the cassettes and film was exposed overnight at -70°C. Twelve possible positives were identified due to strong signals on both of the duplicate 96-well colony impressions. No signal was seen with negative control membranes (colonies of XL1 Blue MR cells containing pWE15), and a very strong signal was seen with positive control membranes (DH5a cells containing the GZ4 isolate of the PCR product) that had been processed concurrently with the experimental samples.
The twelve putative hybridization-positive colonies were retrieved from the frozen 96-well library plates and grown overnight at 37°c on solid LB-Amptoo medium. They were then patched (3/plate, plus three negative controls: XL1 Blue MR cells containing the pWE15 vector) onto solid LB-Ampioo. Two sets of membranes (Magna NT nylon, 0.45 micron) were prepared for hybridization. The first set was prepared by placing a filter directly onto the colonies on a patch plate, then removing it with adherent bacterial cells, and processing as below. Filters of the second set were placed on plates containing LB-Amptoo medium, then inoculated by transferring cells from the patch plates onto the filters. After overnight growth at 37°c, the filters were removed from the plates and processed.
Bacterial cells on the filters were lysed and DNA denatured by placing each filter colony-side-up on a pool (1.0 ml) of 0.5 tl MaOH in a plastic plate for 3 min. The filters were blotted dry on a paper towel, then the process was repeated with fresh 0.5 M NaOH. After blotting dry, the filters were neutralized by placing each on a 1.0 ml pool of 1 M Tris-HCl, pH . for 3 min, blotted dry, and reneutralised with fresh buffer. This was followed by two similar soakings (5 min each) on pools of 0.5 M Tris-HCl pH 7.5 plus 1.5 M NaCl. After blotting dry, the DNA was UV crosslinked to the filter (as above) , and the filters were washed (25°C, 100 rpm) in about 100 ml of 3X SSC plus 0.1% (w/v) SDS (4 times, 30 min each with fresh solution for each wash) . They were then placed in a minimal volume of prehybridizat ion -52- solution [5X SSC plus 1¾ w:v each of Ficoll 400 (Pharmacia; , polyvinylpyrrolidone (av. M.W. 360.000; Sigma ) and bovine s r m albumin Fraction V; (Sigma) ] for 2 hr ac 65°C, 50 rpm. The prehybridizat ion solution was removed, and replaced with the H314 !'P-labeled probe that had been saved from the previous hybridization of the library membranes and which had been denatured at 95°C for 5 min. Hybridization was performed at 60°C for 16 hr with shaking at 50 rpm.
Following removal of the labeled probe solution, the membranes were washed 3 times at 25°C (50 rpm, 15 min) in 3X SSC (about 150 ml each wash) . They were then washed for 3 hr at 63°C (50 rpm) in 0.25X SSC plus 0.2% SDS (minimal hyb wash solution) , and exposed to X-ray film as described above for 1.5 hr at 25°C (no enhancer screens) . This exposure revealed very strong hybridization signals to cosmid isolates 22G12, 25A10, 26A5, and 26B10, and a very weak signal with cosmid isolate 8B10. Mo signal was seen with the negative control (pWE15) colonies, and a very strong signal was seen with positive control membranes (DK5a cells containing the GZ4 isolate of the PC product) that had been processed concurrently with the experimental samples.
Amplification of a specific genomic fragment of a ccaB gene. Based on the N- terminal amino acid sequence determined for the purified TcaB, peptide fraction (disclosed here as SEQ ID NO: ) a pool of degenerate oligonucleotides (pool P8F) was synthesized as described for peptide TcaC. The determined amino acid sequence and the corresponding degenerate DNA sequence are given below, where A, C, G, and T are the standard DNA bases, and I represents inosine : Amino Acid Leu Phe Thr Gin Thr Leu Lys Glu Ala Arg P8F 5' TTT ACI CA(A/G) ACI (C/T)TI AAA GAA GCI (A/C)G 3' (C/TJTI Another set of degenerate oligonucleotides was synthesized (pool P8.108.3R) , representing the complement of the coding strand for the determined amino acid sequence of the TcaBj.-PT108 internal peptide (disclosed herein as SEQ ID NO:20) : Amino Acid Met Tyr Tyr He Gin Ala Gin Gin Codons ATG TA(T.'C) TA(T.'C) AT(T/C A) CA(A/G) GCiA/C/G'T) CA(A G CA'.A-:-' ?$._'·Η.2Ρ. 3' ATIA/G) AT ( A/G ) TA ( A/G/T) GT(T C) CGI GT(T.C) GT 5'' TAC These oligonucleotides were used as primers for PCR* using HotStart 5ΰ Tubes™ (Molecular Bio-Products, Inc., San Diego, CA) to amplify a specific DNA fragment from genomic DNA prepared from P ocorhabdus strain W-14 (see above). A typical reaction (50 μΐ ; contained (bottom layer) 25 pmol of each primer pool P9F and P8.108.3R, with 2 nmol each of dATP, dCTP, dGTP, and dTTP , in IX GeneAmp* PCR buffer, and (top layer) 230 ng of genomic template DNA, 3 nmol each of dATP, dCTP, dGTP, and dTT , and 2.5 units of AmpliTaq DNA polymerase, in IX GeneAmp'* PCR buffer.
Amplifications were performed by 35 cycles as described for the TcaC peptide. Amplification products were analyzed by electrophoresis through 0.7% w/v SeaKem* LE agarose (FMC) in TEA buffer. A specific product of estimated size 1600 bp was observed.
Four such reactions were pooled, and the amplified DNA was extracted from a 1.0% SeaKem* LE gel by Qiaex kit as described for the TcaC peptide. The extracted DNA was used directly as the template for sequence determination ( PRISM*" Sequencing Kit) using, the P8F and P8.108.3R primer pools. Each reaction contained about 100 ng template DNA and 25 pmol of one primer pool, and was processed according to standard protocols as described for the TcaC peptide. An analysis of the sequence derived from extension of the P8F primers revealed the short DNA sequence (and encoded amino acid sequence) : GAT GCA TTG NTT GCT Asp Ala Leu (Val) Ala which corresponds to a portion of the N-terminal peptide sequence disclosed as SEQ ID NO:3 (TcaBj.).
Labeling of a TcaBj-peptide gene-specific probe.
Approximately 50 ng of gel-purified TcaBi DNA fragment was labeled with 'JP-dCTP as described above, and nonincorporated radioisotopes were removed by passage thorough a NICK Column* (Pharmacia) . The specific activity of the labelled DNA was determined to be 6 x 10° dpm^g. This labeled DNA was used to. -54- W probe colony membranes prepared from members of che genomic library chat had hybridised co che TcaC-peptide specific probe.
The membranes containing the 12 colonies identified in che TcaC-probe library screen (see above) were scripped of 5 radioactive TcaC-specif ic label by boiling twice for, approximately 30 min each time in 1 licer of 0. IX SSC plus 0.1 % SDS. Removal of radiolabel was checked with a 6 hr film exposure. The stripped membranes were then incubated with che TcaBi peptide-specif ic probe prepared above. The labeled DMA was 10 denatured by boiling for 10 min, and then added to the filcers that had been incubated for 1 hr in 100 ml of "minimal hyb" solution at 60°C. After overnight hybridization at this temperature, the probe solution was removed, and the filters were washed as follows (all in 0.3X SSC plus 0.1% SDS): once for 5 min 15 at 25°C, once for 1 hr at 60°C in fresh solution, and once for 1 hr at 63°C in fresh solution. After 1.5 hr exposure to X-ray film by standard procedures, 4 strongly-hybridizing colonies were observed. These were, as with the TcaC-specif ic probe, isolates 22G12, 25A10, 26A5, and 26B10. 20 The same TcaBi probe solution was diluted with an equal volume (about 100 ml) of "minimal hyb" solution, and then used to screen the membranes containing the 800 members of the genomic library. After hybridization, washing, and exposure to X-ray film as described above, only the four cosmid clones 22G12, 25 25A10, 26A5, and 26B10, were found to hybridize strongly to this probe .
ISOLATION OF SUBCLONES CONTAINING GENES ENCODING TcaC AIID TcaBj PEPTIDES. AND DETERMINATION OF DMA BASE SEQUENCE THEREOF: 30 Three hybridization-positive cosmids in strain XL1 Blue MR were grown with shaking overnight (200 rpm) at 30°C in 100 ml TB- Am no. After harvesting the cells by centrifugation, cosmid DMA .was prepared using a commercially available kit (BIGprep™, 5 Prime 3 Prime, Inc., Boulder, CO), following the manufacturer's 35 protocols. Only one cosmid, 26A5, was successfully isolated by this procedure. When digested with restriction enzyme EcoR 1 (NEB) and analyzed by gel electrophoresis, fragments of approximate sizes 14, 10, 8 (vector), 5, 3.3, 2.9, and 1.5 kbp were detected. A second attempt to isolate cosmid DMA from che 40 same three strains (8 ml cultures; TB-Am uju. 30°C) utilized a -55- SUBSTmiTE SHEETRULE 26) coiling miniprep method : Evans G. and G. Wahl . , 193", "Cosmid vectors cor genomic walking and rapid restriction mapping." in Guide co Molecular Cloning Techniques. Meth. Enzymology, vol. 152, S. Berger and A. Kimmel, eds . , pgs . 604-610) . Only one cosmid, 25A10, was successfully isolated by chis method. When digested with restriction enzyme EcoR 1 (ME3) and analyzed by gel electrophoresis, this cosmid showed a f ragir.enta ion pattern identical to that previously seen with cosmid 26A5.
A 0.15 g sample of 26A5 cosmid DNA was used to transform 50 ml of Ξ. coli DH5a cells (Gibco BRL) , by the supplier's protocols. A single colony isolate of that strain was inoculated into 4 ml of TB-Ampnu, and grown for 8 hr at 37°c.
Chloramphenicol was added to a final concentration of 225 μς/πιΐ, incubation was continued for another 24 hr, then cells were harvested by centrifugation and frozen at -20°C. Isolation of the 26A5 cosmid DNA was by a standard alkaline lysis miniprep (Maniatis ec al., op. cic, p. 382), modified by increasing all volumes by 50* and with stirring or gentle mixing, rather than vortexing, at every step. After washing the DNA pellet in 70% ethanol, it was dissolved in TE confaining 25 g ml· ribonuciease A !Boehringer Mannheim).
Identification of EcoR 1 fragments hybridizing to GZ4-derived and TcaBj - probes. Approximately 0.4 μg of cosmid 25A10 (from XL1 Blue MR cells) and about 0.5 μg of cosmid 26A5 (from chloramphenicol -amplif ied DH5a cells) were each digested with about 15 units of EcoR 1 (NEB) for 85 min, frozen overnight, then heated at 65°C for five min, and electrophoresed in a 0.7% agarose gel (Seakem* LE, IX TEA, 80 volts, 90 min). The DNA was stained with ethidium bromide as described above, and photographed under ultraviolet light. The EcoR 1 digest of cosmid 25A10 was a complete digestion, but the sample of cosmid 26A5 was only partially digested under these conditions. The agarose gel containing the DNA fragments was subjected to depurination, denaturatibn and neutralization, followed by Southern blotting onto a Magna NT nylon membrane, using a high salt (20X SSC) protocol, all as described in section 2.9 of Ausubel et al. (CPMB', op. cic). The transferred DNA was then UV-crosslinked to the nylon membrane as before. -56- SUBSTmJTE SHEET (RULE 26) An TcaC-peptide specific DMA fragment corresponding ID e insert of piasmid isoiace GZ4 was amplified by PCR° in a 100 mi reaction volume as described previously above. The amplification products from three such reactions were pooled and were extracted 5 from a 1% GTG* agarose gel by Qiaex kit, as described above, and quantitated by f luorcrnetry . The gel-purified DMA (100 ng) was labeled with ':P-dCTP using the High Prime Labeling Mix (Boehringer Mannheim) as described above, to a specific activity of 6.34 x 10' dpm^g.
It) ■ The !:P-labeled G24 probe was boiled 10 min, then added to "minimal hyb" buffer (at 1 ng/ml), and the Southern blot membrane containing the digested cosmid DNA fragments was added, and incubated for 4 hr at 60°C with gentle shaking at 50 rpm. The membrane was then washed 3 times at 25°C for about 5 min each 15 (minimal hyb wash solution) , followed by two washes for - min each at 60°C. The blot was exposed to film '.with enhancer screens) for about 30 min at -70°C. The GZ4 probe hybridized strongly to the 5.0 kbp (apparent size) EcoR 1 fragment ;f both these two cosmids , 26A5 and 25A10. 0 The membrane was stripped of radioactivity by boiling for about 30 min in 0. IX SSC plus 0.1 % SOS, and absence of radiolabel was checked by exposure to film. It was then hybridized at 60°C for 3.5 hours with the (denatured! TcaBi probe in "minimal hyb" buffer previously used for screening the colony 5 membranes (above), washed as described previously, and exposed to film for 40 min at -70°C with two enhancer screens. With both cosmids, the TcaBi probe hybridized lightly with the about 5.0 kbp EcoR 1 fragment, and strongly with a fragment of approximately 2.9 kbp. 0 The sample of cosmid 26A5 DNA previously described, (from DH5a cells) was used as the source of DNA from which to subclone the bands of interest. This DNA (2.5 μg) was digested with about 3 units of EcoR 1 (NEB) in a total volume of 30 μΐ for 1.5 hr, to give a partial digest, as confirmed by gel electrophoresis. Ten 5 g of pBC KS (+) DNA (Stratagene) were digested for 1.5 hr with 20 units of EcoR 1 in a total volume of 20 μΐ , leading to total digestion as confirmed by electrophoresis. Both EcoR 1-cut DNA preparations were diluted to 50 μΐ with water, to each an equal volume of PCI was added, "the suspension was gently mixed, spun in -57- a microcentrifuge and Che aqueous supernacanc was collected. was precipitated by 150 μΐ ethanol. and the mixture was placed a: -20'C overnight. Following centrifugation and drying, the Eco 1-digested pBC S (+) was dissolved in 100 μΐ TE ; the partially digested 26A5 was dissolved in 20 μΐ TE. DNA recovery was checked by £ luorometry .
In separate reactions, approximately 60 g of EcoR 1-digested pBC KS(+) DMA was ligated with approximately 190 ng or 270 ng of partially digested cosmid 26A5 DNA. Ligations were carried out in a volume of 20 μΐ ac 15°C for 5 hr, using T4 ligase and buffer from New England BioLabs. The ligacion mixture, diluted to 100 μΐ with sterile TE, was used to transform frozen, competent DH5a cells (Gibco BRL) according to the supplier's instructions. Varying amounts (25-200 μΐ) of the transformed cells were plated on freshly prepared solid LB-Cam-s medium with 1 mM IPTG and 50 mg/1 X-gal. Plates were incubated at 37°c about 20 hr, then chilled in the dark for approximately 3 hr to intensify color for insert selection. White colonies were picked onto pacch piates of the same composition and incubated overnight at 37°C.
Two colony lifts of each of the selected patch plates were prepared as follows. After picking white colonies to fresh plates, round Magna NT nylon membranes were pressed onto the patch plates, the membrane was lifted off, and subjected to denaturation, neutralization and UV crosslinking as described above for the library colony membranes. The crosslinked colony lifts were vigorously washed, including gently wiping off che excess cell debris with a tissue. One set was hybridized with the GZ4(TcaC) probe solution described earlier, and the other sec was hybridized with the TcaBi probe solution described earlier, according to the 'minimal hyb' protocol, followed by washing and film exposure as described for the library colony membranes.
Colonies showing hybridization signals either only with the GZ4 probe, with both GZ4 and TcaBi probes, or only with the TcaBi probe, were selected for further work and cells were streaked for single colony isolation onto LB-rCanvn media with IPTG and X-gal as before. Approximately 35 single colonies, from 16 differenc isolaces, were picked into liquid LB -Cam-i s media and grown overnight at 2 ~ °C ; the calls were collected by centrifuga ion and plasmid DMA was isolated by a standard alkaline lysis miniprep according to Maniatis ec al. (bp. cic. p. 363) . ' DNA pellets were dissolved in ΤΞ. * 25 μς/ΓηΙ ribonuclease A and DNA concentra ion was determined by fluorometry. The EcoR 1 digestion pattern was analyzed by gel electrophoresis. The following isolates were picked as useful. Isolate A17.2 contains religated pBC KS(+) only and was used for a (negative) control. Isolates D38.3 and C44.1 each contain only the 2.9 kbp, TcaBi -hybridizing EcoR 1 fragment inserted into pBC S(+) . These plasmids, named pDAB2000 and pDAB2001, respectively, are illustrated in Fig. 2.
Isolate A35.3 contains only the approximately 5 kbp, GZ4)-hybridizing EcoR 1 fragment, inserted into pBC KS(+). This plasmid was named pDAB2002 (also Fig. 2) . These isolates provided templates for DNA sequencing.
Plasmids pDAB2000 and pDAB2001 were prepared using the BlGprep™ kit as before. Cultures (30 ml) were grown overnight in TB-Camjs to an OD¾oo of 2, then plasmid was isolated according to the manufacturer's directions. DNA pellets were redissolved in 100 ul TE each, and sample integrity was checked by EcoR 1 digestion and gel electrophoretic analysis.
Sequencing reactions were run in duplicate, with one replicate using as template pDAB2000 DNA, and the other replicate using as template pDAB2001 DNA. The reactions were carried out using the dideoxy dye terminator cycle sequencing method, as described above for the sequencing of the GZ4/HB14 DMAs. Initial sequencing runs utilized as primers the LacZ and T7 primers described above, plus primers based on the determined sequence of the TcaBj PCR amplification product (TH1 = ATTGCAGACTGCCAATCGCTTCGG , TH12 = GAGAGTATCCAGACCGCGGATGATCTG ) .
After alignment and editing of each sequencing output, each was truncated to between 250 to 350 bases, depending on the integrity of the chromatographic data as interpreted by the Perkin Elmer Applied Biosystems Division SeqEd 675 software.
Subsequent sequencing "steps" were made by selecting appropriate sequence for new primers. With a few exceptions, primers (synthesized as described above) were 24 bases in length with a 50% G+C composition. Sequencing by this method was carried out on both strands of the approximately 2.9 kbp EcoR 1 fragment.
SUBSTITUTE SHEET RULE 26) To further serve as template tor DMA sequencing, plasmid DNA from isolate pDAB2002 was prepared by BIGprep^ kit. Sequencing reactions were performed and analyzed as described above.
Initially, a T3 primer (pBS S (+) bases 774-796: CGCGCAATTAACCCTCACTAAAG) and a T7 primer (pBS .S (+) bases 621- 643: GCGCGTAATACGACTCACTATAG ) were used to prime the sequencing reactions from the flanking vector sequences, reading into the insert DNA. Another set of primers, (GZ4F: GTATCGATTACAACGCTGTCACTTCCC ; TH13 : GGGAAGTGACAGCGTTGTAATCGATAC ; TH14: ATGTTGGGTGCGTCGGCTAATGGACATAAC ; and LW1-204: GGGAAGTGACAGCGTTGTAATCGATAC) was made to prime from internal sequences, which were determined previously by degenerate oligonucleotide-mediated sequencing of subcloned TcaC-peptide PCR products. From the data generated during the initial rounds of sequencing, new sets of primers were designed and used to walk the entire length of the -5 kbp fragment. A total of 55 oligo primers was used, enabling the identification of 4832 total bp of contiguous sequence.
When the DNA sequence of the EcoR 1 fragment insert of pDAB2002 is combined with part of the determined sequence of the pDAB2000/pDAB2001 isolates, a total contiguous sequence of 6005 bp was generated (disclosed herein as SEQ ID NO:25) . When long open reading frames were translated into the corresponding amino acids, the sequence clearly shows the TcaBi N-terminal peptide (disclosed as SEQ ID NO:3), encoded by bases 19-75, immediately following a methionine residue (start of translation) . Upstream lies a potential ribosome binding site (bases 1-9), and downstream, at bases 166-228 is encoded the TcaBj.-PT158 internal peptide (disclosed herein as SEQ ID NO:19) . Further downstream, in the same reading frame, at bases 1738-1773, exists a sequence encoding the TcaBi-PT108 internal peptide (disclosed herein as SEQ ID NO:20) . Also in the same reading frame, at bases 1897-1923, is encoded the TcaBii N-terminal peptide (disclosed herein as SEQ ID NO:5) , and the reading frame continues uninterrupted to a translation termination codon at nucleotides 3586-3588.
The lack of an in- frame stop codon between the end of the sequence encoding TcaBj-PT108 and the start of the TcaBii encoding region, and the lack of a discernible ribosome binding site immediately upstream of the TcaBii coding region, indicate that 18 03 peptides TcaBii and TcaBi are encoded by a single open reading frame of 3557 bp beginning at base pair 16 in SEQ ID HO:25) , and are most likely derived from a single primary gene produce or 1189 amino acids (131.586 Daltons; disclosed herein as SEQ ID NO: 26) by post -translational cleavage. If the amino acid immediately preceding the TcaBii N-terminal peptide represents the C-terminal amino acid of peptide TcaBi, then the predicced mass of TcaBii '627 amino acids) is 70,814 Daltons (disclosed herein as SEQ ID NO:28), somewhat higher than the size observed by SDS-PAGE (68 kDa). This peptide would be encoded by a contiguous stretch of 1881 base pairs (disclosed herein as SEQ ID NO:27) . It is thought that the native C-terminus of TcaBi lies somewhat closer to the C-terminus of TcaBi-PT108. The molecular mass of PT108 [3.438 kDa; determined during N-terminal amino acid sequence analysis of this peptide] predicts a size of 30 amino acids.' Using the size of this peptide to designate the C-terminus of the TcaBi coding region [Glu at position 604 of SEQ ID NO:28], the derived size of TcaBi is determined to be 604 amino acids or 68,463 Daltons, more in agreement with experimental observations.
Translation of the TcaBii peptide coding region of 1536 base pairs (disclosed herein as SEQ ID NO:29) yields a protein of 562 amino acids (disclosed herein as SEQ ID NO: 30) with predicted mass of 60,789 Daltons, which corresponds well with the observed 61 kDa.
A potential ribosome binding site (bases 3633-3638) is found 48 bp downstream of the stop codon for the ccaB open reading frame. At bases 3645-3677 is found a sequence encoding the N-terminus of peptide TcaC, (disclosed as SEQ ID NO.2). The open reading frame initiated by this N-terminal peptide continues uninterrupted to base 6005 (2361 base pairs, disclosed herein as the first 2361 base pairs of SEQ ID NO.31) . A gene ( ccaC) encoding the entire TcaC peptide, (apparent size -165 kDa; -1500 amino acids), would comprise about 4500 bp.
Another isolate containing cloned EcoR 1 fragments of cosmid 26A5, E20.6, was also identified by its homology to the previously mentioned GZ4 and TcaB robes. Agarose gel analysis of EcoR 1 digests of the DNA of the plasmid harbored by this strain (pDAB2004, Fig. 2), revealed insert fragments of estimated -61- 3ixes 2.9, 3, and 3.3 kbp. DNA sequence analysis initiated from primers designed from che sequence of plasmid pDAB2002 revealed chat the 3.3 kbp Eco 1 fragment of pDAB2004 lies adjacent to che 5 kbp EcoR 1 fragment represented in pDAB2002. The 2361 base pair open reading frame discovered in pDAB2002 continues uninterrupted for another 2094 bases in pDAB2004 [disclosed herein as base pairs 2362 to 4458 of SEQ ID MO:31) . DNA sequence analysis using the parent cosmid 26A5 DNA as template confirmed the continuity of the open reading frame. Altogether, the open reading frame {TcaC SEQ ID NO:31) comprises 4455 base pairs, and encodes a protein (TcaC) of 1485 amino acids [disclosed herein as SEQ ID NO:32]. The calculated molecular size of 166,214 Daltons is consistent with the estimated size of the TcaC peptide (165 kDa) , and the derived amino acid sequence matches exactly that disclosed for the TcaC N-terminal sequence [SEQ ID NO:2J.
The lack of an amino acid sequence corresponding to SEQ ID NO: 17; used to design the degenerate oligonucleotide primer pool in the discovered sequence indicates that the generation of the PCR® products found in isolates GZ4 and HB14, which were used as probes in the initial library screen, were fortuitously generated by reverse-strand priming by one of the primers in the degenerate pool. Further, the derived protein sequence does not include the internal fragment disclosed herein as SEQ ID NO: 18. These sequences reveai that plasmid pDAB2004 contains the complete coding region for the TcaC peptide.
Example 9 Screening of the Photorhabaus genomic library for genes encoding the TcbAjj peptide This example describes a method used to identify DNA clones that contain the TcbAii peptide-encoding genes, the isolation of the gene, and the determination of its partial DNA base sequence.
Primers and PCR reactions The TcbAii polypeptide of the insect active preparation is -206 kDa. The amino acid sequence of the N-terminus of this peptide is disclosed as SEQ ID NO:l. Four pools of degenerate oligonucleotide primers ("Forward primers": TH-4, TH-5, TH-6, and -62· TH-"i were synthesized to encode a portion of this amino acid sequence, as described in Example 8, and are shown below.
Table 11 Amino Acid Phe lie Gin Gly Tyr Ser As Leu Phe TH-4 5' -TT!T/C) ATI CA(A/G) GG TA(T/C) TCI GA(T/C) CTI TT-3 TH-5 5' -TT(TVC) ATI CA(A/G) GGI TA(T/C) AG(T/C) GA(T/C) CTI TT-3 TH-5 5' -TT(TVC) A I CA(A/G) GGI TA(T/C) TCI GA(T/C) TT ( A,G) TT-3 TH-7 5 ' -TT(T/C) ATI CA(A/G) GGI TA(T/C) AG(T/C) GA(T/C) TT(A,G) TT-3 In addition, a primary ("a") and a secondary ("b") sequence of an internal peptide preparation (TcbAii- PT81 } have been determined and are disclosed herein as SEQ ID No: 23 and SEQ ID No: 24, respectively. Four pools of degenerate oligonucleotides ("Reverse Primers": TH-8, TH-9, TH-10 and TH-11) were similarly designed and synthesized to encode the reverse complement of sequences that encode a portion of the peptide of SEQ ID NO: 23, as shown below. -63- Table 12 Amino Acid Thr yr Leu Thr Ser Phe Glu Gin V TH-8 3* TGI AT (A/G > GAI TGI AGI AA (A/G) CTIT/C) GTIT/C) C TH-9 3 'TGI A (A/G) TT (A/G ) TGI AGI A A/G) CT(T/C) GT(T/C) C TH-10 3 'TGI AT (A/G) GAI TGI TC(G/A) AA (A/G ) CT(T/C) GT(T/C) C TH-11 3'TGI AT (A/G) TT (A/G) TGI TC(G/A) AA (A/G) CT(T/C) GT(T/C) C Sees of these primers were used in PCR° reactions to amp ii ·.·,·· TcbAii- encoding gene fragments from the genomic Phocorhacau-lummescens W-14 DMA prepared in Example 6. All PCR* reactions were run with the "Hot Start" technique using AmpliWax™ gems and other Perkin Elmer reagents and protocols. Typically, a mixtur (total volume 11 μΐ ) of MgCl , dNTP's, 10X GeneAmp11 PCR Buffer II. and the primers were added to tubes containing a single wax bea . (10X GeneAmp" PCR Buffer II is composed of 100 mM Tris-HCl, pH 8.3; and 500 mM KC1.] The tubes were heated to 80°C for 2 minutes and allowed to cool. To the top of the wax seals, a solution containing 10X GeneAmp* PCR Buffer II, DNA template, MYI AmpliTaq* DNA polymerase were added. Following melting of the wax seal and mixing of components by thermal cycling, finai reaction conditions (volume of 50 μΐ) were: 10 mM Tris-HCl, pH 8.3; 50 mM KC1; 2.5 mM MgCl>; 200 μΜ each in dATP , dCTP, dGTP , dTTP; 1.25 mM in a single Forward primer pool; 1.25 μΜ in a single Reverse primer pool, 1.25 units of AmpliTaq' DNA polymerase, and 170 ny of template DNA.
The reactions were placed in a thermocycler (as in Example 8) and run with the following program: Table 13 Temperature Time Cycle Repetition 94°C 2 minutes IX 94°C 15 seconds 55-65°C 30 seconds 30X 72°C 1 minute 7 minutes 72°C IX 15°C Constant -65- SUBSTTiUTE SHEET (RULE 26} A series of amplifications was run ac three different annealing temperatures (55°, 60°, 65° C) using the degenerate primer pools. Reactions with annealing at 65°C had no amplification products visible following agarose gel electrophoresis. Reactions having a 60°C annealing regime and containing primers TH-5+TH-10 produced an amplification product that had a mobility corresponding to 2.9 kbp. A lesser amount of the 2.9 kbp product was produced under these conditions with primers TH-7+TH-10. When reactions were annealed at 55°C, these primer pairs produced more of the 2.9 kbp product, and this product was also produced by primer pairs TH-5+TH-8 and TH-5+TH-11. Additional very faint 2.9 kbp bands were seen in lanes containing amplification products from primer pairs TH-7 plus TH-8, TH-9 , TH-10, or TH-11.
To obtain sufficient PCR amplification product for cloning and DNA sequence determination, 10 separate PCR reactions were set up using the primers TH-5+TH-10, and were run using the above conditions with a 55°C annealing temperature. All reactions were pooled and the 2.9 kbp product was purified by Qiaex extraction from an agarose gel as described above.
Additional sequences determined for TcbAii internal peptides are disclosed herein as SEQ ID NO:21 and SEQ ID NO:22. As before, degenerate oligonucleotides (Reverse primers TH-17 and TH-18) were made corresponding to the reverse complement of sequences that encode a portion of the amino acid sequence of these peptides.
Table 14 From SEQ ID NO: 21 Amino Acid Met Glu Thr Gin Asn lie Gin Glu Pro TH-17 3'-TAC CTT/C TGI GTT/C TTA/G TAI GTT/C GTT/C GG-5' Table 15 From SSQ ID NO: 22 Amino Acid Asn Pro lie Asn lie Asn Thr Gly He Asp TH-18 3'-TT(A/G) GGI TAI T ( A/G) TAI TT(A?G) TGI CCI TAI CT(A/G)-5' Degenerace oligonucleotides TH-13 and TH-17 were used in an amplification experiment with F!ioccrhabdus liuninescans w-14 DMA as template and primers TH-4, TH-5, TH-6, or TH-7 as the 5'- ■ Forward) primers. These reactions amplified products of approximately 4 kbp and 4.5 kbp, respectively. These DMAs were transferred from agarose gels to nylon membranes and hybridized with a :P- labeled probe (as described above) prepared from the 2.9 kbp product amplified by the TH-5-THlO primer pair. Both the 4 kbp and the 4.5 kbp amplification products hybridized strongly co the 2.9 kbp probe. These results were used to construct a map ordering the TcbAii internal peptide sequences as shown in Fig. 3. Approximate distances between the primers are shown in nucleotides in Fig. 3.
D A Sequence of the 2.9 kbp TcbAjj-encoding fragment Approximately 200 ng of the purified 2.9 kbp fragment (prepared above) was precipitated with ethanol and dissolved in 17 ml of water. One-half of this was used as sequencing template with 25 pmol of the TH-5 pool as primers, the other half was used as template for TH-10 priming. Sequencing reactions were as given in Example 8. No reliable sequence was produced using the TH-10 primer pool; however, reactions with TH-5 primer pool produced the sequence disclosed below: . 1 AATCGTGTTG ATCCCTATGC CGNGCCGGGT TCGGTGGAAT CGATGTCCTC ACCGGGGGTT 51 TATTNGAGGG ANTNGTCCCG TGAGGCCAAA AA TGGAATG AAAGAAGTTC AATTTNTTAC 121 CTAGATAAAC GTCGCCCGGM TTTAGAAAGN TTANTGNTCA GCCAGAAAAT TTTGGTTGAG iai GAAATTCCAC CGNTGGTTCT CTCTATTGAT TNGGGCCTGG CCGGGTTCGA ANNAAAACMA 2 1 GGAAATNCAC AAGTTGAGGT GATCGNTTTG TNGCNANCTT NTCGTTTAGG TGGGGAGAAA 301 CCTTNTCANC ACGNTTNTGA AACTGTCCGG GAAATCGTCC ATGANCGTGA NCCAGGNTTN 361 CGCCATTGG Based on this sequence, a sequencing primer (TH-21, 5'-CCGGGCGACGT TATCTAGG-3 ' ) was designed to reverse complement bases 120-139, and initiate polymerization towards the 5' end (i.e., TH-5 end) of the gel-purified 2.9 kbp TcbAii-encoding PCR fragment. The determined sequence is shown below, and is compared to the biochemically determined N-terminal peptide sequence of TcbAii SEQ ID MO:l. -67- SUBSTnUTE SHEET (RULE 26) T bAjj 2.? kbp PC?. fragment Sequencs Con irmation [Underlined amino acids = encoded by degenerate oligonucleotides SEQ ID NO: 1 F I Q Y S D L F G - - A I I I I I I I I I I I 2.9 kbp seq GC ATG CAG GGG TAT AGT GAC CTG TTT GGT AAT CGT GCT M Q G Y S D L F G M R A · From the homology of the derived amino acid sequence to the biochemically determined one, it is clear that the 2.9 kbp PCR fragment represents the TcbA coding region. This 2.9 kbp fragment was then used as a hybridization probe to screen the Phocorhabdus W-14 genomic library prepared in Example 8 for cosmids containing the TcbAii-encoding gene.
Screening the Phocorhabdus cosmid library The 2.9 kb gel-purified PCR fragment was labeled with !iP using the Boehringer Mannheim High Prime labeling kit as described in Example 8. Filters containing remnants of approximately 800 colonies from the cosmid library were screened as described previously (Example 8), and positive clones were streaked for isolated colonies and rescreened. Three clones (8A11, 25G8, and 26D1) gave positive results through several screening and characterization steps. No hybridization of the TcbAii-specif ic probe was ever observed with any of the four cosmids identified in Example 8, and which contain the ccaB and ccaC genes. DNA from cosmids 8A11, 25G8, and 26D1 was digested with restriction enzymes Bgl 2, EcoR 1 or Hind 3 (either alone oi in combination with one another) , and the fragments were separated on an agarose gel and transferred to a nylon membrane as described in Example 8. The membrane was hybridized with '"P-labeled probe prepared from the 4.5 kbp fragment (generated by amplification of Phocorhabdus genomic DNA with primers TH-5+TH-17) . The patterns generated from cosmid DNAs 8A11 and 26D1 were identical to those generated with similarly-cut genomic DNA on the same membrane. It is concluded that cosmids 8A11 and 25D1 are accurate representations of the genomic TcbAn encoding locus. However, cosmid 25G8 has a single Bgl 2 fragment which i: slightly larger than the genomic DNA. This may result from positioning of the insert within the vector. -68- SUBSTTTUTE SHEET (RULE 26) DtlA sequence of the ccbA- encoding gene The membrane hyb idization analysis of cosmid 26D1 reveai^d that the 4.5 kbp probe hybridized to a single large EcoR 1 fragment (greater than 9 kbp) . This fragment was gel purified and ligated into the EcoR 1 site of pBC KS (*) as described in Example 8, to generate plasmid pBC-Sl/Rl. The partial DMA sequence of the insert DNA of this plasmid was determined by "primer walking" from the flanking vector sequence, using procedures described in Example 8. Further sequence was generated by extension from new oligonucleotides designed from the previously determined sequence. When compared to the determined DNA sequence for the ccbA gene identified by other methods (disclosed herein as SEQ ID NO: 11 as described in Example 12 below), complete homology was found to nucleotides 1-272, 319-826, 2578-3036, and 3068-3540 (total bases = 1712). It was concluded that both approaches can be used to identify DNA fragments encoding the TcbAii peptide.
Analysis of the derived amino acid sequence of the ccbA gene.
The sequence of the DMA fragment identified as SEQ ID NO: 11 encodes a protein whose derived amino acid sequence is disclosed herei as SEQ ID NO: 12. Several features verify the identity of the gene as that encoding the TcbAii protein. The TcbAii N-terminal peptide (SEQ ID MO:l; Phe lie Gin Gly Tyr Ser Asp Leu Phe Gly Asn Arg Ala) is encoded as amino acids 88-100. The TcbAii internal peptide TcbAii- PT81(a) (SEQ ID NO:23) is encoded as amino acids 1065-1077, and TcbAii PT81(b) (SEQ ID NO:24) is encoded as amino acids 1571-1592. Further, the internal peptide TcbAii-PT56 (SEQ ID NO: 22) is encoded as amino acids 1474-1488, and the internal peptide TcbAii-PT103 (SEQ ID MO:24) is encoded as amino acids 1614-1639. It is obvious that this gene is an authentic clone encoding the TcbAii peptide as isolated from insecticidal protein preparations of Phocorhabdus luminescens strain W-14.
The protein isolated as peptide TcbAii is derived from cleavage of a longer peptide. Evidence for this is provided by the fact that the nucleotides encoding the TcbAii N-terminal peptide SEQ ID NO:l are preceded by 261 bases (encoding 87 N-terminal -proximal amino acids) of a longer open reading frame (SEQ ID NO:ll) . This reading frame begins with nucleotides that encode the amino acid sequence Met Gin Asn Ser Leu, /hi n corresponus '.υ cue n-cerminai sequence or tne larg peptide TcbA, and is disclosed herein as SEQ ID t!0:15. It is thought that TcbA is the precursor protein tor TcbAj.i.
' Relationship of ccbA, ccaB and ccaC genes.
The ccaB and ccaC genes are closely linked and may be transcribed as a single mRNA (Example 8) . The ccbA gene is borne on cosmids that apparently do not overlap the ones harboring the ccaB and ccaC cluster, since the respective genomic library screens identified different cosmids. However, comparison of the amino sequences encoded by the ccaB and ccaC genes with the ccbA gene reveals a substantial degree of homology. The amino acid conservation (Protein Alignment Mode of MacVector™ Sequence Analysis Software, scoring matrix pam250, hash value = 2; Kodak Scientific Imaging Systems, Rochester, NY) is shown in Fig. 4.
On the score line of each panel in Fig. 4, up carats {') indicate homology or conservative amino acid changes, and down carats (v) indicate nonhomology.
This analysis shows that the amino acid sequence of the TcbA peptide from residues 1739 to 1894 is highly homologous to amino acids 441 to 603 of the TcaB^ peptide (162 of the total 627 amino acids of P8 SEQ ID NO:28) . In addition, the sequence of TcbA amino acids 1932 to 2459 is highly homologous to amino acids 12 to 531 of peptide TcaBii (520 of the total 562 amino acids; SEQ ID MO:30) . Considering that the TcbA peptide (SEQ ID NO:12) comprises 2505 amino acids, a total of 684 amino acids (27%) at the C-proximal end of it is homologous to the TcaBi or TcaB peptides, and the homologies are arranged colinear to' the arrangement of the putative TcaB preprotein (SEQ ID O:26) . A sizeable gap in the TcbA homology coincides with the junction between the TcaB^ and TcaBii portions of the TcaB preprotein.
Clearly the TcbA and TcaB gene products are evolutionarily related, and it is proposed that they share some common function (s) in P ocorhabdus . 003 Example 10 Characterizac ion o£ zinc-metalloproceases in Phocor abaus Broth Protease Inhibition, Classification, and Purification Protease Inhibition and Classification Assays: Protease assays were performed using FITC-casein dissolved in water as substrate 10.08% final assay concentration) . Proteolysis reactions were performed at 25°C for 1 h in the appropriate buffer with 25 ul of Phocarhabdus broth (150 ul total reaction volume) . Samples were also assayed in the presence and absence of dithiothreitol . After incubation, an equal volume of 12% trichloroacetic acid was added to precipitate undigested protein. Following precipitation for 0.5 h and subsequent centrifugation, 100 ul of the supernatant was placed into a 96-well microtiter plate and the pH of the solution was adjusted by addition of an equal volume of 4N MaOH. Proteolysis was then quantitated using a Fluoroskan II fluorometric plate reader at excitation and emission wavelengths of 485 and 538 run, respectively. Protease activity was tested over a range from pH 5.0-10.0 in 0.5 units increments. The following buffers were used at 50 mM final concentration: sodium acetate (pH 5.0 - 6.5); Tris-HCL (pH 7.0 -8.0); and bis-Tris propane (pH 8.5-10.0). To identify the class of protease (s) observed, crude broth was treated with a variety of protease inhibitors (0.5 ug/ul final concentration) and then examined for protease activity at pH 8.0 using the substrate described above. The protease inhibitors used included E-64 (L-trans-expoxysaccinylleucylamido (4- , -guanidino] -butane) , 3,4 dichloroisocoumarin, Leupeptin, pepstatin, amastatin, ethylenediaminetetraacetic acid ( ED A) and 1,10 phenanthroline. Protease assays performed over a pH range revealed that indeed protease (s) were present which exhibited maximal activity at - pH 8.0 (Table 16). Addition of DTT did not have any effect on protease activity. Crude broth was then treated with a variety of protease inhibitors (Table 17) . Treatment of crude broth with the inhibitors described above revealed that 1,10 phenanthroline caused complete inhibition of all protease activity when added at a final concentration of 50 ug, with the IC50 = 5 ug in 100 ul of a 2 mg/ml crude broth solution. These data indicate that the most abundant protease (s) found in the P ocor abcius broth are from the zinc-metallcprotease class ~t enzymes .
Table 16 Effect of pH on the protease activity found in a Day 1 production of Phocorhabdus l minescens (strain W-14) .
Flu. Units'3 Percent Act ivi yD 5. 0 3013 78 17 5. 5 7994 + 448 45 6. 0 12965 483 74 6. 5 14390 + 1291 82 7. 0 14386 1287 82 5 14135 + 198 80 8. 0 17582 + 831 100 8. 5 16183 953 92 9. 0 16795 760 96 9. 5 16279 1022 93 10 .0 15225 + 210 87 a Flu. Units = Fluorescence Units (Maximum = -28,000 background = - 2200). b. Percent activity relative to the maximum at pH 8.0 -72- Table 17 Effect: of different protease inhibitors on che protease activity at pH 9 found in a Day 1 production of Phocarhabdus luninescsns (strain W- 1 ) .
Inhibitor Corr< =cted Flu. Units3 Percent Inhibition0 Control 13053 0 £-64 14259 0 1,10 Phenanthrolinec 15 99 , 4 Dichloroisocoumarind 7956 39 Leupepcin 13074 0 Pepscatinc 13441 0 Amastatin 12474 4 DMSO Concrol 12005 8 Methanol Control 12125 7 a. Corrected Flu. Units = Fluorescence~Units -background (2200 flu. units). b. Percent Inhibition relative to protease activity 8.0. c. Inhibitors were dissolved in methanol. d. Inhibitors were dissolved in DMSO.
The isolation of a zinc-metalloprotease was performed by applying dialyzed 10-80% ammonium sulfate pellet to a Q Sepharose column equilibrated at 50 mM 32P04, pH 7.0 as described in Example 5 for Photorhabdus toxin. After extensive washing, a 0 Co 0.5 M NaCl gradienC was used Co eluCe toxin protein. The majority of biological activity and protein was eluted from 0.15 - 0.45 M NaCl. However, it was observed that the majority of proteolytic accivicy was present in che 0.25-0.35 M NaCl fraccion wich some accivicy in Che 0.15-0.25 M NaCl fraccion. SDS PAGE analysis of Che 0.25-0.35 M NaCl fraction showed a major peptide band of approximately 60 kDa. The 0.15-0.25 M NaCl fraction contained a similar 60 kDa band but at lower relative protein concentraCion . Subsequenc gel filcration of this fraction using a Superose 12 HR 16/50 column resulted in a major peak migrating at 57.5 kDa chat contained a predominanc (> 90% of total stained protein) 58.5 kDa band by SDS PAGE analysis. Additional analysis of this fraction using various protease inhibitors as described above determined that the protease was a zinc-metalloprotease. Nearly all of the protease activity present in Photorhabdus broth at day 1 of fermentation corresponded to the -58 kDa zinc-metalloprotease .
In yet a second isolation of zinc-metalloprotease ( s ) , w-14 Photorhabdus broth grown for three days was taken and protease 6 activity was visualized using sodium dodecyl sulfate- polyacrylamide gel electrophoresis (SDS-PAGE) laced with gelatin as described in Schmidt, T.M., Bleakley, a. and ealson, K.M. 1988. SOS running gels (5.5 x 3 cm) were made with 12.5 % polyacrylamide (40% stock solution of aerylamide/bis -aery lamide; Sigma Chemical Co., St. Louis, MO) into which 0.1% gelatin final concentration (Biorad EIA grade reagent; Richmond CA) was incorporated upon dissolving in water. SDS-stacking gels (1.0 x 3 cm) were made with 5% polyacrylamide , also laced with 0.1% gelatin. Typically, 2.5 μg of protein to be tested was diluted in 0.03 ml of SDS-PAGE loading buffer without dithiothreitol (DTT) and loaded onto the gel. Proteins were electrophoresed in SDS running buffer (Laemmli, U.K. 1970. Nature 227, 680) at 0° C and at 8 mA. After electrophoresis was complete, the gel was washed for 2 h in 2.5% (v/v) Triton X-100. Gels were then incubated for 1 h at 37 °c in 0.1 M glycine (pH 8.0). After incubation, gels were fixed and stained overnight with 0.1% amido black in methanol-acetic acid- water (30:10:60, vol . /vol . / vol . ; Sigma Chemical Co.). Protease activity was visualized as light areas against a dark, amido black stained background due to proteolysis and subsequent diffusion of incorporated gelatin. At least three distinct bands produced by proteolytic activity at 58-, 41-, and 38 kDa were observed.
Activity assays of the different proteases in W-14 day three culture broth were performed using FITC-casein dissolved in water as substrate (0.02% final assay concentration). Proteolysis experiments were performed at 37 °C for 0-0.5 h in 0.1M Tris-HCl (pH 8.0) with different protein fractions in a total volume of 0.15 ml. Reactions were terminated by addition of an equal volume of 12% trichloroacetic acid (TCA) dissolved in water.
After incubation at room temperature for 0.25 h, samples were centrifuged at 10,000 x g for 0.25 h and 0.10 ml aliquots were removed and placed into 96-well icrotiter plates. The solution was then neutralized by the addition of an equal volume of 2 II sodium hydroxide, followed by quantitation using a Fluoroskan II fluorometric plate reader with excitation and emission wavelengths of 485 and 538 nm, respectively. Activity measurements were performed using FITC-Casein with different protease concentrations at 37° C for 0-10 min. A unit of activity was arbitrarily defined as che amount of enzyme needea to produce 1000 fluorescent units min and specific activity was defined as unics/mg of protease.
Inhibition studies were performed using two zinc-metalloprotease inhibitors; 1,10 phenanthroline and -(a- rhamnopyranosyloxyhydroxyphosphinyl) -Leu-Tr (phosphoramidon ) with stock solutions of the inhibitors dissolved in 100% ethanol and water, respectively. Stock concentrations were typically 10 mg/ml and 5 mg/ml for 1,10 phenanthroline and phosphoramidon, respectively, with final concentrations of inhibitor at 0.5-1.0 mg/ml per reaction. Treatment of three day -14 crude broth wich 1,10 phenanthroline, an inhibitor of all zinc metalloproceases , resulted in complete elimination of all protease activity while treatment with phosphoramidon, an inhibitor of thermolys in- like proteases (Weaver, L.H., ester, W.R., and Matthews, B.W. 1977. J. Mol. Biol. 114, 119-132), resulted in -56% reduction of protease activity. The residual proteolytic activity could not be further reduced with additional phosphoramidon.
The proteases of three day W-14 Phocor abdus broth were purified as follows: 4.0 liters of broth were concentrated using an Amicon spiral ultra filtration cartridge Type SIYIOO attached to an Amicon M-12 filtration device. The flow-through material having native proteins less than 100 kDa in size (3.8 L ) was concentrated to 0.375 L using an Amicon spiral ultra filtration cartridge Type S1Y10 attached to an Amicon M-12 filtration device. The retentate material contained proteins ranging in size from 10-100 kDa. This material was loaded onto a Pharmacia HR16/10 column which had been packed with PerSeptive Biosystem (Framington, MA) Poros® 50 HQ strong anion exchange packing that had been equilibrated in 10 mM sodium phosphate buffer (pH 7.0). Proteins were loaded on the column at a flow rate of 5 ml/min, followed by washing unbound protein with buffer until A280 = 0.00. Afterwards, proteins were eluted using a NaCl gradient of 0-1.0 M NaCl in 40 min at a flow rate of 7.5 ml/min. Fractions were assayed for protease activity, supra., and active fractions were pooled. Proteolytically active fractions were diluted with 50% (v/v) 10 mM sodium phosphate buffer (pH 7.0) and loaded onto a' Pharmacia HR 10/10 Mono Q column equilibrated in 10 mM sodium phosphate. After washing the column with buffer until A280 = -75- 0.00, proteins were eiuted using a Macl gradient of 0-0.5 M NaCl for 1 h ac a flow race of 2.0 ml/min. Fractions were assayed fcr protease activity. Those fractions having the greatest amount of phosphoramidon- sensitive protease activity, the phosphoramidon sensitive activity being due to the 41/38 kDa protease, infra., were pooled. These fractions were found to elute at a range of 0.15-0.25 M MaCl. Fractions containing a predominance of phosphoramidon-insensitive protease activity, the 58 kDa protease, were also pooled. These fractions were found to elute at a range of 0.25-0.35 M NaCl. The phosphoramidon-sensitive protease fractions were then concentrated to a final volume of 0.75 ml using a Millipore Ultrafree®- 15 centrifugal filter device Biomax-5K NMWL membrane. This material was applied at a flow race of 0.5 ml/min to a Pharmacia HR 10/30 column that had been packed with Pharmacia Sephadex G-50 equilibrated in 10 mM sodium phosphate buffer (pH 7.0)/ 0.1 M NaCl. Fractions having the maximal phosphoramidon-sensitive protease activity were then pooled and centrifuged over a Millipore Ultrafree- 15 centrifugal filter device Biomax-50K NMWL membrane and further separation on a Pharmacia Superdex-75 column. Fractions containing the protease were pooled.
Analysis of purified 58- and 41/38 kDa purified proteases revealed that, while both types of protease were completely inhibited with 1,10 phenanthroline, only the 41/38 kDa protease was inhibited with phosphoramidon. Further analysis of crude broth indicated that protease activity of day 1 W-14 broth has 23% of the total protease activity due to the 41/38 kDa protease, increasing to 44% in day three W-14 broth.
Standard SDS-PAGE analysis for examining protein purity and obtaining amino terminal sequence was performed using 4-20% gradient MiniPlus SepraGels purchased from Integrated Separation Systems (Natick, MA) . Proteins to be amino-terminal sequenced were blotted onto PVDF membrane following purification, infra.. (ProBlottm Membranes; Applied BiosysCems, FosCer CiCy, CA) , -76- SUBSTTTUTE SHEET (RULE 26) visualized wich 0.1% amido black, excised, and senc co '-'ambridge Prochem; Cambridge, MA, for sequencing.
Deduced amino terminal sequence of the 58- (SEQ ID MO: 5) and 41/38 kDa (SEQ ID NO: 4) proteases from three day old w-14 broth were DV-GSEKANEKLK ( EQ ID NO: 45) and DSGDDDKVTTITDIHR (SEQ ID NO: 44) , respec ively.
Sequencing of the 41/38 kDa protease revealed several amino termini, each one having an additional amino acid removed by proteolysis. Examination of the primary, secondary, tertiary and quartenary sequences for the 38 and 41 kDa polypeptides allowed for deduction of the sequence shown above and revealed that these two proteases are homologous.
Example 11, Part A Screening of Phocorhabdus Genomic Library via use of Antibodies for Genes encoding TcbA Peptide In parallel to the sequencing described above, suitable probing and sequencing was done based on the TcbAii peptide (SEQ ID NO:l) . This sequencing was performed by preparing bacterial culture broths and purifying the toxin as described in Examples 1 and 2 above .
Genomic DNA was isolated from the Phocorhabdus luminescens strain W-14 grown in Grace's insect tissue culture medium. The bacteria were grown in 5 ml of culture medium in a 250 ml Erlenmeyer flask at 28°C and 250 rpm for approximately 24 hours. Bacterial cells from 100 ml of culture medium were pelleted at 5000 x g for 10 minutes. The supernatant was discarded, and the cell pellets then were used for the genomic DNA isolation.
The genomic DNA was isolated using a modification of the CTAB method described in Section 2.4.3 of Ausubel (supra.) . The section entitled "Large Scale CsCl prep of bacterial genomic DNA" was followed through step 6. At this point, an additional chlorof orm/isoamyl alcohol (24:1) extraction was performed followed by a phenol/chloroform/ isoamyl (25:2.4:1) extraction step and a final chloroform/ isoamyl/alcohol (24:1) extraction. The DNA was precipitated by the addition of a 0.6 volume of isopropanol. The precipitated DNA was hooked and wound around the end of a bent glass rod, dipped briefly into 70% ethanol as a final wash, and dissolved in. 3 ml o£ TE buffer.
The DMA ccncenc rat icn , estimated by optical density 1; 230/260 nm, was approximately 2 mg/ml.
Using this genomic DMA, a library was prepared.
Approximately 50 ug of genomic DMA was partly digested with Sau3 Al. Then NaCl density gradient centrifugation was used to size fractionate the partially digested DNA fragments. Fractions containing DMA fragments with an average size of 12 kb, or larger, as determined by agarose gel electrophoresis, were ligated into the plasmid BluScript, Stratagene, La Jolla, California, and transformed into an E. coli DH5a or DHB10 strain.
Separately, purified aliquots of the protein were sent to the biotechnology hybridoma center at the University of Wisconsin, Madison for production of monoclonal antibodies to the proteins. The material that was sent was the HPLC purified fraction containing native bands 1 and 2 which had been denatured at 65°C, and 20 μg of which was injected into each of four mice. Stable monoclonal antibody-producing hybridoma cell lines were recovered after spleen cells from unimmunized mouse were fused with a stable myeloma cell line. Monoclonal antibodies were recovered from the hybridomas .
Separately, polyclonal antibodies were created by taking native agarose gel purified band 1 (see Example 1) protein which was then used to immunize a New Zealand white rabbit. The protein was prepared by excising the band from the native agarose gels, briefly heating the gel pieces to 65°C to melt the agarose, and immediately emulsifying with adjuvant. Freund' s complete adjuvant was used for the primary immunizations and Freund' s incomplete was used for 3 additional injections at monthly intervals. For each injection, .approximately 0.2 ml of emulsified band 1, containing 50 to 100 micrograms of protein, was delivered by multiple subcontaneous injections into the back of the rabbit. Serum was obtained 10 days after the final injection and additional bleeds were performed at weekly intervals for 3 weeks. The serum complement was inactivated by heating to 56°C for 15 minutes and then stored at -20°C.
The monoclonal and polyclonal antibodies were then used to screen the genomic library for the expression of antigens which could be detected by the epitope. Positive clones were detected on nitrocellulose filter colony lifts. An immunoblot analysis of the positive clones was undertaken. -79- SUBSTTTUTE SHEET RULE 26) An analysis of che clones as defined by boch immunobloc and Southern analysis resulted in. the tentative identification of five classes of clones.
In the first class of clone was a gene encoding the peptide designated here as TcbAii. Full DNA sequence of this gene ί TcbA) was obtained. It is set forth as SEQ ID NO: 11. Confirmation that the sequence encodes the internal sequence of SEQ ID NO:l is demonstrated by the presence of SEQ ID NO:l at amino acid number 98 from the deduced amino acid sequence created by the open reading frame of SEQ ID NO: 11. This can be confirmed by referring to SEQ ID NO: 12, which is the deduced amino acid sequence created by SEQ ID NO: 11.
The second class of toxin peptides contains the segments referred to above as TcaB i , TcaBi and TcaC. Following the screening of the library with the polyclonal antisera, this second class of toxin genes was identified by several clones which produced different size proteins, all of which cross-reacted with the polyclonal antibody on an immunoblot and were also found to share DNA homology on a Southern Blot. Sequence comparison revealed that they belonged to the gene complex designated TcaB and TcaC above.
Three other classes of antibody toxin clones were also isolated in the polyclonal screen. These classes produced proteins that cross-react with a polyclonal antibody and also shared DNA homology with the classes as determined by Southern blotting. The classes have been designated Class III, Class IV and Class V. It was also possible to identify monoclonals that cross-reacted with Class I, II, III, and IV. This suggests that all have regions of high protein homology. Thus, it appears that the P. lu/ninescens extracellular protein genes represent a family of genes which are evolut ionarily related.
To further pursue the concept that there might be evolutionarily related variations in the toxin peptides contained within this organism, two approaches have been undertaken to examine other strains of P. lu inescens for the presence of related proteins. This was done both by PCR amplification of genomic DNA and by immunoblot analysis using the polyclonal and monoclonal antibodies. -79- SUBSTITUTE SHEET RULE 26) The results indicate that related proteins are produced by P. luminescens strains X-2, WX-3, WX-4, WX-5, x-6, WX-7, X-? , WX-11, WX-12, WX-15 and W-14.
Example 11, Part B Sequence and anaylsis of Class III toxin clones - c cc Further DNA sequencing was performed on plasmids isolated from Class III E. coli clones described in Example 11, Part A. The nucleotide sequence was shown to be three closely linked open reading frames at this genomic locus. This locus was designated ccc with the three open reading frames designated cccA SEQ ID NO:56, cccB SEQ ID NO:58 and CccC SEQ ID NO:60 (Fig. 6B).
The deduced amino acid from the cccA open reading frame indicates the gene encodes a protein of 105,459 Da. This protein was designated TccA. The first 12 amino acids of this protein match the N-terminal sequence obtained from a 108 kDa protein, SEQ ID NO: 7, previously identified as part of the toxin complex.
The deduced amino acid from the tccB open reading frame indicates this gene encodes a protein of 175,716 Da. This protein was designated TccB. The first 11 amino acids of this protein match the N-terminal sequence obtained from a protein with estimated molecular weight of 185 kDa, SEQ ID NO: 3.
The deduced amino acid sequence of cccC indicated that this open reading frame encodes a protein of 111,694 Da and the protein product was designated TccC.
Exam le 12 Characterization of Phocorhabdus Strains In order to establish that the collection described herein was comprised of Phocorhabdus strains, the strains herein v/ere assessed in terms of recognized microbiological traits that are characteristic of Phocorhabdus and which differentiate it from other En erobacceriaceae and Kenorhabdus spp. (Farmer, J.J. 1984. Bergey's Manual of Systemic Bacteriology, vol 1. pp. 510-511. (ed. reig N.R. and Holt, J.G.). Williams & Wilkins, Baltimore.; Akhurst and Boemare, 1988, Boemare et al., 1993). These characteristic traits are as follows: Gram's stain negative -30- SUBSTTTUTE SHEET (RULE 26) rods, organism size of 0.5-2 urn in width and 2-10 urn in length, red/yellow colony pigmenta ion, presence of crystalline inclusion bodies, presence of catalase, inability to reduce nitrate, presence of bioluminescence , ability to take up dye from growth media, positive for protease production, growth-temperature range below 37°c, survival under anaerobic conditions and positively motile. (Table 18) . Reference Escherichia coli, Xenorhabdus and Photorhabdus strains were included in all tests for comparison. The overall results are consistent with all strains being part of the family Encerobaccariaceae and the genus Pho orhabdus .
A luminometer was used to establish the bioluminescence of each strain and provide a quantitative and relative measurement of light production. For measurement of relative light emitting units, the broths from each strain (cells and media) were measured at three time intervals after inoculation in liquid culture (6, 12, and 24 hr) and compared to background luminosity (uninoculated media and water) . Prior to measuring light emission from the various broths, cell density was established by measuring light absorbance (560 nM) in a Gilford Systems (Oberlin, OH) spectrophotometer using a sipper cell. Appropriate dilutions were then made (to normalize optical density to 1.0 unit) before measuring luminosity. Aliquots of the diluted broths were then placed into cuvettes (300 ul each) and read in a Bio-Orbit 1251 Luminometer (Bio-Orbit Oy, Twiku, Finland) . The integration period for each sample was 45 seconds. The samples were continuously mixed (spun in baffled cuvettes) while being read to provide oxygen availability. A positive test was determined as being ≥ 5-fold background luminescence (-5-10 units) . In addition, colony luminosity was detected wich photographic film overlays and visually, after adaptation in a darkroom. The Gram's staining characteristics of each strain were established with a commercial Gram's stain kit (BBL, Cockeysville, MD) used in conjunction with Gram's stain control slides (Fisher Scientific, Pittsburgh, PA) . Microscopic evaluation was then performed using a Zeiss microscope (Carl Zeiss, Germany) 100X oil immersion objective lens (with 10X ocular and 2X body magnification) . Microscopic examination of individual strains for organism size, cellular description and inclusion bodies (the latter after logarithmic growth) was -81- SUBSTITUTE SHEET RULE 26 performed using wee mount slides '.10X ocular, 2X body and 40X objective magnification) with oil immersion and phase contrast microscopy with a micrometer (Akhurst, R.j. and Boemare, N.E. 1990. Entomopathogenic Nematodes in Biological Control ( ed .
Gaugler, R. and aya, H . ) . pp. 75-90. CRC Press, Boca Raton, USA.; Baghdiguian S., Boyer-Giglio M.H., Thaler, J.O., Bonnot G., Boemare II. 1993. Biol. Cell 79, 177-135.) . Colony pigmentation was observed after inoculation on Bacto nutrient agar, (Difco Laboratories, Detroit, MI) prepared as per label instructions. Incubation occurred at 28°C and descriptions were produced after 5-7 days. To test for the presence of the enzyme catalase, a colony of the test organism was removed on a small plug frr:n a nutrient agar plate and placed into the bottom of a glass test tube. One ml of a household hydrogen peroxide solution was gently added down the side of the tube. A positive reaction was recorded when bubbles of gas (presumptive oxygen) appeared immediately or within 5 seconds. Controls of uninoculated nutrient agar and hydrogen peroxide solution were also examined. To test for nitrate reduction, each culture was inoculated into 10 ml of Bacto Nitrate Broth (Difco Laboratories, Detroit, MI). After. 24 hours incubation at 28°C, nitrite production was tested by the addition of two drops of sulfanilic acid reagent and two drops of alpha-naphthylamine reagent (see Difco Manual, 10th edition, Difco Laboratories, Detroit, MI, 1984). The generation of a distinct pink or red color indicates the formation of nitrite from nitrate. The ability of each strain to uptake dye from growth media was tested with Bacto MacConkey agar containing the dye neutral red; Bacto Tergitol-7 agar containing the dye bromothymol blue and Bacto EMB Agar containing the dye eosin-Y (agars from Difco Laboratories, Detroit, MI, all prepared according to label instructions). After inoculation on these media, dye uptake was recorded after incubation at 28°C for 5 days. Growth on these latter media is characteristic for members of the family Encerobacceriaceae. Motility of each strain was tested using a solution of Bacto Motility Test Medium (Difco Laboratories, Detroit, MI) prepared as per label instructions. A butt-stab inoculation was performed with each strain and motility was judged macroscopically by a diffuse zone of growth spreading from the line of inoculum. In many cases, motility was also reserved microscopically from liquid culture under wee mount slides. Biochemical nutrient evaluation for each strain was performed using BBL Enterotube II (Benton, Dickinson, Germany) Product instructions were followed with the exception that incubation was carried out at 23°C for 5 days. Results were consistent with previously cited reports for Phocorhabdus . The production of protease was tested by observing hydrolysis of gelatin using Bacto gelatin (Difco Laboratories, Detroit, MI) plates made as per label instructions. Cultures were inoculaced and the plates were incubated at 28°C for 5 days. To assess growth at different temperatures, agar plates [2% proteose peptone *3 with two percent Bacto-Agar (Difco, Detroit, MI) in deionized water] were streaked from a common source of inoculum. Plates were sealed with Nesco® film and incubated at 20, 28 and 1~?°C for up to three weeks. Plates showing no growth at 37°c showed no cell viability after transfer to a 29°C incubator for one week. Oxygen requirements for Phocorhabdus strains were tested in the following manner. A butt-stab inoculation into fluid thioglycolate broth medium (Difco, Detroit, MI) was made. The tubes were incubated at room temperature for one week and cultures were then examined for type and extent of growth. The indicator resazurin demonstrates the level of medium oxidation or the aerobiosis zone (Difco Manual, 10th edition, Difco Laboratories, Detroit, MI). Growth zone results obtained for the Phocorhabdus strains tested were consistent with those of a facultative anaerobic microorganism.
Table 18 Taxonomic Traits of Photorhabdus Strains Traits Assessed -83- SUBSTTTUTE SHEET (RULE 26) tan,. LY= light yellow, YT= yellow tan, and L0= light orange. 84 The following strains were deposited on the indicated date in the Agricultural Research Service Patent Culture Collection (NRRL) , National Center for Agricultural Utilization Research, ARS-USDA, 1815 North University St., Peoria IL 61604 USA.
Cellular fatty acid analysis is a recognized tool for bacterial characterization at the genus and species level (Tornabene, T.G. 1985. Lipid Analysis and the Relationship to 84a Chemctaxoncmy in Methods in Microbiology/ , Vol 18, 209- .-4.; Goodfellow, M. and O'Donnell, A.G. 1993. Roots of Bacterial Systematics in Handbook of New Bacterial Systematica ( ed .
Goodfellow, M. 4 O'Donnell, A.G.) pp. 3-54. London: Academic Press Ltd.) , these references are incorporated herein by. reference, and were used to confirm that our collection was related at the genus level. Cultures were shipped to an external, contract laboratory for fatty acid methyl ester analysis (FAME) using a Microbial ID (MIDI, Newark, DE , USA ) Microbial Identification System (MIS) . The MIS system consists of a Hewlett Packard HP5890A gas chromatograph with a 25mm x 0.2mm 5% methylphenyl silicone fused silica capillary column. Hydrogen is used as the carrier gas and a flame- ionization detector functions in conjunction with an automatic sampler, integrator and computer. The computer compares the sample fatty acid methyl esters to a microbial fatty acid library and against a calibration mix of known fatty acids. As selected by the a contract laboratory, strains were grown for 24 hours at 28 C on trypticase soy agar prior to analysis. Extraction of samples was performed by the contract lab as per standard FAME methodology. There was no direct identification of the strains to any luminescent bacterial group other than Phocorhabdus . When the cluster analysis was performed, which compares the fatty acid profiles of a group of isolates, the strain fatty acid profiles were related at the genus level.
The evolutionary diversity of the Phocorhabdus strains in our collection was measured by analysis of PCR (Polymerase Chain Reaction) mediated genomic fingerprinting using genomic DNA from each strain. This technique is based on families of repetitive DNA sequences present throughout the genome of diverse bacterial species (reviewed by Versalovic, J., Schneider, M. , DE Bruijn, F.J. and Lupski, J.R. 1994. Methods Mol. Cell. Biol., 5, 25-40.) . Three of these, repetitive extragenic palindromic sequence (REP) , enterobacterial repetitive intergenic consensus (ERIC! and the BOX element are thought to play an important role in the organization of the bacterial genome. Genomic organization is believed to be shaped by selection and the differential dispersion of these elements within the genome of closely related bacterial strains can be used to discriminate these strains (e.g. -85- SUBSTTTUTE SHEET (RULE 26) Lo ws , F.J., Fulbright, D. . , Stephens, C.T. and DE Bruijn. F.J. 1994. Appl. Environ. Micro. 60, 2286-2295.) . Rep-PCR utilizes oligonucleotide primers complementary to these repetitive sequences to amplify the variably sized DMA fragments lying between them. The resulting products are separated by electrophoresis to establish the DNA "fingerprint" for each strain.
To isolate genomic DNA from our strains, cell pellets were resuspended in TE buffer (10 m Tris-HCl, 1 mM EDTA, pH 8.0) to a final volume of 10 ml and 12 ml of 5 NaCl was then added. This mixture was centrifuged 20 min. at 15,000 x g. The resulting pellet was resuspended in 5.7 ml of TE and 300 ul of 10% SDS and 60 ul 20 mg/ml proteinase K (Gibco BP.L Products, Grand Island, NY) were added. This mixture was incubated at 37 °c for 1 hr, approximately 10 mg of lysozyme was then added and the mixture was incubated for an additional 45 min. One milliliter of 5M NaCl and 800 ul of CTAB/MaCl solution (10% w/v CTAB, 0.7 M NaCl) were then added and the mixture was incubated 10 min. at 65°C, gently agitated, then incubated and agitated for an additional 20 min. to aid in clearing of the cellular material. An equal volume of chloroform/ isoamyl alcohol solution (24:1, v/v) was added, mixed gently then centrifuged. Two extractions were then performed with an equal volume of phenol/chloroform/isoamyl alcohol (50:49:1) . Genomic DNA was precipitated with 0.6 volume of isopropanol.
Precipitated DNA was removed with a glass rod, washed twice with 70% ethanol, dried and dissolved in 2 ml of STE (10 mM Tris-HCl pH8.0, 10 mM NaCl, 1 mM EDTA) . The DNA was then quantitated by optical density at 260 nm. To perform rep-PCR analysis of P otorhabdus genomic DNA the following primers were used, REP1R-I; 5 ' -IIIICGICGICATCIGGC-3 ' and REP2-I; 5 ' - ICGICTTATCIGGCCTAC- 3 ' . PCR was performed using the following 25ul reaction: 7.75 ul H2O, 2.5 ul 10X LA buffer (PanVera Corp., Madison, WI) , 16 ul dNTP mix (2.5 mM each), 1 ul of each primer at 50 pM/ul, 1 ul DMSO, 1.5 ul genomic DNA (concentrations ranged from 0.075-0.480 ug/uD and 0.25 ul TaKaRa EX Taq (PanVera Corp., Madison, WI) . The PCR amplification was performed in a Perkin Elmer DNA Thermal Cycler (Norwalk, CT) using the following conditions: 95°C/7 min. then 35 cycles of; 94°C/1 min.,44°C/l min., 65°C/8 min., followed by 15 min. at 65°C- After' cycling, the 25 ul reaction was added to 5 ul SUBSTmJTE SHEET (RULE 26) of 6X gel loading buffer (0.25% bromophenol blue, 40% w ■ v 3ucr?se in H20) . A 15x20cm 1%-agarose gel was then run in T3E buffer (0.09 M Tris-borace, 0.002 EDTA) using 8 ul of each reaction. The gel was run for approximately 16 hours at 45v. Gels were then stained in 20 ug/ml ethidium bromide for 1 hour and destained in TBE buffer for approximately 3 hours. Polaroid® photographs of the gels were then taken under UV illumination.
The presence or absence of bands at specific sizes for each strain was scored from the photographs and entered as a similarity matrix in the numerical taxonomy software program, NTSYS-pc (Exeter Software, Setauket, NY). Controls of E . coli strain HB101 and Xanthomonas oryzae py. oryzae assayed at the same time produced PCR "fingerprints" corresponding to published reports ( Versalovic , J., Koeuth, T. and Lupski, J.R. 1991.
Nucleic Acids Res. 19, 6323-6831; Vera Cruz, CM., Halda-Alija, L., Louws , F . , Skinner, D.Z., George, M.L., Nelson, R.J., DE Bruijn, F.J., Rice, C. and Leach, J.E. 1995. Int. Rice Res.
Notes, 20, 23-24.; Vera Cruz, CM., Ardales, E.Y., Skinner, D.Z., Talag, J., Nelson, R. J. , Louws, F.J., Leung, H . , Mew, T. . and Leach, J.E. 1996. Phytopathology (in press, respectively) . The data from Phocorhabdus strains were then analyzed with a series of programs within NTSYS-pc ; SIMQUAL (Similarity for Qualitative data) to generate a matrix of similarity coefficients (using the Jaccard coefficient) and SAHN (Sequential, Agglomerative , Heirarchical and Nested) clustering [using the UPGMA (Unweighted Pair-Group Method with Arithmetic Averages) method] which groups related strains and can be expressed as a phenogram (Figure 5) . The COPH (cophenetic values) and MXCOMP (matrix comparison) programs were used to generate a cophenetic value matrix and compare the correlation between this and the original matrix upon which the clustering was based. A resulting normalized Mantel statistic (r) was generated which is a measure of the goodness of fit for a cluster analysis (r=0.8-0.9 represents a very good fit). In our case r = 0.919. Therefore, our collection is comprised of a diverse group of easily distinguishable strains representative of the Phocorhabdus genus. -87- Example 13 - Insecticidal Utility of Toxin(s) Produced by Various F ocar abdus Strains Initial "seed" cultures of the various P ocor abdus strains were produced by inoculating 175 ml of 2% Proteose Peptone «3 (PP3) (Difco Laboratories, Detroit, MI) liquid media wich a primary variant subclone in a 500 ml tribaffled flask wich a Delong neck, covered with a Kaput. Inoculum for each seed culture was derived from oil-overlay agar slant cultures or plate cultures. After inoculation, these flasks were incubated for 16 hrs at 28°C on a rotary shaker at 150 rpm. These seed cultures were Chen used as uniform inoculum sources for a given fermencation of each strain. Additionally, overlaying the post-log seed culture with sterile mineral oil, adding a sterile magnetic stir bar for future resuspension and storing the culture in Che dark, ac room Cemperacure provided long-cerm preservation of inoculum in a toxin-competent state. The production broths were inoculated by adding 1% of the actively growing seed culture to fresh 2% PP3 media (e.g. 1.75 ml per 175 ml fresh media) .
Production of broths occurred in either 500 ml tribaffled flasks (see above), or 2800 ml baffled, convex bottom flasks (500 ml volume) covered by a silicon foam closure. Production flasks were incubated for 24-48 hrs under the above mentioned conditions. Following incubation, the broths were dispensed into sterile 1 L polyethylene bottles, spun at 2600 x g for 1 hr at 10°C and decanted from the cell and debris pellet. The liquid broth was then vacuum filtered through Whatman GF/D (2.7 uM retention) and GF/B (1.0 uM retention) glass filters to remove debris. Further broth clarification was achieved with a tangential flow microf iltration device (Pall Filtron, Northborough, MA) using a 0.5 uM open-channel filter. When necessary, additional clarification could be obtained by chilling the broth (to 4°C) and centrifuging for several hours at 2600 x g. Following these procedures, the broth was filter sterilized using a 0.2 uM nitrocellulose membrane filter. Sterile broths were then used directly for biological assay, biochemical analysis or concentrated (up to 15-fold) using a 10,000 MW cutoff, M12 ultra- filtration device (Amicon, .Beverly MA) or -88- centrifugal concentrators (Millipore, Bedford, MA and Pall Filtron, Northborough , MA) with a 10,000 MW pore size. In the case of centrifugal concentrators, the broth was spun at 2000 x g for approximately 2 hr. The 10,000 MW permeate was added to the corresponding retentate to achieve the desired concentra ion of components greater than 10,000 MW. Heat inactivation of processed broth samples was acheived by heating the samples at 100°C in a sand-filled heat block for 10 minutes.
The broth(s) and toxin complex(es) from different Phocarhabdus strains are useful for reducing populations of insects and were used in a method of inhibiting an insect population which comprises applying to a locus of the insect an effective insect inactivating amount of the active described. A demonstration of the breadth of insecticidal activity observed from broths of a selected group of Phocorhabdus strains fermented as described above is shown in Table 19. It is possible that additional insecticidal activities could be detected with these strains through increased concentration of the broth or by employing different fermentation methods. Consistent with the activity being associated with a protein, the insecticidal activity of all strains tested was heat labile (see above) .
Culture broth(s) from diverse Phocorhabdus strains show differential insecticidal activity (mortality and/or growth inhibition, reduced adult emergence) against a number of insects. More specifically, the activity is seen against corn rootworm larvae and boll weevil larvae which are members of the insect order Coleopcera . Other members of the Coleopcera include wireworms, pollen beetles, flea beetles, seed beetles and Colorado potato beetle. Activity is also observed against aster leafhopper and corn plant hopper, which are members of the order omopcera. Other members of the Ho opcera include planthoppers , pear psylla, apple sucker, scale insects, whiteflies, spittle bugs as well as numerous host specific aphid species. The broths and purified toxin complex (es) are also active against tobacco budworm, tobacco hornworm and European corn borer which are members of the order Lepidopcera. Other typical members of this order are beet armyworm, cabbage looper, black cutworm, corn earworm, codling moth, clothes moth, Indian mealmoth, leaf rollers, cabbage worm, cotton bollworm, bagworm, Eastern tent cacerpillar, sod webworm and fall armyworm. Accivicy is also seen against fruicfly and mosquito larvae which are members oi the order Dipcera. Other members of the order Dipcera are, pea midge, carrot fly, cabbage root fly, turnip root fly, onion fly, crane fly and house fly and various mosquito species. Activity with broth(s) and toxin complex (es) is also seen against two- spotted spider mite which is a member of the order Acarina which includes strawberry spider mites, broad mites, citrus red mice, European red mite, pear rust mite and tomato russet mite.
Activity against corn rootworm larvae was tested as follows. Phocorhabdus culture broth(s) (0-15 fold concentrated, filter sterilized), 2% Proteose Peptone #3, purified toxin complex(es) [0.23 mg/ml] or 10 mM sodium phosphate buffer , pH 7.0 were applied directly to the surface (about 1.5 cm2) of artificial diet (Rose, R. I. and McCabe, J. M. (1973). J. Econ. Entomol. 66, (398-400) in 40 ul aliquots. Toxin complex was diluted in 10 mM sodium phosphate buffer, pH 7.0. The diet plates were allowed to air-dry in a sterile flow-hood and the wells were infested with single, neonate Diabrocica undecimpunccaca howardi (Southern corn rootworm, SCR) hatched from surface sterilized eggs. The plates were sealed, placed in a humidified growth chamber and maintained at 27°C for the appropriate period (3-5 days). Mortality and larval weight determinations were then scored. Generally, 16 insects per treatment were used in all studies. Control mortality was generally less than 5%.
Activity against boll weevil (Anchomonas grandis) was tested as follows. Concentrated (1-10 fold) Phocorhabdus broths, control medium ( 2 % Proteose Peptone *3), purified toxin complex(es) (0.23 mg/ml] or 10 mM sodium phosphate buffer, pH 7.0 were applied in 60 ul aliquots to the surface of 0.35 g of artificial diet (Stoneville Yellow lepidopteran diet) and allowed to dry. A single, 12-24 hr boll weevil larva was placed on the diet, and the wells v/ere sealed and held at 25°C, 50% RH for 5 days. Mortality and larval weights were then assessed. Control mortality ranged between 0-13%.
Activity against mosquito larvae was tested as follows. The assay was conducted in a 96-well microtiter plate. Each well contained 200 ul of aqueous solution (10-fold concentrated Phocorhabdus culture broth(s), control medium (2% Proteose 18003 Peptone #3), 10 mM sodium phosphate buffer, toxin com lexes) :? 0.23 mg mi or H2O) and approximately 20, 1-day old larvae lAedes aegypci). There were 6 wells per treatment. The results were read at 3-4 days after infestation. Control mortality was between 0-20%.
Activity against fruitflies was tested as follows.
Purchased Drosophila melanogascer medium was prepared using 50% dry medium and a 50% liquid of either water, control medium (2% Proteose Peptone #3), 10-fold concentrated Phocorhabd s culture broth(s), purified toxin complex(es) [0.23 mg/ml] or 10 mM sodium phosphate buffer , pH 7.0. This was accomplished by placing 4.0 ml of dry medium in each of 3 rearing vials per treatment and adding 4.0 ml of the appropriate liquid. Ten late instar Drosophila melanogascer maggots were then added to each 25 ml vial. The vials were held on a laboratory bench, at room temperature, under fluorescent ceiling lights. Pupal or adult counts were made after 15 days of exposure. Adult emergence as compared to water and control medium (0-16% reduction) .
Activity against aster leafhopper adults [Macrosceles severini) and corn planthopper nymphs ( Peregrinus maidis) was tested with an ingestion assay designed to allow ingestion of the active without other external contact. The reservoir for the active/ " food" solution is made by making 2 holes in the center of the bottom portion of a 35X10 mm Petri dish. A 2 inch Parafilm M® square is placed across the top of the dish and secured with an "0" ring. A 1 oz. plastic cup is then infested with approximately 7 hoppers and the reservoir is placed on top of the cup, Parafilm down. The test solution is then added to the reservoir through the holes. In tests using 10-fold concentrated Phocorhabdus culture broth(s) , the broth and control medium (2% Proteose Peptone #3) were dialyzed against 10 mM sodium phosphate buffer, pH 7.0 and sucrose (to 5%) was added to the resulting solution to reduce control mortality. Purified toxin comple (es) [0.23 mg/ml] or 10 mM sodium phosphate buffer, pH 7.0 was also tested. Mortality is reported at day 3. The assay was held in an incubator at 28°C, 70% RH with a 16/8 photoperiod. The assays were graded for mortality at 72 hours. Control mortality was less than 6%. -91- SUBSTITUTE SHEET RULE 26 Activity against lepidcptsran larvae was tested as fellows. Concentrated ! 10-fold) Phacorhabdus culture broth(s), control medium (2% Proteose Peptone #3), purified toxin complex (esi [C.23 mg/ml] or 10 m sodium phosphate buffer, pH 7.0 were applied directly to the surface (-1.5 cm^ ) of standard artificial lepidopteran diet (Stoneville Yellow diet) in 40 ul aliquots. The diet places were allowed to air-dry in a sterile flow-hood and each well was infested with a single, neonate larva. European corn borer (Oscrinia nubilalis) and tobacco hornworm (Mand ca sexca) eggs were obtained from commercial sources and hatched in-house, whereas tobacco budworm (Heliochis virescens) larvae were supplied internally. Following infestation with larvae, the diet plates were sealed, placed in a humidified growth chamber and maintained in the dark at 27°C for the appropriate period.
Mortality and weight determinations were scored at day 5.
Generally, 16 insects per treatment were used in all studies. Control mortality generally ranged from 4-12.5% for control medium and was less than 10% for phosphate buffer.
Activity against two-spotted spider mite (Tecranyc us urticae) was determined as follows. Young squash plants were trimmed to a single cotyledon and sprayed to run-off with 10- fold concentrated broth(s), control medium (2% Proteose Peptone #3), purified toxin complex(es) [0.23 mg/ml] or 10 mM sodium phosphate buffer, pH 7.0. After drying, the plants were infested with a mixed population of spider mites and held at lab temperature and humidity for 72 nr. Live mites were then counted to determine levels of control. -92- SUBSTrrUTE SHEET (RULE 26) Table 19 Observed Insecticidai Spectrum of Brochs From Different Phocorhabdus Strains Fhccor abdus Strain Sensi i ' * ins ect Species WX-1 3* * 4, 5, 6. , ί 3 WX-2 2, 4 WX-3 1, 4 WX-4 1, 4 WX- 5 4 WX-6 4 WX-7 3 , 4 , 5, 6, n 8 WX-8 1, 2 , 4 WX-9 1, 1 4 WX-10 4 WX-11 1, 2, 4 WX-12 2, 4, 5 , 6 , 7 , 8 WX-14 1, 2, 4 WX-15 1, 2, 4 W30 3 , 4, 5, 8 NC-1 1, 2, 3, 4, 5, 6, 7, 8, 9 WIR 2, 3 , 5 , 6, 7, 8 HP88 1, 3 , 4, 5, 7, 8 Hb 3, 4, 5, 7, 8 Hm 1, 2, 3, 4, 5, 7, 8 H9 1, 2, 3, 4, 5, 6, 7, 8 W-14 1, 2, 3, 4, 5, 6, 7, 8, 10 ATCC 43948 4 ATCC 43949 4 ATCC 43950 4 ATCC 43951 4 ATCC 43952 4 = ≥ 25% mortality and/or growth inhibition vs. control = 1; Tobacco budworm, 2; European corn borer, 3 Tobacco hornworm, 4; Southern corn rootworm, 5; Boll weevil, 6; Mosquito, 7; Fruit Fly, 8; Aster Leafhopper, 9; Corn planthopper, 10; Two-spotted spider mite. -93- Example 14 tlon '■·)'- 14 Phccorr.abdus Strains: Purification, Characteriza ion and Activity Spectrum Purification The protocol, as follows, is similar to that developed for the purification of W-14 and was established based on purifying those fractions having the most activity against Southern corn root worm (SCR), as determined in bioassays (see Example 13). Typically, 4-20 L of broth that had been filtered, as described in Example 13, were received and concentrated using an Amicon spiral ultra filtration cartridge Type S1Y100 attached to an Amicon M-12 filtration device. The retentate contained native proteins consisting of molecular sizes greater than 100 kDa, whereas the flow through material contained native proteins less than 100 kDa in size. The majority of the activity against SCR was contained in the 100 kDa retentate. The retentate was then continually diafiltered with 10 mM sodium phosphate (pH = 7.0) until the filtrate reached an A280 < 0.100. Unless otherwise stated, all procedures from this point were performed in buffer as defined by 10 mM sodium phosphate (pH 7.0). The retentate was then concentrated to a final volume of approximately 0.20 L and filtered using a 0.45 mm Nalgene™ Filterware sterile filtration unit. The filtered material was loaded at 7.5 ml/min onto a Pharmacia HR16/10 column which had been packed with PerSeptive Biosystem Poros® 50 HQ strong anion exchange matrix equilibrated in buffer using a PerSeptive Biosystem Sprint® HPLC system.
After loading, the column was washed with buffer until an A280 *' 0.100 was achieved. Proteins were then eluted from the column at 2.5 ml/min using buffer with 0.4 M NaCl for 20 min for a total volume of 50 ml. The column was then washed using buffer with 1.0 M NaCl at the same flow rate for an additional 20 min (final volume = 50 ml) . Proteins eluted with 0.4 M and 1.0 M NaCl were placed in separate dialysis bags (Spectra/Por® Membrane MWCO: 2,000) and allowed to dialyze overnight at 4° C in 12 L buffer. The majority of the activity against SCR was contained in the 0.4 M fraction. The 0.4 M fraction was further purified by application of 20 ml to a Pharmacia XK 26/100 column that had been prepacked with Sepharose CL4B (Pharmacia) using a flow rate -94- SUBSTITUTE SHEET (RULE 26} of 0.75 ml /min. Fractions were pooled based cn A28O peak profile and concentrated to a final volume of 0.75 ml using a Miilipore Ul raf ree®- 15 centrifugal filter device Biomax-50K NMWL membrane. Protein concentra ions were determined using a Biorad Protein Assay Kit with bovine gamma globulin as a standard.
Charac ri a ion The native molecular weight of the SCR toxin complex was determined using a Pharmacia HR 16/50 that had been prepacked with Sepharose CL4B in buffer. The column was then calibrated using proteins of known molecular size thereby allowing for calculation of the toxin approximate native molecular size. As shown in Table 20, the molecular size of the toxin complex ranged from 777 kDa with strain Hb to 1,900 kDa with strain WX-14. The yield of toxin complex also varied, from strain WX-12 producing 0.8 mg/L to strain Hb, which produced 7.0 mg/L.
Proteins found in the toxin complex were examined for individual polypeptide size using SDS-PAGE analysis. Typically, 20 mg protein of the toxin complex from each strain was loaded onto a 2-15% polyacrylamide gel (Integrated Separation Systems) and electrophoresed at 20 mA in Biorad SDS-PAGE buffer. After completion of electrophoresis, the gels were stained overnight in Biorad Coomassie blue R-250 (0.2% in methanol: acetic acid: water; 40:10:40 v/v/v) . Subsequently, gels were destained in methanol : acetic acid: water; 40:10:40 (v/v/v) . The gels were then rinsed with water for 15 min and scanned using a Molecular Dynamics Personal Laser Densitometer®. Lanes were quant itated and molecular sizes were calculated as compared to Biorad high molecular weight standards, which ranged from 200-45 kDa.
Sizes of the individual polypeptides comprising the SCR toxin complex from each strain are listed in Table 21. The sizes of the individual polypeptides ranged from 230 kDa with strain WX-1 to a size of 16 kDa, as seen with strain WX- 7 . Every strain, with the exception of strain Hb, had polypeptides comprising the toxin complex that were in the 160-230 kDa range, the 100-160 kDa range, and the 50-80 kDa range. These data indicate that the toxin complex may vary in peptide composition and components from strain to strain, however, in all cases the -95- toxin attributes appears to consist of a large, oligomen protein complex.
Table 20 Characterization of a Toxin Complex Non w-14 Phocorhabdus Strains Activity Spectrum As shown in Table 21, the toxin complexes purified from strains Hm and H9 were tested for activity against a variety of insects, with the toxin complex from strain W-14 for comparison. The assays were performed as described in Example 13. The toxin complex from all three strains exhibited activity against tobacco bud worm, European corn borer, Southern corn root worm, and aster leafhopper. Furthermore, the toxin complex from strains Hm and w-14 also exhibited activity against two-spotted spider mice. In addition, the toxin complex from W-14 exhibited activity against mosquito larvae. These data indicate that the toxin complex, while having similarities in activities between certain orders of insects, can also exhibit differential activities against other orders of insects. -96- Table 21 The Approximate Sizes (in kDa) of Peptides in a Purified Toxin Complex From Mon W-14 Phcco abdus H9 Hb Hm HP NC-1 WIR WX-1 WX-2 wx-7 WX-12 WX-14 W-i 1 88 180 150 170 170 180 170 230 200 200 180 210 190 170 140 140 160 170 160 190 170 180 160 180 180 160 139 100 140 140 120 170 150 110 140 160 170 140 130 81 130 110 110 160 120 87 139 120 160 120 120 72 129 44 89 110 110 75 130 110 150 98 100 68 110 16 79 98 82 43 110 100 130 87 98 49 100 74 76 64 33 92 95 120 84 88 46 86 62 58 37 28 87 80 11 -J 79 81 30 81 51 53 30 26 80 69 93 72 75 22 77 40 41 23 73 49 90 68 69 20 73 39 35 22 59 41 77 60 60 19 60 37 31 21 56 33 6? 57 57 58 33 28 19 51 65 52 54 45 30 24 18 37 63 46 49 39 28 22 16 33 60 40 44 35 27 32 51 37 39 25 26 4<-> 37 23 40 35 3? 2 ° -97- Table 22 Observed Inseccicidal Spectrum of a Purified Toxin Complex ircm Phacarhabdus Strains Phczor abdus Strain Sensitive* Insect Species Hm Toxin Complex 1**, 2, 3, 5, 6, 7, H9 Toxin Complex 1, 2, 3, 6, 7, 8 w-14 Toxin Complex 1, 2, 3, 4, 5, 6, 7 = > 25% mortality or growth inhibition = 25% mortality or growth inhibition = 1; Tobacco bud worm, 2; European corn borer, 3; Southern corn root worm, 4; Mosquito, 5; Two-spotted spider mite, 6; Aster Leafhop er, 7; Fruit Fly, 3; Boll Weevil Example 15 Sub-Fractionation of Phocorhabd s Protein Toxin Complex The Phocorhabdus protein toxin complex was isolated as described in Example 14. Next, about 10 mg toxin was applied to a MonoQ 5/5 column equilibrated with 20 mM Tris-HCl, pH 7.0 at a flow rate of lml/min. The column was washed with 20 mM Tris-HCl, pH '7.0 until the optical density at 280 nm returned to baseline absorbance. The proteins bound to the column were eluted with a linear gradient of 0 to 1.0 M NaCl in 20 mM Tris-HCl, pH 7.0 at 1 ml/min for 30 min. One ml fractions were collected and subjected to Southern corn rootworm (SCR) bioassay (see Example 13) . Peaks of activity were determined by a series of dilutions of each fraction in SCR bioassays. Two activity peaks against SCR were observed and were named A (eluted at about 0.2-0.3 M NaCl) and B (eluted at 0.3-0.4 M NaCl) . Activity peaks A and B were pooled separately and both peaks were further purified using a 3-step procedure described below.
Solid ( H4)2S04 was added to the above protein fraction to a final concentration of 1.7 M. Proteins were then applied to a pheny 1-Superose 5/5 column equilibrated with 1.7 M ( H4)2 04 in 50 mM potassium phosphate buffer, pH 7 at 1 ml/min. Proteins bound to the column were eluted with a linear gradient of 1.7 M (NH4)2S04, 0% ethylene glycol, 50 mM potassium phosphate, pH 7.0 to 25% ethylene glycol, 25 mM potassium phosphate, pH 7.0 (no ( H4)2S04) at 0.5 ml/min. Fractions were dialyzed overnight •98- S96/18003 agamsc 10 mM sodium phosphace buffer, pH 7.0. Activities m each fraction against SCR were determined by bioassay.
The fractions with the highest activity were pooled and ■ applied to a onoQ 5/5 column which was equilibrated with 20 mM Tris-HCl, pH 7.0 at 1 ml/min. The proteins bound to the column were eluted at 1 ml/min by a linear gradient of 0 to 1M NaCl in 20 mM Tris-HCl , pH 7.0.
For the final step of purification, the most active fractions above (determined by SCR bioassay! were pooled and subjected to a second phenyl-Superose 5/5/ column. Solid (NH4>2S04 was added to a final concentration of 1. 7 M. The solution was then loaded onto the column equilibrated with 1. 7 M (NH4)2S04 in 50 mM potassium phosphate buffer, pH 7 at lml/min. Proteins bound to the column were eluted with a linear gradient of 1. 7 M ( H4)2S04, 50 mM potassium phosphate, pH 7.0 to 10 mM potassium phosphate, pH 7.0 at 0.5 ml/min. Fractions were dialyzed overnight against 10 mM sodium phosphate buffer, pH 7.0. Activities in each fraction against SCR were determined by bioassay.
The final purified protein by the above 3 -step procedure from peak A was named toxin A and the final purified protein from peak B was named toxin B.
Characterization and Amino Acid Sequencing of Toxin A and Toxin B In SDS-PAGE, both toxin A and toxin B contained two major (> 90% of total Commassie stained protein) peptides: 192 kDa (named Al and Bl, respec ively) and 58 kDa (named A2 and B2 , respectively) . Both toxin A and toxin B revealed only one major band in native PAGE, indicating Al and A2 were subunits of one protein complex, and Bl and B2 were subunits of one protein complex. Further, the native molecular weight of both toxin A and toxin B were determined to be 860 kDa by gel filtration chromatography. The relative molar concentrations of Al to A2 was judged to be a 1 to 1 equivalence as determined by densiometric analysis of SDS-PAGE gels. Similarly, Bl and B2 peptides were present at the same molar concentration.
Toxin A and toxin B were electrophoresed in 10% SDS-PAGE and transblotted to PVDF membranes. Blots were sent for amino acid analysis and N-terminal amino acid sequencing at Harvard MicroChem and Cambridge ProChem, respectively. The N-terminal SUBSTITUTE SHEET RULE 26 amino sequence of Bi was determined to be identical to SZ 13 MO:l, the TcbAii region of the ccbA gene (SEQ ID NO: 12, pcsi ion 37 to 99') . A unique N-terminal sequence was obtained for peptide B2 (SEQ ID NO:40) . The M-terminal amino acid sequence of peptide 5 B2 was identical to the TcbAiii region of the derived amino acid sequence for the CcbA gene (SEQ ID MO: 12, position 1935 to 1945) . Therefore, the B toxin contained predominantly two peptides, TcbAii and TcbAiii, that were observed to be derived from the same gene product, TcbA. 10 The N-terminal sequence of A2 (SEQ ID NO: 1) was unique in comparison to the TcbAiii peptide and other peptides. The A2 peptide was denoted TcdAiii (see Example 17) . SEQ ID NO:6 was determined to be a mixture of amino acid sequences SEQ ID NO: 0 and 41. [5 Peptides Al and A2 were further subjected to internal amino acid sequencing. For internal amino acid sequencing, 10 ug of toxin A was electrophoresized in 10% SDS-PAGE and transblotted to PVDF membrane. After the blot was stained with amide black, peptides Al and A2 , denoted TcdAii and TcdAiii, respectively, !O were excised from the blot and sent to Harvard MicroChem and Cambridge ProChem. Peptides were subjected to trypsin digestion followed by HPLC chromatography to separate individual peptides. N-terminal amino acid analysis was performed on selected tryptic peptide fragments. Two internal amino acid sequences of peptide !5 Al (TcdAii-PK71, SEQ ID NO:38 and TcdAu-PK44, SEQ ID NO: 39) were found to have significant homologies with deduced amino acid sequences of the TcbAii region of the ccbA gene (SEQ ID MO: 12) . Similarly, the N-terminal sequence (SEQ ID NO: 41) and two internal sequences of peptides A2 (TcdAiii-PK57 , SEQ ID NO:42 and 0 TcdAiii-PK20 , SEQ ID NO.43) also showed significant homology with deduced amino acid sequences of TcbAiii region of the ccbA gene (SEQ ID NO: 12) .
In summary of above results, the toxin complex has at least two active protein toxin complexes against SCR; toxin A and toxin 5 B. Toxin A and toxin B are similar in their native and subunits molecular weight, however, their peptide compositions are different. Toxin A contained peptides TcdAii and TcdAiii as the major peptides and the toxin B contains TcbAii and TcbAiii as chs major peptides.
Example 16 Cleavage and Activation of TcbA Pept ide In the toxin B complex, peptide TcbAi and TcbAii originate from the single gene product TcbA (Example 15) . The processing of TcbA peptide to TcbA ϋ and TcbAiii is presumably by the action of' Phocorhabdus protease(s) , and most likely, the metalloproteases described in Example 10. In some cases, it was noted that when Phocorhabdus W-14 broth was processed, TcbA peptide was present in toxin B complex as a major component, in addition to peptides TcbAii and TcbAi . Identical procedures, described for the purification of toxin B complex (Example 15) , were used to enrich peptide TcbA from toxin complex fraction of W-14 broth. The final purified material was analyzed in a 4-20% gradient SDS-PAGE and major peptides were quantified by densitometry. It was determined that TcbA, TcbAii and TcbAi i comprised 58%, 36%, and 6%, respectively, of total protein. The identities of these peptides were confirmed by their respective molecular sizes in SDS-PAGE and Western blot analysis using monospecific antibodies. The native molecular weight of this fraction was determined to be 860 kDa.
The cleavage of TcbA was evaluated by treating the above purified material with purified 38 kDa and 58 kDa W-14 Phocorhabdus metalloproteases (Example 10) , and Trypsin as a control enzyme (Sigma, MO) . The standard reaction consisted 17.5 ug the above purified fraction, 1.5 unit protease, and 0.1 M Tris buffer, pH 8.0 in a total volume of 100 ul. For the control reaction, protease was omitted. The reaction mixtures were incubated at 37 °C for 90 min. At the end of the reaction, 20 ul was taken and boiled with SDS-PAGE sample buffer immediately for electrophoresis analysis in a 4-20% gradient SDS-PAGE. It was determined from SDS-PAGE that in both 38 kDa and 58 kDa protease treatments, the amount of peptides TcbAii and TcbAiii increased about 3 -fold while the amount of TcbA peptide decreased proportionally (Table 23) . The relative reduction and augmentation of selected peptides was confirmed by Western blot analyses. Furthermore, gel filtration of the cleaved material revealed that the native molecular size of the complex remained the same. Upon trypsin treatment, peptides TcbA and TcbAii were -101- nonspecif ically digested inco small peptides. This indicated that 38 kDa and 58 kDa Phocarnabdus proteases can specifically process peptide TcbA into peptides TcbAii and TcbA ii. Protease treated and untreated control of the remaining 80 ul reaction mixture were serial diluted with 10 mM sodium phosphate buffer, pH 7.0 and analyzed by SCR bioassay. By comparing activity in several dilution, it was determined that the 38 kDa protease treatment increased SCR insecticidal activity approximately 3 to 4 fold. The growth inhibition of remaining insects in the protease treatment was also more severe than control (Table 23).
Table 23 Conversion and activation of peptide TcbA into peptides TcbAu and TcbAiii by protease treatment.
Control 38 kDa protease treatment SO (% of total protein) 58 18 SI (% of total protein) 36 64 S9 (% of total protein) 6 18 LD50 (ug protein) 2.1 0.52 SCR Weight (mg/insect)* 0.2 0.1 *: an indication of growth inhibition by measuring the average weight of live insect after 5 days on diet in the assay.
Example 17 Screening of the library for a gene encoding the TcdAjj Peptide The cloning and characterization of a gene encoding the TcdAii peptide, described as SEQ ID NO: 17 (internal peptide TcdAii-PTlll N-terminal sequence) and SEQ ID NO: 18 (internal peptide TcdAii-PT79 N-terminal sequence) was completed. Two pools of degenerate oligonucleotides, designed to encode the amino acid sequences of SEQ ID NO: 17 (Table 24) and SEQ ID NO: 13 (Table 25), and the reverse complements of those sequences, were synthesized as described in Example 8. The DNA sequence of the oligonucleotides is given below: Table 24 Degenerate Oligonucleotide for SEQ ID NO: Table 25 Degenerate Oligonucleotide for SEQ ID NO: According to IUPAC-IUB codes for nucleotides, Y = C or N = A. C. G or T. K = G or T, R = A or G, and M = A or Polymerase Chain Reactions (PC ) were performed essentially as described in Example 3, using as forward primers F2.3.5.CB or P2.3.5, and as reverse primers P2.79.R.1 or P2.79R.CB, in all forward/reverse combinations, using Phocorhabdus W-14 genomic DMA as template. In another set of reactions, primers P2.79.2 or P2.79.3 were used as forward primers, and P2.3.5R, P2.3.5RI, and P2.3R.CB were used as reverse primers in all forward/reverse combina ions. Only in the reactions containing P2.3.6.CB as the forward primers combined with P2.79.R.1 or P2.79R.CB as the reverse primers was a non-artifactual amplified product seen, cf estimated size (mobility on agarose gels) of 2500 base pairs. The order of the primers used to obtain this amplification product indicates that the peptide fragment TcdAii-PTlll lies amino-proximal to the peptide fragment TcdAii~PT79.
The 2500 bp PCR products were ligated to the plasmid vector pCR^II (Invitrogen, San Diego, CA) according to the supplier's instructions, and the DNA sequences across the ends of the insert fragments of two isolates (HS24 and HS27) were determined using the supplier's recommended primers and the sequencing methods described previously. The sequence of both isolates was the same. New primers were synthesized based on the determined sequence, and used to prime additional sequencing reactions to obtain a total of 2557 bases of the insert [SEQ ID NO:36].
Translation of the partial peptide encoded by SEQ ID No: 36 yields the 845 amino acid sequence disclosed as SEQ ID NO:37.
Protein homology analysis of this portion of the TcdAii peptide fragment reveals substantial amino acid homology (68% similarity; 53% identity) to residues 542 to 1390 of protein TcbA [SEQ ID NO: 12]. It is therefore apparent that the gene represented in part by SEQ ID NO: 36 produces a protein of similar, but not identical, amino acid sequence as the TcbA protein, and which likely has similar, but not identical biological activity as the TcbA protein.
In yet another instance, a gene encoding the peptides TcdAii-P 44 and the TcdAUi. 58 kDa N-terminal peptide, described as SEQ ID NO:9 (internal peptide TcdA -P 44 sequence), and SEQ ID NO: 41 (TcdAiii 58 kDa N-terminal peptide sequence) was isolated. Two pools of degenerate oligonucleotides, designed to encode the amino acid sequences described as SEQ ID NO: 39 (Table 27) and SEQ /18003 ID MO: 41 (Tabie 26), and che reverse compiemencs of chose sequences, were synchesized as described in Example 3, and chsir DMA sequences. / -105- SUBSTTrUTE SHEETRULE 26) Table 26 Degenerate Oligonucleotide for SEQ ID N Table 27 Degenerate Oligonucleotide for SEQ ID N Amino Acid (8) (9) (10) (11) (12) (13) (14) (15) « Codon # 1 2 3 4 5 6 7 8 Amino Acid Gly ro Val Glu He Aan Thr Ala Al.44.1 5' GGY CCR GTK GAA ATT AAT ACC GCI Al.44.lR 5* ATI GCG GTA TTA ATT TCM ACY GGR Al.44.2 5* GGI CCI GTI GAR ATY AAY AC I GCI Al.44.2R 5' ATI GCI GTR TTR ATY TCI AC I GGI Polymerase Chain Reactions (PCR) were performed essentially as described in Example 9, using as forward primers Al.44.1 or Al.44.2, and reverse primers A2.3R or A2.4R, in all forward/ everse combinations, using Phocorhabdus W-14 genomic DMA as template. In another set of reactions, primers A2.1 or A2.2 were used as forward primers, and A1.44.1R, and A1.44.2R were used as reverse primers in all forward/ reverse combinations.
Only in the reactions containing Al.44.1 or Al.44.2 as the forward primers combined with A2.3R as the reverse primer was a non-artifactual amplified product seen, of estimated size (mobility on agarose gels) of 1400 base pairs. The order of the primers used to obtain this amplification product indicates chat the peptide fragment TcdAii-PK44 lies amino-proximal to the 58 kDa peptide fragment of TcdAiii.
The 1400 bp PCR products were ligated to the plasmid vector pCR^II according to the supplier's instructions. The DNA sequences across the ends of the insert fragments of four isolates were determined using primers similar in sequence to the supplier's recommended primers and using sequencing methods described previously. The nucleic acid sequence of all isolates differed as expected in the regions corresponding to the degenerate primer sequences, but the amino acid sequences deduced from these data were the same as the actual amino acid sequences for the peptides determined previously, (SEQ ID NOS:41 and 39).
Screening of the W-14 genomic cosmid library as described in Example 8 with a radiolabeled probe comprised of the DNA prepared above (SEQ ID NO: 36) identified five hybridizing cosmid isolates, namely 17D9, 20B10, 21D2, 27B10, and 26D1. These cosmids were distinct from those previously identified with probes corresponding to the genes described as SEQ ID NO: 11 or SEQ ID NO: 25. Restriction enzyme analysis and DNA blot hybridizations identified three EcoR I fragments, of approximate sizes 3.7, 3.7, and 1.1 kbp, that span the region comprising the' DNA of SEQ ID NO-.36. Screening of the W-14 genomic cosmid library using as probe the radiolabeled 1.4 kbp DNA fragment prepared in this example identified the same five cosmids (17D9, 20B10, 21D2, 27B10, and 26D1). DNA blot hybridization to EcoR I-digested cosmid DNAs also showed hybridization to the same subset -107- SUBSTmiTE SHEETRULE 26) of EcoR I fragments as seen with the 2.5 kbp TcdAii gens probe, indicating chat both fragments are encoded on the genomic DiJA.
DNA sequence determination of the cloned EcoR I fragments revealed an uninterrupted reading frame of 7551 base pairs (SEQ ID MO:46), encoding a 232.9 kDa protein of 2516 amino acids (SEQ ID MO:47) . Analysis of the amino acid sequence of this protein revealed all expected internal fragments of peptides TcdAii (SEQ ID NOS-.17, 13, 37, 38 and 39) and the TcdAi i peptide M-terminus (SEQ ID NO:41) and all TcdAiii internal peptides (SEQ ID NOS:42 and 43) . The peptides isolated and identified as TcdA and TcdAiii are each products of the open reading frame, denoted CcdA, disclosed as SEQ ID NO:46. Further, SEQ ID NO:47 shows, starting at position 89, the sequence disclosed as SEQ ID NO: 13, which is the N-terminal sequence of a peptide of size approximately 201 kDa, indicating that the initial protein produced from SEQ ID No: 46 is processed in a manner similar to that previously disclosed for SEQ ID NO: 12. In addition, the protein is further cleaved to generate a product of size 209.2 kDa, encoded by SEQ ID NO: 48 and disclosed as SEQ ID NO: 49 (TcdAii peptide), and a product of size 63.6 kDa, encoded by SEQ ID NO:50 and disclosed as SEQ ID NO:51 (TcdAiii peptide). Thus, it is thought that the insecticidal activity identified as toxin A (Example 15) derived from the products of SEQ ID MO: 46, as exemplified by the full-length protein of 282.9 kDa disclosed as SEQ ID NO: 47, is processed to produce the peptides disclosed as SEQ ID NOS.-49 and 51. It is thought that the insecticidal activity identified as toxin B (Example 15) derives from the products of SEQ ID NO: 11, as exemplified by the 280.6 kDa protein disclosed as SEQ ID NO: 12. This protein is proteolytically processed to yield the 207.6 kDa peptide disclosed as SEQ ID NO: 53, which is encoded by SEQ ID NO: 52, and the 62.9 kDa peptide having N-terminal sequence disclosed as SEQ ID NO: 40, and further disclosed as SEQ ID NO: 55, which is encoded by SEQ ID NO: 54.
Amino acid sequence comparisons between the proteins disclosed as SEQ ID NO: 12 and SEQ ID NO: 47 reveal that they have 69% similarity and 54% identity. This high degree of evolutionary relationship is not uniform throughout the entire amino acid sequence of these peptides, but is higher towards the carboxy-terminal end of the proteins, since the peptides -108- SUBSTTTUTE SHEET (RULE 26) disclosed as SEQ ID MO: 51 (derived from SEQ ID W0:4 i and SEQ - NO: 55 (derived from SEQ ID MO: 12) have 76% similarity and 54% identity.
Example 13 Control of European Cornborer- Induced Leaf Damage on Maize Planes by Spray Application of Phocorhabdus (Strain W-14) Broth The ability of Phocorhabdus toxin(s) to reduce plant damage caused by insect larvae was demonstrated by measuring leaf damage caused by European corn borer {Oscrinia nubilalis) infested onto maize plants treated with Phocorhabdus broth. Fermentation broth from Pnocorhaibdus strain W-14 was produced and concentrated approximately 10-fold using ultrafiltration (10,000 MW pore-size) as described in Example 13. The resulting concentrated broth was then filter sterilized using 0.2 micron nitrocellulose membrane filters. A similarly prepared sample of uninoculated 2% proteose peptone #3 was used for control purposes. Maize plants (a DowElanco proprietary inbred line) were grown from seed to vegetative stage 7 or 8 in pots containing a soilless mixture in a greenhouse (27°C day; 22°C night, about 50%RH, 14 hr day-length, watered/fertilized as needed) . The test plants were arranged in a randomized complete block design (3 reps /treatment , 6 plants/treatment) in a greenhouse with temperature about 22°C day; 18°C night, no artificial light and with partial shading, about 50%RH and watered/ fertilized as needed. Treatments (uninoculated media and concentrated Phocorhabdus broth) were applied with a syringe sprayer, 2.0 mis applied from directly (about 6 inches) over the whorl and 2.0 additional mis applied in a circular motion from approximately one foot above the whorl. In addition, one group of plants received no treatment. After the treatments had dried (approximately 30 minutes), twelve neonate European corn borer larvae (eggs obtained from commercial sources and hatched in-house) were applied directly to the whorl. After one week, the plants were scored for damage to the leaves using a modified Guthrie Scale (Koziel, M. G., Beland, G. L., Bowman, C, Carozzi, N. B., Crenshaw, R. , Crossland, L., Dawson, J., Desai, M . , Hill, M. , Kadwell, S., Launis, K.( Lev/is, K. , Maddox, D., McPherson, . , Megh i , M. Z., Merlin, E., Rhodes, R . , -109- arren, G. w. , Wright, M . and Evoia, s. 7. 1993) .
Bio /Technology , 11, 194-195.) and the scores were compared statistically [T-test (LSD) p<0.05 and Tukey ' s Studentized Range (HSD) Test p<0.1]. The results are shown in Table 23. For reference, a score of 1 represents no damage, a score of 2 represents fine "window pane" damage on the unfurled leaf with no pinhole penetration and a score of 5 represents leaf penetration with elongated lesions and/or mid rib feeding evident on more than three leaves (lesions < 1 inch) . These data indicate that broth or other protein containing fractions may confer protection against specific insect pests when delivered in a sprayable formulation or when the gene or derivative thereof, encoding the protein or part thereof, is delivered via a transgenic plant or microbe .
Table 28 Effect of Photorhabdus Culture Broth on European Corn Borer- Induced Leaf Damage on Maize Treatment Average Guthrie Score No Treatment 5.02a Uninoculated medium 5.15a Photorhabdus Broth 2.24b Means with different letters are statistically different (p<0.05 or p<0.1) .
Example 19 Genetic Engineering of Genes for Expression in E. coli Summary of constructions A series of plasmids were constructed to express the tcbA gene of Photorhabdus W-14 in Escherichia coli. A list of the plasmids is shown in Table 29. A brief description of each construction follows as well as a summary of the E. coli expression data obtained. -110- Table 29 Expression plasrnids for the ccbA gene. rev at ons: an= anamyc n, =c oramp en co , Amp=ampicillin Construction of pDAB6B 4 In Example 9, a large EcoR I fragment which hybridizes to the TcbAii probe is described. This fragment was subcloned into pBC (Stratagene, La Jolla CA) . Sequence analysis indicates that this fragment is 8816 base pairs. The fragment encodes the ccbA gene with the initiating ATG at position 571 and the terminating TAA at position 8086. The fragment therefore carries 570 base pairs of Phocorhabdus DNA upstream of the ATG and 730 base pairs downstream of the TAA.
Construction of Plasmid pAcGP67B/ ccbA The CcbA gene was PCR amplified using the following primers; 5' primer (SlAc51) 5' TTT AAA CCA TGG GAA ACT CAT TAT CAA GCA CTA TC 3' and 3' primer (SlAc31) 5' TTT AAA GCG GCC GCT TAA CGG ATG GTA TAA CGA ATA TG 3' . PCR was performed using a TaKaRa LA PCR kit from PanVera (Madison, Wisconsin) in the following reaction: 57.5 ml water, 10 ml 10X LA buffer, 16 ml dNTPs (2.5 mM each stock solution), 20 ml each primer at 10 pmoles/ml, 300 ng of the plasmid pDAB634 containing the W-14 cbA gene and one ml of TaKaRa LA Taq polymerase. The cycling conditions were 98°C/20 sec, 68°C/5 min, 72°C/10 min for 30 cycles. A PCR product of the expected about 7526bp was isolated in a 0.8% agarose gel in TBE (100 mM Tris, 90 mM boric acid, 1 mM EDTA) buffer and purified using a Qiaex II kit from Qiagen (Chatsworth, California) . The purified CcbA gene was digested with Nco I and Not I and ligated into the baculovirus transfer vector pAcGP67B (PharMingen (San Diego, California)) and transformed into DH5a E. coli. The tcbA gene was then cut from pAcGP67B and transferred to pET27b to create plasmid pDAB635. A missense mutation in the tcbA gene was repaired in pDAB635. -111- SUBSTmiTE SHEET(RULE 26) The repaired ccbA gene contains two changes from c e sequence shown in Sequence ID NO: 11; an A G ac 212 changing an asparagine 71 co serine 71 and a G.-A at 229 changing an alanine 77 co threonine 77. These changes are both upstream of che proposed TcbAj.i N-cerminus.
Construction of pET15-ccbA The ccbA coding region of pDAB635 was transferred co vector pET15b. This was accomplished using shotgun ligations, che DMAs were cut with rescriction enzymes Nco I and Xho I. The resulting recombinant is called pET15-ccbA.
Expression of TcbA in E. coli from plasmid pET15-ccbA Expression of ccbA in E. coli was obtained by modification of che methods previously described by Studier ec al. (Studier, F.W., Rosenberg, A., Dunn, J., and Duber.dorff, J., (1990) Use of T7 RNA polymerase to direct expression of cloned genes. Methods Enzymol., 185: 60-89.) . Competent E. coli cells strain BL21(DE3) were transformed with plasmid pET15- ccbA and plated on L3 agar containing 100 g ml ampicillin and 40 mM glucose. The transformed cells were plated to a density of several hundred isolated colonies/plate. Following overnight incubation at 37°c the cells were scraped from the plates and suspended in LB broth containing 100 g /ml ampicillin. Typical culture volumes were from 200-500 ml. At time zero, culture densities (OD600) were from 0.05-0.15 depending on the experiment. Cultures were shaken at one of three temperatures (22°C, 30°C or 37°C) until a density of 0.15-0.5 was obtained at which time they were induced with 1 mM isopropylthio- -galactoside (IPTG). Cultures were incubated at the designated temperature for 4-5 hours and then were transferred to 4°C until processing (12-72 hours).
Purification and characterization of TcbA expressed in E.coli from Plasmid pET15-ccbA. ' E. coli cultures expressing TcbA peptides were processed as follows. Cells were harvested by centrifugation at 17,000 x G and the media was decanted and saved in a separate container.
The media was concentrated about 8x using the M12 (Amicon, Beverly MA) filtration system and a 100 kD molecular mass cut-off filter. The concentrated media was loaded onto an anion exchange -112- / 003 column and the bound proteins were eluced with 1.0 M tlaCl. The 1.0 M MaCl elution peak was found co cause morcaiity against Southern corn rootworm (SCR) larvae Table 30) . The 1.0 M NaCl fraction was dialyzed against 10 mM sodium phosphate buffer pH 7.0, concentrated, and subjected to gel filtration on Sepharose CL-4B (Pharmacia, Piscataway, Mew Jersey) . The region of the CL- 4B elution profile corresponding to calculated molecular weight (about 900 kDa) as the native W-14 toxin complex was collected, concentrated and bioassayed against larvae. The collected 900 kDa fraction was found to have insecticidal activity (see Table 30 below) , with sympcomology similar o Chac caused by native W- 14 toxin complex. This fraction was subjected to Proteinase and heat treatment, the activity in both cases was either eliminated or reduced, providing evidence that the activity is proteinaceous in nature. In addition, the active fraction tested immunologically positive for the TcbA and TcbA ii peptides in immunoblot analysis when tested with an anti-TcbAiii monoclonal antibody (Table 30) .
Table 30 Results of Immunobloc and SCR Bioassays.
PK = Proteinase K treatmenc 2 hours; Heat treatment = 100°C or 10 minutes; ND = None Detected; NT = Not Tested. Scoring system for mortality and growth inhibition as compared to control samples; 24%="+", 25-49%="++", 50-100%="+++".
The cell pellet was resuspended in 10 mM sodium phosphate buffer, pH=7.0, and lysed by passage through a Bio-Nebm cell nebulizer (Glas-Col Inc., Terra Haute, IN) . The pellets were -113- SUBSTTTUTE HEET RULE 26 created with DMase no remove DMA and centrifuged ac i7,0G0 :< g r separate the cell pellet from the cell supernatant. The supernatant fraction was decanted and filtered through a 0.2 micron filter to remove large particles and subjected to anion exchange chromatography. Bound proteins were eluted with 1.0 M NaCl, dialyzed and concentrated using Biomax'" (Miilipore Corp, Bedford, MA) concentrators with a molecular mass cut-off of 50,000 Daltons. The concentrated fraction was subjected to gel filtration chromatography using Sepharose CL-4B beaded matrix. Bioassay data for material prepared in this way is shown in Table 30 and is denoted as " TcbA Cell Sup".
In yet another method to handle large amounts of material, the cell pellets were re-suspended in 10 mM sodium phosphate buffer, pH = 7.0 and thoroughly homogenized by using a Kontes Glass Company (Vineland, NJ) 40 ml tissue grinder. The cellular debris was pelleted by centrifugation at 25,000 x g and the cell supernatant was decanted, passed through a 0.2 micron filter and subjected to anion exchange chromatography using a Pharmacia 10/10 column packed with Poros HQ 50 beads. The bound proteins were eluted by performing a NaCl gradient of 0.0 to 1.0 M.
Fractions containing the TcbA protein were combined and concentrated using a 50 kDa concentrator and subjected to gel filtration chromatography using Pharmacia CL-4B beaded matrix. The fractions containing TcbA oligomer, molecular mass of approximately 900 kDa, were collected and subjected to anion exchange chromatography using a Pharmacia Mono Q 10/10 column equilibrated with 20 mM Tris buffer pH = 7.3. A gradient of 0.0 to 1.0 M NaCl was used to elute recombinant TcbA protein.
Recombinant TcbA eluted from the column at1 a salt concentration of approximately 0.3-0.4 M NaCl, the same molarity at which native TcbA oligomer is eluted from the Mono Q 10/10 column. The recombinant TcbA fraction was found to cause SCR mortality in bioassay experiments similar to those in Table 30. -114- SEQUE CE LISTING (1) GENERAL INFORMATIO : i i ) APPLICANT: Ensign, Jerald C Sowen. David J Peteli, James Fatig, Raymond Sc cnover, Sue f trench-Constan , Richard Orr, Gregory L Merlo, Donald J Roberts , Jean L Rocheleau, Thomas A Blackburn, Michael 3 Hey, Timothy D Strickland, James A (ii) TITLE OF INVENTION: Inseccicidal Protein Toxins From Phocor abdus iiii) NUMBER OF SEQUENCES: 61 (IV) CORRESPONDENCE ADDRESS : (A) ADDRESSEE: Quarles & Brady (B) STREET: 1 South Pinckney Street (C) CITY: Madison (D) STATE: WI (E) COUNTRY: US (F) ZIP : 53703 (v) COMPUTER READABLE FORM: (A) MEDIUM TYPE: Floppy disk (B) COMPUTER: IBM PC compatible (C) OPERATING SYSTEM: PC-DOS/MS-DOS (D) SOFTWARE: Patent In Release #1.0, Version *1.30 ( i) CURRENT APPLICATION DATA: (A) APPLICATION NUMBER: (B) FILING DATE: (C) CLASSIFICATION: (vii) PRIOR APPLICATION DATA: (A) APPLICATION NUMBER: US 08/063,615 (B) FILING DATE: 18-MAY-1993 (vii) PRIOR APPLICATION DATA: !A) APPLICATION. NUMBER: US 03/ 395,497 (B) FILING DATE: 28-FEB-1995 (Vii) PRIOR APPLICATION DATA: (A) APPLICATION NUMBER: US 60/007,255 (B) FILING DATE: 06 -NOV- 1995 'vii) PRIOR APPLICATION DATA: (A) APPLICATION NUMBER: US 08/609,423 !B) FILING DATE: 23-FEB-1996 11! vii! PRIOR APPLICATION DATA: ' ) APPLICA ION NUMBER: US 03/705,484 (3) FILING DATE: 23-AUG-1995 tviii) ATTORNEY/AGENT INFORMATION : (A) MAME: Seay, Nicholas J !3) REGISTRATION NUMBER : 2^336 iC! REFERENCE/ DOCKET NUMBER: 960296.93304 (i:<) TELECOMMUNICATION INFORMATION: (A) TELEPHONE: 603-251-5000 (3) TELEFAX: 603-25i-9166 (2) INFORMATION FOR SEQ ID NO : 1 (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 11 amino acids (B) TYPE: amino acid (C) STRANDEDMESS: (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: N- erminal (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 : Phe lie Gin Gly Tyr Ser Asp Leu Phe Gly Asn 1 10 (2) INFORMATION FOR SEQ ID NO: 2: (i) SEQUENCE CHARACTERISTICS: [ A ) LENGTH: 12 amino acids (B) TYPE: amino acid (C) STRANDEDMESS: (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: N- terminal (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 : Met Gin Asp Ser Pro Glu 7a1 Ser lie Thr Thr Trp 1 5 10 (2) INFORMATION FOR SEQ ID NO : 3 : (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 19 amino acids (B) TYPE: amino acid (C) STRANDEDNESS : (D) TOPOLOGY: linear -116- (ii) MOLECULE TYPE : protein (v) FRAGMENT TYPE: N- terminal (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 : Ser Glu Ser Leu Phe Thr Gin Thr Leu Lys Glu Ala Arg Arg Asp Ala 1 5 10 15" Leu Val Ala (2) INFORMATION FOR SEQ ID NO: 4: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 14 amino acids (3) TYPE: amino acid (C) STRANDEDNESS : (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: N- terminal (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4 : Ala Ser Pro Leu Ser Thr Ser Glu Leu Thr Ser Lys Leu Asn 1 5 10 (2) INFORMATION FOR SEQ ID NO: 5: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 9 amino acids (B) TYPE: amino acid (C) STRANDEDNESS : (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: N-terminal [xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: Ala Gly Asp Thr Ala Asn lie Gly Asp l 5 (2) INFORMATION FOR SEQ ID NO: 6: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 15 amino acids (B) TYPE: amino acid (C) STRANDEDNESS : single (D) TOPOLOGY: linear ii) MOLECULE TYPE: protein -117- B TT UTE SHEET RULE 26 FRAGMENT TYPE: M-cerminal (Xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5 : Leu Gly Gly Ala Aia Thr Leu Leu Asp Leu Leu Leu Pro Glr I: 1 5 10 ) INFORMATION FOR SEQ ID NO : 7 : (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 11 amino acids (B) TYPE: amino acid (C) STRANDEDNESS : (D) TOPOLOGY : linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: N- terminal (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: Leu Ser Thr Met Glu Lys Gin Leu Asn Glu 5 10 INFORMATION FOR SEQ ID NO: 8 : (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 9 amino acids (B) TYPE: amino acid (C) STRANDEDNESS : (D) TOPOLOGY: linear (ii) MOLECULE TYPE : protein (v) FRAGMENT TYPE: N-terminal (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 8 : Met Asn Leu Ala Ser Pro Leu lie Ser 1 5 INFORMATION FOR SEQ ID NO : 9 : (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 16 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: M- terminal - ns - SUBSTITUTE SHEE.T (RULE 26) :·:ι) SEQUE CE DESCRIPTION: SEQ ID NO : 9 : Mec He Asn Leu Asp He Asn Glu Gin Asn Lys He M<=C Val Val S-=>>- 1 5 i5 INFORMATION FOR SEQ ID NO: 10: !i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 20 amino acids (B! TYPE: amino acid !C! STRANDEENESS : (Di TOPOLOGY: linear (ii) MOLECULE TYPE: procein (v) FRAGMENT T PE: N-terminal (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: Ala Ala Lys Asp Val Lys Phe Gly Ser Asp Ala Arg Val Lys M=c Leu 1 5 10 15 Arg Gly Val Asn 20 INFORMATION FOR SEQ ID HO: 11 SEQUENCE CHARACTERISTICS: (A) LENGTH: 7515 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS : double (D) TOPOLOGY: linear MOLECULE TYPE: DNA (genomic) FEATURE: (A) NAME /KEY: CDS (B) LOCATION: 1..7515 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: ATG CAA AAC TCA TTA TCA AGC ACT ATC GAT ACT ATT TGT AG AAA CTG 43 Mec Gin Asn Ser Leu Ser Ser Thr He Asp Thr He Cys Gin Lys Leu 1 5 10 15 CAA TTA ACT TGT CCG GCG GAA ATT GCT TTG TAT CCC TTT GAT ACT TTC 96 Gin Leu Thr Cys Pro Ala Glu He Ala Leu Tyr Pro Phe Asp Thr Phe 20 25 30 CGG GAA AAA ACT CGG GGA ATG GTT AAT TGG GGG GAA GCA AAA CGG ATT 144 Arg Glu Lys Thr Arg Gly Mec Val Asn Trp Gly Glu Ala Lys Arg He 35 40 45 TAT GAA ATT GCA CAA GCG GAA CAG GAT AGA AAC CTA CTT CAT GAA AAA 132 Tyr Glu He Ala Gin Ala Glu Gin Asp Arg Asn Leu Leu His Glu Lys 50 55 SO CGT ATT TTT GCC TAT GCT AAT CCG CTG CTG AAA AAC GCT GTT CGG TTG 240 Arg He Phe Ala Tyr Ala Asn Pro Leu Leu Lys Asn Aia Val Arg Leu -119- O 3 30 GGT ACC CGG C.AA ATG TTG GGT TTT ATA CAA GGT TAT AGT GAT CTG TTT 233 Gly Thr Arg Gin Mec Leu Gly Phe He Gin Gly T/r Ser Asp Leu Phe 35 90 95 GGT .AAT CGT GCT GAT AAC TAT GCC GCG CCG GGC TCG GTT GCA TCG ATG 335 G y Asn Arg Ala Asp Asn Tyr Ala Ala Pro Gly Ser Val Aia Ser Mec 100 105 110 TTC TCA CCG GCG GCT TAT TTG ACG GAA TTG TAC CGT GAA GCC AAA AAC 334 Phe Ser Pro Ala Ala Tyr Leu Thr Glu Leu Tyr Arg Glu Ala Lys Asn 115 120 125 TTG CAT GAC AGC AGC TCA ATT TAT TAC CTA GAT AAA CGT CGC CCG GAT 432 Leu His Asp Ser Ser Ser He Tyr Tyr Leu Asp Lys Arg Arg Pro Asp 130 135 140 TTA GCA AGC TTA ATG CTC AGC CAG AAA AAT ATG GAT GAG GAA ATT TCA 430 Leu Ala Ser Leu Mec Leu Ser Gin Lys Asn Mec Asp Glu Glu He Ser 145 150 155 1 = 0 ACG CTG GCT CTC TCT AAT GAA TTG TGC CTT GCC GGG ATC GAA AC AAA 523 Thr Leu Ala Leu Ser Asn Glu Leu Cys Leu Ala Gly He Glu Thr Lys 155 170 175 ACA GGA AAA TCA CAA GAT GAA GTG ATG GAT ATG TTG TCA ACT TAT CGT 576 Thr Gly Lys Ser Gin Asp Glu Val Met Asp Mec Leu Ser Thr Tyr Arg 130 135 190 TTA AGT GGA GAG ACA CCT TAT CAT CAC GCT TAT GAA ACT GTT CGT GAA 624 Leu Ser Gly Glu Thr Pro Tyr His His Ala Tyr Glu Thr Val Arg Glu 195 200 205 ATC GTT CAT GAA CGT GAT CCA GGA TTT CGT CAT TTG TCA CAG GCA CCC 672 lie Val His Glu Arg Asp Pro Gly Phe Arg His Leu Ser Gin Ala Pro 210 215 220 ATT GTT GCT GCT AAG CTC GAT CCT GTG ACT TTG TTG GGT ATT AGC TCC 720 lie Val Ala Ala Lys Leu Asp Pro Val Thr Leu Leu Gly He Ser Ser 225 230 235 240 CAT ATT TCG CCA GAA CTG TAT AAC TTG CTG ATT GAG GAG ATC CCG GAA 763 His He Ser Pro Glu Leu Tyr Asn Leu Leu He Glu Glu He Pro Glu 245 250 255 AAA GAT GAA GCC GCG CTT GAT ACG CTT TAT AAA ACA AAC TTT GGC GAT 316 Lys Asp Glu Ala Ala Leu Asp Thr Leu Tyr Lys Thr Asn Phe Gly Asp 260 265 270 ATT ACT ACT GCT CAG TTA ATG TCC CCA AGT TAT CTG GCC CGG TAT TAT 364 He Thr Thr Ala Gin Leu Met Ser Pro Ser Tyr Leu Ala Arg Tyr T/r 275 280 235 GGC GTC TCA CCG GAA GAT ATT GCC TAC GTG ACG ACT TCA TTA TCA CAT 912 Gly Val Ser Pro Glu Asp He Ala Tyr Val Thr Thr Ser Leu Ser His 290 295 300 GTT GGA TAT AGC AGT GAT ATT CTG GTT ATT CCG TTG GTC GAT GGT GTG 950 Val Gly Tyr Ser Ser Asp He Leu Val He Pro Leu Val Asp Gly Val 305 310 315 320 GGT AAG ATG G A GTA GTT CGT GTT ACC CGA ACA CCA TCG GAT AAT TAT 1003 Gly Lys Mec Glu Val Val Arg Val Thr Arg Thr Pro Ser Asp Asn T/r 325 330 335 -120- SUBSTTTUTE SHEET (RULE 26) A" ACT AG ACG AAT TAT ATT GAG CTG TAT CCA CAG GGT GGC GAC AAT 1055 Thr Ser Gin Thr Asn Tyr He Glu Leu Tyr Pro Gin Gly Gly Asp Asn 34C 345 350 TTG ATC .AAA TAC AAT CTA AGC AAT AGT TTT GGT TTG GAT GAT TTT 1104 Leu He Lys Tyr Asn Leu Ser Asn Ser ?he Gly Leu Asp Asp Phe 355 360 365 TAT CTG CAA TAT AAA. GAT GGT TCC GCT GAT TGG ACT GAG ATT GCC CAT 1152 Tyr Leu Gin Tyr Lys Asp Gly Ser Aia Asp Trp Thr Glu He Ala His 370 375 380 AAT CCC TAT CCT GAT ATG GTC ATA AAT CAA .AAG TAT GAA T A CAG GCG 1200 Asn Pro Tyr Pro Asp Met Val He Asn Gin Lys Tyr Glu Ser Gin Ala 335 390 395 400 ACA ATC .AAA CGT ACT GAC TCT GAC AAT ATA CTC AGT ATA GGG TTA C A 1243 Thr lie Lys Arg Ser Asp Ser Asp Asn He Leu Ser He Gly Leu Gin 405 410 415 TGG CAT AGC GGT AGT TAT AAT TTT GCC GCC GCC AAT TTT AAA ATT 1296 Trp His Ser Gly Ser Tyr Asn Phe Ala Aia Ala Asn Phe Lys He 420 425 430 G C CAA TAC TCC CCG AAA GCT TTC CTG CTT AAA ATG AAT AAG GCT ATT 1344 Asp Gin Tyr Ser Pro Lys Ala Phe Leu Leu Lys Mec Asn Lys Ala He 435 440 445 CGG TTG CTC AAA GCT ACC GGC CTC TCT TTT GCT ACG TTG GAG CGT ATT 1392 Arg Leu Leu Lys Ala Thr Gly Leu Ser Phe Ala Thr Leu Glu Arg He 450 455 460 GTT GAT AGT GTT AAT AGC ACC AAA TCC ATC ACG GTT GAG GTA TTA AAC 1440 Val Asp Ser Val Asn Ser Thr Lys Ser He Thr Val Glu Val Leu Asn 465 470 475 430 AAG GTT TAT CGG GTA AAA TTC TAT ATT GAT CGT TAT GGC ATC AGT GAA 1433 Lys Val Tyr Arg Val Lys Phe Tyr He Asp Arg T/r Gly He Ser Glu 435 490 495 GAG ACA GCC GCT ATT TTG GCT AAT ATT AAT ATC TCT CAG CAA GCT GTT 1535 Glu Thr Ala Ala He Leu Ala Asn He Asn He Ser Gin Gin Ala Val 500 505 510 GGC AAT CAG CTT AGC CAG TTT GAG CAA CTA TTT AAT CAC CCG CCG CTC 1584 Gly Asn Gin Leu Ser Gin Phe Glu Gin Leu Phe Asn His Pro Pro Leu 515 520 525 AAT GGT ATT CGC TAT GAA ATC AGT GAG GAC AAC TCC AAA CAT CTT CCT 1532 Asn Gly He Arg Tyr Glu He Ser Glu Asp Asn Ser Lys His Leu Pro 530 535 540 AAT CCT GAT CTG AAC CTT AAA CCA GAC AGT ACC GGT GAT GAT CAA CGC 1530 Asn Pro Asp Leu Asn Leu Lys Pro Asp Ser Thr Gly Asp Asp Gin Arg 545 550 555 560 AAG GCG GTT TTA AAA CGC GCG TTT CAG GTT AAC GCC AGT GAG TTG TAT 1723 Lys Ala Val Leu Lys Arg Ala Phe Gin Val Asn Ala Ser Glu Leu Tyr 565 570 575 CAG ATG TTA TTG ATC ACT GAT CGT AAA GAA GAC GGT GTT ATC AAA AAT 1775 Gin MeC Leu Leu He Thr Asp Arg Lys Glu Asp Gly Val He Lys Asn 530 535 590 AAC TTA GAG AAT TTG TCT GAT CTG TAT TTG GTT AGT TTG CTG GCC CAG 1324 Asn Leu Glu Asn Leu Ser Asp Leu Tyr Leu Val Ser Leu Leu Ala Gin -121- 595 600 605 ATT AAC CTG ACT ATT GCT GAA TTG AAC ATT TTG TTG GTG ATT TGT 1372 He Asn Leu Thr He Ala Glu Leu Asn H Leu Leu Val He Cys 615 620 GGC GAC ACC .AAC ATT TAT CAG ATT ACC GAC GAT PAT TTA GCC 1920 Gly Asp Thr Asn He Tyr Gin He Thr Asp Asp Asn Leu Ala 630 535 640 GTG AA ACA TTG TTG TGG ATC ACT CAA TGG TTG AAG ACC CAA 1363 Val Glu Thr Leu Leu Trp He Thr Gin Trp Leu Lys Thr Gin 645 550 655 AAA TGG ACA GTT ACC GAC CTG TTT CTG ATG ACC ACG GCC ACT TAC AGC 2015 Lys Trp Thr Val Thr Asp Leu Phe Leu Met Thr Thr Ala Thr Tyr Ser 660 665 670 ACC ACT TTA ACC CCA GAA ATT AGC AAT CTG ACG GCT ACG TTG TCT TCA 2064 Thr Thr Leu Thr Pro Glu He Ser Asn Leu Thr Ala Thr Leu Ser Ser 675 530 635 ACT TTG CAT GGC AAA GAG AGT CTG ATT GGG GAA GAT CTG AAA AGA GCA 2112 Thr Leu His Gly Lys Glu Ssr Leu He Gly Glu Asp Leu Lys Arg Ala 590 695 700 ATG GCG CCT TGC TTC ACT TCG GCT TTG CAT TTG ACT TCT CAA GAA GTT 2150 Met Ala Pro Cys Phe Thr Ser Ala Leu His Leu Thr Ser Gin Glu Val 705 710 715 720 GCG TAT GAC CTG CTG TTG TGG ATA GAC CAG ATT CAA CCG GCA CAA ATA 2203 Ala Tyr Asp Leu Leu Leu Trp He Asp Gin He Gin Pro Ala Gin He "25 730 735 ACT GTT GAT GGG TTT TGG GAA G A GTG CAA ACA ACA CCA ACC AGC TTG 2256 Thr Val Asp Gly Phe Trp Glu Glu Val Gin Thr Thr Pro Thr Ser Leu 740 745 750 AAG GTG ATT ACC TTT GCT CAG GTG CTG GCA CAA TTG AGC CTG ATC TAT 2304 Lys Val He Thr Phe Ala Gin Val Leu Ala Gin Leu Ser Leu He Tyr 755 760 765 CGT CGT ATT GGG TTA AGT GAA ACG GAA CTG TCA CTG ATC GTG ACT CAA 2352 Arg Arg lie Gly Leu Ser Glu Thr Glu Leu Ser Leu He Val Thr Gin 770 775 730 TCT TCT CTG CTA GTG GCA GGC AAA AGC ATA CTG GAT CAC GGT CTG TTA 2400 Ser Ser Leu Leu Val Ala Gly Lys Ser He Leu Asp His Gly Leu Leu 735 790 795 300 ACC CTG ATG GCC TTG GAA GGT TTT CAT ACC TGG GTT AAT GGC TTG GGG 2443 Thr Leu Mec Ala Leu Glu Gly Phe His Thr Trp Val Asn Gly Leu Gly 305 810 815 CAA CAT GCC TCC TTG ATA TTG GCG GCG TTG AAA GAC GGA GCC TTG ACA 2496 Gin His Ala Ser Leu He Leu Ala Ala Leu Lys Asp Gly Ala Leu Thr 820 825 830 GTT ACC GAT GTA GCA CAA GCT ATG AAT AAG GAG GAA TCT CTC CTA CAA 2544 V l Thr Asp Val Ala Gin Ala Met Asn Lys Glu Glu Ser Leu Leu Gin 335 ' 340 845 ATG GCA GCT AAT CAG GTG GAG AAG GAT CTA ACA AAA CTG ACC AGT TGG 2592 Mec Ala Ala Asn Gin Val Glu Lys Asp Leu Thr Lys Leu Thr Ser Trp 350 355 360 -122- ACA CAG ATT GAC GCT ATT CTG CAA TGG TTA CAG ATG TCT TCG GCC TTG 2540 Thr Gin lie Asp Ala He Leu Gin Trp Leu Gin Met Ser Ser Ala Leu 855 370 375 3a 0 GCG GTT TCT CCA CTG GAT CTG GCA GGG ATG ATG GCC CTG AAA TAT GGG 2633 Ala Val Ser Pro Leu Asp Leu Ala Gly Met Mec Ala Leu Lys Tyr Gly 885 390 395 ATA GAT CAT AAC TAT GCT GCC TGG CAA GCT GCG GCG GCT GCG CTG ATG 2730 lie Asp His Asn T/r Ala Ala Trp Gin Ala Ala Ala Ala Ala Leu Mec 900 905 9i0 GCT GAT CAT GCT AAT CAG GCA CAG AAA AAA CTG GAT GAG ACG TTC AGT 2734 Ala Asp His Ala Asn Gin Ala Gin Lys Lys Leu Asp Glu Thr Phe Ser 915 920 925 AAG GCA TTA TGT AAC TAT TAT ATT AAT GCT GTT GTC GAT AGT GCT GCT 2332 Lys Ala Leu Cys Asn Tyr Tyr He Asn Ala Val Val ASD Ser Ala Ala 930 935 940 GGA GTA CGT GAT CGT AAC GGT TTA TAT ACC TAT TTG CTG ATT GAT AAT 2330 Gly Val Arg Asp Arg Asn Gly Leu Tyr Thr Tyr Leu Leu lie Asp Asn 945 950 955 950 CAG GTT TCT GCC GAT GTG ATC ACT TCA CGT ATT GCA GAA GCT ATC GCC 2923 Gin Val Ser Ala Asp Val He Thr Ser Arg lie Ala Glu Ala He Ala 965 970 975 GGT ATT CAA CTG TAC GTT AAC CGG GCT TTA AAC CGA GAT GAA GGT CAG 2975 Gly He Gin Leu Tyr Val Asn Arg Ala Leu Asn Arg Asp Glu Gly Gin 980 985 990 CTT GCA TCG GAC GTT AGT ACC CGT CAG TTC TTC ACT GAC TGG GAA CGT 3024 Leu Ala Ser Asp Val Ser Thr Arg Gin Phe Phe Thr Asp Trp Glu Arg 995 1000 1005 TAC AAT AAA CGT TAC AGT ACT TGG GCT GGT GTC TCT GAA CTG GTC TAT 3072 Tyr Asn Lys Arg Tyr Ser Thr Trp Ala Gly Val Ser Glu Leu Val Tyr 1010 1015 1020 TAT CCA GAA AAC TAT GTT GAT CCC ACT CAG CGC ATT GGG CAA ACC AAA 3120 Tyr Pro Glu Asn Tyr Val Asp Pro Thr Gin Arg He Gly Gin Thr Lys 1025 1030 1035 1040 ATG ATG GAT GCG CTG TTG CAA TCC ATC AAC CAG AGC CAG CTA AAT GCG 3153 Mec Mec Asp Ala Leu Leu Gin Ser He Asn Gin Ser Gin Leu Asn Ala 1045 1050 1055 GAT ACG GTG GAA GAT GCT TTC AAA ACT TAT TTG ACC AGC TTT GAG CAG 3215 Asp Thr Val Glu Asp Ala Phe Lys Thr Tyr Leu Thr Ser Phe Glu Gin 1060 1065 1070 GTA GCA AAT CTG AAA GTA ATT AGT GCT TAC CAC GAT AAT GTG AAT GTG 3264 Val Ala Asn Leu Lys Val He Ser Ala Tyr His Asp Asn Val Asn Val 1075 1030 1085 GAT CAA GGA TTA ACT TAT TTT ATC GGT ATC GAC CAA GCA GCT CCG GGT 3312 Asp Gin Gly Leu Thr Tyr Phe He Gly He Asp Gin Ala Ala Pro Gly 1090 1095 1100 ACG TAT TAC TGG CGT AGT GTT GAT CAC AGC AAA TGT GAA AAT GGC AAG 3350 Thr Tyr Tyr Trp Arg Ser Val Asp His Ser Lys Cys Glu Asn Gly Lys 1105 1110 1115 1120 TTT GCC GCT AAT GCT TGG GGT GAG TGG AAT AAA ATT ACC TGT GCT GTC 3403 Phe Ala Ala Asn Ala Trp Gly Glu Trp Asn Lys He Thr Cys Ala Val -123- 1125 1130 i i 25 AAT CCT TGG AAA AAT AT ATC CGT CCG GTT GTT TAT ATG TCC CGC TTA 50 Asn Pro Trp Lys Asn I la He Arg Pro Val Val Tyr Met .Ser Aro L=u 1140 1145 1150 TAT CTG CTA TGG CTG GAG CAG C.AA TCA AAG AAA AGT GAT GAT GGT AAA 3504 Tyr Leu Leu Trp Leu G u Gin Gin Ser Lys Lys Ser Asp Asp Gly Lys 1155 1160 1155 ACC ACG ATT TAT C.AA TAT .AAC TTA AAA CTG GCT CAT ATT CGT TAC GAC 3552 Thr Thr He Tyr Gin Tyr Asn Leu Lys Leu Ala His He Arg Tr Asp 1170 1175 1190 GGT AGT TGG AAT ACA CCA TTT ACT TTT GAT GTG ACA GAA AAG GTA AAA 360C Gly Ser Trp Asn Thr Pro Phe Thr Phe Asp Val Thr Glu Lys Val Lys 1135 1190 1195 120 AAT TAC ACG TCG AGT ACT GAT GCT GCT GAA TCT TTA GGG TTG TAT TGT 3643 Asn Tyr Thr Ser Ser Thr Asp Ala Ala Glu Ser Leu Gly Leu T/r cys 1205 1210 1215 ACT GGT TAT CAA GGG GAA GAC ACT CTA TTA GTT ATG TTC TAT TCG ATG 3656 Thr Gly T/r Gin Gly Glu Asp Thr Leu Leu Val Men Phe Tyr Ser Mec 1220 1225 1230 CAG AGT AGT TAT AGC TCC TAT ACC GAT AAT AAT GCG CCG GTC ACT GGG 3744 Gin Ser Ser T/r Ser Ser T/r Thr Asp Asn Asn Ala Pro Val Thr Gly 1235 1240 1245 CTA TAT ATT TTC GCT GAT ATG TCA TCA GAC AAT ATG ACG AAT GCA CAA 3792 Leu T/r He Phe Ala Asp Mec Ser Ser Asp Asn Met Thr Asn Ala Gin 1250 1255 1260 GCA ACT AAC TAT TGG AAT AAC AGT TAT CCG CAA TTT GAT ACT GTG ATG 3340 Ala Thr Asn T/r Trp Asn Asn Ser Tyr Pro Gin Phe Asp Thr Val Met 1265 1270 1275 12B0 GCA GAT CCG GAT AGC GAC AAT AAA AAA GTC ATA ACC AGA AGA GTT AAT 3838 Ala Asp Pro Asp Ser Asp Asn Lys Lys Val He Thr Arg Arg Val Asn 1235 1290 1295 AAC CGT TAT GCG GAG GAT TAT GAA ATT CCT TCC TCT GTG ACA AGT AAC 3935 Asn Arg T/r Ala Glu Asp T/r Glu He Pro Ser Ser Val Thr Ser Asn 1300 1305 1310 AGT AAT TAT TCT TGG GGT GAT CAC AGT TTA ACC ATG CTT TAT GGT GGT 3984 Ser Asn Tyr Ser Trp Gly Asp His Ser Leu Thr Met Leu Tyr Gly Gly 1315 1320 1325 ACT GTT CCT AAT ATT ACT TTT GAA TCG GCG GCA GAA GAT TTA AGG CTA 4032 Ser Val Pro Asn He Thr Phe Glu Ser Ala Ala Glu Asp Leu Arg Leu 1330 1335 1340 TCT ACC AAT ATG GCA TTG AGT ATT ATT CAT AAT GGA TAT GCG GGA ACC 4080 Ser Thr Asn Met Ala Leu Ser He He His Asn Gly Tyr Ala Gly Thr 1345 1350 1355 1360 CGC CGT ATA CAA TGT AAT CTT ATG AAA CAA TAC GCT TCA TTA GGT GAT 4123 Arg Arg He Gin Cys Asn Leu Met Lys Gin Tyr Ala Ser Leu Gly Asp 1365 1370 1375 AAA TTT ATA ATT TAT GAT TCA TCA TTT GAT GAT GCA AAC CGT TTT AAT 4175 Lys Phe He He Tyr Asp Ser Ser Phe Asp Asp Ala Asn Arg Phe Asn 1380 . 1385 1390 -124- GAG AAC GAT AGT TTT GTG ATT TAT CAA GGA GAA CTT AGO GAA ACA AGT 5040 Glu Asn Asp Ser Phe Vai He Tyr Gin Giy Glu Leu Ser Giu Thr Ser 1555 1 70 1675 1530 CAA ACT GTT GTG AAA GTT TTC TTA TCC TAT TTT ATA GAG GCG ACT GGA 5033 '.;ln Thr Vai. Vai Lys Vai Phe Leu Ser T/r Phe lie Glu Ala Thr Giy 1535 1690 1595 AAT AAG AAC CAC TTA TGG GTA CGT GCT AAA TAC CAA AAG GAA ACG ACT 5136 Asn Lys Asn His Leu Trp Vai Arg Ala Lys Tyr Gin Lys Glu Thr Thr 1700 1705 1710 GAT AAG ATC TTG TTC GAC CGT ACT GAT GAG AAA GAT CCG CAC GGT TGG 5134 Asp Lys lie Leu Phe Asp Arg Thr Asp Glu Lys Asp Pro His Giy Trp 1715 1720 1725 TTT CTC AGC GAC GAT CAC AAG ACC TTT AGT GGT CTC TCT TCC GCA CAG 5232 Phe Leu Ser Asp Asp His Lys Thr Phe Ser Giy Leu Ser Ser Ala Gin 1730 1735 1740 GCA TTA AAG AAC GAC AGT GAA CCG ATG GAT TTC TCT GGC GCC AAT GCT 5230 Ala Leu Lys Asn Asp Ser Glu Pro Met Asp Phe Ser Giy Aia Asn Ala 1745 1750 1755 1760 CTC TAT TTC TGG GAA CTG TTC TAT TAC ACG CCG ATG ATG ATG GCT CAT 5323 Leu Tyr Phe Trp Glu Leu Phe Tyr Tyr Thr Pro Mec Met Mec Ala His 1765 1770 1775 CGT TTG TTG CAG GAA CAG AAT TTT GAT GCG GCG AAC CAT TGG TTC CGT 5376 Arg Leu Leu Gin Gl'u Gin Asn Phe Asp Ala Ala Asn His Trp Phe Arg 1780 1785 1790 TAT GTC TGG AGT CCA TCC GGT TAT ATC GTT GAT GGT AAA ATT GCT ATC 5424 Tyr Vai Trp Ser Pro Ser Giy Tyr lie Vai Asp Giy Lys lie Ala He 1795 1800 1805 " TAC CAC TGG AAC GTG CGA CCG CTG GAA GAA GAC ACC AGT TGG AAT GCA 5 72 Tyr His Trp Asn Vai Arg Pro Leu Glu Glu Asp Thr Ser Trp Asn Ala 1810 1815 1820 CAA CAA CTG GAC TCC ACC GAT CCA GAT GCT GTA GCC CAA GAT GAT CCG 5520 Gin Gin Leu Asp Ser Thr Asp Pro Asp Ala Vai Ala Gin Asp Asp Pro 1325 1830 1335 1340 ATG CAC TAC AAG GTG GCT ACC TTT ATG GCG ACG TTG GAT CTG CTA ATG 55 ·3 Mec His Tyr Lys Vai Ala Thr Phe Me Ala Thr Leu Asp Leu Leu Mec 1845 1850 1855 GCC CGT GGT GAT GCT GCT TAC CGC CAG TTA GAG CGT GAT ACG TTG GCT 56 L6 Ala Arg Giy Asp Ala Ala Tyr Arg Gin Leu Glu Arg Asp Thr Leu Ala 1860 1865 1870 GA¾ GCT AAA ATG TGG TAT ACA CAG GCG CTT AAT CTG TTG GGT GAT GAG 5664 Glu Ala Lys Mec Trp Tyr Thr Gin Ala Leu Asn Leu Leu Giy Asp Glu 1875 1380 1885 CCA CAA GTG ATG CTG AGT ACG ACT TGG GCT AAT CCA ACA TTG GGT .AAT 5712 Pro Gin Vai Mec Leu Ser Thr Thr Trp Ala Asn Pro. Thr Leu Giy Asn 1890 1895 1900 GCT GCT TCA AAA ACC ACA CAG CAG GTT CGT CAG CAA GTG CTT ACC CAG 5760 Ala Ala Ser Lys Thr Thr Gin Gin Vai Arg Gin Gin Vai Leu Thr Gin 1305 1910 1915 1920 -126- SUBSTTTUTE SHEET (RULE 26) CTG GTG CCA TTG TTT AAA TTC GGA AAA GAC GAG AAC TCA GAT GAT ACT 421 ! Leu Val Pro Leu Phe Lys Phe Gly Lys Asp Giu Asn Ser AS D Asp Ser 1395 1400 1405 ATT TGT ATA TAT AAT GAA AAC CCT TCC TCT GAA GAT AAG AAG TGG TAT 4 "_ lie Cys He Tyr Asn Giu Asn Pro Ser Ser Giu Asp Lys Lys Trp Tyr 1410 1415 1420 TTT TCT TCG AAA GAT GAC AAT AAA ACA GCG GAT TAT AAT GGT GGA ACT 432'. ?he Ser Ser Lys Asp Asp Asn Lys Thr Aia Asp Tyr Asn Gly Gly Thr 1425 1430 1435 1440 CAA TGT ATA GAT GCT GGA ACC AGT AAC AAA GAT TTT TAT TAT AAT CTC 435. Gin Cys lie Asp Ala Gly Thr Ser Asn Lys Asp Phe Tyr Tyr Asn Leu 1445 1450 1455 CAG GAG ATT GAA GTA ATT AGT GTT ACT GGT GGG TAT TGG TCG AGT TAT 4 1 Gin Glu He Giu Val He Ser Val Thr Gly Gly T/r Trp Ser Ser Tyr 1460 1465 1470 AAA ATA TCC AAC CCG ATT AAT ATC AAT ACG GGC ATT GAT AGT GCT AAA 4461 Lys lie Ser Asn Pro He Asn He Asn Thr Gly He Asp Ser Aia Lys 1475 1480 1485 GTA AAA GTC ACC GTA AAA GCG GGT GGT GAC GAT CAA ATC TTT ACT GCT 451 J Val Lys Vai Thr Val Lys Ala Gly Gly Asp Asp Gin He Phe Thr Ala 1490 1495 1500 GAT AAT AGT ACC TAT GTT CCT CAG CAA CCG GCA CCC AGT TTT GAG GAG 456'; Asp Asn Ser Thr Tyr Val Pro Gin Gin Pro Ala Pro Ser Phe Glu Glu 1505 1510 1515 1520 ATG ATT TAT CAG TTC AAT AAC CTG ACA ATA GAT TGT AAG AAT TTA AAT 46Gi Mec He Tyr Gin Phe Asn Asn Leu Thr lie Asp Cys Lys Asn Leu Asn 1525 1530 1535 TTC ATC GAC AAT CAG GCA CAT ATT GAG ATT GAT TTC ACC GCT ACG GCA 46 "6 Phe He Asp Asn Gin Ala His He Glu He Asp Phe Thr Ala Thr Aia 1540 1545 1550 CAA GAT GGC CGA TTC TTG GGT GCA GAA ACT TTT ATT ATC CCG GTA ACT 47 C Gin Asp Gly Arg Phe Leu Gly Ala Glu Thr Phe He He Pro Val Thr 1555 1560 1565 AAA AAA GTT CTC GGT ACT GAG AAC GTG ATT GCG TTA TAT AGC GAA AAT 4752 Lys Lys Val Leu Gly Thr Glu Asn Val He Ala Leu Tyr Ser Glu Asn 1570 1575 1530 AAC GGT GTT CAA TAT ATG CAA ATT GGC GCA TAT CGT ACC CGT TTG AAT 4300 Asn Gly Val Gin Tyr Mec Gin He Gly Ala Tyr Arg Thr Arg Leu Asn 1535 1590 1595 1600 ACG TTA TTC GCT CAA CAG TTG GTT AGC CGT GCT AAT CGT GGC ATT GAT 4343 Thr Leu Phe Ala Gin Gin Leu Val Ser Arg Ala Asn Arg Gly He Asp 1605 1510 1615 GCA GTG CTC AGT ATG GAA ACT CAG AAT ATT CAG GAA CCG CAA TTA GGA 4396 Ala Val Leu Ser Mec Glu Thr Gin Asn He Gin Glu Pro Gin Leu Gly 1620 1625 1630 CG GGC ACA TAT GTG CAG CTT GTG TTG GAT AAA TAT GAT GAG TCT ATT 4344 la Gly Thr T/r Val Gin Leu Val Leu Asp Lys Tyr Asp Glu Ser He 1635 1640 1645 CAT GGC ACT AAT AAA AGC TTT GCT ATT GAA TAT GTT GAT ATA TTT AAA 499 His Gly Thr Asn Lys Ser Phe Ala He Glu T/r Val Asp He Phe Lys " GT :TC AAT AGC AGG GTA AAA ACC CCG TG CTA CGA ACA ZZZ AT = Ϊ ¾ Leu Arg Leu Asn Ser A ? Val Lys Thr Pro Leu Leu Gly Thr Ala Asn 1325 1S30 1335 TCC CTG ACC GCT TTA TTC CTG CCG AG CAA AAT AGC AAG CTC AAA GGC 5356 Ser Leu Thr Ala Leu Phe Leu Pro Gin Glu Asn Ser Lys Leu Lvs Gly 1340 1945 L350 TAC TGG CGG ACA CTG GCG CAG CGT ATG TTT AAT TTA CGT CAT AAT CTG 5904 Tyr Trp Arg Thr Leu Ala Gin Arg Mec Phe Asn Leu Arg His Asn Leu 1955 1950 L965 TCG ATT GAC GGC CAG CCG CTC TCC TTG CCG CTG TAT GCT AAA CCG GCT 5352 Ser He Asp Gly Gin Pro Leu Ser Leu Pro Leu Tyr Ala Lys Pro Ala 1970 1975 1980 GAT CCA AAA GCT TTA CTG AGT GCG GCG GTT TCA GCT TCT CAA GGG GGA 5000 Asp Pro Lys Ala Leu Leu Ser Ala Ala Val Ser Ala Ser Gin Gly Gly 1935 1390 1995 2000 GCC GAC TTG CCG AAG GCG CCG CTG ACT ATT CAC CGC TTC CCT CAA ATG 5043 Ala Asp Leu Pre Lys Ala Pro Leu Thr lie His Arg Phe Pro Gin Mec 2005 2010 2015 CTA GAA GGG GCA CGG GGC TTG GTT AAC CAG CTT ATA CAG TTC GGT AGT 5096 Leu Glu Gly Ala Arg Gly Leu Val Asn Gin Leu He Gin Phe Gly Ser 2020 2025 2030 TCA CTA TTG GGG TAC AGT GAG CGT CAG AT GCG GAA GCT ATG AGT CAA 6144 Ser Leu Leu Gly Tyr Ser Glu Arg Gin Asp Ala Glu Ala Mec Ser Gin 2035 2040 2045 CTA CTG CAA ACC CAA GCC AGC GAG TTA ATA CTG ACC AGT ATT CGT ATG 6132 Leu Leu Gin Thr Gin Ala Ser Glu Leu He Leu Thr Ser He Arg Mec 2050 2055 2060 CAG GAT AAC CAA TTG GCA GAG CTG GAT TCG GAA AAA ACC GCC TTG CAA 6240 Gin Asp Asn Gin Leu Ala Glu Leu Asp Ser Glu Lys Thr Ala Leu Gin 2055 2070 2075 2080 GTC TCT TTA GCT GGA GTG CAA CAA CGG TTT GAC AGC TAT AGC CAA CTG 5288 Val Ser Leu Ala Gly Val Gin Gin Arg Phe Asp Ser Tyr Ser Gin Leu 2085 2090 2095 TAT GAG GAG AAC ATC AAC GCA GGT GAG CAG CGA GCG CTG GCG TTA CGC 5335 Tyr Glu Glu Asn He Asn Ala Gl Glu Gin Arg Ala Leu Ala Leu Arg 2100 2105 2110 TCA GAA TCT GCT ATT GAG TCT CAG GGA GCG CAG ATT TCC CGT ATG GCA 6384 Ser Glu Ser Ala He Glu Ser Gin Gly Ala Gin He Ser Arg Mec Ala 2115 2120 2125 GGC GCG GGT GTT GAT ATG GCA CCA AAT ATC TTC GGC CTG GCT GAT GGC 6432 Gly Ala Gly Val Asp Mec Ala Pro Asn He Phe Gly Leu Ala Asp Gly 2130 2135 2140 GGC ATG CAT TAT GGT GCT ATT GCC TAT GCC ATC GCT GAC GGT ATT GAG 6480 Gly Mec His Tyr Gly Ala He Ala Tyr Ala He Ala Asp Gly He Glu 2145 2150 2155 2160 TTG AGT GCT TCT GCC AAG ATG GTT GAT GCG GAG AAA GTT GCT CAG TCG 6523 Leu Ser Ala Ser Ala Lys Mec Val Asp Ala Glu Lys Val Ala Gin Ser 2165 2170 2175 G.AA ATA TAT CGC CCT CGC CGT CAA GAA TGG AAA ATT CAG CGT GAC AAC 6575 Glu He Tyr Arg Arg Arg Arg Gin Glu Trp Lys He Gin Arg Asp Asn -127- 2130 2133 2130 GCA CAA GCG GAG ATT AAC CAG TTA AAC GCG CAA CTG GAA TCA CTG TCT 6: Ala Gin Ala Glu He Asn Gin Leu Asn Ala Gin Leu Glu Ser Leu Ser 2195 2200 2205 ATT CGC CGT GAA GCC GCT GAA ATG CAA AAA GAG TAC CTG AAA ACC C G 66" 2 lie Arg Arg Glu Ala Ala Glu Met Gin Lys Glu Tyr Leu Lys Thr Gin 2210 2215 2220 CAA GCT CAG GCG CAG GCA CAA CTT ACT TTC TTA AGA AGC AAA TTC ACT 6" 20 Gin Ala Gin Ala Gin Ala Gin Leu Thr Phe Leu Arg Ser Lys Phe Ser 2225 2230 2235 2240 .AAT AA GCG TTA TAT AGT TGG TTA CGA GGG CGT TTG TCA GGT ATT TAT 6763 Asn Gin Aia Leu Tyr Ser Trp Leu Arg Gly Arg Leu Ser Gly lie Tyr 2245 2250 2255 TTC CAG TTC TAT GAC TTG GCC GTA TCA CGT TGC CTG ATG GCA GAG CAA 63 i6 Phe Gin Phe Tyr Asp Leu Ala Val Ser Arg Cys Leu Mec Ala Glu Gin 2260 2265 2270 TCC TAT CAA TGG GAA GCT AAT GAT AAT TCC ATT AGC TTT GTC AAA CCG 6364 Ser Tyr Gin Trp Glu Ala Asn Asp Asn Ser lie Ser Phe Val Lys Pro 2275 2280 2285 GGT GCA TGG CAA GGA ACT TAC GCC GGC TTA TTG TGT GGA GAA GCT TTG 6912 Gly Ala Trp Gin Giy Thr Tyr Ala Gly Leu Leu Cys Gly Glu Ala Leu 2290 2295 2300 ATA CAA AAT CTG GCA CAA ATG GAA GAG GCA TAT CTG AAA TGG GAA TCT 6960 lie Gin Asn Leu Ala Gin Met Glu Glu Aia Tyr Leu Lys Trp Glu Ser 2305 2310 2315 2320 CGC GCT TTG GAA GTA GAA CGC ACG GTT TCA TTG GCA GTG GTT TAT GAT 700Ϊ Arg Ala Leu Glu Val Glu Arg Thr Val Ser Leu Ala Val Val Tyr Asp 2325 2330 2335 TCA CTG GAA GGT AAT GAT CGT TTT AAT TTA GCG GAA CAA ΑΤΛ CCT GCA 0 Ser Leu Glu Gly Asn Asp Arg Phe Asn Leu Ala Glu Gin lie Pro Ala 2340 2345 2350 TTA TTG GAT AAG GGG GAG GGA ACA GCA GGA AC AAA GAA AAT GGG TTA i04 Leu Leu Asp Lys Gly Glu Gly Thr Ala Gly Thr Lys Glu Asn Gly Leu 2355 2360 2365 TCA TTG GCT AAT GCT ATC CTG TCA GCT TCG GTC AAA TTG TCC GAC TTG 7152 Ser Leu Ala Asn Ala lie Leu Ser Ala Ser Val Lys Leu Ser Asp Leu 2370 2375 2380 AAA CTG GGA ACG GAT TAT CCA GAC AGT ATC GTT GGT AGC AAC AAG GTT 7200 Lys Leu Gly Thr Asp Tyr Pro Asp Ser He Val Gly Ser Asn Lys Val 2335 2390 2395 2400 CGT CGT ATT AAG CAA ATC AGT GTT TCG CTA CCT GCA TTG GTT GGG CCT 7243 Arg Arg lie Lys Gin He Ser Val Ser Leu Pro Ala Leu Val Gly Pro 2405 2410 2415 TAT CAG GAT GTT CAG GCT ATG CTC AGC TAT GGT GGC AGT ACT CAA TTG "295 Tyr Gin Asp Val Gin Ala Mec Leu Ser Tyr Giy Giy Ser Thr Gin Leu 2420 2425 2430 CCG AAA GGT TGT TCA GCG TTG GCT GTG TCT CAT GGT ACC AAT GAT AGT "344 Pro Lys Gly Cys Ser Ala Leu Aia Val Ser His Gly Thr Asn Asp Ser 2435 2440 2445 -128- SUBSTTTUTE SHEET (RULE 26) GGT CAG TTC CAG TTG GAT TTC AAT GAC GGC AAA TAC G CCA TTT AA "332 Gly Gin Phe Gin Leu Asp Phe Asn Asp Gly Lys Tyr Leu Pro Phe Giu 2450 2455 2460 GGT ATT GCT CTT GAT GAT CAG GGT ACA CTG AAT CTT CAA TTT CCG AAT 7440 Gly He Ala Leu Asp Asp Gin Gly Thr Leu Asn Leu Gin Phe Pro Asn 2465 2470 2475 2430 GCT ACC GAC AAG CAG AAA GCA ATA TTG CAA ACT ATG AGC GAT ATT ATT 7438 Ala Thr Asp Lys Gin Lys Ala He Leu 'Gin Thr Mec Ser Asp lie He 2485 2490 2495 TTG CAT ATT CGT TAT ACC ATC CGT TAA Leu His lie Arg Tyr Thr He Arg * 2500 2505 (2) INFORMATION FOR SEQ ID NO: 12: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH : 2505 amino acids (B) TYPE: amino acid (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: Mec Gin Asn Ser Leu Ser Ser Thr He Asp Thr He Cys Gin Lys Leu 1 5 10 15 Gin Leu Thr Cys Pro Ala Glu lie Ala Leu Tyr Pro Phe Asp Thr Phe 20 25 30 Arg Glu Lys Thr Arg Gly MeC Val Asn Trp Gly Glu Ala Lys Arg He 35 40 45 Tyr Glu He Ala Gin Ala Glu Gin Asp Arg Asn Leu Leu His Glu Lys 50 55 60 Arg He Phe Ala Tyr Ala Asn Pro Leu Leu Lys Asn Ala Val Arg Leu 65 70 75 80 Gly. Thr Arg Gin MeC Leu Gly Phe He Gin Gly Tyr Ser Asp Leu Phe 85 90 95 Gly Asn Arg Ala Asp Asn Tyr Ala Ala Pro Gly Ser Val Ala ser Mec 100 105 110 Phe Ser Pro Ala Ala Tyr Leu Thr Glu Leu Tyr Arg Glu Ala Lys Asn 115 120 125 Leu His Asp Ser Ser Ser He Tyr Tyr Leu Asp Lys Arg Arg Pro Asp 130 135 140 Leu Ala Ser Leu Mec Leu Ser Gin Lys Asn MeC Asp Glu Glu He Ser 145 150 155 160 Thr Leu Ala Leu Ser Asn Glu Leu Cys Leu Ala Gly He Glu Thr Lys 165 ' 170 175 Thr Gly Lys Ser Gin Asp Glu Val Mec Asp Mec Leu Ser Thr Tyr Arg 180 185 190 129- L-eu Ser Gly Glu Thr Pro Tyr His His Ala Tyr Glu Thr Vai Arg Glu 195 200 205 Val His Glu Arg Asp Pro Gly Phe Arg His Leu Ser Gin Ala Pro 210 215 220 He Val Ala Ala Lys Leu Asp Pro Val Thr Leu Leu Gly lie Ser Ser 225 230 235 240 His He Ser Pro Glu Leu Tyr Asn Leu Leu He Glu Glu He Pro Glu 245 250 255 Lys Asp Glu Ala Ala Leu Asp Thr Leu Tyr Lys Thr Asn Phe Gly Asp 260 265 270 He Thr Thr Ala Gin Leu Met Ser Pro Ser Tyr Leu Ala Arg Tyr Tyr 275 230 285 Gly Val Ser Pro Glu Asp He Ala Tyr Val Thr Thr Ser Leu Ser His 290 295 300 Val Gly Tyr Ser Ser Asp lie Leu Val He Pro Leu Val Asp Gly Val 305 310 315 320 Gly Lys Met Glu Val Val Arg Val Thr Arg Thr Pro Ser Asp Asn Tyr 325 330 335 Thr Ser Gin Thr Asn Tyr He Glu Leu Tyr Pro Gin Gly Gly Asp Asn 340 345 350 Tyr Leu He Lys Tyr Asn Leu Ser Asn Ser Phe Gly Leu Asp Asp Phe 355 360 365 Tyr Leu Gin Tyr Lys Asp Gly Ser Ala Asp Trp Thr Glu He Ala His 370 375 380 Asn Pro Tyr Pro Asp Met Val He Asn Gin Lys Tyr Glu Ser Gin Ala 385 390 395 400 Thr He Lys Arg Ser Asp Ser Asp Asn He Leu Ser He Gly Leu Gin 405 410 415 Arg Trp His Ser Gly Ser Tyr Asn Phe Ala Ala Ala Asn Phe Lys He 420 425 430 Asp Gin Tyr Ser Pro Lys Ala Phe Leu Leu Lys Met Asn Lys Ala He 435 440 445 Arg Leu Leu Lys Ala Thr Gly Leu Ser Phe Ala Thr Leu Glu Arg He 450 455 460 Val Asp Ser Val Asn Ser Thr Lys Ser He Thr Val Glu Val Leu Asn 465 470 475 480 Lys Val Tyr Ara Val Lys Phe Tyr He Asp Arg Tyr Gly He Ser Glu 435 490 495 Glu Thr Ala Ala He Leu Ala Asn He Asn He Ser Gin Gin Ala Val 500 505 510 Gly Asn Gin Leu Ser Gin Phe Glu Gin Leu Phe Asn His Pro Pro Leu 515 520 525 Asn Gly He Arg Tyr Glu He Ser Glu Asp Asn Ser Lys His Leu Pro 530 535 540 -130- SUBSTJTUTE SHEET (RULE 26) Asn Pro Asp Leu Asn Leu Lys Pro Asp Ser Thr Gly Asp Asp Gin Arg 545 550 555 560 Lys Ala Val Leu Lys Arg Ala Phe Gin Val Asn Ala Ser Glu Leu Tyr 565 570 575 Gin Mec Leu Leu lie Thr Asp Arg Lys Giu Asp Gly Val lie Lys Asn 530 585 590 Asn Leu Glu Asn Leu Ser Asp Leu Tyr Leu Val Ser Leu Leu Ala Gin 595 600 605 lie His Asn Leu Thr He Ala Glu Leu Asn He Leu Leu Val lie Cys 610 515 620 Gly Tyr Gly Asp Thr Asn lie Tyr Gin He Thr Asp Asp Asn Leu Ala 625 630 635 640 Lys He Val Glu Thr Leu Leu Trp He Thr Gin Trp Leu Lys Thr Gin 645 650 655 Lys Trp Thr Val Thr Asp Leu Phe Leu Mec Thr Thr Ala Thr Tyr Ser 650 665 670 Thr Thr Leu Thr Pro Glu He Ser Asn Leu Thr Ala Thr Leu Ser Ser 675 680 635 Thr Leu His Gly Lys Glu Ser Leu He Gly Glu Asp Leu Lys Arg Ala 690 695 700 Met Ala Pro Cys Phe Thr Ser Ala Leu His Leu Thr Ser Gin Glu Val 705 710 715 720 Ala Tyr Asp Leu Leu Leu Trp He Asp Cln He Gin Pro Ala Gin He 725 730 735 Thr Val Asp Gly Phe Trp Glu Glu Val Gin Thr Thr Pro Thr Ser Leu 740 745 750 Lys Val He Thr Phe Ala Gin Val Leu Ala Gin Leu Ser Leu He Tyr 755 760 765 Arg Arg He Gly Leu Ser Glu Thr Glu Leu Ser Leu He Val Thr Gin 770 775 780 Ser Ser Leu Leu Val Ala Gly Lys Ser He Leu Asp His Gly Leu Leu 785 790 795 800 Thr Leu Mec Ala Leu Glu Gly Phe His Thr Trp Val Asn Gly Leu Gly 805 810 815 Gin His Ala Ser Leu He Leu Ala Ala Leu Lys Asp Gly Ala Leu Thr 820 825 830 Val Thr Asp Val Ala Gin Aia Mec Asn Lys Glu Glu Ser Leu Leu Gin 835 840 845 Mec Ala Ala Asn Gin Val Glu Lys Asp Leu Thr Lys Leu Thr Ser Trp 350 355 860 Thr Gin He Asp Ala lie Leu Gin Trp Leu Gin Mec Ser Ser Ala Leu 365 870 375 330 Ala Val Ser Pro Leu Asp Leu Ala Gly Mec Mec Ala Leu Lys Tyr Gly 885 890 895 -131- He Asp His Asn Tyr Aid Ala Trp Gin Ala Ala Ala Ala Ala Leu Mec 900 905 910 Ala Asp His Ala Asn Gin Ala Gin Lys Lys Leu Asp Glu Thr Phe Ser 915 920 925 Lys Ala Leu Cys Asn Tyr Tyr He Asn Ala Val Val Asp Ser Ala Ala 930 935 940 Gly Val Arg Asp Arg Asn Gly Leu Tyr Thr Tyr Leu Leu He Asp Asn 945 950 955 900 Gin Val Ser Ala Asp Val He Thr Ser Arg He Ala Glu Ala He Ala 965 970 975 Gly He Gin Leu Tyr Val Asn Arg Ala Leu Asn Arg Asp Glu Gly Gin 930 985 990 Leu Ala Ser Asp Val Ser Thr Arg Gin Phe Phe Thr Asp Trp Glu Arg 995 1000 i005 Tyr Asn Lys Arg Tyr Sar Thr Trp Ala Gly Val Ser Glu Leu Val Tyr iOlO 1015 1020 Tyr Pro Glu Asn Tyr Val Asp Pro Thr Gin Arg lie Gly Gin Thr Lys 1025 1030 ... . _ 1035 1040 Mec Mec Asp Ala Leu "Leu Gin Ser He Asn Gin Ser Gin Leu Asn Ala " 1045 1050 1055 Asp Thr Val Glu Asp Ala Phe Lys Thr Tyr Leu Thr Ser Phe Glu Gin 1060 1065 1070 Val Ala Asn Leu Lys Val Ha Ser Ala Tyr His Asp Asn Val Asn Val 1075 1080 1085 Asp Gin Gly Leu Thr Tyr Phe He Gly He Asp Gin Ala Ala Pro Gly 1090 1095 1100 Thr Tyr Tyr Trp Arg Ser Val Asp His Ser Lys Cys Glu Asn Gly Lys 1105 1110 1115 1120 Phe Ala Ala Asn Ala Trp Gly Glu Trp Asn Lys He Thr Cys Ala Val 1125 1130 1135 Asn Pro Trp Lys Asn He Ha Arg Pro Val Val Tyr Met Ser Arg Leu 1140 1145 1150 Tyr Leu Leu Trp Leu Glu Gin Gin Ser Lys Lys Ser Asp Asp Gly Lys 1155 1160 1165 Thr Thr I la Tyr Gin Tyr Asn Leu Lys Leu Ala His He Arg Tyr Asp 1170 1175 1180 Gly Ser Trp Asn Thr Pro Phe Thr Phe Asp Val Thr Glu Lys Val Lys 1135 1190 1195 1200 Asn Tyr Thr Ser Ser Thr Asp Ala Ala Glu Ser Leu Gly Leu Tyr Cys 1205 1210 1215 Thr Gly Tyr Gin Gly Glu Asp Thr Leu Leu Val Mec Phe Tyr Ser Mec 1220 1225 1230 Gin Ser Ser Tyr Ser Ser Tyr Thr Asp Asn Asn Ala Pro Val Thr Gly 1235 1-240 1245 -132· L u Tyr 11= Phe Aia Asp Met Ser s-sr Asp Asn Mec Thr Asn Aia Zir. 1230 1255 1250 Ala Thr Asn Ty Trp Asn Asn Ser T r Pro Gin Phe Asp Thr Val Met 1255 1270 1275 1230 Ala Asp Pro Asp Ser Asp Asn Lys Lys Val lie Thr Arg Arg Val 1235 1290 129 Asn Arg Tyr Ala Glu Asp Tyr Glu lie Pro Ser ser Val Thr Ser 1300 1305 1310 Ser Asn Tyr Ser Trp Giy Asp His Ser Leu Thr Mec Leu Tyr Giy Giy 1315 1320 1325 Ser Val Pro Asn He Thr Phe Glu Ser Ala Ala Glu Asp Leu Arg Leu 1330 1335 1340 Ser Thr Asn Mec Ala Leu Ser lie He His Asn Giy Tyr Ala Giy Thr 1345 1350 1355 1360 Arg Arg He Gin Cys Asn Leu Mec Lys Gin Tyr Ala Ser Leu Giy Asp 1365 1370 1375 Lys Phe He He Tyr Asp Ser Ser Phe Asp Asp Ala Asn Arg Phe Asn 1380 1385 1390 Leu Val Pro Leu Phe Lys Phe Giy Lys Asp Glu Asn Ser Asp Asp Ser 1395 1400 1405 He Cys He Tyr Asn Glu Asn Pro Ser Ser Glu Asp Lys Lys Trp Tyr 1410 1415 1420 Phe Ser Ser Lys Asp Asp Asn Lys Thr Ala Asp Tyr Asn Giy Giy Thr 1425 1430 1435 1440 Gin Cys He Asp Ala Giy Thr Ser Asn Lys Asp Phe Tyr Tyr Asn Leu 1445 1450 1455 Gin Glu He Glu Val He Ser Val Thr Giy Giy Tyr Trp Ser Ser Tyr 1460 1465 1470 Lys He Ser Asn Pro He Asn He Asn Thr Giy He Asp Ser Ala Lys 1475 1480 1485 Val Lys Val Thr Val Lys Ala Giy Giy Asp Asp Gin He Phe Thr Ala 1490 1495 1500 Asp Asn Ser Thr Tyr Val Pro Gin Gin Pro Ala Pro Ser Phe Glu Glu 1505 1510 1515 1520 Mec He Tyr Gin Phe Asn Asn Leu Thr He Asp Cys Lys Asn Leu Asn 1525 1530 1535 Phe He Asp Asn Gin Ala His He Glu He Asp Phe Thr Ala Thr Ala 1540 1545 1550 Gin Asp Giy Arg Phe Leu Giy Ala Glu Thr Phe He He Pro Val Thr 1555 1560 1565 Lys Lys Val Leu Giy Thr Glu Asn Val He Ala Leu Tyr Ser Glu Asn 1570 1575 1580 Asn Giy Val Gin Tyr Mec Gin He Giy Ala Tyr Arg Thr Arg Leu Asn 1535 1590 1595 1600 133 - Thr Leu Phe Aia Gin Gin Leu Val Ser Arg Ala Asn Arg Gly lie Asp 1605 1610 1615 Ala Val Leu 3er Mec Glu Thr Gin Asn lie Gin Glu Pro Gin Leu Gly 1529 1625 1630 Ala Gly Thr T r Val Gin Leu Val Leu Asp Lys Tyr Asp Glu Ser lie 1635 1640 1645 His Gly Thr Asn Lys Ser Phe Ala lie Glu Tyr Val Asp lie Phe Lys 1650 1655 1660 Glu Asn Asp Ser Phe Val lie Tyr Gin Gly Glu Leu Ser Glu Thr Ser 1665 1670 1675 isao Gin Thr Val Val Lys Val Phe Leu Ser Tyr Phe He Glu Ala Thr Gly 1685 1690 1595 Asn Lys Asn His Leu Trp Val Arg Ala Lys Tyr Gin Lys Glu Thr Thr 1700 1705 1710 Asp Lys He Leu Phe Asp Arg Thr Asp Glu Lys Asp Pro His Gly Trp 1715 1720 1725 Phe Leu Ser Asp Asp His Lys Thr Phe Ser Gly Leu Ser Ser Ala Gin 1730 1735 1740 Ala Leu Lys Asn Asp Ser Glu Pro Mec Asp Phe Ser Gly Ala Asn Ala 1745 1750 1755 1760 Leu Tyr Phe Trp Glu Leu Phe Tyr Tyr Thr Pro Mec Mec Mec Ala His 17S5 1770 1775 Arg Leu Leu Gin Glu Gin Asn Phe Asp Ala Ala Asn His Trp Phe Arg 1780 1785 1790 Tyr Val Trp Ser Pro Ser Gly Tyr lie Val Asp Gly Lys lie Ala lie 1795 1800 1805 Tyr His Trp Asn Val Arg Pro Leu Glu Glu Asp Thr Ser Trp Asn Ala 1810 1815 1320 Gin Gin Leu Asp Ser Thr Asp Pro Asp Ala Val Ala Gin Asp Asp Pro 1825 1830 1835 1340 Mec His Tyr Lys Val Ala Thr Phe Mec Ala Thr Leu Asp Leu Leu Mec 1845 1850 1855 Ala Arg Gly Asp Ala Ala Tyr Arg Gin Leu Glu Arg Asp Thr Leu Ala 1860 1365 1370 Glu Ala Lys Mec Trp Tyr Thr Gin Ala Leu Asn Leu Leu Gly Asp Glu 1875 1880 1885 Pro Gin Val Mec Leu Ser Thr Thr Trp Ala Asn Pro Thr Leu Gly Asn 1890 1895 1900 Ala Ala Ser Lys Thr Thr Gin Gin Val Arg Gin Gin Val Leu Thr Gin 1905 1910 1915 1920 Leu Arg Leu Asn Ser Arg Val Lys Thr Pro Leu Leu Gly Thr Ala Asn 1925 1930 1935 Ser Leu Thr Ala Leu Phe Leu Pro Gin Glu Asn Ser Lys Leu Lys Gly 1940 1945 1950 134- yr Trp Arg Thr Leu Aia G Ln Arg Mec Phe Asn Leu A a His Asn Leu L955 1350 196? er He Asp Gly Gin Pro Leu Ser Leu Pro Leu Tyr Ala Lys Pro Ala 1970 1975 1980 .sp Pro Lys Ala Leu Leu Ser Ala Ala Val Ser Ala Ser G Ln Gly Gly 335 1990 1995 2000 la Asp Leu Pro Lys Ala Pro Leu Thr He His Arg Phe Pro Gin Mec 2005 2010 2015 Leu Glu Gly Ala Arg Gly Leu Val Asn Gin Leu He Gin Phe Gly Ser 2020 2025 2030 Ser Leu Leu Gly Tyr Ser Glu Arg Gin Asp Ala Glu Ala Me Ser Gin 2035 2040 2045 Leu Leu Gin Thr Gin Ala Ser Glu Leu He Leu Thr Ser He Arg Mec 2050 2055 2060 Gin Asp Asn Gin Leu Ala Glu Leu Asp Ser Glu Lys Thr Ala Leu Gin 2065 2070 2075 2080 Val Ser Leu Ala Gly Val Gin Gin Arg Phe Asp Ser Tyr Ser Gin Leu 2085 2090 2095 Tyr Glu Glu Asn He Asn Ala Gly Glu Gin Arg Ala Leu Ala Leu Arg 2100 2105 2110 Ser Glu Ser Ala He Glu Ser Gin Gly Ala Gin He Ser Arg Mec Ala 2115 2120 2125 Gly Ala Gly Val Asp Mec Ala Pro Asn He Phe Gly Leu Ala Asp Gly 2130 2135 2140 Gly Mec His Tyr Gly Ala He Ala Tyr Ala He Ala Asp Gly He Glu 2145 2150 2155 2160 Leu Ser Ala Ser Ala Lys Mec Val Asp Ala Glu Lys Val Ala Gin Ser 2165 2170 2175 Glu He Tyr Arg Arg Arg Arg Gin Glu Trp Lys He Gin Arg Asp Asn 2180 2185 2190 Ala Gin Ala Glu He Asn Gin Leu Asn Ala Gin Leu Glu Ser Leu Ser 2195 2200 2205 He Arg Arg Glu Ala Ala Glu Me Gin Lys Glu Tyr Leu Lys Thr Gin 2210 2215 2220 Gin Ala Gin Ala Gin Ala Gin Leu Thr Phe Leu Arg Ser Lys Phe Ser 2225 2230 2235 2240 Asn Gin Ala Leu Tyr Ser Trp Leu Arg Gly Arg Leu Ser Gly He Tyr 2245 2250 2255 Phe Gin Phe Tyr Asp Leu Ala Val Ser Arg Cys Leu Mec Ala Glu Gin 2260 2265 2270 Ser Tyr Gin Trp Glu Ala Asn Asp Asn Ser He Ser Phe Val Lys Pro 2275 2230 2285 Gly Ala Trp Gin Gly Thr Tyr Ala Gly Leu Leu Cys Gly Glu Ala Leu 2290 2295 2300 135- SUBST1TUTE SHEET (RULE 26) lie Gin Asn Leu Aid Gin Mec Glu Glu Ala Tyr Leu Lys Trp Glu 3=r 2305 2310 2315 232 Arg Ala Leu Glu Val Glu Arg Thr 7a1 3er Leu Ala Val Val Tyr Asp 2325 2330 2335 3er Leu Glu Gly Asn Asp Arg Phe Asn Leu Ala Glu Gin lie Pro Ala 2340 2345 2350 Leu Leu Asp Lys Gly Glu Gly Thr Ala Gly Thr Lys Glu Asn Gly Leu 2355 2350 2365 3er Leu Ala Asn Ala He Leu Ser Ala Ser Val Lys Leu Ser Asp Leu 2370 2375 2380 Lys Leu Gly Thr Asp Tyr Pro Asp Ser He Val Gly Ser Asn Lys Val 2385 2390 2395 2400 Arg Arg He Lys Gin He Ser Val Ser Leu Pro Ala Leu Val Gly Pro 2405 2410 2415 Tyr Gin Asp Val Gin Ala Mec Leu Ser Tyr Gly Gly Ser Thr Gin Leu 2420 2425 2430 Pro Lys Gly Cys Ser Ala Leu Ala Val Ser His Gly Thr Asn Asp Ser 2435 2440 2445 Gly Gin Phe Gin Leu Asp Phe Asn Asp Gly Lys Tyr Leu Pro Phe Glu 2450 2455 2460 Gly He Ala Leu Asp Asp Gin Gly Thr Leu Asn Leu Gin Phe Pro Asn 2465 2470 2475 2480 Ala Thr Asp Lys Gin Lys Ala He Leu Gin Thr Mec Ser Asp He He 2485 2490 2495 Leu His He Arg Tyr Thr He Arg * 2500 2505 (2) INFORMATION FOR SEQ ID NO: 13: SEQUENCE CHARACTERISTICS (A) LENGTH: 12 amino ac (B) TYPE: amino acid (C) STRANDEDNESS : singl (D) TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: Leu He Gly Tyr Asn Asn Gin Phe Ser Gly Xaa Ala 1 5 10 INFORMATION FOR SEQ ID NO: 14: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 12 amino aci> (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear -136- SUBSTmiTE SHEET(RULE 26 (ii) MOLECULE TYPE: pepcide (xi) SEQUENCE DESCRIPTIO : SEQ ID MO: 14: Mec Gin Asn Ser Gin Thr Phe Ser Val Gly Glu Leu 1 5 10 (2) INFORMATION FOR SEQ ID NO: 15 (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 9 amino acids ' (B) TYPE: amino acid (C) STRANDEDNESS : single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: Ala Gin Asp Gly Asn Gin Asp Thr Phe Phe Ser Gly Asn Thr 1 5 10 INFORMATION FOR SEQ ID NO: 16: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 5 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION Met Gin Asn Ser Leu 1 5 (2) INFORMATION FOR SEQ ID NO: 17: SEQUENCE CHARACTERISTICS: (A) LENGTH: 10 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO Ala Phe Asn He Asp Asp Val Ser Leu Phe 1 ' 5 10 137· INFORMATION FC SEQ ID MO: 13: (i) SEQUENCE CHARACTERISTICS : (A) LENGTH: 15 ammo acids (8) TYPE: amino acid (C) STRANDEDNESS : Single (D) TOPOLOGY: linear !ii) MOLECULE TYPE: pepcide ιυ (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18 Phe He Val Tyr Thr Ser Leu Gly Val Asn Pro Asn Asn Ser Ser Asn 1 5 10 15 (2) INFORMATION FOR SEQ ID NO: 19: 0 (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 21 amino acids (B) TYPE : amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear 5 (ii) MOLECULE TYPE: peptide (Xi) SEQUENCE DESCRIPTIO : SEQ ID NO: 19: 0 He Ser Asp Leu Val Thr Thr Ser Pro Leu Ser Glu Ala lie Gly Ser 1 5 10 15 Leu Gin Leu Phe lie 5 20 INFORMATION FOR SEQ ID NO: 20: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 12 amino acids (B) TYPE : amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear 5 (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20 0 Met Tyr Tyr He Gin Ala Gin Gin Leu Leu Gly 1 5 10 5 (2) INFORMATION FOR SEQ ID NO: 21: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 26 amino acids (B) TYPE: amino acid 0 (C) STRANDEDNESS : single (D) TOPOLOGY: linear -138- SUBSTTTUTE SHEET (RULE 26) (ii) MOLECULE TYPE : peptide '. xi) SEQUENCE DESCRIPTION: 3EQ ID NO : 21 : Gly He Asp Ala Val Leu Ser Mec Glu Thr Gin Asn He Gin Glu Pro 1 5 10 15 Gin Leu Gly Ala Gly Thr Tyr Val Gin Leu 20 25 (2) INFORMATION FOR SEQ ID NO: 22 li) SEQUENCE CHARACTERISTICS: (A) LENGTH: 15 amino acids (B) TYPE: amino acid (C) STRANDEDNESS : single " (D) TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: He Ser Asn Pro He Asn He Asn Thr Gly He Asp Ser Ala Lys 1 5 10 15 (2) INFORMATION FOR SEQ ID NO: 23: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 13 amino acids (B) TYPE: amino acid (C) STRANDEDNESS : Single ( D) TOPOLOGY : 1inear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO:23 Thr Tyr Leu Thr Ser Phe Glu Gin Val Ala Asn Leu Lys 1 5 10 INFORMATION FOR SEQ ID NO: 24: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 22 amino acids (B) TYPE: amino acid (C) STRANDEDNESS : single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: Val Leu Gly Thr Glu Asn Val He Ala Leu Tyr Ser Glu Asn Asn Gly -139- 10 Val Gin Tyr Mac Gin lie 20 INFORMATION FOR ≤EQ ID NO: 25: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 5005 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS : double (D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (ix) FEATURE: (A) NAME/KEY: RBS (B) LOCATION: 1..9 (ix) FEATURE: (A) NAME/KEY : CDS (B) LOCATION: 16..3585 (D) OTHER INFORMATION: /produce = "Ρ9' (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: AAGAAGGAAT TGATT ATG TCT GAA TCT TTA TTT ACA CAA ACG TTG AAA GAA 51 Met Ser Glu Ser Leu Phe Thr Gin Thr Leu Lys Glu 1 5 10 GCG CGC CGT GAT GCA TTG GTT GCT CAT TAT ATT GCT ACT CAG GTG CCC 99 Ala Arg Arg Asp Ala Leu Val Ala His Tyr lie Ala Thr Gin Val Pro 15 20 25 GCA GAT TTA AAA GAG AGT ATC CAG ACC GCG GAT GAT CTG TAC GAA TAT 147 Ala Asp Leu Lys Glu Ser lie Gin Thr Ala Asp Asp Leu Tyr Glu Tyr 30 35 40 CTG TTG CTG GAT ACC AAA ATT AGC GAT CTG GTT ACT ACT TCA CCG CTG 195 Leu Leu Leu Asp Thr Lys lie Ser Asp Leu Val Thr Thr Ser Pro Leu 45 50 55 60 TCC GAA GCG ATT GGC AGT CTG CAA TTG TTT ATT CAT CGT GCG ATA GAG 243 Ser Glu Ala He Gly Ser Leu Gin Leu Phe lie His Arg Ala He Glu 65 70 75 GGC TAT GAC GGC ACG CTG GCA GAC TCA GCA AAA CCC TAT TTT GCC GAT 291 Gly Tyr Asp Gly Thr Leu Ala Asp Ser Ala Lys Pro Tyr Phe Ala Asp 80 35 90 GAA CAG TTT TTA TAT AAC TGG GAT AGT TTT AAC CAC CGT TAT AGC ACT 3 9 Glu Gin Phe Leu Tyr Asn Trp Asp Ser Phe Asn His Arg Tyr Ser Thr 95 100 105 TGG GCT GGC AAG GAA CGG TTG AAA TTC TAT GCC GGG GAT TAT ATT GAT 337 Trp Ala Gly Lys Glu Arg Leu Lys Phe Tyr Ala Gly Asp Tyr He Asp 110 115 120 CCA ACA TTG CGA TTG AAT AAG ACC GAG ATA TTT ACC GCA TTT GAA CAA 435 Pro Thr Leu Arg Leu Asn Lys Thr Glu He Phe Thr Ala Phe Glu Gin 125 130 135 140 -140- SUBSTmJTE SHEET(RULE 26) 18003 GCT ATT TCT CAA GGG AAA TTA AAA AGT GAA TTA GTC GAA TCT AAA TTA 432 Gly lie Ser Gin Gly Lys Leu Lys Ser GLu Lsu Val Glu Ser Lys Leu 145 150 155 COT GAT TAT CTA ATT AGT TAT GAC ACT TTA GCC ACC CTT GAT TAT ATT 531 Arg Asp Tyr Leu lie Ser Tyr Asp Thr Leu Ala Thr Leu Asp Tyr He 160 165 170 ACT GCC TGC CAA GGC AAA GAT AAT AAA ACC ATC TTC TTT ATT GGC CGT 579 Thr Ala Cys Gin Gly Lys Asp Asn Lys Thr He Phe Phe He Gly Arg 175 180 185 ACA CAG AAT GCA CCC TAT GCA TTT TAT TCG CGA AAA TTA ACT TTA GTC 627 Thr Gin Asn Ala Pro Tyr Ala Phe Tyr Trp Arg Lys Leu Thr Leu Val 190 195 200 ACT GAT GGC GGT AAG TTG AAA CCA GAT CAA TGG TCA GAG TGG CGA GCA 6" Thr Asp Gly Gly Lys Leu Lys Pro Asp Gin Trp Ser Glu Trp Arg Ala 205 210 215 220 ATT AAT GCC GGG ATT AGT GAG GCA TAT TCA GGG CAT GTC GAG CCT TTC 723 lie Asn Ala Gly He Ser Glu Ala Tyr Ser Gly His Val Glu Pro Phe 225 230 235 TGG GAA AAT AAC AAG CTG CAC ATC CGT TGG TTT ACT ATC TCG AAA GAA 771 Trp Glu Asn Asn Lys Leu His He Arg Trp Phe Thr He Ser Lys Glu 240 245 250 GAT AAA ATA GAT TTT GTT TAT AAA AAC ATC TGG GTG ATG AGT AGC GAT 819 Asp Lys He Asp Phe Val Tyr Lys Asn He Trp Val Met Ser Ser Asp 255 260 265 TAT AGC TGG GCA TCA AAG AAA AAA ATC TTG GAA CTT TCT TTT ACT GAC 367 Tyr Ser Trp Ala Ser Lys Lys Lys He Leu Glu Leu Ser Phe Thr Asp 270 275 280 TAC AAT AGA GTT GGA GCA ACA GGA TCA TCA AGC CCG ACT GAA GTA GCT 915 Tyr Asn Arg Val Gly Ala Thr Gly Ser Ser Ser Pro Thr Glu Val Ala 285 290 295 300 TCA CAA TAT GGT TCT GAT GCT CAG ATG AAT ATT TCT GAT GAT GGG ACT 963 Ser Gin Tyr Gly Ser Asp Ala Gin Mec Asn He Ser Asp Asp Gly Thr 305 310 315 GTA CTT ATT TTT CAG AAT GCC GGC GGA GCT ACT CCC AGT ACT GGA GTG 1011 Val Leu He Phe Gin Asn Ala Gly Gly Ala Thr Pro Ser Thr Gly Val 320 325 330 ACG TTA TGT TAT GAC TCT GGC AAC GTG ATT AAG AAC CTA TCT AGT ACA 1059 Thr Leu Cys Tyr Asp Ser Gly Asn Val He Lys Asn Leu Ser Ser Thr 335 340 345 GGA AGT GCA AAT TTA TCG TCA AAG GAT TAT GCC ACA ACT AAA TTA CGC 1107 Gly Ser Ala Asn Leu Ser Ser Lys Asp Tyr Ala Thr Thr Lys Leu Arg 350 355 360 ATG TGT CAT GGA CAA AGT TAC AAT GAT AAT AAC TAC TGC AAT TTT ACA 1155 Mec Cys His Gly Gin Ser Tyr Asn Asp Asn Asn Tyr Cys Asn Phe Thr 365 370 375 330 CTC TCT ATT AAT ACA ATA GAA TTC ACC TCC TAC GGC ACA TTC TCA TCA 1203 Leu Ser He Asn Thr He Glu Phe Thr Ser Tyr Gly Thr Phe Ser Ser 385 390 395 GAT GGA AAA CAA TTT ACA CCA CCT TCT GGT TCT GCC ATT GAT TTA CAC 1251 Asp Gly Lys Gin Phe Thr Pro Pro Ser Gly Ser Ala He Asp Leu His -141- 400 405 410 CTC CCT AAT TAT GTA GAT CTC AAC GCG CTA TTA GAT ATT AGC CTC GAT 129? Leu Pro Asn Tyr Val Asp Leu Asn Ala Leu Leu Asp lie 3er Leu Asp 415 420 425 TCA CTA CTT AAT TAT GAC CTT CAG GGG CAG TTT GGC GGA TCT AAT CCG 1347 Ser Leu Leu Asn Tyr Asp Val Gin Gly Gin Phe Gly Gly Ser Asn Pro 430 435 440 GTT GAT AAT TTC AGT GGT CCC TAT GGT ATT TAT CTA TGG GAA ATC TTC 1335 Val Asp Asn Phe Ser Gly Pro Tyr Gly lie Tyr Leu Trp Glu lie Phe 445 450 455 460 TTC CAT ATT CCG TTC CTT GTT ACG GTC CGT ATG CAA ACC GAA CAA CGT 1443 Phe His lie Pro Phe Leu Val Thr Val Arg Met Gin Thr Glu Gin Arg 465 470 475 TAC GAA GAC GCG GAC ACT TGG TAC AAA TAT ATT TTC CGC AGC GCC GGT 1431 Tyr Glu Asp Ala Asp Thr Trp Tyr Lys Tyr He Phe Arg Ser Ala Gly 480 435 490 TAT CGC GAT GCT AAT GGC CAG CTC ATT ATG GAT GGC AGT AAA CCA CGT 1539 Tyr Arg Asp Ala Asn Gly Gin Leu He MeC Asp Gly Ser Lys Pro Arg 495 500 505 TAT TGG AAT GTG ATG CCA TTG CAA CTG GAT ACC GCA TGG GAT ACC ACA 1537 Tyr Trp Asn Val Mec Pro Leu Gin Leu Asp Thr Ala Trp Asp Thr Thr 510 515 520 CAG CCC GCC ACC ACT GAT CCA GAT GTG ATC GCT ATG GCG GAC CCG ATG 1535 Gin Pro Ala Thr Thr Asp Pro Asp Val He Ala Mec Ala Asp Pro Mec 525 530 535 540 CAT TAC AAG CTG GCG ATA TTC CTG CAT ACC CTT GAT CTA TTG ATT GCC 1633 His Tyr Lys Leu Ala lie Phe Leu His Thr Leu Asp Leu Leu He Ala 545 550 555 CGA GGC GAC AGC GCT TAC CGT CAA CTT GAA CGC GAT ACT CTA GTC GAA 1731 Arg Gly Asp Ser Ala Tyr Arg Gin Leu Glu Arg Asp Thr Leu Val Glu 560 565 570 GCC AAA ATG TAC TAC ATT CAG GCA CAA CAG CTA CTG GGA CCG CGC CCT 1779 Ala Lys Me Tyr Tyr He Gin Ala Gin Gin Leu Leu Gly Pro Arg Pro 575 530 585 GAT ATC CAT ACC ACC AAT ACT TGG CCA AAT CCC ACC TTG AGT AAA GAA 1327 Asp He His Thr Thr Asn Thr Trp Pro Asn Pro Thr Leu Ser Lys Glu 590 595 600 GCT GGC GCT ATT GCC ACA CCG ACA TTC CTC AGT TCA CCG GAG GTG ATG 137 Ala Gly Ala He Ala Thr Pro Thr Phe Leu Ser Ser Pro Glu Val Mec 605 610 615 620 ACG TTC GCT GCC TGG CTA AGC GCA GGC GAT ACC GCA AAT ATT GGC GAC 1923 Thr Phe Ala Ala Trp Leu Ser Ala Gly Asp Thr Ala Asn He Gly Asp 625 630 635 GGT GAT TTC TTG CCA CCG TAC AAC GAT GTA CTA CTC GGT TAC TGG GAT 1971 Gly Asp Phe Leu Pro Pro Tyr Asn Asp Val Leu Leu Gly Tyr Trp Asp 640 645 650 AAA CTT GAG TTA CGC CTA TAC AAC CTG CGC CAC AAT CTG AGT CTG GAT 2013 Lys Leu Glu Leu Arg Leu Tyr Asn Leu Arg His Asn Leu Ser Leu Asp 655 660 665 -142- SUBSTmjTE SHEET(RULE 26) 3GT CAA CCG CTA AAT C G CTG TAT GCC ACG CCG CTA GAC CCG AAA Giy Gin Pro Leu Asn Leu Pro Leu Tyr Ala Thr Pro Val Asp Pro Lys 675 680 ACC CTG CAA CGC CAG CAA GCC GGA GGG GAC OCT ACA GGC AGT ACT CCG 2115 T -r Leu Gin Arg Gin Gin A a Gly Giy Asp G y Thr Gly Ser 3er Pro 635 90 695 700 GCT GGT GGT CAA GGC AGT GTT CAG GGC TGG CGC TAT CCG TTA TTG GTA 2153 Ala Gly Gly Gin Gly Ser Val Gin Gly Trp Arg Tyr Pro Leu Leu Val 705 710 715 GAA CGC GCC CGC TCT GCC GTG AGT TTG TTG ACT CAG TTC GGC AAC AGC 2211 Glu Arg Ala Arg Ser Ala Val Ser Leu Leu Thr Gin Phe Gly Asn Ser 720 .725 730 TTA CAA ACA ACG TTA GAA CAT CAG GAT AAT GAA AAA ATG ACG ATA CTG 2259 Leu Gin Thr Thr Leu Glu His Gin Asp Asn Glu Lys Met Thr lie Leu 735 740 745 TTG CAG ACT CAA CAG GAA GCC ATC CTG AAA CAT CAG CAC GAT ATA CAA 2307 Leu Gin Thr Gin Gin Glu Ala lie Leu L s His Gin His Asp lie Gin 750 755 760 CAA AAT AAT CTA AAA GGA TTA CAA CAC AGC CTG ACC GCA TTA CAG GCT 2355 Gin Asn Asn Leu Lys Gly Leu Gin His Ser Leu Thr Ala Leu Gin Ala 765 770 775 "30 AGC CGT GAT GGC GAC ACA TTG CGG CAA AAA CAT TAC AGC GAC CTG ATT 2403 Ser Arg Asp Gly Asp Thr Leu Arg Gin Lys His Tyr Ser Asp Leu lie 785 790 795 AAC GGT GGT CTA TCT GCG GCA GAA ATC GCC GGT CTG ACA CTA CGC AGC Asn Gly Gly Leu Ser Ala Ala Glu He Ala Gly Leu Thr Leu Arg Ser 800 805 810 ACC GCC ATG ATT ACC AAT GGC GTT GCA ACG GGA TTG CTG ATT GCC GGC 2499 Thr Ala Met He Thr Asn Gly Val Ala Thr Gly Leu Leu He Ala Gly 815 820 825 GGA ATC GCC AAC GCG GTA CCT AAC GTC TTC GGG CTG GCT AAC GGT GGA 2547 Gly lie Ala Asn Ala Val Pro Asn Val Phe Gly Leu Ala Asn Gly Gly 830 835 840 TCG GAA TGG GGA GCG CCA TTA ATT GGC TCC GGG CAA GCA ACC CAA GTT 2595 Ser Glu Trp Gly Ala Pro Leu He Gly Ser Gly Gin Ala Thr Gin Val 845 850 855 860 GGC GCC GGC ATC CAG GAT CAG AGC GCG GGC ATT TCA GAA GTG ACA GCA 2643 Gly Ala Gly He Gin Asp Gin Ser Ala Gly He Ser Glu Val Thr Ala 865 870 875 GGC TAT CAG CGT CGT CAG GAA GAA TGG GCA TTG CAA CGG GAT ATT GCT 2691 Gly Tyr Gin Arg Arg Gin Glu Glu Trp Ala Leu Gin Arg Asp He Ala 880 885 890 GAT AAC GAA ATA ACC CAA CTG GAT GCC CAG ATA CAA AGC CTG CAA GAG 2739 Asp Asn Glu He Thr Gin Leu Asp Ala Gin He Gin Ser Leu Gin Glu 895 900 905 CAA ATC ACG ATG GCA CAA AAA CAG ATC ACG CTC TCT GAA ACC GAA CAA 2737 Gin He Thr Mec .-.la Gin Lys Gin He Thr Leu Ser Glu Thr Glu Gin 910 915 920 CCG AAT GCC CAA GCC ATT TAT CAC CTG CAA ACC ACT CGT TTT ACC GGG 2335 Ala Asn Ala Gin Ala He Tyr Asp Leu Gin Thr Thr Arg Phe Thr Gly ■143- SUBSTITUTE SHEET (RULE 26 325 ■J 30 935 54C CAG GCA CTG TAT AAC TGG ATG GCC GGT CGT CTC TCC GCG CTC TAT TAC 233: Gin Ala Leu Tyr Asn Trp Mec Ala Gl Arg Leu Ser Ala Leu Tyr Tyr 945 950 955 CAA ATG TAT GAT TCC ACT CTG CCA ATC TGT CTC CAG CCA AAA GCC GCA 2931 Gin Mec Tyr Asp Ser Thr Leu Pro lie Cys Leu Gin Pro Lys Ala Ala 960 965 970 TTA GTA CAG GAA TTA GGC GAG AAA GAG AGC GAC AGT CTT TTC CAG GTT 2375 Leu Val Gin Glu Leu Gly Glu Lys Glu Ser Asp Ser Leu Phe Gin Val 975 980 9a5 CCG GTG TGG AAT GAT CTG TGG CAA GGG CTG TTA GCA GGA GAA GGT TTA 3027 Pro Val Trp Asn Asp Leu Trp Gin Gly Leu Leu Ala Gly Glu Gly Leu 990 995 1000 AGT TCA GAG CTA CAG AAA CTG GAT GCC ATC TGG CTT GCA CGT GGT GGT 3075 Ser Ser Glu Leu Gin Lys Leu Asp Ala lie Trp Leu Ala Arg Gly Gly 1005 1010 101 1020 ATT GGG CTA GAA GCC ATC CGC ACC GTG TCG CTG GAT ACC CTG TTT GGC 3123 lie Gly Leu Glu Ala lie Arg Thr Val Ser Leu Asp Thr Leu Phe Gly 102' 1030 1035 ACA GGG ACG TTA AGT GAA AAT ATC AAT AAA GTG CTT AAC GGG GAA ACG 3171 Thr Gly Thr Leu Ser Glu Asn lie Asn Lys Val Leu Asn Gly Glu Thr 1040 1045 1050 GTA TCT CCA TCC GGT GGC GTC ACT CTG GCG CTG ACA GGG GAT ATC TTC 3219 Val Ser Pro Ser Gly Gly Val Thr Leu Ala Leu Thr Gly Asp He Phe 1055 1060 1065 CAA GCA ACA CTG GAT TTG AGT CAG CTA GGT TTG GAT AAC TCT TAC AAC 3267 Gin Ala Thr Leu Asp Leu Ser Gin Leu Gly Leu Asp Asn Ser Tyr Asn 1070 1075 1080 TTG GGT AAC GAG AAG AAA CGT CGT ATT AAA CGT ATC GCC GTC ACC CTG 3315 Leu Gly Asn Glu Lys Lys Arg Arg He Lys Arg He Ala Val Thr Leu 1085 1090 1095 1100 CCA ACA CTT CTG GGG CCA TAT CAA GAT CTT GAA GCC ACA CTG GTA ATG 3363 Pro Thr Leu Leu Gly Pro Tyr Gin Asp Leu Glu Ala Thr Leu Val Mec 1105 1110 1115 GGT GCG GAA ATC GCC GCC TTA TCA CAC GGT GTG AAT GAC GGA GGC CGG 3411 Gly Ala Glu He Ala Ala Leu Ser His Gly Val Asn Asp Gly Gly Arg 1120 1125 1130 TTT GTT ACC GAC TTT AAC GAC AGC CGT TTT CTG CCT TTT GAA GGT CGA 3459 Phe Val Glu Gly Arg GAT GCA ACA ACC GGC ACA CTG GAG CTC AAT ATT TTC CAT GCG GGT AAA 3507 Asp Ala Thr Thr Gly Thr Leu Glu Leu Asn lie Phe His Ala Gly Lys 1150 1155 1160 GAG GGA ACG CAA CAC GAG TTG GTC GCG AAT CTG AGT GAC ATC ATT GTG 3555 Glu Gly Thr Gin His Glu Leu Val Ala Asn Leu Ser Asp He He Val 1165 1170 1175 1180 CAT CTG AAT TAC ATC ATT CGA GAC GCG TAA ATTTCTTTTC TTTGTCGATT 3605 His Leu Asn Tyr He He Arg Asp Ala 1185 1190 -144- SUBSTJTUTE SHEET (RULE 26) AC.-.GGTCC T ATCAGGGGCC TGTTATTAAG GAGTACT A TCCAGGAT C AC GAAGTA 3 6 6 5 TCGAT ACAA CGCTGTCACT TCC AAAGGT GGCGGTGCTA TCAATGGCAT GGGAGAAGCA 3 ~ 2 : CTGAATGCTG CCGGCCCTGA TGGAATGGCC TCCCTATCTC TGCCAT ACC CCTTTCGACC 3 73 5 GGCAGAGGGA CGGCTCCTGG ATTATCGCTG AT TACAGCA ACAGTGCAGG TAATGGCCCT 3 34 5 TTCGGCATCG GCTGGCAATG CGGTGT ATG TCCATTAGCC GACGCACCCA ACATGGCAT 3 9 05 CCACAATACG GTAATGACGA ACGTTCCTA TCCCCACAAG GCGAGGTCAT GAATATCCCC 3 9 6 5 CTGAATGACC AAGGGCAACC TGATATCCGT CAAGACGT A AAACGCTGCA AGGCGTTACC 4 02 5 TTGCCAATTT CCTATACCGT GACCCGCTAT CAAGCCCGCC AGATCCTGGA TTTCAGTAAA 403 5 ATCGAATACT GGCAACCTGC CTCCGGTCAA GAAGGACGCG CT TCTGGCT GATATCGACA 14 5 CCGGACGGGC ATCTACACAT CTTAGGGAAA ACCGCGCAGG CT GTCTGGC AAATCCGCAA 4205 AATGACCAAC AAATCGCCCA GTGGTTGCTG GAAGAAACTG TGACGCCAGC CGGTGAACAT 42 5 5 GTCAGCTATC AATATCGAGC CGAAGATGAA GCCCATTGTG ACGACAATGA AAAAACCGCT 4 3 25 CATCCCAATG TTACCGCACA GCGCTATCTG GTACAGGTGA ACTACAGGCA ACATCAAACC 43 3 5 ACAAGCCAGC CTGTTCGTAC TGGATAACGC ACCTCCCGCA CCGGAAGAGT GGCTGT CA 444 5 TCTGGTCTTT GACCACGGTG AGCGCGTACC TCACT CATA CCGTGCCAAC ATGGGATGCA 4 5 05 GGTACAGCGC AATGGTCTGT ACGCCCGGAT ATCT CTCTC GCTATGAATA TGGTT GAA 4565 GTGCGTACTC GCCGCTTATG TCAACAAGTG CTGATGTTTC ACCGCACCGC GCTCATGGCC 4 625 GGAGAAGCCA GTACCAATGA CGCCCCGGAA CTGGTTGCAC GCTTAATACT GGAATATGAC 4 58 5 AAAAACGCCA GCGTCACCAC GTTGAT ACC ATCCGTCAAT TAAGCCATGA ATCGGACGGG 474 5 AGGCCAGTCA CCCAGCCACC ACTAGAACTA GCCTGCCAAC GGT TGATCT GGAGAAAATC 480 5 CCGACATGGC AACGCTTTGA CGCACTAGAT AAT TAACT CGCAGCAACG TTATCAACTG 43 65 GTTGATCTGC GGGGAGAAGG GTTGCCAGGT ATGCTGTATC AAGATCGAGG CGCT GGTGG 4925 TATAAAGCTC CGCAACGTCA GGAAGACGGA GACAGCAATG CCGTCACTTA CGACAAAATC 493 5 GCCCCACTGC CTACCCTACC CAATTTGCAG GATAATGCCT CATTGA GGA TATCAACGGA 504 5 GACGGCCAAC TGGATTGGGT TGTTACCGCC TCCGGTATTC GCGGATACCA TAGTCAGCAA 5 105 CCCGATGGAA AGTGGACGCA CTTTACGCCA ATCAATGCCT TGCCCGTGGA ATATTTTCAT 5 1 6 5 CCAAGCATCC AGTTCGCTGA CCTTACCGGG GCAGGCTTAT CTGATTTAGT GTTGATCGGG 5225 CCGAAAAGCG TGCGTCTATA TGCCAACCAG CGAAACGGCT GGCGTAAAGG AGAAGATGTC 5 235 CCCCAATC A CAGGTATCAC CCTGCCTGTC ACAGGCACCG ATGCCCGCAA ACTGGTGGCT 53 45 TTCAGTGATA TGCTCGGTTC CGGTCAACAA CATCTGGTGG AAATCAAGGG TAATCGCGTC 54 05 ACCTGTTGGC CGAATCTAGG GCATGGCCGT TTCGGTCAAC CACTAACTCT GTCAGGATTT 5 4 6 5 AGCCAGCCCG AAAATAGCTT CAATCCCGAA CGGCTGTTTC TGGCGGATAT CGACGGCTCC 5 52 5 GGCACCACCG ACCTTATCTA TGCGCAATCC GGCTCTTTGC TCATTTATCT CAACCAAAGT 5 53 5 - 14 5 - GGTAATCAGT TTGATGCCCC OTTGACATTA GCGTTGCCAG AAGGCGTACA ATTTCACAA ^ 5 ACTTGCCAAC TTCAAGTCGC C ATATTCAG GGATTAGGGA TAGCCAGCTT GATTCTGACT "05 GTGCCACATA TCGCGCCACA TCACTGGCGT TGTGACCTGT CACTGACCAA ACCCTCGTTG 5"55 TTGAATGTAA TGAACAATAA CCGGGGCGCA CATCACACGC TACATTATCG TAGTTC GCG 5325 CAATTCTGGT TGGATGAAAA ATTACAGCTC ACCAAAGCAG GCAAATCTCC GGCTTGTTAT 5335 CTGCCGTTTC CAATGCATTT GCTATGGTAT ACCGAAATTC AGGATGAAAT CAGCGGCAAC 5345 CGGCTCACCA GTGAAGTCAA CTACAGCCAC GGCGTCTGGG ATGGTAAAGA GCGGGAATTC 5005 (2) INFORMATION FOR SEQ ID NO: 26: SEQUENCE CHARACTERISTICS (A) LENGTH: 1190 amino (B) TYPE: amino acid (D) TOPOLOGY: linear ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: Met Ser Glu Ser Leu Phe Thr Gin Thr Leu Lys Glu Ala Arg Arg Asp 1 5 10 15 Ala Leu Val Ala His Tyr He Ala Thr Gin Val Pro Ala Asp Leu Lys 20 25 30 Glu Ser He Gin Thr Ala Asp Asp Leu Tyr Glu Tyr Leu Leu Leu Asp 35 40 45 Thr Lys He Ser Asp Leu Val Thr Thr Ser Pro Leu Ser Glu Ala He 50 55 60 Gly Ser Leu Gin Leu Phe He His Arg Ala He Glu Gly Tyr Asp Gly 65 70 75 30 Thr Leu Ala Asp Ser Ala Lys Pro Tyr Phe Ala Asp Glu Gin Phe Leu 85 90 95 Tyr Asn Trp Asp Ser Phe Asn His Arg Tyr Ser Thr Trp Ala Gly Lys 100 105 110 Glu Arg Leu Lys Phe Tyr Ala Gly Asp Tyr He Asp Pro Thr Leu Arg 115 120 125 Leu Asn Lys Thr Glu He Phe Thr Ala Phe Glu Gin Gly He Ser Gin 130 135 140 Gly Lys Leu Lys Ser Glu Leu Val Glu Ser Lys Leu Arg Asp Tyr Leu 145 150 155 160 He Ser Tyr Asp Thr Leu Ala Thr Leu Asp Tyr He Thr Ala Cys Gin 165 170 175 Gly Lys Asp Asn Lys Thr He Phe Phe He Gly Arg Thr Gin Asn Ala 130 185 190 Pro Tyr Ala Phe Tyr Trp Arg Lys Leu Thr Leu Val Thr Asp Gly Gly 195 · 200 205 -146- SUBSTTTUTE SHEET (RULE 26) Lys Leu Lys Pro Asp Gin Trp Ser Glu Trp Arg Ala He Asn Ala Glv 210 215 220 lie Ser Glu Ala T/r Ser Gly His Val Glu Pro Phe Trp Glu Asn Asn 225 230 235 240 Lys Leu His He Arg Trp Phe Thr He Ser Lys Glu Asp Lys He Asp 245 250 255 !() Phe Val Tyr Lys Asn He Trp Val Mec Ser Ser Asp Tyr Ser Trp Ala 260 265 270 Ser Lys Lys Lys He Leu Glu Leu Ser Phe Thr Asp T r Asn Arg Val 275 230 285 Gly Ala Thr Gly Ser Ser Ser Pro Thr Glu Val Ala Ser Gin Tyr Gly 290 295 300 Ser Asp Ala Gin Mec Asn He Ser Asp Asp Gly Thr Val Leu He Phe 305 310 315 320 Gin Asn Ala Gly Gly Ala Thr Pro Ser Thr Gly Val Thr Leu Cys Tyr 325 330 335 5 Asp Ser Gly Asn Val He Lys Asn Leu Ser Ser Thr Gly Ser Ala Asn 340 345 350 Leu Ser Ser Lys Asp Tyr Ala Thr Thr Lys Leu Arg Mec Cys His Gly 355 360 365 Gin Ser Tyr Asn Asp Asn Asn Tyr cys Asn Phe Thr Leu Ser He Asn 370 375 380 Thr He Glu Phe Thr Ser Tyr Gly Thr Phe Ser Ser Asp Gly Lys Gin 385 390 395 400 Phe Thr Pro Pro Ser Gly Ser Ala He Asp Leu His Leu Pro Asn Tyr 405 410 415 0 Val Asp Leu Asn Ala Leu Leu Asp He Ser Leu Asp Ser Leu Leu Asn 420 425 430 Tyr Asp Val Gin Gly Gin Phe Gly Gly Ser Asn Pro Val Asp Asn Phe 5 435 440 445 Ser Gly Pro Tyr Gly He Tyr Leu Trp Glu He Phe Phe His He Pro 450 455 460 0 Phe Leu< Val Thr Val Arg Met Gin Thr Glu Gin Arg Tyr Glu Asp Ala 465 470 475 480 Asp Thr Trp Tyr Lys Tyr He Phe Arg Ser Ala Gly Tyr Arg Asp Ala 485 490 495 5 Asn Gly Gin Leu He Mec Asp Gly Ser Lys Pro Arg Tyr Trp Asn Val 500 505 510 Mec Pro Leu Gin Leu Asp Thr Ala Trp Asp Thr Thr Gin Pro Ala Thr 0 515 520 525 Thr Asp Pro Asp Val He Ala Mec Ala Asp Pro Mec His Tyr Lys Leu 530 535 540 5 Ala He Phe Leu His. Thr Leu Asp Leu Leu He Ala Arg Gly Asp Ser 545 550 555 560 -147- SUBSTTTUTE SHEET (RULE 26) Ala Tyr Arg Gin Leu Glu Arg Asp Thr Leu Val Gl Ala Lvs Mec Tvr 565 5" 5"5 T- r He Gin Ala Gin Gin Leu Leu Gly Pro Arg Pro Asp lie His Thr 530 535 590 Thr Asn Thr Trp Pro Asn Pro Thr Leu Ser Lys Giu Ala Gly Ala He 595 000 505 Ala Thr Pro Thr Phe Leu Ser Ser Pro Glu Val Mec Thr Phe Aia Ala 610 615 620 Trp Leu Ser Aia Giy Asp Thr Ala Asn He Gly Asp Gly Asp Phe Leu 625 530 535 640 Pro Pro Tyr Asn Asp Val Leu Leu Gly Tyr Trp Asp Lys Leu Glu Leu 645 650 655 Arg Leu Tyr Asn Leu Arg His Asn Leu Ser Leu Asp Gly Gin Pro Leu 660 665 670 Asn Leu Pro Leu Tyr Ala Thr Pro Val Asp Pro Lys Thr Leu Gin Arg 675 680 685 Gin Gin Ala Gly Gly Asp Gly Thr Gly Ser Ser Pro Ala Gly Gly Gin 690 695 700 Gly Ser al Gin Gly Trp Arg Tyr Pro Leu Leu Val Glu Arg Ala Arg 705 710 715 720 Ser Ala Val Ser Leu Leu Thr Gin Phe Gly Asn Ser Leu Gin Thr Thr 725 730 735 Leu Glu His Gin Asp Asn Glu Lys Mec Thr lie Leu Leu Gin Thr Gin 740 745 750 Gin Glu Ala He Leu Lys His Gin His Asp He Gin Gin Asn Asn Leu 755 760 765 Lys Gly Leu Gin His Ser Leu Thr Ala Leu Gin Ala Ser Arg Asp Gly 770 775 780 Asp Thr Leu Arg Gin Lys His Tyr Ser Asp Leu He Asn Gly Gly Leu 795 790 795 300 Ser Ala Ala Glu lie Ala Gly Leu Thr Leu Arg Ser Thr Ala Mec He 805 810 815 Thr Asn Gly Val Ala Thr Gly Leu Leu He Ala Gly Gly He Ala Asn 820 825 330 Ala Val Pro Asn Val Phe Gly Leu Ala Asn Gly Gly Ser Glu Trp Gly 835 840 845 Ala Pro Leu He Gly Ser Gly Gin Ala Thr Gin Val Gly Ala Gly lie 850 855 360 Gin Asp Gin Ser Ala Gly He Ser Glu Val Thr Ala Gly Tyr Gin Arg 365 870 375 330 Arg Gin Glu Glu Trp Ala Leu Gin Arg Asp He Ala Asp Asn Glu lie 885 390 895 Thr Gin Leu Asp Ala Gln lie Gin Ser Leu Gin Glu Gin He Thr Mec 900 905 910 -148- A1 A Gin Lys Gin He Thr Leu Ser Glu Thr Glu Gin Ala Asn ia Gin. 915 920 925 Ala lie Tyr Asp Leu Gin Thr Thr Arg Phe Thr Giy Gin Ala Leu Tyr 930 925 940 Asn Trp Mec Ala Gly Arg Leu Ser Ala Leu Tyr Tyr Gin Mec T r Asp 945 950 955 960 Ser Thr Leu Pro lie Cys Leu Gin Pro L' s Ala Ala Leu V l Gin Glu 9o5 970 975 Leu Gly Glu Lys Glu Ser Asp Ser Leu Phe Gin Val Pro Val Trp Asn 980 985 990 Asp Leu Trp Gin Gly Leu Leu Ala Gly Glu Gly Leu Ser Ser Glu Leu 995 1000 1005 Gin Lys Leu Asp Ala He Trp Leu Ala Arg Gly Gly He Gly Leu Glu 1010 1015 1020 Ala He Arg Thr Val Ser Leu Asp Thr Leu Phe Gly Thr Gly Thr Leu 1025 1030 1035 1040 Ser Glu Asn He Asn Lys Val Leu Asn Gly Glu Thr Val Ser Pro Ser 1045 1050 1055 Gly Gly Val Thr Leu Ala Leu Thr Gly Asp He Phe Gin Ala Thr Leu 1060 1065 1070 Asp Leu Ser Gin Leu Gly Leu Asp Asn Ser Tyr Asn Leu Gly Asn Glu 1075 1080 1085 Lys Lys Arg Arg lie Lys Arg He Ala Val Thr Leu Pro Thr Leu Leu 1090 1095 1100 Gly Pro Tyr Gin Asp Leu Glu Ala Thr Leu Val Mec Gly Ala Glu He 1105 1110 1115 1120 Ala Ala Leu Ser His Gly Val Asn Asp Gly Gly Arg Phe Val Thr Asp 1125 1130 1135 Phe Asn Asp Ser Arg Phe Leu Pro Phe Glu Gly Arg Asp Ala Thr Thr 1140 1145 1150 Gly Thr Leu Glu Leu Asn He Phe His Ala Gly Lys Glu Gly Thr Gin 1155 1160 1165 His Glu Leu Val Ala Asn Leu Ser Asp He He Val His Leu Asn T r 1170 1175 1130 He He Arg Asp Ala * 1135 1190 (2) INFORMATION FOR SEQ ID NO: 27: SEQUENCE CHARACTERISTICS: (A) LENGTH: 1881 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS : double (D) TOPOLOGY: linear MOLECULE TYPE: DNA (genomic) -149- SUBSTTTUTE SHEET (RULE 26) I i:<) FEATURE: • (B) LOCATION: 1..1331 (D) OTHER INFORMATION: . produce = " P3 ' (xi) SEQUENCE DESCRIPTION : SEQ ID MO: 27 ATG TCT GAA TCT TTA TTT ACA CAA ACG TTG AAA GAA GCG CGC CGT GAT 43 ec Ser Glu Ser Leu Phe Thr Gin Thr Leu Lys Glu Ala Arg Arg Asp 1 10 15 GCA TTG GTT GCT CAT TAT ATT GCT ACT CAG GTG CCC GCA GAT TTA AAA 96 Ala Leu Val Ala His Tyr lie Ala Thr Gin Val Pro Ala Asp Leu Lys 20 25 30 GAG AGT ATC CAG ACC GCG GAT GAT CTG TAC GAA TAT CTG TTG CTG GAT 144 Glu ser He Gin Thr Ala Asp Asp Leu Tyr Glu Tyr Leu Leu Leu Asp 35 40 45 ACC AAA ATT AGC GAT CTG GTT ACT ACT TCA CCG CTG TCC GAA GCG ATT 132 Thr Lys He Ser Asp Leu Val Thr Thr Ser Pro Leu Ser Glu Ala He 50 55 60 GGC AGT C G CAA TTG TTT ATT CAT CGT GCG ATA GAG GGC TAT GAC GGC 240 Gly Ser Leu Gin Leu Phe He His Arg Ala He Glu Gly Tyr Asp Gly 70 75 80 ACG CTG GCA GAC TCA GCA AAA CCC TAT TTT GCC GAT GAA CAG TTT TTA 233 Thr Leu Ala Asp Ser Ala Lys Pro Tyr Phe Ala Asp Glu Gin Phe Leu 35 90 95 TAT AAC TGG GAT AGT TTT AAC CAC CGT TAT AGC ACT TGG GCT GGC AAG 335 Tyr Asn Trp Asp Ser Phe Asn His Arg Tyr Ser Thr Trp Ala Gly Lys 100 105 110 GAA CGG TTG AAA TTC TAT GCC GGG GAT TAT ATT GAT CCA ACA TTG CGA 334 Glu Arg Leu Lys Phe Tyr Ala Gly Asp Tyr He Asp Pro Thr Leu Arg 115 120 125 TTG AAT AAG ACC GAG ATA TTT ACC GCA TTT GAA CAA GGT ATT TCT CAA 432 Leu Asn Lys Thr Glu He Phe Thr Ala Phe Glu Gin Gly He Ser Gin 130 135 140 GGG AAA TTA AAA AGT GAA TTA GTC GAA TCT AAA TTA CGT GAT TAT CTA 430 Gly Lys Leu Lys Ser Glu Leu Val Glu Ser Lys Leu Arg Asp Tyr Leu 145 150 155 160 ATT AGT TAT GAC ACT TTA GCC ACC CTT GAT TAT ATT ACT GCC TGC CAA 523 He Ser Tyr Asp Thr Leu Ala Thr Leu Asp Tyr He Thr Ala Cys Gin 165 170 175 GGC AAA GAT AAT AAA ACC ATC TTC TTT ATT GGC CGT ACA CAG AAT GCA 576 Gly Lys Asp Asn Lys Thr He Phe Phe He Gly Arg Thr Gin Asn Ala 180 185 190 CCC TAT GCA TTT TAT TGG CGA AAA TTA ACT TTA GTC ACT GAT GGC GGT 624 Pro Tyr Ala Phe Tyr Trp Arg Lys Leu Thr Leu Val Thr Asp Gly Gly 195 200 205 .AAG TTG AAA CCA GAT CAA TGG TCA GAG TGG CGA GCA ATT AAT GCC GGG 6" 2 Lys Leu Lys Pro Asp Gin Trp Ser Glu Trp Arg Ala He Asn Ala Gly 210 215 220 -150- SUBSTTTUTE SHEET (RULE 26) ΛΤΤ AJT nG GCA T.-.T TCA GGG CAT J GAG CCT TTC TGG GAA AAT AAC lie J*r Giu Ala Tyr S CTA CAT TAT CCT ACT TCC CCG CAA TTC T Leu His Tyr Arg Ser Ser Ala Gin Phe Trp Leu Asp Giu Lys Leu Gin 725 "30 " 5 CTC ACC AAA GCA GGC AAA TCT CCG GCT TGT TAT CTG CCG TTT CCA ATG 225c Leu Thr Lys Ala Gly Lys Ser Pro Ala Cys Tyr Leu Pro Phe Pro Mec 7-10 45 750 CAT TTG CTA TGG TAT AC GAA ATT CAG GAT G A ATC AGC GGC AAC CCG 304 His Leu Leu Trp Tyr Thr Giu lie Gin Asp Giu lie Ser Giy Asn Arg 755 760 765 CTC ACC ACT GAA GTC .AAC TAC AGC CAC GGC GTC TGG GAT GGT AAA GAG 2352 Leu Thr Ser Giu Val Asn Tyr Ser His Gly Val Trp Asp Gly Lys Giu 770 775 780 CGG GAA TTC AGA GGA TTT GGC TGC ATC AAA CAG ACA GAT ACC ACA ACC 2-100 Arg Glu Phe Arg Gly Phe Gly Cys lie Lys Gin Thr Asp Thr Thr Thr 785 790 795 300 TTT TCT CAC GGC ACC GCC CCC GAA CAG GCG GCA CCG TCG CTG AGT ATT 2443 Phe Ser His Gly Thr Ala Pro Glu Gin Ala Ala Pro Ser Leu Ser He 305 310 315 AGC TGG TTT GCC ACC GGC ATG GAT GAA GTA GAC AGC CAA TTA GCT ACG 2496 Ser Trp Phe Ala Thr Gly Mec Asp Glu Val Asp Ser Gin Leu Ala Thr 320 325 830 GAA TAT TGG CAG GCA GAC ACG CAA GCT TAT AGC GGA TTT GAA ACC CGT 25 4 Glu Tyr Trp Gin Ala Asp Thr Gin Ala Tyr Ser Gly Phe Glu Thr Arg 83 5 840 845 TAT ACC GTC TGG GAT CAC ACC AAC CAG ACA GAC CAA GCA TTT ACC CCC 2592 Tyr Thr Val Trp Asp His Thr Asn Gin Thr Asp Gin Ala Phe Thr Pro 350 855 360 AAT GAG ACA CAA CGT AAC TGG CTG ACG CGA GCG CTT AAA GGC CAA CTG 2 640 Asn Glu Thr Gin Arg Asn Trp Lau Thr Arg Ala Leu Lys Gly Gin Leu 3 65 870 875 380 CTA CGC ACT GAG CTC TAC GGT CTG GAC GGA ACA GAT AAG CAA ACA GTG 26 a a Leu Arg Thr Glu Leu Tyr Gly Leu Asp Gly Thr Asp Lys Gin Thr Val 835 390 395 CCT TAT ACC GTC AGT GAA TCG CGC TAT CAG GTA CGC TCT ATT CCC GTA 27 3 6 Pro Tyr Thr Val Ser Glu Ser Arg Tyr Gin Val Arg Ser lie Pro Val 900 905 910 AAT AAA GAA ACT GAA TTA TCT GCC TGG GTG ACT GCT ATT GAA AAT CGC 2734 Asn Lys Glu Thr Glu Leu Ser Ala Trp Val Thr Ala He Glu Asn Arg 915 920 925 AGC TAC CAC TAT GAA CGT ATC ATC ACT GAC CCA CAG TTC AGC CAG AGT 2332 Ser Tyr His Tyr Glu Arg He He Thr Asp Pro Gin Phe Ser Gin Ser 930 935 940 ATC AAG TTG CAA CAC GAT ATC TTT GGT CAA TCA CTG CAA AGT GTC GAT 2380 He Lys Leu Gin His Asp He Phe Gly Gin Ser Leu Gin Ser Val Asp 945 950 955 560 ATT GCC TGG CCG CGC CGC GAA AAA CCA GCA GTG AAT CCC TAC CCG CCT 2923 He Ala Trp Pro Arg Arg Glu Lys Pro Aia Val Asn Pro Tyr Pro Pro 965 970 975 - 1 6 2 - "hr L=u Pro CTA TTA CCT C?J CTG AGA CAA AAA A.AT AGC TGG CAT C CTG ACT GAT .'024 Lau Leu Arg Leu Vai Arg Gin Lys Asn 3er Trp His His Leu Thr Asp 995 1000 1005 A.AC TGG CGA TTA GGT TTA CG AAT GCA CAA CGC OCT GAT CTT 3 07 : Gly L Asn Trp rg Leu Gly Leu Pro Asn Aia Gin Arg Arg Asp Vai 10 1 0 10 15 1020 TAT ACT TAT GAC CGG AGC .AAA ATT CCA ACC GAA GGG ATT TCC CTT C A 3120 Tyr Thr Tyr Asp Arg Ser Lys He Pro Thr Glu Gly lie Ser Leu Glu 1025 103 0 103 5 104 0 A C TTG CTG AAA GAT GAT GGC CTG CTA GCA GAT GAA AAA GCG GCC GTT 31 53 lie Leu Leu Lys Asp Asp Gly Leu Leu Ala Asp Glu Lys Ala Ala Vai 104 5 1050 1055 TAT CTG GGA CAA CAA CAG ACG TTT TAC ACC GCC GGT CAA GCG GAA GTC 3 2 16 Tyr Leu Gly Gin Gin Gin Thr Phe Tyr Thr Ala Gly Gin Ala Glu Vai 1060 1065 1070 ACT CTA GAA .AAA CCC ACG TTA CAA GCA CTG GTC GCG TTC CAA GAA ACC 3 264 Thr Leu Glu Lys Pro Thr Leu Gin Ala Leu Vai Ala Phe Gin Glu Thr 1075 1030 1085 GCC ATG ATC GAC GAT ACC TCA TTA CAG GCG TAT GAA GGC GTG ATT GAA 3 3 12 Ala Mec Mec Asp Asp Thr Ser Leu Gin Ala Tyr Glu Gly Vai lie Glu 1090 1095 1100 GAG CAA GAG TTG AAT ACC GCG CTG ACA CAG GCC GGT TAT CAG CAA GTC 3 3 60 Glu Gin Glu Leu Asn Thr Ala Leu Thr Gin Ala Gly Tyr Gin Gin Vai 1 105 1110 1115 1 120 GCG CGG TTG TTT AAT ACC AGA TCA GAA AGC CCG GTA TGG GCG GCA CGG 3 403 Aia Arg Leu Phe Asn Thr Arg Ser Glu Ser Pro Vai Trp Ala Aia Arg 1125 1130 U 3 5 CAA GGT TAT ACC GAT TAC GGT GAC GCC GCA CAG TTC TGG CCG CCT CAG 3 456 Gin Gly Tyr Thr Asp Tyr Gly Asp Ala Ala Gin Phe Trp Arg Pro Gin 1 140 1 145 1150 GCT CAG CGT AAC TCG TTG CTG ACA GGG AAA ACC ACA CTG ACC TGG GAT 3 504 Ala Gin Arg Asn Ser Lau Leu Thr Gly Lys Thr Thr Leu Thr Trp Asp 1 155 1 160 1165 ACC CAT CAT TGT GTA ATA ATA CAG ACT CAA GAT GCC GCT GGA TTA ACG 3 552 Thr His His Cys Vai He He Gin Thr Gin Asp Ala Ala Gly Leu Thr 1 170 1 175 na o ACG CAA GCC CAT TAC GAT TAT CGT TTC CTT ACA CCG GTA CAA CTG ACA 3 600 Thr Gin Ala His Tyr Asp Tyr Arg Phe Leu Thr Pro Vai Gin Leu Thr 1 135 1 190 1195 1200 GAT ATT AAT GAT AAT CAA CAT ATT GTG ACT CTG GAC GCG CTA GGT CGC 3 543 Asp He Asn Asp Asn Gin His He Vai Thr Leu Asp Ala Leu Gly Arg 1205 1210 12 15 GTA ACC ACC AGC CGG TTC TGG GGC ACA GAG GCA GGA CAA GCC GCA GGC 3696 Vai Thr Thr Ser Arg Phe Trp Gly Thr Glu Ala Gly Gin Ala Ala Gly 1220 1225 1230 TAT TCC .AAC CAG CCC TTC ACA CCA CCG GAC TCC GTA GAT AAA GCG TG 3 ~ 44 Tyr Ser Asn Gin Pro Phe Thr Pro Pro Asp Ser Vai Asp Lys Aia Leu - 1 5 3 - GCA TTA ACC GGC GCA CTC "T GTT CCC CAA TCT 7TA GTC TAT GCC GTT Γ " ': Ala Leu Thr Gly Ala Leu Pro Val Ala Gin Cys Leu Val T r Ala Val 1250· 1255 1250 GAT AGC TGG ATG CCG TCG TTA TCT TTG TCT CAG CTT TCT CAG TCA C A 3340 Asp Ser Trp Mec Pro Ser Leu Ser Leu Ser Gin Leu Ser Gin Ser Gin 12≤5 12"0 1275 1230 GAA GAG GCA CAA GCG CTA TGG GCG CAA CTG CGT GCC GCT CAT ATG ATT 3333 Giu Glu, Ala Glu Ala Leu Trp Ala Gin Leu Arg Ala Ala His Met lie 1235 1290 1235 ACC G A GAT GGG AAA GTG TGT GCG TTA AGC GGG AAA CGA GGA ACA AGC 3?3 Thr Glu Asp Gly Lys Val Cys Ala Leu Ser Gly Lys Arg Gly Thr Ser 1300 1305 1310 CAT CAG AAC CTG ACG ATT CAA CTT ATT TCG CTA TTG GCA ACT ATT CCC 3934 H s Gin Asn Leu Thr He Gin Leu He Ser Leu Leu Ala Ser lie Pro 1315 1320 1325 CGT TTA CCG CCA CAT GTA CTG GGG ATC ACC ACT GAT CGC TAT GAT AGC 4032 Arg Leu Pro Pro His Val Leu Gly lie Thr Thr Asp Arg Tyr Asp Ser 1330 1335 1340 GAT CCG CAA CAG CAG CAC CAA CAG ACG GTG AGC TTT AGT GAC GGT TTT 4030 Asp Pro Gin Gin Gin His Gin Gin Thr Val Ser Phe Ser Asp Gly Phe 1345 1350 1355 1360 GGC CGG TTA CTC CAG AGT TCA GCT CGT CAT GAG TCA GGT GAT GCC TGG 4123 Gly Arg Leu Leu Gin Ser Ser Ala Arg His Glu Ser Gly Asp Ala Trp 1365 1370. 1375 CAA CGT AAA GAG GAT GGC GGG CTG GTC GTG GAT GCA AAT GGC GTT CTG 4 i? 5 Gin Arg Lys Glu Asp GLy Gly Leu Val Val Asp Ala Asn Gly Val Leu 1330 1385 1390 GTC AGT GCC-CCT ACA GAC ACC CGA TCG GCC GTT TCC GGT CGC ACA GAA 4224 Val Ser Ala Pro Thr Asp Thr Arg Trp Ala Val Ser Gly Arg Thr Glu 1395 1400 1405 TAT GAC GAC AAA GGC CAA CCT GTG CGT ACT TAT CAA CCC TAT TTT CTA 4272 Tyr Asp Asp Lys Gly Gin Pro Val Arg Thr Tyr Gin Pro Tyr Phe Leu 1410 1415 1420 AAT GAC TGG CGT TAC GTT AGT GAT GAC AGC GCA CGA GAT GAC CTG TTT 4320 Asn Asp Trp Arg Tyr Val Ser Asp Asp Ser Ala Arg Asp Asp Leu Phe 1425 1430 1435 1440 GCC GAT ACC CAC CTT TAT GAT CCA TTG GGA CGG GAA TAC AAA GTC ATC 4363 Ala Asp Thr His Leu Tyr Asp Pro Leu Gly Arg Glu Tyr Lys Val lie 1445 1450 1455 ACT GCT AAG AAA TAT TTG CGA GAA AAG CTG TAC ACC CCG TGG TTT ATT 4416 Thr Ala Lys Lys Tyr Leu Arg Glu Lys Leu T/r Thr Pro Trp Phe He 1460 1465 1470 GTC AGT GAG GAT GAA AAC GAT ACA GCA TCA AGA ACC CCA TAG 4453 Val Ser Glu Asp Glu Asn Asp Thr Ala Ser Arg Thr Pro * 1475 1430 1435 (2) INFORMATION FOR ScQ ID NO: 32 -164- SUBSTTTUTE SHEET (RULE 26) l) SEQUENCE CHAF.ACTEP.ISTICS : I A j LENGTH: 143 amino acids i 3 ) TYPE: amino acid (Dl TOPOLOGY: linear (ii) MOLECULE TYPE: roc a in (XI) SEQUENCE DESCRIPTION: SEQ 10 NO : 32 : Met Gin Asp Se Pro Glu Val Ser Ila Thr Thr Leu Ser Leu Pro Lv= 1 5 10 15 Gly Gly Gly Ala He Asn Gly Mec Gly Glu Ala Leu Asn Ala Ala Gly 20 25 30 Pro Asp Gly Mec Ala Ser Leu Ser Leu Pro Leu Pro Leu Ser Thr Gly 35 40 45 Arg Gly Thr Ala Pro Gly Lau Ser Leu lie Tyr Ser Asn Ser Ala Gly 50 55 60 Asn Gly Pro Phe Gly lie Gly Trp Gin Cys Gly Val Mec Ser lie Ser 65 70 "5 30 Arg Arg Thr Gin His Gly He Pro Gin Tyr Gly Asn Asp Asp Thr Phe 35 90 95 Leu Ser Pro Gin Gly Glu Val Mec Asn He Ala Leu Asn Asp Gin Gly 100 105 110 Gin Pro Asp He Arg Gin Asp Val Lys Thr Lau Gin Gly Val Thr Leu 115 120 125 Pro He Ser Tyr Thr Val Thr Arg Tyr Gin Ala Arg Gin He Leu Asp 13.0 135 140 Phe Ser Lys He Glu Tyr Trp Gin Pro Ala Ser Gly Gin Glu Gly Arg 145 150 155 150 Ala Phe Trp Leu He Ser Thr Pro Asp Gly His Leu His Ila Leu Gly 155 170 175 Lys Thr Ala Gin Ala Cys Leu Ala Asn Pro Gin Asn Asp Gin Gin He 130 135 150 Ala Gin Trp Leu Leu Glu Glu Thr Val Thr Pro Ala Gly Glu His Val 195 200 205 Ser Tyr Gin Tyr Arg Ala Glu Asp Glu Ala His Cys Asp Asp Asn Glu 210 215 220 Lys Thr Ala His Pro Asn Val Thr Ala Gin Arg Tyr Lau Val Gin Val 225 230 235 240 Asn Tyr Gly Asn He Lys Pro Gin Ala Ser Leu Phe Val Leu Asp Asn 245 250 255 Ala Pro Pro Ala Pro Glu Glu Trp Leu Phe His Leu Val Phe Asp His 260 265 270 Gly Glu Arg Asp Thr Ser Leu His Thr Val Pro Thr Trp Asp Aia Gly 275 230 235 Thr Ala Gin Trp Ser Val Arg Pro Asp He Phe Ser Arg T r Glu Tyr SUBSTITUTE SHEET (RULE 2S) 250 295 JOO Gly Phe Glu Vai Arg Thr Arg Arg Leu Cys Gin Gin Vai Leu Mec Phe 305 310 315 3 His Arg Thr Ala Leu Mec Ala Gly Giu Ala Ser Thr Asn Asp Ala Pro 325 330 335 Glu Leu Vai Gly Arg Leu lie Leu Giu Tyr Asp Lys Asn Ala Ser Vai 340 345 350 Thr Thr Leu lie Thr lie Arg Gin Leu Ser His Glu Ser Asp Gly \rg 355 350 365 Pro Vai Thr Gin Pro Pro Leu Glu Leu Ala Trp Gin Arg Phe Asp Leu 370 375 380 Glu Lys lie Pro Thr Trp Gin Arg Phe Asp Ala Leu Asp Asn Phe Asn 335 390 355 400 Ser Gin Gin Arg Tyr Gin Leu Vai Asp Leu Arg Gly Glu Gly Leu Pro 405 410 415 Gly Mec Leu Tyr Gin Asp Arg Gly Ala Trp Trp Tyr Lys Ala Pro Gin 420 425 430 Arg Gin Glu Asp Gly Asp Ser Asn Ala Vai Thr Tyr Asp Lys lie Ala 435 440 445 Pro Leu Pro Thr Leu Pro Asn Leu Gin Asp Asn Ala Ser Leu Mec Asp 450 · 455 460 He Asn Gly Asp Gly Gin Leu Asp Trp Vai Vai Thr Ala Ser Gly lie 455 470 475 430 Arg Gly Tyr His Ser Gin Gin Pro Asp Gly Lys Trp Thr His Phe Thr 485 490 495 Pro He Asn Ala Leu Pro Vai Glu Tyr Phe His Pro Ser lie Gin Phe 500 505 510 Ala Asp Leu Thr Gly Ala Gly Leu Ser Asp Leu Vai Leu lie Gly Pro 515 520 525 Lys Ser Vai Arg Leu Tyr Ala Asn Gin Arg Asn Gly Trp Arg Lys Gly 530 535 540 Glu Asp Vai Pro Gin Ser Thr Gly lie Thr Leu Pro Vai Thr Gly Thr 545 550 555 560 Asp Ala Arg Lys Leu Vai Ala Phe Ser Asp Mec Leu Gly Ser Gly Gin 565 570 575 Gin His Leu Vai Glu He Lys Gly Asn Arg Vai Thr Cys Trp Pro Asn 580 585 590 Leu Gly His Gly Arg Phe Gly Gin Pro Leu Thr Leu Ser Gly Phe Ser 555 500 505 Gin Pro Glu Asn Ser Phe Asn Pro Glu Arg Leu Phe Leu Ala Asp He 610 615 620 Asp Gly Ser ,Gly Thr Thr Asp Leu lie Tyr Ala Gin Ser Gly Ser Leu 625 630 635 640 Leu He Tyr Leu Asn Gin Ser Gly Asn Gin Phe Asp Ala Pro Leu Thr - 166 - Leu Ala Leu Pro Glu Gly l Gin Phe Asp Asn Thr cys Gin L«u Gin 650 565 570 Val Ala Asp lie Gin Gly Leu Gly lie Ala Ser Leu lie Leu Thr Val 575 530 635 Pro His lie Ala Pro His His Trp Arg Cys Asp Leu Ser Leu Thr Ly: 650 6 5 700 Pro Trp Leu Leu Asn Val Mec Asn Asn Asn Arg Gly Ala His His Thr 705 710 7L5 "20 Leu His Tyr Arg Ser Ser Ala Gin Phe Trp Leu Asp Glu Lys Leu Gin 725 730 735 Leu Thr Lys Ala Gly Lys Ser Pro Ala Cys Tyr Leu Pro Phe Pro Mec 740 745 750 His Leu Leu Trp Tyr Thr Glu lie Gin Asp Glu lie Ser Gly Asn Arg 755 760 755 Leu Thr Ser Glu Val Asn Tyr Ser His Gly Val Trp Asp Gly Lys Glu 770 775 780 Arg Glu Phe Arg Gly Phe Gly Cys lie Lys Gin Thr Asp Thr Thr Thr 735 790 795 300 Phe Ser His Gly Thr Ala Pro Glu Gin Ala Ala Pro Ser Leu Ser He 805 810 315 Ser Trp Phe Ala Thr Gly Mec Asp Glu Val Asp Ser Gin Leu Ala Thr 320 825 830 Glu Tyr Trp Gin Ala Asp Thr Gin Ala Tyr Ser Gly Phe Glu Thr Arg 835 840 345 Tyr Thr Val Trp Asp His Thr Asn Gin Thr Asp Gin Ala Phe Thr Pro 850 855 360 Asn Glu Thr Gin Arg Asn Trp Leu Thr Arg Ala Leu Lys Gly Gin Leu 365 870 875 380 Leu Arg Thr Glu Leu Tyr Gly Leu Asp Gly Thr Asp Lys Gin Thr Val 885 890 395 Pro Tyr Thr Val Ser Glu Ser Arg Tyr Gin Val Arg Ser He Pro Val 900 905 910 Asn Lys Glu Thr Glu Leu Ser Aia Trp Val Thr Ala He Glu Asn Arg 915 920 925 Ser Tyr His Tyr Glu Arg He He Thr Asp Pro Gin Phe Ser Gin Ser 930 935 940 lie Lys Leu Gin His Asp He Phe Gly Gin Ser Leu Gin Ser Val Asp 945 950 955 960 He Ala Trp Pro Arg Arg Glu Lys Pro Ala Val Asn Pro Tyr Pro Pro 965 970 975 Thr Leu Pro Glu Thr Leu Phe Asp Ser Ser Tyr Asp Asp Gin Gin Gin 980 935 990 Leu Leu Arg Leu Val Arg Gin Lys Asn Ser Trp His His Leu Thr Asp -167- SUBST1TUTE SHEET (RULE 26 ; '· 1 u υ 1005 Gly Glu Asn Trp Arg Leu Gly Leu Pro Asn Ala Gin Arg Arg Asp V l 1010 1015 1020 5 T r Thr Tyr Asp Arg Ser Lys lie Pro Thr Glu Gly lie Ser Leu Glu 1025 1030 1035 1C40 He Leu Leu Lys Asp Asp Gly Leu Leu Ala Asp Glu Lys Ala Ala Val 10 1045 1050 1055 Tyr Leu Gly Gin Gin Gin Thr Phe T/r Thr Ala Gly Gin Ala Glu Val 1000 1065 1070 15 Thr Leu Glu Lys Pro Thr Leu Gin Ala Leu Val Ala Phe Gin Glu Thr 1075 1080 1085 Ala Mec Mec Asp Asp Thr Ser Leu Gin Ala Tyr Glu Gly Val He Glu 1090 1095 1100 20 Glu Gin Glu Leu Asn Thr Ala Leu Thr Gin Ala Gly Tyr Gin Gin V l 1105 1110 1115 1120 Ala Arg Leu Phe Asn Thr Arg Ser Glu Ser Pro Val Trp Ala Ala Arg 1125 1130 1135 Gin Gly Tyr Thr Asp Tyr Gly Asp Ala Ala Gin Phe Trp Arg Pro Gin 1140 1145 1150 Ala Gin Arg Asn Ser Leu Leu Thr Gly Lys Thr Thr Leu Thr Trp Asp 1155 1160 1155 Thr His His Cys Val He He Gin Thr Gin Asp Ala Ala Gly Leu Thr 1170 1175 1180 35 Thr Gin Ala His Tyr Asp Tyr Arg Phe Leu Thr Pro Val Gin Leu Thr 1135 1190 1195 1200 Asp He Asn Asp Asn Gin His lie Val Thr Leu Asp Ala Leu Gly Arg 40 1205 1210 1215 Val Thr Thr Ser Arg Phe Trp Gly Thr Glu Ala Gly Gin Ala Ala Gly 1220 1225 1230 45 Tyr Ser Asn Gin Pro Phe Thr Pro Pro Asp Ser Val Asp Lys. Ala Lau 1235 1240 1245 Ala Leu Thr Gly Ala Leu Pro Val Ala Gin Cys Leu Val T/r Ala Val 1250 1255 1260 50 Asp Ser Trp Mec Pro Ser Leu Ser Leu Ser Gin Leu Ser Gin Ser Gin 1265 1270 1275 1230 Clu Glu Ala Glu Ala Leu Trp Ala Gin Leu Arg Ala Ala His Mec He 1235 1290 1295 Thr Glu Asp Gly Lys Val Cys Ala Leu Ser Gly Lys Arg Gly Thr Ser 1300 1305 1310 His Gin Asn Leu Thr He Gin Leu He Ser Leu Leu Ala Ser lie Pro 1315 1320 1325 Arg Lau Pro Pro His Val Leu Gly lie Thr Thr Asp Arg Tyr Asp Ser 1330 1335 1340 n5 Asp Pro Gin Gin Gin His Gin Gin Thr Val Ser Phe Ser Asp Gly Phe -169- SUBSTmjTE SHEET (RULE 26) 1350 .355 Sly Arg Lau Leu Gin ier Ser Ala Arg His Giu Ser Giy Asp .-.Li 1365 1370 Gin Arg Lys Giu Asp Giy Giy Leu Val Vai Asp Ala Asn Giy Vai Leu 1330 1335 90 val er Ala Pro Thr Asp Thr Arg Trp Aia Vai Ser Glv Arg Thr iu 1355 1400 1405 Tyr Asp Asp Lys Giy Gin Pro Val Arg Thr Tyr Gin Pro T r Phe Lau 1410 1415 1420 Asn Asp Trp Arg Tyr Val Ser Asp Asp Ser Ala Arg Asp Aso Leu Phe 1425 1430 1435 1 0 Ala Asp Thr His Leu Tyr Asp Pro Leu Giy Arg Giu Tyr Lys Val 1445 1450 1455 Thr Ala Lys Lys Tyr Leu Arg Giu Lys Leu Tyr Thr Pro Trp Phe 1460 1465 1470 Val Ser Giu Asp Giu Asn Asp Thr Ala Ser Arg Thr Pro 1475 1430 1435 (2) INFORMATION FOR SEQ ID NO : 33 : (i) SEQUENCE CHARACTERISTICS: (A) LENGTH : 3288 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS : double (D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 33 ATG GTG ACT GTT ATG CAA AAT AAA ATA TCA TTT TT TCA GGT ACA TCC 43 Mec Val Thr Val Mec Gin Asn Lys lie Ser Phe Leu Sar Giy Thr Ser 1 5 10 15 GAA CAG CCC CTG CTT GAC GCC GGT TAT CAA AAC GTA TTT GAT ATC GCA 96 Giu Gin Pro Lau Lau Asp Ala Giy Tyr Gin Asn Val Pha Asp He Ala 20 25 30 TCA ATC AGC CGG GCT ACT TTC GTT CAA TCC GTT CCC ACC CTG CCC GTT 144 Ser Ila Sar Arg Ala Thr Phe Val Gin Ser Val Pro Thr Leu Pro Val 35 40 45 AAA GAG GCT CAT ACC GTC TAT CGT CAG GCG CGG CAA CGT GCG G A AAT 152 L s Giu Ala His Thr Val Tyr Arg Gin Ala Arg Gin Arg Ala Glu Asn 50 55 50 CTG AAA TCC CTC TAC CGA GCC TGG CAA TTG CGT CAG GAG CCC GTT ATT 240 Lau Lys Ser Leu Tyr Arg Ala Trp Gin Leu Arg Gin Glu Pro Val lie 65 70 75 30 AAA GGG CTG GCT AAA CTT AAC CTA CAA TCC AAC GTT TCT GTG CTT CAA 233 Lys Gi Leu Ala Lys Leu Asn Leu Gin Ser Asn Val Ser Val Leu Gin 35 90 95 GAT GCT TTG GTA GAG AAT ATT GGC GGT GAT GAT TTC AGC GAT TTA 336 Asp Ala Leu Val Giu Asn lie Giy Giy Asp Giy Asp Phe Ser Asp Leu -169- 100 105 LIG ATC AAC CGT GCC AGT CAA TAT GCT GAC GCT GCC TCT ATT CAA TCC CTA 3 Mec Asn Arg Aia Ser Gin Tyr Ala Asp Ala Ala Ser Ila Gin ia Leu 115 120 125 TTT TCA CCG GGC CGT TAT GCT TCC GCA CTC TAG AGA GTT GCT AAA GAT 432 ? e Ser Pro Gly Arg T r Ala Ser Ala Leu Tyr Arg Val Ala Lys Asp 130 135 140 CTC, CAT AAA TCA GAT TCC AGT TTG CAT ATT GAT AAT CGC CGC GCT C T 430 Leu His Lys Ser Asp Ser Ser Leu His lie Asp Asn Arg Arg Ala Asp 145 150 155 160 CTG AAG GAT CTG ATA TTA AGC G.AA ACG ACG ATG AAT AAA GAG CTC ACT 523 Leu Lys Asp Leu He Leu Ser Glu Thr Thr Mec Asn Lys Glu Val Thr 165 170 175 TCC CTT GAT ATC TTG TTG GAT GTG CTA CAA AAA GGC GGT AAA GAT ATT 576 Ser Leu Asp He Leu Leu Asp Val Leu Gin Lys Gly Gly Lys Asp He ' 130 185 190 ACT GAG CTG TCC GGC GCA TTC TTC CCA ATG ACG TTA CCT TAT GAC GAT 624 Thr Glu Leu ser Gly Ala Phe Phe Pro Mec Thr Leu Pro Tyr Asp Asp 195 200 205 CAT CTG TCG CAA ATC GAT TCC GCT TTA TCG GCA CAA GCC AGA ACG CTG 672 His Leu Ser Gin Ila Asp. Ser Ala Leu Ser Ala Gin Ala Arg Thr Leu 210 215 220 AAC GGT GTG TGG AAT ACT TTG ACA GAT ACC ACG GCA CAA GCG GTT TCA 720 Asn Gly Val Trp Asn Thr Leu Thr Asp Thr Thr Ala Gin Ala Val Ser 225 230 235 240 GAA CAA ACC AGT AAT ACG AAT ACA CGC AAA CTG TTC GCT GCC CAA GAT 768 Glu Gin Thr Ser Asn Thr Asn Thr Arg Lys Leu Phe Ala Ala Gin Asp 245 250 255 GGT AAT CAA GAT ACA TTT TTT TCC GGA AAC ACT TTT TAT TTC AAA GCG 316 Gly Asn Gin Asp Thr Phe Phe Ser Gly Asn Thr Phe Tyr Phe Lys Ala 260 265 270 GTG GGA TTC AGC GGG CAA CCT ATG GTT TAC CTG TCA CAG TAC ACC AGC 364 Val Gly Phe Ser Gly Gin Pro Mec Val Tyr Leu Ser Gin Tyr Thr Ser 275 280 28S GGG A-AC GGC ATT GTC GGC GCA CAA TTG ATT GCA GGT AAT CCA GAC CAA 12 Gly Asn Gly He Val Gly Ala Gin Leu lie Ala Gly Asn Pro Asp Gin 290 295 300 GCC GCC GCC GCA ATA GTC GCA CCG TTG AAA CTC ACT TGG TCA ATG GCA 960 Ala Ala Ala Ala He Val Ala Pro Leu Lys Leu Thr Trp Ser Mec Ala 305 310 315 320 AAA CAG TGT TAC TAC CTC GTC GCT CCC GAT GGT ACA ACG ATG GGA GAC 1008 Lys Gin Cys Tyr Tyr Leu Val Ala Pro Asp Gly Thr Thr Mec Gly Asp 325 330 335 GGT AAT GTT CTG ACC GGC TGT TTC TTA AGA GGC AAC AGC CCA ACT AAC 1056 Gly Asn Val Leu Thr Gly Cys Phe Leu Arg Gly Asn Ser Pro Thr Asn 340 345 350 CCG GAT AAA GAC GGT ATT TTT GCT CAG GTA GCC AAC AAA TCA GGC AGT 1104 Pro Asp Lys Asp Gly He Phe Ala Gin Val Ala Asn Lys Ser Gly Ser 355 360 365 -170- ACT CAG CCT TTG CCA AGC TTC CAT CTG CCG GTC ACA CTG GAA CAC AGC 1152 Thr Gin Pro Leu Pro Ser Phe His Leu Pro Val Thr Leu Glu His Ser 370 375 380 GAG AAT AAA GAT CAG TAC TAT CTG AAA ACA GAG CAG GGT TAT ATC ACG 1200 Glu Asn Lys Asp Gin Tyr Tyr Leu Lys Thr Glu Gin Gly Tyr He Thr 335 390 395 400 GTA GAT AGT TCC GGA CAG TCA AAT TGG AAA AAC GCG CTG GTT ATC AAT 1248 Val Asp Ser Ser Gly Gin Ser Asn Trp Lys Asn Ala Leu Val He Asn 405 410 415 GGG ACA AAA GAC AAG GGG CTG TTA TTA ACC TTT TGC AGC GAT AGC TCA 1296 Gly Thr Lys Asp Lys Gly Leu Leu Leu Thr Phe Cys Ser Asp Ser Ser 420 425 430 GGC ACT CCG ACA AAC CCT GAT GAT GTG ATT CCT CCC GCT ATC AAT GAT 1344 Gly Thr Pro Thr Asn Pro Asp Asp Val He Pro Pro Ala He Asn Asp 435 440 445 ATT CCA TCG CCG CCA GCC CGC GAA ACA CTG TCA CTG ACG CCG GTC AGT 1392 lie Pro Ser Pro Pro Ala Arg Glu Thr Leu Ser Leu Thr Pro Val Ser 450 455 460 TAT CAA TTG ATG ACC AAT CCG GCA CCG ACA GAA GAT GAT ATT ACC AAC 1440 Tyr Gin Leu Met Thr Asn Pro Ala Pro Thr Glu Asp Asp He Thr Asn 465 470 475 480 CAT TAT GGT TTT AAC GGC GCT AGC TTA CGG GCT TCT CCA TTG TCA ACC 1488 His Tyr Gly Phe Asn Gly Ala Ser Leu Arg Ala Ser Pro Leu Ser Thr 485 490 495 AGC GAG TTG ACC AGC AAA CTG AAT TCT ATC GAT ACT TTC TGT GAG AAG 1536 Ser Glu Leu Thr Ser Lys Leu Asn Ser He Asp Thr Phe Cys Glu Lys 500 505 510 ACC CGG TTA AGC TTC AAT CAG TTA ATG GAT TTG ACC GCT CAG CAA TCT 1584 Thr Arg Leu Ser Phe Asn Gin Leu Met Asp Leu Thr Ala Gin Gin Ser 515 520 525 TAC AGT CAA AGC AGC ATT GAT GCG AAA GCA GCC AGC CGC TAT GTT CGT 1632 Tyr Ser Gin Ser Ser lie Asp Ala Lys Ala Ala Ser Arg Tyr Val Arg 530 535 540 TTT GGG GAA ACC ACC CCA ACC CGC GTC AAT GTC TAC GGT GCC GCT TAT 1680 Phe Gly Glu Thr Thr Pro Thr Arg Val Asn Val Tyr Gly Ala Ala Tyr 545 550 555 560 CTG AAC AGC ACA CTG GCA GAC GCG GCT GAT GGT CAA TAT CTG TGG ATT 1728 Leu Asn Ser Thr Leu Ala Asp Ala Ala Asp Gly Gin Tyr Leu Trp He 565 570 575 CAG ACT GAT GGC AAG AGC CTA AAT TTC ACT GAC GAT ACG GTA GTC GCC 1776 Gin Thr Asp Gly Lys Ser Leu Asn Phe Thr Asp Asp Thr Val Val Ala 580 585 590 TTA GCC GGT CGC GCT GAA AAG CTG GTA CGT TTA TCA TCC CAG ACC GGG 1324 Leu Ala Gly Arg Ala Glu Lys Leu Val Arg Leu Ser Ser Gin Thr Gly 595 600 605 CTA TCA TTT GAA GAA TTG GAC TGG CTG ATT GCC AAT GCC AGT CGT AGT 1872 Leu Ser Phe Glu Glu Leu Asp Trp Leu lie Ala- Asn Ala Ser Arg Ser 610 615 620 GTG CCG GAC CAC CAC GAC AAA ATT GTG CTG GAT AAG CCG GTC CTT GAA 1920 Val Pro Asp His His Asp Lys lie Val Leu Asp Lys Pro Val Leu Glu -171- ■5 3 J CTG CCA GAG TAT GTC AGC CTA AAA CAG CCC TAT GGG CTT AT GCC i.H3 Ala Lau Ala Glu Tyr Val i=r Leu Lys Gin Arg Tyr Gly Leu Ai? Aia 5 045 30 555 AAT ACC TTT GCG ACC TTC ATT AGT GCA GTA AA CCT TAT ACG CCA GAT 2010 Asn Thr Phe Ala Thr Phe lie Ser Ala Val Asn Pro Tyr Thr Pro Asp 650 665 670 U CAG ACA CCC AGT TTC TAT GAA ACC GCT TTC CGC TCT GCC GAC GGT AAT 2064 Gin Thr Pro Ser Phe Tyr Glu Thr Ala Phe Arg Ser Aia Asp Gly Asn 675 630 635 5 CAT GTC ATT GCG CTA GGT ACA GAG GTG AAA TAT GCA GAA AAT GAG CAG 2112 His Val H e Ala Leu Gly Thr Glu Val Lys Tyr Aia Glu Asn Glu Gin 690 695 700 GAT GAG TTA GCC GCC ATA TGC TGC AAA GCA TTG GGT GTC ACC AGT GAT 2160 U Asp Glu Leu Ala Ala lie Cys Cys Lys Ala Leu Gly Val Thr Ser Asp 705 710 715 720 GAA CTG CTC CGT ATT GGT CGC TAT TGC TTC GGT AAT GCA GGC AGT TTT 2208 Glu Leu Leu Arg lie Gly Arg Tyr Cys Phe Gly Asn Ala Gly Ser Phe 5 725 730 735 ACC TTG GAT GAA TAT ACC GCC AGT CAG TTG TAT CGC TTC GGC GCC ATT 2256 Thr Leu Asp Glu Tyr Thr Ala Ser Gin Leu Tyr Arg Phe Gly Ala lie 740 745 750 CCC CGT TTG TTT GGG CTG ACA TTT GCC CAA GCC GAA ATT TTA TGG CGT 2304 Pro Arg Leu Phe Gly Leu Thr Phe Ala Gin Ala Glu He Leu Trp Arg 755 7S0 765 CTG ATG GAA GGC GGA AAA GAT ATC TTA TTG CAA CAG TTA GGT CAG GCA 2352 Leu Mec Glu Gly Gly Lys Asp He Leu Leu Gin Gin Leu Gly Gin Ala 770 775 730 AAA TCC CTG CAA CCA CTG GCT ATT TTA CGC CGT ACC GAG CAG GTG CTG 2400 Lys Ser Leu Gin Pro Leu Ala He Leu Arg Arg Thr Glu Gin Val Leu 735 790 795 300 GAT TGG ATG TCG TCC GTA AAT CTA AGT CTG ACT TAT CTG CAA GGG ATG 2 43 Asp Trp Mec Ser Ser Val Asn Leu Ser Leu Thr Tyr Leu Gin Gly Mec 305 310 315 GTA AGT ACG CAA TGG AGC GGT ACC GCC ACC GCT GAG ATG TTC AAT TTC 2496 Val Ser Thr Gin Trp Ser Gly Thr Ala Thr Ala Glu Mec Phe Asn Phe 820 825 830 TTG GAA AAC GTT TGT GAC AGC GTG AAT AGT CAA GCT GCC ACT AAA GAA Leu Glu Asn Val Cys Asp Ser Val Asn Ser Gin Ala Ala Thr Lys Glu 835 840 845 ACA ATG GAT TCG GCG TTA CAG CAG AAA GTG CTG CGG GCG CTA AGC GCC 2592 Thr Mec Asp Ser Ala Leu Gin Gin Lys Val Leu Arg Ala Leu Ser Ala 350 855 860 GGT TTC GGC ATT AAG AGC AAT GTG ATG GGT ATC GTC ACC TTC TGG CTG 2640 Gly Phe Gly He Lys Ser Asn Val Mec Gly He Val Thr Phe Trp Leu 365 370 375 380 GAG AAA ATC ACA ATC GGT AGT GAT AAT CCT TTT ACA TTG GCA AAC TAC 2538 Glu Lys He Thr He Gly Ser Asp Asn Pro Phe Thr Leu Ala Asn Tyr 385 390 395 •172- SUBSTUUTE SHEET (RULE 26) T T.-.T CAT ATT AC- -TO TTT AGC T GAC AAT GCC ACG TTA GAG ~:' Trp His Asp lie Gin Thr Leu Phe Ser His Asp Asn Aid Thr Leu Glu 900 305 910 5 TCC TTA CAA ACC GAC ACT TCT CTG GTA ATT GCT ACT C G CAA CTT AGC 2734 Ser Leu Gin Thr Asp Thr Ser Leu Val lie Ala Thr Gin Gin Leu Ser 915 920 925 CAG CTA GTG TTA ATT GTG AAA TCG CTG AGC CTG ACC GAG CAG GAT CTG 233: 1U in Leu Val Leu He Val Lys Trp Leu Ser Leu Thr Glu Gin Asp Leu 930 935 940 CAA TTA CTG ACA ACC TAT CCC GAA CGT TTA ATC AAC GCC ATC ACG AAT 2330 Gin Leu Leu Thr Thr Tyr Pro Glu Arg Leu lie Asn Giy He Thr Asn 15 9 5 950 955 960 CTT CCT GTA CCC AAT CCG GAG CTA TTA CTC ACG CTA TCA CGT TTT AAG 2923 Val Pro Val Pro Asn Pro Glu Leu Leu Leu Thr Leu Ser Arg Phe Lys 965 970 975 20 CAG TGG GAA ACT CAA GTC ACC CTT TCC CGT GAT GAA GCG ATG CGC TGT 29"6 Gin Trp Glu Thr Gin Val Thr Val Ser Arg Asp Glu Ala Mec Arg Cys 980 985 990 25 TTC GAT CAA TTA AAT GCC AAT GAT ATG ACG ACT GAA AAT GCA GGT TCA 3024 Phe Asp Gin Leu Asn Ala Asn Asp MeC Thr Thr Glu Asn Ala Giy Ser 995 1000 1005 CTG ATC GCC ACA TTC TAT GAG ATG GAT AAA GGT ACG GGA GCG CAA GTT 3072 30 Leu He Ala Thr Leu Tyr Glu Mec Asp Lys Giy Thr Giy Ala Gin Val 1010 ■ 1015 1020 .AAT ACC TTC CTA TTA GGT GAA AAT AAC TGG CCG AAA AGT TTT ACC TCT 3i20 Asn Thr Leu Leu Leu Giy Glu Asn Asn Trp Pro Lys Ser Phe Thr Ser 35 1025 · 1030 1035 1040 CTC TGG CAA CTT CTG ACC TGG TTA CGC GTC GGG CAA AGA CTG AAT GTC 3163 Leu Trp Gin Leu Leu Thr Trp Leu Arg Val Giy Gin Arg Leu Asn Val 1045 1050 1055 0 GGT AGT ACC ACT CTG GGC AAT CTG TTG TCC ATG ATG CAA GCA GAC CCT 3215 Giy Ser Thr Thr Leu Giy Asn Leu Leu Ser Mec Mec Gin Ala Asp Pro 1060 1065 1070 \5 GCT GCC GAG AGT AGC GCT TTA TTG GCA TCA GTA GCC CAA AAC TTA AGT 3264 Ala Ala Glu Ser Ser Ala Leu Leu Ala Ser Val Ala Gin Asn Leu Ser 1075 1080 1085 GCC GCA ATC AGC AAT CGT CAG TAA Ala Ala lie Ser Asn Arg Gin ··· 1090 1095 INFORMATION FOR SEQ ID NO: 34: 55 (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 1095 amino acids (B) TYPE: amino acids !C) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34 : Features From To Descripc ion 254 ' 267 SEQ ID NO: 15 i5 254 492 TcaAii pepcide -173- SUBSTTtUTE SHEET (RULE 26) Mec Val Thr Val Mec Gin Asn Lys lie Ser Phe Leu Ser Gly Thr Ser 1 5 10 'l5 Giu Gin Pro Leu Leu Asp Ala Gly Tyr Gin Asn Val Phe ASD lie ^l 20 25 0' Ser lie Ser Arg Ala Thr Phe Val Gin Ser Val Pro Thr Leu Pro Val 35 40 45 Lys Glu Ala His Thr Val Tyr Arg Gin Ala Arg Gin Arg Ala Glu Asn 50 55 60 Leu Lys Ser Leu Tyr Arg Ala Trp Gin Leu Arg Gin Glu Pro Val lie 65 70 75 80 Lys Gly Leu Ala Lys Leu Asn Leu Gin Ser Asn Val Ser Val Leu Gin 35 90 Asp Ala Leu Val Glu Asn He Gly Gly Asp Gly Asp Phe Ser Asp Leu 100 105 110 Mec Asn Arg Ala Ser Gin Tyr Ala Asp Ala Ala Ser lie Gin Ser Leu 115 120 125 Phe Ser Pro Gly Arg Tyr Ala Ser Ala Leu Tyr Arg Val Ala Lys Asp 130 135 140 Leu His Lys Ser Asp Ser Ser Leu His He Asp Asn Arg Arg Ala Asp 145 150 155 160 Leu Lys Asp Leu He Leu Ser Glu Thr Thr Mec Asn Lys Glu Val Thr 165 170 175 Ser Leu Asp lie Leu Leu Asp Val Leu Gin Lys Gly Gly Lys Asp lie 180 185 190 Thr Glu Leu Ser Gly Ala Phe Phe Pro Mac Thr Leu Pro Tyr Asp Asp 195 200 205 His Leu Ser Gin lie Asp Ser Ala Leu Ser Ala Gin Ala Arg Thr Leu 210 215 220 Asn Gly Val Trp Asn Thr Leu Thr Asp Thr Thr Ala Gin Ala Val Ser 225 230 235 240 Glu Gin Thr Ser Asn Thr Asn Thr Arg Lys Leu Phe Ala Ala Gin ASD 245 250 255 Gly Asn Gin Asp Thr Phe Phe Ser Gly Asn Thr Phe Tyr Phe Lys Ala 260 265 270 Val Gly Phe Ser Gly Gin Pro Mec Val Tyr Leu Ser Gin T/r Thr Ser 275 280 285 Gly Asn Gly He Val Gly Ala Gin Leu lie Ala Gly Asn Pro Asp Gin 290 295 300 Ala Ala Ala Ala He Val Ala Pro Leu Lys Leu Thr Trp Ser Mec Ala 305 310 315 320 Lys Gin Cys Tyr Tyr Leu Val Ala Pro Asp Gly Thr Thr Mec Gly Asp 325 330 335 Gly Asn Val Leu Thr Gly .Cys Phe Leu Arg Gly Asn Ser Pro Thr Asn -174- SUBSTTTUTE SHEET (RULE 26) 340 345 350 Pro Asp Lys Asp Gly lis Phe Ala Gin Val Ala Asn Lys Ser Gly Ser 355 350 365 Thr Gin Pro Leu' Pro Ser Phe His Leu Pro Val Thr Leu Glu His Ser 370 375 330 Glu Asn Lys Asp Gin T/r Tyr Leu Lys Thr Glu Gin Gly T/r lie Thr 335 390 395 400 Val Asp Ser Ser Gly Gin Ser Asn Trp Lys Asn Ala Leu Val Tie Asn 405 410 415 Gly Thr Lys Asp Lys Gly Leu Leu Leu Thr Phe Cys Ser Asp Ser Ser 420 425 .430 Gly Thr Pro Thr Asn Pro Asp Asp Val He Pro Pro Ala lie Asn Asp 435 440 445 lie Pro Ser Pro Pro Ala Arg Glu Thr Leu Ser Leu Thr Pro Val Ser 450 455 460 T/r Gin Leu Mec Thr Asn Pro Ala Pro Thr Glu Asp Asp lie Thr Asn 465 470 475 430 His Tyr Gly Phe Asn Gly Ala Ser Leu Arg Ala Ser Pro Leu Ser Thr 435 490 W4 » 495 Ser Glu Leu Thr Ser Lys Leu Asn Ser lie Asp Thr Phe Cys Glu Lys 500 505 510 Thr Arg Leu Ser Phe Asn Gin Leu Mec Asp -Leu Thr Ala Gin Gin Ser 515 520 525 T/r Ser Gin Ser Ser He Asp Ala Lys Ala Ala Ser Arg T/r Val Arg 530 535 540 Phe Gly Glu Thr Thr Pro Thr Arg Val Asn Val Tyr Gly Ala Ala T/r 545 550 555 560 Leu Asn Ser Thr Leu Ala Asp Ala Ala Asp Gly Gin Tyr Leu Trp He 565 570 575 Gin Thr Asp Gly Lys Ser Leu Asn Phe Thr Asp Asp Thr Val Val Ala 580 535 590 Leu Ala Gly Arg Ala Glu Lys Leu Val Arg Leu Ser Ser Gin Thr Gly 595 600 605 Leu Ser Phe Glu Glu Leu Asp Trp Leu He Ala Asn Ala Ser Arg Ser 610 615 620 Val Pro Asp His His Asp Lys He Val Leu Asp Lys Pro Val Leu Glu 625 530 635 640 Ala Leu Ala Glu T/r Val Ser Leu Lys Gin Arg Tyr Gly Leu Asp Ala 645 550 555 Asn Thr Phe Ala Thr Phe He Ser Ala Val Asn Pro Tyr Thr Pro Asp 650 665 570 Gin Thr Pro Ser Phe T/r Glu Thr Ala Phe Arg Ser Ala Asp Gly Asn 575 630 635 His Val He Ala Leu Gly Thr Glu Val Lys T/r Ala Glu Asn Glu Gin -175- 590 372 G0 Asp GLu Leu Ala Ala lie Cys C s Lys Ala Lau iy V i Thr Ser Asp 705 710 "15 Glu Leu Leu Arg He Gly Arg Tyr Cys Phe ly Asn Ala Giy Ser Phe 725 730 735 hr Leu Asp Glu Tyr Thr Ala Ser Gin Leu Tvr Arg Phe Gly 7-10 745 750 Pro Arg Leu Phe Gly Leu Thr Phe Ala Gin Ala Glu lie Lau Trp Ai •g 755 750 755 Lau Mec Glu Giy Gly Lys Asp He Leu Lau Gin Gin Leu Gly Gin Ala 770 775 730 Lys Ser Lau Gin Pro Leu Ala He Leu Arg Arg Thr Glu Gin Val Leu 785 790 795 300 Asp Trp Mec Ser Ser Val Asn Leu Ser Leu Thr Tyr Leu Gin Gly Mec 305 810 315 Val Ser Thr Gin Trp Ser Gly Thr Ala Thr Ala Glu Mec Phe Asn Phe 320 825 330 Leu Glu Asn Val Cys Asp Ser Val Asn Ser Gin Ala Ala Thr Lys Glu 335 840 345 Thr Mec Asp Ser Ala Leu Gin Gin Lys Val Leu Arg Ala Leu Ser Ala 850 855 350 Gly Phe Gly He Lys Ser Asn Val Mec Gly He Val Thr Phe Trp Leu 855 370 875 330 Glu Lys lie Thr He Gly Ser Asp Asn Pro Phe Thr Leu Ala Asn Tyr 885 890 395 Trp His Asp Ha Gin Thr Lau Phe Ser His Asp Asn Ala Thr Leu Glu 900 905 910 Ser Leu Gin Thr Asp Thr Ser Lau Val He Ala Thr Gin Gin Lau Ser 915 920 925 Gin Leu Val Leu Ila Val Lys Trp Leu Ser Leu Thr Glu Gin Asp Lau 930 935 940 Gin Leu Lau Thr Thr Tyr Pro Glu Arg Leu He Asn Gly He Thr Asn 945 950 955 960 Val Pro Val Pro Asn Pro Glu Leu Leu Leu Thr Leu Ser Arg Phe Lys 965 970 975 Gin Trp Glu Thr Gin Val Thr Val Ser Arg Asp Glu Ala Mec Arg Cys 980 985 990 Phe Asp Gin Leu Asn Ala Asn Asp Mec Thr Thr Glu Asn Ala Gly Ser 995 1000 1005 Leu Ila Ala Thr Leu Tyr Glu Mec Asp Lys Gly Thr Giy Ala Gin Val 1010 1015 1020 Asn Thr Lau Leu Leu Gly Glu Asn Asn Trp Pro Lys Ser Phe Thr Sar 1025 1030 1035 1040 Leu Trp Gin Leu Leu Thr Trp Leu Arg Val Gly Gin Arg Lau Asn Val -176- :0 5 1050 Ser Thr Thr Leu Gly Asn Leu Leu Ser Mec Mec Gin Ala r ro 1060 i C 65 LO-'' Ala Ala Glu Ser Ser Ala Leu Lsu Ala Ser Val Ala Gin Asn Leu e^ 1075 1080 1035 Ala Ala lie Ser Asn Arg Gin 1090 1095 (2) INFORMATION FOR 3EQ ID NO : 35 (l) SEQUENCE CHARACTERISTICS: (A) LENGTH: 603 ammo acids (B) TYPE: amino acid '(C) TOPOLOGY : linear ii) MOLECULE TYPE : procein (xi) SEQUENCE DESCRIPTION : SEQ ID NO : 35 : Pro Leu Ser Thr Ser Glu Leu Thr Ser Lys Leu Asn Ser lie Asp Thr 1 5 10 15 Phe Cys Glu Lys Thr Arg Lau Ser Phe Asn Gin Leu Mec Asp Leu Thr 20 25 30 Ala Gin Gin Ser Tyr Ser Gin Ser Ser lie Asp Ala Lys Ala Ala Ser 35 40 45 Arg Tyr Val Arg Phe Gly Glu Thr Thr Pro Thr Arg Val Asn Val Tvr 50 55 50 Gly Ala Ala Tyr Leu Asn Ser Thr Leu Ala Asp Ala Ala Asp Gly Gin 65 70 75 30 Tyr Leu Trp lie Gin Thr Asp Gly Lys Ser Leu Asn Phe Thr Asp Asp 35 90 95 Thr Val Val Ala Leu Ala Gly Arg Ala Glu Lys Leu Val Arg Leu Ser 100 105 110 Ser Gin Thr Gly Leu Ser Phe Glu Glu Leu Asp Trp Leu lie Ala Asn 115 120 125 Ala Ser Arg Ser Val Pro Asp His His Asp Lys lie Val Leu Asp Lys 130 135 140 Pro Val Leu Glu Ala Leu Ala Glu Tyr Val Ser Leu Lys Gin Arg Tyr 145 150 155 150 Gly Leu Asp Ala Asn Thr Phe Ala Thr Phe lie Ser Ala Val Asn Pro 105 170 175 Tyr Thr Pro Asp Gin Thr Pro Ser Phe Tyr Glu Thr Ala Phe Arg Ser 130 135 190 Ala Asp Gly Asn His Val lie Ala Leu Gly Thr Glu Val Lys Tyr Ala 195 200 205 Glu Asn Glu Gin Asp Glu Leu Ala Ala lie Cys cys Lys Ala Leu Giy 210 215 220 -177- SUBSTmJTESHEET(RULE26 Val Thr isc Asp Glu Leu Leu Arg Ha Giy Arg Tyr Cys Pha Giy Asn 225 -30 235 240 Ala Giy Arg ?he Thr Leu Asp Giu Tyr Thr Ala Ser Gin Leu Tyr Arg 245 250 255 Phe Giy Ala lie Pro Arg Leu Pha Giy Lau Thr Phe Ala Gin Ala Giu 250 255 270 I la Leu Trp Arg Lau Mec Glu Giy Giy Lys Asp I la Lau Leu Gin Gin 275 230 285 :xx Giy Gin Ala Lys Ser Leu Gin Pro Leu Ala lie Leu Arg Arg Thr 290 295 300 Glu Gin Val Leu Asp Trp Mec Ser Pro Val Asn Leu Ser Leu Thr Tyr 305 310 315 320 Leu Gin Giy Mec Val Ser Thr Gin Trp Ser Giy Thr Ala Thr Ala Glu 325 330 335 Mec Phe Asn Phe Leu Glu Asn Val Cys Asp Ser Val Asn Ser Gin Ala 340 345 350 Xxx Thr Lys Glu Thr Mec Asp Ser Ala Leu Gin Gin Lys Val Leu Arg 355 360 365 Ala Leu Ser Ala Giy Phe Giy lie Lys Ser Asn Val Mec Giy lie Val 370 375 330 Thr Phe Trp Leu Glu Lys lie Thr lie Giy Arg Asp Asn Pro Phe Thr 33 390 395 400 Leu Ala Asn Tyr Trp His Asp He Gin Thr Leu Phe Ser His Asp Asn 405 410 415 Ala Thr Leu Glu Ser Leu Gin Thr Asp Thr Ser Leu Val He Ala Thr 420 425 430 Gin Gin Leu Ser Gin Leu Val Leu lie Val Lys Trp Val Ser Leu Thr 435 440 445 Glu Gin Asp Leu Gin Leu Leu Thr Thr Tyr Pro Glu Arg Leu lie Asn 450 455 460 Giy He Thr Asn Val Pro Val Pro Asn Pro Glu Leu Leu Leu Thr Leu 465 470 475 430 Ser Arg Phe Lys Gin Trp Glu Thr Gin Val Thr Val Ser Arg Asp Glu 485 490 495 Ala Mec Arg Cys Phe Asp Gin Leu Asn Ala Asn Asp Mec Thr Thr Glu 500 505 510 Asn Ala Giy Ser Leu He Ala Thr Leu Tyr Glu Mec Asp Lys Giy Thr 515 520 525 Giy Ala Gin Val Asn Thr Leu Leu Leu Giy Glu Asn Asn Trp Pro Lys 530 535 540 Ser Phe Thr Ser Leu Trp Gin Leu Leu Thr Trp Lsu Arg Val Giy Gin 545 550 555 560 Arg Leu Asn Val Giy Ser Thr Thr Leu Giy Asn Leu Leu Ser Mec Mec 565 570 575 -178- SUBSTmjTE SHEET(RULE 26) Gin Ala Asp Pro Ala Ala Giu er ≤er Ala L=u Leu Ala Sar Val Aia 530 5 a 5 590 Gin Asn Leu Ser Ala Ala lie Ser Asn Arg Gin ' 595 500 !2) INFORMATION FOR SEQ ID MO : 36.: ( l ) SEQUENCE CHARACTERISTICS : (A) LENGTH: 2557 base pairs (Bt TYPE: nucleic acid (C) TOPOLOGY: linear (ii) MOLECULE TYPE : DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: GAATTCGGCT TGCGTTTAAT ATTGATGATG TCTCGCTCTT CCGCCTGCTT AAAATTACCG 60 ACCATGA AA TAAAGATGGA AAAATTAAAA ATAACCTAAA GAATCTTTCC AATTTATATA 120 TTGGAAAATT ACTGGCAGAT ATTCATCAAT TAACCATTGA TGAACTGGAT TTATTACTGA 130 TTGCCGTAGG TGAAGGAAAA ACTAATTTAT CCGCTATCAG TGATAAGCAA TTGGCTACCC 2 40 TGATCAGAAA ACTCAATACT ATTACCAGCT GGCTACATAC ACAGAAGTGG AGTGTATTCC 3 00 AGCTATTTAT CATGACCTCC ACCAGCTATA ACAAAACGCT AACGCCTGAA ATTAAGAATT 3 60 TGCTGGATAC CGTCTACCAC GGTTTACAAG GTTTTGATAA AGACAAAGCA GATTTGCTAC 420 ATGTCATGGC CCCCTATATT GCGGCCACCT TGCAATTATC ATCGGAAAAT GTCGCCCACT 480 CGGTACTCCT TTGGGCAGAT AAGTTACAGC CCGGCGACGG CGCAATGACA GCAGAGGGA 54 0 TCTGGGACTG GTTGAATACT AAGTATACGC CGGGTTCATC GGAAGCCGTA GAAACGCAGG 600 AACATATCGT TCAGTATTGT CAGGCTCTGG CACAATTGGA AATGGTTTAC CATTCCACCG 660 GCATCAACGA AAACGCCTTC CGTCTATTTG TGACAAAACC AGAGATGTTT GGCGCTGCAA 720 CTGGAGCAGC GCCCGCGCAT GATGCCCTTT CACTGATTAT GCTGACACGT TTTGCGGATT 730 GGGTGAACGC ACTAGGCGAA AAAGCGTCCT CGGTGCTAGC GGCATTTGAA GCTAACTCGT 340 TAACGGCAGA ACAACTGGCT GATGCCATGA ATCTTGATGC TAATTTGCTG TTGCAAGCCA 900 GTATTCAAGC ACAAAATCAT CAACATCTTC CCCCAGTAAC TCCAGAAAAT GCGTTCTCCT 9 60 GTTGGACATC TATCAATACT ATCCTGCAAT GGGTTAATGT CGCACAACAA TTGAAATGTC 1020 GCCCCACAGG GCGTTTCCGC TTTGGTCGGG CTGGATTATA TTCAATCAAT GAAAGAGACA 1030 CCGACCTATG CCCAGTGGGA AAACGCGGCA GGCGTATTAA CCGCCGGGTT GAATTCAACA 1 140 ACAGGCTAAT ACATTACAAC GCTTTTCTGG ATGAATCTCG CAGTGCCCCA TTAAGCACCT 1200 ACTATATCCG TCAAGTCGCC AAGGCAGCGG CGGCTATTAA AAGCCGTGAT GACTTGTATC 1260 AATACTTACT GATTGATAAT CAGGTTTCTG CGGCAATAAA AACCACCCGG ATCGCCGAAG 13 20 CCATTGCCAG TATTCAACTG TACGTCAACC GGGCATTGGA AAATGTGGAA GAAAATGCCA 13 30 ATTCGGGGGT TATCAGCCGC CAATTCTTTA TCGACTGGGA CAAATACAAT AAACGCTACA 1440 GCACTTGGGC GGGTGTTTCT CAATTAGTTT ACTACCCGGA AAACTATATT GATCCGACCA 1500 TGCGTATCGG ACAAACCAAA ATGATGGACG CATTACTGCA ATCCGTCAGC CAAAGCCAAT 15 60 TAAACGCCGA TACCGTCGAA GATGCCTTTA TGTCTTATCT GACATCGTTT GAACAAGTGG 1 620 CTAATCTTAA AGTTATTAGC GCATATCACG ATAATATTAA TAACGATCAA GGGCTGACCT 1 630 ATTTTATCGG ACTCAGTGAA ACTGATGCCG GTGAATATTA TTGGCGCAGT GTCGATCACA 40 GTAAATTCAA CGACGGTAAA TTCGCGGCTA ATGCCTGGAG TGAATGGCAT AAAATTGATT 1 300 GTCCAATTAA CCCTTATAAA AGCACTATCC GTCCAGTGAT ATATAAATCC CGCCTGTATC 1 360 -179- AACΑΑΛΑ-G GAGATC CCA AACAGACrtuC AAATAG AA JT. X '„ ~- ·> AAACTGAAAC GGATTATCGT TATGAAC7AA AATTGGCCCA TATC GCTAT 1330 GGAATACGCC AATCACCTTT GATGTCAATA AAAAAATATC CGAGCTAAAA CTCGAAAAAA 2040 ATAGAGCGCC CGGACTCTAT ATCAAGGTGA AGATACGTTG CTGGTGATGT 2100 TTTATAACCA ACAAGACACA CTAGATAGTT ATAAAAACCC TTCAATGCAA GGACTATATA 2150 TCTTTCCTGA TATGGCATCC AAAGATATGA CCCCAGAACA GAGCAATGTT TATCGGGATA 2220 ATAGCTATCA ACAATTTGAT ACCAATAATG TCAGAAGAGT GAATAACCGC TATGCAGAGG 2230 ATTATGAGAT TCCTTCTTCG GTAAGTAGCC GTAAAGACTA TGGTTGGGGA GATTATTACC 2340 TCAGCATGGT ATATAACGGA GATATTCCAA CTATCAATTA CAAAGCCGCA TCAAGTGATT 2400 TAAAAATTTA TATTTCACCA AAATTAAGAA TTATTCATAA TGGATATGAA VJACAGAAGC 2460 GCAATCAATG CAATTTGATG AATAAATATG GCAAACTAGG TGATAAATTT ATTGTGTATA 2520 CCAGCCTGGG CGTTAATCCG AATAATAAGC CGAATTC 2557 INFORMATION FOR SEQ ID NO: 37: (i) SEQUENCE CHARACTERISTICS (A) LENGTH: 345 amino (B) TYPE: amino acids (C) TOPOLOGY: linear (ii) MOLECULE TYPE: procein (parcial) (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37 Ala Phe Asn lie Asp Asp Val Ser Leu Phe Arg Leu Leu Lys lie Thr 1 5 10 15 Asp His Asp Asn Lys Asp Gly Lys He Lys Asn Asn Leu Lys Asn Leu 20 25 30 Ser Asn Leu Tyr He Gly Lys Leu Leu Ala Asp lie His Gin Leu Thr 35 40 45 Asp Glu Leu Asp Leu Leu Leu lie Ala Val Gly Glu Gly Lys Thr 50 55 60 Asn Leu Ser Ala Ila Ser Asp Lys Gin Leu Ala Thr Leu lie Arg Lys 05 70 75 30 Leu Asn Thr Ila Thr Ser Trp Leu His Thr Gin Lys Trp Ser Val Phe 85 90 95 Gin Leu Phe lie Mec Thr Ser Thr Ser Tyr Asn Lys Thr Lau Thr Pro 100 105 110 Glu He Lys Asn Leu Leu Asp Thr Val Tyr His Gly Leu Gin Gly Phe 115 120 125 Asp Lys Asp Lys Ala Asp Leu Leu His Val Mec Ala Pro Tyr He Ala 130 135 140 Ala Thr Leu Gin Leu Ser Ser Glu Asn Val Ala His Ser Val Leu Leu 145 150 155 150 Trp Ala Asp Lys Leu Gin Pro Gly Asp Gly Ala Mec Thr Ala Glu Gly 165 - 170 175 Phe Trp Asp Trp Leu Asn Thr Lys Tyr Thr Pro Gly Ser Ser Glu Ala -180- SUBSTUUTE SHEET (RULE 26) i30 Val Glu Thr Gin Glu His He V l in Ήη Ala -eu Ala 195 200 205 Lau Glu Mec Val Tyr His Ser hr Gly lie Asn Giu Asn Ala ?he 210 .215 220 Lau Pha Val Thr Lys Pro Glu Mec ?he Gly Ala Ala Thr Gly Ala Aia 225 230 235 240 Pro Ala His Asp Ala Leu Ser Leu lie Mec Leu Thr Arg Phe Aia ASD 245 250 235 Trp Val Asn Ala Leu Gly Glu Lys Ala Ser Ser Vai Leu Ala Aia Phe 260 265 270 Glu Ala Asn Ser Lau Thr Ala Glu Gin Leu Ala Asp Ala Mec Asn Leu 275 280 235 Asp Ala Asn Leu Leu Leu Gin Ala Ser He Gin Ala Gin Asn His Gin 290 295 300 His Lau Pro Pro Val Thr Pro Glu Asn Ala Phe Ser Cys Trp Thr Ser 305 310 315 320 lie Asn Thr He Leu Gin Trp Val Asn Val Ala Gin Gin Leu Lys Cys 325 330 335 Arg Pro Thr Gly Arg Phe Arg Phe Gly Arg Ala Gly Leu Tyr Ser lie 340 345 350 Asn Glu Arg Asp Thr Asp Lau Cys Pro Val Gly Lys Arg Gly Arg Arg 355 360 365 Ila Asn Arg Arg Val Glu Phe Asn Asn Arg Leu He His Tyr Asn Ala 370 375 380 Pha Leu Asp Glu Ser Arg Ser Ala Ala Leu Ser Thr Tyr Tyr He Arg 385 390 395 400 Gin Val Ala Lys Ala Ala Ala Ala He Lys Ser Arg. Asp Asp Lau Tyr 405 410 415 Gin Tyr Leu Leu He Asp Asn Gin Val Ser Ala Ala He Lys Thr Thr 420 425 430 Arg Ha Ala Glu Ala Ila Ala Sar He Gin Lau Tyr Val Asn Arg Ala 435 440 445 Lau Glu Asn Val Glu Glu Asn Ala Asn Ser Gly Val He Ser Arg Gin 450 455 460 Pha Phe He Asp Trp Asp Lys Tyr Asn Lys Arg Tyr Ser Thr Trp Ala 465 470 475 480 Gly Val Ser Gin Leu Val Tyr Tyr Pro Glu Asn Tyr He Asp Pro Thr 485 490 495 Mec Arg Ila Gly Gin Thr Lys Mec Mec Asp Ala Leu Leu cln Ser Val 500 505 510 Ser Gin Ser Gin Leu Asn Ala Asp Thr Val Glu Asp Ala Pha Mec Ser 515 520 525 Tyr Leu Thr Ser Phe Glu Gl Val Ala Asn Leu Lys Val He Ser Aia -181- 530 535 540 Tyr His Asp Asn Ila Asn Asn Asp Gin Gly Leu Thr T/r Phe lie Gly 545 550 555 550 Lau Ser Glu Thr Asp Ala Gly Glu Tyr T/r Trp Arg Ser Val Asp His 565 570 575 Ser Lys Phe Asn Asp Gly Lys Phe Ala Aia Asn Ala Trp Ser i Trp 530 585 550 His Lys lie Asp Cys Pro lie Asn Pro T/r Lys Ser Thr lie Arg 595 600 605 Val He Tyr Lys Ser Arg Leu Tyr Leu Leu Trp Leu Glu Gin Lys 510 615 520 He Thr Lys Gin Thr Gly Asn Ser Lys Asp Gly Tyr Gin Thr Giu Thr 625 530 635 540 Asp Tyr Arg T/r Glu Leu Lys Leu Ala His He Arg T/r Asp Gly Thr 645 650 555 Trp Asn Thr Pro He Thr Phe Asp Val Asn Lys Lys lie Ser Glu Leu 660 565 570 Lys Leu Glu Lys Asn Arg Ala Pro Gly Leu T/r Cys Ala Gly T/r Gin 575 680 635 Gly Glu Asp Thr Leu Leu Val Mac Phe Tyr Asn Gin Gin Asp Thr Lau 690 695 700 Asp Ser Tyr Lys Asn Ala Ser Mec Gin Gly Leu Tyr He Phe Aia Asp 705 710 715 720 Mec Ala Ser Lys Asp MeC Thr Pro Glu Gin Ser Asn Val T/r Arg Asp 725 730 735 Asn Ser Tyr Gin Gin Phe Asp Thr Asn Asn Val Arg Arg Val Asn Asn 740 745 750 Arg Tyr Ala Glu Asp Tyr Glu He Pro Ser Ser Val Ser Ser Arg Lys 755 760 765 Asp T/r Gly Trp Gly Asp Tyr Tyr Leu Ser Mec Val Tyr Asn Gly Asp 770 775 780 Ila Pro Thr Ila Asn Tyr Lys Ala Ala Ser Ser Asp Leu Lys lie T/r 785 790 795 300 Ha Ser Pro Lys Leu Arg He He His Asn Gly Tyr Glu Gly Gin Lys 805 8i0 315 Arg Asn Gin Cys Asn Leu Mec Asn Lys T/r Gly Lys Leu Gly Asp Lys 820 825 830 Phe He Val T/r Thr Ser Leu Gly Val Asn Pro Asn Asn 835 840 345 !2) INFORMATION FOR SEQ ID NO : 38 : (i) SEQUENCE CHARACTERISTICS : (A) LENGTH: 15 amino acids (B) TYPE: amino acid 182- ίθ STRANDNES : single iD) TOFCLGGV: linear i i i ) MOLECULAR TYPE : procein (v) FRAGMENT TYPE: N- terminal (xi) SEQUENCE DESCRIPTIO : SEQ ID NO: 38: Arg Tyr Tyr Asn Leu Ser Asp Glu Glu Leu Ser Gin Phe lie Gly 1 5 10 15 Lys (2) INFORMATION FOR SEQ ID NO : 39 : (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 20 amino acids (B) TYPE: amino acid (C) STRANDNESS: single (D) TOPOLOG : linear (ii) MOLECULAR TYPE: procein (v) FRAGMENT TYPE : N- terminal (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: Gly Thr Ala Thr Asp Val Ser Gly Pro Val Glu He Asn Thr Ala 1 5 10 15 He Ser Pro Ala Lys (2) INFORMATION FOR SEQ ID NO: 40: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 11 amino acids (B) TYPE: amino acid (C) STRANDNESS : single (D) TOPOLOGY: linear (ii) MOLECULAR TYPE: procein (v) FRAGMENT TYPE: N-terminal i xi ) SEQUENCE DESCRIPTION: SEQ ID NO: 0: Ala Asn Ser Leu Tyr Ala Leu Phe Leu Pro Gin 1 5 10 (2) INFORMATION FOR SEQ ID NO: 1: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH': 14 amino acids -183- ■3) TYPE: amino acid (CI STRANDNESS : single !D) TOPOLOGY: linear (ii) MOLECULAR TYPE : protein (v) FRAGMENT TYPE: N-cermmal ( i) SEQUENCE DESCRIPTION: SEQ ID NO: 41: Leu Arg Ser Aid Asn Thr Leu Thr Asp Lau Fhe Leu Pro Gin 1 5 10 (2) INFORMATION FOR SEQ ID NO: 2: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 19 amino acids (B) TYPE : amino acid (C) STRANDNESS: single (D) TOPOLOG : linear (ii) MOLECULAR TYPE: procein (V) FRAGMENT TYPE: N-cerminal (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: Arg Ala Leu Glu Val Glu Arg Thr Val Ser Leu Ala Glu Val Tyr 1 5 10 15 Ala Gly Lau Glu (2) INFORMATION FOR SEQ ID NO: 43: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 11 amino acids (B) TYPE: amino acid (C) STRANDNESS: single (D) TOPOLOGY: linear (ii) MOLECULAR TYPE: procein (v) FRAGMENT TYPE: N-Cerminal (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43 lie Arg Glu Asp Tyr Pro Ala Ser Lau Gly Lys 1 5 10 (2) INFORMATION FOR SEQ ID NO: 44: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 16 amino acids (B) TYPE: amino acid (C) STRANDNESS : single (D) TOPOLOGY: linear -184- SUESTTTUTE SHEET (RULE 26) (ii) MOLECULAR TYPE : procsm (v) FRAGMENT TYPE : N-termmal (xi) SEQUENCE DESCRIPTION: 3EQ ID MO: 4: Asp Asp Ser Gly Asp Asp Asp Lys Vai Thr Asn Thr Asp lie His 1 5 10 15 Arg (2) INFORMATION FOR SEQ ID NO: 45: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 13 amino acids (B) TYPE: amino acid (C) STRANDNESS: single (D) TOPOLOGY: linear (ii) MOLECULAR TYPE: procein (v) FRAGMENT TYPE: N-cerminal (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: Asp Vai Xaa Gly Ser Glu Lys Ala Asn Glu Lys Leu Lys 1 5 10 · (2) INFORMATION FOR SEQ ID NO: 6: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 7551 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS : double (D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:46 (ccdA) : ATG AAC GAG TCT GTA AAA GAG ATA CCT GAT GTA TTA AAA AGC CAG TGT 43 Mec Asn Glu Ser Vai Lys Glu lie Pro Asp Vai Leu Lys Ser Gin Cys 1 5 10 15 GGT TTT AAT TGT CTG ACA GAT ATT AGC CAC AGC TCT TTT AAT GAA TTT 96 Gly Phe Asn Cys Leu Thr Asp lie Ser His Ser Ser Phe Asn Glu Phe 20 25 30 CGC CAG CAA GTA TCT GAG CAC CTC TCC TGG TCC GAA ACA CAC GAC TTA 1 4 Arg Gin Gin Vai Ser Glu His Leu Ser Trp Ser Glu Thr His Asp Leu 35 40 45 TAT CAT GAT GCA CAA CAG GCA CAA AAG GAT AAT CGC CTG TAT GAA GCG 192 Tyr His Asp Ala Gin Gin Ala Gin Lys Asp Asn Arg Leu Tyr Glu Ala 50 55 60 CGT ATT CTC AAA CGC GCC AAT CCC CAA TTA CAA AAT GCG GTG CAT CTT 240 Arg lie Leu Lys Arg Ala Asn Pro Gin Leu Gin Asn Ala Vai His Leu 70 75 30 -185- SUBSTTTUTE SHEET (RULE 26) GCC ATT CTC CCT CCC AAT CCT CAA CTG ATA GCC TAT AAC AAT CAA TTT 2 S3 Aia list Leu Ala Pro Asn Aia Glu Leu lie Gl T r Asn Asn Gin Phe 35 S O 95 AGC GGT AGA GCC AGT CAA TAT GTT GCG CCG GGT ACC GTT TCT TCC ATG 3 3 6 Ser Gly Arg Ala Ser Gin Tyr Val Ala Pre Gly Thr Val Ser Ser Mec 100 105 n o TTC TCC CCC GCC GCT TAT TTG ACT GAA CTT TAT CGT GAA GCA CGC AAT 3 34 Phe Ser Pro Ala Ala Tyr Leu Thr Glu Leu Tyr Arg Glu Ala Arg -sn 1 15 120 125 TTA CAC GCA AGT GAC TCC GTT TAT TAT CTG GAT ACC CGC CGC CCA GAT 4 3 2 Leu His Aia Ser Asp Ser Val Tyr Tyr Leu ASD Thr Arg Arg Pro Asp 13 0 13 5 1 0 CTC AAA TCA ATG GCG CTC AGT CAG CAA AAT ATG GAT ATA GAA TTA TCC 430 Leu Lys Ser Mac Ala Leu Ser Gin Gin Asn Mec Asp He Glu Leu Ser 14 5 150 155 100 ACA CTC TCT TTG TCC AAT GAG CTG TTA TTG GAA AGC ATT AAA ACT GAA Thr Leu Ser Leu Ser Asn Glu Leu Leu Leu Glu Ser lie Lys Thr Glu 165 170 175 TCT AAA CTG GAA AAC TAT ACT AAA GTG ATG GAA ATG CTC TCC ACT TTC 575 Ser Lys Leu Glu Asn Tyr Thr Lys Val Mec Glu Mec Leu Ser Thr Phe 130 135 190 CGT CCT TCC GCC GCA ACG CCT TAT CAT GAT GCT TAT GAA AAT GTG CGT 524 Arg Pro Ser Gly Ala Thr Pro Tyr His Asp Ala Tyr Glu Asn Val Arg 195 200 205 GAA GTT ATC CAG CTA CAA GAT CCT GGA CTT GAG CAA CTC AAT GCA TCA 572 Glu Val lie Gin Leu Gin Asp Pro Gly Leu Glu Gin Leu Asn Ala Ser 2 10 215 220 CCG GCA ATT GCC GGG TTG ATG CAT CAA GCC TCC CTA TTG GGT ATT AAC 720 Pro Ala lie Ala Gly Leu Mec His Gin Ala Ser Leu Leu Gly lie Asn 225 230 23 5 240 GCT TCA ATC TCG CCT GAG CTA TTT AAT ATT CTG ACG GAG GAG ATT ACC 753 Ala Ser He Ser Pro Glu Leu Phe Asn He Leu Thr Glu Glu He Thr 245 250 255 GAA GGT AAT GCT GAG GAA CTT TAT AAG AAA AAT TTT GGT AAT ATC GAA 3 15 Glu Gly Asn Ala Glu Glu Leu Tyr Lys Lys Asn Phe Gly Asn He Glu 260 265 27 0 CCG GCC TCA TTG GCT ATG CCG GAA TAC CTT AAA CGT TAT TAT AAT TTA 364 Pro Ala Ser Leu Ala Mec Pro Glu Tyr Leu Lys Arg Tyr Tyr Asn Leu 275 230 235 AGC GAT GAA GAA CTT AGT CAG TTT ATT GGT AAA GCC AGC AAT TTT GGT 9 12 Ser Asp Glu Glu Leu Ser Gin Phe He Gly Lys Ala Ser Asn Phe Gly 290 295 3 00 CAA CAG GAA TAT AGT AAT AAC CAA CTT ATT ACT CCG GTA GTC AAC AGC 950 Gin Gin Glu Tyr Ser Asn Asn Gin Leu He Thr Pro Val Val Asn Ser 3 05 3 10 3 15 3 20 AGT GAT GGC ACG GTT AAG GTA TAT CGG ATC ACC CGC GAA TAT ACA ACC 1003 Ser Asp Gly Thr Val Lys Val Tyr Arg He Thr Arg Glu Tyr Thr Thr 3 25 3 3 0 3 3 5 .AAT GCT TAT CAA ATG GAT GTG GAG CTA TTT CCC TTC GGT CGT GAG AAT 1 035 Asn Ala Tyr Gin Mec Asp Val Glu Leu Phe Pro Phe Gly Gly Glu Asn 3 40 3 45 3 50 TAT CGG TTA GAT TAT AAA TTC AAA AAT TTT TAT AAT GCC TCT TAT TTA 1 104 Tyr Arg Leu Asp Tyr Lys Phe Lys Asn Phe Tyr Asn Ala Ser Tyr Leu 3 55 3 50 3 65 - 186 - TCC ATC AAG TTA AAT GAT .AAA AGA GAA CTT GTT CGA ACT GAA GGC GCT 1152 jar Ila Lys Lau Asn Asp Lys Arg Giu Lau Val Arg Thr Glu Gly Aid 370 375 330 CCT CAA GTC AAT ATA GAA TAC TCC GCA .AAT ATC ACA TTA AAT ACC GCT 1200 Pro Gin Val Asn Ila Glu Tyr Ser Ala Asn lie Thr Lau Asn Thr Ala 335 390 395 400 GAT ATC ACT CAA CCT TTT GAA ATT GGC CTG ACA CGA GTA CTT CCT TCC 1243 Asp lie Ser Gin Pro Pna Glu Ila Gly Leu Thr Arg Val Lau Pro Ser 405 410 415 GGT TCT TGG GCA TAT GCC GCC GCA AAA TTT ACC GTT GAA GAG TAT AAC 1295 Gly Ser Trp Ala Tyr Ala Ala Ala Lys Phe Thr Val Glu Glu Tyr Asn 420 425 430 CAA TAC TCT TTT CTG CTA AAA CTT AAC AAG GCT ATT CGT CTA TCA CGT 1344 Gin Tyr Ser Pha Lau Lau Lys Leu Asn Lys Ala lie Arg Lau Ser Arg 435 440 445 GCG ACA GAA TTG TCA CCC ACG ATT CTG GAA GGC ATT GTG CGC ACT GTT 1392 Ala Thr Glu Lau Ser Pro Thr Ila Lau Glu Gly lie Val Arg Sar Val 450 455 460 AAT CTA CAA CTG GAT ATC AAC ACA GAC GTA TTA GGT AAA GTT TTT CTG 1440 Asn Lau Gin Leu Asp Ila Asn Thr Asp Val Leu Gly Lys Val Phe Leu 455 470 475 430 ACT AAA TAT TAT ATG CAG CGT TAT GCT ATT CAT GCT GAA ACT GCC CTG 1433 Thr Lys Tyr Tyr Mec Gin Arg Tyr Ala Ila His Ala Glu Thr Ala Leu 435 490 495 ATA CTA TGC AAC GCG CCT ATT TCA CAA CGT TCA TAT GAT AAT CAA CCT 1536 Ila Leu Cys Asn Ala Pro lie Sar Gin Arg Sar Tyr Asp Asn Gin Pro 500 505 510 AGC CAA TTT GAT CGC CTG TTT AAT ACG CCA TTA CTG AAC GGA CAA TAT 1534 Ser Gin Phe Asp Arg Lau Phe Asn Thr Pro Leu Leu Asn Gly Gin Tyr 515 520 525 TTT TCT ACC GGC GAT GAG GAG ATT GAT TTA AAT TCA GGT AGC ACC GGC 1632 Phe Sar Thr Gly Asp Glu Glu Ila Asp Leu Asn Ser Gly Ser Thr Gly 530 535 540 GAT TGG CGA AAA ACC ATA CTT AAG CGT GCA TTT AAT ATT GAT GAT GTC 1630 Asp Trp Arg Lys Thr Ila Lau Lys Arg Ala Pha Asn Ila Asp Asp Val 545 550 555 560 TCG CTC TTC CGC CTG CTT AAA ATT ACC GAC CAT GAT AAT AAA GAT GGA 1723 Ser Leu Pha Arg Lau Leu Lys Ila Thr Asp His Asp Asn Lys Asp Gly 565 570 575 AAA ATT AAA AAT AAC CTA AAG AAT CTT TCC AAT TTA TAT ATT GGA AAA 1776 Lys lie Lys Asn Asn Leu Lys Asn Lau Sar Asn Lau Tyr Ila Gly Lys ' 530 535 590 TTA CTG GCA GAT ATT CAT CAA TTA ACC ATT GAT GAA CTG GAT TTA TTA 1324 Lau Lau Ala Asp lie His Gin Leu Thr lie Asp Glu Leu Asp Leu Lau 595 500 605 CTG AT GCC GTA GGT GAA GGA AAA ACT AAT TTA TCC GCT ATC AGT GAT L3 Leu lie Ala Val Gly Glu Gly Lys Thr Asn Lau Ser Ala lie Sar Asp 510 515 620 .AAG CAA TTG GCT ACC CTG ATC AGA AAA CTC AAT ACT ATT ACC AGC TGG 1920 Lys Gin Leu Ala Thr Lau lie Arg Lys Leu Asn Thr lie Thr Ser Trp 525 530 635 640 CTA C T ACA CAG AAG TGG AGT GTA TTC CAG CTA TTT ATC ATG ACC TCC 1953 Leu His Thr Gin Lys Trp Ser Val Phe Gin Lau Phe lie Mac Thr sar -L87- 550 ACC ACC TAT AAC AAA ACC CTA ACG CCT GAA ATT AAG AAT TTG CTG GAT 20 Li Thr. Ser Tyr Asn Lys Thr Leu Thr Pro Glu lie Lys Asn Leu Leu Asp 660 665 670 ACC GTC TAC CAC GGT TTA CAA GGT TTT GAT AAA GAC .AAA GCA GAT TTG 2064 Thr Val Tyr His Gly Leu Gin Gly Phe Asp Lys Asp Lys Ala Asp Leu 675 530 535 CTA CAT GTC ATG GCG CCC TAT ATT GCG GCC ACC TTG CAA TTA TCA TCG 2112 Leu His Val Mec Ala Pro Tyr He Ala Ala Thr Leu Gin Leu Ser Ser 690 695 700 GAA AAT GTC GCC CAC TCG GTA CTC CTT TGG GCA GAT AAG TTA CAG CCC 2160 Glu Asn Val Ala His Ser Val Leu Lau Trp Ala Asp Lys Lau Gin Pro 705 710 715 720 GGC GAC GGC GCA ATG ACA GCA GAA AAA TTC TGG GAC TGG TTG AAT ACT 2203 Gly Asp Gly Ala Mec Thr Ala Glu Lys Phe Trp Asp Trp Leu Asn Thr 725 730 735 AAG TAT ACG CCG GGT TCA TCG GAA GCC GTA GAA ACG CAG GAA CAT ATC 2256 Lys Tyr Thr Pro Gly Ser Ser Glu Ala Val Glu Thr Gin Glu His He 740 745 750 CTT CAG TAT TGT CAG GCT CTG GCA CAA TTG GAA ATG GTT TAC CAT TCC 2304 Val Gin Tyr Cys Gin Ala Leu Ala Gin Leu Glu Mec Val Tyr His Ser 755 760 765 ACC GGC ATC AAC GAA AAC GCC TTC CGT CTA TTT GTC ACA AAA CCA GAG Thr Gly He Asn Glu Asn Ala Phe Arg Leu Phe Val Thr Lys Pro Glu 770 775 730 ATG TTT GGC GCT GCA ACT GGA GCA GCG CCC GCG CAT GAT GCC CTT TCA 2400 Mec Phe Gly Ala Ala Thr Gly Ala Ala Pro Ala His Asp Ala Leu Ser 735 790 795 800 CTG ATT ATG CTG ACA CGT TTT GCG GAT TGG GTG AAC GCA CTA GGC GAA 2448 Leu He Mac Lau Thr Arg Phe Ala Asp Trp V l Asn Ala Lau Gly Glu 305 310 815 AAA GCG TCC TCG GTG CTA GCG GCA TTT GAA GCT AAC TCG TTA ACG GCA 2496 Lys Ala Ser Ser Val Lau Ala Ala Phe Glu Ala Asn Ser Lau Thr Ala 320 825 330 GAA CAA CTG GCT GAT GCC ATG AAT CTT GAT GCT AAT TTG CTG TTG CAA 2544 Glu Gin Leu Ala Asp Ala Mec Asn Lau Asp Ala Asn Leu Leu Lau Gin 335 840 345 GCC AGT ATT CAA GCA CAA AAT CAT CAA CAT CTT CCC CCA GTA ACT CCA 2592 Ala Ser lie Gin Ala Gin Asn His Gin His Lau Pro Pro Val Thr Pro 350 855 300 GAA AAT GCG TTC TCC TGT TGG ACA TCT ATC AAT ACT ATC CTG CAA TGG 2640 Glu Asn Ala Phe Ser Cys Trp Thr Ser He Asn Thr He Leu Gin Trp 365 870 375 380 GTT AAT GTC GCA CAA CAA TTG AAT GTC GCC CCA CAG GGC GTT TCC GCT 2538 Val Asn Val Ala Gin Gin Leu Asn Val Ala Pro Gin Gly Val Ser Ala 385 890 395 TTG GTC GGG CTG GAT TAT ATT CAA TCA ATG AAA GAG ACA CCG ACC TAT 2735 Leu Val Gly Leu Asp Tyr He Gin Ser Mec Lys Glu Thr Pro Thr Tyr 900 905 910 GCC CAG TGG GAA AAC GCG GCA GGC GTA TTA ACC GCC GGG TTG AAT TCA 2734 Ala Gin Trp Glu Asn Ala Ala Gly Val Leu Thr Ala Gly Leu Asn Ser 915 920 925 :AA CAG GCT AAT ACA TTA CAC GCT TTT CTC GAT GAA TCT CGC AGT GCC 2332 -198- Gin Gin Ala Asn Thr Leu His Ala Phe Leu Asp Giu 3er Arg Ser Aia 930 535 340 GCA TTA AGC ACC T C TAT ATC CGT CAA GTC GCC AAG GCA GCG GCG GCT Ala Leu Ser Thr Tyr Tyr lie Arg Gin Val Ala Lys Ala Ala Ala Aia 945 950 955 960 ATT .AAA AGC CGT GAT GAC TTG TAT CAA TAC TTA CTG ATT GAT AAT CAG 2923 lie Lys Ser Arg Asp Asp Leu Tyr Gin Tyr Lau Leu lie ASD Asn Gin 965 970 " 975 GTT TCT GCG GCA ATA AAA ACC ACC CGG ATC GCC GAA GCC ATT GCC ACT 2976 Val Ser Ala Ala lie Lys Thr Thr Arg lie Ala Giu Ala He Ala Ser 980 985 990 ATT CAA CTG TAC GTC AAC CGG GCA TTG GAA AAT GTG GAA GAA AAT GCC 3024 lie Gin Leu Ty Val Asn Arg Aia Leu Giu Asn Val Giu Giu Asn Ala 995 1000 1005 AAT TCG GGG GTT ATC AGC CGC CAA TTC TTT ATC GAC TGG GAC AAA TAC 3072 Asn Ser Gly Val He Ser Arg Gin Phe Phe lie Asp Trp Asp Lys Tyr 1010 1015 1020 AAT AAA CGC TAC AGC ACT TGG GCG GGT GTT TCT CAA TTA GTT TAC TAC 3120 Asn Lys Arg Tyr Ser Thr Trp Ala Gly Val Ser Gin Leu Val Tyr Is/r 1025 1030 1035 1040 CCG GAA AAC TAT ATT GAT CCG ACC ATG CGT ATC GGA CAA ACC AAA ATG 3153 Pro Giu Asn Tyr lie Asp Pro Thr Mec Arg He Gly Gin Thr Lys Mac 1045 1050 1055 ATG GAC GCA TTA CTG CAA TCC GTC AGC CAA AGC CAA TTA AAC GCC GAT 3215 Mec Asp Ala Leu Leu Gin Sar Val Ser Gin Ser Gin Leu Asn Ala Asp 1060 1065 1070 ACC GTC GAA GAT GCC TTT ATG TCT TAT CTG ACA TCG TTT GAA CAA GTG Thr Val Giu Asp Ala Pha Mac Sar Tyr Leu Thr Sar Phe Giu Gin Val 1075 1080 1085 GCT AAT CTT AAA GTT ATT AGC GCA TAT CAC GAT AAT ATT AAT AAC GAT 3312 Ala Asn Lau Lys Val He Ser Aia Tyr His Asp Asn lie Asn Asn Asp 1090 1095 1100 CAA GGG CTG ACC TAT TTT ATC GGA CTC AGT GAA ACT GAT GCC GGT GAA 3360 Gin Giy Leu Thr Tyr Pha lie Gly Lau Ser Giu Thr Asp Ala Gly Giu 1105 1110 1115 1120 TAT TAT TGG CGC AGT GTC GAT CAC AGT AAA TTC AAC GAC GGT AAA TTC 3408 Tyr Tyr Trp Arg Ser Val Asp His Ser Lys Pha Asn Asp Gly Lys Pha 1125 1130 1135 GCG GCT AAT GCC TGG AGT GAA TGG CAT AAA ATT GAT TGT CCA ATT AAC 3455 Ala Ala Asn Ala Trp Ser Giu Trp His Lys lie Asp Cys Pro lie Asn 1140 1145 1150 CCT TAT AAA AGC ACT ATC CGT CCA GTG ATA TAT AAA TCC CGC CTG TAT 3504 Pro Tyr Lys Ser Thr lie Arg Pro Val He Tyr Lys Ser Arg Leu Tyr 1155 1160 1165 CTG CTC TGG TTG GAA CAA AAG GAG ATC ACC AAA CAG ACA GGA AAT AGT 3552 Leu Leu Trp Lau Giu Gin Lys Giu lie Thr Lys Gin Thr Gly Asn Ser 1170 1175 1180 AAA GAT GGC TAT CAA ACT GAA ACG GAT TAT CGT TAT GAA CTA AAA TTG 3600 Lys Asp Gly Tyr Gin Thr Giu Thr Asp Tyr Arg Tyr Giu Leu Lys Leu 1135 1190 1195 1200 GCG C T ATC CGC TAT GAT GGC ACT TGG AAT ACG CCA ATC ACC TTT GAT 3643 Ala His He Arg Tyr Asp Gly Thr Trp Asn Thr Pro He Thr Phe Asp 1205 1210 1215 -189- GTC AAT AAA AAA ATA TCC GAG CTA AAA CTG GAA .AAA AAT AGA GCG C όόίό Vdl Asn Lys Lys lie Ser Giu Leu Lys Leu Giu Lys Asn Arg Aia Pro i220 1225 1230 GGA CTC TAT TGT GCC GGT TAT CAA GGT GAA GAT ACG TTG CTG G G ATG 744 Glv Leu Tyr Cys Ala Gly Tyr Gin Gly Glu Asp Thr Leu Leu Val Mec 1235 1240 1245 TTT TAT .AAC CAA CAA GAC ACA CTA GAT ACT TAT AAA AAC GCT TCA ATG 3 "92 Phe Tvr Asn Gin Gin Asp Thr Leu Asp Ser Tyr Lys Asn Ala Ser Mec 1250 1255 1260 CAA GGA CTA TAT ATC TTT GCT GAT ATG GCA TCC AAA GAT ATG ACC CCA 3340 Gin Gly Leu Tyr lie Phe Ala Asp Mec Ala Ser Lys Asp Mec Thr Pro 1255 1270 1275 1230 GAA CAG AGC AAT GTT TAT CGG GAT AAT AGC TAT CAA CAA TTT GAT ACC 3333 Glu Gin Ser Asn Val Tyr Arg Asp Asn Ser Tyr Gin Gin Phe Asp Thr 1285 1290 1295 AAT AAT GTC AGA AGA GTG AAT AAC CGC TAT GCA GAG GAT TAT GAG ATT 3936 Asn Asn Val Arg Arg Val Asn Asn Arg Tyr Ala Glu Asp Tyr Glu lie 1300 1305 1310 CCT TCC TCG GTA AGT AGC CGT AAA GAC TAT GGT TGG GGA GAT TAT TAC 3984 Pro Ser Ser Val Ser Ser Arg Lys Asp Tyr Gly Trp Gly Asp Tyr Tyr 1315 1320 1325 CTC AGC ATG GTA TAT AAC GGA GAT ATT CCA ACT ATC AAT TAC AAA GCC 4032 Leu Ser Mec Val Tyr Asn Gly Asp lie Pro Thr lie Asn Tyr Lys Ala 1330 1335 1340 GCA TCA AGT GAT TTA AAA ATC TAT ATC TCA CCA AAA TTA AGA ATT ATT 4080 Ala Ser Ser Asp Leu Lys He Tyr lie Ser Pro Lys Leu Arg He lie 1345 1350 .1355 1360 CAT AAT GGA TAT GAA GGA CAG AAG CGC AAT CAA TGC AAT CTG ATG AAT 4123 His Asn Gly Tyr Glu Gly Gin Lys Arg Asn Gin Cys Asn Leu Mec Asn 1365 1370 1375 AAA TAT GGC AAA CTA GGT GAT AAA TTT ATT GTT TAT ACT AGC TTG GGG 176 Lys Tyr Gly Lys Leu Gly Asp Lys Phe He Val Tyr Thr Ser Leu Gly 1330 1385 1390 GTC AAT CCA AAT AAC TCG TCA AAT AAG CTC ATG TTT TAC CCC GTC TAT 4224 Val Asn Pro Asn Asn Ser Ser Asn Lys Lau Mec Phe Tyr Pro Val Tyr 1395 1400 1405 CAA TAT AGC GGA AAC ACC AGT GGA CTC AAT CAA GGG AGA CTA CTA TTC 4272 Gin Tyr Ser Gly Asn Thr Ser Gly Leu Asn Gin Gly Arg Leu Leu Phe 1410 1415 1420 CAC CGT GAC ACC ACT TAT CCA TCT AAA GTA GAA GCT TGG ATT CCT GGA 4320 His Arg Asp Thr Thr Tyr Pro Ser Lys Val Glu Ala Trp lie Pro Gly 1425 1430 1435 1440 GCA AAA CGT TCT CTA ACC AAC CAA AAT GCC GCC ATT GGT GAT GAT TAT 4368 Ala Lys Arg Ser Lau Thr Asn Gin Asn Ala Ala He Gly Asp Asp Tyr 1445 1450 1455 GCT ACA GAC TCT CTG AAT AAA CCG GAT GAT CTT AAG CAA TAT ATC TTT 4416 Ala .".ir Asp Ser Lau Asn Lys Pro Asp Asp Lau Lys Gin Tyr He Phe 1460 1465 1470 ATG ACT GAC AGT AAA GGG ACT GCT ACT GAT GTC TCA GGC CCA GTA GAG 4464 Mec Thr Asp Ser Lys Gly Thr Ala Thr Asp Val Ser Gly Pro Vai Giu 1475 1480 1485 ATT AAT ACT GCA ATT TCT CCA GCA AAA GTT CAG ATA ATA GTC AAA GCG 4 12 lie Asn Thr Ala He Ser Pro Ala Lys Val Gin He He Val Lys Ala 1490 1495 1500 -190- SUBSTTTUTE SHEET (RULE 26) CGT GGC AAG GAG CAA ACT TTT AC GCA GAT AAA GAT GTC TCC ATT CAG 4560 Giy Giy Lv3 Glu Gin Thr Phe Thr Ala Asp Lys Asp Vai Ser He Gin 1505 1510 1515 1520 CCA TCA CCT AGC TTT GAT GAA ATG AAT TAT CAA TTT AAT GCC CTT GAA 4603 Pro Ser Pro Ser Phe Asp Glu Mec Asn Tyr Gin Phe Asn Ala Leu Glu 1525 1530 153 ATA GAC GGT TCT GGT CTG AAT TTT ATT AAC AAC TCA GCC ACT ATT GAT 4056 lie Asp Giy Sar Giy Leu Asn Phe lie Asn Asn Ser Ala Ser lie Asp 1540 1545 1550 GTT ACT TTT ACC GCA TTT GCG GAG GAT GGC CGC AAA CTG GGT TAT GAA 4704 Vai Thr Phe Thr Aia Phe Aia Glu Asp Giy Arg Lys Leu Giy Tyr Glu 1555 1560 15S5 AGT TTC AGT ATT CCT GTT ACC CTC AAG GTA AGT ACC GAT AAT GCC CTG 47 2 Sar Phe Ser lie Pro Vai Thr Leu Lys Vai Ser Thr Asp Asn Ala Lau 1570 1575 1580 ACC CTG CAC CAT AAT GAA AAT GGT GCG CAA TAT ATG CAA TGG CAA TCC 4300 Thr Lau His His Asn Glu Asn Giy Ala Gin Tyr Mec Gin Trp Gin Ser 1535 1590 1595 1500 TAT CCT ACC CGC CTG AAT ACT CTA TTT GCC CGC CAG TTG GTT GCA CGC 4343 Tyr Arg Thr Arg Leu Asn Thr Leu Phe Ala Arg Gin Leu Vai Ala Arg 1605 1510 1615 GCC ACC ACC GGA ATC GAT ACA ATT CTG AGT ATG GAA ACT CAG AAT ATT 4396 Ala Thr Thr Giy He Asp Thr He Lau Sar Mac Glu Thr Gin Asn lie 1620 1625 1630 CAG GAA CCG CAG TTA GGC AAA GGT TTC TAT GCT ACG TTC GTG ATA CCT 4944 Gin Glu Pro Gin Leu Giy Lys Giy Pha Tyr Ala Thr Phe Vai lie Pro 1635' 1640 1645 CCC TAT AAC CTA TCA ACT CAT GGT GAT GAA CGT TGG TTT AAG CTT TAT 49 2 Pro Tyr Asn Leu Ser Thr His Giy Asp Glu Arg Trp Phe Lys Lau Tyr 1650 1655 1660 ATC AAA CAT GTT GTT GAT AAT AAT TCA CAT ATT ATC TAT TCA GGC CAG 040 lie Lys His Vai Vai Asp Asn Asn Ser His lie lie Tyr Ser Giy Gin 1655 1670 1675 1680 CTA ACA GAT ACA AAT ATA AAC ATC ACA TTA TTT ATT CCT CTT GAT GAT Leu Thr Asp Thr Asn lie Asn lie Thr Leu Phe I la Pro Leu Asp Asp 1685 1690 1695 GTC CCA TTG AAT CAA GAT TAT CAC GCC AAG GTT TAT ATG ACC TTC AAG 136 Vai Pro Lau Asn Gin Asp Tyr His Ala Lys Vai Tyr Mec Thr Pha Lys 1700 1705 1710 AAA TCA CCA TCA GAT GGT ACC TGG TGG GGC CCT CAC TTT GTT AGA GAT Lys Ser Pro Ser Asp Giy Thr Trp Trp Giy Pro His Phe Vai Arg Asp 1715 1720 1725 GAT AAA GGA ATA GTA ACA ATA AAC CCT AAA TCC ATT TTG ACC CAT TTT 5232 Asp Lys Giy lie Vai Thr He Asn Pro Lys Ser He Lau Thr His Pha 1730 1735 1740 GAG AGC GTC .AAT GTC CTG AAT AAT ATT AGT AGC GAA CCA ATG GAT TTC 5230 Glu Ser Vai Asn Vai Leu Asn Asn He Ser Ser Glu Pro Mec Asp Phe 1745 1750 1755 1750 AGC GGC GCT AAC AGC CTC TAT TTC TGG GAA CTG TTC TAC TAT ACC CCG 5323 Ser Giy Ala Asn Ser Leu Tyr Phe Trp Glu Leu Phe Tyr Tyr Thr Pro 1755 1770 1775 ATG CTG GTT GCT CAA CGT TTG CTG CAT GAA CAG AAC TTC GAT GAA GCC Mec Lau Vai Ala Gin Arg Leu Leu His Glu Gin Asn Phe Asp Glu Ala - 191- 1"30 17 35 179C AAC CGT TCG CTG AAA TAT GTC TGG ACT CCA TCC GGT TAT ATT GTC CAC = 42-1 Asn Arg Trp Lau Lys T/r Val Trp Ser Pro Ser Giy Tyr lie Val His 1795 1300 1305 GGC CAG ATT CAG AAC TAC CAG TGG AAC GTC CGC CCG TTA CTG GAA CAC 54~2 Cly Gin lie Gin Asn Tyr Gin Trp Asn Val Arg Pro Leu Leu Glu Asp 1310 iai5 1320 ACC ACT TGG .AAC AGT GAT OCT TTG GAT TCC GTC GAT CCT GAC GCG GTA 5520 Thr Ser Trp Asn Ser Asp Pro Lsu Asp Ser Val Asp Pro Asp Ala Val 1325 1830 1335 1340 GCA CAG CAC GAT CCA ATG CAC TAC AAA GTT TCA ACT TTT ATG CGT ACC 5563 Ala Gin His Asp Pro Mec His Tyr Lys Val Ser Thr Phe Mec Arg Thr 1845 1350 135 TTG GAT CTA TTG ATA GCA CGC GGC GAC CAT GCT TAT CGC CAA CTG GAA 5615 Leu Asp Lau Leu lie Ala Arg Giy Asp His Ala T/r Arg Gin Leu Giu 1360 1365 1370 CGA GAT ACA CTC AAC GAA GCG AAG ATG TGG TAT ATG CAA GCG CTG CAT 5664 Arg Asp Thr Leu Asn Glu Ala Lys Mec Trp Tyr Mec Gin Ala Leu His 1375 1830 1885 CTA TTA GGT GAC AAA CCT TAT CTA CCG CTG AGT ACG ACA TGG AGT GAT 5712 Leu Leu Giy Asp Lys Pro Tyr Leu Pro Leu Ser Thr Thr Trp Ser Asp 1390 1895 1900 CCA CGA CTA GAC AGA GCC GCG GAT ATC ACT ACC CAA AAT GCT CAC GAC 5750 Pro Arg Leu Asp Arg Ala Ala Asp lie Thr Thr Gin Asn Ala His Asp 1905 1910 1915 1920 AGC GCA ATA GTC GCT CTG CGG CAG AAT ATA CCT ACA CCG GCA CCT TTA 5303 Ser Ala He Val Ala Leu Arg Gin Asn He Pro Thr Pro Ala Pro Leu 1925 1930 1935 TCA TTG CGC AGC GCT AAT ACC CTG ACT GAT CTC TTC CTG CCG CAA ATC 53 6 Ser Leu Arg Ser Ala Asn Thr Leu Thr Asp Leu Phe Leu Pro Gin lie 1940 1945 1950 AAT GAA GTG ATG ATG AAT TAC TGG CAG ACA TTA GCT CAG AGA GTA TAC 5904 Asn Glu Val Mec Mec Asn Tyr Trp Gin Thr Leu Ala Gin Arg Val Tyr 1955 1960 1965 AAT CTG CGT CAT AAC CTC TCT ATC GAC GGC CAG CCG TTA TAT CTG CCA 5952 Asn Leu Arg His Asn Leu Ser He Asp Giy Gin Pro Leu Tyr Leu Pro 1970 1975 1980 ATC TAT GCC ACA CCG GCC GAT CCG AAA GCG TTA CTC AGC GCC GCC GTT 6000 He Tyr Ala Thr Pro Ala Asp Pro Lys Ala Leu Leu Ser Ala Ala Val 1935 1990 1995 2000 GCC ACT TCT CAA GGT GGA GGC AAG CTA CCG GAA TCA TTT ATG TCC CTG 6043 Ala Thr Ser Gin Giy Giy Giy Lys Leu Pro Glu Ser Phe Mec Ser Leu 2005 2010 2015 TGG CGT TTC CCG CAC ATG CTG GAA AAT GCG CGC GGC ATG GTT AGC CAG 6096 Trp Arg Phe Pro His Mec Leu Glu Asn Ala Arg Giy Mec Val Ser Gin 2020 2025 2030 CTC ACC CAG TTC GGC TCC ACG TTA CAA AAT ATT ATC GAA CGT CAG GAC 1 4 Leu Thr Gin Phe Giy Ser Thr Leu Gin Asn He lie Glu Arg Gin Asp 2035 2040 2045 GCG GAA GCG CTC AAT GCG TTA TTA CAA AAT CAG GCC GCC GAG CTG ATA 192 Ala Glu Ala Leu Asn Ala Leu Leu Gin Asn Gin Ala Ala Glu Leu He 2050 2055 2060 TTG ACT AAC CTG AGC ATT CAG GAC AAA ACC ATT GAA GAA TTG GAT GCC 6240 -192- SUBSTnUTESHEET (RULE 26) Leu Thr Asn Leu 3er lie :Jin As D Lvs Thr lis i- Giu Leu ASD Aia 2055 20~0 20"= " 2030 GAG AAA ACG CTG TTG 3A-A TCC AAA GCG GGA. GCA CAA TCC CGC TTT 5 5 ^ Glu Lys Thr Val Lau Giu Lys Ser Lys Aia Gly Ala Gin Ser Arg Phe 2035 2090 2095 GAT AGC TAC GGC AAA CTG TAG GAT GAG AAT ATC AAC GCC GGT GAA AAC 53:5 As D Ser Tyr Gly Lys Leu Tyr ASD Glu Asn lie Asn Ala Gly Glu ^sn 2100 2105 2110 CAA GCC ATG ACG CTA, CGA GCG TCC GCC GCC GGG CTT ACC ACG GCA GTT 5334 Gin Aia ec Thr Leu Arg Ala Ser Ala Ala Giv Leu Thr Thr Ala Val 2115 2120 2125 CAG GCA TCC CGT CTG GCC GGT GCG GCG GCT GAT CTG GTG CCT AAC ATC 4 Gin Ala Ser Arg Leu Ala Gly Ala Ala Ala ASD Leu Val Pro Asn lie 2130 2135 2140 TTC GGC TTT GCC GGT GGC GGC AGC CGT TGG GGG GCT ATC GCT GAG GCG Phe Gly Phe Ala Gly Gly Gly Ser Arg Trp Gly Ala He Ala Glu Ala 2145 2150 2155 2150 ACA GGT TAT GTG ATG GAA TTC TCC GCG AAT GTT ATG AAC ACC GAA GCG Thr Gly Tyr Val Mec Glu Phe Ser Ala Asn Val Mec Asn Thr Glu Ala 2155 2170 2175 GAT .AAA ATT AGC CAA TCT GAA ACC TAC CGT CGT CGC CGT CAG GAG TGG Asp Lys lie Ser Gin Ser Giu Thr T/r Arg Arg Arg Arg Gin Giu Tro 2130 2185 2190 GAG ATC CAG CGG AAT AAT GCC GAA GCG GAA TTG AAG CAA ATC GAT GCT Glu lie Gin Arg Asn Asn Ala Glu Ala Glu Leu Lys Gin He ASD Ala 2195 2200 2205 CAG CTC AAA TCA CTC GCT GTA CGC CGC GAA GCC GCC GTA TTG CAG AAA Gin Leu Lys Ser Leu Ala Val Arg Arg Glu Ala Ala Val Leu Gin Lys 2210 2215 2220 ACC AGT CTG AAA ACC CAA CAA GAA CAG ACC CAA TCT CAA TTG GCC TTC 5" 20 Thr Ser Leu Lys Thr Gin Gin Glu Gin Thr Gin Ser Gin Leu Ala Phe 2225 2230 2235 2240 CTG CAA CGT AAG TTC AGC AAT CAG GCG TTA TAC AAC TGG CTG CGT GGT 5" 53 Leu Gin Arg Lys Phe Ser Asn Gin Ala Leu Tyr Asn Trp Leu Arg Gly 2245 2250 2255 CGA CTG GCG GCG ATT TAC TTC CAG TTC TAC GAT TTG GCC GTC GCG CGT 5315 Arg Leu Ala Ala He Tyr Phe Gin Phe Tyr Asp Leu Ala Val Ala Arg 2260 2265 2270 TGC CTG ATG GCA GAA CAA GCT TAC CGT TGG GAA CTC AAT GAT GAC TCT 6354 Cys Leu Met Ala Glu Gin Ala Tyr Arg Trp Glu Leu Asn Asp Asp Ser 2275 2280 2235 GCC CGC TTC ATT AAA CCG GGC GCC TGG CAG GGA ACC TAT GCC GGT CTG 59 i; Ala Ara Phe He Lys Pro Gly Ala Trp Gin Gly Thr Tyr Ala Gly Leu 2290 2295 2300 CTT GCA GGT GAA ACC TTG ATG CTG AGT CTG GCA CAA ATG GAA GAC GCT Leu Ala Gly Glu Thr Leu Mec Leu Ser Leu Ala Gin Mec Glu Asp Aia 2305 2310 2315 2320 CAT CTG AAA CGC GAT AAA CGC GCA TTA GAG GTT GAA CGC ACA GTA TCG "003 His Leu Lys Arg Asp Lys Arg Ala Leu Glu Val Glu Arg Thr Val Ser 2325 2330 2335 CTG GCC GAA GTT TAT GCA GGA TTA CCA AAA GAT AAC GGT CCA TTT TCC Leu Aia Glu Val Tyr Ala Gly Lau Pro Lys As Asn Gly Pro Phe Ser 2340 2345 2350 -193- CTC -T CAG GA A T AC .-_-.G CTG CTG .¾uT CAA GOT TCA GGC AGT GCC Leu Ala Gin Glu lie Asp Lys Leu Val Ser Gin Gly Ser Gly Ser Ala 2355 2360 2365 GGC AGT GGT AAT AAT AAT TTG GCG TTC GGC GCC GGC ACG GAC ACT AAA " i s : Gly Ser Gly Asn Asn Asn Leu Ala Phe Gly Ala Gly Thr Asp Thr Lys 2370 2375 2330 ACC TCT TTG CAG GCA TCA GTT TCA TTC GCT GAT TTG AAA ATT CGT GAA 7200 Thr Ser Leu Gin Ala Ser Val Ser Phe Ala Asp Lsu Lys lis Arg Glu 2335 2390 2395 2400 GAT TAC CCG GCA TCG CTT GGC AAA ATT CGA CGT ATC AAA CAG ATC AGC "243 Asp Tyr Pro Ala Ser Leu Gly Lys lie Arg Arg lie Lys Gin He ser 2405 2410 2415 GTC ACT TTG CCC GCG CTA CTG GGA CCG TAT CAG GAT GTA CAG GCA ATA 7296 Val Thr Leu Pro Ala Leu Leu Gly Pro Tyr Gin Asp Val Gin Ala He 2420 2425 2430 TTG TCT TAC GGC GAT AAA GCC GGA TTA GCT AAC GGC TGT GAA GCG CTG 7344 Leu Ser Tyr Gly Asp Lys Ala Gly Leu Ala Asn Gly Cys Glu Ala Leu 2435 2440 2445 GCA GTT TCT CAC GGT ATG AAT GAC AGC GGC CAA TTC CAG CTC GAT TTC 7392 Ala Val Ser His Gly Mec Asn Asp Ser Gly Gin Phe Gin Leu Asp Phe 2450 2455 2460 AAC GAT GGC AAA TTC CTG CCA TTC GAA GGC ATC GCC ATT GAT CAA GGC '440 Asn Asp Gly Lys Phe Leu Pro Phe Glu Gly He Ala He Asp Gin Gly 2465 2470 2475 2430 ACG CTG ACA CTG AGC TTC CCA AAT GCA TCT ATG CCG GAG AAA GGT AAA '433 Thr Leu Thr Leu Ser Phe Pro Asn Ala Ser Mec Pro Glu Lys Gly Lys 2485 2490 2495 CAA GCC ACT ATG TTA AAA ACC CTG AAC GAT ATC ATT TTG CAT ATT CGC 7536 Gin Ala Thr Mac Leu Lys Thr Leu Asn Asp He He Leu His He Arg 2500 2505 2510 TAC ACC ATT AAA TAA 7551 Tyr Thr He Lys ··· 2516 INFORMATION FOR SEQ ID NO: 47: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 2516 amino ( B ) TYPE : amino acids (C) STRANDEDNESS : singl ' (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47 (TcdA) eacures From To Descripcion Pepcide 1 2516 TcdA proceins Pepcids 89 1937 TcdAj_i pepcide Fragmenc 89 100 S2 N-cerminus ( SEQ ID NO: 13 ) Fragmenc 284 299 (SEQ ID NO;38) Fragmenc 554 563 (SEQ ID NO: 17) Fragmenc 1080 1092 (SEQ ID NO:23 ; 12/13) Fragmenc 1385 1400 (SEQ ID NO: 18) Fragmenc 1473 1497 (SEQ ID NO: 39) Fragmenc 1620 1642 (SEQ ID NO: 21; 19/23) Fragmenc , 1938 1948 (SEQ ID NO: 41) Pepc ide 1938 2516 TcdAiii pepcidf Fragmenc 2327 2345 (SEQ ID NO: 42) Fragmenc 2393 2403 (SEQ ID NO: 3) -194- Ser Val Lys Glu lie Pro Aso Val Leu Lys Ser Gin vs 5 10 15 Cys Lau Thr Asp He Ser His Ser Ser Phe Asn Glu Phe 20 25 30 Val Ser Glu His Leu Ser Trp Ser Glu Thr His Asp Leu 40 45 Ala Gin Gin Ala Gin Lys Asp Asn Arg Leu Tyr Glu Aia 55 50 Arg lie Leu Lys Arg Ala Asn Pro Gin Leu Gin Asn Ala Val His Leu 65 70 75 30 Ala lie Leu Ala Pro Asn Ala Glu Leu He Gly Tyr Asn Asn Gin Phe 35 90 95 Ser Gly Arg Ala Ser Gin Tyr Val Ala Pro Gly Thr Val Ser Ser Mec 100 105 110 Phe Ser Pro Ala Ala Tyr Leu Thr Glu Leu Tyr Arg Glu Ala Arg Asn 115 120 125 Leu His Ala Ser Asp Ser Val Tyr Tyr Leu Asp Thr Arg Arg Pro Asp 130 135 140 Leu Lys Ser Mec Ala Leu Ser Gin Gin Asn Mec Asp lie Glu Leu Ser 145 150 155 150 Thr Leu Ser Leu Ser Asn Glu Leu Leu Leu Glu Ser lie Lys Thr Glu 165 170 175 Ser Lys Leu Glu Asn Tyr Thr Lys Val Mec Glu Mec Leu Ser Thr Phe 180 135 190 Arg Pro Ser Gly Ala Thr Pro Tyr His Asp Ala Tyr Glu Asn Val Arg 195 200 205 Glu Val He Gin ..-u Gin Asp Pro Gly Leu Glu Gin Leu Asn Ala Ser 210 215 220 Pro Ala He Ala Gly Leu Mec His Gin Ala Ser Leu Leu Gly He Asn 225 230 235 240 Ala Ser He Ser Pro Glu Leu Phe Asn He Leu Thr Glu Glu He Thr 245 250 255 Glu Gly Asn Ala Glu Glu Leu Tyr Lys Lys Asn Phe Gly Asn He Glu 260 265 270 Pro Ala Ser Leu Ala Mec Pro Glu Tyr Leu Lys Arg Tyr Tyr Asn Leu 275 230 235 Ser Asp Glu Glu Leu Ser Gin Phe He Gly Lys Ala Ser Asn Phe Gly 290 295 300 Gin Gin Glu Tyr Ser Asn Asn Gin Leu He Thr Pro Val Val Asn Ser 305 310 315 320 Ser Asp Gly Thr Val Lys Val Tyr Arg He Thr Arg Glu Tyr Thr Thr 325 330 335 Asn Ala Tyr Gin Mec Asp Val Glu Leu Phe Pro Phe Gly Gly Glu Asn 340 345 350 T r Arg Leu Asp Tyr Lys Phe Lys Asn Phe Tyr Asn Ala Ser Tyr Lau 355 360 365 Ser He Lys Leu Asn Asp Lys Arg Glu Leu Val Arg Thr Glu Gly Ala -195- SUBSTTTUTE SHEET (RULE 26) 3~0 _ 375 330 Pro Gin Val Asn lie Glu Tyr Ser Aia Asn lie Thr Leu Asn Thr Aia 335 390 39 400 5 Asp lie Ser Gin Pro Phe Glu lie Gly Leu Thr Arg -Val Leu Pro Se 405 410 415 Gly Ser Trp Aia Tyr Ala Ala Ala Lys Phe Thr Val Glu Glu Tyr Asn |() 420 425 430 Gin Tyr Ser Phe Leu Leu Lys Leu Asn Lys Ala lie Arg Leu Ser Arg 435 440 445 15 Ala Thr Glu Leu Ser Pro Thr lie Leu Glu Gly lie Val Arg Ser Val 450 455 460 Asn Leu Gin Leu Asp lie Asn Thr Asp Val Leu Gly Lys Val Phe Leu 465 470 475 430 0 Thr Lys Tyr Tyr Mec Gin Arg Tyr Ala lie His Ala Glu Thr Ala Leu 435 490 495 lie Leu Cys Asn A.la Pro lie Ser Gin Arg Ser Tyr Asp Asn Gin Pro 500 505 510 Ser Gin Phe Asp Arg Leu Phe Asn Thr Pro Leu Leu Asn Gly Gin Tyr 515 520 525 Phe Ser Thr Gly Asp Glu Glu lie Asp Leu Asn Ser Gly Ser Thr Gly 530 535 540 Asp Trp Arg Lys Thr lie Leu Lys Arg Ala Phe Asn lie Asp Asp Val 545 550 555 550 5 Ser Leu Phe Arg Leu Leu Lys lie Thr Asp His Asp Asn Lys Asp Gly 565 570 575 Lys lie Lys Asn Asn Leu Lys Asn Leu Ser Asn Leu Tyr lie Glv Lys 580 585 590 Leu Leu Ala Asp lie His Gin Leu Thr lie Asp Glu Leu Asp Leu Leu 595 600 605 Leu lie Ala Val Gly Glu Gly Lys Thr Asn Leu Ser Ala lie Ser Asp 610 615 520 Lys Gin Leu Ala Thr Leu lie. Arg Lys Leu Asn Thr He Thr Ser Trp 625 630 535 640 0 Leu His Thr Gin Lys Trp Ser Val Phe Gin Leu Phe lie Mec Thr Ser 645 650 555- Thr Ser Tyr Asn Lys Thr Leu Thr Pro Glu He Lys Asn Leu Leu Asp 660 665 670 Thr Val Tyr His Gly Leu Gin Gly Phe Asp Lys Asp Lys Ala Asp Leu 675 630 635 Leu His Val Mec Ala Pro Tyr He Ala Ala Thr Leu Gin Leu Ser Ser 690 695 700 Glu Asn Val Ala His Ser Val Leu Leu Trp Ala Asp Lys Leu Gin Pro 705 710 715 720 5 Gly Asp Gly Ala Mec Thr Ala Glu Lys Phe Trp Asp Trp Leu Asn Thr 725 730 735 Lys Tyr Thr Pro Gly Ser Ser Glu Ala Val Glu Thr Gin Glu His He 0 740 745 750 -196- SUBSTmiTE SHEET (RULE 26) Val Gin Tyr Cys Gin Ala Leu Ala Gin Leu Glu Mac Val Tyr His 3er "55 ~60 7<55 Thr Gly lie Asn Glu Asn Ala Phe Arg Leu Phe Val Thr Lys Pro Glu 770 775 730 Mec Phe Gly Ala Ala Thr Gly Ala Ala Pro Ala His Asp Ala Lau Ser 735 790 795 300 Leu lie Mec Leu Thr Arg Phe Ala Asp Trp Val Asn Ala Leu Glv Glu 305 3i0 315 Lys Ala Ser Ser Val Leu Ala Ala Phe Glu Ala Asn Ser Leu Thr ¾la 320 325 330 Glu Gin Leu Ala Asp Ala Mec Asn Leu Asp Ala Asn Leu Leu Leu Gin 835 340 345 Ala Ser He Gin Ala Gin Asn His Gin His Leu Pro Pro Val Thr Pro 350 855 360 Glu Asn Ala Phe Ser Cys Trp Thr Ser Ila Asn Thr He Leu Gin Trp 305 370 375 330 Val Asn Val Ala Gin Gin Leu Asn Val Ala Pro Gin Gly Val Ser Ala 385 390 855 Leu Val Gly Leu Asp Tyr He Gin Ser Mec Lys Glu Thr Pro Thr Tyr 900 905 910 Ala Gin Trp Glu Asn Ala Ala Gly Val Leu Thr Ala Gly Lau Asn Ser 915 920 925 Gin Gin Ala Asn Thr Leu His Ala Phe Leu Asp Glu Ser Arg Ser Ala 930 935 940 Ala Leu Ser Thr Tyr Tyr He Arg Gin Val Ala Lys Ala Ala Ala Ala 945 950 955 950 He Lys Ser Arg Asp Asp Leu Tyr Gin Tyr Leu Leu He Asp Asn Gin 965 970 975 Val Ser Ala Ala He Lys Thr Thr Arg He Ala Glu Ala He Ala Sar 980 985 990 He Gin Leu Tyr Val Asn Arg Ala Leu Glu Asn Val Glu Glu Asn Ala 995 1000 1005 Asn Ser Gly Val He Ser Arg Gin Phe Phe He Asp Trp Asp Lys Tyr 1010 1015 1020 Asn Lys Arg Tyr Ser Thr Trp Ala Gly Val Ser Gin Leu Val Tyr Tyr 1025 1030 1035 L040 Pro Glu Asn Tyr He Asp Pro Thr Mec Arg He Gly Gin Thr Lys Mec 1045 1050 1055 Mec Asp Ala Leu Leu Gin Ser Val Ser Gin Ser Gin Leu Asn Ala Asp 1060 1065 1070 Thr Val Glu Asp Ala Phe Mec Ser Tyr Leu Thr Ser Phe Glu Gin Val i075 1080 1085 Ala Asn Leu Lys Val lis Ser Ala Tyr His Asp Asn He Asn Asn Asp i090 L095 1100 Gin Gly Leu Thr Tyr Phe Ha Gly Lau Ser Glu Thr Asp Ala GLy Glu 1105 1110 ' 1115 1120 Tyr Tyr Trp Arg Ser Val- Asp His Ser Lys Phe Asn Asp Gly Lys Phe 1125 1130 1135 -197- SUBSTTTUTE SHEET (RULE 26) Ala Ala Asn Ala Trp Ser Giu Trp His Lys He Asp Cys Pro il, Asn 1140 1145 1150 Pro Tyr Lys Ser Thr lie Arg Pro Val lie Tyr Lys Ser Arg Leu Tyr 1155 1160 1155 Leu Leu Trp Leu Giu Gin Lys Glu lie Thr Lys Gin Thr Giy Asn Ser 1170 1175 1130 Lys Asp Gly Tyr Gin Thr Glu Thr Asp Tyr Arg Tyr Glu Leu Lys Leu 1135 1190 1195 1200 Ala His He Arg Tyr Asp Gly Thr Trp Asn Thr Pro lis. Thr Phe Asp 1205 1210 1215 Val Asn Lys Lys lis Ser Glu Leu Lys Leu Glu Lys Asn Arg Ala Pro 1220 1225 1230 Gly Leu Tyr Cys Ala Gly Tyr Gin Gly Glu Asp Thr Leu Leu Val Mec 1235 1240 1245 Phe Tyr Asn Gin Gin Asp Thr Leu Asp Ser T r Lys Asn Ala Ser Mec 1250 1255 1250 Gin Gly Leu Tyr He Phe Ala Asp Mec Ala Ser Lys Asp Mec Thr Pro 1265 1270 1275 1230 Glu Gin Ser Asn Val Tyr Arg Asp Asn Ser Tyr Gin Gin Phe Asp Thr 1285 1290 1295 Asn Asn Val Arg Arg Val Asn Asn Arg Tyr Ala Glu Asp Tyr Glu He 1300 1305 1310 Pro Ser Ser Val Ser Ser Arg Lys Asp Tyr Gly Trp Gly Asp Tyr Tyr 1315 1320 1325 Leu Ser Mec Val Tyr Asn Gly Asp He Pro Thr He Asn Tyr Lys Ala 1330 1335 1340 Ala Ser Ser Asp Leu Lys He Tyr He Ser Pro Lys Leu Arg lis He 1345 1350 . 1355 1360 His Asn Gly Tyr Glu Gly Gin Lys Arg Asn Gin Cys Asn Leu Mec Asn 1365 1370 1375 Lys Tyr Gly Lys Leu Gly Asp Lys Phe Ha Val Tyr Thr Ser Leu Gly 1380 1385 1390 Val Asn Pro Asn Asn Ser Ser Asn Lys Leu Mac Pha Tyr Pro Val Tyr 1395 1400 1405 Gin Tyr Ser Gly Asn Thr Ser Gly Leu Asn Gin Gly Arg Leu Leu Phe 1410 1415 1420 His Arg Asp Thr Thr Tyr Pro Ser Lys Val Glu Ala Trp He Pro Gly 1425 1430 1435 1440 Ala Lys Arg Ser Leu Thr Asn Gin Asn Ala Ala lie Gly Asp Asp Tyr 1445 1450 1455 Ala Thr Asp Ser Leu Asn Lys Pro Asp Asp Leu Lys Gin Tyr He Phe 1460 1465 1470 Mec Thr Asp Ser Lys Gly Thr Ala Thr Asp Val Ser Gly Pro Val Glu 1475 1480 1485 He Asn Thr Ala He Ser Pro Ala Lys Val Gin He He Val Lys Ala 1490 1495 1500 Gly Gly Lys Glu Gin Thr Phe Thr Ala Asp Lys Asp Val Ser lis Gin -198- 1510 Pro ier Pro .«r Phe Asp Glu Mec Asn Tyr Gin Phe Asn \la Leu Jlu 1525 15 30 153 5 5 lie Asp Gly Ser Gly Leu Asn Phe lie Asn Asn Ser Ala Ser lie Asp 1540 1545 1550 Val Thr Phe Thr Ala Phe Ala Glu Asp Gly Arg Lys Leu Gly Tyr Glu i n 1 5 55 1560 15 65 Ser Phe Ser lie Pro Val Thr Leu Lys Val Ser Thr Asp Asn Ala Leu 1 570 157 5 1580 15 Thr Leu His His Asn Glu Asn Gly Ala Gin Tyr Mec Gin Trp Gin Ser .535 1590 1595 160 0 Ty Arg Thr Arg Leu Asn Thr Leu Phe Ala Arg Gin Leu Val Ala Arg 1605 1610 1615 0 Ala Thr Thr Gly He Asp Thr lie Leu Ser Mec Glu Thr Gin Asn lie 1620 1625 1630 Gin Glu Pro Gin Leu Gly Lys Gly Phe Tyr Ala Thr Phe Val He Pro 5 1635 1640 1645 Pro Tyr Asn Leu Ser Thr His Gly Asp Glu Arg Trp Phe Lys Leu Tyr 1550 1655 1660 0 lie Lys His Val Val Asp Asn Asn Ser His lie lie Tyr Ser Gly Gin 1655 1670 1675 1630 Leu Thr Asp Thr Asn He Asn He Thr Leu Phe lie Pro Leu Asp Asp 1635 1690 1695 5 Val Pro Leu Asn Gin Asp Tyr His Ala Lys Val Tyr Mec Thr Phe Lys 1700 1705 1710 Lys Ser Pro Ser Asp Gly Thr Trp Trp Gly Pro His Phe Val Arg Asp 1715 1720 1725 Asp Lys Gly He Val Thr He Asn Pro Lys Ser He Leu Thr His Phe 1730 1735 1740 Glu Ser Val Asn Val Leu Asn Asn He Ser Ser Glu Pro Mec Asp Phe 1745 1750 1755 1760 Ser Gly Ala Asn Ser Leu Tyr Phe Trp Glu Leu Phe Tyr Tyr Thr Pro 1765 1770 1775 U Mec Leu Val Ala Gin Arg Leu Leu His Glu Gin Asn Phe Asp Glu Ala 1780 1785 1790 Asn Arg Trp Leu Lys Tyr Val Trp Ser Pro Ser Gly Tyr He Val His 1795 1800 1805 Gly Gin He Gin Asn Tyr Gin Trp Asn Val Arg Pro Leu Leu Glu Asp 1310 1815 1320 Thr Ser Trp Asn Ser Asp Pro Leu Asp Ser Val Asp Pro Asp Ala Val 1325 1330 ■ 1335 1340 Ala Gin His Asp Pro Mec His Tyr Lys Val Ser Thr Phe Mec Arg Thr 1845 1850 1855 5 Leu Asp Leu Leu He Ala Arg Gly Asp His Ala Tyr Arg Gin Leu Glu 1360 1865 1870 Arg Asp Thr Leu Asn Glu Ala Lys Mec Trp Tyr Mec Gin Ala Leu His 1375 1880 1335 -199- SUBSTTTUTE SHEET (RULE 26) .. t390 ' 1395 1-00 Pro Arg Leu Asp Arg Aia Ala ASD He Thr Thr Gin Asn Ala His Asa 1905 1910 1915 1920 Ser Ala lis Val Ala Leu Arg Gin Asn lie Pro Thr Pro Ala Pro Leu 1925 1930 1935 Ser Leu Arg Ser Ala Asn Thr Leu Thr Asp Leu Phe Leu Pro Gin lie 1940 1945 1950 Asn Glu Val Mec Mec Asn Tyr Trp Gin Thr Leu Ala Gin Arg Val Tyr 1955 1960 1965 Asn Leu Arg His Asn Leu Ser lie Asp Gly Gin Pro Leu Tyr Leu Pro 1970 1975 1980 lie Tyr Ala Thr Pro Ala Asp Pro Lys Ala Leu Leu Ser Ala Ala Val i985 1990 1995 2000 Ala Thr Ser Gin Gly Gly Gly Lys Leu Pro Glu Ser Phe Mec Ser Leu 2005 2010 2015 Trp Arg Phe Pro His Mec Leu Glu Asn Ala Arg Gly Mec Val Ser Gin 2020 2025 2030 Leu Thr Gin Phe Gly Ser Thr Leu Gin Asn He lie Glu Arg Gin Asp 2035 2040 2045 Ala Glu Ala Leu Asn Ala Leu Leu Gin Asn Gin Ala Ala Glu Leu He 2050 2055 2060 Leu Thr Asn Leu Ser lie Gin Asp Lys Thr lie Glu Glu Leu Asp Ala 2065 2070 2075 2030 Glu Lys Thr Val Leu Glu Lys Ser Lys Ala Gly Ala Gin Ser Arg Phe 2085 2090 2095 Asp Ser Tyr Gly Lys Leu Tyr Asp Glu Asn He Asn Ala Gly Glu Asn 2100 2105 2110 Gin Ala Met Thr Leu Arg Ala Ser Ala Ala Gly Leu Thr Thr Ala Val 2115 2120 2125 Gin Ala Ser Arg Leu Ala Gly Ala Ala Ala Asp Leu Val Pro Asn He 2130 2135 2140 Phe Gly Phe Ala Gly Gly Gly Ser Arg Trp Gly Ala He Ala Glu Ala 2145 2150 2155 2160 Thr Gly Tyr Val Mec Glu Phe Ser Ala Asn Val Mec Asn Thr Glu Ala 2165 2170 2175 Asp Lys He Ser Gin Ser Glu Thr Tyr Arg Arg Arg Arg Gin Glu Trp 2180 2185 2190 Glu He Gin Arg Asn Asn Ala Glu Ala Glu Leu Lys Gin He Asp Ala 2195 2200 2205 Gin Leu Lys Ser Leu Ala Val Arg Arg Glu Ala Ala Val Leu Gin Lys 2210 2215 2220 Thr Ser Leu Lys Thr Gin Gin Glu Gin Thr Gin Ser Gin Leu Ala Phe 2225 2230 2235 2240 Leu Gin Arg Lys Phe Ser Asn Gin Ala Leu Tyr Asn Trp Leu Arg Gly 2245 2250 2255 Arg Leu Ala Ala He T r Phe Gin Phe Tyr Asp Leu Ala Val Ala Arg 2260 2265 2270 - 200 - SUBSTTTUTE SHEET (RULE 26} Cys Leu Mec Ala Glu Gin Ala Tyr Arg Trp Glu Leu Asn Asp Asp Ser 2275 2230 2235 Ala Arg Phe lie Lys Pro Gly Ala Trp Gin Gly Thr Tyr Ala Gly Leu 2290 2295 2300 Leu Ala Gly Glu Thr Leu Mec Leu Ser Leu Ala Gin Mec Glu Asp Ala 2305 2310 2315 2320 His Leu Lys Arg Asp Lys Arg Ala Leu Glu Val Glu Arg Thr Val Ser 2325 2330 2335 eu Ala Glu Val Tyr Ala Gly Leu Pro Lys Asp Asn Gly Pro Phe Ser 2340 2345 2350 eu Ala Gin Glu lie Asp Lys Leu Val Ser Gin Gly Ser Gly Ser Ala 2355 2360 2365 ly Ser Gly Asn Asn Asn Leu Ala Phe Gly Ala Gly Thr Asp Thr Lys 2370 2375 2330 Thr Ser Leu Gin Ala Ser Val Ser Phe Ala Asp Leu Lys lie Arg Glu 2385 2390 2395 2400 Asp Tyr Pro Ala Ser Leu Gly Lys lie Arg Arg He Lys Gin He Ser 2405 2410 2415 V l Thr Leu Pro Ala Leu Leu Gly Pro Tyr Gin Asp Val Gin Ala lie 2420 2425 2430 Leu Ser Tyr Gly Asp Lys Ala Gly Leu Ala Asn Gly Cys Glu Ala Leu 2435 2440 2445 Ala Val Ser His Gly Mec Asn Asp Ser Gly Gin Phe Gin Leu Asp Phe 2450 2455 2460 Asn Asp Gly Lys Phe Leu Pro Phe Glu Gly He Ala He Asp Gin Gly 2465 2470 2475 2480 Thr Leu Thr Leu Ser Phe Pro Asn Ala Ser Mec Pro Glu Lys Gly Lys 2485 2490 2495 Gin Ala Thr Mec Leu Lys Thr Leu Asn Asp He He Leu His He Arg 2500 2505 2510 Tyr Thr He Lys 2516 (2) INFORMATION FOR SEQ ID NO: 48: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 5547 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS : double (D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:48 (ccdAii coding region CTG ATA GGC TAT AAC AAT CAA TTT AGC GGT AGA GCC ACT CAA TAT GTT 43 Leu He Gly Tyr Asn Asn Gin Phe Ser Gly Arg Ala Ser Gin Tyr Val 1 5 10 15 GCG CCG GGT ACC GTT TCT TCC ATG TTC TCC CCC GCC GCT TAT TTG ACT 96 Ala Pro Gly Thr Val Ser Ser Mec Phe Ser Pro Ala Ala Tyr Leu Thr 20 25 30 -201- SUBSTTTUTE SHEET (RULE 26) GAA CTT TAT CGT GAA GCA GC AAT TTA CAC GCA AC GAC TCC CTT TAT 1 -, Giu Leu Tyr Arg Glu Ala Arg Asn Leu His Ala Ser Asp Ser Val Tyr 35 40 45 TAT CTG GAT ACC CGC CGC CCA GAT CTC AAA TCA ATG GCG CTC AGT CAG 132 Tyr Leu Asp Thr Arg Arg Pro Asp Leu Lys Ser Mec Ala Leu Ser Gin 50 55 50 CAA AAT ATG GAT ATA GAA TTA TCC ACA CTC TCT TTG TCC AAT GAG CTG 240 Gin Asn Mec Asp He Glu Leu Ser Thr Leu Ser Leu Ser Asn Glu Leu 55 70 75 30 TTA TTG GAA AGC ATT AAA ACT GAA TCT AAA CTG GAA AAC TAT ACT AAA 233 Leu Leu Glu Ser lie Lys Thr Giu Ser Lys Leu Glu Asn Tyr Thr Lys 85 90 95 GTG ATG GAA ATG CTC TCC ACT TTC CGT CCT TCC GGC GCA ACG CCT TAT 336 Val Mec Glu Mec Leu Ser Thr Phe Arg Pro Ser Giy Ala Thr Pro Tyr 100 105 110 CAT GAT GCT TAT GAA AAT GTG CGT GAA GTT ATC CAG CTA CAA GAT CCT 334 His Asp Ala Tyr Glu Asn Val Arg Glu Val He Gin Leu Gin Asp Pro 115 120 125 GGA CTT GAG CAA CTC AAT GCA TCA CCG GCA ATT GCC GGG TTG ATG CAT 432 Giy Leu Glu Gin Leu Asn Ala Ser Pro Ala lie Ala Giy Leu Mec His 130 135 140 CAA GCC TCC CTA TTG GGT ATT AAC GCT TCA ATC TCG CCT GAG CTA TTT 430 Gin Ala Ser Leu Leu Giy He Asn Ala Ser He Ser Pro Glu Leu Phe 145 150 155 100 AAT ATT CTG ACG GAG GAG ATT ACC GAA GGT AAT GCT GAG GAA CTT TAT 528 Asn lie Leu Thr Glu Glu He Thr Glu Giy Asn Ala Glu Glu Leu Tyr 165 170 175 AAG AAA AAT TTT GGT AAT ATC GAA CCG GCC TCA TTG GCT ATG CCG GAA 576 Lys Lys Asn Phe Giy Asn He Glu Pro Ala Ser Leu Ala Mec Pro Glu 180 185 190 TAC CTT AAA CGT TAT TAT AAT TTA AGC GAT GAA GAA CTT AGT CAG TTT 624 Tyr Leu Lys Arg Tyr Tyr Asn Leu Ser Asp Glu Glu Leu Ser Gin Phe 195 200 205 ATT GGT AAA GCC AGC AAT TTT GGT CAA CAG GAA TAT AGT AAT AAC CAA 672 He Giy Lys Ala Ser Asn Phe Giy Gin Gin Glu Tyr Ser Asn Asn Gin 210 215 220 CTT ATT ACT CCG GTA GTC AAC AGC AGT GAT GGC ACG GTT AAG GTA TAT 720 Lau He Thr Pro Val Val Asn Ser Ser Asp Giy Thr Val Lys Val Tyr 225 230 235 240 CGG ATC ACC CGC GAA TAT ACA ACC AAT GCT TAT CAA ATG GAT GTG GAG 763 Arg He Thr Arg Glu Tyr Thr Thr Asn Ala Tyr Gin Mec Asp Val Glu 245 250 255 CTA TTT CCC TTC GGT GGT GAG AAT TAT CGG TTA GAT TAT AAA TTC AAA 816 Leu Phe Pro Phe Giy Giy Glu Asn Tyr Arg Lau Asp Tyr Lys Phe Lys 260 265 270 .AAT TTT TAT AAT GCC TCT TAT TTA TCC ATC AAG TTA AAT GAT AAA AGA 364 Asn Phe Tyr Asn Ala Ser Tyr Leu Ser He Lys Leu Asn Asp Lys Arg 275 280 285 GAA CTT GTT CGA ACT GAA GGC GCT CCT CAA GTC AAT ATA GAA TAC TCC 912 Giu Leu Val Arg Thr Glu Giy Ala Pro Gin Val Asn He Glu Tyr Ser 290 295 300 GCA AAT ATC ACA TTA AAT ACC GCT GAT ATC AGT CAA CCT TTT GAA ATT 960 Ala Asn He Thr Leu Asn Thr Ala Asp He Ser Gin Pro Phe Glu He -202- 05 10 315 3 ACA CGA GTA CTT CCT TCC GGT TCT TGG GCA TAT GCC A 003 Gly Lsu Thr Arg Val Leu Pro Ser Gly Ser Trp Al Tyr Ala Ala Ala 325 330 335 AAA TTT ACC GTT CAA GAG TAT AAC CAA C TCT TTT CTG CTA .AAA CTT 1056 Lys Phe Thr Val Glu Glu Tyr Asn Gin Tyr Ser Phe Leu Leu Lys Leu 340 345 350 AAC AAG GCT ATT CGT CTA TCA CGT GCC ACA GAA TTG TCA CCC ACC ATT 1104 Asn Lvs Ala He Arg Leu Ser Arg Ala Thr Glu Leu Ser Pro Thr lie 355 360 365 CTG GAA GGC ATT GTC CGC AGT GTT AAT CTA CAA CTG GAT ATC AAC ACA 1152 Leu Glu Gly lie Val Arg Ser Val Asn Leu Gin Leu Asp He Asn Thr 370 375 330 GAC GTA TTA GGT AAA GTT TTT CTG ACT AAA TAT TAT ATG CAG CGT TAT 1200 Asp Val Leu Gly Lys Val Phe Leu Thr Lys Tyr Tyr Mec Gin Arg Tyr 385 390 395 400 GCT ATT CAT GCT GAA ACT GCC CTG ATA CTA TGC AAC GCG CCT ATT TCA 1248 Ala lie His Ala Glu Thr Ala Leu He Leu Cys Asn Ala Pro He Ser 405 " 410 415 CAA CGT TCA TAT GAT AAT CAA CCT AGC CAA TTT GAT CGC CTG TTT AAT 1296 Gin Arg Ser Tyr Asp Asn Gin Pro Ser Gin Phe Asp Arg Leu Phe Asn 420 425 430 ACG CCA TTA CTG AAC GGA CAA TAT TTT TCT ACC GGC GAT GAG GAG ATT 1344 Thr Pro Leu Leu Asn Gly Gin Tyr Phe Ser Thr Gly Asp Glu Glu He 435 440 445 GAT TTA AAT TCA GGT AGC ACC GGC GAT TGG CGA AAA ACC ATA CTT AAG 1392 Asp Leu Asn Ser Gly Ser Thr Gly Asp Trp Arg Lys Thr He Leu Lys 450 455 460 CGT GCA TTT AAT ATT GAT GAT GTC TCG CTC TTC CGC CTG CTT AAA ATT 1440 Arg Ala Phe Asn He Asp Asp Val Ser Leu Phe Arg Leu Leu Lys He 455 470 475 480 ACC GAC CAT GAT AAT AAA GAT GGA AAA ATT AAA AAT AAC CTA AAG AAT 1488 Thr Asp His Asp Asn Lys Asp Gly Lys He Lys Asn Asn Leu Lys Asn 485 490 495 CTT TCC AAT TTA TAT ATT GGA AAA TTA CTG GCA GAT ATT CAT CAA TTA 1536 Leu Ser Asn Leu Tyr He Gly Lys Leu Leu Ala Asp He His Gin Leu 500 505 510 ACC ATT GAT GAA CTG GAT TTA TTA CTG ATT GCC GTA GGT GAA GGA AAA 1584 Thr He Asp Glu Leu Asp Leu Leu Leu He Ala Val Gly Glu Gly Lys 515 520 525 ACT AAT TTA TCC GCT ATC AGT GAT AAG CAA TTG GCT ACC CTG ATC AGA 1632 Thr Asn Leu Ser Ala He Ser Asp Lys Gin Leu Ala Thr Leu He Arg 530 535 540 AAA CTC AAT ACT ATT ACC AGC TGG CTA CAT ACA CAG AAG TGG AGT GTA 1680 Lys Leu Asn Thr He Thr Ser Trp Leu His Thr Gin Lys Trp Ser al 545 550 555 560 TTC CAG CTA TTT ATC ATG ACC TCC ACC AGC TAT AAC AAA ACG CTA ACG 1723 Phe Gin Leu Phe He Mec Thr Ser Thr Ser Tyr Asn Lys Thr' Leu Thr 555 570 575 CCT GAA ATT AAG AAT TTG CTG GAT ACC GTC TAC CAC GGT TTA CAA GGT 1776 Pro Glu He Lys Asn Leu Leu Asp Thr Val Tyr His Gly Leu Gin Gly 580 585 590 TTT GAT AAA GAC AAA GCA GAT TTG CTA CAT GTC ATG GCG CCC TAT ATT 132 -203 - .-he As? Lys Asp Lys Aia Asp Leu Leu His Val Mec Aia Pro Tyr H 555 όΰϋ 005 GCG GCC ACC TTG C.AA TTA TCA TCG G A AAT GTC GCC CAC TCG GTA CTC 1372 Aia Aia Thr Leu Gin Leu Ser Ser Glu Asn Val Ala His Ser Val Leu 610 615 620 CTT TGG GCA G.AT AAG TTA CAG CC GCv. GAC GGC GCA ATG ACA OCA GAA 192 C Leu Trp Ala Asp Lys Leu Gin Pro Gly Asp Gly Aia Mec Thr Aia Glu z 2 ~. 630 635 640 AAA TTC TGG GAC TGG TTG AAT ACT AAG TAT ACG CCG GGT TCA TCG G A 1563 Lys Phe Trp Asp Trp Leu Asn Thr Lys Tyr Thr Pro Gly Ser Ser Glu 045 50 655 GCC GTA GAA ACG CAG GAA CAT ATC CTT CAG TAT TGT CAG GCT CTG GCA 2016 Ala Val Glu Thr Gin Glu His lie Val Gin Tyr Cys Gin Ala Leu Aia 660 655 670 C.AA TTG GAA ATG GTT TAC CAT TCC ACC GGC ATC AAC GAA AAC GCC TTC 2064 Gin Leu Glu Mec Val Tyr His Ser Thr Gly lie Asn Glu Asn Ala Phe 675 630 535 CGT CTA TTT CTG ACA AAA CCA GAG ATG TTT GGC GCT GCA ACT GGA GCA 2112 Arg Leu Phe Val Thr Lys Pro Glu Mec Phe Gly Ala Ala Thr Glv Ala 690 695 700 GCG CCC GCG CAT GAT GCC CTT TCA CTG ATT ATG CTG ACA CGT TTT GCG 2160 Aia Pro Ala His Asp Ala Leu Ser Leu lie Mec Leu Thr Arg Phe Ala 705 710 715 720 GAT TGG GTG AAC GCA CTA GGC GAA AAA GCG TCC TCG GTG CTA GCG GCA 2203 Asp Trp Val Asn Ala Leu Gly Glu Lys Ala Ser Ser Val Leu Ala Aia 725 730 735 TTT GAA GCT AAC TCG TTA ACG GCA GAA CAA CTG GCT GAT GCC ATG AAT Phe Glu Ala Asn Ser Leu Thr Ala Glu Gin Leu Ala Asp Ala Mec Asn 740 745 750 CTT GAT GCT AAT TTG CTG TTG CAA GCC ACT ATT CAA GCA CAA AAT CAT 2304 Leu Asp Ala Asn Leu Leu Leu Gin Ala Ser He Gin Ala Gin Asn His 755 760 765 CAA CAT CTT CCC CCA GTA ACT CCA GAA AAT GCG TTC TCC TGT TGG ACA 2352 Gin His Leu Pro Pro Val Thr Pro Glu Asn Ala Phe Ser Cys Trp Thr 770 775 780 TCT ATC AAT ACT ATC CTG CAA TGG GTT AAT GTC GCA CAA CAA TTG AAT 2400 Ser He Asn Thr lie Leu Gin Trp Val Asn Val Ala Gin Gin Leu Asn 735 790 795 aoo GTC GCC CCA CAG GGC GTT TCC GCT TTG GTC GGG CTG GAT TAT ATT CAA 2443 Val Ala Pro Gin Gly Val Ser Ala Leu Val Gly Leu Asp Tyr lie Gin 805 810 315 TCA ATG AAA GAG ACA CCG ACC TAT GCC CAG TGG GAA AAC GCG GCA GGC 2496 Ser Mec Lys Glu Thr Pro Thr Tyr Ala Gin Trp Glu Asn Ala Ala Gly 820 825 830 GTA TTA ACC GCC GGG TTG AAT TCA CAA CAG GCT AAT ACA TTA CAC GCT 2544 V l Leu Thr Aia Gly Leu Asn Ser Gin Gin Ala Asn Thr Leu His Ala 335 340 345 TTT CTG GAT GAA TCT CGC ACT GCC GCA TTA AGC ACC TAC TAT ATC CGT 2592 Phe Leu Asp Glu Ser Arg Ser Aia Ala Leu Ser Thr T r Tyr He Arg 350 355 860 GCC .AAG GCA GCG GCG GCT ATT .AAA AGC CGT GAT GAC TTG TAT 2640 G Ln Va 1 Ala Lys Ala Ala Ala Ala He Lys Ser Arg Asp Asp Leu T/r 5 5 370 375 330 -204- SUBSTTTUTE SHEET (RULE 26) AA TAC TTA CTC ATT GAT AAT G GTT TCT GCC GCA ATA AAA ACC .-.ZZ ii-i Gin Tyr Leu lie As .-.sn Gin Val 3er Ala Aia lie Lys Thr "^r 335 390 3 5 CGG ATC GCC GAA GCC ATT GCC AGT ATT CAA CTG TAC GTC AAC CGG GCA 2" 35 Arg lie Ala Glu Ala lie Ala Ser lie Gin Leu Tyr Val Asn Arg Ala 900 905 910 TTG GAA AAT CTG GAA GAA AAT GCC AAT TCG GGG G T ATC AGC CGC CAA ;~34 Leu Giu Asn Val Glu Glu Asn Ala Asn Ser Gly Val lie Ser Arg Gin 915 920 925 TTC TTT ATC GAC TGG GAC AAA TAC AAT AAA CGC TAC AGC ACT TGG GCG 33 ?he Phe lie Asp Trp Asp Lys Tyr Asn Lys Arg Tvr Ser Thr Trp Ala 930 935 940 GGT GTT TCT CAA TTA GTT TAC TAC CCG GAA AAC TAT ATT GAT CCG ACC 2330 Gly Val Ser Gin Leu Val Tyr Tyr Pro Glu Asn Tyr He Asp Pro Thr 945 950 955 900 ATC CGT ATC GGA CAA ACC AAA ATG ATG GAC GCA TTA CTG CAA TCC GTC .Mec Arg lie Gly Gin Thr Lys Mec Mec Asp Ala Leu Leu Gin Ser Val 9ό5 970 975 AGC CAA AGC CAA TTA AAC GCC GAT ACC GTC GAA GAT GCC TTT ATG TCT Ser Gin Ser Gin Leu Asn Ala Asp Thr Val Glu Asp Ala Phe Mec Ser 980 935 990 TAT C G ACA TCG TTT GAA CAA GTG GCT AAT CTT AAA GTT ATT AGC GCA 3024 Tyr Leu Thr Ser Phe Glu Gin Val Ala Asn Leu Lys Val lie Ser Ala 995 1000 1005 TAT CAC GAT AAT ATT AAT AAC GAT CAA GGG CTG ACC TAT TTT ATC GGA 30"? 2 Tyr His Asp Asn lie Asn Asn Asp Gin Gly Leu Thr Tyr Phe lie Gly 1010 1015 1020 CTC AGT GAA ACT GAT GCC GGT GAA TAT TAT TGG CGC AGT GTC GAT CAC 3120 Leu Ser Glu Thr Asp Ala Gly Glu Tyr Tyr Trp Arg Ser Val Asp His 1025 1030 1035 1040 AGT AAA TTC AAC GAC GGT AAA TTC GCG GCT AAT GCC TGG AGT GAA TGG 3153 Ser Lys Phe Asn Asp Gly Lys Phe Ala Ala Asn Ala Trp Ser Glu Trp 1045 1050 1055 CAT AAA ATT GAT TGT CCA ATT AAC CCT TAT AAA AGC ACT ATC CGT CCA 3215 His Lys lie Asp Cys Pro He Asn Pro Tyr Lys Ser Thr lie Arg Pro 1060 1065 1070 GTG ATA TAT AAA TCC CGC CTG TAT CTG CTC TGG TTG GAA CAA AAG GAG 3264 Val He Tyr Lys Ser Arg Leu Tyr Lau Leu Trp Leu Glu Gin Lys Glu 1075 1080 1085 ATC ACC AAA CAG ACA GGA AAT AGT AAA GAT GGC TAT CAA ACT GAA ACG 3312 lie Thr Lys Gin Thr Gly Asn Ser Lys Asp Gly Tyr Gin Thr Glu Thr 1090 1095 1100 GAT TAT CGT TAT GAA CTA AAA TTG GCG CAT ATC CGC TAT GAT GGC ACT 3360 Asp Tyr Arg Tyr Glu Leu Lys Leu Ala His He Arg Tyr Asp Gly Thr 1105 1110 1115 1120 TGG AAT ACG CCA ATC ACC TTT GAT GTC AAT AAA AAA ATA TCC GAG CTA 3403 Trp Asn Thr Pro He Thr Phe Aso Val Asn Lys Lys He Ser Glu Leu 1125 1130 1135 AAA CTG GAA AAA AAT AGA GCG CCC GGA CTC TAT TGT GCC GGT TAT CAA 3455 Lys Leu Glu Lys Asn Arg Ala Pro Gly Leu Tyr Cys Ala Gly T Gin 1140 1145 1150 GGT GAA GAT ACG TTG CTG GTG ATG TTT TAT AAC CAA CAA GAC ACA CTA 3504 Gly Glu Asp Thr Leu Leu Val Mec' Phe Tyr Asn Gin Gin Asp Thr Leu 1155 1150 1165 -205- GAT ACT TAT AAA AAC GCT TCA ATG CAA GGA CTA TAT ATC TTT GCT GAT 3552 Asp Ser Tyr Lys Asn la ser Mac Gin Gly Leu Tyr lie Phe Ala Aso 1170 11T5 1130 ATG GCA TCC AAA GAT ATG ACC CCA A CAG AGC AAT GTT TAT CCG GAT 3600 Mec Ala Ser Lys Asp Mec Thr Pro Glu Gin Ser Asn Val Tyr Arg Asp 1135 1190 1195 1200 AAT AGC TAT CAA C.AA TTT GAT ACC AAT AAT GTC AGA AGA GTG .AAT .AAC 3643 Asn Ser Tyr Gin Gin Phe Asp Thr Asn Asn Val Arg Arg Val Asn Asn 1205 1210 1215 CGC TAT GCA GAG GAT TAT GAG ATT CCT TCC TCG GTA ACT AGC CGT .AAA 3596 Arg Tyr Ala Glu Asp Tyr Glu lie Pro Ser Ser Val Ser Ser Arg Lys 1220 1225 1230 GAC TAT GGT TCG GGA GAT TAT TAC CTC AGC ATG GTA TAT AAC GGA GAT 3744 Asp Tyr Gly Trp Gly Asp Tyr Tyr Lau Ser Mec Val Tyr Asn Gly Asp 1235 1240 1245 ATT CCA ACT ATC .AAT TAC AAA GCC GCA TCA ACT GAT TTA AAA ATC TAT 3732 lie Pro Thr lie Asn Tyr Lvs Ala Ala Ser Ser Asp Leu Lys lie Tyr 1250 1255 1260 ATC TCA CCA AAA TTA AGA ATT ATT CAT AAT GGA TAT GAA GGA CAG AAC 3340 lie Ser Pro Lys Leu Arg He He His Asn Gly Tyr Glu Gly Gin Lys 1265 1270 1275 1230 CGC AAT CAA TGC AAT CTG ATG AAT AAA TAT GGC AAA CTA GGT GAT AAA 3333 Arg Asn Gin Cys Asn Leu Mec Asn Lys- Tyr Gly Lys Leu Gly ASD Lys 1285 1290 1295 TTT ATT GTT TAT ACT AGC TTG GGG GTC AAT CCA AAT AAC TCG TCA AAT 3936 Phe He Val Tyr Thr Ser Leu Gly Val Asn Pro Asn Asn Ser Ser Asn 1300 1305 1310 AAG CTC ATG TTT TAC CCC GTC TAT CAA TAT AGC GGA AAC ACC ACT GGA 3934 Lys Leu Mec Pha Tyr Pro Val Tyr Gin Tyr Ser Gly Asn Thr Ser Gly 1315 1320 1325 CTC AAT CAA GGG AGA CTA CTA TTC CAC CGT GAC ACC ACT TAT CCA TCT 4032 Leu Asn Gin Gly Arg Lau Leu Phe His Arg Asp Thr Thr Tyr Pro Ser 1330 1335 1340 AAA GTA GAA GCT TGG ATT CCT GGA GCA AAA CGT TCT CTA ACC AAC CAA 4080 Lys Val Glu Ala Trp lie Pro Gly Ala Lys Arg Ser Leu Thr Asn Gin 1345 1350 1355 1360 AAT GCC GCC ATT GGT GAT GAT TAT GCT ACA GAC TCT CTG AAT AAA CCG 4123 Asn Ala Ala Ha Gly Asp Asp Tyr Ala Thr Asp Ser Leu Asn Lys Pro 1365 1370 1375 GAT GAT CTT AAG CAA TAT ATC TTT ATG ACT GAC ACT AAA GGG ACT GCT 176 Asp Asp Leu Lys Gin Tyr He Phe Mac Thr Asp Ser Lys Gly Thr Ala 1380 1385 1390 ACT GAT GTC TCA GGC CCA GTA GAG ATT AAT ACT GCA ATT TCT CCA GCA 4224 Thr Asp Val Ser Gly Pro Val Glu He Asn Thr Ala He Ser Pro Ala 1395 1400 1405 AAA GTT CAG ATA ATA GTC AAA GCG GGT GGC AAG GAG CAA ACT TTT ACC 4272 Lys Val Gin He He Val Lys Ala Gly Gly Lys Glu Gin Thr Phe Thr 1410 1415 1420 GCA GAT AAA GAT GTC TCC ATT CAG CCA TCA CCT AGC TTT GAT GAA ATG 4320 Ala Asp Lys Asp Val Ser He Gin Pro Ser Pro Ser Phe Asp Giu Mec 1425 1430 1435 1440 AAT TAT CAA TTT AAT GCC CTT CAA ATA GAC GGT TCT GGT CTG AAT TTT 4363 Asn T r Gin Phe Asn Ala Leu. Glu He Asp Gly Ser Gly Leu Asn Phe -206- 1445 1450 1455 ATT AAC AAC TCA CCC ACT ATT S T GTT ACT TTT ACC GCA TTT CCC GAG 4 10 lie Asn Asn Ser Ala Ser lie Asp Vai Thr Phe Thr Ala Phe Aia Giu 1460 1465 1470 GAT GGC CGC AAA CTG GGT TAT GAA AGT TTC ACT ATT CCT GTT ACC CTC 4464 Asp Gly Arg Lys Leu Gly Tyr Giu Ser Phe Ser lie Pro Vai Thr Leu 1475 1430 1485 .-.AG CTA AGT ACC GAT AAT GCC CTG ACC CTG CAC CAT AAT GAA AAT GGT 4512 Lys Vai Ser Thr Asp Asn Aia Leu Thr Leu His His Asn Giu Asn Gly 1490 1495 1500 CCG CAA TAT ATG CAA TGG CAA TCC TAT CGT ACC CGC CTG AAT ACT CTA 4560 Ala Gin Tyr Mec Gin Trp Gin Ser Tyr Arg Thr Arg Leu Asn Thr Leu 1505 1510 1515 1520 TTT GCC CGC CAC TTG GTT GCA CGC GCC ACC ACC GGA ATC GAT ACA ATT 4603 Phe Ala Arg Gin Leu Vai Ala Arg Ala Thr Thr Gly lie Asp Thr lie 1525 1530 1535 CTG AGT ATG GAA ACT CAG AAT ATT CAG GAA CCG CAG TTA GGC AAA GGT 4656 Leu Ser Mec Giu Thr Gin Asn He Gin Giu Pro Gin Leu Gly Lys Gly 1540 1545 1550 TTC TAT GCT ACG TTC GTG ATA CCT CCC TAT AAC CTA TCA ACT CAT GGT 4704 Phe Tyr Ala Thr Phe Vai lie Pro Pro Tyr Asn Leu Ser Thr His Gly 1555 1560 1565 GAT GAA CGT TGG TTT AAG CTT TAT ATC AAA CAT GTT GTT GAT AAT .AAT 4752 Asp Giu Arg Trp Phe Lys Leu Tyr He Lys His Vai Vai Asp Asn Asn 1570 1575 1530 TCA CAT ATT ATC TAT TCA GGC CAG CTA ACA GAT ACA AAT ATA AAC ATC 4300 Ser His He He Tyr Ser Gly Gin Leu Thr Asp Thr Asn He Asn He 1535 1590 1595 1600 ACA TTA TTT ATT CCT CTT GAT GAT GTC CCA TTG AAT CAA GAT TAT CAC 43 3 Thr Leu Phe He Pro Leu Asp Asp Vai Pro Leu Asn Gin Asp Tyr His 1605 1610 1615 GCC AAG GTT TAT ATG ACC TTC AAG AAA TCA CCA TCA GAT GGT ACC TGG 4396 Ala Lys Vai Tyr Mec Thr Phe Lys Lys Ser Pro Ser Asp Gly Thr Trp 1620 1625 1630 TGG GGC CCT CAC TTT GTT AGA GAT GAT AAA GGA ATA GTA ACA ATA AAC 49 Trp Gly Pro His Phe Vai Arg Asp Asp Lys Gly He Vai Thr He Asn 1635 1640 1645 CCT AAA TCC ATT TTG ACC CAT TTT GAG AGC GTC AAT GTC CTG AAT AAT 4992 Pro Lys Ser He Leu Thr His Pha Giu Ser Vai Asn Vai Leu Asn Asn 1650 1655 1660 ATT AGT AGC GAA CCA ATG GAT TTC AGC GGC GCT AAC AGC CTC TAT TTC 5040 He Ser Ser Giu Pro Mec Asp Phe Ser Gly Ala Asn Ser Leu Tyr Phe 1665 1670 1675 1630 TGG GAA CTG TTC TAC TAT ACC CCG ATG CTG GTT GCT CAA CGT TTG CTG 5088 Trp Giu Leu Phe Tyr Tyr Thr Pro Mec Leu Vai Ala Gin Arg Leu Leu 1685 1690 1655 CAT GAA CAG AAC TTC GAT GAA GCC AAC CGT TGG CTG AAA TAT GTC TGG 136 His Giu Gin Asn Phe Asp Giu Ala Asn Arg Trp Leu Lys Tyr Vai Trp 1700 1705 1710 AGT CCA TCC GGT TAT ATT GTC CAC GGC CAG ATT CAG AAC TAC CAG TGG 5134 Ser Pro Ser Gly Tyr He Vai His Gly Gin He Gin Asn Tyr Gin Trp 1715 1720 1725 AAC GTC CGC CCG TTA CTG GAA GAC ACC AGT TGG AAC AGT GAT CCT TTG 5232 -207- SUBSTTTUTE SHEET (RULE 26) Pro L«u Leu Glu Asp Thr Ser Trp Asn ier Asp Pro 1735 1740 GAT TCC GTC GAT CCT GAC GCG GTA GCA CAG CAC GAT CCA ATG CAC TAC 5230 Asp Ser Val Asp Pro Asp Ala Val Ala Gin His ASD Pro Mec His Tyr 1745 1750 1755 1760 AAA GTT TCA ACT TTT ATG CGT ACC TTG GAT CTA TTG ATA GCA CGC GGC 5323 Lys Val Ser Thr Phe Mec Arg Thr Leu Asp Leu Leu lie Ala Arg Gly 1765 1770 1775 GAC CAT GCT TAT CGC CAA CTG GAA CGA GAT ACA CTC AAC GA_A GCG AAG Asp His Ala Tyr Arg Gin Leu Glu Arg Asp Thr Leu Asn Glu Ala Lys 1730 1785 1790 ATG TGG TAT ATG CAA GCG CTG CAT CTA TTA GGT GAC AAA CCT TAT CTA 5424 Mec Trp Tyr Mec Gin Ala Leu His Leu Leu Gly Asp Lys Pro Tyr Leu 1795 1800 1305 CCG CTG AGT ACG ACA TGG AGT GAT CCA CGA CTA GAC AGA GCC GCG GAT 5472 Pro Leu Ser Thr Thr Trp Ser Asp Pro Arg Leu Asp Arg Ala Ala Asp 1310 1815 1320 ATC ACT ACC CAA AAT GCT CAC GAC AGC GCA ATA GTC GCT CTG CGG CAG 5520 :ia Thr Thr Gin Asn Ala His Asp Ser Ala He Val Ala Leu Arg Gin 1325 1330 1335 1340 .AAT ATA CCT ACA CCG GCA CCT TTA TCA 5547 Asn He Pro Thr Pro Ala Pro Leu Ser 1845 1349 INFORMATION FOR SEQ ID NO: 49: (i) SEQUENCE CHARACTERISTICS (A) LENGTH: 1849 amino (B) TYPE: amino acids (C) STRANDEDNESS : single (D) TOPOLOGY : linear ii) MOLECULE TYPE: protein (xi ) SEQUENCE DESCRIPTION: SEQ ID NO: 9 (TcdAii ) : Features From To Descr ipcion Pepc ide 1 1349 TcdAii peptide Fragment 1 12 S2 N -cerminus ( SEQ ID NO: 13 ) Fragment 196 211 (SEQ ID NO: 38) Fragment 466 475 ( SEQ ID NO : 17) Fragmenc 993 1004 (SEQ ID NO: 23 ; 12/13 ) F agment 1297 1312 (SEQ ID NO: 18) Fragment 1390 1409 (SEQ ID NO: 39) Fragment 1532 1554 (SEQ ID NO: 21; 19/23) Leu He Gly Tyr Asn Asn Gin Phe Ser Gly Arg Ala Ser Gin Tyr Val 1 5 10 15 Ala Pro Gly Thr Val Ser Ser Mec Phe Ser Pro Ala Ala Tyr Leu Thr 20 25 30 Glu Leu Tyr Arg Glu Ala Arg Asn Leu His Ala Ser Asp Ser Val T r 35 40 45 T r Leu Asp Thr Arg Arg Pro ASD Leu Lys Ser Mec Ala Leu Ser Gin 50 55 50 Gin Asn Mec Asp lie Glu Leu Ser Thr Leu Ser Leu Ser Asn Glu Leu i 5 70 75 30 Leu Leu Glu Ser lie Lys Thr Glu Ser Lys Leu Glu Asn T/r Thr Lys -208- 35 JO 95 Vai Mec Glu Mec Lau Ser Thr Phe Arg Pro Ser Giy Ala Thr Pro T r 100 105 no His Asp Ala T r Glu Asn Vai Arg Glu Vai lie Gin Leu Gin s p*-o 115 120 125 Giy Leu Glu Gin Leu Asn Ala Ser Pro Ala lie Ala Giy Leu Mec His 130 135 140 Gin Ala Ser Leu Leu Giy He Asn Ala Ser lie Ser Pro Giu Leu Phe i45 150 155 1 0 Asn He Leu Thr Glu Glu lie Thr Glu Giy Asn Ala Glu Giu Leu Tyr 105 170 175 Lys Lys Asn Phe Giy Asn lie Glu Pro Ala Ser Leu Ala Mec Pro Glu 130 135 190 Tyr Leu Lys Arg Tyr Tyr Asn Leu Ser Asp Glu Glu Leu Ser Gin Phe 195 200 205 lie Giy Lys Ala Ser Asn Phe Giy Gin Gin Glu Tyr Ser Asn Asn Gin 210 215 220 Leu lie Thr Pro Vai Vai Asn Ser Ser Asp Giy Thr Vai Lys Vai Tyr 225 230 235 240 Arg He Thr Arg Glu Tyr Thr Thr Asn Ala Tyr Gin Mec Asp Vai Glu 245 250 255 Leu Phe Pro Phe Giy Giy Glu Asn Tyr Arg Leu Asp Tyr Lys Phe Lys 260 265 270 Asn Phe Tyr Asn Ala Ser Tyr Leu Ser He Lys Leu Asn Asp Lys Arg 275 280 285 Glu Leu Vai Arg Thr Glu Giy Ala Pro Gin Vai Asn He Glu Tyr Ser 290 295 300 Ala Asn He Thr Lau Asn Thr Ala Asp lie Ser Gin Pro Phe Glu lie 305 310 315 320 Giy Leu Thr Arg Vai Leu Pro Ser Giy Ser Trp Ala Tyr Ala Ala Ala 325 330 335 Lys Phe Thr Vai Glu Glu Tyr Asn Gin Tyr Sar Pha Leu Leu Lys Leu 340 345 350 Asn Lys Ala Ha Arg Leu Ser Arg Ala Thr Glu Lau Ser Pro Thr He 355 360 365 Leu Glu Giy He Vai Arg Ser Vai Asn Leu Gin Leu Asp He Asn Thr 370 375 330 Asp Vai Leu Giy Lys Vai Pha Leu Thr Lys Tyr Tyr Mec Gin Arg Tyr 335 390 395 400 Ala He His Ala Glu Thr Ala Leu He Leu Cys Asn Ala Pro He Ser 405 410 415 Gin Arg Ser Tyr Asp Asn Gin Pro Ser Gin Phe Asp Arg Leu Phe Asn 420 425 430 Thr Pro Leu Leu Asn Giy Gin Tyr Phe Ser Thr Giy Asp Glu Glu He 435 440 445 Asp Leu Asn Ser Giy Ser Thr Giy Asp Trp Arg Lys Thr He Leu Lys 450 '455 450 -209- Arg a e sn e sp sp a er au e t'7 au eu ys [ :e 4 = 5 470 75 430 Thr Asp His ASD Asn Lys Asp Giy Lys lie Lys Asn Asn Lau Lys Asn 435 430 435 Leu Ser Asn Leu Tyr lie Giy Lys Leu Leu Aia Asp lie His Gin Leu 500 505 510 Thr lie Asp Giu Leu Asp Leu Leu Leu lie Ala Val Giy Giu Giy Lys 515 520 525 Thr Asn Leu Ser Ala lie Ser Asp Lys Gin Leu Aia Thr Leu lie Arg 530 535 540 Lys Leu Asn Thr lie Thr Ser Trp Leu His Thr Gin Lys Trp Ser Val 545 550 555 500 Phe Gin Leu Phe lie Met Thr Ser Thr Ser Tyr Asn Lys Thr Leu Thr 565 570 575 Pro Giu lie Lys Asn Leu Leu Asp Thr Val Tyr His Giy Leu Gin Giy 530 535 590 Phe Asp Lys Asp Lys Ala Asp Leu Leu His Val Mec Ala Pro Tyr lie 595 600 605 Ala Ala Thr Leu Gin Leu Ser Ser Giu Asn Val Ala His Ser Val Leu 510 615 620 Leu Trp Ala Asp Lys Leu Gin Pro Giy Asp Giy Ala Mec Thr Ala Giu 625 630 635 640 Lys Phe Trp Asp Trp Leu Asn Thr Lys Tyr Thr Pro Giy Ser Ser Giu 645 650 655 Ala Val Giu Thr Gin Giu His lie Val Gin Tyr Cys Gin Ala Leu Ala 660 665 670 Gin Leu Giu Mec Val Tyr His Ser Thr Giy lie Asn Giu Asn Ala Phe 675 630 685 Arg Leu Phe Val Thr Lys Pro Giu Me Phe Giy Ala Ala Thr Giy Ala 690 695 700 Ala Pro Ala His Asp Ala Leu Ser Leu lie Mec Leu Thr Arg Phe Ala 705 710 715 720 Asp Trp Val Asn Ala Leu Giy Giu Lys Ala Ser Ser Val Leu Ala Ala 725 730 735 Phe Giu Ala Asn Ser Leu Thr Ala Giu Gin Leu Ala Asp Ala Mec Asn 740 745 750 Leu Asp Ala Asn Leu Leu Leu Gin Ala Ser lie Gin Ala Gin Asn His 755 760 765 Gin His Leu Pro Pro Val Thr Pro Giu Asn Ala Phe Ser Cys Trp Thr 770 775 730 Ser He Asn Thr He Leu Gin Trp Val Asn Val Ala Gin Gin Leu Asn 735 790 795 800 Val Ala Pro Gin Giy Val Ser Ala Leu Val Giy Leu Asp Tyr lie Gin 805 810 815 Ser Mec Lys Giu Thr Pro Thr Tyr Ala Gin Trp Giu Asn Ala Ala Giy 820 325 830 Val Leu Thr Ala Giy Leu Asn Ser Gin Gin Aia Asn Thr Leu His Ala 335 840 345 Phe Leu ASD JIU -ir Arg 5-?r Ala Ala Leu Ser Thr Tvr Tyr lie Arg 350 ' 355 360 31 n Val Ala Lys Ala Ala Ala Ala lie L s Ser Arg Asp Asp Leu Tyr 365 370 3~5 330 Gin Tvr Leu Leu lis Asp Asn Gin Val Ser Aia Ala lie Lys Thr Thr 335 390 395 Ara lie Ala Glu Aia lie Aia Ser He Gin Leu Tyr Val Asn Arg Ala 300 905 9i0 Leu Glu Asn Val Glu Glu Asn Ala Asn Ser Gly Val lie Ser Arg Gin 915 920 925 Phe Phe lie Asp Trp Asp Lys Tyr Asn Lys Arg Tyr Ser Thr Trp Ala 930 935 940 Gly Val Ser Gin Leu Val Tyr Tyr Pro Glu Asn Tyr He Asp Pro Thr 945 950 955 960 Mec Arg lie Gly Gin Thr Lys MeC Mec Asp Ala Leu Leu Gin Ser Val 965 970 975 Ser Gin Ser Gin Leu Asn Ala Asp Thr Val Glu Asp Ala Phe Mec Ser 980 935 990 T r Leu Thr Ser Phe Glu Gin Val Ala Asn Leu Lys Val He Ser Ala 995 1000 1005 Tyr His Asp Asn He Asn Asn Asp Gin Gly Leu Thr Tyr Phe He Gly 1010 1015 1020 Leu Ser Glu Thr Asp Ala Gly Glu Tyr Tyr Trp Arg Ser Val Asp His 1025 1030 1035 1040 Ser Lys Phe Asn Asp Gly Lys Phe Ala Ala Asn Ala Trp Ser Glu Trp 1045 1050 1055 His Lys He Asp Cys Pro He Asn Pro Tyr Lys Ser Thr He Arg Pro 1060 1065 1070 Val He Tyr Lys Ser Arg Leu Tyr Leu Leu Trp Leu Glu Gin Lys Glu 1075 1080 1085 He Thr Lys Gin Thr Gly Asn Ser Lys Asp Gly Tyr Gin Thr Glu Thr 1090 1095 1100 Asp Tyr Arg Tyr Glu Leu Lys Leu Ala His lis Arg Tyr Asp Gly Thr 1105 1110 1115 1120 Trp Asn Thr Pro He Thr Phe Asp Val Asn Lys Lys He Ser Glu Leu 1125 1130 1135 Lys Leu Glu Lys Asn Arg Ala Pro Gly Leu Tyr Cys Ala Gly Tyr Gin 1140 1145 1150 Gly Glu Asp Thr Leu Leu Val Mec Phe Tyr Asn Gin Gin Asp Thr Leu 1155 1160 1165 Asp Ser Tyr Lys Asn Ala Ser Mec Gin Gly Leu Tyr He Phe Ala Asp 1170 1175 1180 Mec Ala Ser Lys Asp Mec Thr Pro Glu Gin Ser Asn Val Tyr Arg Asp 1185 1190 1195 1200 Asn Ser Tyr Gin Gin Phe Asp Thr Asn Asn Val Arg Arg Val Asn Asn i205 1210 1215 Arg Tyr Ala Glu Asp Tyr Glu He Pro Ser Ser Val Ser Ser Arg Lys 1220 1225 1230 -211- SUBSTTrUTE SHEET(RULE 26) lie Pro Thr lis Asn Tyr Lys Ala Ala Ser .:?r Asp Leu Lys He Tyr 1250· 1255 1250 lis Ser Pro Lys Lsu Arg lie lis His Asn lv Tyr Glu Gly Gin Lys 1255 1270 1275 1230 Ar Asn Gin Cys Asn Leu Mec Asn Lys Tyr Gly Lys Leu Gly Aso Lys 1235 1290 1235 Phe He Val Tyr Thr Ser Leu Gly Val Asn Pro Asn Asn Ser Ser Asn 1300 1305 1310 Lys Leu Mec Phe Tyr Pro Val Tyr Gin Tyr Ser Gly Asn Thr Ser Gly 1315 1320 1325 Leu Asn Gin Gly Arg Leu Leu Phe His Arg Asp Thr Thr Tyr Pro Ser 1330 1335 1340 Lys Val Glu Ala Trp lie Pro Gly Ala Lys Arg Ser Leu Thr Asn Gin 1345 1350 1355 1360 Asn Ala Ala He Gly Asp Asp Tyr Ala Thr Asp Ser Leu Asn Lys Pro 1365 1370 1375 Asp Asp Leu Lys Gin Tyr He Phe Mec Thr ASD Ser Lys Gly Thr Ala 1380 1385 1390 Thr Asp Val Ser Gly Pro Val Glu He Asn Thr Ala He ser Pro Ala 1395 1400 1405 Lys Val Gin He He Val Lys Ala Gly Gly Lys Glu Gin Thr Phe Thr 1410 1415 1420 Ala Asp Lys Asp Val Ser He Gin Pro Ser Pro Ser Phe Asp Glu Mec 1425 1430 1435 1440 Asn Tyr Gin Phe Asn Ala Leu Glu He Asp Gly Ser Gly Leu Asn Phe 1445 1450 1455 lie Asn Asn Ser Ala Ser He Asp Val Thr Phe Thr Ala Phe Ala Glu 1460 1465 1470 Asp Gly Arg Lys Leu Gly Tyr Glu Ser Phe Ser He Pro Val Thr Leu 1475 1480 1485 Lys Val Ser Thr Asp Asn Ala Leu Thr Leu His His Asn Glu Asn Gly .1490 1495 1500 Ala Gin Tyr Met Gin Trp Gin Ser Tyr Arg Thr Arg Leu Asn Thr Leu 1505 1510 1515 1520 Phe Ala Arg Gin Leu Val Ala Arg Ala Thr Thr Gly He Asp Thr He 1525 1530 1535 Leu Ser Mec Glu Thr Gin Asn He Gin Glu Pro Gin Leu Gly Lys Gly 1540 1545 1550 Phe Tyr Ala Thr Phe Val He Pro Pro Tyr Asn Leu Ser Thr His Gly 1555 1560 1565 Asp Glu Arg Trp Phe Lys Leu Tyr He Lys His Val' al Asp Asn Asn 1570 1575 1530 Ser His He He Tyr Ser Gly Gin Leu Thr Aso Thr Asn He Asn He 1535 1590 1595 1600 Thr Leu Phe He Pro Leu Asp Asp Val Pro Leu Asn Gin Asp Tyr His -212- 1605 1510 1015 Ala Lys Val Tyr Mac Thr Phe Lys Lys Ser Pro Ser Asp Gly Thr Trp 1S20 1625 1030 Trp Gly Pro His Phe Val Arg Asp Asp Lys Gly He Val Thr lie Asn 1635 L640 1645 Pro Lys Ser He Leu Thr His Phe Giu Ser Val Asn Val Leu Asn Asn 1650 1655 1560 lie Ser Ser Glu Pro Mec Asp Phe Ser Gly Ala Asn Ser Leu Tyr Phe 1665 1670 1675 1530 Trp Glu Leu Phe Tyr Tyr Thr Pro Mec Leu Val Ala Gin Arg Leu Leu 16a5 1690 1655 His Glu Gin Asn Phe Asp Glu Ala Asn Arg Trp Leu Lys Tyr Val Trp 1700 ' 1705 1710 Ser Pro Ser Gly Tyr He Val His Gly Gin He Gin Asn Tyr Gin Trp 1715 1720 1725 Asn Val Arg Pro Leu Leu Glu Asp Thr Ser Trp Asn Ser Asp Pro Leu 1730 1735 17-10 Asp Ser Val Asp Pro Asp Ala Val Ala Gin His Asp Pro Mec His Tyr 1745 1750 1755 1760 Lys Val Ser Thr Phe Mec Arg Thr Leu Asp Leu Leu lie Ala Arg Gly 1765 1770 1775 Asp His Ala Tyr Arg Gin Leu Glu Arg Asp Thr Leu Asn Glu Ala Lys 1780 1735 1790 Mec Trp Tyr Mec Gin Ala Leu His Leu Leu Gly Asp Lys Pro Tyr Leu 1795 1800 1805 Pro Leu Ser Thr Thr Trp Ser Asp Pro Arg Leu Asp Arg Ala Ala Asp 1810 1815 1820 lie Thr Thr Gin Asn Ala His Asp Ser Ala lie Val Ala Leu Arg Gin 1825 1830 1835 1840 Asn lie Pro Thr Pro Ala Pro Leu Ser 1845 1849 (2) INFORMATION FOR SEQ ID NO: 50: (i) SEQUENCE CHARACTERISTICS : (A) LENGTH: 1740 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS : double (D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50 (TcdAiii coding region TTG CGC AGC GCT AAT ACC CTG ACT GAT CTC TTC CTG CCG CAA ATC AAT 48 Leu Arg Ser Ala Asn Thr Leu Thr Asp Leu Phe Leu Pro Gin lie Asn 1 5 10 15 GAA GTG ATG ATG AAT TAC TGG CAG ACA TTA GCT CAG AGA GTA TAC AAT 96 Glu Val Mec Mec Asn Tyr Trp Gin Thr Leu Ala Gin Arg Val Tyr Asn 20 25 30 CTG CGT CAT AAC CTC TCT ATC GAC GGC CAG CCG TTA TAT CTG CCA ATC 144 -213- SUBSTTTu E SHEET (RULE 26) Leu Arg His As Leu Jer I ' - Asp -ly Gin Pro Leu . -. Leu Pro H i 35 40 45 TAT GCC ACA CCG GCC GAT AAA. TTA CTC AGC GCC CC GTT GCC 5 Tyr Aia Thr Pro Ala Asp Pro Lys Al Leu Leu Ser Ai Aia Val Aia 50 55 60 ACT TCT CAA GGT GGA AAG C A. CCG G A ~.— . TTT A.TG TCC i TGG Thr Ser Gin Gly Gly Gly Lys Leu Pro Glu Ser Phe Mec Ser Leu Trp 1U i 5 70 30 CGT TTC CCG CAC ATG CTG GAA AAT CGC GGC ATG GTT AGC Arg Phe Pro His Mec Leu Glu Asn Ala Arg Gly Mec V i Ser Gin Leu 85 90 95 il J< ACC CAG TTC GGC TCC ACG TTA CAA AAT ATT ATC GAA CGT CAG GAC GCG Thr Gin Pha Gly Ser Thr Leu Gin Asn He lis Glu Arg Gin Asp Ala 100 105 110 0 GAA GCG CTC AAT GCG TTA TTA CAA AAT CAG GCC GCC GAG CTG ATA TTG Glu Ala Leu Asn Ala Leu Leu Gin Asn Gin Ala Ala Glu Leu He Leu 115 120 125 ACT AAC CTG AGC ATT CAG GAC AAA ACC ATT GAA GAA TTG GAT GCC GAG 5 Thr Asn Leu Ser lie Gin Asp Lys Thr lie Giu Glu Leu Asp Ala Giu 130 135 140 AAA ACG GTG TTG GAA AAA TCC AAA GCG GGA GCA C A TCG CGC TTT GAT Lys Thr Val Leu Glu Lys Ser Lys Ala Gly Ala Gin Ser Arg Phe Asp 0 145 150 155 160 AGC TAC GGC AAA CTG TAC GAT GAG AAT ATC .AAC GCC GGT GAA AAC C A Ser T r Gly Lys Leu Tyr Asp Glu Asn He Asn Ala Gly Glu Asn Gin 165 170 175 5 GCC ATG ACG CTA CGA GCG TCC GCC GCC GGG CTT ACC ACG GCA GTT CAG Ala Mec Thr Leu Arg Ala Ser Ala Ala Gly Leu Thr Thr Aia Val Gin 130 135 190 0 GCA TCC CGT CTG GCC GGT GCG GCG GCT GAT CTG GTG CCT 'AAC ATC TTC Ala Ser Arg Leu Ala Gly Ala Ala Ala Asp Leu Val Pro Asn He Phe 195 200 205 GGC TTT GCC GGT GGC GGC AGC CGT TGG GGG GCT ATC GCT GAG GCG ACA 5 Gly Pha Ala Gly Gly Gly Ser Arg Trp Gly Ala He Ala Glu Ala Thr 210 215 220 GGT TAT GTG ATG GAA TTC TCC GCG AAT GTT ATG AAC ACC GAA GCG GAT Gly Tyr Val Mac Glu Phe Ser Ala Asn Val Mec Asn Thr Glu Ala Asp 0 225 230 235 240 AAA ATT AGC CAA TCT GAA ACC TAC CGT CGT CGC CGT CAG GAG TGG GAG 753 Lys Ha Ser Gin Ser Glu Thr Tyr Arg Arg Arg Arg Gin Glu Trp Glu 245 250 255 5 ATC CAG CGG AAT AAT GCC GAA GCG GAA TTG AAG CAA ATC GAT GCT CAG 315 He Gin Arg Asn Asn Ala Glu Ala Glu Leu Lys Gin He Asp Ala Gin 260 265 270 0 CTC AAA TCA CTC GCT GTA CGC CGC GAA GCC GCC GTA TTG CAG AAA ACC 364 Leu Lys Ser Leu Ala Val Arg Arg Glu Ala Ala Val Leu Gin Lys Thr 275. 230 235 AGT CTG AAA ACC CAA CAA GAA CAG ACC C A TCT CAA TTG GCC TTC CTG 912 5 Ser Leu Lys Thr Gin Gin Glu Gin Thr Gin Ser Gin Leu Ala Phe Leu 290 295 300 CAA CGT .AAG TTC AGC AAT CAG GCG TTA TAC .AAC TGG CTG CGT GGT CGA 960 Gin Arg Lys Phe Ser Asn Gin Ala Leu Tyr Asn Trp Leu Arg Gly Arg 0 305 310 315 320 -214- ΓΤ J^- j C ΛΤΤ ..-.C XT'- -.-.G TTC Trw. Ι_ΙΛΤ ITG G GTC GCG CGT TCC Liu Aia Al lie Tyr Phe Gin Phe Tyr Asp Leu Ala Val Aia Arg Cvs 325 330 335 CTG ATG GCA GAA CAA GCT TAC CGT TGG GAA CTC AAT GAT GAC TCT GCC 1056 Leu Mec Ala Glu Gin Ala Tyr Arg Trp Glu Leu Asn As.p Asp Ser Aia 340 345 350 CGC TTC ATT AAA CCG GGC GCC TGG CAG GCA ACC TAT GCC GGT CTG CTT 1104 Arg Phe He Lys Pro Giy Aia Trp Gin Gly Thr Tyr Aia Gly Leu Leu 355 350 365 GCA GGT GAA ACC TTG ATG CTG ACT CTG GCA CAA ATG GAA GAC GCT CAT 1.152 Aia Gly Glu Thr Leu Mec Leu Ser Leu Ala Gin Mec Glu Asp Ala His 370 375 330 CTG AAA CGC GAT AAA CGC GCA TTA GAG GTT GAA CGC ACA GTA TCG CTG 1200 Leu Lys Arg Asp Lys Arg Ala Leu Giu Val Glu Arg Thr Val Ser Leu 335 390 395 400 GCC GAA GTT TAT GCA GGA TTA CCA AAA GAT AAC GGT CCA TTT TCC CTG 1243 Ala Giu Val Tyr Ala Gly Leu Pro Lys Asp Asn Gly Pro Phe Ser Leu 405 410 415 GCT CAG GAA ATT GAC AAG CTG GTG AGT CAA GGT TCA GGC AGT GCC GGC 1296 Ala Gin Glu lie Asp Lys Leu Val Ser Gin Gly Ser Gly Ser Ala Gly 420 425 430 AGT GGT AAT AAT AAT TTG GCG TTC GGC GCC GGC ACQ GAC ACT AAA ACC 1344 Ser Gly Asn Asn Asn Leu Ala Phe Gly Ala Gly Thr Asp Thr Lys Thr 435 440 445 TCT TTG CAG GCA TCA GTT TCA TTC GCT GAT TTG AAA ATT CGT GAA GAT 1352· Ser Leu Gin Ala Ser Val Ser Phe Ala Asp Leu Lys He Arg Glu Asp 450 455 460 TAC CCG GCA TCG CTT GGC AAA ATT CGA CGT ATC AAA CAG ATC AGC GTC 1 40 Tyr Pro Ala Ser Leu Gly Lys He Arg Arg lie Lys Gin He Ser Val 465 470 475 430 ACT TTG CCC GCG CTA CTG GGA CCG TAT CAG GAT GTA CAG GCA ATA TTG 1433 Thr Leu Pro Ala Leu Leu Gly Pro Tyr Gin Asp Val Gin Ala He Leu 485 490 495 TCT TAC GGC GAT AAA GCC GGA TTA GCT AAC GGC TGT GAA GCG CTG GCA Ser Tyr Gly Asp Lys Ala Gly Leu Ala Asn Gly Cys Glu Ala Leu Ala 500 505 510 GTT TCT CAC GGT ATG AAT GAC AGC GGC CAA TTC CAG CTC GAT TTC AAC 1534 Val Ser His Gly Mec Asn Asp Ser Gly Gin Phe Gin Leu Asp Phe Asn 515 520 525 GAT GGC AAA TTC CTG CCA TTC GAA GGC ATC GCC ATT GAT CAA GGC ACG 1632 Asp Gly Lys Phe Leu Pro Phe Glu Gly He Ala He As Gin Gly Thr 530 535 540 CTG ACA CTG AGC TTC CCA AAT GCA TCT ATG CCG GAG AAA GGT AAA CAA 1630 Leu Thr Leu Ser Phe Pro Asn Ala Ser Mec Pro Glu Lys Gly Lys Gin 545 550 555 560 GCC ACT ATG TTA AAA ACC CTG AAC GAT ATC ATT TTG CAT ATT CGC TAC i"23 Ala Thr Mec Leu Lys Thr Leu Asn Asp He lie Leu His He Arg Tyr 565 570 575 (2) JFORMATION FOR SEQ ID MO : 51 : -215- SUBST1TUTE SHEET (RULE 26) (A) LENGTH: 5~'? ammo acids (3) TYPE: amino acids (C) STRANDEDMESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein !xi) SEQUENCE DESCRIPTION : SEQ ID NO : 51 (TcdAiii) : I (J Leu Arg Ser Ala Asn Thr Leu Thr Asp Leu Phe Leu Pro Gin lie Asn 1 5 10 15 Glu Val Mec Mec Asn Tyr Trp Gin Thr Leu Ala Gin Arg Val Tyr Asn 20 25 30 Leu Arg His Asn Leu Ser He Asp Gly Gin Pro Leu Tyr Leu Pro lie 35 40 45 Tyr Ala Thr Pro Ala Asp Pro Lys Ala Leu Leu Ser Ala Ala Val Ala 50 55 60 Thr Ser Gin Gly Gly Gly Lys Leu Pro Glu Ser Phe Mec Ser Leu Trp 55 70 75 30 5 Arg Phe Pro His Mec Leu Glu Asn Ala Arg Gly Mec Val Ser Gin Leu 85 90 95 Thr Gin Phe Gly Ser Thr Leu Gin Asn lie He Glu Arg Gin Asp Ala 100 105 110 Glu Ala Leu Asn Ala Leu Leu Gin Asn Gin Ala Ala Glu Leu lie Leu 115 120 125 Thr Asn Leu Ser lie Gin Asp Lys Thr He Glu Glu Leu Asp Ala Glu 130 135 140 Lys Thr Val Leu Glu Lys Ser Lys Ala Gly Ala Gin Ser Arg Phe Asp 145 150 155 160 0 Ser Tyr Gly Lys Leu Tyr Asp Glu Asn He Asn Ala Gly Glu Asn Gin 165 170 175 Ala Mec Thr Leu Arg Ala Ser Ala Ala Gly Leu Thr Thr Ala Val Gin 5 180 185 190 Ala Ser Arg Leu Ala Gly Ala Ala Ala Asp Leu Val Pro Asn He Phe 195 200 205 0 Gly Phe Ala Gly Gly Gly Ser Arg Trp Gly Ala He Ala Glu Ala Thr 210 215 220 Gly Tyr Val Mec Glu Phe Ser Ala Asn Val Mec Asn Thr Glu Ala Asp 225 230 235 240 5 Lys He Ser Gin Ser Glu Thr Tyr Arg Arg Arg Arg Gin Glu Trp Glu 245 250 255 He Gin Arg Asn Asn Ala Glu Ala Glu Leu Lys Gin He Asp Ala Gin 260 265 270 Leu Lys Ser Leu Ala Val Arg Arg Glu Ala Ala Val Leu Gin Lys Thr 275 280 285 Ser Leu Lys Thr Gin Gin Glu Gin Thr Gin Ser Gin Leu Ala Phe Leu 290 295 300 Gin Arg Lys Phe Ser Asn Gin Ala Leu Tyr Asn Trp Leu Arg Gly Arg 305 . 310 315 320 0 L=u Aia Aia lie Tvr Phe Gin Phe Tyr Asp Leu Ala Vai Ala Arg Cvs 325 330 335 Lau Mec Ala Glu Gin Ala Tyr Arg Trp Giu Leu Asn Asp Asp Ser Ala 340 345 350 Arg Phe He Lys Pro Gly Ala Trp Gin Gly Thr Tyr Aia Gly Lau Lau 355 360 365 Ala Gly Glu Thr Lau Mec Leu Sar Leu Ala Gin Mec Glu Asp Ala His 370 375 330 Leu Lys Arg Asp Lys Arg Ala Leu Glu Vai Glu Arg Thr Vai ser Leu 335 390 395 400 Ala Glu Vai Tyr Ala Gly Leu Pro Lys Asp Asn Gly Pro Phe Ser Lau 405 410 4i5 Ala Gin Giu He Asp Lys Leu Vai Ser Gin Gly Ser Gly Ser Ala Gly 420 425 430 Ser Gly Asn Asn Asn Leu Ala Phe Gly Aia Gly Thr Asp Thr Lys Thr 435 440 445 Ser Leu Gin Ala Ser Vai Ser Phe Ala Asp Leu Lys He Arg Glu Asp 450 455 460 Tyr Pro Ala Ser Leu Gly Lys He Arg Arg He Lys Gin He ser Vai 465 470 475 430 Thr Leu Pro Ala Leu Leu Gly Pro Tyr Gin Asp Vai Gin Ala He Leu 485 490 495 Sar Tyr Gly Asp Lys Ala Gly Leu Ala Asn Gly cys Glu Ala Leu Ala 500 505 510 Vai Ser His Gly Mec Asn Asp Ser Gly Gin Phe Gin Leu Asp Phe Asn 515 520 525 Asp Gly Lys Phe Leu Pro Phe Glu Gly lie Ala He Asp Gin Gly Thr 530 535 540 Lau Thr Leu Ser Phe Pro Asn Ala Ser Mec Pro Glu Lys Gly Lys Gin 545 550 555 550 Ala Thr Mec Leu Lys Thr Leu Asn Asp He He Leu His He Arg Tyr 565 570 575 Thr He Lys ··· 579 ( 2 ) INFORMATION FOR SEQ ID NO: 52 : (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 5532 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS : double (D) TOPOLOGY : linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTIO : SEQ ID MO: 52 (TcdA^ coding regi n! TTT ATA CAA GGT TAT ACT GAT CTG TTT GGT AAT CGT GCT GAT AAC TAT 43 Phe He Gin Gly Tyr Ser Asp Leu Phe Gly Asn Arg Ala Asp Asn Tyr i 5 10 15 ■ZCC GCG CCG GGC TCG GTT GCA TCG ATG TTC TCA CCG GCG GCT TAT TTG 96 -217- SUBSTTTUTE SHEET (RULE 26) Ala A a Pro Ziy Ser a a Ser Mec e Ser Pro Ala Ala Tyr Leu 20 25 30 ACG GAA TTG TAC CGT GAA GCC AAA AAC TTG CAT GAC ACC AGC TCA ATT 144 Thr Glu Leu Tyr Arg Glu Ala Lys Asn Leu His Asp Ser Ser Ser He 35 40 45 TAT TAC CTA GAT AAA CGT CCC CCG GAT TTA GCA AGC TTA ATG CTC AGC 132 T r T/r Leu Asp Lys Arg Arg Pro Asp Leu Ala Ser Leu Met Leu Ser 50 55 50 CAG AAA AAT ATG GAT GAG GAA ATT TCA ACG CTG GCT CTC TCT AAT GAA 240 Gin Lys Asn Mec Asp Giu Glu lie Ser Thr Leu Ala Leu Ser Asn Glu 55 "0 75 30 TTG TGC CTT GCC GGG ATC GAA ACA AAA ACA GGA AAA TCA CAA GAT GAA 233 Leu Cys Leu Ala Gly lie Glu Thr Lys Thr Gly Lys Ser Gin Asp Glu 85 90 95 GTG ATG GAT ATG TTG TCA ACT TAT CGT TTA ACT GGA GAG ACA CCT TAT 336 Val Mec Asp Mec Leu Ser Thr Tyr Arg Leu Ser Gly Glu Thr Pro Tyr 100 105 HO CAT CAC GCT TAT GAA ACT CTT CGT GAA ATC GTT CAT GAA CGT GAT CCA 334 His His Ala Tyr Glu Thr Val Arg Glu lie Val His Glu Arg Asp Pro 115 120 125 GGA TTT CGT CAT TTG TCA CAG GCA CCC ATT GTT GCT GCT AAG CTC GAT 432 Gly Phe Arg His Leu Ser Gin Ala Pro He Val Ala Ala Lys Leu Asp 130 135 140 CCT GTG ACT TTG TTG GGT ATT AGC TCC CAT ATT TCG CCA GAA CTG TAT 480 Pro Val Thr Leu Leu Gly lie Ser Ser His He Ser Pro Glu Leu Tyr 145 150 155 160 AAC TTG CTG ATT GAG GAG ATC CCG GAA AAA GAT GAA GCC GCG CTT GAT 528 Asn Leu Leu He Glu Glu lie Pro Glu Lys Asp Glu Ala Ala Leu Asp 165 170 175 ACG CTT TAT AAA ACA AAC TTT GGC GAT ATT ACT ACT GCT CAG TTA ATG 576 Thr Leu Tyr Lys Thr Asn Phe Gly Asp He Thr Thr Ala Gin Leu Mec 180 135 190 TCC CCA ACT TAT CTG GCC CGG TAT TAT GGC GTC TCA CCG GAA GAT ATT 624 Ser Pro Ser Tyr Leu Ala Arg Tyr Tyr Gly Val Ser Pro Glu Asp He 195 200 205 GCC TAC GTG ACG ACT TCA TTA TCA CAT GTT GGA TAT AGC AGT GAT ATT 672 Ala Tyr Val Thr Thr Ser Leu Ser His Val Gly Tyr Ser Ser Asp lis 210 215 220 CTG GTT ATT CCG TTG GTC GAT GGT GTG GGT AAG ATG GAA GTA GTT CGT 720 Leu Val He Pro Leu Val Asp Gly Val Gly Lys Mec Glu Val Val Arg 225 230 235 240 GTT ACC CGA ACA CCA TCG GAT AAT TAT ACC AGT CAG ACG AAT TAT ATT 763 Val Thr Arg Thr Pro Ser Asp Asn Tyr Thr Ser Gin Thr Asn Tyr He 245 250 255 GAG CTG TAT CCA CAG GGT GGC GAC AAT TAT TTG ATC AAA TAC AAT CTA 316 Glu Leu Tyr Pro Gin Gly Gly Asp Asn Tyr Leu He Lys Tyr Asn Leu 260 265 270 AGC AAT AGT TTT GGT TTG GAT GAT TTT TAT CTG CAA TAT AAA GAT GGT 364 Ser Asn Ser Phe Gly Leu Asp Asp Phe Tyr Leu Gin Tyr Lys Asp Gly 275 280 285 TCC GCT GAT TCG ACT GAG ATT GCC CAT AAT CCC TAT CCT GAT ATG GTC 912 Ser Ala Asp Trp Thr Glu He Ala His Asn Pro Tyr Pro Asp Mec Val 290 295 300 -218- ATA AAT AAG TAT .--A T A CAG GC ACA ATC .AAA CGT AGT GAC TCT 360 :le Asn Gin Lys Tyr Giu Ser Gin Ala Thr lie Lys Arg Ser Asp Ser 305 310 315 32C "AC AAT ATA CTC AGT ATA GGG TTA CAA AGA TGG CAT AGC GGT AGT TAT 1003 Asp Asn lie Leu Ser lie Giy Leu Gin Arg Trp His Ser Giy Ser Tyr 325 330 335 AAT TTT GCC GCC GCC AAT TTT AAA ATT GAC C A TAC TCC CCG AAA GCT 1056 Asn Phe Ala Ala Ala Asn Phe Lys He Asp Gin T/r Ser Pro Lys Ala 340 345 350 TTC C G CTT AAA ATG AAT AAG GCT ATT CGG TTG CTC AAA GCT ACC GGC 1104 Phe Leu Leu Lys Mec Asn Lys Ala He Arg Leu Leu Lys Ala Thr Giy 355 360 365 CTC TCT TTT GCT ACG TTG GAG CGT ATT GTT GAT AGT GTT AAT AGC ACC - 52 Leu Ser Phe Ala Thr Leu Glu Arg He Val Asp Ser Val Asn Ser Thr 370 375 380 2U AAA TCC ATC ACG GTT GAG GTA TTA AAC AAG GTT TAT CGG GTA AAA TTC 1200 Lys Ser He Thr Val Glu Val Leu Asn Lys Val Tyr Arg Val Lys Phe 335 390 395 400 25 "TAT ATT GAT CGT TAT GGC ATC AGT GAA GAG ACA GCC GCT ATT TTG GCT 1248 Tyr lis Asp Arg Tyr Giy He Ser Glu Glu Thr Ala Ala He Leu Ala 405 410 415 _AT ATT AAT ATC TCT CAG CAA GCT GTT GGC AAT CAG CTT AGC CAG TTT 1296 sn lie Asn He Ser Gin Gin Ala Val Giy Asn Gin Leu Ser Gin Phe 420 425 430 GAG CAA CTA TTT AAT CAC CCG CCG CTC AAT GGT ATT CGC TAT GAA ATC 1344 Glu Gin Leu Phe Asn His Pro Pro Leu Asn Giy He Arg Tyr Glu He 435 440 445 AGT GAG GAC AAC TCC AAA CAT CTT CCT AAT CCT GAT CTG AAC CTT AAA 1392 Ser Glu Asp Asn Ser Lys His Leu Pro Asn Pro Asp Leu Asn Leu Lys 450 455 460 40 CCA GAC AGT ACC GGT GAT GAT CAA CGC AAG GCG GTT TTA AAA CGC GCG 1440 Pro Asp Ser Thr Giy Asp Asp Gin Arg Lys Ala Val Leu Lys Arg Ala 465 470 475 480 45 TTT CAG GTT AAC GCC AGT GAG TTG TAT CAG ATG TTA TTG ATC ACT GAT 1 83 Phe Gin Val Asn Ala Ser Glu Leu Tyr Gin Mec Leu Leu He Thr Asp 485 490 495 CGT AAA GAA GAC GGT GTT ATC AAA AAT AAC TTA GAG AAT TTG TCT GAT 1536 50 Arg Lys Glu Asp Giy Val He Lys Asn Asn Leu Glu Asn Leu Ser Asp 500 505 510 CTG TAT TTG GTT AGT TTG CTG GCC CAG ATT CAT AAC CTG ACT ATT GCT 1534 Leu Tyr Leu Val Ser Leu Leu Ala Gin He His Asn Leu Thr He Ala 55 515 520 525 GAA TTG AAC ATT TTG TTG GTG ATT TGT GGC TAT GGC GAC ACC AAC ATT 1532 Glu Leu Asn He Leu Leu Val He Cys Giy Tyr Giy Asp Thr Asn He 530 535 540 6U TAT CAG ATT ACC GAC GAT AAT TTA GCC AAA ATA GTG GAA ACA TTG TTG 1630 T/r Gin He Thr Asp Asp Asn Leu Ala Lys He Val Glu Thr Leu Leu 545 550 555 560 ή TGG ATC ACT CAA TGG TTG AAG ACC CAA AAA TGG ACA GTT ACC GAC CTG i"23 Trp He Thr Gin Trp Leu Lys Thr Gin Lys Trp Thr Val Thr Asp Leu 565 570 575 TTT CTG ATG ACC ACG GCC ACT TAC AGC ACC ACT TTA ACG CCA G A ATT 1~76 70 Phe Leu Mec Thr Thr Ala Thr Tyr Ser Thr Thr Leu Thr Pro Glu He 580 585 590 -219- SUBSTITUTE SHEET (RULE 26} AGC .AAT CTG ACG GCT ACC TTG TCT TCA ACT TTG CAT GGC AAA GAG AGT 132-1 ier Asn Leu Thr Ala Thr Leu Ser Ser Thr Leu His Gly Lys Giu ier 595 500 605 CTG ATT GGG GAA GAT CTG AAA AGA GCA ATG GCG CCT TGC TTC ACT TCG 1372 L<=u He Gly Glu Asp Leu Lys Arg Ala Mac Ala Pro Cys Phe Thr Ser 610 615 620 GCT TTG CAT TTG ACT TCT CAA GAA GTT GCG TAT GAC CTG CTG TTG TGG 1920 Ala Leu His Liu Thr Ser Gin Glu Val Ala Tyr Asp Leu Leu Leu Trp 625 630 535 640 ATA GAC CAG ATT AA CCG GCA CAA ATA ACT GTT GAT GGG TTT TGG GAA IS 63 lie Asp Gin He Gin Pro Ala Gin lie Thr Val Asp Gly Phe Trp Glu 645 650 555 GAA GTG CAA ACA ACA CCA ACC AGC TTG AAG GTG ATT ACC TTT GCT CAG 2015 Glu Val Gin Thr Thr Pro Thr Ser Leu Lys Val lis Thr Phe Ala Gin 660 665 670 GTG CTG GCA CAA TTG ACC CTG ATC TAT CGT CGT ATT GGG TTA AGT GAA 2064 Val Leu Ala Gin Leu Ser Leu He Tyr Arg Arg lie Gly Leu Ser Giu 675 630 635 ACG GAA CTG TCA CTG ATC GTG ACT CAA TCT TCT CTG CTA GTG GCA GGC 2112 Thr Glu Leu Ser Leu He Val Thr Gin Ser Ser Leu Leu Val Ala Gly 690 695 700 AAA AGC ATA CTG GAT CAC GGT CTG TTA ACC CTG ATG GCC TTG GAA GGT 2150 Lys Ser He Leu Asp His Gly Leu Leu Thr Lau Mec Ala Leu Glu Gly 705 710 715 720 TTT CAT ACC TGG GTT AAT GGC TTG GGG CAA CAT GCC TCC TTG ATA TTG 2203 Phe His Thr Trp Val Asn Gly Leu Gly Gin His Ala Ser Leu He Leu 725 730 · 735 GCG GCG TTG AAA GAC GGA GCC TTG ACA GTT ACC GAT GTA GCA CAA GCT 2256 Ala Ala Leu Lys Asp Gly Ala Leu Thr Val Thr Asp Val Ala Gin Ala 740 745 750 ATG AAT AAG GAG GAA TCT CTC CTA CAA ATG GCA GCT AAT CAG GTG GAG 2304 Mac Asn Lys Glu Glu Ser Leu Leu Gin Mac Ala Ala Asn Gin Val Glu 755 760 765 AAG GAT CTA ACA AAA' CTG ACC AGT TGG ACA CAG ATT GAC GCT ATT CTG 23 2 Lys Asp Leu Thr Lys Lau Thr Ser Trp Thr Gin He Asp Ala He Leu 770 775 780 CAA TGG TTA CAG ATG TCT TCG GCC TTG GCG GTT TCT CCA CTG GAT CTG 2400 Gin Trp Lau Gin Mac Ser Ser Ala Leu Ala Val Ser Pro Leu Asp Leu 735 790 795 300 GCA GGG ATG ATG GCC CTG AAA TAT GGG ATA GAT CAT AAC TAT GCT GCC 2 48 Ala Gly Mec Mec Ala Leu Lys Tyr Gly He Asp His Asn Tyr Ala Ala 305 810 315 TGG CAA GCT GCG GCG GCT GCG CTG ATG GCT GAT CAT GCT AAT CAG GCA 2495 Trp Gin Ala Ala Ala Ala Ala Lau Mac Ala Asp His Ala Asn Gin Ala 820 825 830 CAG AAA AAA CTG GAT GAG ACG TTC AGT AAG GCA TTA TGT AAC TAT TAT 2 4 Gin Lys Lys Lau Asp Glu Thr Phe Ser Lys Ala Leu Cys Asn Tyr Tyr 335 840 845 ATT AAT GCT GTT GTC GAT AGT GCT GCT GGA GTA CGT GAT CGT AAC GGT 2592 He Asn Ala Val Val Asp Ser Ala Ala Gly Val Arg Asp Arg Asn Gly 350 855 860 TTA TAT ACC TAT TTG CTG ATT GAT AAT CAG GTT TCT GCC GAT GTG ATC 2640 Leu T r Thr Tyr Leu Leu He Asp Asn Gin Val Ser Ala Asp Val He -220- 365 370 375 ACT TC COT ATT OCA GAA GCT ATC GCC GGT ATT CAA CTG TAC GTT AAC 2633 Thr Ser Arg lie Al Glu Ala ie Aia Giy lie Gin Leu Tyr i Asn 385 390 395 CGG GCT TTA AAC CCA GAT GAA GGT CAG CTT GCA TCG GAC GTT AGT ACC 2736 Arg Ala Leu Asn Arg Asp Glu Giy Gin Leu Ala Ser Asp Val Ser Thr 900 905 910 CGT CAG TTC TTC ACT GAC TGG GAA CGT TAC AAT AAA CGT TAC AGT ACT 273-1 Arg Gin Phe Phe Thr Asp Trp Glu Arg Tyr Asn Lys Arg Tyr Ser Thr 915 920 925 TGG GCT GGT GTC TCT GAA CTG GTC TAT TAT CCA GAA AAC TAT G T GAT 2332 Trp Ala Giy Val Ser Glu Leu Val Tyr Tyr Pro Glu Asn Tyr Val Asp 930 935 940 CCC ACT CAG CGC ATT GGG CAA ACC AAA ATG ATG GAT GCG CTG TTC CAA 2330 Pro Thr Gin Arg lie Giy Gin Thr Lys Mec Mec Asp Ala Leu Leu Gin 945 950 955 960 TCC ATC AAC CAG AGC CAG CTA AAT GCG GAT ACG GTG GAA GAT GCT TTC 2923 Ser lie Asn Gin Ser Gin Leu Asn Ala Asp Thr Val Giu Asp Ala Phe 955 970 975 AAA ACT TAT TTG ACC AGC TTT GAG CAG GTA GCA AAT CTG AAA GTA ATT 2975 Lys Thr Tyr Leu Thr Ser Phe Glu Gin Val Ala Asn Leu Lys Val He 930 985 990 AGT GCT TAC CAC GAT AAT GTG AAT GTG GAT CAA GGA TTA ACT TAT TTT 3024 Ser Ala Tyr His Asp Asn Val Asn Val Asp Gin Giy Leu Thr Tyr Phe 995 1000 1005 ATC GGT ATC GAC CAA GCA GCT CCG GGT ACG TAT TAC TGG CGT AGT GTT 3072 He Giy He Asp Gin Ala Ala Pro Giy Thr Tyr Tyr Trp Arg Ser Val 1010 1015 1020 GAT CAC AGC AAA TGT GAA AAT GGC AAG TTT GCC GCT AAT GCT TGG GGT 3120 Asp His Ser Lys Cys Glu Asn Giy Lys Phe Ala Ala Asn Ala Trp Giy 1025 1030 1035 1040 GAG TGG AAT AAA ATT ACC TGT GCT GTC AAT CCT TGG AAA AAT ATC ATC 3168 Glu Trp Asn Lys lie Thr Cys Ala Val Asn Pro Trp Lys Asn He He 1045 1050 1055 CGT CCG GTT GTT TAT ATG TCC CGC TTA TA CTG CTA TGG CTG GAG CAG 3215 Arg Pro Vai Val Tyr Mec Ser Arg Leu Tyr Leu Leu Trp Leu Glu Gin 1060 1065 1070 CAA TCA AAG AAA AGT GAT GAT GGT AAA ACC ACG ATT TAT CAA TAT AAC 3264 Gin Ser Lys Lys Ser Asp Asp Giy Lys Thr Thr lie Tyr Gin Tyr Asn 1075 1030 1085 TTA AAA CTG GCT CAT ATT CGT TAC GAC GGT AGT TGG AAT ACA CCA TTT 3312 Leu Lys Leu Ala His He Arg -Tyr Asp Giy Ser Trp Asn Thr Pro Phe 1090 1095 1100 ACT TTT GAT GTG ACA GAA AAG GTA AAA AAT TAC ACG TCG AGT ACT GAT 3360 Thr Phe Asp Val Thr Glu Lys Val Lys Asn Tyr Thr Ser Ser Thr Asp 1105 1110 1115 1120 GCT GCT GAA TCT TTA GGG TTG TAT TGT ACT GGT TAT CAA GGG GAA GAC 3403 Ala Ala Glu Ser Leu Giy Leu Tyr Cys Thr Giy Tyr Gin Giy Glu Asp 1125 1130 1135 ACT CTA TTA GTT ATG TTC TAT TCG ATG CAG AGT AGT TAT AGC TCC TAT 3456 Thr Leu Leu Val Mec Phe Tyr Ser Mec Gin Ser Ser Tyr Ser Ser r/r 1140 1145 1150 ACC GAT AAT AAT GCG CCG GTC ACT GGG CTA TAT ATT TTC GCT GAT ATG 3504 -221- Thr Asp Asn Asn Ala Pro Val Thr Gly Leu Tyr :'ie ?he Ala As Mec 1155 1 i 50 1165 TCA TCA GAC AAT A G ACG AAT GCA CAA GCA ACT AAC TAT TGG AAT AAC Ser Ser Asp Asn Mec Thr Asn Ala Gin Ala Thr Asn Tyr Trp Asn Asn 1170 1175 1130 AGT TAT COG CAA TTT GAT ACT GTG ATG GCA GAT CCG GAT AGC GAC AAT 3600 Ser T Pro Gin Phe Asp Thr Val Mec Ala Asp Pro Asp Ser Asp Asn 1135 1190 1195 1200 GTC ATA ACC AGA A A GTT AAT AAC CGT TAT GCG GAG GAT TAT 3643 Lvs Lys Val lie Thr Arg Arg Val Asn Asn Arg Tyr Ala Glu Asp T/r 1205 1210 1215 GAA ATT CCT TCC TCT GTG ACA AGT AAC AGT AAT TAT TCT TGG GGT GAT 3696 Glu lie Pro Ser Ser Val Thr Ser Asn Ser Asn Tyr Ser Trp Gly \≤ 1220 1225 1230 CAC AGT TTA ACC ATG CTT TAT GGT GGT AGT GTT CCT AAT ATT ACT TTT 744 His Ser Leu Thr Mec Leu Tyr Gly Gly Ser Val Pro Asn lie Thr Phe 1235 1240 1245 GAA TCC GCG GCA GAA GAT TTA AGG CTA TCT ACC AAT ATG GCA TTG AGT 3792 Glu Ser Ala Ala Glu Asp Leu Arg Leu Ser Thr Asn Mec Ala Leu Ser 1250 1255 1250 ATT ATT CAT AAT GGA TAT GCG GGA ACC CGC CGT ATA CAA TGT AAT CTT 3840 lie lie His Asn Gly Tyr Ala Gly Thr Arg Arg He Gin Cys Asn Leu 1255 1270 1275 1230 ATG AAA CAA TAC GCT TCA TTA GGT GAT AAA TTT ATA ATT TAT GAT TCA 3333 Mec Lys Gin Tyr Ala Ser Leu Gly Asp Lys Phe lie lie Tyr Asp Ser 1235 1290 1295 TCA TTT GAT GAT GCA AAC CGT TTT AAT CTG GTG CCA TTG TTT AAA TTC 3936 Ser Phe Asp Asp Ala Asn Arg Phe Asn Leu Val Pro Leu Phe Lys Phe 1300 1305 1310 GGA AAA GAC GAG AAC TCA GAT GAT AGT ATT TGT ATA TAT AAT GAA AAC 3984 Gly Lys Asp Glu Asn Ser Asp Asp Ser lie Cys lie Tyr Asn Glu Asn 1315 1320 1325 CCT TCC TCT GAA GAT AAG AAG TGG TAT TTT TCT TCG AAA GAT GAC AAT 4032 Pro Ser Ser Glu Asp Lys Lys Trp Tyr Phe Ser Ser Lys Asp Asp Asn 1330 1335 1340 AAA ACA GCG GAT TAT AAT GGT GGA ACT CAA TGT ATA GAT GCT GGA ACC 4030 Lys Thr Ala Asp Tyr Asn Gly Gly Thr Gin Cys lie Asp Ala Gly Thr 1345 1350 1355 1360 AGT AAC AAA GAT TTT TAT TAT AAT CTC CAG GAG ATT GAA GTA ATT AGT 123 Ser Asn Lys Asp Phe Tyr Tyr Asn Leu Gin Glu lie Glu Val lie Ser 1365 1370 1375 GTT ACT GGT GGG TAT TGG TCG AGT TAT AAA ATA TCC AAC CCG ATT AAT 176 Val Thr Gly Gly Tyr Trp Ser Ser Tyr Lys I la Ser Asn Pro lie Asn 1380 1385 1390 ATC AAT ACG GGC ATT GAT AGT GCT AAA GTA AAA GTC ACC GTA AAA GCG 4224 lie Asn Thr Gly lie Asp Ser Ala Lys Val Lys Val Thr Val Lys Ala 1395 1400 1405 GGT GGT GAC GAT CAA ATC TTT ACT GCT GAT AAT AGT ACC TAT GTT CCT 4272 Gly Gly Asp Asp Gin lie Phe Thr Ala Asp Asn Ser Thr Tyr Val Pro 1410 1415 1420 CAG CAA CCG GCA CCC AGT TTT GAG GAG ATG ATT TAT CAG TTC .AAT AAC 4320 Gin Gin Pro Ala Pro Ser Phe Glu Glu Mec lie T/r Gin Phe Asn Asn 1425 1430 1435 1440 _ ^ -> i _ CTG ACA ATA GAT T AAG AAT TTA AAT TTC ATC GAC AAT CAG GCA -AT ci Leu Thr He Asp Cys Lys Asn Leu Asn Phe lie Asp Asn Gin Ala His 1445 1450 1455 ATT GAG ATT GAT TTC ACC GCT ACG GCA CAA GAT GGC CGA TTC TTG GGT 44 id lie Glu lie Asp Phe Thr Ala Thr Ala Gin Asp Giy Arg Phe Leu Gly 1460 1405 1470 GCA GAA ACT TTT ATT ATC CCG GTA ACT AAA AAA GTT CTC GGT ACT GAG 4464 Aia Giu Thr Phe lie lie Pro 7a1 Thr Lys Lys Val Leu Gly Thr Glu 1475 1430 1435 AAC GTG ATT GCG TTA TAT AGC GAA AAT AAC GGT GTT CAA TAT ATG CAA 4 12 Asn Val lie Ala Leu Tyr Ser Glu Asn Asn Gly Val Gin Tyr Met Gin 1490 1495 1500 ATT GGC GCA TAT CGT ACC CGT TTG AAT ACG TTA TTC GCT CAA CAG TTG 4 60 lie Gly Ala Tyr Arg Thr Arg Leu Asn Thr Leu Phe Ala Gin Gin Leu 1505 1510 1515 1520 GTT AGC CGT GCT AAT CGT GGC ATT GAT GCA GTG CTC AGT ATG GAA ACT 4603 Val Ser Arg Ala Asn Arg Giy lie Asp Ala Val Leu Ser Mec Glu Thr 1525 1530 1535 CAG AAT ATT CAG GAA CCG CAA TTA GGA GCG GGC ACA TAT GTG CAG CTT 4655 Gin Asn lie Gin Glu Pro Gin Leu Gly Ala Gly Thr Tyr Val Gin Leu 1540 1545 1550 GTG TTG GAT AAA TAT GAT GAG TCT ATT CAT GGC ACT AAT AAA AGC TTT 4704 Val Leu Asp Lys Tyr Asp Glu Ser lie His Giy Thr Asn Lys Ser Phe 1555 1560 1565 GCT ATT GAA TAT GTT GAT ATA TTT AAA GAG AAC GAT AGT TTT GTG ATT 4752 Ala lie Glu Tyr Val Asp lie Phe Lys Glu Asn Asp Ser Phe Val lie 1570 1575 1530 TAT CAA GGA GAA CTT AGC GAA ACA AGT CAA ACT GTT GTG AAA GTT TTC 4300 Tyr Gin Gly Glu Leu Ser Glu Thr Ser Gin Thr Val Val Lys Val Phe 1535 1590 1595 1600 TTA TCC TAT TTT ATA GAG GCG ACT GGA AAT AAG AAC CAC TTA TGG GTA 434 Leu Ser Tyr Phe He Glu Ala Thr Gly Asn Lys Asn His Leu Trp Val 1605 1610 1615 CGT GCT AAA TAC CAA AAG GAA ACG ACT GAT AAG ATC TTG TTC GAC CGT 4396 Arg Ala Lys Tyr Gin Lys Glu Thr Thr Asp Lys He Leu Phe Asp Arg 1620 1625 1630 ACT GAT GAG AAA GAT CCG CAC GGT TGG TTT CTC AGC GAC GAT CAC AAG 4944 Thr Asp Glu Lys Asp Pro His Gly Trp Phe Leu Ser Asp Asp His Lys 1635 1640 1645 ACC TTT AGT GGT CTC TCT TCC GCA CAG GCA TTA AAG AAC GAC AGT GAA 4992 Thr Phe Ser Gly Leu Ser Ser Ala Gin Ala Leu Lys Asn Asp Ser Glu 1650 1655 1560 CCG ATG GAT TTC TCT GGC GCC AAT GCT CTC TAT TTC TGG GAA CTG TTC 5040 Pro Mec Asp Phe Ser Gly Ala Asn Ala Lau Tyr Phe Trp Glu Leu Phe 1665 1570 1675 1630 TAT TAC ACG CCG ATG ATG ATG GCT CAT CGT TTG TTG CAG GAA CAG AAT 5033 Tyr Tyr Thr Pro Mec Mec Mec Ala His Arg Leu Leu Gin Glu Gin Asn 1535 1690 1595 TTT GAT GCG GCG AAC CAT TGG TTC CGT TAT GTC TGG AGT CCA TCC GGT 5136 Phe Asp Ala Ala Asn His Trp Phe Arg Tyr Val Trp Ser Pro Ser Gly 1700 1705 1710 TAT ATC GTT GAT GGT AAA ATT GCT ATC TAC CAC TGG AAC GTG CGA CCG 5134 Tyr He Val Asp Gly Lys He Ala He Ty His Trp Asn Val Arg Pro 1715 1720 1725 - 223 - CTG- GAA AA GAC ACC AGT TGG AAT GCA CAA CAA CTG GAC TCC ACC GAT 52 2 Leu Glu Glu Asp Thr Ser Trp Asn Ala Gin in Leu Asp ser Thr Asp 1730 1735 1740 CCA GAT GCT GTA GCC CAA GAT GAT CCG TG CAC TAC AAG CTG GCT ACC 5230 Pro Asp Ala Val Ala Gin Asp Asp Pro Mec His Tyr Lys Val Ala Thr .745 1"50 1755 1750 TTT ATG GCG ACG TTG GAT CTG CTA ATG GCC CGT GGT GAT GCT GCT TAC 5323 Phe Mec Ala Thr Leu Asp Lau Leu Mec Aia Arg Gly Asp Aia Ala Tyr 1755 i770 1775 CGC CAG TTA GAG CGT GAT ACG TTG GCT GAA GCT AAA ATG TGG TAT ACA 5376 Arg Gin Leu Glu Arg Asp Thr Leu Ala Glu Ala Lys Mec Trp Tyr Thr 1780 1735 1790 CAG GCG CTT AAT CTG TTG GGT GAT GAG CCA CAA GTG ATG CTG AGT ACG 5424 Gin Ala Leu Asn Leu Leu Gly Asp Glu Pro Gin Val Mec Leu Ser Thr 1795 1300 1305 ACT TGG GCT AAT CCA ACA TTG GGT AAT GCT GCT TCA AAA ACC ACA CAG 5472 Thr Trp Ala Asn Pro Thr Leu Gly Asn Ala Ala Ser Lys Thr Thr Gin 1310 1315 1820 C G GTT CGT CAG CAA GTG CTT ACC CAG TTG CGT CTC AAT AGC AGG GTA 5520 Gin Val Arg Gin Gin Val Leu Thr Gin Leu Arg Leu Asn Ser Arg Val 1325 1330 1335 1340 .AAA ACC CCG TTG Lys Thr Pro Leu 1344 (2) INFORMATION FOR SEQ ID NO: 53 : (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 1844 amino acids (B) TYPE: amino acids (C) STRANDEDNESS : single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (Xi) SEQUENCE DESCRIPTION: SEQ ID O-. 5 3 (Tc Aii) : Features From To Description Pepcids 1 1844 TcbAii pepcide Fragment 1 11 (SEQ ID NO:l) Fragment 978 990 ( SEQ ID NO: 23) Fragment 1387 1401 (SEQ ID NO:22) Fragment 1484 1505 (SEQ ID NO:24) Fragment 1527 1552 (SEQ ID NO:21) Phe lie Gin Gly Tyr Ser Asp Leu Phe Gly Asn Arg Ala Asp Asn Tyr 1 5 10 15 Ala Ala Pro Gly Ser Val Ala Ser Met Phe Ser Pro Ala Ala Tyr Leu 20 25 30 Thr Glu Leu Tyr Arg Glu Ala Lys Asn Leu His Asp Ser Ser Ser lie 35 40 45 Tyr Tyr Leu Asp Lys Arg Arg Pro Asp Leu Ala Ser Leu Mec Leu Ser 50 55 50 Gin Lys Asn Mec Asp Glu Glu He Ser Thr Leu Ala Leu Ser Asn Glu 55 70 75 30 Leu Cys Leu Ala Gly He Glu Thr Lys Thr Gly Lys Ser Gin Asp Glu -224- 35 90 }= Val Mec Asp Mac Leu Ser Thr Tyr Arg Leu Ser Gly Glu Thr Pro Tvr 100 105 110 His His Ala Tyr Glu Thr Val Arg Giu lie Val His Glu Arg Asp Pro 115 120 125 Gly Phe Arg His Leu Ser Gin Ala Pro lie Val Ala Ala Lys Leu Asp i30 135 140 Pro Val Thr Leu Leu Gly lie Ser Ser His lie Ser Pro Glu Leu Tyr 145 150 155 150 Asn Leu Leu lie Glu Glu He Pro Glu Lys Asp Glu Ala Ala Leu Asp 165 170 175 Thr Leu Tyr Lys Thr Asn Phe Gly Asp lie Thr Thr Ala Gin Leu Mec 180 185 190 Ser Pro Ser Tyr Leu Ala Arg Tyr Tyr Gly Val Ser Pro Glu Asp lie 195 200 205 Ala Tyr Val Thr Thr Ser Leu Ser His Val Gly Tyr Ser Ser Asp He 210 215 220 Leu Val He Pro Leu Val Asp Gly Val Gly Lys Mec Glu Val Val Arg 225 230 235 240 Val Thr Arg Thr Pro Ser Asp Asn Tyr Thr Ser Gin Thr Asn Tyr He 245 250 255 Glu Leu Tyr Pro Gin Gly Gly Asp Asn Tyr Leu He Lys Tyr Asn Leu 260 265 270 Ser Asn Ser Phe Gly Leu Asp Asp Phe Tyr Leu Gin T r Lys Asp Gly 275 280 285 Ser Ala Asp Trp Thr Glu He Ala His Asn Pro Tyr Pro Asp Mec Val 29 0 295 3 00 He Asn Gin Lys Tyr Glu Ser Gin Ala Thr He Lys Arg Ser Asp Ser 3 05 3 10 3 15 320 Asp Asn He Leu Ser He Gly Leu Gin Arg Trp His Ser Gly Ser Tyr 3 25 3 3 0 3 3 5 Asn Phe Ala Ala Ala Asn Phe Lys He Asp Gin Tyr Ser Pro Lys Ala 340 345 350 Phe Leu Leu Lys Mec Asn Lys Ala He Arg Leu Leu Lys Ala Thr Gly 355 360 365 Leu Ser Phe Ala Thr Leu Glu Arg He Val Asp Ser Val Asn Ser Thr 370 375 a 0 Lys Ser He Thr Val Glu Val Leu Asn Lys Val Tyr Arg Val Lys Phe 3 35 3 90 3 95 400 Tyr He Asp Arg Tyr Gly He Ser Glu Glu Thr Ala Ala He Leu Ala 405 410 4 15 Asn He Asn He Ser Gin Gin Ala Val Gly Asn Gin Leu Ser Gin Phe 420 425 430 Glu Gin Leu Phe Asn His Pro Pro Leu Asn Gly He Arg Tyr Glu He 43 5 440 445 Ser Glu Asp Asn Ser Lys His Leu Pro Asn Pro Asp Leu Asn Leu Lys 450 455 450 - 2 2 5 - Pr Asp Ser Thr Z y Asp Asp Gin Arg Lys l Val Leu Lys Arg Ala 465 . 4 ~ 0 4.5 430 Phe Gin Val Asn Ala Ser Glu Leu T/r Gin Mec Leu Leu lie Thr Aso 485 490 ^95 Arg Lys Glu Asp Gly Val lie Lys Asn Asn Leu Glu Asn Leu Ser Asp 500 505 510 Leu T r Leu Val Ser Leu Leu Ala Gin He His Asn Leu Thr lie Ala 515 520 525 Glu Leu Asn lie Leu Leu Val lie Cys Gly T/r Gly Asp Thr Asn lie 530 535 540 T/r Gin lie Thr Asp Asp Asn Leu Ala Lys lie Val Glu Thr Leu Leu 545 S50 555 500 Trp lie Thr Gin Trp Leu Lys Thr Gin Lys Trp Thr Val Thr Asp Leu 505 570 575 Phe Leu Met Thr Thr Ala Thr T/r Ser Thr Thr Leu Thr Pro Glu He 530 585 590 Ser Asn Leu Thr Ala Thr Leu Ser Ser Thr Leu His Gly Lys Glu Ser 595 500 505 Leu lie Gly Glu Asp Leu Lys Arg Ala Mec Ala Pro Cys Phe Thr Ser 610 615 620 Ala Leu His Leu Thr Ser Gin Glu Val Ala Tyr Asp Leu Leu Leu Trp 625 630 635 640 lie Asp Gin He Gin Pro Ala Gin He Thr Val Asp Gly Phe Trp Glu 645 650 655 Glu Val Gin Thr Thr Pro Thr Ser Leu Lys Val He Thr Phe Ala Gin 660 665 670 Val Leu Ala Gin Leu Ser Leu He Tyr Arg Arg He Gly Leu Ser Glu 675 680 685 Thr Glu Leu Ser Leu He Val Thr Gin Ser Ser Leu Leu Val Ala Gly 690 695 700 Lys Ser He Leu Asp His Gly Leu Leu Thr Leu Mec Ala Leu Glu Gly 705 710 715 720 Phe His Thr Trp Val Asn Gly Leu Gly Gin His Ala Ser Leu He Leu 725 730 735 Ala Ala Leu Lys Asp Gly Ala Leu Thr Val Thr Asp Val Ala Gin Ala 740 745 750 Mec Asn Lys Glu Glu Ser Leu Leu Gin Mec Ala Ala Asn Gin Val Glu 755 750 765 Lys Asp Leu Thr Lys Leu Thr Ser Trp Thr Gin He Asp Ala He Leu 770 775 780 Gin Trp Leu Gin Mec Ser Ser Ala Leu Ala Val Ser Pro Leu Asp Leu 735 790 795 300 Ala Gly Mec Mec Ala Leu Lys T/r Gly lie Asp His Asn T/r Ala Ala 305 310 315 Trp Gin Aia Ala Ala Ala Ala Leu Mec Ala Asp His Ala Asn Gin Ala 320 325 330 Gin Lys Lys Leu Asp Glu Thr Phe Ser Lys Ala Leu Cys Asn T/r T/r 335 840 845 - 22 6 - SUBSTmiTE SHEET(RULE 26} lie Asn Aia Val Val Asp Ser Ala Ala Gly Val Arg Asp Arg Asn Gly 350 355 360 Leu Tyr Thr T/r Leu Leu He Asp Asn Gin Val Ser Ala Asp Val lie 305 370 375 330 Thr Ser Arg Ha Ala Glu Ala lie Ala Gly He Gin Leu Tyr Val Asn 335 390 395 Arg Ala Leu Asn Arg Asp Glu Gly Gin Leu Ala Ser Asp Val Ser Thr 900 905 910 Arg Gin Phe Phe Thr Asp Trp Glu Arg Tyr Asn Lys Arg Tyr Ser Thr 915 920 925 Trp Ala Gly Val Ser Glu Leu Val Tyr Tyr Pro Glu Asn Tyr Val Asp 930 935 940 Pro Thr Gin Arg He Gly Gin Thr Lys Mec Mec Asp Ala Leu Leu Gin 9 5 950 955 960 Ser I la Asn Gin Ser Gin Leu Asn Ala Asp Thr Val Glu Asp Ala Phe 965 970 975 Lys Thr Tyr Leu Thr Ser Phe Glu Gin Val Ala Asn Leu Lys Val He 930 985 990 Ser Ala Tyr His Asp Asn Val Asn Val Asp Gin Gly Leu Thr Tyr Phe 995 1000 1005 He Gly He Asp Gin Ala Ala Pro Gly Thr Tyr Tyr Trp Arg Ser Val 1010 1015 1020 Asp His Ser Lys Cys Glu Asn Gly Lys Phe Ala Ala Asn Ala Trp Gly 1025 1030 1035 1040 Glu Trp Asn Lys He Thr Cys Ala Val Asn Pro Trp Lys Asn He He 1045 1050 1055 Arg Pro Val Val Tyr Mec Ser Arg Lau Tyr Leu Leu Trp Leu Glu Gin 1060 1065 1070 Gin Ser Lys Lys Ser Asp Asp Gly Lys Thr Thr Ha Tyr Gin Tyr Asn 1075 1080 1085 Leu Lys Lau Ala His lie Arg Tyr Asp Gly Ser Trp Asn Thr Pro Phe 1090 1095 1100 Thr Phe Asp Val Thr Glu Lys Val Lys Asn Tyr Thr Ser Ser Thr Asp 1105 1110 1115 1120 Ala Ala Glu Sar Lau Gly Leu Tyr Cys Thr Gly Tyr Gin Gly Glu Asp 1125 1130 1135 Thr Leu Lau Val Mec Phe Tyr Ser Mec Gin Ser Ser Tyr Ser Ser Tyr 1140 1145 1150 Thr Asp Asn Asn Ala Pro Val Thr Gly Leu Tyr I la Pha Ala Asp Mec 1155 1160 1165 Ser Ser Asp Asn Mec Thr Asn Ala Gin Ala Thr Asn T/r Trp Asn Asn 1170 1175 1130 Ser T/r Pro Gin Phe Asp Thr Val Mec Ala Asp Pro Asp Ser Asp Asn 1135 1190 1195 1200 Lys Lys Val He Thr Arg Arg Val Asn Asn Arg T/r Ala Glu Asp T/r 1205 1210 1215 Glu He Pro Ser Ser Val-Thr Ser Asn Ser Asn T/r Ser Trp Gly Asp His Ser Leu Thr Met Leu T r Gly Gly Ser Val Pro Asn lie Thr Ph<= 1235 1240 1245 Glu Ser Ala Ala Glu Asp Leu Arg Leu Ser Thr Asn Mec Ala Leu Ser 1250 1255 1260 lis He His Asn Gly T/r Ala Gly Thr Arg Arg He Gin Cys Asn Leu 1255 1270 1275 1230 Mec Lys Gin T/r Ala Ser Leu Gly Asp Lys Phe He He T/r ASD Ser 1235 1290 1295 Ser Phe Asp Asp Ala Asn Arg Phe Asn Leu Val Pro Leu Phe Lys Phe 1300 1305 1310 Gly Lys Asp Glu Asn Ser Asp Asp Ser He Cys He Tyr Asn Glu Asn 1315 1320 1325 Pro Ser Ser Glu Asp Lys Lys Trp T/r Phe Ser Ser Lys Asp ASD Asn 1330 1335 1340 Lys Thr Ala Asp Tyr Asn Gly Gly Thr Gin Cvs He Asp Ala Gly Thr 1345 1350 1355 1350 Ser Asn Lys Asp Phe T/r Tyr Asn Leu Gin Glu He Glu Val He Ser 1365 1370 1375 Val Thr Gly Gly T/r Trp Ser Ser Tyr Lys He -Ser Asn Pro He Asn 1380 1335 1390 He Asn Thr Gly He Asp Ser Ala Lys Val Lys Val Thr Val Lys Ala 1395 1400 1405 Gly Gly Asp Asp Gin He Phe Thr Ala Asp Asn Ser Thr Tyr Val Pro 1410 1415 1420 Gin Gin Pro Ala Pro Ser Phe Glu Glu Mec He Tyr Gin Phe Asn Asn 1425 1430 1435 1440 Leu Thr He Asp Cys Lys Asn Leu Asn Phe He Asp Asn Gin Ala His 1445 1450 1455 He Glu He Asp Phe Thr Ala Thr Ala Gin Asp Gly Arg Phe Leu Gly 1460 1465 1470 Ala Glu Thr Phe He He Pro Val Thr Lys Lys Val Leu Gly Thr Glu 1475 1480 1485 Asn Val He Ala Leu Tyr Ser Glu Asn Asn Gly Val Gin Tyr Mec Gin 1490 1495 1500 He Gly Ala Tyr Arg Thr Arg Leu Asn Thr Leu Phe Ala Gin Gin Leu 1505 1510 1515 1520 Val Ser Arg Ala Asn Arg Gly He Asp Ala Val Leu Ser Mec Glu Thr 1525 1530 1535 Gin Asn He Gin Glu Pro Gin Leu Gly Ala Gly Thr Tyr Val Gin Leu 1540 1545 1550 Val Leu Asp Lys Tyr Asp Glu Ser He His Gly Thr Asn Lys Ser Phe 1555 1560 1565 Ala He Glu T/r Val Asp He Phe Lys Glu Asn Asp Ser Phe Val He 1570 1575 1530 T/r Gin Gly Glu Leu Ser Glu Thr Ser Gin Thr Val Val Lys Val Phe 1535 1590 1595 1600 - 228 - SUBSTTTUTE SHEET (RULE 26) Lau 3er Tyr Phe I la Glu Ala Thr Gly Asn Lys Asn His Leu Trp Val 1605 1510 1615 Arg Ala Lys Tyr Gin Lys Glu Thr Thr Asp Lys He Leu Phe Asp Arg 1620 1525 1530 Thr Asp Glu Lys Asp Pro His Gly Trp Phe Leu Ser Asp Asp His Lys 1635 1640 1545 Thr Phe Ser Gly Leu Ser Ser Ala Gin Ala Leu Lys Asn Asp Ser Glu 1650 1655 1660 Pro Mec Asp Phe Ser Gly Ala Asn Ala Leu Tyr Phe Trp Glu Leu Phe 1665 1570 1575 1530 Tyr Tyr Thr Pro Mec Mec Mec Ala His Arg Leu Leu Gin Glu Gin Asn 1685 1690 1695 Phe Asp Ala Ala Asn His Trp Phe Arg Tyr Val Trp Ser Pro Ser Gly 1700 1705 1710 Tyr lie Val Asp Gly Lys He Ala H e Tyr His Trp Asn Val Arg Pro 1715 1720 1725 Leu Glu Glu Asp Thr Ser Trp Asn Ala Gin Gin Leu Asp Ser Thr Asp 1730 1735 1740 Pro Asp Ala Val Ala Gin Asp Asp Pro Mec His Tyr Lys Val Ala Thr 1745 1750 1755 1750 Phe Mec Ala Thr Leu Asp Leu Leu Mec Ala Arg Gly Asp Ala Ala Tyr 1765 1770 1775 Arg Gin Leu Glu Arg Asp Thr Leu Ala Glu Ala Lys Mec Trp Tyr Thr 1780 1785 1790 Gin Ala Leu Asn Leu Leu Gly Asp Glu Pro Gin Val Mec Leu Ser Thr 1795 1800 1805 Thr Trp Ala Asn Pro Thr Leu Gly Asn Ala Ala Ser Lys Thr Thr Gin 1310 1815 1820 Gin Val Arg Gin Gin Val Leu Thr Gin Lau Arg Leu Asn Ser Arg Val 1825 1830 1835 1340 Lvs Thr Pro Leu 1844 (2) INFORMATION FOR SEQ ID NO: 5 4 : (i) SEQUENCE CHARACTERISTICS : (A) LENGTH: 1722 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS : double (D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5 4 (TcbAj.j.i coding region CTA GGA ACA GCC AAT TCC CTG ACC GCT TTA TTC CTG CCG CAG G A AAT 43 Leu Gly Thr Ala Asn Ser Leu Thr Ala Leu Phe Leu Pro Gin Glu Asn 1 5 10 15 AGC AAG CTC AAA GGC TAC TGG CGG ACA CTG GCG CAG CGT ATG TTT AAT 96 Ser Lys Leu Lys Gly Tyr Trp Arg Thr Leu Ala Gin Arg Mec Phe Asn 20 25 30 - 2 2 9 - Leu Arg His Asn Leu Ser l e Asp Giy Gin Pro Leu Ser Leu Pro Leu 35 40 45 TAT GCT AAA CCG GCT GAT CCA AAA GCT TTA CTG ACT GCG GCG GTT TC 192 Tyr Aia Lys Pro Ala Asp Pro Lys Aia Leu Leu Ser Ala Ala Val Ser 50 55 50 GCT TCT CAA GGG GGA GCC GAC TTG CCG AAG GCG CCG CTG ACT ATT CAC 240 Ala Ser Gin Gly Gly Ala Asp Leu Pro Lys Ala Pro Leu Thr lie His 65 70 75 30 CGC TTC CCT CAA ATG CTA GAA GGG GC.A CGG GGC TTG GTT AAC CAG CTT 233 Arg Phe Pro Gin Met Leu Glu Gly Ala Arg Gly Leu Val Asn Gin Leu 35 90 95 ATA CAG TTC GGT ACT TCA CTA TTG GGG TAC AGT GAG CGT CAG GAT GCG 33 S lie Gin Phe Gly Ser Ser Leu Leu Gly Tyr Ser Glu Arg Gin Asp Ala 100 105 no GAA GCT ATG AGT CAA CTA CTG CAA ACC CAA GCC AGC GAG TTA ATA CTG 334 Glu Ala Mec Ser Gin Leu Leu Gin Thr Gin Ala Ser Glu Leu He Leu 115 120 125 ACC AGT ATT CGT ATG CAG GAT AAC CAA TTG GCA GAG CTG GAT TCG GAA 432 Thr Ser lie Arg Mec Gin Asp Asn Gin Leu Ala Glu Leu Asp Ser Glu 130 135 140 .AAA ACC GCC TTG CAA GTC TCT TTA GCT GGA GTG CAA CAA CGG TTT GAC 430 Lys Thr Ala Leu Gin Val Ser Leu Ala Gly Val Gin Gin Arg Phe Asp 145 150 155 150 AGC TAT AGC CAA CTG TAT GAG GAG AAC ATC AAC GCA GGT GAG CAG CGA 523 Ser Tyr Ser Gin Leu Tyr Glu Glu Asn lie Asn Ala Gly Glu Gin Arg 165 170 175 GCG CTG GCG TTA CGC TCA GAA TCT GCT ATT GAG TCT CAG GGA GCG CAG 575 Ala Leu Ala Leu Arg Ser Glu Ser Ala lie Glu Ser Gin Gly Ala Gin 180 135 190 ATT TCC CGT ATG GCA GGC GCG GGT GTT GAT ATG GCA CCA AAT ATC TTC 52 He Ser Arg Mec Ala Gly Ala Gly Val Asp Met Ala Pro Asn He Phe 195 200 205 GGC CTG GCT GAT GGC GGC ATG CAT TAT GGT GCT ATT GCC TAT GCC ATC 572 Gly Leu Ala Asp Gly Gly Mec His Tyr Gly Ala He Ala Tyr Ala He 210 215 220 GCT GAC GGT ATT GAG TTG AGT GCT TCT GCC AAG ATG GTT GAT GCG GAG 720 Ala Asp Gly He Glu Leu Ser Ala Ser Ala Lys Mec Val Asp Ala Glu 225 230 235 240 AAA GTT GCT CAG TCG GAA ATA TAT CGC CGT CGC CGT CAA GAA TGG AAA 758 Lys Val Ala Gin Ser Glu He Tyr Arg Arg Arg Arg Gin Glu Trp Lys 245 250 255 ATT CAG CGT GAC AAC GCA CAA GCG GAG ATT AAC CAG TTA AAC GCG CAA 315 He Gin Arg Asp Asn Ala Gin Ala Glu He Asn Gin Leu Asn Ala Gin 260 265 270 CTG GAA TCA CTG TCT ATT CGC CGT GAA GCC GCT GAA ATG CAA AAA GAG 354 Leu Glu Ser Leu Ser He Arg Arg Glu Ala Ala Glu Mec Gin Lys Glu 275 280 285 TAC CTG AAA ACC CAG CAA GCT CAG GCG CAG GCA CAA CTT ACT TTC TTA 912 Tyr Leu Lys Thr Gin Gin Ala Gin Ala Gin Ala Gin Leu Thr Phe Leu 290 295 300 AGA AGC AAA TTC AGT AAT CAA GCG TTA TAT AGT TGG TTA CGA GGG CGT 950 Arg Ser Lys Phe Ser Asn Gin Ala Leu Tyr Ser Trp Leu Arg Gly Arg 305 310 315 320 -230- TTG TTA OCT ATT TAT TTC CAG T TAT GAC TTG GCC GTA TCA CGT TCC IC 3 Leu Ser Gly lie Tyr Phe Gin Fhe Tyr Asp Leu Ala 7 1 Ser Arg vs 325 330 335 CTG ATG GCA GAG CAA TCC TAT CAA TGG GAA GCT AAT GAT AAT TCC ATT 10 6 Leu Mec Ala Glu Gin Ser Tyr Gin Trp Giu Ala Asn Asp Asn Ser lie 340 345 350 AGC TTT GTC AAA CCG GGT GCA TGG CAA GGA ACT TAC GCC GGC TTA TTG 110-4 •sr Phe 7 1 Lys Pro Gly Ala Trp Gin Gly Thr Tyr Ala Gly Leu Leu 355 360 365 TGT GGA GAA GCT TTG ATA CAA AAT CTG GCA CAA ATG GAA GAG GCA TAT 1152 cvs Giy Glu Ala Leu He Gin Asn Leu Ala Gin Mec Glu Glu Ala Tyr 370 375 380 G AAA TGG GAA TCT CGC GCT TTG GAA GTA GAA CGC ACC GTT TCA TTG 1200 Leu Lys Trp Glu Ser Arg Ala Leu Glu Val Glu Arg Thr 7al Ser L=»u 335 390 395 400 GCA GTG GTT TAT GAT TCA CTG GAA GGT AAT GAT CGT TTT AAT TTA GCG 12 3 Ala Val Val Tyr Asp Ser Leu Glu Gly Asn Asp Arg Phe Asn Leu Ala 405 410 415 GAA CAA ATA CCT GCA TTA TTG GAT AAG GGG GAG GGA ACA GCA GGA ACT 1296 Glu Gin lie Pro Ala Leu Leu Asp Lys Gly Glu Gly Thr Ala Gly Thr 420 425 430 AAA GAA AAT GGG TTA TCA TTG GCT AAT GCT ATC CTG TCA GCT TCG GTC 1344 Lys Glu Asn Gly Leu Ser Leu Ala Asn Ala. He Leu Ser Ala Ser Val 435 440 445 AAA TTG TCC GAC TTG AAA CTG GGA ACG GAT TAT CCA GAC AGT ATC GTT 1392 Lys Leu Ser Asp Leu Lys Leu Gly Thr Asp Tyr Pro Asp Ser He Val 450 455 460 GGT AGC AAC AAG GTT CGT CGT ATT AAG CAA ATC AGT GTT TCG CTA CCT 1440 Gly Ser Asn Lys Val Arg Arg He Lys Gin He Ser Val Ser Leu Pro 465 470 475 430 GCA TTG GTT GGG CCT TAT CAG GAT GTT CAG GCT ATG CTC AGC TAT GGT 1438 Ala Leu Val Gly Pro Tyr Gin Asp Val Gin Ala Mec Leu Ser Tyr Gly 485 490 495 GGC AGT ACT CAA TTG CCG AAA GGT TGT TCA GCG TTG GCT GTG TCT CAT 1536 Gly Ser Thr Gin Leu Pro Lys Gly cys Ser Ala Leu Ala Val ser His 500 505 510 GGT ACC AAT GAT AGT GGT CAG TTC CAG TTG GAT TTC AAT GAC GGC AAA 1534 Gly Thr Asn Asp Ser Gly Gin Phe Gin Leu Asp Phe Asn Asp Gly Lys 515 520 525 GAT GAT CAG GGT Asp Asp Gin Gly 530 535 540 CAG AAA GCA ATA Gin Lys Ala He 545 550 555555 560 TAT ACC ATC CGT 1722 Leu His He Arg TTyyrr TThhrr HHee AArrgg ··· 565 570 573 (2) INFORMATION FOR SEQ ID NO: 55: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 573 amino acids (B) TYPE: amino acids -231- SUBSTnUTESHEET (RULE 26) i d STPANDEDNESS : single L729 CAA TTG GCA GCG AAA CCC ACA ATC ACG GTA CCA C A AAA GAT TCC CCG I " i 577 Gin Leu Aia Gly Lys Pro Thr lie Thr Vai Pro Gin Lys Asp Ser Pro 177" CTG GCG GCG GAT ATT CTG AGT TTG CTG CAA GCG CTA AGT GCG ATT GCT 1324 5 9 3 Leu Ala Aia Asp lie Leu Ser Leu Leu Gin Ala Leu Ser Ala lie Aia 508 1325 CAA TGG CAA CAA CAG CAC GAT TTA GAA TTT TCA GCA CTG CTT TTG CTG 1372 609 Gin Trp Gin Gin Gin His Asp Leu Glu Phe Ser Ala Leu Leu Leu Leu 524 1373 TTG AGT GAC AAC CCT ATT TCT ACC TCG CAG GGC ACT GAC GAT CAA TTG 1920 625 Leu Ser Asp Asn Pro lie Ser Thr Ser Gin Gly Thr Asp Asp Gin Leu 640 1921 AAC TTT ATC CGT CAA GTG TGG CAG AAC CTA GGC AGT ACG TTT GTG GGT 1963 641 Asn Phe lie Arg Gin Vai Trp Gin Asn Leu Gly Ser Thr Phe Vai Gly 656 1569 GCA ACA TTG TTG TCC CGC AGT GGG GCA CCA TTA GTC GAT ACC AAC GGC 2016 657 Ala Thr Leu Leu Ser Arg Ser Gly Ala Pro Leu Vai Asp Thr Asn Gly 672 2017 CAC GCT ATT GAC TGG TTT GCT CTG CTC TCA GCA GGT AAT AGT CCG CTT 2064 673 His Ala lie Asp Trp Phe Ala Leu Leu Ser Ala Gly Asn Ser Pro Leu 633 2065 ATC GAT AAG GTT GGT CTG GTG ACT GAT GCT GGC ATA CAA AGT CTT ATA 2112 639 lie Asp Lys Vai Gly Leu Vai Thr Asp Ala Gly He Gin Ser Vai He 704 2113 GCA ACG GTG GTC AAT ACA CAA AGC TTA TCT GAT GAA GAT AAG AAG CTG 2160 705 Ala Thr Vai Vai Asn Thr Gin Ser Leu Ser Asp Glu Asp Lys Lys Leu 720 2161 GCA ATC ACT ACT CTG ACT AAT ACG TTG AAT CAG GTA CAG AAA ACT CAA 2203 721 Ala He Thr Thr Leu Thr Asn Thr Leu Asn Gin Vai Gin Lys Thr Gin 30 2209 CAG GGC GTG GCC GTC AGT CTG TTG GCG CAG ACT CTG AAC GTG AGT CAG 2256 737 Gin Gly Vai Ala Vai Ser Leu Leu Ala Gin Thr Leu Asn Vai Ser Gin 752 2257 TCA CTG CCT GCG TTA TTG TTG CGC TGG AGT GGA CAA ACA ACC TAC CAG 2304 753 Ser Leu Pro Ala Leu Leu Leu Arg Trp Ser Gly Gin Thr Thr Tyr Gin 763 2305 TGG TTG AGT GCG ACT TGG GCA TTG AAG GAT GCC GTT AAG ACT GCC GCC 2352 759 Trp Leu Ser Ala Thr Trp Ala Leu Lys Asp Ala Vai Lys Thr Ala Ala 734 2353 GAT ATT CCC GCT GAC TAT CTG CGT CAA TTA CGT GAA GTG GTA CGC CGC 2400 735 Asp He Pro Ala Asp Tyr Leu Arg Gin Leu Arg Glu Vai Vai Arg Arg 300 2401 TCC TTG TTG ACC CAA CAA TTC ACG CTG AGT CCT GCA ATC GTG CAA ACC 2443 301 Ser Leu Leu Thr Gin Gin Phe Thr Leu Ser Pro Ala Mec Vai Gin Thr 3.6 -236- SUBSTmjTE SHEET (RULE 26) 2449 TTG TG GAG TAT CCA GCC TAT TTT GGC GCT TCC CCA G A ACA CTC A 24;·: 317 Liu Leu Asp Tyr Pro Ala Tyr Phe Gly Ala Ser Ala Glu Thr Val Thr 53 j 249" GAT ATC ACT TTG TGG ATG CTT TAT ACC CTG AGC TGT TAT AGC GAT TTA 25 ·: 833 Asp He Ser Leu Trp Mec Leu Tyr Thr Leu Ser Cys Tyr Ser Asp Leu 343 545 TTG CTC CAA ATG GGT GAA GCT GGT GGT ACC GAA GAT GAT GTA CTG GCC 343 Leu Leu Gin Mec Gly Glu Ala Gly Gly Thr Glu Asp Asp Val Leu Ala 2593 TAC TTA CGC ACA GCT AAT GCT ACC ACA CCG TTG AGC CAA TCT GAT GCT 254 0 a<55 Tyr Leu Arg Thr Ala Asn Ala Thr Thr Pro Leu Ser Gin Ser Asp Ala 330 26 1 GCA CAG ACG TTG GCA ACG CTA TTG GGT TGG GAG GTT AAC GAG TTG CAA 2533 33 1 Ala Gin Thr Leu Ala Thr Leu Leu Gly Trp Glu Val Asn Glu Leu Gin 396 2639 GCC GCT TGG TCG GTA TTG GGC GGG ATT GCC AAA ACC ACA CCG CAA CTG 2736 397 Ala Ala Trp Ser Val Leu Gly Gly lie Ala Lys. Thr Thr Pro Gin Leu 512 2737 GAT GCG CTT CTG CGT TTG CAA CAG GCA CAG AAC CAA ACT GGT CTT GGC 2734 913 Asp Ala Leu Leu Arg Leu Gin Gin Ala Gin Asn Gin Thr Gly Leu Gly 923 2735 GTT ACA CAG CAA CAG CAA GGC TAT CTC CTG AGT CGT GAC AGT GAT TAT 2332 929 Val Thr Gin Gin Gin Gin Gly Tyr Leu Leu Ser Arg Asp Ser Asp Tyr 944 2333 ACC CTT TGG CAA AGC ACC GGT CAG GCG CTG GTG GCT GGC GTA TCC CAT 2330 945 Thr Leu Trp Gin Ser Thr Gly Gin Ala Leu Val Ala Gly Val Ser His 960 2881 GTC AAG GGC AGT AAC TGA 2898 961 Val Lys Gly Ser Asn End 966 INFORMATION FOR SEQ ID NO: 57 (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 965 amino (B) TYPE: amino acid tC) TOPOLOGY: linea (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 57 (TccA pepcide) Features From To Description 1 10 SEQ ID NO: 8 1 1 M Meecc A Assnn G Giinn L Leeuu Ala Ser Pro Leu He Ser Arg Thr Glu Glu He His 16 17 Asn Leu Pro Gly Lys Leu Thr Asp Leu Gly Tyr Thr Ser Val Phe Asp 32 33 Val Val Arg Mec Pro Arg Glu Arg Phe He Arg Glu His Arg Ala Asp 43 49 Leu Gly Arg Ser Ala Glu Lys Mec T r Asp Leu Ala Val Gly T r Ala 64 65 His Gin Val Leu His His Phe Arg Arg Asn Ser Leu Ser Glu Ala V l 30 31 Gin Phe Gly Leu Arg Ser Pro Phe Ser Val Ser Gly Pro Asp Tyr Ala 56 •237 ■ 5 ~ Asn Gin. Phe Leu As Ala Asn Thr Trp Lys Asp Lys Ala ?r3 5ar 111 113 Gly Ser Pro Glu Ala Asn Asp Ala Pro al Ala Tyr Lau Thr His lie 123 129. Tyr Gin Leu Ala Leu Glu Gin Glu Lys Asn Gly Ala Thr Thr lie Mec 4 145 Asn Thr Leu Ala Glu Arg Arg Pro Asp Lau Gly Ala Lau Lau Ha Asn 160 151 Asp Lys Ala Ila Asn Glu 7a 1 He Pro in Lau Gin Lau V l Asn Glu 175 177 lie Leu Ser Lys Ala lie Gin Lys Lys Lau Ser Leu Thr Asp Leu G lu 192 193 Ala V l Asn Ala Arg Leu Ser Thr Thr Arg T r Pro Asn Asn Lau Pro 203 209 Ty His Tyr Gly His Gin Gin He Gin Thr Ala Gin Ser Val Lau Gly 224 225 Thr Thr Leu Gin Asp He Thr Lau Pro Gin Thr Lau Asp Lau Pro Gin 240 241 Asn Phe Trp Ala Thr Ala Lys Gly Lys Leu Ser Asp Thr Thr Ala Ser 256 257 Ala Leu Thr Arg Leu Gin He Mec Ala Ser Gin Phe Ser Pro Glu Gin 2-?2 273 Gin Lys He He Thr Glu Thr V l Gly Gin Asp Phe Tyr Gin Leu Asn 233 239 T r Gly Asp Ser Ser Leu Thr Val Asn Ser Phe Ser Asp Mec Thr He 304 305 Msc Thr Asp Arg Thr Ser Leu Thr Val Pro Gin Val Glu Leu Mec Leu 320 321 Cys Ser Thr Val Gly Gly Ser Thr Val V l Lys Ser Asp Asn V l Ser 335 337 Ser Gly Asp Thr Thr Ala Thr Pro Phe Ala Tyr Gly Ala Arg Phe He 352 353 His Ala Gly Lys Pro Glu Ala Ila Thr Lau Ser Arg Ser Gly Ala Glu 363 359 Ala His Phe Ala Leu Thr Val Asn Asn Leu Thr Asp Asp Lys Lau As 334 335 Arg lie Asn Arg Thr Val Arg Lau Gin Lys Trp Leu Asn Leu Pro T r 400 401 Glu As He Asp Leu Leu Val Thr Ser Ala Met Asp Ala Glu Thr Gly 416 417 Asn Thr Ala Lau Ser Me Asn Asp Asn Thr Leu Arg Mec Leu Gly Val 432 433 Phe Lys His Tyr Gin Ala Lys. Tyr Gly Val Ser Ala Lys Gin Pha Ala 443 449 Gly Trp Leu Arg Val Val Ala Pro Phe Ala He Thr Pro Ala Thr Pro 454 465 Phe Leu Asp Gin Val Phe Asn Ser Val Gly Thr Phe Asp Thr Pro Phe 430 431 Val lie Asp Asn Gin Asp Phe Val T r Thr Lau Thr Thr Gly Gly Asp 496 497 Gly Ala Arg Val Lys His He Ser Thr Ala Leu Gly Lau Asn His Arg 512 513 Gin Phe Leu Leu Leu Ala Asp Asn He Ala Arg Gin Gin Gly Asn Val 523 529 Thr Gin Ser Thr Leu Asn Cys Asn Leu Phe Val Val Ser Ala Phe Tyr 544 545 Arg Leu Ala Asn Leu Ala Arg Thr Leu Gly He Asn Pro Glu Sar Phe 560 561 Cys Ala Leu Val Asp Arg Lau Asp Ala Gly Thr Gly Ha V l Trp Gin 576 577 Gin Leu Ala Gly Lys Pro Thr He Thr V l Pro Gin Lys Asp Ser Pro 592 593 Leu Ala Ala Asp He Lau Ser Leu Leu Gin Ala Leu Ser Ala Ila Ala 503 605 Gin Trp Gin Gin Glh His Asp Leu Glu Phe Ser Ala Lau Lau Leu Leu 524 !38- SUBSTTTUTE SHEET (RULE 26 - 5 Leu Ser Asp Asn Pro _ _ e Ser Thr 3er Gin Gly Thr Asp Asp n Leu 641 Asn Phe He Arg Gin Val Trp Gin Asn Leu Gly Ser Thr Phe V l 01- 655 657 Ala Thr Leu Leu Ser Arg Ser ly Ala Pro Leu Val Asp Thr As ly 5~2 £73 His Ala He Asp Trp Phe Ala Leu Leu Ser Ala Gly Asn Ser Pro Leu 633 539 lie As L s Val Gly Leu Val Thr Asp Ala Gly He Gin Ser V l lie 7C4 705 ' Ala Thr V l Val Asn Thr Gin Ser Leu Ser Asp Glu Asp Lys Lys Leu 720 721 Ala lie Thr Thr Leu Thr Asn Thr Leu Asn Gin Val Gin Lys Thr Gin ~ 36 "37 Gin Gly Val Ala Val Ser Leu Leu Ala Gin Thr Leu Asn Val Ser Gin "52 753 Ser Leu Pro Ala Leu Leu Leu Arg Trp Ser Gly Gin Thr Thr Tyr Gin "53 759 Trp Leu Ser Ala Thr Trp Ala Leu Lys Asp Ala Val Lys Thr Ala Ala 734 735 Asp lie Pro Ala Asp Tyr Leu Arg Gin Leu Arg Glu Val Val Arg Arg 300 301 Ser Leu Leu Thr Gin Gin Phe Thr Leu Ser Pro Ala Mec Val Gin Thr 316 317 Leu Leu Asp T r Pro Ala Tyr Phe Gly Ala Ser Ala Glu Thr Val Thr 332 333 Asp lie Ser Leu Trp Mec Leu Tyr Thr Leu Ser Cys Tyr Ser Asp Leu 343 349 Leu Leu Gin Mec Gly Glu Ala Gly Gly Thr Glu Asp As Val Leu Ala 364 365 Tyr Leu Arg Thr Ala Asn Ala Thr Thr Pro Leu Ser Gin Ser Asp Ala 330 331 Ala Gin Thr Leu Ala Thr Leu Leu Gly Trp Giu Val As Glu Leu Gin 396 397 Ala Ala Trp Ser Val Leu Gly Gly He Ala Lys Thr Thr Pro Gin Leu 312 913 Asp Ala Leu Leu Arg Leu Gin Gin Ala Gin Asn Gin Thr Gly Leu Gly 923 929 Val Thr Gin Gin Gin Gin Gly Tyr Leu Leu Ser Arg Asp Ser Asp Tyr 944 945 Thr Leu Trp Gin Ser Thr Gly Gin Ala Leu Val Ala Gly Val Ser His 960 361 V l Lys Gly Ser Asn 965 INFORMATION FOR SEQ ID NO: 58 (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 4698 base pairs ( B ) TYPE: nucleic acid (C) STRANDEDNESS : double (D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 58 ( CCS) 1 ATG TTA TCG ACA ATG GAA AAA CAA CTG AAT GAA TCC CAG CGT GAT GCG 1 Mec Leu Ser Thr Mec Glu Lys Gin Leu Asn Glu Ser Gin Arg Asp Aia TTG GTG ACT GGC TAT ATG AAT TTT GTG GCG CCG ACG TTG AAA GGC GTC Leu Val Thr Gly Tyr Mec Asn Phe Val Ala Pro Thr Leu Lys Gly Val -239- ACT CGT CAG CCG GTG ACG GTG G A GAT TTA TAC GAA TAT TTG CTC ATT .44 33 Ser Giy Gin Pro Val Thr Vai Glu Asp Leu Tyr Glu Tyr Leu Leu li 43 145 GAC CCG GAA GTG GCT GAT GAG GTT GAG ACG ACT CGG GTA GCA CAA GCG 192 49 Asp Pro Glu Val Ala Asp Glu Val Giu Thr Ser Arg Val Ala Gin Ala 64 193 ATT GCC AGC ATA CAG CAA TAT ATG ACT CGT CTG GTC .AAC GGC TCT GAA 240 55 He Ala Ser He Gin Gin Tyr Mec Thr Arg Leu Val Asn Gly Ser Glu 30 2 1 "CCG GGG CGT CAG GCG ATG GAG CCT TCT ACA GCT AAC GAA TGG CGT GAT 233 31 Pro Gly Arg Gin Ala Mec Glu Pro Ser Thr Ala Asn Glu Trp Arg Asp 96 239 AAT GAT AAC CAA TAT GCT ATC TGG GCT GCG GGG GCT GAG GTT CGA AAT 33 97 Asn Asp Asn Gin Tyr Aia He Trp Ala Ala Gly Ala Glu Val Arg Asn 337 TAC GCT GAA AAC TAT ATT TCA CCC ATC ACC CGG CAG GAA AAA AGC CAT 334 113 Tyr Ala Glu Asn Tyr He Ser Pro He Thr Arg Gin Glu Lys Ser His 123 385 TAT TTC Tec GAG CTG GAG ACG ACT TTA AAT CAG AAT CGA CTC GAT CCG 432 129 Tyr Phe Ser Glu Leu Glu Thr Thr Leu Asn Gin Asn Arg Leu Asp Pro 144 433 GAT CGT GTG CAG GAT GCT GTT TTG GCG TAT CTC AAT GAG TTT GAG GCA 430 145 Asp Arg Val Gin Asp Ala Val Leu Ala Tyr Leu Asn Glu Phe Glu Ala 160 431 GTG AGT AAT CTA TAT GTG CTC AGT GGT TAT ATT AAT CAG GAT AAA TTT 523 151 Val Ser Asn Leu Tyr Val Leu Ser Gly Tyr lie Asn Gin Asp Lys Phe 175 529 GAC CAA GCT ATC TAC TAC TTT ATT GGT CGC ACT ACC ACT AAA CCG TAT 575 177 Asp Gin Ala He Tyr Tyr Phe He Gly Arg Thr Thr Thr Lys Pro Tyr 192 577 CGC TAC TAC TGG CGT CAG ATG GAT TTG AGT AAG AAC CGT CAA GAT CCG 624 193 Arg Tyr Tyr Trp Arg Gin Mec Asp Leu Ser Lys Asn Arg Gin Asp Pro 208 625 GCA GGG AAT CCG GTG ACG CCA AAT TGC TGG AAT GAT TGG CAG GAA ATC 672 209 Ala Gly Asn Pro Vai Thr Pro Asn Cys Trp Asn Asp Trp Gin Glu lie 224 673 ACT TTG CCG CTG TCT GGT GAT ACG GTG CTG GAG CAT ACA GTT CGC CCG 720 225 Thr Lau Pro Leu Ser Gly Asp Thr Val Leu Glu His Thr Val Arg Pro 240 721 GTA TTT TAT AAT GAT CGA CTA TAT GTG GCT TGG GTT GAG CGT GAC CCG 3 241 Val Phe Tyr Asn Asp Arg Leu Tyr Val Ala Trp Val Glu Arg Asp Pro 256 769 GCA GTA CAG AAG GAT GCT GAC GGT AAA AAC ATC GGT AAA ACC CAT GCC 316 257 Ala Val Gin Lys Asp Ala Asp Gly Lys Asn He Gly Lys Thr His Ala 2~2 317 TAC AAC ATA AAG TTT GGT TAT AAA CGT TAT GAT GAT ACT TGG ACA GCG 364 273 T r Asn He Lys Phe Gly Tyr Lys Arg Tyr Asp Asp Thr Trp Thr Ala 233 3 5 CCG AAT ACG ACC ACG TTA ATG ACA CAA CAA GCA GGG GAA AGT TCA GAA 912 239 Fro Asn Thr Thr Thr Leu Mec Thr Gin Gin Ala Gly Glu Ser Ser Glu i "4 -240- SUBSTTTUTE SHEET (RULE 26) 913 ACA CAG CGA TCC AGC CTG CTC ATT GAT G A TCT AGC ACC ACA TTG CGC HO 305 Thr Gin Arg Ser Ser Leu Leu lie Asp Glu Ser Ser Thr Thr Leu Arg 220 361 CAA CTT AAT CTG TTG GCT ACC ACC GAT TTT ACT ATC GAT CCG ACG GAG 1003 321 Gin Val Asn Leu Leu Ala Thr Thr Asp Phe Ser lie Asp Pro Thr Glu 336 Ι ϋ 1009 GAA ACG GAC ACT AAC CCG TAT GGC CGC CTA ATG TTG GGG GTG TTT GTC 1055 337 Glu Thr Asp Ser Asn Pro /r Gly Arg Leu Met Leu Gly Val Phe al 352 5 1057 CGT CAA TTT GAA GGT GAT GGG GCC AAT AGA AAA AAT AAA CCC GTT GTT 1104 353 Arg Gin Phe. Glu Gly Asp Gly Ala Asn Arg Lys Asn Lys Pro Val Vai 363 1105 TAT GGT TAT CTC TAT TGT GAC TCA GCT TTC AAT CGT CAT GTT CTC AGG 1152 0 369 Tyr Gly Tyr Leu Tyr Cys Asp Ser Ala Phe Asn Arg His Val Leu Arg 334 1153 CCG TTA ACT AAG AAC TTT TTG TTC ACT ACT TAC CGT GAT GAA ACG GAT 1200 385 Pro Leu Ser Lys Asn Phe Leu Phe Ser Thr Tyr Arg Asp Glu Thr Asp -iOO 5 1201 GGT CAA AAC AGC TTG CAA TTT GCG GTA TAC GAT AAA AAG TAT GTA ATT 1243 401 Gly Gin Asn Ser Lau Gin Phe Ala Val Tyr Asp Lys Lys Tyr Val lie 416 0 12 9 ACT AAG GTT GTT ACA GGT GCA ACG GAA GAT CCC GAA AAT ACA GGA TGG 1296 417 Thr Lys Val Val Thr Gly Ala Thr Glu Asp Pro Glu Asn Thr Gly Trp 432 5 1297 GTA AGT AAA GTT GAT GAC TTG AAA CAA GGC ACT ACT GGG GCC TAT GTG 1344 433 Val Ser Lys Val Asp Asp Leu Lys Gin Gly Thr Thr Gly Ala Tyr Val 448 1345 TAT ATC GAT CAA GAT GGC CTG ACG CTT CAT ATA CAA ACC ACA ACT AAT 1392 0 449 Tyr lie Asp Gin Asp Gly Leu Thr Leu His lie Gin Thr Thr Thr Asn 464 1393 GGG GAT TTT ATT AAC CGT CAT ACG TTT GGA TAT AAC GAT CTT GTA TAT 1440 465 Gly Asp Phe lie Asn Arg His Thr Pha Gly Tyr Asn Asp Leu Val T/r 430 5 14 1 GAT TCT AAG TCT GGT TAT GGT TTC ACG TGG TCA GGA AAT GAA GGT TTT 1483 481 Asp Ser Lys Ser Gly Tyr Gly Phe Thr Trp Ser Gly Asn Glu Gly Phe 496 0 1489 TAT CTG GAT TAC CAT GAT GGA AAT TAT TAC ACC TTT CAT AAT GCA ATA 1536 497 Tyr Leu Asp Tyr His Asp Gly Asn Tyr Tyr Thr Phe His Asn Ala He 512 5 1537 ATC AAC TAC TAT CCG TCT GGA TAT GGT GGT GGA TCT GTT CCT AAT GGA L 34 513 He Asn Tyr Tyr Pro Ser Gly Tyr Gly Gly Gly Ser Val Pro Asn Gly 523 1585 ACG TGG GCG TTA GAG CAA AGG ATT AAT GAG GGA' TGG GCT ATT GCT CCC 1632 0 529 Thr Trp Ala Leu Glu Gin Arg He Asn Glu Gly Trp Ala lie Ala Pro 544 1633 CTG CTT GAT ACT CTC CAT ACT GTT ACT GTG AAG GGC AGT TAT ATC GCT i≤3C 545 Leu Leu Asp Thr Leu His Thr Val- Thr Val Lys Gly Ser Tyr He Ala 560 5 -241- 1531 TCG GAA GGG GAA ACA CCT ACC CGT TAT AAT CTG TAT ATT CCA GAT CGT 1~23 5dl Trp Glu Gly Glu Thr Pro Thr Gly T/r Asn Leu Tyr lie Pro Asp Gly 575 17:9 ACC GTG TTG CTA GAT TCG TTT GAT .AAA ATA AAT TTT GCT ATT GGT CTT 17" 6 577 Thr Val Leu Leu Asp Trp Phe Asp Lys Ila Asn Phe Ala lie Gly Leu 592 1~77 AAT AAG CTT GAG TCT GTA TTT ACC TCG CCA GAT TGG CCA AC CTA ACC 1324 553 Asn Lys Leu Glu Ser Val Phe Thr Ser Pro Asp Trp Pro Thr Leu Thr 003 1325 ACT ATC AAA AAT TTC ACT AAA ATC GCC GAT AAC CGC AAA TTC TAT CAG 1372 509 Thr lie Lys Asn Phe Ser Lys lie Ala Asp Asn Arg Lys Phe Tyr Glr. 624 1373 GAA ATC AAT GCT GAG ACG GCG GAT GGA CGC AAC CTG TTT AAA CGT TAC 1920 525 Glu lie Asn Ala Glu Thr Ala Asp Gly Arg Asn Leu Phe Lys Arg Ty 540 1921 ACT ACT CAA ACT TTC GGA CTT ACC AGC GGT GCG ACT TAT TCT ACA ACT 1953 541 Ser Thr Gin Thr Phe Gly Leu Thr Ser Gly Ala Thr T r Ser Thr Thr 556 1969 TAT ACT TTG TCT GAG GCG GAT TTC TCC ACT GAT CCG GAC AAA AAC TAC 2016 557 Tyr Thr Leu Ser Glu Ala Asp Phe Ser Thr Asp Pro Asp Lys Asn T/r 672 2017 CTA CAG GTT TGT TTG AAT GTC GTG TGG GAT CAT TAT GAC CGC CCG TCA 2064 673 Leu Gin Val Cys Leu Asn Val Val Trp Asp His Tyr Asp Arg Pro Ser 633 2065 GGG AAA AAA GGG GCT TAT TCT TGG GTC AGT AAG TGG TTT AAC GTC TAT 2112 689 Gly Lys Lys Gly Ala Tyr Ser Trp Val Ser Lys Trp Phe Asn Val T/r 704 2113 GTT GCG TTG CAA GAT AGC AAA GCT CCG GAT GCC ATT CCT CGA TTA GTT 2160 705 Val Ala Leu Gin Asp Ser Lys Ala Pro Asp Ala lie Pro Arg Leu Val 20 2161 TCC CGT TAC GAT AGT AAA CGT GGT CTG GTG CAA TAT CTG GAC TTC TGG 2203 721 Ser Arg Tyr Asp Ser Lys Arg Gly Leu Val Gin Tyr Leu Asp Phe Trp 736 2209 ACC TCA TCA TTA CCC GCG AAA ACC CGT CTT AAC ACC ACC TTT GTG CGT 2256 737 Thr Ser Ser Leu Pro Ala Lys Thr Arg Leu Asn Thr Thr Phe Val Arg 52 2257 ACT TTG ATT GAG AAG GCT AAT CTG GGG CTG GAT AGT TTG CTG GAT TAC 2304 753 Thr Leu lie Glu Lys Ala Asn Leu Gly Leu Asp Ser Leu Leu Asp T/r 763 2305 ACC TTG CAG GCA GAT CCT TCT CTG GAA GCA GAT TTA GTG ACT GAC GGC 2352 769 Thr Leu Gin Ala Asp Pro Ser Lau Glu Ala Asp Leu Val Thr Asp Gly 734 2353 AAA AGC GAA CCA ATG GAC TTT AAT GGT TCA AAC GGT CTC TAT TTC TGG 2400 735 Lys Ser Glu Pro Mec Asp Phe Asn Gly Ser Asn Gly Leu Tyr Phe Trp ^ 300 2401 GAA TTG TTC TTT CAC CTG CCG TTT TTG GTT GCT ACA CGC TTT GCC .AAC 443 301 Glu Leu Phe Phe His Lau Pro Phe Leu Val Ala Thr Arg Phe Ala Asn 316 2449 GAA CAG CAA TTT TCG CCG GCA CAA AAG AGT TTG CAT TAC ATC TTT GAC 2496 317 Glu Gin Gin Phe Ser Pro Ala Gin Lys Ser Leu His T/r He Phe Asp 33 -242- SUBSTTTUTE SHEET (RULE 26) 97 CCG GCG ATG AAA AAC AAG CCA CAC .AAT SJCC CCG GCT TAT TGG AAT GTA ^ 33 Pro Ala Mec Lys Asn Lys Pro His Asn Ala Pro Ala T r Tr? Asn Val 8 45 CGT CCG TTG GTT GAA GGA .AAC AGC GAT TTG TCA CGT CAT TTG GAC GAT 2592 49 Arg Pro Lau Val Glu Gly Asn Sar Asp Lau Ser Arg His Lau Asp Asp 354 53 TCT ATA GAC CCA GAT ACT CAA GCT TAT GCT CAT CCG GTG ATA TAC CAC 2640 65 Sar lie Asp Pro Asp Thr Gin Ala Tyr Ala His Pro Val lie Tyr Gi 330 20 1 AAA GCG GTG TTT ATT GCC TAT GTC ACT AAC CTG ATT GCT CAG GGA GAT 2633 381 Lys Ala Val Phe lie Ala Tyr Val Ser Asn Leu He Ala Gin Gly Asp 396 2539 ATG TGG TAT CGC CAA TTG ACT CGT GAC GGT CTG ACT CAG GCC CGT G 2736 397 Mec Trp Tyr Arg Gin Leu Thr Arg Asp Gly Leu Thr Gin Ala Arg Val 912 2" 7 TAT TAC AAT CTG GCC GCT GAA TTG CTA GGG CCT CGT CCG GAT GTA TCG 2734 913 Tyr Tyr Asn Leu Ala Ala Glu Lau Lau Gly Pro Arg Pro Asp Val Sar 929 2735 CTG AGT AGC ATT TGG ACG CCG CAA ACC CTG GAT ACC TTA GCA GCC GGG 2332 929 Leu Ser Ser lie Trp Thr Pro Gin Thr Lau Asp Thr Leu Ala Ala Gly 944 2833 CAA AAA GCG GTT TTA CGT GAT TTT GAG CAC CAG TTG GCT AAT AGT GAT 2330 945 Gin Lys Ala Val Leu Arg Asp Phe Glu His Gin Leu Ala Asn Ser Asp 950 2381 ACC GCT TTA CCC GCA TTG CCG GGC CGC AAT GTC AGC TAC TTG AAA CTG 2923 951 Thr Ala Leu Pro Ala Leu Pro Gly Arg Asn Val Ser Tyr Leu Lys Lau 975 2929 GCA GAT AAT GGC TAC TTT AAT GAA CCG CTC AAT GTT CTG ATG TTG TCT 2975 977 Ala Asp Asn Gly Tyr Phe Asn Glu Pro Leu Asn Val Leu Mec Leu Ser 992 2977 CAC TGG GAT ACG TTG GAT GCA CGG TTA TAC AAT CTG CGT CAT AAC CTG 3024 993 His Trp Asp Thr Leu Asp Ala Arg Leu Tyr Asn Leu Arg His Asn Lau 1003 302S ACC GTT GAT GGC AAG CCG CTT TCG CTG CCG CTG TAT GCT GCG CCT GTT 3072 1009 Thr Val Asp Gly Lys Pro Leu Ser Leu Pro Leu Tyr Ala Ala Pro Val 1024 3073 GAT CCG GTA GCG TTG TTG GCT CAG CGT GCT CAG TCC GGC ACG TTG ACG 3120 1025 Asp Pro Val Ala Leu Leu Ala Gin Arg Ala Gin Ser Gly Thr Lau Thr L040 3121 AAT GGC GTC AGT GGC GCC ATG TTG ACG GTG CCG CCA TAC CGT TTC AGC 3153 1041 Asn Gly Val Ser Gly Ala Mec Leu Thr Val Pro Pro Tyr Arg Phe Sar 1056 3159 GCT ATG TTG CCG CGA GCT TAC AGC GCC GTG GGT ACG TTG ACC AGT TTT 1057 Ala Mec Leu Pro Arg Ala Tyr Ser Ala Val Gly Thr Lau Thr Sar Pha 3217 GGT CAG AAC CTG CTT AGT TTG TTG GAA CGT AGC GAA CGA GCC TGT CAA :264 1073 Gly Gin Asn Lau Leu Ser Leu Leu Glu Arg Ser Glu Arg Ala Cys Gin 1038 -243 - 255 GAA GAG TTG GCG --AA CAG CAA CTG TTG ATG TCC AGC TAT GCC ATC 1035 Glu Glu Leu Aia Gin Gin Gin Leu Leu Asp Mec S=r Ser Tyr Aid :ie 3313 ACG TTG CAA CAA CAG GCG CTG GAT GGA TTG GCG GCA GAT CGT CTG GCG j 36 1105 Thr Leu Gin Gin Gin Ala Leu Asp Gly Leu Ala Aia Asp Arg Leu Aia 112; 336 i CTG CTA GCT ACT CAG GCT ACG GCA CAA CAG CGT CAT GAC CAT TAT TAC 3-103 i 121 Leu Leu Ala Ser Gin Ala Thr Ala Gin Gin Arg His Asp His Ty Ty 1135 3409 ACT C G TAT CAG AAC AAC ATC TCC AGT GCG GAA CAA CTG GTG ATG GAC 3 56 1137 Thr Leu Tyr Gin Asn Asn lie Ser Ser A.La Glu Gin Leu Vai Mec Asp ii52 3457 ACC CAA ACG TCA GCA CAA TCC CTG ATT TCT TCT TCC ACT GGT GTA CAA 350-4 1153 Thr Gin Thr Ser Ala Gin Ser Leu lie Ser Ser Ser Thr Giy Vai Gin ii6o 3505 ACT GCC AGT GGG GCA CTG AAA GTG ATC CCG AAT ATC TTT GGT TTG GCT 3552 1169 Thr Ala Ser Gly Ala Lau Lys Vai lie Pro Asn lie Phe Gly Leu Ala 1134 3553 GAT GGC GGC TCG CGC TAT GAA GGA GTA ACG GAA GCG ATT GCC ATC GGG 3600 1135 Asp Gly Gly Ser Arg Tyr Glu Gly Vai Thr Glu Ala lie Ala lie Gly 1200 3601 TTA ATG GCT GCC GGA CAA GCC ACC AGC GTG GTG GCC GAG CGT CTG GCA 3643 1201 Leu Mec Ala Ala Gly Gin Ala Thr Ser Vai Vai Ala Glu Arg Leu Ala 1216 3649 ACC ACG GAG AAT TAC CGC CGC CGC CGT GAA GAG TGG CAA ATC CAA TAC 3556 1217 Thr Thr Glu Asn Tyr Arg Arg Arg Arg Glu Glu Trp Gin He Gin Tyr 1232 3697 CAG CAG GCA CAG TCT GAG GTC GAC GCA TTA CAG AAA CAG TTG GAT GCG 3744 1233 Gin Gin Ala Gin Sar Glu Vai Asp Ala Leu Gin Lys Gin Leu Asp Ala 1243 3745 CTG GCA GTG CGC GAG AAA GCA GCT CAA AC TCC CTG CAA CAG GCG AAG 3792 1249 Leu Ala Vai Arg Glu Lys Ala Ala Gin Thr Ser Leu Gin Gin Ala Lys 1264 3793 GCA CAG CAG GTA CAA ATT CGG ACC ATG CTG ACT TAC TTA ACT ACT CGT 3340 1265 Ala Gin Gin Vai Gin lie Arg Thr Mac Leu Thr Tyr Leu Thr Thr Arg 1230 3341 TTC ACC CAG GCG ACT CTG TAC CAG TGG CTG AGT GGT CAA TTA TCC GCG 3333 1231 Phe Thr Gin Ala Thr Lau Tyr Gin Trp Leu Sar Gly Gin Leu Ser Ala 1236 3339 TTG TAT TAT CAA GCG TAT GAT GCC GTG GTT GCT CTC TGC CTC TCC GCC 3936 1297 Leu Tyr Tyr Gin Ala Tyr Asp Ala Vai Vai Ala Leu Cys Leu Ser Ala 1312 3937 CAA GCT TGC TGG CAG TAT GAA TTG GGT GAT TAC GCT ACC ACT TTT ATC 3534 1313 Gin Ala Cys Trp Gin Tyr Glu Lau Gly Asp Tyr Ala Thr Thr Phe lie H I S 3985 CAG ACC GGT ACC TGG AAC GAC CAT TAC CGT GGT TTG CAA GTG GGG GAG 4032 1329 Gin Thr Gly Thr Trp Asn Asp His Tyr Arg Gly Leu Gin Vai Gly Glu 1344 4033 ACA CTG CAA CTC AAT TTG CAT CAG ATG GAA GCG GCC TAT TTA GTT CGT 4030 1345 Thr Leu Gin Leu Asn Leu His Gin Mec Glu Ala Ala Tyr Leu Vai Arg .360 4031 CAC G A CGC CGT CTT AAT CTC AT" CGT ACT 3TG TCG CTC AAA AGC CTA 4125 13 1 His Glu Arg Arg Leu Αϋη 7ai lie Arg Thr Vai ier Leu Lys Ser Leu ί 3~6 129 TTG GGT GAT GAT GGT TTT GGT AAG TTA AAA ACC GAA GGC AAA GTC GAC 1" 6 137" Lau Gly Asp Asp Gly Phe Gly Lys Leu Lys Thr Glu Gly Lys Val Asp 1332 177 TTT CCA TTA AGC GAA AAG CTG TTT GAC AAC GAC TAT CCG GGG CAC TAT 4224 1393 Phe Pro Leu Ser Glu Lys Leu Phe Asp Asn Asp Tyr Pro Gly His Tyr 1403 4225 TTG CGC CAG ATT AAA ACT GTG TCA GTG ACG TTG CCG ACG TTA GTC GGG 42" 1409 Lau Arg Gin Ila Lys Thr Val Ser Val Thr Lau Pro Thr Lau Val Gly 1424 4273 CCG TAT CAA AAC GTG AAG GCA ACG CTC ACT CAG ACC AGC AGC ACT ATA 4320 1425 Pro Tyr Gin Asn Val Lys Ala Thr Lau Thr Gin Thr Ser Ser Ser lie 1440 4321 TTG TTA GCA GCA GAT ATC ΛΑΤ GGT GTT AAA CGT CTC AAT GAT CCG ACA 4363 1441 Lau Leu Ala Ala Asp lie Asn Gly Val Lys Arg Leu Asn Asp Pro Thr 1455 4369 GGT AAA GAG GGT GAT GCG ACG CAT ATT GTC ACC AAT CTG CGT GCC AGC 4415 1457 Gly Lys Glu Gly Asp Ala Thr His He Val Thr Asn Leu Arg Ala Ser 14"2 4417 CAG CAG GTG GCG CTC TCT TCT GGC ATT AAT GAT GCC GGT AGC TTT GAG 4464 1473 Gin Gin Val Ala Leu Ser Ser Gly Ila Asn Asp Ala Gly Ser Phe Glu 1433 4465 TTG CGT TTG GAA GAT GAG CGC TAT CTA TCA TTT GAG GGG ACT GGA GCT 4512 1489 Leu Arg Leu Glu Asp Glu Arg Tyr Leu Ser Phe Glu Gly Thr Gly Ala 1504 4513 GTT TCC AAA TGG ACT CTT AAC TTC CCG CGT TCT GTG GAT GAG CAT ATT 4560 1505 Val Ser Lys Trp Thr Lau Asn Phe Pro Arg Ser Val Asp Glu His He 1520 4561 GAC GAT AAG ACA TTG AAA GCG GAT GAG ATG CAG GCC GCA CTG TTG GCG 4603 1521 Asp Asp Lys Thr Lau Lys Ala Asp Glu Mac Gin Ala Ala Leu Lau Ala 15 6 4609 AAT ATG GAT GAT GTG CTG GTG CAG GTG CAT TAT ACC GCC TGC GAC GGC 45 1537 Asn Mac Asp Asp Val Leu Val Gin Val His Tyr Thr Ala Cys Asp Gly 1552 4657 GGC GCC AGT TTC GCA AAC CAG GTC AAG AAA ACA CTC TCT TAA 4698 1553 Gly Ala Ser Pha Ala Asn Gin Val Lys Lys Thr Leu Ser End 1566 12) INFORMATION FOR SEQ ID NO: 59 (1) SEQUENCE CHARACTERISTICS: (A) LENGTH: 1665 amino acids (B) TYPE: amino acid (C) TOPOLOGY: linear (ii) MOLECULE TYPE: procein i x i ) SEQUENCE DESCRIPTIO : SEQ ID NO:59 (Tcc3 pepcide) Features From To Description -245- il SEQ ID MO: Mec Leu Ser Thr Mec Glu Lys Gin Leu Asn Glu Ser Gin Arg Asp i i L Leu Val Thr Gly Tyr Mec Asn Phe Val Ala Pro Thr Leu Lys Gly Val 33 Ser Gly Gin Pro Val Thr V l Glu Asp Leu Tyr Glu Tyr Leu Leu lie 43 49 As Pro Glu Val Ala Asp Glu Val Glu Thr Ser A g Val Ala Gin Ala 64 65 lie Ala Ser lie Gin Gin Ty Mec Thr Arg Leu Val Asn Gly Ser Glu 30 31 P o ly Arg Gin Ala Mec Glu Pro Ser Thr Ala Asn Glu Trp Arg Asp 56 9" Asn Asp Asn Gin T r Ala He Trp Ala Ala Gly Ala Glu Val Arg Asn 112 113 Tyr Ala Glu Asn Tyr He Ser Pro He Thr Arg Gin Glu Lys Ser His 123 129 Tyr Phe Ser Glu Leu Glu Thr Thr Leu Asn Gin Asn Arg Leu Asp Pro 144 145 Asp Arg Val Gin As Ala Val Leu Ala Tyr Leu Asn Glu Phe Glu Ala 150 161 V l Ser Asn Leu Tyr Val Leu Ser Gly T He Asn Gin Asp Lys Phe 175 177 Asp Gin Ala lie T r Tyr Phe He Gly Arg Thr Thr Thr Lys Pro Tyr 192 193 Arg Tyr Tyr Trp Arg Gin Mec As Leu Ser Lys Asn Arg Gin Asp Pro 203 209 Ala Gly Asn Pro Val Thr Pro Asn Cys Trp Asn Asp Trp Gin Glu He 224 225 Thr Leu Pro Leu Ser Gly Asp Thr Val Leu Glu His Thr Val Arg Pro 240 241 ' l Phe Tyr Asn Asp Arg Leu Tyr Val Ala Trp Val Glu Arg Asp Pro 25d 257 Ala Val Gin Lys Asp Ala Asp Gly Lys Asn He Gly Lys Thr His Ala 72 273 Tyr Asn lie Lys Phe Gly Tyr Lys Arg Tyr Asp Asp Thr Trp Thr Ala 233 289 Pro Asn Thr Thr Thr Leu Mec Thr Gin Gin Ala Gly Glu Ser Ser Glu 304 305 Thr Gin Arg Ser Ser Leu Leu He Asp Glu Ser Ser Thr Thr Leu Arg .20 321 Gin Val Asn Leu Leu Ala Thr Thr Asp Phe Ser He Asp Pro Thr Glu 335 337 Glu Thr Asp Ser Asn Pro Tyr Gly Arg Leu Mec Leu Gly Val Phe Val 352 353 Arg Gin Phe Glu Gly As Gly Ala Asn Arg Lys Asn Lys Pro Val Val 363 369 Tyr Gly Tyr Leu Tyr Cys Asp Ser Ala Phe Asn Arg His Val Leu Arg 334 385 Pro Leu Ser L s Asn Phe Leu Phe Ser Thr Tyr Arg Asp Glu Thr Asp 400 401 Gly Gin Asn Ser Leu Gin Phe Ala Val Tyr Asp L'/s Lys Tyr Val lie 416 417 Thr Lys Val V l Thr Gly Ala Thr Glu Asp Pro Glu Asn Thr Gly Trp 433 Val Ser L s Val Asp As Leu Lys Gin Gly Thr Thr Gly Ala Tyr V l 443 449 Tyr lie Asp Gin As Gly Leu Thr Leu His He Gin Thr Thr Thr Asn 464 465 Gly Asp Phe He Asn Arg His Thr Phe Gly Tyr Asn Asp Leu Val T r 430 431 Asp Ser Lys Ser Gly T r Gly Phe Thr Trp Ser Gly Asn Glu Gly Phe 496 497 Tyr Leu Asp Tyr His Asp Gly Asn Tyr T r Thr Phe His Asn Ala lie 51 -246- SUBSTnUTE SHEET(RULE 26) Tyr Gly Gly Ser Val Pro Asn 5:5 Thr Trp Ala Leu Glu Gin A g lie Asn lu Gly Trp Aia lie Aia Pre 544 5-15 Leu Leu Asp Thr Leu His Thr Val Thr V l Lys Gly Ser T lie Ala 550 551 Trp Glu Gly Glu Thr Pro Thr Gly Tyr Asn Leu Tyr He Pro Asp Giy 576 ID 5~~ Thr V l Leu Leu Asp Trp Phe Asp Lys He Asn Phe Ala li ly Leu 533 Asn Lys Leu Glu Ser Val Phe Thr Ser Pro Asp Trp Pro Thr Leu Thr 603 509 Thr lie Lys Asn Phe Ser Lys He Ala Asp Asn Arg Lys Phe T/r Gin 5 ^ 1 15 525 Glu He Asn Ala Glu Thr Ala Asp Gly Arg Asn Leu Phe Lys Arg T/r 640 541 Ser Thr Gin Thr Phe Gly Leu Thr Ser Gly Ala Thr T/r Ser Thr Thr 655 U 657 Tyr Thr Leu Ser Glu Ala Asp Phe Ser Thr Asp Pro Asp Lys Asn T/r 6"2 573 Leu Gin Val Cys Leu Asn Val Val Trp Asp His Tyr Asp Arg Pro Ser 638 539 Gly Lys Lys Gly Ala Tyr Ser Trp Val Ser Lys Trp Phe Asn Val T/r 704 705 al Ala Leu Gin Asp Ser Lys Ala Pro Asp Ala He Pro Arg Leu Val 720 721 Ser Arg Tyr Asp Ser Lys Arg Gly Leu Val Gin Tyr Leu Asp Phe Trp 735 0 737 Thr Ser Ser Leu Pro Ala Lys Thr Arg Leu Asn Thr Thr Phe Val Arg 752 753 Thr Leu He Glu Lys Ala Asn Leu Gly Leu Asp Ser Leu Leu Asp T/r 763 759 Thr Leu Gin Ala A p Pro Ser Leu Glu Ala Asp Leu Val Thr Asp Gly 734 5 735 Lys Ser Glu Pro Mec Asp Phe Asn Gly Ser Asn Gly Leu Tyr Phe Trp 300 301 Glu Leu Phe Phe His Leu Pro Phe Leu V l Ala Thr Arg Phe Ala Asn 315 0 317 Glu Gin Gin Phe Ser Pro Ala Gin L s Ser Leu His Tyr He Phe Asp 332 333 Pro Ala Mec Lys Asn Lys Pro His Asn Ala Pro Ala Tyr Trp Asn Val 343 349 Arg Pro Leu Val Glu Gly Asn Ser A p Leu Ser Arg His Leu Asp Asp 364 5 965 Ser He Asp Pro Asp Thr Gin Ala Tyr Ala His Pro Val He T/r Gin 380 381 Lys Ala Val Phe He Ala Tyr Val Ser Asn Leu He Ala Gin Gly Asp 396 0 397 Mec Trp Tyr Arg Gin Leu Thr Arg Asp Gly Lau Thr Gin Ala Arg Val 312 913 Tyr Tyr Asn Leu Ala Ala Glu Leu Leu Gly Pro Arg Pro Asp V l Ser 323 929 Leu Ser Ser He Trp Thr Pro Gin Thr Leu Asp Thr Leu Ala Ala ly '344 5 945 Gin Lys Ala Val Leu Arg Asp Phe Glu His Gin Leu Ala Asn Ser Asp 6 951 Thr Ala Leu Pro Ala Leu Pro Gly Arg Asn Val Ser Tyr Leu Lys Leu 3 " 6 0 977 Ala Asp Asn Gly Tyr Phe Asn Glu Pro Leu Asn Val Leu Mec Leu Ser 952 393 His Trp Asp Thr .Leu Asp Ala Arg Leu Tyr As Leu Arg His As Leu 1003 009 Thr Val Asp Gly Lys Pro Leu Ser Leu Pro Leu Tyr Aia Ala Pro Vai 5 ■ 2 As Pro Val Ala Leu Leu Ala Gin Arg Ala Gin Ser Gly Thr Leu Thr .040 •247- SUBSTmiTE SHEET (RULE 26) 1041 Asn Gly al Ser Gly Ala Mec Leu Thr V l P o. Pro Tyr Arg Phe Ser i;?6 1057 Ala Mec Leu Pro Arg Aia Tyr Ser Ala Val ly Thr Leu Thr Ser Phe i j-: 1073 Giy Gin Asn Leu Leu Ser Leu Leu Glu Arg Ser Glu Arg Ala . Gin 1033 1039 Glu Glu Leu Ala Gin Gin Gin Leu Leu Asp Mec Ser Ser Tyr Ala He 1104 1105 Thr Leu Gin Gin Gin Ala Leu Asp Gly Leu Ala Ala Asp Arg Leu Aia U20 1121 Leu Leu Ala Ser Gin Ala Thr Ala Gin Gin Arg His Asp His Tyr Tyr 1136 1137 Thr Leu Tyr Gin Asn Asn Ila Ser Ser Ala Glu Gin Leu Vai Mec Asp 1152 1153 Thr Gin Thr Ser Ala Gin Ser Leu lie Ser Ser Ser Thr Gly V l Gin 1163 1169 Thr Ala Ser Gly Ala Leu Lys Val He Pro Asn He Phe Giy Leu Al 1134 1135 Asp Gly Gly Ser Arg Tyr Glu Gly Val Thr Glu Ala He Ala lie Gly 1200 1201 Leu Mec Ala Ala Gly Gin Ala Thr Ser Val Val Ala Glu Arg Leu Ala 1215 1217 Thr Thr Glu Asn Tyr Arg Arg Arg Arg Glu Glu Trp Gin He Gin Tyr 1232 1233 Gin Gin Ala Gin Ser Glu Val Asp Ala Leu Gin Lys Gin Leu Asp Ala 1243 1249 Leu Ala Val Arg Glu Lys Ala Ala Gin Thr Ser Leu Gin Gin Ala Lys 1264 1265 Ala Gin Gin Val Gin He Arg Thr Mec Leu Thr Tyr Leu Thr Thr Arg 1230 1231 Phe Thr Gin Ala Thr Leu Tyr Gin Trp Leu Ser Gly Gin Leu Ser Ala 1296 1297 Leu Tyr Tyr Gin Ala Tyr Asp Ala Val Val Ala Leu cys Leu Ser Ala 1312 1313 Gin Ala Cys Trp Gin Tyr Glu Leu Gly Asp Tyr Ala Thr Thr Phe lie 1323 1329 Gin Thr Gly Thr Trp Asn Asp His Tyr Arg Gly Leu Gin Val Gly Glu 1344 1345 Thr Leu Gin Leu Asn Leu His Gin Mec Glu Ala Aia Tyr Leu Val Arg 1360 1361 His Glu Arg Arg Leu Asn Val lie Arg Thr Val Ser Leu Lys Ser Leu 1376 1377. . Leu Gly Asp Asp Gly Phe Gly Lys Leu Lys Thr Glu Gly Lys Val Asp 1392 1393 Phe Pro Leu Ser Glu Lys Leu Phe Asp Asn Asp Tyr Pro Gly His T r 1403 1409 Leu Arg Gin lie L s Thr Val Ser Val Thr Leu Pro Thr Leu V l Ci 1424 1425 Pro Tyr Gin Asn Val Lys Ala Thr Leu Thr Gin Thr Ser Ser Ser ile 1440 1441 Leu Leu Ala Ala Asp lie Asn Gly Val Lys Arg Leu Asn Asp Pro Thr 1456 1457 Giy Lys Glu Gly Asp Ala Thr His lie Val Thr Asn Leu Arg Ala Ser 1472 1473 Gin Gin Val Ala Leu Ser Ser Gly He As Asp Ala Gly Ser Phe Glu 1433 1439 Leu Arg Leu Glu Asp Glu A g Tyr Leu Ser Phe Glu Gly Thr Gly Ala 1504 1505 Val Ser Lys Trp Thr Leu As Phe Pro Arg Ser Val Asp Glu His He Q 1521 Asp As L s Thr Leu Lys Ala Asp Glu Mec Gin Ala Ala Leu Leu Ai 1 j c 1537 Asn Mec Asp As Val Leu Val Gin Val His Ty Thr Ala Cys Asp Gly 1552 1553 Giy Ala Ser Phe Ala Asn G n al Lys Lys Thr Leu Sar 1565 -248- (2) INFORMATION FOR SEQ ID NO: 60 ( i i SEQUENCE CHARACTERISTICS: (A) LENGTH : 3122 base pairs (B) TYPE: nucleic acid (C) STRANDEDME33 : double (D) TOPOLOGY: linear i i i ) MOLECULE TYPE: DMA (genomic) ( X i i SEQUENCE DESCRIPTION: SEQ ID NO: 60 (CCCC) 1 ATG AGT CCG TCT GAG ACT ACT CTT TAT ACT CAA ACC CCA AC GTC AGC 43 i Mec Ser Pro Ser Glu Thr Thr Leu Tyr Thr Gin Thr Pro Thr l Ser 16 49 GTG TTA GAT AAT CGC GGT CTG TCC ATT CGT GAT ATT GGT TTT CAC CGT 56 17 Val Leu Asp Asn Arg Gly Leu Ser lie Arg Asp lie Gly Phe His Arg 32 97 ATT GTA ATC GGG GGG GAT ACT GAC ACC CGC GTC ACC CGT CAC CAG TAT 1 4 33 He Val He Gly Gly Asp Thr Asp Thr Arg Val Thr Arg His Gin Tyr 43 145 GAT GCC CGT GGA CAC CTG AAC TAC AGT ATT GAC CCA CGC TTG TAT GAT 192 49 Asp Ala Arg Gly His Leu Asn Tyr Ser He Asp Pro Arg Leu Tyr Asp 64 193 GCA AAG CAG GCT GAT AAC TCA GTA AAG CCT AAT TTT GTC TGG CAG CAT 240 55 Ala Lys Gin Ala Asp Asn Ser Val Lys Pro Asn Phe Val Trp Gin His 30 2 1 GAT CTG GCC GGT CAT GCC CTG CGG ACA GAG AGT GTC GAT GCT GGT CGT 233 31 Asp Leu Ala Gly His Ala Leu Arg Thr Glu Ser Val Asp Ala Gly Arg 96 239 ACT GTT GCA TTG AAT GAT ATT GAA GGT CGT TCG GTA ATG ACA ATG AAT 336 97 Thr Val Ala Leu Asn Asp He Glu Gly Arg Ser Val Mec Thr Mec Asn 112 337 GCG ACC GGT GTT CGT CAG ACC CGT CGC TAT GAA GGC AAC ACC TTG CCC 334 113 Ala Thr Gly Val Arg Gin Thr Arg Arg Tyr Glu Gly Asn Thr Leu Pro 123 335 GGT CGC TTG TTA TCT GTG AGC GAG CAA GTT TTC AAC CAA GAG AGT GCT 432 129 Giy Arg Leu Leu Ser Val Ser Glu Gin Val Phe Asn Gin Glu Ser Ala 144 433 AAA GTG ACA GAG CGC TTT ATC TGG GCT GGG AAT ACA ACC TCG GAG .AAA 430 145 Lys Val Thr Glu Arg Phe He Trp Ala Gly Asn Thr Thr Ser Glu Lys 160 431 GAG TAT AAC CTC TCC GGT CTG TCT ATA CGC CAC TAC GAC ACA GCG GGA 523 161 Glu Tyr Asn Leu Ser Gly Leu Cys He Arg His Tyr Asp Thr Ala Gly 176 529 GTG ACC CGG TTG ATG AGT CAG TCA CTG GCG GGC GCC ATG CTA TCC CAA 5 177 Val Thr Arg Leu Mec Ser Gin Ser Leu Ala Gly Ala Mec Leu Ser Gin HI 577 TCT CAC CAA TTG CTG GCG GAA GGG CAG GAG GCT AAC TGG AGC GGT GAC 624 193 Ser His Gin Leu Leu Ala Glu Gly Gin Glu Ala Asn Trp Ser Gly Asp 203 -249- SUBSTmjTE SHEET (RULE 26) GAC GAA ACT GTC TCC CAG CCA ATG C G GCA AGT GAG GTC TAT ACG CA 205 Asp Glu Thr Val Trp Gin Gly Mec Leu Ala ier Glu Val Tyr Thr Thr _·_.-! 57 3 C.AA AGT ACC ACT AAT GCC ATC GGG GCT TTA CTG ACC CAA ACC GAT GCG 7 20 22 5 Gin Ser Thr Thr Asn Ala lie Gly Ala Leu Leu Thr Gin Thr Asp Ala 2 40 " 2 1 AAA GGC A.-.T ATT CAG CGT CTG GCT TAT GAC ATT GCC GGT CAG TTA AAA " -' 3 2 4 1 Lys Gly Asn lie Gin Arg Leu Ala Tyr Asp lie Ala Gly Gin L÷u Lys 2 56 7 5 9 GGG AGT TGG TTG ACG GTC AAA GGC CAG AGT GAA CAG CTG ATT GTT AAG 3 16 2 57 Gly Ser Trp Leu Thr Val Lys Gly Gin Ser Glu Gin Val He Val Ly 3 17 TCC CTG AGC TGG TCA GCC GCA GGT CAT AAA TTG CGT GAA GAG CAC GGT 3 64 273 Ser Leu Ser Trp Ser Ala Ala Gly His Lys Leu Arg Glu Glu His Gly 2 3 a 365 AAC GGC GTG GTT ACG GAG TAC AGT TAT GAG CCG GAA ACT CAA CGT CTG 5 12 289 Asn Gly Val Val Thr Glu Tyr Ser Tyr Glu Pro Glu Thr Gin Arg Leu 3 04 9 13 ATA GGT ATC ACC ACC CGG CGT GCC GAA GGG AGT CAA TCA GGA GCC AGA 9 50 3 0 5 He Gly lie Thr Thr Arg Arg Ala Glu Gly Ser Gin Ser Gly Ala Arg 3 20 9 61 GTA TTG CAG GAT CTA CGC TAT AAG TAT GAT CCG GTG GGG AAT GTT ATC 1008 3 2 1 Val Leu Gin Asp Lau Arg Tyr Lys Tyr Asp Pro Val Gly Asn Val He 3 3 6 1009 AGT ATC CAT AAT GAT GCC GAA GCT ACC CGC TTT TGG CGT AAT CAG AAA 1056 3 37 Ser lie His Asn Asp Ala Glu Ala Thr Arg Phe Trp Arg Asn Gin Lys 3 52 1057 GTG GAG CCG GAG AAT CGC TAT GTT TAT GAT TCT CTG TAT CAG CTT ATG 1 104 3 53 Val Glu Pro Glu Asn Arg Tyr Val Tyr Asp Ser Lau Tyr Gin Leu Mec 3 63 1105 AGT GCG ACA GGG CGT GAA ATG GCT AAT ATC GGT CAG CAA AGC AAC CAA 1 152 3 59 Ser Ala Thr Gly Arg Glu Mec Ala Asn Ila Gly Gin Gin Ser Asn Gin- 3 34 1153 CTT CCC TCA CCC GTT ATA CCT GTT CCT ACT GAC GAC AGC ACT TAT ACC 1 200 3 85 Leu Pro Ser Pro Val Ila Pro Val Pro Thr Asp Asp Ser Thr Tyr Thr 4 00 1201 AAT TAC CTT CGT, ACC TAT ACT TAT GAC CGT GGC GGT AAT TTG GTT CAA 1243 40 1 Asn Tyr Leu Arg Thr Tyr Thr Tyr Asp Arg Gly Gly Asn Leu Val Gin 4 16 12 9 ATC CGA CAC AGT TCA CCC GCG ACT CAA AAT AGT TAC ACC ACA GAT ATC 1255 4 17 Ha Arg His Ser Ser Pro Ala Thr Gin Asn Ser Tyr Thr Thr Asp He 4 3 2 1297 ACC GTT TCA AGC CGC AGT AAC CGG GCG GTA TTG AGT ACA TTA ACG ACA 1 44 43 3 Thr Val Ser Ser Arg Ser Asn Arg Ala Val Leu Ser Thr Leu Thr Thr 4 43 13 45 GAT CCA ACC CGA GTG GAT GCG CTA TTT GAT TCC GGC GGT CAT CAG AAG 1 3 92 449 Asp Pro Thr Arg Val Asp Ala Leu Phe Asp Ser Gly Gly His Gin Lys 4 64 13 9 3 ATG TTA ATA CCG GGG CAA AAT CTG GAT TGG AAT ATT CGG GGT GAA TTG L 40 -250- 455 Mac Leu lis Pro Giy Gin Asn Lau Asp Trp Asn rie Arg Giy Giu Le 1441 CAA CGA GTC ACA CCG GTG AGC CGT GAA AAT AGC ACT GAC AGT G A TGG 1433 431 Gin Arg Val Thr Pro Val Sar Arg Glu Asn Ser Ser Asp Sar Giu Trp 496 1439 TAT CGC TAT AGC AGT GAT GGC ATG CGG CTG CTA AAA GTG AGT G A CAG 1535 497 T r Arg T/r Ser Ser Asp Giy Mec Arg Lau Leu Lys Val Ser Giu Gin Si^ 1537 CAG ACG GGC AAC AGT ACT CAA GTA CAA CGG GTG ACT TAT CTG CCG GGA L 53 513 Gin Thr Giy Asn Ser Thr Gin Val Gin Arg Val Thr T/r Leu Pro Giy 523 1535 TTA GAG CTA CGG ACA ACT GGG GTT GCA GAT AAA ACA ACC GAA GAT TTG 1532 529 Leu Glu Leu Arg Thr Thr Giy Val Ala Asp Lys Thr Thr Glu Asp Lau 544 1533 CAG GTG ATT ACG GTA GGT GAA GCG GGT CGC GCA CAG GTA AGG GTA TTG 1630 545 Gin Val lie Thr Val Giy Glu Ala Giy Arg Ala Gin Val Arg Val Lau 560 13 1 CAC TGG GAA AGT GGT AAG CCG ACA GAT ATT GAC AAC AAT CAG GTG CGC 1723 561 His Trp Glu Ser Giy Lys Pro Thr Asp lie Asp Asn Asn Gin Val Arg 576 1729 TAC AGC TAC GAT AAT CTG CTT GGC TCC AGC CAG CTT GAA CTG GAT AGC 1 76 577 Tyr Ser Tyr Asp Asn Leu Leu Giy Ser Ser Gin Leu Glu Leu Asp Sar 592 1777 GAA GGG CAG ATT CTC AGT CAG GAA GAG TAT TAT CCG TAT GGC GGT ACG 1324 593 Glu Giy Gin lie Leu Ser Gin Glu Glu Tyr T/r Pro Tyr Giy Giy Thr 603 1325 GCG ATA TGG GCG GCG AGA AAT CAG ACA GAA GCC AGC TAC AAA TTT ATT 13" 2 509 Ala lie Trp Ala Ala Arg Asn Gin Thr Glu Ala Ser Tyr Lys Phe lie 624 1373 CGT TAC TCC GGT AAA GAG CGG GAT GCC ACT GGA TTG TAT TAT TAC GGC 1920 525 Arg Tyr Ser Giy Lys Glu Arg Asp Ala Thr Giy Leu Tyr Tyr Tyr Giy 540 1521 TAC CGT TAT TAT CAA CCT TGG GTG GGT CGA TGG TTG AGT GCT GAT CCG 1 6 641 Tyr Arg Tyr Tyr Gin Pro Trp Val Giy Arg Trp Leu Ser Ala Asp Pro 656 1969 GCG GGA ACC GTG GAT GGG CTG AAT TTG TAC CGA ATG GTG AGG AAT AAC 2016 557 Ala Giy Thr Val Asp Giy Leu Asn Leu Tyr Arg Mec Val Arg Asn Asn 672 2017 CCC ATC ACA TTG ACT GAC CAT GAC GGA TTA GCA CCG TCT CCA AAT AGA 2064 673 Pro lie Thr Leu Thr Asp His Asp Giy Lau Ala Pro Ser Pro Asn Arg 633 2065 AAT CGA AAT ACA TTT TGG TTT GCT TCA TTT TTG TTT CGT AAA CCT T 11 639 Asn Arg Asn Thr Phe Trp Phe Ala Ser Phe Leu Phe Arg Lys Pro .-.sp "04 2113 GAG GGA ATG TCC GCG TCA ATG AGA CGG GGA CAA AAA ATT GGC AGA GCC 2160 705 Glu Giy Mec Ser Ala Ser Mec Arg Arg Giy Gin Lys He Giy Arg Ala "20 2151 ATT GCC GGC GGG ATT GCG ATT GGC GGT CTT GCG GCT ACC ATT GCC GCT 2 3 -721 He Ala Giy Giy He Ala .lie Giy Giy Lau Ala Ala Thr lie Ala Ala " 36 251- SUBSTmjTE SHEET(RULE 26) 2209 ACC GCT GGC CCG CCT ATC CCC CTC ATT CTG GGC CTT GCG CCC CTA CCC = "37 Thr- Ala Gly Ala Ala lie Pro Val lis Lau Gly Val Aia Aia Vai civ " ; 2257 GCG GGG ATT GGC GCG TTG ATG GGA TAT AAC GTC GGT AGC CTG CTG GAA 2204 "53 Ala Gly lis Gly Ala Leu Mec Gly Tyr Asn Val Gly Ser Lau Leu Giu "65 2305 AAA GGC CGG CCA TTA CTT GCT CGA CTC CTA CAG GCG AAA TCC ACC TTA 2352 769 Lys Gly Gly Ala Leu Leu Ala Arg Leu Val Gin Gly Lys Ser Thr Leu. 34 2353 GTA CAG TCG GCG GCT GGC GCG GCT GCC GGA GCG AGT TCA GCC GCG CCT 240C 735 Val Gin Ser Ala Ala Gly Ala Ala Ala Gly Ala Ser Ser Ala Ala Ala 300 2401 TAT GGC GCA CGG GCA CAA GGT GTC GGT GTT GCA TCA GCC GCC GGG GCC 2443 301 Tyr Gly Ala Arg Ala Gin Gly Val Gly Val Ala Ser Ala Ala Gly Ala 3i5 2449 GTA ACA GGG GCT GTG GGA TCA TGG ATA AAT AAT GCT GAT CCG GGG ATT 2496 317 al Thr Gly Ala Val Gly Ser Trp lie Asn Asn Ala Asp Arg Gly lie 332 2 97 GGC GGC GCT ATT GGG GCC GGG AGT GCG GTA GGC ACC ATT GAT ACT ATG 2544 333 Gly Gly Ala lie Gly Ala Gly Ser Ala Val Gly Thr He Asp Thr Mec 343 2545 TTA GGG ACT GCC TCT ACC CTT ACC CAT GAA GTC GGG GCA GCG GCG GGT 2592 349 Leu Gly Thr Ala Ser Thr Leu Thr His Glu Val Gly Ala Ala Ala Gly 354 2593 GGG GCG GCG GGT GGG ATG ATC ACC GGT ACG CAA GGG AGT ACT CGG GCA 2540 365 Gly Ala Ala Gly Gly Mec He Thr Gly Thr Gin Gly Ser Thr Arg Ala 330 2641 GGT ATC CAT GCC GGT ATT GGC ACC TAT TAT GGC TCC TGG ATT GGT TTT 2533 331 Gly lie His Ala Gly He Gly Thr Tyr Tyr Gly Ser Trp He Gly Phe 396 2639 GGT TTA GAT GTC GCT AGT AAC CCC GCC GGA CAT TTA GCG AAT TAC GCA 2735 897 Gly Leu Asp Val Ala Ser Asn Pro Ala Gly His Leu Ala Asn Tyr Ala 912 2737 GTG GGT TAT GCC GCT GGT TTG GGT GCT GAA ATG GCT GTC AAC AGA ATA 2734 913 Val Gly Tyr Ala Ala Gly Leu Gly Ala Glu Mec Ala Val Asn Arg He 923 2785 ATG GGT GGT GGA TTT TTG AGT AGG CTC TTA GGC CGG GTT GTC AGC CCA 2332 929 Mec Gly Gly Gly Phe Leu Ser Arg Leu Leu Gly Arg Val Val Ser Pro 944 2333 TAT GCC GCC GGT TTA GCC AGA CAA TTA GTA CAT TTC AGT GTC GCC AGA 2330 545 Tyr Ala Ala Gly Leu Ala Arg Gin Leu Val His Phe Ser Val Ala Arg 550 2831 CCT GTC TTT GAG CCG ATA TTT AGT GTT CTC GGC GGG CTT GTC GGT GGT 2923 961 Pro Val Phe Glu Pro lie Phe Ser Val Leu Gly Gly Leu Val Gly Gly 575 2929 ATT GGA ACT GGC CTG CAC AGA GTG ATG GGA AGA GAG AGT TGG ATT TCC 29" 5 977 He Gly Thr Gly Leu His Arg Val Mec Gly Arg Glu Ser Trp He Ser 552 2977 AGA GCG TTA AGT GCT GCC GGT AGT GGT ATA GAT CAT GTC GCT GCC ATC Γ 4 -252- SUBSTTTUTE SHEET (RULE 26) .-53 Arg Ala Leu Ser Ala Ala Gly Ser Z x y lie Asp His 7 i Aia Giy Ms- ; ; : · 3025 ATT GGT .AAT CAG ATC AGA GGC AGG GTC TTG ACC ACA ACC GGG ATC OCT 1009 He Gly Asn Gin lie Arg Gly Arg Val Leu Thr Thr Thr Gly lis Ala :024 30~3 .AAT GCG ATA GAC TAT GGC ACC AGT GCT GTG GGA GCC GCA CGA CCA G T 120 1025 Asn Aia lie Asp Tyr Gly Thr Ser Aia Val Gly Ala Aia Arg Arg Val 1040 10 312 i TTT TCT TTG TAA 3132 1041 Phe Ser Leu End 1043 15 (2) INFORMATION FOR SEQ ID NO: 61 (ii SEQUENCE CHARACTERISTICS: (A) LENGTH: 1043 amino acids ( B ) TYPE: amino acid 20 (C) TOPOLOGY: linear (ii) MOLECULE TYPE: procein 25 (XI) SEQUENCE DESCRIPTION: SEQ ID NO : 61 (TccC pepcidei 1 Mec Ser Pro Ser Glu Thr Thr Leu Tyr Thr Gin Thr Pro Thr Val Ser 15 17 Val Leu Asp Asn Arg Gly Leu Ser lis Arg Asp lie Gly Phe His Arg 32 30 33 lie Val He Gly Gly Asp Thr Asp Thr Arg Val Thr Arg His Gin Tyr 43 49 Asp Ala Arg Gly His Leu Asn Tyr Ser He Asp Pro Arg Leu Tyr Asp 64 35 55 Ala Lys Gin Ala Asp Asn Ser Val Lys Pro Asn Phe Val Trp Gin His 30 31 Asp Leu Ala Gly His Ala Leu Arg Thr Glu Ser Val Asp Ala Gly Arg 96 97 Thr Val Ala Leu Asn Asp lie Glu Gly Arg Ser Val Mec Thr Mec Asn 112 40 113 Ala Thr Gly Val Arg Gin Thr Arg Arg Tyr Glu Gly Asn Thr Leu Pro 123 129 Gly Arg Leu Leu Ser Val Ser Glu Gin Val Phe Asn Gin Glu Ser Ala 144 45 145 Lys Val Thr Glu Arg Phe He Trp Ala Gly Asn Thr Thr Ser Glu Lys 150 161 Glu Tyr Asn Leu Ser Gly Leu Cys He Arg His Tyr Asp Thr Ala Gly l"5 177 Val Thr Arg Leu Mec Ser Gin Ser Lau Ala Gly Ala Mec Leu Ser Gin 192 50 193 Ser His Gin Leu Leu Ala Glu Gly Gin Glu Ala Asn Trp Ser Gly Asp 203 209 Asp Glu Thr Val Trp Gin Gly Mec Leu Ala Ser Glu Val Tyr Thr Thr 224 55 225 Gin Ser Thr Thr Asn Ala He Gly Ala Leu Leu Thr Gin Thr Asp Ala 240 241 Lys Gly Asn He Gin Arg Leu Ala Tyr Asp Ha Ala Gly Gin Leu Lys 256 257 Gly Ser Trp Leu Thr Val Lys Gly Gin Sar Glu Gin Val He Val Lys 2" ή 2'73 Ser Leu Ser Trp Ser Ala Ala Gly His Lys Leu Arg Glu Glu His Gly 233 239 Asn Gly Val Val Thr Glu T r Ser Tyr Glu Pro Glu Thr Gin Arg Leu 304 65 305 He Gly He Thr Thr Arg Arg Ala Glu Gly Ser Gin Ser Gly Ala Arg 320 -253 - 32 , Val Leu GLn Asp Leu Arg Tyr Lys T* r Asp Pro Val Gly Asn Val I.÷ .■ .6 32" Ser lie His Asn Asp Ala Giu Ala Thr Arg Phe Trp Arg Asn Gin Lys 5 353 Val Glu Pro Giu Asn Arg T Val Tyr Asp Ser Leu Tyr Gin Leu Mec 363 359 Ser Ala Thr Gly Arg Glu Mac Ala Asn lie Gly Gin Gin Ser Asn Gin 334 10 325 Leu Pro Ser Pro Val lie Pro Val Pro Thr Asp Asp Ser Thr Tyr Thr 400 -101 Asn T r Leu Arg Thr Tyr Thr Tyr Asp Arg Gly Gly Asn Leu Val Gin 4i5 417 lie Arg His Ser Ser Pro Ala Thr Gin Asn Ser Tyr Thr Thr Asp "is 15 433 Thr Val Ser Ser Arg Ser Asn Arg Ala Val Leu Ser Thr Leu Thr Thr 449 Asp Pro Thr Arg Val Asp Ala Leu Phe Asp Ser Gly Gly His Gin Lys 464 20 465 Mec Leu He Pro Gly Gin Asn Leu Asp . Trp Asn lie Arg Gly Glu Leu 430 431 Gin Arg Val Thr Pro Val Ser Arg Glu Asn Ser Ser As Ser Glu Trp 49-5 497 Tyr Arg Tyr Ser Ser Asp Gly Mec Arg Leu Leu Lys Val Ser Glu Gin 1"1 25 513 Gin Thr Gly Asn Ser Thr Gin Val Gin Arg Val Thr Tyr Lau Pro Gly 523 529 Leu Glu Leu Arg Thr Thr Gly Val Ala Asp Lys Thr Thr Giu Asp Leu 544 30 545 Gin Val He Thr Val Gly Glu Ala Gly Arg Ala Gin Val Arg Val Leu 550 551 His Trp Glu Ser Gly Lys Pro Thr Asp Ila Asp Asn Asn Gin Val Arg 5" 6 577 Tyr Ser Tyr Asp Asn Leu Leu Gly Ser Ser Gin Leu Glu Leu Asp Ser 592 35 593 Glu Gly Gin Ila Leu Ser Gin Glu Glu Tyr Tyr Pro Tyr Gly Gly Thr 503 509 Ala He Trp Ala Ala Arg Asn Gin Thr Glu Ala Ser Tyr Lys Phe He 624 40 525 Arg Tyr Sar Gly Lys Glu Arg Asp Ala Thr Gly Leu Tyr Tyr Tyr Gly 640 641 Tyr Arg Tyr Tyr Gin Pro Trp Val Gly Arg Trp Lau Ser Ala Asp Pro 656 657 Ala Gly Thr Val Asp Gly Leu Asn Lau Tyr Arg Mec Val Arg Asn Asn 672 45 573 Pro He Thr Leu Thr Asp His Asp Gly Leu Ala Pro Ser Pro Asn Arg 633 639 Asn Arg Asn Thr Pha Trp Pha Ala Sar Pha Lau Pha Arg Lys Pro Asp 704 50 705 Glu Gly Mac Sar Ala Ser Met Arg Arg Gly Gin Lys Ila Gly Arg Ala 720 721 He Ala Gly Gly He Ala He Gly Gly Leu Ala Ala Thr He Ala Ala 736 37 Thr Ala Gly Ala Ala He Pro Val He Leu Gly Val Ala Ala Val Gly "52 55 753 Ala Gly He Gly Ala Leu Mec Gly Tyr Asn Val Gly Ser Leu Leu Giu "53 769 Lys Gly Gly Ala Leu Leu Ala Arg Leu Val Gin Gly Lys Ser Thr Leu "34 (SO "35 Val Gin Ser Ala Ala Gly Ala Ala Ala Gly Ala Ser Ser Ala Ala Ala 300 301 Tyr Gly Ala Arg Ala Gin Gly Val Gly Val Ala Ser Ala Ala Gly Ala 316 317 al Thr Gly Ala Val Gly Ser Trp Ha Asn Asn Aia Asp Arg Gly He 322 ή5 333 Gly Gly Ala He Gly Ala Gly Ser Ala Val Gly Thr Ha Asp Thr Mec ^ 5 -254- SUBSTTTUTE SHEET (RULE 26) 549 Leu Gly Thr Al 5er Thr Leu Thr His Va 1 Gly Ala Al Ai J - V 355 Gly Ala Ala Gly Gly Mec He Thr Gly Thr Gin Gly Ser Thr A g Al ; 3 331 Gly lie His Ala Gly He Gly Thr Tyr Tyr Gly Ser Trp He Gly Phe 355 35? Gly Leu Asp Val Ala Ser Asn Pro Ala Gly His Leu Ala Asn Tyr Ala i ;·ΐ3 V l Gly Tyr Ala Ala Gly Leu Gly Ala Glu Mec Ala Val Asn Arg He 923 525 Mec Gly Gly Gly Phe Leu Ser Arg Leu Leu Gly Arg Val Val Ser P o 544 345 T r Ala Ala Gly Leu Ala Arg Gin Leu Val His Phe Ser Val Ala Arg 550 5ό1 Pro V l Phe Glu Pro He Phe Ser Val Leu Gly Gly Leu Val Gly Gly 9"5 377 lie Gly Thr Gly Leu His Arg Val Mec Gly Arg Glu Ser Trp He Ser 552 553 Arg Ala Leu Ser Ala Ala Gly Ser Gly He Asp His al Ala Gly Mec IOCS 1009 lie Gly Asn Gin He Arg Gly Arg Val Leu Thr Thr Thr Gly He Ala 1024 1025 Asn Ala He Asp T r Gly Thr Ser Ala Val Gly Ala Ala Arg Arg V l 1040 10-11 Phe Ser Leu 1043 -255-

Claims (8)

- 256 - 121243/6 CLAIMS:
1. A polynucleotide that is operably associated with a heterologous promoter, wherein said polynucleotide encodes a protein that has toxin activity against an insect pest wherein a nucleotide molecule that codes for said protein maintains hybridization with the complement of nucleic acid sequence ID NO:46 after hybridization and wash, and wherein said hybridization is conducted at 60°C in solution containing 10% w/v PEG (polyethylene glycol, M.W. approximately 8000), 0.6X SSC, 7% w/v SDS, 10 raM sodium phosphate buffer, 5mM EDTA, and 100 mg/ml denatured salmon sperm DNA, said wash is conducted in 0.25X SSC and 0.2% SDS at 60°C.
2. The polynucleotide of claim 1 wherein said protein comprises the amino acid sequence of SEQ ID NO:47.
3. A transgenic plant cell comprising a polynucleotide of claim 1.
4. A recombinant protein having toxin activity against an insect pest wherein a nucleotide sequence that codes for said protein is the nucleotide molecule as defined in claim 1.
5. The recombinant protein of claim 5 wherein said protein comprises amino acid sequence ID NO:47.
6. A method of controlling an insect pest wherein said method comprises feeding a protein of Claim 4 or Claim 5 to said pest.
7. The method of Claim 6, wherein said protein is produced by and is present in a transgenic plant that is accessible to said pest.
8. A method of protecting a plant from an insect which comprises spraying on the plant a toxin comprising the amino acid sequence of SEQ ID NO:47. For the Applicants, REINHOLD COHN AND PARTNERS
IL121243A 1995-11-06 1996-11-06 Pecombinant insecticidal protein toxin from photorhabdus, polynucleotide encoding said protein and method of controlling pests using said protein IL121243A (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US725595P 1995-11-06 1995-11-06
US60842396A 1996-02-28 1996-02-28
US70548496A 1996-08-29 1996-08-29
PCT/US1996/018003 WO1997017432A1 (en) 1995-11-06 1996-11-06 Insecticidal protein toxins from photorhabdus

Publications (2)

Publication Number Publication Date
IL121243A0 IL121243A0 (en) 1998-01-04
IL121243A true IL121243A (en) 2010-05-31

Family

ID=27358315

Family Applications (1)

Application Number Title Priority Date Filing Date
IL121243A IL121243A (en) 1995-11-06 1996-11-06 Pecombinant insecticidal protein toxin from photorhabdus, polynucleotide encoding said protein and method of controlling pests using said protein

Country Status (13)

Country Link
EP (1) EP0797659A4 (en)
JP (2) JP3482214B2 (en)
KR (1) KR100354530B1 (en)
AU (1) AU729228B2 (en)
BR (1) BR9606889A (en)
CA (1) CA2209659C (en)
HU (1) HUP9900768A3 (en)
IL (1) IL121243A (en)
MX (1) MX9705101A (en)
PL (1) PL186242B1 (en)
RO (1) RO121280B1 (en)
SK (1) SK93197A3 (en)
WO (1) WO1997017432A1 (en)

Families Citing this family (141)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB9618083D0 (en) * 1996-08-29 1996-10-09 Mini Agriculture & Fisheries Pesticidal agents
EP0915909B1 (en) * 1997-05-05 2007-06-13 Dow AgroSciences LLC Insecticidal protein toxins from xenorhabdus
US5997269A (en) * 1997-06-20 1999-12-07 Mycogen Corporation Means for discovering microbes
AUPO808897A0 (en) 1997-07-17 1997-08-14 Commonwealth Scientific And Industrial Research Organisation Toxin genes from the bacteria xenorhabdus nematophilus and photohabdus luminescens
US6281413B1 (en) 1998-02-20 2001-08-28 Syngenta Participations Ag Insecticidal toxins from Photorhabdus luminescens and nucleic acid sequences coding therefor
JP2002504336A (en) * 1998-02-20 2002-02-12 ノバルティス アクチエンゲゼルシャフト Insecticidal toxins from Photolabdus
US6174860B1 (en) 1999-04-16 2001-01-16 Novartis Ag Insecticidal toxins and nucleic acid sequences coding therefor
GB9901499D0 (en) * 1999-01-22 1999-03-17 Horticulture Res Int Biological control
AUPP911399A0 (en) * 1999-03-10 1999-04-01 Commonwealth Scientific And Industrial Research Organisation Plants and feed baits for controlling damage
EP1069134A1 (en) * 1999-07-15 2001-01-17 Wisconsin Alumni Research Foundation Photorhabdus luminescens strains
WO2001016305A2 (en) * 1999-09-02 2001-03-08 Agresearch Limited Nucleotide sequences encoding an insectidal protein complex from serratia
FR2803592A1 (en) 2000-01-06 2001-07-13 Aventis Cropscience Sa NOVEL DERIVATIVES OF 3-HYDROXYPICOLINIC ACID, PROCESS FOR THEIR PREPARATION AND FUNGICIDAL COMPOSITIONS CONTAINING SAME
US8440880B2 (en) 2000-06-30 2013-05-14 Monsanto Technology Llc Xenorhabdus sp. genome sequences and uses thereof
FR2815969B1 (en) 2000-10-30 2004-12-10 Aventis Cropscience Sa TOLERANT PLANTS WITH HERBICIDES BY METABOLIC BYPASS
ES2311731T3 (en) * 2002-06-28 2009-02-16 Dow Agrosciences Llc PROTEINS OF PESTICIDE ACTION AND POLYNUCLEOTIDES THAT CAN BE OBTAINED FROM PAENIBACILLUS SPECIES.
BRPI0406856A (en) 2003-01-21 2005-12-27 Dow Agrosciences Llc Mixture and compatibility of tc proteins for pest control
AR042717A1 (en) 2003-01-21 2005-06-29 Dow Agrosciences Llc XENORHABDUS CT PROTEINS AND GENES FOR PEST CONTROL
US7319142B1 (en) 2004-08-31 2008-01-15 Monsanto Technology Llc Nucleotide and amino acid sequences from Xenorhabdus and uses thereof
EP2195436A2 (en) * 2007-08-31 2010-06-16 BASF Plant Science GmbH Pathogen control genes and methods of use in plants
TW201142029A (en) 2009-11-24 2011-12-01 Univ Leuven Kath Banana promoters
EA201290572A1 (en) 2009-12-23 2013-05-30 Байер Интеллектуэль Проперти Гмбх PLANTS RESISTANT TO HERBICIDES - HPPD INHIBITORS
CA2785211C (en) 2009-12-23 2018-12-11 Bayer Intellectual Property Gmbh Plants tolerant to hppd inhibitor herbicides
CN102762725A (en) 2009-12-23 2012-10-31 拜尔知识产权有限公司 Plants tolerant to hppd inhibitor herbicides
AR079972A1 (en) 2009-12-23 2012-03-07 Bayer Cropscience Ag TOLERANT PLANTS TO INHIBITING HERBICIDES OF HPPD
UY33140A (en) 2009-12-23 2011-07-29 Bayer Cropscience Ag TOLERANT PLANTS TO INHIBITING HERBICIDES OF HPPD
AU2011212538B2 (en) 2010-02-02 2014-12-04 BASF Agricultural Solutions Seed US LLC Soybean transformation using HPPD inhibitors as selection agents
US20120088719A1 (en) 2010-09-20 2012-04-12 Jerald Coleman Ensign Mosquitocidal Xenorhabdus, Lipopeptide And Methods
ES2588802T3 (en) 2010-11-10 2016-11-04 Bayer Cropscience Ag HPPD variants and usage procedures
MX2013010821A (en) 2011-03-25 2013-10-17 Bayer Ip Gmbh Use of n-(1,2,5-oxadiazol-3-yl)benzamides for controlling unwanted plants in areas of transgenic crop plants being tolerant to hppd inhibitor herbicides.
AU2012234449B2 (en) 2011-03-25 2016-05-12 Bayer Intellectual Property Gmbh Use of N-(tetrazol-4-yl)- or N-(triazol-3-yl)arylcarboxamides or their salts for controlling unwanted plants in areas of transgenic crop plants being tolerant to hppd inhibitor herbicides
KR101246707B1 (en) * 2012-08-01 2013-03-25 ㈜엠알이노베이션 Nematocide compound containing photorhabdus temperata subsp. temperata
UA119532C2 (en) 2012-09-14 2019-07-10 Байєр Кропсайєнс Лп Hppd variants and methods of use
EP2964767B1 (en) 2013-03-07 2019-12-18 BASF Agricultural Solutions Seed US LLC Toxin genes and methods for their use
MX2016011745A (en) 2014-03-11 2017-09-01 Bayer Cropscience Lp Hppd variants and methods of use.
JP6873979B2 (en) 2015-09-11 2021-05-19 バイエル・クロップサイエンス・アクチェンゲゼルシャフト HPPD mutants and usage
US11091772B2 (en) 2016-11-23 2021-08-17 BASF Agricultural Solutions Seed US LLC AXMI669 and AXMI991 toxin genes and methods for their use
EP3559241A1 (en) 2016-12-22 2019-10-30 Basf Agricultural Solutions Seed Us Llc Use of cry14 for the control of nematode pests
WO2018136611A1 (en) 2017-01-18 2018-07-26 Bayer Cropscience Lp Use of bp005 for the control of plant pathogens
CN110431234B (en) 2017-01-18 2024-04-16 巴斯夫农业种子解决方案美国有限责任公司 BP005 toxin gene and methods of use thereof
US11708565B2 (en) 2017-03-07 2023-07-25 BASF Agricultural Solutions Seesi US LLC HPPD variants and methods of use
US20210032651A1 (en) 2017-10-24 2021-02-04 Basf Se Improvement of herbicide tolerance to hppd inhibitors by down-regulation of putative 4-hydroxyphenylpyruvate reductases in soybean
BR112020008092A2 (en) 2017-10-24 2020-09-15 BASF Agricultural Solutions Seed US LLC method for checking tolerance to a GM herbicide and soy plant
AR119426A1 (en) 2019-07-22 2021-12-15 Bayer Ag 5-AMINO PYRAZOLES AND TRIAZOLES AS PESTICIDES
US20220274947A1 (en) 2019-07-23 2022-09-01 Bayer Aktiengesellschaft Novel heteroaryl-triazole compounds as pesticides
UY38794A (en) 2019-07-23 2021-02-26 Bayer Ag NOVEL HETEROARYL-TRIAZOLE COMPOUNDS AS PESTICIDES
EP3701796A1 (en) 2019-08-08 2020-09-02 Bayer AG Active compound combinations
US20220403410A1 (en) 2019-09-26 2022-12-22 Bayer Aktiengesellschaft Rnai-mediated pest control
MX2022003964A (en) 2019-10-02 2022-04-25 Bayer Ag Active compound combinations comprising fatty acids.
AU2020363864A1 (en) 2019-10-09 2022-04-21 Bayer Aktiengesellschaft Novel heteroaryl-triazole compounds as pesticides
DK4041721T3 (en) 2019-10-09 2024-05-27 Bayer Ag PREVIOUSLY UNKNOWN HETEROARYLTRIAZOLE COMPOUNDS AS PESTICIDES
WO2021089673A1 (en) 2019-11-07 2021-05-14 Bayer Aktiengesellschaft Substituted sulfonyl amides for controlling animal pests
WO2021097162A1 (en) 2019-11-13 2021-05-20 Bayer Cropscience Lp Beneficial combinations with paenibacillus
TW202134226A (en) 2019-11-18 2021-09-16 德商拜耳廠股份有限公司 Novel heteroaryl-triazole compounds as pesticides
EP4061131A1 (en) 2019-11-18 2022-09-28 Bayer Aktiengesellschaft Active compound combinations comprising fatty acids
TW202136248A (en) 2019-11-25 2021-10-01 德商拜耳廠股份有限公司 Novel heteroaryl-triazole compounds as pesticides
CN115335392A (en) 2020-01-31 2022-11-11 成对植物服务股份有限公司 Suppression of plant shade-avoidance response
CN115551839B (en) 2020-02-18 2024-06-04 拜耳公司 Heteroaryl-triazole compounds as pesticides
EP3708565A1 (en) 2020-03-04 2020-09-16 Bayer AG Pyrimidinyloxyphenylamidines and the use thereof as fungicides
CA3180157A1 (en) 2020-04-16 2021-10-21 Pairwise Plants Services, Inc. Methods for controlling meristem size for crop improvement
WO2021209490A1 (en) 2020-04-16 2021-10-21 Bayer Aktiengesellschaft Cyclaminephenylaminoquinolines as fungicides
BR112022021264A2 (en) 2020-04-21 2023-02-14 Bayer Ag 2-(HET)ARYL SUBSTITUTED HETEROCYCLIC DERIVATIVES AS PESTICIDES
TW202208347A (en) 2020-05-06 2022-03-01 德商拜耳廠股份有限公司 Novel heteroaryl-triazole compounds as pesticides
JP2023538713A (en) 2020-05-06 2023-09-11 バイエル、アクチエンゲゼルシャフト Pyridine(thio)amide as a fungicidal compound
US20230180756A1 (en) 2020-05-12 2023-06-15 Bayer Aktiengesellschaft Triazine and pyrimidine (thio)amides as fungicidal compounds
WO2021233861A1 (en) 2020-05-19 2021-11-25 Bayer Aktiengesellschaft Azabicyclic(thio)amides as fungicidal compounds
EP4156909A1 (en) 2020-06-02 2023-04-05 Pairwise Plants Services, Inc. Methods for controlling meristem size for crop improvement
CN115803320A (en) 2020-06-04 2023-03-14 拜耳公司 Heterocyclyl pyrimidines and triazines as novel fungicides
WO2021249995A1 (en) 2020-06-10 2021-12-16 Bayer Aktiengesellschaft Azabicyclyl-substituted heterocycles as fungicides
CN115943212A (en) 2020-06-17 2023-04-07 成对植物服务股份有限公司 Method for controlling meristem size to improve crop plants
EP4167738A1 (en) 2020-06-18 2023-04-26 Bayer Aktiengesellschaft Composition for use in agriculture
KR20230026388A (en) 2020-06-18 2023-02-24 바이엘 악티엔게젤샤프트 3-(pyridazin-4-yl)-5,6-dihydro-4H-1,2,4-oxadiazine derivatives as fungicides for crop protection
UY39276A (en) 2020-06-19 2022-01-31 Bayer Ag USE OF 1,3,4-OXADIAZOL-2-ILPYRIMIDINE COMPOUNDS TO CONTROL PHYTOPATHOGENIC MICROORGANISMS, METHODS OF USE AND COMPOSITIONS.
UY39275A (en) 2020-06-19 2022-01-31 Bayer Ag 1,3,4-OXADIAZOLE PYRIMIDINES AS FUNGICIDES, PROCESSES AND INTERMEDIARIES FOR THEIR PREPARATION, METHODS OF USE AND USES OF THE SAME
WO2021255089A1 (en) 2020-06-19 2021-12-23 Bayer Aktiengesellschaft 1,3,4-oxadiazole pyrimidines and 1,3,4-oxadiazole pyridines as fungicides
BR112022025692A2 (en) 2020-06-19 2023-02-28 Bayer Ag 1,3,4-OXADIAZOLES AND THEIR DERIVATIVES AS FUNGICIDES
EP3929189A1 (en) 2020-06-25 2021-12-29 Bayer Animal Health GmbH Novel heteroaryl-substituted pyrazine derivatives as pesticides
CN116033828A (en) 2020-07-02 2023-04-28 拜耳公司 Heterocyclic derivatives as pest control agents
WO2022033991A1 (en) 2020-08-13 2022-02-17 Bayer Aktiengesellschaft 5-amino substituted triazoles as pest control agents
WO2022053453A1 (en) 2020-09-09 2022-03-17 Bayer Aktiengesellschaft Azole carboxamide as pest control agents
WO2022058327A1 (en) 2020-09-15 2022-03-24 Bayer Aktiengesellschaft Substituted ureas and derivatives as new antifungal agents
EP3974414A1 (en) 2020-09-25 2022-03-30 Bayer AG 5-amino substituted pyrazoles and triazoles as pesticides
EP3915971A1 (en) 2020-12-16 2021-12-01 Bayer Aktiengesellschaft Phenyl-s(o)n-phenylamidines and the use thereof as fungicides
EP4262394A1 (en) 2020-12-18 2023-10-25 Bayer Aktiengesellschaft Use of dhodh inhibitor for controlling resistant phytopathogenic fungi in crops
WO2022129190A1 (en) 2020-12-18 2022-06-23 Bayer Aktiengesellschaft (hetero)aryl substituted 1,2,4-oxadiazoles as fungicides
WO2022129188A1 (en) 2020-12-18 2022-06-23 Bayer Aktiengesellschaft 1,2,4-oxadiazol-3-yl pyrimidines as fungicides
WO2022129196A1 (en) 2020-12-18 2022-06-23 Bayer Aktiengesellschaft Heterobicycle substituted 1,2,4-oxadiazoles as fungicides
EP4036083A1 (en) 2021-02-02 2022-08-03 Bayer Aktiengesellschaft 5-oxy substituted heterocycles as pesticides
BR112023015909A2 (en) 2021-02-11 2023-11-21 Monsanto Technology Llc METHODS AND COMPOSITIONS FOR MODIFYING CYTOKININ OXIDASE LEVELS IN PLANTS
CN117203227A (en) 2021-02-25 2023-12-08 成对植物服务股份有限公司 Methods and compositions for modifying root architecture in plants
BR112023019400A2 (en) 2021-03-30 2023-12-05 Bayer Ag 3-(HETERO)ARYL-5-CHLORODIFLOROMETHYL-1,2,4-OXADIAZOLE AS A FUNGICIDE
BR112023019788A2 (en) 2021-03-30 2023-11-07 Bayer Ag 3-(HETERO)ARYL-5-CHLORODIFLOROMETHYL-1,2,4-OXADIAZOLE AS A FUNGICIDE
EP4334315A1 (en) 2021-05-06 2024-03-13 Bayer Aktiengesellschaft Alkylamide substituted, annulated imidazoles and use thereof as insecticides
EP4337661A1 (en) 2021-05-12 2024-03-20 Bayer Aktiengesellschaft 2-(het)aryl-substituted condensed heterocycle derivatives as pest control agents
EP4355083A1 (en) 2021-06-17 2024-04-24 Pairwise Plants Services, Inc. Modification of growth regulating factor family transcription factors in soybean
UY39827A (en) 2021-06-24 2023-01-31 Pairwise Plants Services Inc MODIFICATION OF UBIQUITIN LIGASE E3 HECT GENES TO IMPROVE PERFORMANCE TRAITS
US20230027468A1 (en) 2021-07-01 2023-01-26 Pairwise Plants Services, Inc. Methods and compositions for enhancing root system development
CA3229056A1 (en) 2021-08-12 2023-02-16 Pairwise Plants Services, Inc. Modification of brassinosteroid receptor genes to improve yield traits
EP4384016A1 (en) 2021-08-13 2024-06-19 Bayer Aktiengesellschaft Active compound combinations and fungicide compositions comprising those
CA3229224A1 (en) 2021-08-17 2023-02-23 Pairwise Plants Services, Inc. Methods and compositions for modifying cytokinin receptor histidine kinase genes in plants
EP4392419A1 (en) 2021-08-25 2024-07-03 Bayer Aktiengesellschaft Novel pyrazinyl-triazole compounds as pesticides
WO2023034731A1 (en) 2021-08-30 2023-03-09 Pairwise Plants Services, Inc. Modification of ubiquitin binding peptidase genes in plants for yield trait improvement
AR126938A1 (en) 2021-09-02 2023-11-29 Pairwise Plants Services Inc METHODS AND COMPOSITIONS TO IMPROVE PLANT ARCHITECTURE AND PERFORMANCE TRAITS
EP4144739A1 (en) 2021-09-02 2023-03-08 Bayer Aktiengesellschaft Anellated pyrazoles as parasiticides
CA3232804A1 (en) 2021-09-21 2023-03-30 Pairwise Plants Services, Inc. Methods and compositions for reducing pod shatter in canola
CA3237641A1 (en) 2021-10-04 2023-04-13 Pairwise Plants Services, Inc. Methods for improving floret fertility and seed yield
AR127300A1 (en) 2021-10-07 2024-01-10 Pairwise Plants Services Inc METHODS TO IMPROVE FLOWER FERTILITY AND SEED YIELD
WO2023078915A1 (en) 2021-11-03 2023-05-11 Bayer Aktiengesellschaft Bis(hetero)aryl thioether (thio)amides as fungicidal compounds
WO2023099445A1 (en) 2021-11-30 2023-06-08 Bayer Aktiengesellschaft Bis(hetero)aryl thioether oxadiazines as fungicidal compounds
AR127904A1 (en) 2021-12-09 2024-03-06 Pairwise Plants Services Inc METHODS TO IMPROVE FLOWER FERTILITY AND SEED YIELD
AR128372A1 (en) 2022-01-31 2024-04-24 Pairwise Plants Services Inc SUPPRESSION OF THE SHADE AVOIDANCE RESPONSE IN PLANTS
WO2023148031A1 (en) 2022-02-01 2023-08-10 Globachem Nv Methods and compositions for controlling pests in cotton
WO2023148028A1 (en) 2022-02-01 2023-08-10 Globachem Nv Methods and compositions for controlling pests
WO2023148036A1 (en) 2022-02-01 2023-08-10 Globachem Nv Methods and compositions for controlling pests in soybean
WO2023148035A1 (en) 2022-02-01 2023-08-10 Globachem Nv Methods and compositions for controlling pests in rice
WO2023148029A1 (en) 2022-02-01 2023-08-10 Globachem Nv Methods and compositions for controlling pests in cereals
WO2023148030A1 (en) 2022-02-01 2023-08-10 Globachem Nv Methods and compositions for controlling pests in corn
WO2023168217A1 (en) 2022-03-02 2023-09-07 Pairwise Plants Services, Inc. Modification of brassinosteroid receptor genes to improve yield traits
WO2023192838A1 (en) 2022-03-31 2023-10-05 Pairwise Plants Services, Inc. Early flowering rosaceae plants with improved characteristics
WO2023196886A1 (en) 2022-04-07 2023-10-12 Pairwise Plants Services, Inc. Methods and compositions for improving resistance to fusarium head blight
WO2023205714A1 (en) 2022-04-21 2023-10-26 Pairwise Plants Services, Inc. Methods and compositions for improving yield traits
WO2023215704A1 (en) 2022-05-02 2023-11-09 Pairwise Plants Services, Inc. Methods and compositions for enhancing yield and disease resistance
WO2023213626A1 (en) 2022-05-03 2023-11-09 Bayer Aktiengesellschaft Use of (5s)-3-[3-(3-chloro-2-fluorophenoxy)-6-methylpyridazin-4-yl]-5-(2-chloro-4-methylbenzyl)-5,6-dihydro-4h-1,2,4-oxadiazine for controlling unwanted microorganisms
AU2023263693A1 (en) 2022-05-03 2024-10-31 Bayer Aktiengesellschaft Crystalline forms of (5s)-3-[3-(3-chloro-2-fluorophenoxy)-6-methylpyridazin-4-yl]-5-(2-chloro-4-methylbenzyl)-5,6-dihydro-4h-1,2,4-oxadiazine
US20230416767A1 (en) 2022-05-05 2023-12-28 Pairwise Plants Services, Inc. Methods and compositions for modifying root architecture and/or improving plant yield traits
AR129709A1 (en) 2022-06-27 2024-09-18 Pairwise Plants Services Inc METHODS AND COMPOSITIONS TO MODIFY SHADE ESCAPE IN PLANTS
AR129748A1 (en) 2022-06-29 2024-09-25 Pairwise Plants Services Inc METHODS AND COMPOSITIONS FOR CONTROLLING MERISTEM SIZE FOR CROP IMPROVEMENT
US20240000031A1 (en) 2022-06-29 2024-01-04 Pairwise Plants Services, Inc. Methods and compositions for controlling meristem size for crop improvement
WO2024030984A1 (en) 2022-08-04 2024-02-08 Pairwise Plants Services, Inc. Methods and compositions for improving yield traits
WO2024036240A1 (en) 2022-08-11 2024-02-15 Pairwise Plants Services, Inc. Methods and compositions for controlling meristem size for crop improvement
WO2024054880A1 (en) 2022-09-08 2024-03-14 Pairwise Plants Services, Inc. Methods and compositions for improving yield characteristics in plants
EP4295688A1 (en) 2022-09-28 2023-12-27 Bayer Aktiengesellschaft Active compound combination
WO2024068519A1 (en) 2022-09-28 2024-04-04 Bayer Aktiengesellschaft 3-(hetero)aryl-5-chlorodifluoromethyl-1,2,4-oxadiazole as fungicide
WO2024068518A1 (en) 2022-09-28 2024-04-04 Bayer Aktiengesellschaft 3-heteroaryl-5-chlorodifluoromethyl-1,2,4-oxadiazole as fungicide
WO2024068517A1 (en) 2022-09-28 2024-04-04 Bayer Aktiengesellschaft 3-(hetero)aryl-5-chlorodifluoromethyl-1,2,4-oxadiazole as fungicide
WO2024068520A1 (en) 2022-09-28 2024-04-04 Bayer Aktiengesellschaft 3-(hetero)aryl-5-chlorodifluoromethyl-1,2,4-oxadiazole as fungicide
EP4385326A1 (en) 2022-12-15 2024-06-19 Kimitec Biogorup Biopesticide composition and method for controlling and treating broad spectrum of pests and diseases in plants
WO2024137438A2 (en) 2022-12-19 2024-06-27 BASF Agricultural Solutions Seed US LLC Insect toxin genes and methods for their use
KR20240100589A (en) * 2022-12-22 2024-07-02 주식회사 남보 Photorhabdus cinerea NB-YG4-3 strain, composition and control method for wilt disease using the same
WO2024173622A1 (en) 2023-02-16 2024-08-22 Pairwise Plants Services, Inc. Methods and compositions for modifying shade avoidance in plants
US20240294933A1 (en) 2023-03-02 2024-09-05 Pairwise Plants Services, Inc. Methods and compositions for modifying shade avoidance in plants
US20240301438A1 (en) 2023-03-09 2024-09-12 Pairwise Plants Services, Inc. Modification of brassinosteroid signaling pathway genes for improving yield traits in plants

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
IL76312A (en) * 1984-09-05 1991-06-10 Biotech Australia Pty Ltd Xenocoumacins and derivatives thereof,their preparation and pharmaceutical compositions containing them
US5254799A (en) * 1985-01-18 1993-10-19 Plant Genetic Systems N.V. Transformation vectors allowing expression of Bacillus thuringiensis endotoxins in plants
US5039523A (en) * 1988-10-27 1991-08-13 Mycogen Corporation Novel Bacillus thuringiensis isolate denoted B.t. PS81F, active against lepidopteran pests, and a gene encoding a lepidopteran-active toxin
JPH09500264A (en) * 1993-06-25 1997-01-14 コモンウェルス サイエンティフィック アンド インダストリアル リサーチ オーガニゼーション Xenorhabdus nematophilus virulence gene
AU7513994A (en) * 1993-07-27 1995-02-28 Agro-Biotech Corporation Novel fungicidal properties of metabolites, culture broth, stilbene derivatives and indole derivatives produced by the bacteria (xenorhabdus) and (photorhabdus) spp.
GB9618083D0 (en) * 1996-08-29 1996-10-09 Mini Agriculture & Fisheries Pesticidal agents

Also Published As

Publication number Publication date
IL121243A0 (en) 1998-01-04
WO1997017432A1 (en) 1997-05-15
AU729228B2 (en) 2001-01-25
JP2004089189A (en) 2004-03-25
JP3482214B2 (en) 2003-12-22
KR19980701244A (en) 1998-05-15
PL186242B1 (en) 2003-12-31
EP0797659A1 (en) 1997-10-01
CA2209659A1 (en) 1997-05-15
EP0797659A4 (en) 1998-11-11
JP3657593B2 (en) 2005-06-08
BR9606889A (en) 1997-10-28
RO121280B1 (en) 2007-02-28
JP2002509424A (en) 2002-03-26
MX9705101A (en) 1997-10-31
CA2209659C (en) 2008-01-15
KR100354530B1 (en) 2003-01-06
HUP9900768A2 (en) 1999-06-28
HUP9900768A3 (en) 2002-10-28
AU1050997A (en) 1997-05-29
SK93197A3 (en) 1998-05-06
PL321212A1 (en) 1997-11-24

Similar Documents

Publication Publication Date Title
CA2209659C (en) Insecticidal protein toxins from photorhabdus
AU2829997A (en) Insecticidal protein toxins from (photorhabdus)
US6048838A (en) Insecticidal protein toxins from xenorhabdus
EP2142009B1 (en) Hemipteran- and coleopteran- active toxin proteins from bacillus thuringiensis
US6528484B1 (en) Insecticidal protein toxins from Photorhabdus
US7569748B2 (en) Nucleic acid encoding an insecticidal protein toxin from photorhabdus
US6280722B1 (en) Antifungal Bacillus thuringiensis strains
ES2348509T5 (en) New insecticidal proteins from Bacillus thuringiensis
AU9712501A (en) Insecticidal protein toxins from photorhabdus
MXPA99001288A (en) Insecticidal protein toxins from xenorhabdus
UA82485C2 (en) Insecticide protein toxins from photorhabdus
AU2013273706A1 (en) Hemipteran- and Coleopteran- active toxin proteins from Bacillus thuringiensis

Legal Events

Date Code Title Description
FF Patent granted
MM9K Patent not in force due to non-payment of renewal fees