WO2010016071A2 - Identification of genomic signature for differentiating highly similar sequence variants of an organism - Google Patents
Identification of genomic signature for differentiating highly similar sequence variants of an organism Download PDFInfo
- Publication number
- WO2010016071A2 WO2010016071A2 PCT/IN2009/000442 IN2009000442W WO2010016071A2 WO 2010016071 A2 WO2010016071 A2 WO 2010016071A2 IN 2009000442 W IN2009000442 W IN 2009000442W WO 2010016071 A2 WO2010016071 A2 WO 2010016071A2
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- seq
- genotype
- hbv
- polynucleotide sequence
- genotypes
- Prior art date
Links
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/70—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving virus or bacteriophage
- C12Q1/701—Specific hybridization probes
- C12Q1/702—Specific hybridization probes for retroviruses
- C12Q1/703—Viruses associated with AIDS
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/70—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving virus or bacteriophage
- C12Q1/701—Specific hybridization probes
- C12Q1/706—Specific hybridization probes for hepatitis
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2600/00—Oligonucleotides characterized by their use
- C12Q2600/156—Polymorphic or mutational markers
Definitions
- genotypes may help in prognostication, treatment and development of prevention strategies.
- HBV owing to immune selection pressure HBV undergoes a number of mutations which again have an effect on the severity of disease and response to therapy.
- the Indian subcontinent is home to a large repertoire of genetically variable hepatitis B viruses that may co-circulate. The existence of such mixed infections with different genotypes and/or mutants may have a significant effect on the natural course of disease.
- Another aspect of the present invention relates to a method of detecting one or more HBV genotypes in a sample using the HBV genomic signature, said method comprising providing a sample; and analyzing the presence of genomic signature specific for HBV genotype wherein the genomic signature of a specific HBV genotype is the presence of specific single nucleotide variation at one or more specific nucleotide position in the representative HBV polynucleotide sequence as set forth in SEQ ID NO: 1; whereby the presence of adenine at nucleotide position 439 in the polynucleotide sequence as set forth in SEQ ID NO:
- GSP genomic signature processing
- a genomic signature processing (GSP) system comprising: at least one processor (606); a memory (608) coupled to the processor (606), the memory (608) comprising one or more processor executable instructions; a preliminary signature generator (618) to generate preliminary signatures for genotypes, wherein the preliminary signatures are generated based at least on a set of unique sites identified for the genotypes; and a genomic signature generator (620) to generate genomic signatures for each of the genotypes, wherein the genomic signatures are based on at least one informative site identified for each of the genotypes, and wherein the informative site is identified such that a probability of two or more genotypes having a same nucleotide at a same informative site is minimal.
- GSP genomic signature processing
- the genotype of the HBV for which the particular site is 'informative' gives the genotype of the HBV for which the particular site is 'informative'.
- the bases within parenthesis show the base present in the said genotype sequence viz a viz the base present in all other HBV genotype sequences.
- preliminary signature refers to the all the unique sites identified in the small subset of the genotype.
- the tool of genomic signature disclosed in the present invention can be used to detect the microorganism in a sample including but not limited to whole blood, plasma, serum, saliva, sputum, tissue, DNA, RNA, hair, and soil.
- the present invention discloses genomic signature of HBV genotype, wherein the genomic signature of HBV-C genotype is adenine at position 494 and 2786 in the representative polynucleotide sequence of HBV genome as set forth in SEQ ID NO: 1, that correspond to the nucleotide position 491 and 2741 respectively in the consensus polynucleotide sequence of HBV-C as set forth in SEQ ID NO: 5.
- Whole genome profiling based approach enables not only genotype identification, but also delineation of mixed and recombinant types, in addition to identification of various mutations on a single platform at one instance.
- the tool of genomic signature disclosed in the present invention is useful for disease prognosis and patient management.
- the present approach describes the concept, design and future applications for characterization of micro-organisms present in a sample.
- Yet another embodiment of the present invention relates to the method as disclosed in the instant invention, wherein the primer extension reaction was performed using the oligonucleotide as set forth in SEQ ID NO: 33, SEQ ID NO: 34, SEQ ID NO: 35, SEQ ID NO: 36, SEQ ID NO: 37; SEQ ID NO: 38; SEQ ID NO: 39, SEQ ID NO: 40, SEQ ID NO: 41 and SEQ ID NO: 42.
- Yet another aspect of the present invention relates to A DNA fragment comprising the genomic signature of HBV genotypes, wherein the nucleotide sequence of the fragment is selected from a group consisting of SEQ ID NO: 55 to SEQ ID NO: 64.
- Another embodiment of the present invention relates to the method of detecting one or more HIV genotypes in a sample using the HIV genomic signature as disclosed in the instant invention, wherein the step of analyzing the presence of genomic signature specific for HIV genotype further comprises selecting a pair of oligonucleotides to amplify the specific single nucleotide variation, performing a primer extension reaction to generate extended primer and analyzing the extended primer to identify the genomic signature of the HIV genotype.
- a genomic signature processing (GSP) system comprising: at least one processor (606); a memory (608) coupled to the processor (606), the memory (608) comprising one or more processor executable instructions; a preliminary signature generator (618) to generate preliminary signatures for genotypes, wherein the preliminary signatures are generated based at least on a set of unique sites identified for the genotypes; and a genomic signature generator (620) to generate genomic signatures for each of the genotypes, wherein the genomic signatures are based on at least one informative site identified for each of the genotypes, and wherein the informative site is identified such that a probability of two or more genotypes having a same nucleotide at a same informative site is minimal.
- Another embodiment of the present invention relates to the GSP system (600) as disclosed in the instant invention, wherein the GSP system (600) is communicatively coupled to a sequential database (610) that stores multiple sequences for each of the genotypes.
- HBV DNA sequences were retrieved from the Hepatitis Virus Database (http://s2asO2.genes.nig.ac.jp/), and a phylogenetic tree was constructed using the Neighbor Joining Phylogenetic method.
- Phylogenetic analysis of the various Hepatitis B virus sequences from the database reveal eight genotypes of HBV isolates infecting humans. The isolates clustering together on the phylogram show similarity in the sequences. The DNA sequences were seen to cluster on the basis of the different gene types or genotypes of the HBV isolate.
- a genotype specific consensus sequence was generated from NCBI reference strains belonging to the particular HBV genotype.
- the eight consensus sequences, representing eight different HBV genotypes were then aligned together to identify individual sites (nucleotide positions) on HBV genome that are unique for a particular genotype. That is to say, that the nucleotide is present in only one genotype at a specific site whereas the other genotypes have a different nucleotide at that site.
- 132 unique sites were identified from the Multiple Sequence Alignment of reference DNA sequences of HBV as provided in the database (NCBI). The distribution of these sites was different for different HBV genotypes.
- the predictive value (probability of classification) for each site was calculated using the formula:
- Step 3 527 HBV non- genotype A sequences were taken. The 6 HBV-A sites, given previously, were looked up separately in each genotype (Table 6).
- Site 2 (a single site corresponding to position 439 of representative sequence SEQ ID No.l) is sufficient for classification of a HBV-A sequence correctly with least probability of misclassification of a HBV-non A genotype sequences as HBV- A genotype.
- Step 5 The probability of Misclassification in HBV-A and HBV-non A genotypes taken together for Site 2 is as given below;
- nucleic acid is added to the sample that leads to selective binding of nucleic acid to glass fiber of filter tube.
- the nucleic acid remains bound while a series of rapid "wash and spin” steps remove contaminating cellular components. Finally, the nucleic acids are eluted in low salt concentrated buffer. - Genomic nucleic acid extraction
Landscapes
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Chemical & Material Sciences (AREA)
- Organic Chemistry (AREA)
- Engineering & Computer Science (AREA)
- Zoology (AREA)
- Virology (AREA)
- Immunology (AREA)
- Wood Science & Technology (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Microbiology (AREA)
- Biochemistry (AREA)
- Biophysics (AREA)
- Molecular Biology (AREA)
- Analytical Chemistry (AREA)
- Physics & Mathematics (AREA)
- Genetics & Genomics (AREA)
- Biotechnology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- AIDS & HIV (AREA)
- Communicable Diseases (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
The present invention provides a process of identification of genomic signature for differentiating highly similar sequence variants of an organism. The present invention also provides unique genomic signature of an organism that can be used as a tool or an identifier of a species of an organism for differentiating highly similar sequence variants of the organism. The present invention further provides the nucleotide probe and/or primers useful for identification the genomic signature of an organism.
Description
IDENTIFICATION OF GENOMIC SIGNATURE FOR DIFFERENTIATING HIGHLY SIMILAR SEQUENCE VARIANTS OF AN ORGANISM
FIELD OF INVENTION
The present invention relates to identification of genomic signature for differentiating highly similar sequence variants of an organism.
BACKGROUND OF THE INVENTION
More than 350 million people in the world are chronically infected with Hepatitis B Virus (HBV), and approximately 500,000 to 10, 00,000 deaths worldwide are attributed to it (Lavanchy D. Worldwide epidemiology of HBV infection, disease burden and vaccine prevention. J Clin Virol (2005), 34). India alone has world's second largest pool of carriers with a prevalence rate of ~2- 4% of which 10% are highly infectious (http://www.whoindia.org). Till date, 8 HBV genotypes have been described, each one from specific geographical regions of the world. HBV genotypes have been reported to influence the progression of disease, development of chronicity, risks of development of hepatocellular carcinoma, evolution of mutant strains and response to antiviral therapy. Thus the knowledge about genotypes may help in prognostication, treatment and development of prevention strategies. Similarly owing to immune selection pressure HBV undergoes a number of mutations which again have an effect on the severity of disease and response to therapy. The Indian subcontinent is home to a large repertoire of genetically variable hepatitis B viruses that may co-circulate. The existence of such mixed infections with different genotypes and/or mutants may have a significant effect on the natural course of disease.
Conventional techniques thus far used by various researchers are prone to result variations due to experimental constraints as well as limitations in recognizing strains with differential genetic make-up.
SUMMARY OF THE INVENTION
The present invention relates to a process of identification of genomic signature for differentiating highly similar sequence variants of an organism. The process disclosed in the present invention is useful for detection of genotype of an organism. There is
provided the genomic signatures of genotypes of HBV and HIV using the process disclosed in the present invention. The process of identification of a genomic signature is useful for identification of genomic signatures of any variants of any organism (Figure 1).
One aspect of the present invention relates to a method of generating a genomic signature of a genotype within a species of an organism, said method comprising a. providing a consensus polynucleotide sequence of each of the genotypes, wherein the consensus sequence is obtained from a small subset of polynucleotide sequences of said genotype, wherein said subset comprises 5 to 10% of total polynucleotide sequences of said genotype, b. aligning the consensus polynucleotide sequences of said genotypes, c. obtaining all unique sites having a single nucleotide variation at a specific nucleotide position in the consensus polynucleotide sequence, d. validating said unique sites with a larger subset comprising the polynucleotide sequences of said genotype, e. selecting at least one unique site to generate the genomic signature of said genotype, wherein the frequency of occurrence of said unique site is in the range of 0.85-1.0 in said genotype and 0.001 -.05 in other genotypes, wherein the genomic signature is specific to said genotype and different in other genotypes Another aspect of the present invention relates to a method of detecting one or more HBV genotypes in a sample using the HBV genomic signature, said method comprising providing a sample; and analyzing the presence of genomic signature specific for HBV genotype wherein the genomic signature of a specific HBV genotype is the presence of specific single nucleotide variation at one or more specific nucleotide position in the representative HBV polynucleotide sequence as set forth in SEQ ID NO: 1; whereby the presence of adenine at nucleotide position 439 in the polynucleotide sequence as set forth in SEQ ID NO: 1 detects HBV-A genotype , the presence of guanine at nucleotide position 411 in the polynucleotide sequence as set forth in SEQ ID NO: 1 detects HBV-B genotype, the presence of adenine at nucleotide position 494 and guanine at nucleotide position 2786 in the polynucleotide sequence as set forth in SEQ ID NO: 1 detects HBV- C genotype, the presence of adenine at nucleotide position 293 in the polynucleotide
sequence as set forth in SEQ ID NO: 1 detects HBV-D genotype, the presence of thymine at nucleotide position 2964 in the polynucleotide sequence as set forth in SEQ ID NO: 1 detects HBV-E genotype, the presence of adenine at nucleotide position 631 in the polynucleotide sequence as set forth in SEQ ID NO: 1 detects HBV-F genotype, the presence of adenine at nucleotide position 1003 and guanine at nucleotide position 1620 in the polynucleotide sequence as set forth in SEQ ID NO: 1 detects HBV-G genotype, the presence of guanine at nucleotide position 1321 in the polynucleotide sequence as set forth in SEQ ID NO: 1 detects HBV-H genotype .
Yet another preferred aspect of the present invention discloses a method of detecting HBV-A and HBV-D genotypes in a sample using the HBV-A and HBV-D genomic signature, said method comprising providing a sample; and analyzing the presence of genomic signatures specific for HBV-A and HBV-D genotypes, wherein the genomic signature for HBV-A genotype is the presence of specific single nucleotide variation at nucleotide position 439 in the representative HBV polynucleotide sequence as set forth in SEQ ID NO: 1 and the genomic signature for HBV-D genotype is the presence of specific single nucleotide variation at nucleotide position 293 in the representative HBV polynucleotide sequence as set forth in SEQ ID NO: 1, whereby the presence of adenine at nucleotide position 439 in the polynucleotide sequence as set forth in SEQ ID NO: 1 detects HBV-A genotype, and the presence of adenine at nucleotide position 293 in the polynucleotide sequence as set forth in SEQ ID NO: 1 detects HBV D genotype.
Yet another aspect of the present invention relates to a method of detecting HBV-C genotype in a sample using the HBV-C genomic signature, said method comprising providing a sample; and analyzing the presence of genomic signature specific for HBV-C genotype, wherein the genomic signature for HBV-C genotype is the presence of specific single nucleotide variation at nucleotide positions 494 and 2786 in the representative HBV polynucleotide sequence as set forth in SEQ ID NO: 1, whereby the presence of adenine at nucleotide position 494 and guanine at nucleotide position 2786 in the polynucleotide sequence as set forth in SEQ ID NO: 1 detects HBV-C genotype.
One aspect of the present invention relates to an oligonucleotide set selected from a group consisting of set 11 SEQ ID NO; 13, SEQ ID NO 14 and SEQ ID NO: 33; set 12 SEQ ID
NO: 15, SEQ ID NO: 16 and SEQ ID NO: 34; set 13 SEQ ID NO: 17, SEQ ID NO: 18 and SEQ ID NO: 35; set 14 SEQ ID NO: 19, SEQ ID NO: 20 and SEQ ID NO: 36; set 15 SEQ ID NO: 21, SEQ ID NO: 22 and SEQ ID NO: 37; set 16 SEQ ID NO: 23, SEQ ID NO: 24 and SEQ ID NO: 38; set 17 SEQ ID NO: 25, SEQ ID NO: 26 and SEQ ID NO: 39; set 18 SEQ ID NO: 27, SEQ ID NO: 28 and SEQ ID NO: 40; set 19 SEQ ID NO: 29, SEQ ID NO: 30 and SEQ ID NO: 41; and set 20 SEQ ID NO: 31, SEQ ID NO: 32 and SEQ ID NO: 42, wherein the oligonucleotides detects one or more HBV genotypes in a sample.
Yet another aspect of the present invention relates to a DNA fragment comprising the genomic signature of HBV genotypes, wherein the nucleotide sequence of the fragment is selected from a group consisting of SEQ ID NO: 55 to SEQ ID NO: 64.
Yet another aspect of the present invention discloses a method of detecting one or more HIV genotypes in a sample using the HIV genomic signature , said method comprising providing a sample; and analyzing the presence of the genomic signature specific for HIV genotype wherein the genomic signature of a specific HIV genotype is the presence of specific single nucleotide variation at one or more specific nucleotide position in the representative HIV polynucleotide sequence as set forth in SEQ ID NO: 2; whereby the presence of Thymine and cytosine at nucleotide positions 1440 and 4244 respectively in the polynucleotide sequence as set forth in SEQ ID NO: 2 detects HIV-Al genotype, the presence of cytosine at nucleotide positions 4462 and 7304 in the polynucleotide sequence as set forth in SEQ ID NO: 2 detects HIV-A2 genotype, the presence of adenine and guanine at nucleotide positions 2040 and 9001 respectively in the polynucleotide sequence as set forth in SEQ ID NO: 2 detects HIV-B genotype, the presence of cytosine and thymine at nucleotide positions 2480 and 6395 respectively in the polynucleotide sequence as set forth in SEQ ID NO: 2 detects HIV-C genotype, the presence of adenine and guanine at nucleotide positions 4158 and 8180 respectively in the polynucleotide sequence as set forth in SEQ ID NO: 2 detects HIV-D genotype, the presence of cytosine and guanine at nucleotide positions 1679 and 7169 respectively in the polynucleotide sequence as set forth in SEQ ID NO: 2 detects HIV-Fl genotype, the presence of adenine and cytosine at nucleotide positions 3456 and 7985 respectively in the polynucleotide sequence as set forth in SEQ ID NO: 2 detects HIV-F2 genotype, the
presence of thymine and guanine at nucleotide positions 2993 and 6574 respectively in the polynucleotide sequence as set forth in SEQ ID NO: 2 detects HIV-G genotype, the presence of adenine and cytosine at nucleotide positions 3973 and 8493 respectively in the polynucleotide sequence as set forth in SEQ ID NO: 2 detects HIV-H genotype, the presence of adenine and guanine at nucleotide position 2589 and 6769 respectively in the polynucleotide sequence as set forth in SEQ ID NO: 2 detects HIV-J genotype, and the presence of adenine at nucleotide positions 3428 and 9134 respectively in the polynucleotide sequence as set forth in SEQ ID NO: 2 detects HIV-K genotype.
Another aspect of the present invention relates to a genomic signature processing (GSP) system (600) comprising: at least one processor (606); a memory (608) coupled to the processor (606), the memory (608) comprising one or more processor executable instructions; a preliminary signature generator (618) to generate preliminary signatures for genotypes, wherein the preliminary signatures are generated based at least on a set of unique sites identified for the genotypes; and a genomic signature generator (620) to generate genomic signatures for each of the genotypes, wherein the genomic signatures are based on at least one informative site identified for each of the genotypes, and wherein the informative site is identified such that a probability of two or more genotypes having a same nucleotide at a same informative site is minimal.
BRIEF DESCRIPTION OF THE DRAWINGS Figure 1 shows the process of Genomic Signature Identification.
1. Subtypes (A-F)
2. Multiple Sequence Alignment of Subtypes (A-F)
3. Consensus sequences of individual subtypes (A-F)
4. Multiple Sequence Alignment of Consensus Sequences (A-F) 5. Identification of Unique and Shared Sites for Consensus Subtypes (A-F)
6. Site Short listing & Statistical Significance Analysis.
Figure 2 shows a segment of aligned consensus sequences of different HIV genotypes showing the identification of a unique site (boxed) at nucleotide position 1440. Figure 3 shows Multiple Sequence Alignment of HBV consensus sequences (Genotype A-H) showing the ten informative sites (boxed). The description of the informative site,
IN2009/000442
in the figure, gives the genotype of the HBV for which the particular site is 'informative'. The bases within parenthesis show the base present in the said genotype sequence viz a viz the base present in all other HBV genotype sequences.
DETAILED DESCRIPTION OF THE INVENTION The term "Genomic signature(s)" used herein can be interchangeably used as "Gene Signature(s)" or "Genetic signatures" or "genomic patterns" or 'genome signature" or "nucleotide patterns" or "nucleotide signatures" or "base patterns" or "base signatures" or "informative site".
The term "Genomic signature(s)" means but not limited to a combination or pattern of unique nucleotide bases across the organism's genome which enables its identification, differentiation and characterization from other organisms.
The term "genotype" used herein can be interchangeably used as "variant" or "subtype" or "strain" or "serotype".
The term "preliminary signature" used herein refers to the all the unique sites identified in the small subset of the genotype.
The term "informative site" used herein means a nucleotide at a specified position in the genome sequence which serves to act as the identifier for that genotype or strain from other closely related sequences of different genotype or strain (Figure 2).
The present invention provides a process of identification of genomic signature for differentiating highly similar sequence variants of an organism. The present invention also provides unique genomic signature of an organism that can be used as a tool or an identifier of a species of an organism. The present invention further provides the nucleotide probe and/or primers useful for identification the genomic signature.
The tool of genomic signature disclosed in the present invention can be used to detect the microorganism in a sample including but not limited to whole blood, plasma, serum, saliva, sputum, tissue, DNA, RNA, hair, and soil.
The process disclosed in the present invention can be used to identify the genomic signature of microorganisms such as viruses for example Hepatitis, Dengue, HSV, Rotavirus, TMV, Ebola, Polio and other similar viruses; bacteria such as Mycobacterium
tuberculosis, Streptococcus, Pseudomonas, Shigella, Campylobacter, Salmonella and other similar bacterial pathogens, and protozoan such as Giardia Lamblia, Naegleria Fowleri, Acanthamoeba Spp, Entamoeba Histolytica, Cryptosporidium Parvum, Cyclospora Cayetanensis, Isospora Belli and other similar protozoan pathogens. The process for identification of unique genomic signature disclosed in the present invention comprises genotyping by studying specific single nucleotide base.
The genomic signatures were identified and tested using the Hepatitis B virus as a model.
The present invention provides a novel method of detection of a genotype within a species of an organism using genomic signature of the genotype, wherein the genomic signature is a single nucleotide variation at a specific nucleotide position in the genomic DNA of the genotype.
Surprisingly it was found that presence of a specific single nucleotide at unique site or specific position or location in the genome of the genotype provides identity to the particular genotype, whereby the specific nucleotide is the genomic signature of the genotype. The present invention further discloses that genomic signature of certain genotypes is defined by presence of two specific single nucleotides at two specific locations. Therefore it was unexpected that a genotype within a species of an organism can be detected accurately by knowing the presence or absence of a genomic signature of the genotype, wherein the genomic signature is a one or two specific nucleotides present at specific location(s).
The present invention provides a method for generating the genomic signature of HBV and HIV genotypes. These are model systems to demonstrate the generation of genomic signature. However, genomic signature of any other genotypes of a species within the organism can be generated using the method disclosed in the present invention. The method of generating a genomic signature of a genotype as provided in the present invention encompasses generation of specific genomic signature of any genotype within the species of an organism.
In one embodiment the present invention discloses genomic signature of HBV genotype, wherein the genomic signature of HBV-A genotype is adenine at position 439 in the
representative polynucleotide sequence of HBV genome as set forth in SEQ ID NO: 1, that correspond to the nucleotide position 436 in the consensus polynucleotide sequence of HBV-A as set forth in SEQ IDNO: 3.
In another embodiment the present invention discloses genomic signature of HBV genotype, wherein the genomic signature of HBV-B genotype is guanine at position 411 in the representative polynucleotide sequence of HBV genome as set forth in SEQ ID NO: 1, that correspond to the nucleotide position 410 in the consensus polynucleotide sequence of HBV-B as set forth in SEQ ID NO: 4.
In another embodiment the present invention discloses genomic signature of HBV genotype, wherein the genomic signature of HBV-C genotype is adenine at position 494 and 2786 in the representative polynucleotide sequence of HBV genome as set forth in SEQ ID NO: 1, that correspond to the nucleotide position 491 and 2741 respectively in the consensus polynucleotide sequence of HBV-C as set forth in SEQ ID NO: 5.
In another embodiment the present invention discloses genomic signature of HBV genotype, wherein the genomic signature of HBV-D genotype is adenine at position 293 in the representative polynucleotide sequence of HBV genome as set forth in SEQ ID NO: 1, that correspond to the nucleotide position 292 in the consensus polynucleotide sequence of HBV-D as set forth in SEQ IDNO: 6.
In another embodiment the present invention discloses genomic signature of HBV genotype, wherein the genomic signature of HBV-E genotype is thymine at position 2964 in the representative polynucleotide sequence of HBV genome as set forth in SEQ ID NO: 1, that correspond to the nucleotide position 2916 in the consensus polynucleotide sequence of HBV-E as set forth in SEQ ID NO: 7.
In another embodiment the present invention discloses genomic signature of HBV genotype, wherein the genomic signature of HBV-F genotype is adenine at position 631 in the representative polynucleotide sequence of HBV genome as set forth in SEQ ID NO: 1, that correspond to the nucleotide position 628 in the consensus polynucleotide sequence of HBV-F as set forth in SEQ ID NO: 8.
In another embodiment the present invention discloses genomic signature of HBV genotype, wherein the genomic signature of HBV-G genotype is guanine at position 1003 and guanine at 1620 in the representative polynucleotide sequence of HBV genome as set forth in SEQ ID NO: 1, that correspond to the nucleotide position 1000 and 1617 respectively in the consensus polynucleotide sequence of HBV-G as set forth in SEQ ID NO: 9.
In another embodiment the present invention discloses genomic signature of HBV genotype, wherein the genomic signature of HBV-H genotype is guanine at position 1321 in the representative polynucleotide sequence of HBV genome as set forth in SEQ ID NO: 1, that correspond to the nucleotide position 1321 in the consensus polynucleotide sequence of HBV-H as set forth in SEQ ID NO: 10.
In one embodiment of the present invention a genome signature is identified for detection of HBV and for identification of each of the eight genotypes of HBV. Based on the degree of conservation of each of the nucleotide present in the signature sequence, when scanned against the entire HBV genome database sequences available publicly, certain nucleotides site are found to more reliably predict the strain's character. The sites selected are based on the probability of a correct prediction by them. The sites showing the highest probability of a correct prediction, either individually or in combination with other sites, are selected. This has been done by applying varying statistical tests on the selected sites.
Unlike other available tools for genotyping, the proposed approach is not dependent on a single region or site. It takes into account the sites across the entire genome and is therefore a better predictor of the mixed and recombinant nature of the infecting strain.
As sequence information of additional strains is made available, more sites will be added to the genomic signatures for genotyping. This will take into account any population to population nucleotide variation, such that the best possible signatures defining the circulating strains in a particular population can be identified and validated.
Another embodiment of the present invention provides a process for detection and/or genetic characterization of HBV in a biological sample. A set of sites (nucleotide bases) are identified by one or multiple nucleotide probes/primers in the genomic sequence of a
clinical HBV isolate through hybridization. The sites are conserved across a single genotype of HBV and span the entire genome. Therefore eight such genomic signature (corresponding to 10 informative sites) are identified which can solely and collectively characterize the isolate. These sites once identified are inferred by an algorithm which then scores the sites identified and then characterize the clinical HBV isolate. These genomic signature sites can form the basis of various tools for HBV strain detection and characterization eg: hybridization, enzymatic profiling, mass spectrometry, isothermal amplification, microarray etc.
Additionally, the concept of genomic signature based identification and characterization can be applied in other pathogen and host genomes as well.
Whole genome profiling based approach enables not only genotype identification, but also delineation of mixed and recombinant types, in addition to identification of various mutations on a single platform at one instance. The tool of genomic signature disclosed in the present invention is useful for disease prognosis and patient management. The present approach describes the concept, design and future applications for characterization of micro-organisms present in a sample.
The genomic signature tool disclosed in the present invention can be utilized for identification of pathogens, preclinical detection of infections, diagnosis of disease, biomarker detection, toxicology testing, positive selection, phylogenetic prediction, drug discovery and disease prognosis.
Yet another embodiment of the present invention discloses a method of generating a genomic signature of a genotype within a species of an organism, said method comprising: providing a consensus polynucleotide sequence of each of the genotypes, wherein the consensus sequence is obtained from a small subset of polynucleotide sequences of said genotype, wherein said subset comprises 5 to 10% of total polynucleotide sequences of said genotype; aligning the consensus polynucleotide sequences of said genotypes; obtaining all unique sites having a single nucleotide variation at a specific nucleotide position in the consensus polynucleotide sequence; validating said unique sites with a larger subset comprising the polynucleotide sequences of said genotype; selecting at least one unique site to generate the genomic signature of
said genotype, wherein the frequency of occurrence of said unique site is in the range of 0.85-1.0 in said genotype and 0.001-.05 in other genotypes, wherein the genomic signature is specific to said genotype and different in other genotypes
Yet another embodiment of the present invention relates to the method of generating a genomic signature of a genotype within a species of an organism, said method comprising providing a consensus polynucleotide sequence of each of the genotypes, wherein the consensus sequence is obtained from a small subset of polynucleotide sequences of said genotype, wherein said subset comprises 5 to 10% of total polynucleotide sequences of said genotype, aligning the consensus polynucleotide sequences of said genotypes, obtaining all unique sites having a single nucleotide variation at a specific nucleotide position in the consensus polynucleotide sequence, validating said unique sites with a larger subset comprising the polynucleotide sequences of said genotype, selecting at least one unique site to generate the genomic signature of said genotype, wherein the frequency of occurrence of said unique site is in the range of 0.85-1.0 in said genotype and 0.001-.05 in other genotypes, wherein the genomic signature is specific to said genotype and different in other genotypes as disclosed in the instant invention, wherein the organism is selected from a group consisting of virus, bacteria and protozoan.
Yet another embodiment of the present invention relates to a method of generating a genomic signature of a genotype within a species of an organism, said method comprising; providing a consensus polynucleotide sequence of each of the genotypes, wherein the consensus sequence is obtained from a small subset of polynucleotide sequences of said genotype, wherein said subset comprises 5 to 10% of total polynucleotide sequences of said genotype; aligning the consensus polynucleotide sequences of said genotypes; obtaining all unique sites having a single nucleotide variation at a specific nucleotide position in the consensus polynucleotide sequence; validating said unique sites with a larger subset comprising the polynucleotide sequences of said genotype; selecting at least one unique site to generate the genomic signature of said genotype, wherein the frequency of occurrence of said unique site is in the range of 0.85-1.0 in said genotype and 0.001-.05 in other genotypes, wherein the genomic
signature is specific to said genotype and different in other genotypes; wherein the polynucleotide sequence is the genomic sequence of the organism.
Another embodiment of the present invention relates to the method of generating a genomic signature as disclosed in the instant invention, wherein the virus is selected from a group consisting of Hepatitis Viruses, Dengue, Herpes Simplex Virus, Rotavirus, Tobacco Mosaic Virus, Ebola, Polio, Influenza virus, Japanese encephalitis and Human Immunodeficiency Virus.
Another embodiment of the present invention relates to the method of generating a genomic signature as disclosed in the instant invention, wherein the virus is selected from a group consisting of Hepatitis B Virus, Hepatitis C Virus and Human Immunodeficiency
Virus.
Yet another embodiment of the present invention relates to the method of generating a genomic signature as disclosed in the instant invention, wherein the bacteria are selected from a group consisting of Mycobacterium, Streptococcus, Pseudomonas, Shigella, Campylobacter, and Salmonella.
Yet another embodiment of the present invention relates to the method of generating a genomic signature as disclosed in the instant invention, wherein the protozoan is selected from a group consisting of Giardia Lamblia, Naegleria Fowleri, Acanthamoeba Spp, Entamoeba Histolytica, Cryptosporidium Parvum, Cyclospora Cayetanensis and Isospora Belli.
Another embodiment of the present invention recites a method of detecting one or more HBV genotypes in a sample using the HBV genomic signature , said method comprising: providing a sample; and analyzing the presence of genomic signature specific for HBV genotype wherein the genomic signature of a specific HBV genotype is the presence of specific single nucleotide variation at one or more specific nucleotide position in the representative HBV polynucleotide sequence as set forth in SEQ ID NO: 1; whereby the presence of adenine at nucleotide position 439 in the polynucleotide sequence as set forth in SEQ ID NO: 1 detects HBV-A genotype , the presence of guanine at nucleotide position 411 in the polynucleotide sequence as set forth in SEQ ID NO: 1 detects HBV-B genotype, the presence of adenine at nucleotide position 494 and guanine at nucleotide
position 2786 in the polynucleotide sequence as set forth in SEQ ID NO: 1 detects HBV- C genotype, the presence of adenine at nucleotide position 293 in the polynucleotide sequence as set forth in SEQ ID NO: 1 detects HBV-D genotype, the presence of thymine at nucleotide position 2964 in the polynucleotide sequence as set forth in SEQ ID NO: 1 detects HBV-E genotype, the presence of adenine at nucleotide position 631 in the polynucleotide sequence as set forth in SEQ ID NO: 1 detects HBV-F genotype, the presence of adenine at nucleotide position 1003 and guanine at nucleotide position 1620 in the polynucleotide sequence as set forth in SEQ ID NO: 1 detects HBV-G genotype, the presence of guanine at nucleotide position 1321 in the polynucleotide sequence as set forth in SEQ ID NO: 1 detects HBV-H genotype .
Yet another embodiment of the present invention discloses a method of detecting HBV-A and HBV-D genotypes in a sample using the HBV-A and HBV-D genomic signature, said method comprising providing a sample; and analyzing the presence of genomic signatures specific for HBV-A and HBV-D genotypes, wherein the genomic signature for HBV-A genotype is the presence of specific single nucleotide variation at nucleotide position 439 in the representative HBV polynucleotide sequence as set forth in SEQ ID NO: 1 and the genomic signature for HBV-D genotype is the presence of specific single nucleotide variation at nucleotide position 293 in the representative HBV polynucleotide sequence as set forth in SEQ ID NO: 1, whereby the presence of adenine at nucleotide position 439 in the polynucleotide sequence as set forth in SEQ ID NO: 1 detects HBV-A genotype, and the presence of adenine at nucleotide position 293 in the polynucleotide sequence as set forth in SEQ ID NO: 1 detects HBV D genotype.
Yet another embodiment of the present invention disclose a method of detecting HBV-C genotype in a sample using the HBV-C genomic signature, said method comprising providing a sample; and analyzing the presence of genomic signature specific for HBV-C genotype, wherein the genomic signature for HBV-C genotype is the presence of specific single nucleotide variation at nucleotide positions 494 and 2786 in the representative HBV polynucleotide sequence as set forth in SEQ ID NO: 1, whereby the presence of adenine at nucleotide position 494 and guanine at nucleotide position 2786 in the polynucleotide sequence as set forth in SEQ ID NO: 1 detects HBV-C genotype.
Yet another embodiment of the present invention relates to the method as disclosed in the instant invention, wherein the sample is selected from a group consisting of blood, plasma, serum, saliva, sputum, tissue, DNA and hair, preferably DNA.
Yet another embodiment of the present invention relates to the method as disclosed in the instant invention, wherein analyzing the presence of genomic signature specific for HBV genotype is performed using hybridization-based method, enzyme-based method, post- amplification method based on physical properties of DNA or sequencing method
Yet another embodiment of the present invention relates to the method as disclosed in the instant invention, wherein hybridization-based method is selected from a group consisting of dynamic allele-specific hybridization, molecular beacons, SNP microarrays.
Yet another embodiment of the present invention relates to the method as disclosed in the instant invention, wherein enzyme-based method is selected from a group consisting of restriction fragment length polymorphism, PCR-based method, flap endonuclease, primer extension, 5'- nuclease, or oligonucleotide ligase assay. Yet another embodiment of the present invention relates to the method as disclosed in the instant invention, post-amplification method is selected from a group consisting of single strand conformation polymorphism, temperature gradient gel electrophoresis, denaturing high performance liquid chromatography, high-resolution melting of the entire amplicon or SNPlex. Yet another embodiment of the present invention relates to the method as disclosed in the instant invention, wherein the step of analyzing the presence of genomic signature specific for HBV genotype further comprises selecting a pair of oligonucleotides to amplify the specific single nucleotide variation, performing a primer extension reaction to generate extended primer and analyzing the extended primer to identify the genomic signature of the HBV genotype.
Yet another embodiment of the present invention relates to the method as disclosed in the instant invention, wherein the extended primer is analyzed using mass spectrometry, differential labeling method or differential size fractionation method.
Yet another embodiment of the present invention relates to the method as disclosed in the instant invention, wherein the oligonucleotides are selected from a group consisting of set 1 SEQ ID NO: 13 and SEQ ID NO 14; set 2 SEQ ID NO: 15 and SEQ ID NO: 16; set 3 SEQ ID NO: 17 and SEQ ID NO: 18; set 4 SEQ ID NO: 19 and SEQ ID NO: 20; set 5 SEQ ID NO: 21 and SEQ ID NO: 22; set 6 SEQ ID NO: 23 and SEQ ID NO: 24; set 7 SEQ ID NO: 25 and SEQ ID NO: 26; set 8 SEQ ID NO: 27 and SEQ ID NO: 28; set 9 SEQ ID NO: 29 and SEQ ID NO: 30 and set 10 SEQ ID NO: 31 and SEQ ID NO: 32.
Yet another embodiment of the present invention relates to the method as disclosed in the instant invention, wherein the primer extension reaction was performed using the oligonucleotide as set forth in SEQ ID NO: 33, SEQ ID NO: 34, SEQ ID NO: 35, SEQ ID NO: 36, SEQ ID NO: 37; SEQ ID NO: 38; SEQ ID NO: 39, SEQ ID NO: 40, SEQ ID NO: 41 and SEQ ID NO: 42.
Another embodiment of the present invention relates to oligonucleotide set selected from a group consisting of set 11 SEQ ID NO: 13, SEQ ID NO 14 and SEQ ID NO: 33; set 12 SEQ ID NO: 15, SEQ ID NO: 16 and SEQ ID NO: 34; set 13 SEQ ID NO: 17, SEQ ID NO: 18 and SEQ ID NO: 35; set 14 SEQ ID NO: 19, SEQ ID NO: 20 and SEQ ID NO: 36; set 15 SEQ ID NO: 21, SEQ ID NO: 22 and SEQ ID NO: 37; set 16 SEQ ID NO: 23, SEQ ID NO: 24 and SEQ ID NO: 38; set 17 SEQ ID NO: 25, SEQ ID NO: 26 and SEQ ID NO: 39; set 18 SEQ ID NO: 27, SEQ ID NO: 28 and SEQ ID NO: 40; set 19 SEQ ID NO: 29, SEQ ID NO: 30 and SEQ ID NO: 41; and set 20 SEQ ID NO: 31, SEQ ID NO: 32 and SEQ ID NO: 42, wherein the oligonucleotides detects one or more HBV genotypes in a sample.
Yet another aspect of the present invention relates to A DNA fragment comprising the genomic signature of HBV genotypes, wherein the nucleotide sequence of the fragment is selected from a group consisting of SEQ ID NO: 55 to SEQ ID NO: 64.
One embodiment of the present invention discloses a method of detecting one or more HIV genotypes in a sample using the HIV genomic signature , said method comprising providing a sample; and analyzing the presence of the genomic signature specific for HIV genotype wherein the genomic signature of a specific HIV genotype is the presence of specific single nucleotide variation at one or more specific nucleotide position in the
representative HIV polynucleotide sequence as set forth in SEQ ID NO: 2; whereby the presence of Thymine and cytosine at nucleotide positions 1440 and 4244 respectively in the polynucleotide sequence as set forth in SEQ ID NO: 2 detects HIV-Al genotype, the presence of cytosine at nucleotide positions 4462 and 7304 in the polynucleotide sequence as set forth in SEQ ID NO: 2 detects HIV-A2 genotype, the presence of adenine and guanine at nucleotide positions 2040 and 9001 respectively in the polynucleotide sequence as set forth in SEQ ID NO: 2 detects HIV-B genotype, the presence of cytosine and thymine at nucleotide positions 2480 and 6395 respectively in the polynucleotide sequence as set forth in SEQ ID NO: 2 detects HIV-C genotype, the presence of adenine and guanine at nucleotide positions 4158 and 8180 respectively in the polynucleotide sequence as set forth in SEQ ID NO: 2 detects HIV-D genotype, the presence- of cytosine and guanine at nucleotide positions 1679 and 7169 respectively in the polynucleotide sequence as set forth in SEQ ID NO: 2 detects HIV-Fl genotype, the presence of adenine and cytosine at nucleotide positions 3456 and 7985 respectively in the polynucleotide sequence as set forth in SEQ ID NO: 2 detects HIV-F2 genotype, the presence of thymine and guanine at nucleotide positions 2993 and 6574 respectively in the polynucleotide sequence as set forth in SEQ ID NO: 2 detects HIV-G genotype, the presence of adenine and cytosine at nucleotide positions 3973 and 8493 respectively in the polynucleotide sequence as set forth in SEQ ID NO: 2 detects HIV-H genotype, the presence of adenine and guanine at nucleotide position 2589 and 6769 respectively in the polynucleotide sequence as set forth in SEQ ID NO: 2 detects HIV-J genotype, and the presence of adenine at nucleotide positions 3428 and 9134 respectively in the polynucleotide sequence as set forth in SEQ ID NO: 2 detects HIV-K genotype.
Another embodiment of the present invention relates to the method of detecting one or more HIV genotypes in a sample using the HIV genomic signature as disclosed in the instant invention, wherein the sample is selected from a group consisting of blood, plasma, serum, saliva, sputum, tissue, DNA, RNA and hair, preferably DNA.
Another embodiment of the present invention relates to the method of detecting one or more HIV genotypes in a sample using the HIV genomic signature as disclosed in the instant invention, wherein analyzing the presence of genomic signature specific for HIV
genotype is performed using hybridization-based method, enzyme-based method, post- amplification method based on physical properties of DNA or sequencing method
One embodiment of the present invention relates to the method as disclosed in the instant invention, wherein hybridization-based method is selected from a group consisting of dynamic allele-specific hybridization, molecular beacons, SNP microarrays.
Another embodiment of the present invention relates to the method as disclosed in the instant invention, wherein enzyme-based method is selected from a group consisting of restriction fragment length polymorphism, PCR-based method, flap endonuclease, primer extension, 5'- nuclease, or oligonucleotide ligase assay. Yet another embodiment of the present invention relates to the method as disclose din the present invention, wherein the post-amplification method is selected from a group consisting of single strand conformation polymorphism, temperature gradient gel electrophoresis, denaturing high performance liquid chromatography, high-resolution melting of the entire amplicon or SNPlex. Another embodiment of the present invention relates to the method of detecting one or more HIV genotypes in a sample using the HIV genomic signature as disclosed in the instant invention, wherein the step of analyzing the presence of genomic signature specific for HIV genotype further comprises selecting a pair of oligonucleotides to amplify the specific single nucleotide variation, performing a primer extension reaction to generate extended primer and analyzing the extended primer to identify the genomic signature of the HIV genotype.
Yet another embodiment of the present invention relates to the method as disclosed in the present invention, wherein the extended primer is analyzed using mass spectrometry, differential labeling method or differential size fractionation method. Yet another embodiment of the present invention relates to the method as disclosed in the present invention, wherein the oligonucleotides for detecting HIV-B and HIV-C genotype are selected from a group consisting of set 21 SEQ ID NO: 43 and SEQ ID NO: 44; set 22 SEQ ID NO: 45 and SEQ ID NO: 46; set 23 SEQ ID NO: 47 and SEQ ID NO: 48; set 24 SEQ ID NO: 49 and SEQ ID NO: 50.
Yet another embodiment of the present invention relates to the method as disclosed in the instant invention, wherein the primer extension reaction was performed using the oligonucleotide as set forth in SEQ ID NO: 51, SEQ ID NO: 52, SEQ ID NO: 53 and SEQ ID NO: 54. In one embodiment of the present invention is disclosed a genomic signature processing (GSP) system (600) comprising: at least one processor (606); a memory (608) coupled to the processor (606), the memory (608) comprising one or more processor executable instructions; a preliminary signature generator (618) to generate preliminary signatures for genotypes, wherein the preliminary signatures are generated based at least on a set of unique sites identified for the genotypes; and a genomic signature generator (620) to generate genomic signatures for each of the genotypes, wherein the genomic signatures are based on at least one informative site identified for each of the genotypes, and wherein the informative site is identified such that a probability of two or more genotypes having a same nucleotide at a same informative site is minimal. Another embodiment of the present invention relates to the GSP system (600) as disclosed in the instant invention, wherein the GSP system (600) is communicatively coupled to a sequential database (610) that stores multiple sequences for each of the genotypes.
Yet another embodiment of the present invention relates to the GSP system (600) as disclosed in the instant invention, wherein the genomic signatures are generated based on a comparison of the preliminary signatures of each of the genotypes with the multiple sequences of all the genotypes.
Yet another embodiment of the present invention relates to the GSP system (600) as disclosed in the instant invention comprising a consensus generator (616) to generate consensus sequences for the genotypes, wherein the consensus sequences are generated based on multiple sequences corresponding to each of the genotypes.
Yet another embodiment of the present invention relates to the GSP system (600) as disclosed in the instant invention, wherein the preliminary signatures are generated based on the set of unique sites identified for the genotypes.
Yet another embodiment of the present invention relates to the GSP system (600) as disclosed in the instant invention, wherein the unique sites are identified based on consensus sequences corresponding to the genotypes.
Detailed description of Algorithm based detection/characterization The genomic signature identification for a species involves combination of multiple sequence alignment, phylogeny (Neighbor Joining) and statistical validation. Combination of unique sites from a multiple sequence alignment of similar sequences was taken as a preliminary signature for a particular species, which is then further validated, using phylogeny and statistical based approaches. The genomic signature formed is a combination of unique nucleotide bases and shared nucleotide bases. The genomic signature is validated for specificity and accuracy of prediction in set of similar sequences.
The nucleotide sequences of interest are run on the algorithm which scans for specific nucleotide bases at specific sites. Based on the presence or absence of the base at all interrogated sites the algorithm calculates the score to tabulate the possibility of the sequence matching the signature of any particular micro organism or its subtype.
The process as described above broadly comprises (1) site identification (2) site validation (3) statistical significance determination of each site and (4) site short listing.
Site Identification Identification of sites was achieved as described below;
Phylogenetic Analysis: HBV DNA sequences were retrieved from the Hepatitis Virus Database (http://s2asO2.genes.nig.ac.jp/), and a phylogenetic tree was constructed using the Neighbor Joining Phylogenetic method. Phylogenetic analysis of the various Hepatitis B virus sequences from the database reveal eight genotypes of HBV isolates infecting humans. The isolates clustering together on the phylogram show similarity in the sequences. The DNA sequences were seen to cluster on the basis of the different gene types or genotypes of the HBV isolate. A genotype specific consensus sequence was generated from NCBI reference strains belonging to the particular HBV genotype. The eight consensus sequences, representing eight different HBV genotypes (A-H) were then
aligned together to identify individual sites (nucleotide positions) on HBV genome that are unique for a particular genotype. That is to say, that the nucleotide is present in only one genotype at a specific site whereas the other genotypes have a different nucleotide at that site. In total, 132 unique sites were identified from the Multiple Sequence Alignment of reference DNA sequences of HBV as provided in the database (NCBI). The distribution of these sites was different for different HBV genotypes. The unique sites (US) and shared sites (SS) for the different genotypes of HBV is as follows: Genotype A: 13 US and 7 SS; Genotype B: 11 US and 11 SS; Genotype C: 13 US and 7 SS; Genotype D: 6 US and 14 SS; Genotype E 8 US and 14 SS; Genotype F 15 US and 56 SS; Genotype G 48 US and 15 SS; Genotype H 18 US and 47 SS.
These sites were analyzed for different parameters like i) distribution across the genome ii) proximity to other informative sites iii) conservation / presence among different isolates of HBV. On this basis, 85 unique and 46 shared sites of HBV genome were short listed. These are as follows;
Unique Sites: The positions given below are with respect to the representative sequence SEQ ID No.1.
A: 139 195 318 319 439 454489 619 782 788 1185 1853 3035 B: 130 170 348 411 796 1255 28342996 3105 C: 52 494 534 636 852 1168 1232 2635 2786 2843 D: 293 2673 2732 2760 3086 3232 E: 1483 2161 2376 2675 2964 3026 F: 40 583 631 1098 1357 1690 1703 2005 2062 2302 2362 3141 3195
G: 929 939 1003 1020 1329 1609 1616 1620 1751 1768 1820206021162194 2474 2475 3225
H: 568 1321 1555 1680 2167 2242 2266 2391 2534 2600 Site Validation
T/IN2000/000442
Both the unique and shared sites were validated for their reliability to correctly predict the genotype of an input sequence using an algorithm developed in-house which can predict the genotype based of the input sequence. The input HBV DNA sequences were obtained from the Hepatitis Virus Database. The genotype information for these DNA sequences was available from the associated literature provided by the submitters. Most of these isolates were genotyped using serological techniques. Wherever the predicted genotype and reported genotype information did not match, the respective sequences were analyzed with online genotyping resources (NCBI genotyping tool, BioAfrica Online HBV genotyping tool) for validation. Sample data set for validation
174 HBV whole genome sequences were taken from public database. In a single blinded study the sequence was provided for validation, however the genotype information of the sequences was masked from the analyst (*Masked dataset means that the HBV sequences were retrieved from the database by a person who did not know the purpose of the study and was asked to remove any information suggesting the genotype information of the isolate. The nucleotide sequence of the isolate was then given to the analyst in fasta format for genotype prediction using the informative sites). Of these 174 HBV DNA Sequences, genotype prediction of 160 sequences was correct (genotype information available from literature and that predicted by our algorithm matched perfectly), whereas 6 sequences did not match genotype information available from literature and that predicted by our algorithm matched partially or did not match. Of these 6 DNA sequences, 5 DNA sequences were predicted to have two genotypes by the algorithm whereas the literature cited only one of the genotypes. The case was reversed in one query sequence where a recombinant / mixed C and D genotype was given in literature whereas our algorithm predicted a pure C genotype. To cross-check our result, we used the online genotyping tools. The genotyping tools predicted two genotypes, similar to that predicted by our algorithm. Remaining 9 DNA sequences which were not predicted by the algorithm were later found to be HBV strain sequences from chimpanzees.
Determination of the Statistical significance of each site
Based on the above mentioned genotype predictions, detailed statistical analysis was performed. Chi square test was performed to check the statistical significance of Individual Sites. A cut off of 0.5 probability was taken and all sites falling below this cut off were removed. Applying these criteria, 62 unique sites were selected. Table 1 shows the distribution of shortlisted sites among different genotypes. These sites were further analyzed using the following dataset.
Dataset
587 HBV DNA sequences were taken from 800 sequences in HBV sequence database. The remaining sequences (304 sequences) were not taken for analysis as they were not assigned any genotype by the investigators who first reported these sequences in literature. The distribution of the sequences in database was as follows (Table 2).
The analysis carried out for identification of statistically significant site(s) among all 6 HBV-A genotype sites is explained below;
Genotype A: The site positions and their respective bases in HBV-A and HBV-non A sequences is given in Table 3.
The distribution of A and non A sequences is as follows; HBV Genotype A: 60 sequences HBV Non A Genotypes: 527 sequences Step l :
The frequency of identification of the expected base amongst the 60 genotype A sequences is summarized in Table 4.
The base position highlighted showed occurrence in maximum number of genotype A sequences. These sites were therefore chosen for further analysis. Step 2:
The predictive value (probability of classification) for each site was calculated using the formula:
Number of genotype A sequences showing the expected base at the particular site
Total number of genotype A sequences.
Site 2858 showed the maximum probability of a correct classification (or minimum probability of misclassification) within genotype HBV-A sequences (Misclassification: when the site unique for HBV-A is absent in a HBV-A sequence, there by incorrectly predicting the sequence as non-A). Method of calculating the probability of misclassification is explained in an example below. Calculations for the other sites were done similarly and thus the criterion has not been explained again in the report.
Ex. HBV-A specific site at site 102 is found in 54 out of 60 HBV-A sequences, thus 6 HBV-A sequences did not have the correct base at this site. Thus the probability of misclassifying a genotype A as non A by this site is 6/60 = 0.134812287. This value is determined for all the 6 sites and is given below (Table 5).
Step 3: 527 HBV non- genotype A sequences were taken. The 6 HBV-A sites, given previously, were looked up separately in each genotype (Table 6).
The number of times a HBV-A specific site was identified in these sequences is given in column B, C etc (Number of Misclassifications) (Table 6). The probability of misclassification was calculated as explained above. Collectively, if the non-A genotypes were taken as one group of 527 sequences, then the probability of misclassification was calculated as given below (Table 7);
Observation: Site No. 2 (position 439) showed the least probability of misclassification in Non-A genotype sequences.
Step 4: Based on the above observations, site 2 was taken in combination with the rest of the 5 sites one by one (Table 8). The combination of two sites with the least probability of misclassification in A and least probability of misclassification in Non A was taken in combination with a third site. The combination of three sites which gave the least probability of misclassification in A and least probability of misclassification in Non A was taken in combination with the next fourth site. This was continued till addition of further sites either did not lead to any change in the probabilities of (mis)classification or led to an increase in the probability of misclassification and lowering of the probability of classification.
For all the site combinations given in Table 8, only one sequence showed the presence of HBV-A specific bases at all the 6 sites. This sequence was identified as a mixed genotype of C and D by the investigators however our analysis as well as other genotype predicting tools available online (which are based on whole genome sequences) predicted this as genotype A.
Observation: Site 2 (a single site corresponding to position 439 of representative sequence SEQ ID No.l) is sufficient for classification of a HBV-A sequence correctly with least probability of misclassification of a HBV-non A genotype sequences as HBV- A genotype. Step 5: The probability of Misclassification in HBV-A and HBV-non A genotypes taken together for Site 2 is as given below;
Number of HBV-non A genotype sequences showing HBV-A specific base at site 2: 1 Number of HBV-A genotype sequences not showing the HBV-A specific base at site 2: 3
Therefore, Total number of misclassifications: 4 out of 587. Therefore, probability of misclassification due to Site 2 in HBV-A and HBV-Non A genotype sequences: 0.00682594. Probability of misclassification by site 2 in genotype A and non-A taken together was investigated as shown (Table 9).
As is shown in table 9, the probability of misclassification increased upon addition of more sites which implies that site 2 (base position of 439) is sufficient to identify genotype A. Similar Analysis was performed for rest of the genotypes as well and 10 sites were finally short-listed (Table 10). The site and base call description of these ten sites is given (Figure 3).
Significance Analysis of the Informative sites
Each of the 10 sites were checked for their specificity within all the HBV sequences The frequency of identification of the respective informative sites among the HBV sequences is as provided in Table 11.
In Table 11, the number outside the bracket is the total number of sequences with the non-specific (informative site of another genotype) site. Number inside the bracket is the total number of sequences with both specific and non- specific (informative site of
another genotype) sites. Overall, the 10 informative sites showed a false positive in the 1.7%-6.9% cases.
Identification of Non Human HBV Genome Sequences from Human HBV Genome Sequences Using Informative Site Analysis DNA sequences of non human HBV from Gibbon (accession nos. ajl31568, aj l31574, ajl31569, ajl31571, ajl31572, ajl31573, U46935); Gorilla (ajl31567); Woolly monkey (accession no. afO46996) and Chimpanzee (accession no. D00220) were taken for analysis. These sequences were compared with the human consensus sequences for the different genotypes (A-H). Multiple Sequence Alignment of these sequences was done using CLUSTALW software. The short-listed human HBV informative sites were checked on these Non Human Genome sequences.
Table 12 shows the nucleotide base calls in Human as well as non human HBV Genome Sequences. Woolly monkey HBV genome could be differentiated from Human HBV genome as the base calls at position 494 and 1620 were different from the base calls present in Human HBV sequences. Chimpanzee, Gibbon and Gorilla could not be clearly differentiated based on the 10 informative sites. Four more sites (102_A/B, 454_A, 619_A and 636_C) were included to the informative sites (thus totaling to 14 sites) for differentiating Chimpanzee, Gibbon and Gorilla HBV genome sequences from Human HBV sequences. With inclusion of more sites, HBV DNA sequences from the virus infecting Chimpanzees, and Gibbons could be differentiated.
SEOUENOM® Assay Design Document
The SEQUENOM® ® MassArray assay was designed for all 10 informative sites. The purpose was to check the robustness of these sites in correctly identifying the genotypes in HBV DNA isolated from clinical samples. Genomic signature profiling (GSP) system
Figure 4 illustrates an exemplary genomic signature profiling (GSP) system 600. In one implementation, the GSP system 600 includes input/output (I/O) interface(s) 602, network interface(s) 604, one or more processor(s) 606, and a memory 608. The GSP system 600 is communicatively coupled to a sequential database 610. The I/O interfaces
009/000442
602 can be configure red to send and receive information from various I/O media such as a keyboard, a mouse, an external memory, a printer etc. The processor(s) 604 may be implemented as one or more microprocessors, microcomputers, dual core processors, and so forth. Among other capabilities, the processor(s) 604 can be configure red to fetch and execute computer readable instructions stored in the memory 608.
The network interfaces 606 enable the GSP system 600 to communicate to other computing-based devices, such as web servers and external databases. The network interfaces 606 can facilitate multiple communications within a wide variety of networks and protocol types, including wired networks (e.g. LAN, cable, etc.) and wireless networks (e.g. WLAN, cellular, satellite, etc.). For the purpose the network interfaces 606 may include one or more ports for connecting a number of computing devices to each other or to another server computer.
The memory 608 may include any computer-readable medium known in the art, for example, volatile random access memory (e.g., RAM) and non-volatile read-only memory (e.g., ROM, flash memory, etc.). As illustrated in Figure 6, the memory 608 may include program modules 612 and program data 614. The program modules 612 generally include routines, programs, objects, components, and data structures that perform particular tasks or implement particular abstract data types.
In one implementation, the program modules 612 include a consensus generator 616, a preliminary signature generator 618, a genomic signature generator 620, and other modules 622. The other modules 622 may include programs that supplement applications on a computing-based device, such as an operating system. The program data 614 includes genomic data 624 and other data 626 required for or generated during operation of the GSP system 600. In one implementation, the GSP system 600 is employed for identification of genomic signatures for genotypes. Multiple sequences for the genotype under study may be stored in the sequential database 610. The multiple sequences may be stored in the form of clusters. Additionally, the sequential database 610 may also store multiple sequences for genotypes substantially similar to the genotype under study. In one implementation, the sequential database 610 is stored as part of a computing system external to the GSP
system 600, and communicates with the GSP system 600 using the network interfaces 606. In another implementation, the sequential database 610 may be stored in an external memory device and may communicate with the GSP system 600 using the I/O interfaces 602. In yet another implementation, the sequential database 610 may be integral to the program data 614.
The clusters of multiple sequences are fetched from the sequential database 610 by the consensus generator 616 to generate a consensus sequence for each of the genotypes, using methods as previously explained. The consensus sequences thus generated by the consensus generator 616 are stored in the other data 626 and are used by the preliminary signature generator 618 to generate preliminary signatures for individual genotypes.
The preliminary signature generator 618 creates a multiple sequence alignment of the consensus sequences to compare the consensus sequences and identifies a set of unique sites of the individual genotypes. Additionally, the preliminary signature generator 618 may also identify a set of shared sites of the individual genotypes. In one implementation, the preliminary signature generator 618 generates a preliminary signature for individual genotypes based on the unique sites, as previously explained. In another implementation, the preliminary signature generator 618 generates a preliminary signature for individual genotypes based on a combination of the unique sites and the shared sites, as previously explained. The preliminary signature thus generated is stored in the other data 626 and is used by the genomic signature generator 620 to generate genomic signature for the individual genotypes.
The genomic signature generator 620 identifies one or more informative sites from the unique and/or shared sites of the individual genotype. For the purpose, the genomic signature generator 620 performs detailed statistical analysis of the consensus sequence of the individual genotype with the multiple sequences for the genotype, stored in the sequence database 610. As previously explained the statistical analysis is performed to check the statistical significance of the probability of presence of the unique and/or shared sites identified by the preliminary signature generator 618. In one implementation, Chi square test is performed to check the statistical significance of the probability of
presence of the unique and shared sites. However, it would be appreciated that any other statistical test may also be implemented.
The informative sites, in a consensus sequence of a genotype, are identified such that a probability of any other genotype having a same nucleotide base at a same informative site is minimal. The informative sites in combination with stretches of nucleotide bases form a unique genomic signature for the genotype. The genomic signature for the genotype may be stored in the genomic data 624. The generated genomic signature can be utilized for identification of pathogens, preclinical detection of infections, diagnosis of disease, biomarker detection, toxicology testing, positive selection, phylogenetic prediction, drug recovery, and disease prognosis, etc. as discussed earlier.
The following examples are given by the way of illustration of the invention contained in the present invention and therefore should not be construed to limit the scope of the present invention.
EXAMPLES It should be understood that the following examples described herein are for illustrative purposes only and that various modifications or changes in light of the specification will be suggestive to person skilled in the art and are to be included within the spirit and purview of this application and the scope of the appended claims.
Example 1 Detailed description of Algorithm based detection/characterization
The genomic signature identification for a species involves combination of multiple sequence alignment, phylogeny (Neighbour Joining) and statistical validation. Combination of unique sites from a multiple sequence alignment of similar sequences was taken as a preliminary signature for a particular species, which is then further, validated using phylogeny and statistical based approaches. The genomic signature formed is a combination of at most two unique nucleotide bases. The genomic signature is validated for specificity and accuracy of prediction in a set of similar sequences.
The nucleotide sequence of interest is run on the algorithm which scans for specific nucleotide bases at specific informative sites. Based on the presence or absence of the
base at all interrogated sites the algorithm calculates the score to tabulate the possibility of the sequence matching the signature of any particular micro organism or its subtype.
Genomic signature of HBV genotype
Genomic signature of HBV genotypes A, B, C, D, E, F, G, and H was generated using the above algorithm. The results are provided in Table 14 and Table 15.
Example 2: Primer designing for investigating the base at the informative site
Stepl: MSA of Consensus HBV sequence from database was created individually for all genotypes.
Step2: 5' and 3' lOObp flanking sequences were taken on either sides of the informative site. These flanking sequences were used for primer (PCRl5 PCR2 and iPLEX) design. e.g. the HBV informative site and their 20bp flanking sequence is as below;
SEQ ID NO: 55: A_439_A
CTGCTATGCCTCATCTTCTT(A/G)TTGGTTCTTCTGGATTATCA
SEQ ID NO: 56: B_411_B GCGTTTTATC ATCTTCCTCT(G/T)CATCCTGCTGCTATGCCTCA
SEQ ID NO: 57: C_494_C
TTTGTCCTCTACTTCCAGGA(A/T)CATCAACTACCAGCACGGGA
SEQ ID NO: 58: C_2786_C
AATCATTACTTCAAAACTAG(GZA)CATTATTTACATACTCTGTG SEQ ID NO: 59: D_293_D
TCAATTTTCTAGGGGGAACY(A/C)CCGTGTGTCTTGGCCAAAAT
SEQ ID NO: 60: E_2964JE
ACCACCAATCCTCTGGGATT(T/C)TTTCCCGACCACCAGTTGGA
SEQ ID NO: 61: F_631_F ATCCCATCATCYTGGGCTTT(A/C)GGAAAATACCTATGGGAGTG
SEQ ID NO: 62: G_1003_G
GGAAAGTCTGTCAACGAATA(A/G)CTGGTCTGTTGGGTTTCGCT
SEQ ID NO: 63: G_1620_G
CTGCACGTTACATGGAAACC(G/A)CCATGAACACCTCTCATCAT SEQ ID NO: 64: H_1321_H
CTCGCAGCMGGTCTGGAGCG(G/A)ACATTATCGGCACTGACAAC
N2009/000442
Similarly the primers were designed in the flanking sequences for each of the 10 sites in the eight HBV genotypes (therefore for 10 sites x 8 genotypes = 80 primers).
Similarly, primers were designed for the HIV genotypes as well;
SEQ ID NO: 65: A1_144O_A1 TTTRAATRTRATGCTRAACA[TZC]AGTGGGGGGACAYCAGGCAG
SEQ ID NO: 66: A1_4244_A1
TCATTCAGGCMCARCCAGA[C/T]ARRAGTGAATCAGARWTAG
SEQ ID NO: 67: A2_4462_A2
GYAATTGGAGAGCMATGGCT[C/A]ATGACTTTAATCTRCCACCT SEQ ID NO: 68: A2_7304_A2
ACATGGAATTADRCCAGTAGtC/TjATCAACTCAACTGCTGYTGA
SEQ ID NO: 69: .B__2040_B
TACAGAAAGGCAATTTTAGG[AZG]ACCAAAGAAAGACTGTTAAG
SEQ ID NO: 70: B_9001__B GTGGGAAGCCCTCAAATATTtG/TlGTGGAATCTCCTACAGTATT
SEQ ID NO: 71: C_2480_C
GGCCAAATAAAAGAGGCTCT[CZA]TTAGACACAGGAGCAGATGA
SEQ ID NO: 72: C_6395_C
AACTGGTTAATTAAAAGAAT[TZA]AGGGAAAGAGCAGAAGACAG SEQ ID NO: 73: D_4158_D
AGACTGARTTACAAGCMATW[A/T]AYCTAGCYTTGCAGGATTCG
SEQ ID NO: 74: .DJl 80_D
TTGGGARCAGCAGGAAGCAC[GZT]ATGGGCGCASBGTCABTGAC
SEQ ID NO: 75: JF1_1679_F1 GAMATSTATAAAAGATGGAT[CZA]ATCCTAGGATTAAATAAAAT
SEQ ID NO: 76: .F 1_7169 JF 1
RGCKTGTCCMAAGRTRWCCT[GZT]TGAGCCWATTCCCATACATT
SEQ ID NO: 77: .F2_3456_F2
CTATACAATTGCCARAMAAG[AZG]GCAGCTGGACTGTCAATGAT SEQ ID NO: 78: .F2_7985_F2
AKRNWRMYRRBWMHGASAYY[CZT]TCAGACCTRKAGGRGGAGAK
SEQ ID NO: 79: .G_2993_G
GARGTCCMATTAGGAATACCtT/AjCAYCCYKSRGGGTTAAAAMA
SEQ ID NO: 80: .G_6574_G
ATTATGGRGTACCTGTGTGGtG/AjARGAYGCARAKRCCMCYCTA SEQ ID NO: 81: .H_3973_H
RGAGTTTGTTAACACCCCTC[AZC]TCTAGTRAAATTATGGTATC
SEQ ID NO: 82: .HJ493JH
CTTGGATGGARTGGGAYARA[CZG]AAATTRRCAATTACACAGA
SEQ ID NO: 83: J_2589_J TTATCAAAGTRAGACAGTAT[AZG]AYGAKRTACYSATAGAAATT
SEQ ID NO: 84 J_6769_J
TGCAKGAWGATATAATCAGT[GZT]TATGGGATGAAAGCTTAAAR
SEQ ID NO: 85: .KJ3248JC
AAAAATCCAGAWATGGTTWT[AZC]TACCAATACATGGATGATTT SEQ ID NO: 56: . K_9134__K
YAMAGAGCTTTTAGAGCTTT(AZCZT)CTTCACATACCTAGAAGAAT
Similarly the primers were designed in the flanking sequences for each of the 22 sites in the eight HIV genotypes (therefore for 22 sites x 11 genotypes = 242 primers). The primer sequences were designed for Multiplex design using Mass Array iPLEX design software. Similarly, primers can be designed for other pathogens based on the flanking sequences of their respective informative sites.
Example 3
Assays
Nucleic Acid Isolation Different isolation procedures can be employed based on the source of the starting material
Viral nucleic acid extraction
Nucleic Acid isolation from sample involves lyses of sample by detergent and proteinase K which leads to release of total Nucleic Acids. Chaotropic salt like Guanidine HCl was
IN2009/000442
added to the sample that leads to selective binding of nucleic acid to glass fiber of filter tube. The nucleic acid remains bound while a series of rapid "wash and spin" steps remove contaminating cellular components. Finally, the nucleic acids are eluted in low salt concentrated buffer. - Genomic nucleic acid extraction
Phenol was added to the samples for protein denaturation. Ammonium acetate and ethanol was added to precipitate out the DNA. The DNA pellet was washed and re- suspended in low salt buffer or ultrapure water.
Example 4 Detection of HBV genotypes
Mass spectrometry is an analytical technique that identifies the chemical composition of a sample (nucleotide sequence) on the basis of the mass-to-charge ratio of charged particles. The method employs chemical fragmentation of a sample into charged particles (ions) and measurements of two properties, charge and mass of the resulting particles, the ratio of which is deduced by passing the particles through electric and magnetic fields in a mass spectrometer.
The mass spectrometry assay is based on multiplex polymerase chain reaction followed by a single base primer extension reaction. After the PCR, remaining nucleotides are deactivated by Shrimp Alkaline Phosphatase treatment. The single base primer extension step are performed, and the primer extension products are analyzed using MALDI TOF MS.
Example: Detection of HBV genomic signatures by Mass Spectometry Source: HBV Viral DNA from HBsAg +ve serum sample. Step 1: PCR Amplification
Ten ng of DNA is transferred to 384 well plate and dried at 85°C for 15 min in a thermal cycler. The following oligonucleotides were used for detection of HBV genotypes in the sample. HBV-A Set 11 : SEQ ID NO: 13, SEQ ID NO 14 and SEQ ID NO: 33
HBV- B Set 12: SEQ ID NO: 15, SEQ ID NO: 16 and SEQ ID NO: 34
HBV-C Set 13: SEQ ID NO: 17, SEQ ID NO: 18 and SEQ ID NO: 35
Set 14: SEQ ID NO: 19, SEQ ID NO: 20 and SEQ ID NO: 36
HBV- D Set 20: SEQ ID NO: 31, SEQ ID NO: 32 and SEQ ID NO: 42 HBV-E Set 15: SEQ ID NO: 21, SEQ ID NO: 22 and SEQ ID NO: 37
HBV-F Set 16: SEQ ID NO: 23, SEQ ID NO: 24 and SEQ ID NO: 38
HBV-G Set 17: SEQ ID NO: 25, SEQ ID NO: 26 and SEQ ID NO: 39
Set 18: SEQ ID NO: 27, SEQ ID NO: 28 and SEQ ID NO: 40 HBV-H Set 19: SEQ ID NO: 29, SEQ ID NO: 30 and SEQ ID NO: 41 Reaction mixture was prepared as follows,
Reagent Volume Final Concentration
Ultra pure water 2.85 μl
1OX buffer with 15mM MgC12 0.625 μl 1.25X (*1.875mM MgC12
MgC12 (25 mM ) 0.325 μl 1.625mM (*Total 3.5mM) dNTP Mix (each 25mM) 0.1 μl 1.625mM each
Primer Mix [50OnM] 1 μl
Taq polymerase (5U/μl) 0.1 μl 0.5 U
Total volume 5 ul
The PCR plate consisting of the reaction mixture was sealed with plate sealing film. The PCR microplate was centrifuged at 200Og for 1 min to spin down the PCR mix. The reaction mixture was subjected to following thermal conditions in a thermocycler. 94 ° C 15min
72 ° C 3min 40 C forever
(** This maybe varied from 30-45 cycles)
Step 2: SAP (Shrimp alkaline phosphatase) treatment for iPLEX MassARRAY Genotyping
The PCR product obtained from the step 1 was treated with Shrimp alkaline phosphatase to dephosphorylate the unincorporated dNTP. The SAP mixture was prepared as follows.. Reagent Volume (μl)
Ultra pure water 1.53 10 X SAP buffer 0.17
SAP enzyme (lU/μl) 0.3 Total volume 2
The mixture was vortexes for five seconds. 2.0 μl of SAP enzyme solution was added to each well in the 384-well sample microplate as per the sample sheet. The sample micro plate was sealed properly with plate sealing film. The SAP treated PCR reaction was incubated at following thermal conditions in a thermocycler. 37 0C for 20 minutes
85 0C for 5 minutes 40C forever Step 3 : iPLEX Reaction iPLEX primer extension reaction was performed as follows. In a tube, prepare the iPLEX reaction mixture was prepared as given below.
Reagent Volume (μl)
Ultra pure water 0.559 lO x iPLEXbuffer 0.2 iPLEX termination mix 0.2
Normalized extention Primer mix 1 iPLEX enzyme 0.041
Total volume , 2 The reaction mixture was subjected to the following thermal conditions in a thermocycler.
• 940C 30sec
• 940C 5sec
52υC 5sec l5 Cycles U 40 Cycl 800C 5secJ
720C 3min
• 40C Forever
Step 4: Resin Cleanup of iPLEX extension product
I Preparation of resin plate
Using the elongated spoon, resin transfer from the container was transferred onto a 384- well 6mg dimple plate. The resin was spread into the dimple plate using a scraper. Excess resin was removed.
II Addition of resin to the 384 well sample microplate. After the iPLEX reaction, the sample microplate was sinned and the sealing film was removed from the 384 well sample microplate. Gently place the 384-well sample microplate was gently placed upside- down, over the dimple plate. Holding the 384-well microplate and the dimple plate together, fliped them over so the resin came out of the plate into the wells of the microplate. The sample microplate was spin down. III Addition of water to the 384-well sample microtiter plate
25 μl of MiIIiQ water was added into each well of 384-well sample microplate and the sample microplate was sealed. The microplate was spin down and kept in a plate rotator for half an hour. The sample microtiter plate was centrifuged at 3200 g for five minutes. The iPLEX reaction products were transferred to a SpectroCHIP, using the MassARRAY Nanodispenser.
Step 5: MALDI-TOF MS Analysis
MassARRAY Workstation version 3.3 software is used to process and analyze iPLEX
SpectroCHIP bioarrays.
Step 6: Identification of Genotype based on the results obtained from SEQUENOM® iPLEX Assay
The result obtained after MassSpectometric analysis of the HBV sites on three samples (marked HBV A, HBV D and HBV A') is given in Table 13.
Step 8: Based on base calls at all positions HBV genotype (s) were identified Example 5 Detection of HIV genotypes
Detection of HIV genotypes was performed as described above using the oligonucleotide set as provided below. Primers for HIV Genotype B SEQ ID NO: 43 HIV_58.B_2040_B_PCRl ACGTTGGATGAGCCAAGTAACAAATCCAGC
SEQ ID NO: 44 HIV_58.B_2040_B_PCR2
ACGTTGGATGGTGCCCTTCTTTGCCACAAT
SEQ ID NO: 51 HIV_58.B_2040_B_iPLEX
CTTAACAGTCTTTCTTTGGT
SEQ ID NO: 45 HIV_80.B_9001_B_PCRl
ACGTTGGATGAGGATTGTGGAACTTCTGGG
SEQ ID NO: 46 HIV_80.B_9001_B_PCR2
ACGTTGGATGTCCTGACTCCAATACTGTAG
SEQ ID NO: 52 HIV_80.B_9001_B_iPLEX
ATACTGTAGGAGATTCCAC
Primers for HIV Genotype C
SEQ ID NO: 47 HIV_103.C_2480_C_PCRl
ACGTTGGATGAATCACTCTTTGGCAGCGAC
SEQ ID NO: 48 HIV_103.C_2480_C_PCR2 * ACGTTGGATGTACTGTATCATCTGCTCCTG
SEQ ID NO: 53 HIV_103.C_2480_C_iPLEX
TCTGCTCCTGTGTCTAA
SEQ ID NO: 49 HIV_147.C_6395_C_PCR1
ACGTTGGATGGGAAATTGGTAAGACAAAGC
SEQ ID NO: 50 HIV_147.C_6395_C_PCR2
ACGTTGGATGCTCTCATTGCCACTGTCTTC
SEQ ID NO: 54 HIV_147.C_6395_C_iPLEX
TCTTCTGCTCTTTCCCT
Table 1 : Number of short-listed sites amongst different HBV genotypes
Table 2: Distribution of the number of sequences representative of each of the 8 genotypes in the dataset
Table 3: The distribution of Base calls at HBV-A specific sites in HBV-A and HBV-non A sequences
Table 4: Frequency distribution of the occurrence of the specific site among all Genotype A sequences
Table 5: The tables show the probability of A. Classification and B. Misclassification for the 6 genotype A sequences among the HBV-A sequences
A
B
Table 6: Calculation of the probability of occurrence of the individual HBV-A sites in Non HBV-A genotype sequences
Table 7: Probability of misclassification by the individual sites is calculated taking all Non HBV-A genotype sequences in a single group of 527 sequences.
Table 8: Calculation of the probability of misclassification in HBV-A and HBV-nonA genotype sequences. (NA: Non A genotypes)
Table 10: The 10 informative sites identified from the eight H^V genotypes (*Site Name is the site position with respect to the representative HBV sequence SEQ ID No.1)
Table 11: Frequency of occurrence of the informative sites among HBV sequences of different genotypes
Table 12: Investigation of base calls at the 10 informative sites among non human sequences
Table 13: Base calls generated from Sequenom MassSpectometer analysis of clinical samples
Table 14: Details of the genomic signature of HBV genotypes
Table 15: the genomic signature of HBV and corresponding site in the consensus sequence of HBV genotype
Claims
1. A method of generating a genomic signature of a genotype within a species of an organism, said method comprising a. providing a consensus polynucleotide sequence of each of the genotypes, wherein the consensus sequence is obtained from a small subset of polynucleotide sequences of said genotype, wherein said subset comprises 5 to 10% of total polynucleotide sequences of said genotype, b. aligning the consensus polynucleotide sequences of said genotypes, c. obtaining all unique sites having a single nucleotide variation at a specific nucleotide position in the consensus polynucleotide sequence, d. validating said unique sites with a larger subset comprising the polynucleotide sequences of said genotype, e. selecting at least one unique site to generate the genomic signature of said genotype, wherein the frequency of occurrence of said unique site is in the range of 0.85-1.0 in said genotype and 0.001 -.05 in other genotypes, wherein the genomic signature is specific to said genotype and different in other genotypes
2. The method of generating a genomic signature as claimed in claim 1, wherein the organism is selected from a group consisting of virus, bacteria and protozoan.
3. The method of generating a genomic signature as claimed in claim 1, wherein the polynucleotide sequence is the genomic sequence of the organism.
4. The method of generating a genomic. signature as claimed in claim 2, wherein the virus is selected from a group consisting of Hepatitis Viruses, Dengue, Herpes Simplex Virus, Rotavirus, Tobacco Mosaic Virus, Ebola, Polio, Influenza virus, Japanese encephalitis and Human Immunodeficiency Virus.
5. The method of generating a genomic signature as claimed in claim 2, wherein the virus is selected from a group consisting of Hepatitis B Virus, Hepatitis C Virus and Human Immunodeficiency Virus.
6. The method of generating a genomic signature as claimed in claim 2, wherein the bacteria are selected from a group consisting of Mycobacterium, Streptococcus, Pseudomonas, Shigella, Campylobacter, and Salmonella.
7. The method of generating a genomic signature as claimed in claim 2, wherein the protozoan is selected from a group consisting of Giardia Lamblia, Naegleria Fowleri, Acanthamoeba Spp, Entamoeba Histolytica, Cryptosporidium Parvum, Cyclospora Cayetanensis and Isospora Belli.
8. A method of detecting one or more HBV genotypes in a sample using the HBV genomic signature , said method comprising a. providing a sample; and b. analyzing the presence of genomic signature specific for HBV genotype wherein the genomic signature of a specific HBV genotype is the presence of specific single nucleotide variation at one or more specific nucleotide position in the representative HBV polynucleotide sequence as set forth in SEQ ID NO: 1, whereby the presence of adenine at nucleotide position 439 in the polynucleotide sequence as set forth in SEQ ID NO: 1 detects HBV-A genotype , the presence of guanine at nucleotide position 411 in the polynucleotide sequence as set forth in SEQ ID NO: 1 detects HBV-B genotype, the presence of adenine at nucleotide position 494 and guanine at nucleotide position 2786 in the polynucleotide sequence as set forth in SEQ ID NO: 1 detects HBV-C genotype, the presence of adenine at nucleotide position 293 in the polynucleotide sequence as set forth in SEQ ID NO: 1 detects HBV-D genotype, the presence of thymine at nucleotide position 2964 in the polynucleotide sequence as set forth in SEQ ID NO: 1 detects HBV-E genotype, the presence of adenine at nucleotide position 631 in the polynucleotide sequence as set forth in SEQ ID NO: 1 detects HBV-F genotype, the presence of adenine at nucleotide position 1003 and guanine at nucleotide position 1620 in the polynucleotide sequence as set forth in SEQ ID NO: 1 detects HBV-G genotype, the presence of guanine at nucleotide position 1321 in the polynucleotide sequence as set forth in SEQ ID NO: 1 detects HBV-H genotype .
9. A method of detecting HBV-A and HBV-D genotypes in a sample using the HBV-A and HBV-D genomic signature, said method comprising a. providing a sample; and b. analyzing the presence of genomic signatures specific for HBV-A and HBV-D genotypes, wherein the genomic signature for HBV-A genotype is the presence of specific single nucleotide variation at nucleotide position 439 in the representative HBV polynucleotide sequence as set forth in SEQ ID NO: 1 and the genomic signature for HBV-D genotype is the presence of specific single nucleotide variation at nucleotide position 293 in the representative HBV polynucleotide sequence as set forth in SEQ ID NO: 1, whereby the presence of adenine at nucleotide position 439 in the polynucleotide sequence as set forth in SEQ ID NO: 1 detects HBV-A genotype, and the presence of adenine at nucleotide position 293 in the polynucleotide sequence as set forth in SEQ ID NO: 1 detects HBV D genotype.
10. A method of detecting HBV-C genotype in a sample using the HBV-C genomic signature, said method comprising a. providing a sample; and b. analyzing the presence of genomic signature specific for HBV-C genotype, wherein the genomic signature for HBV-C genotype is the presence of specific single nucleotide variation at nucleotide positions 494 and 2786 in the representative HBV polynucleotide sequence as set forth in SEQ ID NO: 1, whereby the presence of adenine at nucleotide position 494 and guanine at nucleotide position 2786 in the polynucleotide sequence as set forth in SEQ ID NO: 1 detects HBV-C genotype.
11. The method as claimed in claims 8 to 10, wherein the sample is selected from a group consisting of blood, plasma, serum, saliva, sputum, tissue, DNA and hair, preferably DNA.
12. The method as claimed in claims 8 to 10, wherein analyzing the presence of genomic signature specific for HBV genotype is performed using hybridization-based method, enzyme-based method, post-amplification method based on physical properties of DNA or sequencing method
13. The method as claimed in claims 12, wherein hybridization-based method is selected from a group consisting of dynamic allele-specific hybridization, molecular beacons, SNP microarrays.
14. The method as claimed in claims 12, wherein enzyme-based method is selected from a group consisting of ' restriction fragment length polymorphism, PCR-based method, flap endonuclease, primer extension, 5'- nuclease, or oligonucleotide ligase assay.
15. The method as claimed in claims 12, post-amplification method is selected from a group consisting of single strand conformation polymorphism, temperature gradient gel electrophoresis, denaturing high performance liquid chromatography, high-resolution melting of the entire amplicon or SNPlex.
16. The method as claimed in claims 8 to 10, wherein the step of analyzing the presence of genomic signature specific for HBV genotype further comprises selecting a pair of oligonucleotides to amplify the specific single nucleotide variation, performing a primer extension reaction to generate extended primer and analyzing the extended primer to identify the genomic signature of the HBV genotype.
17. The method as claimed in claim 16, wherein the extended primer is analyzed using mass spectrometry, differential labeling method or differential size fractionation method.
18. The method as claimed in claim 16, wherein the oligonucleotides are selected from a group consisting of set 1 SEQ ID NO: 13 and SEQ ID NO 14; set 2 SEQ ID NO: 15 and SEQ ID NO: 16; set 3 SEQ ID NO: 17 and SEQ ID NO: 18; set 4 SEQ ID NO: 19 and SEQ ID NO: 20; set 5 SEQ ID NO: 21 and SEQ ID NO: 22; set 6 SEQ ID NO: 23 and SEQ ID NO: 24; set 7 SEQ ID NO: 25 and SEQ ID NO: 26; set 8 SEQ ID NO: 27 and SEQ ID NO: 28; set 9 SEQ ID NO: 29 and SEQ ID NO: 30 and set 10 SEQ ID NO: 31 and SEQ ID NO: 32.
19. The method as claimed in claim 16, wherein the primer extension reaction was performed using the oligonucleotide as set forth in SEQ ID NO: 33, SEQ ID NO: 34, SEQ ID NO: 35, SEQ ID NO: 36, SEQ ID NO: 37; SEQ ID NO: 38; SEQ ID NO: 39, SEQ ID NO: 40, SEQ ID NO: 41 and SEQ ID NO: 42.
20. Oligonucleotide set selected from a group consisting of set 11 SEQ ID NO: 13, SEQ ID NO 14 and SEQ ID NO: 33; set 12 SEQ ID NO: 15, SEQ ID NO: 16 and SEQ ID NO: 34; set 13 SEQ ID NO: 17, SEQ ID NO: 18 and SEQ ID NO: 35; set 14 SEQ ID NO: 19, SEQ ID NO: 20 and SEQ ID NO: 36; set 15 SEQ ID NO: 21, SEQ ID NO: 22 and SEQ ID NO: 37; set 16 SEQ ID NO: 23, SEQ ID NO: 24 and SEQ ID NO: 38; set 17 SEQ ID NO: 25, SEQ ID NO: 26 and SEQ ID NO: 39; set 18 SEQ ID NO: 27, SEQ ID NO: 28 and SEQ ID NO: 40; set 19 SEQ ID NO: 29, SEQ ID NO: 30 and SEQ ID NO: 41; and set 20 SEQ ID NO: 31, SEQ ID NO: 32 and SEQ ID NO: 42, wherein the oligonucleotides detects one or more HBV genotypes in a sample.
21. A DNA fragment comprising the genomic signature of HBV genotypes, wherein the nucleotide sequence of the fragment is selected from a group consisting of SEQ ID NO: 55 to SEQ ID NO: 64.
22. A method of detecting one or more HIV genotypes in a sample using the HIV genomic signature , said method comprising a. providing a sample; and b. analyzing the presence of the genomic signature specific for HIV genotype wherein the genomic signature of a specific HIV genotype is the presence of specific single nucleotide variation at one or more specific nucleotide position in the representative HIV polynucleotide sequence as set forth in SEQ ID NO: 2, whereby the presence of Thymine and cytosine at nucleotide positions 1440 and 4244 respectively in the polynucleotide sequence as set forth in SEQ ID NO: 2 detects HIV-Al genotype, the presence of cytosine at nucleotide positions 4462 and 7304 in the polynucleotide sequence as set forth in SEQ ID NO: 2 detects HIV-A2 genotype, the presence of adenine and guanine at nucleotide positions 2040 and 9001 respectively in the polynucleotide sequence as set forth in SEQ ID NO: 2 detects HIV-B genotype, the presence of cytosine and thymine at nucleotide positions 2480 and 6395 respectively in the polynucleotide sequence as set forth in SEQ ID NO: 2 detects HIV-C genotype, the presence of adenine and guanine at nucleotide positions 4158 and '8180 respectively in the polynucleotide sequence as set forth in SEQ ID NO: 2 detects HIV-D genotype, the presence of cytosine and guanine at nucleotide positions 1679 and 7169 respectively in the polynucleotide sequence as set forth in SEQ ID NO: 2 detects HIV-Fl genotype, the presence of adenine and cytosine at nucleotide positions 3456 and 7985 respectively in the polynucleotide sequence as set forth in SEQ ID NO: 2 detects HIV-F2 genotype, the presence of thymine and guanine at nucleotide positions 2993 and 6574 respectively in the polynucleotide sequence as set forth in SEQ ID NO: 2 detects HIV-G genotype, the presence of adenine and cytosine at nucleotide positions 3973 and 8493 respectively in the polynucleotide sequence as set forth in SEQ ID NO: 2 detects HIV-H genotype, the presence of adenine and guanine at nucleotide position 2589 and 6769 respectively in the polynucleotide sequence as set forth in SEQ ID NO: 2 detects HIV-J genotype, and the presence of adenine at nucleotide positions 3428 and 9134 respectively in the polynucleotide sequence as set forth in SEQ ID NO: 2 detects HIV-K genotype. 23. The method as claimed in claims 22, wherein the sample is selected from a group consisting of blood, plasma, serum, saliva, sputum, tissue, DNA, RNA and hair, preferably DNA.
24. The method as claimed in claims 22, wherein analyzing the presence of genomic signature specific for HIV genotype is performed using hybridization-based method, enzyme-based method, post-amplification method based on physical properties of DNA or sequencing method
25. The method as claimed in claims 24, wherein hybridization-based method is selected from a group consisting of dynamic allele-specific hybridization, molecular beacons, SNP microarrays.
26. The method as claimed in claims 24, wherein enzyme-based method is selected from a group consisting of restriction fragment length polymorphism, PCR-based method, flap endonuclease, primer extension, 5'- nuclease, or oligonucleotide ligase assay.
27. The method as claimed in claims 24, post-amplification method is selected from a group consisting of single strand conformation polymorphism, temperature gradient gel electrophoresis, denaturing high performance liquid chromatography, high-resolution melting of the entire amplicon or SNPlex.
28. The method as claimed in claims 22, wherein the step of analyzing the presence of genomic signature specific for HIV genotype further comprises selecting a pair of oligonucleotides to amplify the specific single nucleotide variation, performing a primer extension reaction to generate extended primer and analyzing the extended primer to identify the genomic signature of the HIV genotype.
29. The method as claimed in claim 28, wherein the extended primer is analyzed using mass spectrometry, differential labeling method or differential size fractionation method.
30. The method as claimed in claim 28, wherein the oligonucleotides for detecting HIV-B and HIV-C genotype are selected from a group consisting of set 21 SEQ ID NO: 43 and SEQ ID NO: 44; set 22 SEQ ID NO: 45 and SEQ ID NO: 46; set 23 SEQ ID NO: 47 and SEQ ID NO: 48; set 24 SEQ ID NO: 49 and SEQ ID NO: 50.
31. The method as claimed in claim 12, wherein the primer extension reaction was performed using the oligonucleotide as set forth in SEQ ID NO: 51, SEQ ID NO: 52, SEQ ID NO: 53 and SEQ ID NO: 54.
32. A genomic signature processing (GSP) system (600) comprising: at least one processor (606); a memory (608) coupled to the processor (606), the memory (608) comprising one or more processor executable instructions; a preliminary signature generator (618) to generate preliminary signatures for genotypes, wherein the preliminary signatures are generated based at least on a set of unique sites identified for the genotypes; and a genomic signature generator (620) to generate genomic signatures for each of the genotypes, wherein the genomic signatures are based on at least one informative site identified for each of the genotypes, and wherein the informative site is identified such that a probability of two or more genotypes having a same nucleotide at a same informative site is minimal.
33. The GSP system (600) as claimed in claim 32, wherein the GSP system (600) is communicatively coupled to a sequential database (610) that stores multiple sequences for each of the genotypes.
34. The GSP system (600) as claimed in claim 33, wherein the genomic signatures are generated based on a comparison of the preliminary signatures of each of the genotypes with the multiple sequences of all the genotypes.
35. The GSP system (600) as claimed in claim 32 comprising a consensus generator (616) to generate consensus sequences for the genotypes, wherein the consensus sequences are generated based on multiple sequences corresponding to each of the genotypes.
36. The GSP system (600) as claimed in claim 32, wherein the preliminary signatures are generated based on the set of unique sites identified for the genotypes.
7. The GSP system (600) as claimed in claim 36, wherein the unique sites are identified based on consensus sequences corresponding to the genotypes.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
IN1857/DEL/2008 | 2008-08-05 | ||
IN1857DE2008 | 2008-08-05 |
Publications (2)
Publication Number | Publication Date |
---|---|
WO2010016071A2 true WO2010016071A2 (en) | 2010-02-11 |
WO2010016071A3 WO2010016071A3 (en) | 2010-07-29 |
Family
ID=41466823
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/IN2009/000442 WO2010016071A2 (en) | 2008-08-05 | 2009-08-05 | Identification of genomic signature for differentiating highly similar sequence variants of an organism |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2010016071A2 (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2014047286A1 (en) * | 2012-09-19 | 2014-03-27 | The Trustees Of The University Of Pennsylvania | Hepatitis b virus core protein and surface antigen protein and vaccine comprising the same |
WO2014144578A1 (en) * | 2013-03-15 | 2014-09-18 | The Usa, As Represented By The Secretary, Department Of Health And Human Services | Selective detection of hepatitis a, b, c, d, or e viruses or combinations thereof |
US9403879B2 (en) | 2011-02-11 | 2016-08-02 | The Trustees Of The University Of Pennsylvania | Nucleic acid molecule encoding hepatitis B virus core protein and vaccine comprising the same |
CN113215290A (en) * | 2020-02-05 | 2021-08-06 | 厦门大学 | Method and kit for detecting serotype of shigella |
WO2021196357A1 (en) * | 2020-04-02 | 2021-10-07 | 上海之江生物科技股份有限公司 | Method and device for obtaining species-specific consensus sequences of microorganisms and application |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO1997040193A2 (en) * | 1996-04-19 | 1997-10-30 | Innogenetics N.V. | Method for typing and detecting hbv |
EP1193313A1 (en) * | 1999-06-15 | 2002-04-03 | Otsuka Pharmaceutical Co., Ltd. | Method for determining hiv-1 subtype |
US20060286566A1 (en) * | 2005-02-03 | 2006-12-21 | Helicos Biosciences Corporation | Detecting apparent mutations in nucleic acid sequences |
-
2009
- 2009-08-05 WO PCT/IN2009/000442 patent/WO2010016071A2/en active Application Filing
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO1997040193A2 (en) * | 1996-04-19 | 1997-10-30 | Innogenetics N.V. | Method for typing and detecting hbv |
EP1193313A1 (en) * | 1999-06-15 | 2002-04-03 | Otsuka Pharmaceutical Co., Ltd. | Method for determining hiv-1 subtype |
US20060286566A1 (en) * | 2005-02-03 | 2006-12-21 | Helicos Biosciences Corporation | Detecting apparent mutations in nucleic acid sequences |
Non-Patent Citations (14)
Title |
---|
BAAR DE M P ET AL: "ONE-TUBE REAL-TIME ISOTHERMAL AMPLIFICATION ASSAY TO IDENTIFY AND DISTINGUISH HUMAN IMMUNODEFICIENCY VIRUS TYPE 1 SUBTYPES A,B, AND C AND CIRCULATING RECOMBINANT FORMS AE AND AG" JOURNAL OF CLINICAL MICROBIOLOGY, AMERICAN SOCIETY FOR MICROBIOLOGY, WASHINGTON, DC, US LNKD- DOI:10.1128/JCM.39.5.1895-1902.2001, vol. 39, no. 5, 1 May 2001 (2001-05-01), pages 1895-1902, XP001011572 ISSN: 0095-1137 * |
BUKH J ET AL: "SEQUENCE ANALYSIS OF THE CORE GENE OF 14 HEPATITIS C VIRUS GENOTYPES" PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF USA, NATIONAL ACADEMY OF SCIENCE, WASHINGTON, DC, US, vol. 91, no. 17, 16 August 1994 (1994-08-16), pages 8239-8243, XP002047846 ISSN: 0027-8424 * |
HOELSCHER MICHAEL ET AL: "Detection of HIV-1 subtypes, recombinants, and dual infections in east Africa by a multi-region hybridization assay." AIDS (LONDON, ENGLAND) 18 OCT 2002 LNKD- PUBMED:12370505, vol. 16, no. 15, 18 October 2002 (2002-10-18), pages 2055-2064, XP002581538 ISSN: 0269-9370 * |
HOMMAIS F ET AL: "Single-Nucleotide Polymorphism Phylotyping of Escherichia coli" APPLIED AND ENVIRONMENTAL MICROBIOLOGY, AMERICAN SOCIETY FOR MICROBIOLOGY, US, vol. 71, no. 8, 1 August 2005 (2005-08-01) , pages 4784-4792, XP003017261 ISSN: 0099-2240 * |
HUSSAIN MUNIRA ET AL: "Rapid and sensitive assays for determination of hepatitis B virus (HBV) genotypes and detection of HBV precore and core promoter variants." JOURNAL OF CLINICAL MICROBIOLOGY AUG 2003, vol. 41, no. 8, August 2003 (2003-08), pages 3699-3705, XP002565052 ISSN: 0095-1137 * |
KIJAK ET AL: "Distinguishing molecular forms of HIV-1 in Asia with a high-throughput, fluorescent genotyping assay, MHAbce v.2" VIROLOGY, ACADEMIC PRESS,ORLANDO, US LNKD- DOI:10.1016/J.VIROL.2006.07.055, vol. 358, no. 1, 9 January 2007 (2007-01-09), pages 178-191, XP005762428 ISSN: 0042-6822 * |
KIRSCHBERG OLIVER ET AL: "A multiplex-PCR to identify hepatitis B virus--enotypes A-F." JOURNAL OF CLINICAL VIROLOGY : THE OFFICIAL PUBLICATION OF THE PAN AMERICAN SOCIETY FOR CLINICAL VIROLOGY JAN 2004, vol. 29, no. 1, January 2004 (2004-01), pages 39-43, XP002565051 ISSN: 1386-6532 * |
LIU WEN-CHUN ET AL: "Simultaneous quantification and genotyping of hepatitis B virus for genotypes A to G by real-time PCR and two-step melting curve analysis." JOURNAL OF CLINICAL MICROBIOLOGY DEC 2006, vol. 44, no. 12, December 2006 (2006-12), pages 4491-4497, XP002565053 ISSN: 0095-1137 * |
MOROZOV V ET AL: "Homologous recombination between different genotypes of hepatitis B virus" GENE, ELSEVIER, AMSTERDAM, NL, vol. 260, no. 1-2, 30 December 2000 (2000-12-30), pages 55-65, XP004227524 ISSN: 0378-1119 * |
OKAMOTO H ET AL: "TYPING HEPATITIS B VIRUS BY HOMOLOGY IN NUCLEOTIDE SEQUENCE: COMPARISON OF SURFACE ANTIGEN SUBTYPES" JOURNAL OF GENERAL VIROLOGY, SOCIETY FOR GENERAL MICROBIOLOGY, SPENCERS WOOD, GB, vol. 69, no. 10, 1 October 1988 (1988-10-01), pages 2575-2583, XP000672123 ISSN: 0022-1317 * |
PAI REKHA ET AL: "Sequential multiplex PCR approach for determining capsular serotypes of Streptococcus pneumoniae isolates" JOURNAL OF CLINICAL MICROBIOLOGY JAN 2006,, vol. 44, no. 1, 1 January 2006 (2006-01-01), pages 124-131, XP002560719 * |
URWIN R ET AL: "Multi-locuas sequence typing: a tool for global epidemiology" TRENDS IN MICROBIOLOGY, ELSEVIER SCIENCE LTD., KIDLINGTON, GB, vol. 11, no. 10, 1 January 2003 (2003-01-01), pages 479-487, XP003017264 ISSN: 0966-842X * |
WEI M ET AL: "Simple subtyping assay for human immunodeficiency virus type 1 subtypes B, C, CRF01-AE, CRF07-BC, and CRF08-BC" JOURNAL OF CLINICAL MICROBIOLOGY, AMERICAN SOCIETY FOR MICROBIOLOGY, WASHINGTON, DC, US LNKD- DOI:10.1128/JCM.42.9.4261-4267.2004, vol. 42, no. 9, 1 September 2004 (2004-09-01), pages 4261-4267, XP002998793 ISSN: 0095-1137 * |
YEH S-H ET AL: "Quantification and genotyping of hepatitis B virus in a single reaction by real-time PCR and melting curve analysis" JOURNAL OF HEPATOLOGY, MUNKSGAARD INTERNATIONAL PUBLISHERS, COPENHAGEN, DK, vol. 41, no. 4, 1 October 2004 (2004-10-01), pages 659-666, XP004586587 ISSN: 0168-8278 * |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9238679B2 (en) | 2011-02-11 | 2016-01-19 | The Trustees Of The University Of Pennslyvania | Nucleic acid molecule encoding hepatitis B virus core protein and surface antigen protein and vaccine comprising the same |
US9403879B2 (en) | 2011-02-11 | 2016-08-02 | The Trustees Of The University Of Pennsylvania | Nucleic acid molecule encoding hepatitis B virus core protein and vaccine comprising the same |
US9675690B2 (en) | 2011-02-11 | 2017-06-13 | The Trustees Of The University Of Pennsylvania | Nucleic acid molecule encoding hepatitis B virus core protein and surface antigen protein and vaccine comprising the same |
US10195268B2 (en) | 2011-02-11 | 2019-02-05 | The Trustees Of The University Of Pennsylvania | Nucleic acid molecule encoding hepatitis B virus core protein and vaccine comprising the same |
US10695421B2 (en) | 2011-02-11 | 2020-06-30 | The Trustees Of The University Of Pennsylvania | Nucleic acid molecule encoding hepatitis B virus core protein and vaccine comprising the same |
WO2014047286A1 (en) * | 2012-09-19 | 2014-03-27 | The Trustees Of The University Of Pennsylvania | Hepatitis b virus core protein and surface antigen protein and vaccine comprising the same |
EA036030B1 (en) * | 2012-09-19 | 2020-09-16 | Дзе Трастиз Оф Дзе Юниверсити Оф Пенсильвания | Immunogenic vaccine against hbv in a human |
WO2014144578A1 (en) * | 2013-03-15 | 2014-09-18 | The Usa, As Represented By The Secretary, Department Of Health And Human Services | Selective detection of hepatitis a, b, c, d, or e viruses or combinations thereof |
CN113215290A (en) * | 2020-02-05 | 2021-08-06 | 厦门大学 | Method and kit for detecting serotype of shigella |
CN113215290B (en) * | 2020-02-05 | 2022-11-11 | 厦门大学 | Method and kit for detecting serotype of shigella |
WO2021196357A1 (en) * | 2020-04-02 | 2021-10-07 | 上海之江生物科技股份有限公司 | Method and device for obtaining species-specific consensus sequences of microorganisms and application |
EP4116982A4 (en) * | 2020-04-02 | 2023-12-20 | Shanghai Zj Bio-tech Co., Ltd. | Method and device for obtaining species-specific consensus sequences of microorganisms and application |
Also Published As
Publication number | Publication date |
---|---|
WO2010016071A3 (en) | 2010-07-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Cobo | Application of maldi-tof mass spectrometry in clinical virology: a review | |
US8980555B2 (en) | Rapid genotyping analysis and devices thereof | |
Wang et al. | Identifying influenza viruses with resequencing microarrays | |
EP1957521B1 (en) | Compositions for use in identification of influenza viruses | |
JP2005504508A5 (en) | ||
AU2016324473B2 (en) | Virome capture sequencing platform, methods of designing and constructing and methods of using | |
WO2008042450A2 (en) | Multiplex detection of respiratory pathogens | |
WO2010016071A2 (en) | Identification of genomic signature for differentiating highly similar sequence variants of an organism | |
CN107735500A (en) | For detecting the grand genome composition and method of breast cancer | |
Li et al. | Rapid identification and metagenomics analysis of the adenovirus type 55 outbreak in Hubei using real-time and high-throughput sequencing platforms | |
Rybicka et al. | Current molecular methods for the detection of hepatitis B virus quasispecies | |
Neverov et al. | Genotyping of measles virus in clinical specimens on the basis of oligonucleotide microarray hybridization patterns | |
CN106414775A (en) | Compositions and methods for metagenome biomarker detection | |
Alavian et al. | A comprehensive overview of hepatitis virus genotyping methods | |
WO2013093992A1 (en) | Oligonucleotide set for detecting hepatitis b virus group and evaluating gene diversity, and method using same | |
Lee | Evidence-Based Evaluation of PCR Diagnostics for SARS-Cov-2 and the Omicron Variants by Sanger Sequencing | |
US20120094274A1 (en) | Identification of swine-origin influenza a (h1n1) virus | |
Ding et al. | Detection of hepatitis B virus genotypes A to D by the fluorescence polarization assay based on asymmetric PCR | |
Nunley et al. | Clinical performance evaluation of a tiling amplicon panel for whole genome sequencing of respiratory syncytial virus | |
Yang | MultiPrime: A Reliable and Efficient Tool for Targeted Next-Generation Sequencing | |
WO2021242819A1 (en) | Compositions and methods of detecting respiratory viruses including coronaviruses | |
EP1576185A1 (en) | Method for detecting mutations | |
Fischer et al. | Diagnosis Using Microarrays | |
JP2009509499A (en) | Multiple polymerase chain reaction for gene sequence analysis |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 09787614 Country of ref document: EP Kind code of ref document: A2 |
|
NENP | Non-entry into the national phase in: |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 09787614 Country of ref document: EP Kind code of ref document: A2 |