CN108138237B - Assessment of NFkB cell signaling pathway activity using mathematical modeling of target gene expression - Google Patents
Assessment of NFkB cell signaling pathway activity using mathematical modeling of target gene expression Download PDFInfo
- Publication number
- CN108138237B CN108138237B CN201680056337.1A CN201680056337A CN108138237B CN 108138237 B CN108138237 B CN 108138237B CN 201680056337 A CN201680056337 A CN 201680056337A CN 108138237 B CN108138237 B CN 108138237B
- Authority
- CN
- China
- Prior art keywords
- signaling pathway
- nfkb
- cell signaling
- target genes
- activity
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Images
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B25/00—ICT specially adapted for hybridisation; ICT specially adapted for gene or protein expression
- G16B25/20—Polymerase chain reaction [PCR]; Primer or probe design; Probe optimisation
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61K—PREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
- A61K31/00—Medicinal preparations containing organic active ingredients
- A61K31/33—Heterocyclic compounds
- A61K31/335—Heterocyclic compounds having oxygen as the only ring hetero atom, e.g. fungichromin
- A61K31/336—Heterocyclic compounds having oxygen as the only ring hetero atom, e.g. fungichromin having three-membered rings, e.g. oxirane, fumagillin
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61K—PREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
- A61K31/00—Medicinal preparations containing organic active ingredients
- A61K31/33—Heterocyclic compounds
- A61K31/395—Heterocyclic compounds having nitrogen as a ring hetero atom, e.g. guanethidine or rifamycins
- A61K31/41—Heterocyclic compounds having nitrogen as a ring hetero atom, e.g. guanethidine or rifamycins having five-membered rings with two or more ring hetero atoms, at least one of which being nitrogen, e.g. tetrazole
- A61K31/415—1,2-Diazoles
- A61K31/416—1,2-Diazoles condensed with carbocyclic ring systems, e.g. indazole
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61K—PREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
- A61K31/00—Medicinal preparations containing organic active ingredients
- A61K31/33—Heterocyclic compounds
- A61K31/395—Heterocyclic compounds having nitrogen as a ring hetero atom, e.g. guanethidine or rifamycins
- A61K31/495—Heterocyclic compounds having nitrogen as a ring hetero atom, e.g. guanethidine or rifamycins having six-membered rings with two or more nitrogen atoms as the only ring heteroatoms, e.g. piperazine or tetrazines
- A61K31/4985—Pyrazines or piperazines ortho- or peri-condensed with heterocyclic ring systems
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61K—PREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
- A61K31/00—Medicinal preparations containing organic active ingredients
- A61K31/56—Compounds containing cyclopenta[a]hydrophenanthrene ring systems; Derivatives thereof, e.g. steroids
- A61K31/57—Compounds containing cyclopenta[a]hydrophenanthrene ring systems; Derivatives thereof, e.g. steroids substituted in position 17 beta by a chain of two carbon atoms, e.g. pregnane or progesterone
- A61K31/573—Compounds containing cyclopenta[a]hydrophenanthrene ring systems; Derivatives thereof, e.g. steroids substituted in position 17 beta by a chain of two carbon atoms, e.g. pregnane or progesterone substituted in position 21, e.g. cortisone, dexamethasone, prednisone or aldosterone
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61K—PREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
- A61K38/00—Medicinal preparations containing peptides
- A61K38/04—Peptides having up to 20 amino acids in a fully defined sequence; Derivatives thereof
- A61K38/05—Dipeptides
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6876—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
- C12Q1/6883—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
- C12Q1/6886—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N33/00—Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
- G01N33/48—Biological material, e.g. blood, urine; Haemocytometers
- G01N33/50—Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
- G01N33/53—Immunoassay; Biospecific binding assay; Materials therefor
- G01N33/574—Immunoassay; Biospecific binding assay; Materials therefor for cancer
- G01N33/57484—Immunoassay; Biospecific binding assay; Materials therefor for cancer involving compounds serving as markers for tumor, cancer, neoplasia, e.g. cellular determinants, receptors, heat shock/stress proteins, A-protein, oligosaccharides, metabolites
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B25/00—ICT specially adapted for hybridisation; ICT specially adapted for gene or protein expression
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B40/00—ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B5/00—ICT specially adapted for modelling or simulations in systems biology, e.g. gene-regulatory networks, protein interaction networks or metabolic networks
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B5/00—ICT specially adapted for modelling or simulations in systems biology, e.g. gene-regulatory networks, protein interaction networks or metabolic networks
- G16B5/20—Probabilistic models
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2600/00—Oligonucleotides characterized by their use
- C12Q2600/158—Expression markers
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N2800/00—Detection or diagnosis of diseases
- G01N2800/60—Complex ways of combining multiple protein biomarkers for diagnosis
Landscapes
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Chemical & Material Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Biotechnology (AREA)
- Public Health (AREA)
- Epidemiology (AREA)
- Molecular Biology (AREA)
- Medical Informatics (AREA)
- Biophysics (AREA)
- Medicinal Chemistry (AREA)
- Animal Behavior & Ethology (AREA)
- Pharmacology & Pharmacy (AREA)
- Veterinary Medicine (AREA)
- Evolutionary Biology (AREA)
- Theoretical Computer Science (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Immunology (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Genetics & Genomics (AREA)
- Organic Chemistry (AREA)
- Pathology (AREA)
- Analytical Chemistry (AREA)
- Physiology (AREA)
- Zoology (AREA)
- Wood Science & Technology (AREA)
- Biochemistry (AREA)
- Hospice & Palliative Care (AREA)
- Microbiology (AREA)
- Oncology (AREA)
- Urology & Nephrology (AREA)
- Hematology (AREA)
- Biomedical Technology (AREA)
- Cell Biology (AREA)
- Gastroenterology & Hepatology (AREA)
- Databases & Information Systems (AREA)
Abstract
The present invention relates to a method comprising inferring activity of an NFkB cellular signaling pathway based at least on measured expression levels of six or more target genes of the NFkB cellular signaling pathway in a sample. The invention also relates to an apparatus comprising a digital processor configured to perform such a method, a non-transitory storage medium storing instructions executable by a digital processing device to perform such a method, and a computer program comprising program code means to cause a digital processing device to perform such a method. The invention also relates to a kit for measuring expression levels.
Description
Technical Field
The present invention relates generally to the fields of bioinformatics, genomic processing, proteomic processing, and related fields. More specifically, the present invention relates to a method comprising inferring activity of an NFkB cellular signaling pathway based at least on measured expression levels of six or more target genes of the NFkB cellular signaling pathway in a sample. The invention also relates to an apparatus comprising a digital processor configured to perform such a method, a non-transitory storage medium storing instructions executable by a digital processing device to perform such a method, and a computer program comprising program code means to cause a digital processing device to perform such a method. The invention further relates to a kit for measuring expression levels.
Background
Genomics and proteomics analysis have substantial implementation and potential promise for clinical applications in the medical field, such as oncology, where a variety of cancers are known to be associated with specific combinations of genomic mutations/variations and/or high or low expression levels of specific genes, which play a role in the growth and evolution of cancers, such as cell proliferation and metastasis.
The nuclear factor-kappaB (NFkB or NF-. kappa.B) is an inducible transcription factor that regulates the expression of many genes involved in immune responses. The NFkB cell signaling pathway is a key cell signaling pathway involved in immune, inflammatory and acute phase responses, but is also involved in the control of cell survival, proliferation and apoptosis. In healthy non-activated cells, NFKB cell signaling pathway-associated transcription factors consisting of dimers derived from five genes (NFKB1 or p50/p105, NFKB2 or p52/p100, RELA or p65, REL and RELB) are predominantly cytoplasmic transcription factors due to their interaction with NFKB inhibitors (IkB) and thus maintain transcriptional inactivation, thus maintaining NFKB cell signaling pathway inactivation. Upon activation of the upstream signaling cascade, IkB becomes phosphorylated and undergoes ubiquitin-dependent degradation by the proteasome, and the NFkB dimer translocates into its nucleus, which is a transcription factor. In addition to this classical NFkB cell signaling pathway, there is an alternative pathway that can initiate NFkB-regulated transcription (see fig. 1; CP ═ classical pathway; AP ═ alternative pathway; PS ═ proteasome; NC ═ nucleus). The term "NFkB cell signaling pathway" herein preferably refers to any signaling process leading to the transcriptional activity of the above-mentioned NFkB transcription factor.
Although the activity of the NFkB cell signaling pathway is known to be associated with different types of cancer and supports the pro-active phenotype of cancer cells, there are no clinical assays available to assess NFkB cell signaling pathway activity. It would therefore be desirable to be able to improve the possibility of identifying patients suffering from cancer, such as diffuse large B-cell lymphoma (DLBCL), multiple myeloma, another cancer of hematological origin, or a solid tumor such as breast, melanoma or prostate tumor, which is at least partially driven by a deregulated NFkB cell signalling pathway and thus likely to respond to inhibitors of the NFkB cell signalling pathway.
Xing, F.ZHou and J.Wang, "Subset of genes targeted by transcription factors NF-. kappa.B in TNF α -stimulated human HeLa cells", Functional and Integrated genetics, Vol.13, No.1, pages 143to 154(2013) revealed a study claiming identification of the Direct Target Gene (DTG) of NF-. kappa.B in TNF α -stimulated HeLa cells by using ChIP-Seq, RNAi and gene expression profiling techniques.
Feuerhake et al, "NF κ B activity, function, and target-gene signatures in primary mediated large B-cell lymphoma and diffuse large B-cell lymphoma", Blood, Vol.106, No.4, pages 1392to 1399(2005) revealed that primary mediastinal large B-cell lymphoma (MLBCL) shares clinical and molecular features with traditional Hodgkin's lymphoma, including the nuclear localization of the nuclear factor κ B (NF κ B) subunit c-REL (reticuloendotheliosis virus oncogene homolog) in the experimental series. The authors analyzed the c-REL subcellular localization in otherwise primary MLBCL and identified NF κ B activity and function in MLBCL cell lines.
US 2004/0180341A1 relates to the transcriptional regulation of a family of proteins defined by the presence of the RKIP motif. Proteins comprising the RKIP motif regulate kinases involved in signal transduction pathways. Transcriptional regulation of RKIP motif-containing proteins underlies screening assays for identifying substances useful in the modulation of signal transduction pathways subject to RKIP family mediated regulation, as well as the diagnosis and treatment of diseases involving inappropriate activity of pathways subject to RKIP family mediated regulation.
Summary of The Invention
The present invention provides new and improved methods and apparatus as disclosed herein.
According to one main aspect of the present invention, the above problem is solved by a method for inferring the activity of the NFkB cellular signaling pathway using mathematical modeling of target gene expression, said method comprising:
inferring activity of the NFkB cell signaling pathway based at least on expression levels of six or more, e.g., seven, eight, nine, ten, eleven, or twelve or more target genes of the NFkB cell signaling pathway measured in the sample, wherein the inferring comprises:
determining a level of an NFkB Transcription Factor (TF) element in the sample, the NFkB TF element controlling transcription of six or more target genes of the NFkB cellular signaling pathway, the determining based at least in part on evaluating a mathematical model that correlates expression levels of the six or more target genes of the NFkB cellular signaling pathway with the level of the NFkB TF element, wherein the six or more target genes are selected from the group consisting of: BCL2L1, BIRC3, CCL2, CCL3, CCL4, CCL5, CCL20, CCL22, CX3CL1, CXCL1, CXCL2, CXCL3, ICAM1, IL1B, IL6, IL8, IRF1, MMP9, NFKB2, NFKBIA, NFKBIE, PTGS2, SELE, STAT5A, TNF, TNFAIP2, TNIP1, TRAF1, and VCAM 1;
inferring activity of the NFkB cellular signaling pathway based on the determined level of the NFkB TF element in the sample,
wherein the inferring is performed by a digital processing device using the mathematical model.
In this context, "level" of a TF element refers to the level of activity of the TF element with respect to transcription of its target gene.
The present invention is based on the recognition by the inventors that a suitable way of identifying effects occurring in the NFkB cellular signaling pathway can be based on measuring the signal output of the NFkB cellular signaling pathway, which is in particular transcribed through a target gene of the NFkB Transcription Factor (TF) element controlled by the NFkB cellular signaling pathway. This recognition by the present inventors assumes that the TF level in the sample is in a quasi-steady state, which can be detected, in particular, by the expression value of the target gene. The NFkB cell signaling pathway targeted herein is known to be associated with different types of cancer and supports the pro-active phenotype of cancer cells, as the NFkB cell signaling pathway is known to regulate genes that control cell proliferation and cell survival processes. Many different types of human tumors were found to have deregulated NFkB cell signaling pathway activity. In addition, studies have shown that inhibiting the activity of the constitutive NFkB cell signaling pathway can block the oncogenic potential of cancer cells.
The invention makes it possible to determine the activity of the NFkB cellular signalling pathway by: (i) determining a level of an NFkB TF element in a sample, wherein the determining is based at least in part on evaluating a mathematical model that correlates the expression level of six or more target genes that are transcribed to an NFkB cellular signaling pathway controlled by the NFkB TF element with the level of the NFkB TF element, and (ii) inferring an activity of the NFkB cellular signaling pathway based on the determined level of the NFkB TF element in the sample. This preferably allows for an improved possibility of identifying patients with cancer, such as diffuse large B-cell lymphoma (DLBCL), multiple myeloma, another cancer of hematological origin, or a solid tumor such as breast, melanoma or prostate tumor, which is at least partially driven by a deregulated NFkB cell signaling pathway and thus likely to be responsive to inhibitors of the NFkB cell signaling pathway. An important advantage of the present invention is that it allows determining the activity of the NFkB cellular signalling pathway using a single sample, without the need for multiple samples taken at different time points.
Herein, an NFkB Transcription Factor (TF) element is defined as a protein complex containing at least one or preferably a dimer of NFkB members (NFkB1 or p50/p105, NFkB2 or p52/p100, RELA or p65, REL and RELB) capable of binding to a specific DNA sequence, thereby controlling transcription of a target gene.
The mathematical model may be a probabilistic model, preferably a bayesian network model, based at least in part on conditional probabilities relating the NFkB TF element to the expression levels of six or more target genes of the NFkB cellular signaling pathway measured in the sample, or the mathematical model may be based at least in part on one or more linear combinations of the expression levels of six or more target genes of the NFkB cellular signaling pathway measured in the sample. In particular, it is concluded that the activity of the NFkB cell signaling pathway can be performed as described in published International patent application WO2013/011479A2 ("association of cellular signaling pathway activity using basic modulation of target gene expressions") or as described in published International patent application WO2014/1026868A2 ("association of cellular signaling pathway activity combining(s) of target gene expressions"), which are hereby incorporated by reference in their entirety. Further details regarding the use of mathematical models of target gene expression to infer cell signaling pathway activity can be found in Verhaegh W.et al, "Selection of qualified pathway therapy through the use of knowledge-based regulatory models through pathways," Cancer Research, Vol.74, No.11,2014, 2936-.
The use of probabilistic models, such as bayesian network models, allows the integration of existing information about the NFkB cellular signaling pathway using conditional probability relationships. This also includes the integration of information based on partial (rather than comprehensive) knowledge of the NFkB cell signaling pathway and biological measurements typically measured with some uncertainty. In contrast, using a mathematical model based at least in part on one or more linear combinations of expression levels may provide a very simple and easily calculated way to infer the activity of the NFkB cellular signaling pathway.
If the sample is a sample of a subject, the inferred activity of the NFkB cellular signaling pathway may be associated with the subject, in particular a tissue and/or a cell and/or a body fluid of the subject, wherein the term "subject" as used herein refers to any living organism. In some embodiments, the subject is an animal, preferably a mammal. In certain embodiments, the subject is a human, preferably a medical subject. On the other hand, if the sample is taken from, for example, a cell line, primary cell culture or a tissue culture, the inferred activity is only representative of the subject having a possible treatment effect when the original sample from which the cell line, primary cell culture or tissue culture was derived was extracted from the subject and in case the cell line, primary cell culture or tissue culture was subjected to one or more treatments (e.g., with a drug, chemical or physical treatment). Furthermore, a "target gene" can be a "direct target gene" and/or an "indirect target gene" (as described herein).
Particularly suitable target genes are described in the text paragraphs below and in the examples below (see, e.g., table 1).
Thus, according to a preferred embodiment, the target gene is selected from the group consisting of the target genes listed in table 1.
Particularly preferred is a method wherein the six or more target genes are selected from the group consisting of: CXCL2, ICAM1, IL6, CCL5, NFKBIA, IL8, TNFAIP2, CXCL3, MMP9, NFKB2, CCL20, CCL2, CXCL1, TNF, IRF1, TRAF1, NFKBIE, VCAM1 and BIRC3, preferably wherein six or more target genes comprise three or more, e.g. four, five, six, seven, eight or nine or more target genes selected from the group consisting of: CXCL2, ICAM1, IL6, CCL5, NFKBIA, IL8, TNFAIP2, CXCL3, MMP9, NFKB2, CCL20, CCL2 and CXCL1, more preferably the three or more target genes are selected from the group consisting of: CXCL2, ICAM1, IL6, CCL5, NFKBIA, IL8 and TNFAIP 2.
These target genes were selected based on a novel methodology of extensive literature review by the inventors, indicating that the selected target genes provide a good basis for inferring the activity of the NFkB cell signaling pathway, as detailed in example 2. Advantageously, since multiple target genes are used to infer the activity of the NFkB cellular signaling pathway, a more robust inference of the activity of the NFkB cellular signaling pathway can be achieved.
Another aspect of the invention relates to a method (as described herein), further comprising:
determining whether the NFkB cellular signaling pathway is operating abnormally based on the inferred activity of the NFkB cellular signaling pathway.
By determining whether an NFkB cell signaling pathway functions abnormally, for example, in patients with cancer that is driven at least in part by a deregulated NFkB cell signaling pathway, these patients can be identified in an efficient and reliable manner. In another variation, the methods (as described herein) can be used to identify cell lines derived, for example, from a cancer patient sample, with or without the use of a particular stimulating agent, such as TNF α or LPS.
The present invention also relates to a method (as described herein) further comprising:
it is proposed to prescribe a drug for correcting abnormal operation of NFkB cell signaling pathway,
wherein the recommendation is made only if the NFkB cellular signaling pathway is determined to be abnormally operational based on the inferred activity of the NFkB cellular signaling pathway.
By doing so, it can be better ensured that e.g. drugs relying on inhibitors of the NFkB cell signalling pathway are recommended only to patients with cancers that are likely to respond to these inhibitors. In this way, unnecessary ingestion of these drugs by patients who may not benefit from them can be avoided, while the chances of therapeutic response are optimized.
The invention also relates to a method (as described herein), wherein said inferring comprises:
inferring activity of the NFkB cellular signaling pathway based at least on expression levels of six or more target genes in a set of target genes of the NFkB cellular signaling pathway measured in the sample.
Advantageously, since multiple target genes are used to infer the activity of the NFkB cellular signaling pathway, a more robust inference of the activity of the NFkB cellular signaling pathway can be achieved.
The sample used according to the present invention may be an extracted sample, i.e. a sample that has been extracted from a subject. Examples of samples include, but are not limited to, tissue, biopsy, cells, blood, and/or bodily fluids of a subject. It may be, for example, obtained from a cancer lesion or suspected cancer lesion, or from a metastatic tumor, or from a body cavity in which the liquid present is contaminated with cancer cells (e.g. the pleural or abdominal or bladder cavity), or from other body fluids containing cancer cells, etc., preferably obtained by biopsy procedures or other sample extraction procedures. The cells extracted from the sample may also be tumor cells from a hematologic malignancy (e.g., leukemia or lymphoma). In some cases, the cell sample may also be circulating tumor cells, i.e., tumor cells that have entered the bloodstream, and may be extracted using a suitable separation technique, such as apheresis or conventional venous blood collection. In addition to blood, the body fluid from which the sample is extracted may be urine, gastrointestinal contents or exudate. The term "sample" as used herein also includes situations where e.g. tissue and/or cells and/or body fluids of a subject have been taken from the subject and e.g. have been placed on a microscope slide and a portion of the sample is extracted in order to perform the method, e.g. by Laser Capture Microdissection (LCM) or scraping of cells of interest from the slide, or by fluorescence activated cell sorting techniques. In addition, the term "sample" as used herein also includes situations where, for example, tissue and/or cells and/or bodily fluids of a subject have been removed from the subject and placed on a microscope slide and the method is performed on the slide.
According to another aspect of the invention, an apparatus includes a digital processor configured to perform the inventive method as described herein.
According to another aspect of the invention, a non-transitory storage medium stores instructions executable by a digital processing apparatus to perform the inventive methods as described herein. The non-transitory storage medium may be a computer-readable storage medium, such as a hard disk drive or other magnetic storage medium, an optical disk or other optical storage medium, Random Access Memory (RAM), Read Only Memory (ROM), flash memory or other electronic storage medium, a network server, or the like. The digital processing device may be a handheld device (e.g., a personal data assistant or smart phone), a laptop, a desktop computer, a tablet computer or device, a remote network server, or the like.
According to another aspect of the invention, a computer program comprises program code means for causing a digital processing device to carry out the inventive methods as described herein, when the computer program is run on the digital processing device. The digital processing device may be a handheld device (e.g., a personal data assistant or smart phone), a laptop computer, a desktop computer, a tablet computer or device, a remote network server, or the like.
According to another aspect of the invention, the signal represents a putative activity of the NFkB cellular signaling pathway, wherein the putative activity results from performing the methods of the invention as described herein. The signal may be a digital signal or an analog signal.
According to another aspect of the invention, a kit for measuring the expression level of six or more, e.g. seven, eight, nine, ten, eleven or twelve or more target genes of an NFkB cell signalling pathway in a sample comprises:
polymerase chain reaction primers directed against six or more NFkB target genes,
a probe for six or more NFkB target genes, and
optionally, an apparatus comprising a digital processor configured to perform the inventive methods as described herein, a non-transitory storage medium storing instructions executable by a digital processing apparatus to perform the inventive methods as described herein, or a computer program comprising program code means for causing a digital processing apparatus to perform the inventive methods as described herein when said computer program is run on said digital processing apparatus,
wherein the six or more NFkB target genes are selected from the group consisting of: BCL2L1, BIRC3, CCL2, CCL3, CCL4, CCL5, CCL20, CCL22, CX3CL1, CXCL1, CXCL2, CXCL3, ICAM1, IL1B, IL6, IL8, IRF1, MMP9, NFKB2, NFKBIA, NFKBIE, PTGS2, SELE, STAT5A, TNF, TNFAIP2, TNIP1, TRAF1, and VCAM 1.
Preferably, the first and second electrodes are formed of a metal,
the six or more target genes are selected from the group consisting of: CXCL2, ICAM1, IL6, CCL5, NFKBIA, IL8, TNFAIP2, CXCL3, MMP9, NFKB2, CCL20, CCL2, CXCL1, TNF, IRF1, TRAF1, NFKBIE, VCAM1 and BIRC3, more preferably the six or more target genes comprise three or more, e.g. four, five, six, seven, eight or nine or more target genes selected from the group consisting of: CXCL2, ICAM1, IL6, CCL5, NFKBIA, IL8, TNFAIP2, CXCL3, MMP9, NFKB2, CCL20, CCL2 and CXCL1, most preferably the three or more target genes are selected from the group consisting of: CXCL2, ICAM1, IL6, CCL5, NFKBIA, IL8 and TNFAIP 2.
Preferably, the first and second electrodes are formed of a metal,
the probe comprises at least one of seq.d. nos. 46, 49, 52, 55 and 58 and/or the primer comprises at least one of seq.d. nos. 44 and 45, 47 and 48, 50 and 51, 53 and 54 and 56 and 57.
According to another aspect of the invention, a kit for measuring the expression level of six or more, e.g. seven, eight, nine, ten, eleven or twelve or more target genes of an NFkB cell signalling pathway in a sample comprises:
one or more components for determining the expression level of six or more target genes of the NFkB cellular signaling pathway, and
comprising a digital processor configured to perform the inventive methods as described herein, a non-transitory storage medium storing instructions executable by a digital processing apparatus to perform the inventive methods as described herein, or a computer program comprising program code means for causing a digital processing apparatus to perform the inventive methods as described herein when the computer program is run on the digital processing apparatus,
wherein the one or more components are preferably selected from the group consisting of a microarray chip (e.g., a DNA array chip, an oligonucleotide array chip, a protein array chip), an antibody, a plurality of probes, RNA sequencing, and a set of primers,
wherein the six or more target genes of the NFkB cellular signaling pathway are selected from the group consisting of: BCL2L1, BIRC3, CCL2, CCL3, CCL4, CCL5, CCL20, CCL22, CX3CL1, CXCL1, CXCL2, CXCL3, ICAM1, IL1B, IL6, IL8, IRF1, MMP9, NFKB2, NFKBIA, NFKBIE, PTGS2, SELE, STAT5A, TNF, TNFAIP2, TNIP1, TRAF1, and VCAM 1.
Preferably, the first and second electrodes are formed of a metal,
the six or more target genes are selected from the group consisting of: CXCL2, ICAM1, IL6, CCL5, NFKBIA, IL8, TNFAIP2, CXCL3, MMP9, NFKB2, CCL20, CCL2, CXCL1, TNF, IRF1, TRAF1, NFKBIE, VCAM1 and BIRC3, more preferably said six or more target genes comprise three or more, e.g. four, five, six, seven, eight or nine or more target genes selected from the group consisting of: CXCL2, ICAM1, IL6, CCL5, NFKBIA, IL8, TNFAIP2, CXCL3, MMP9, NFKB2, CCL20, CCL2 and CXCL1, most preferably the three or more target genes are selected from the group consisting of: CXCL2, ICAM1, IL6, CCL5, NFKBIA, IL8, and TNFAIP 2.
By means of such a kit, the expression level of six or more target genes of the NFkB cell signalling pathway can be determined in a simple and comfortable manner. Advantageously, since multiple target genes are used to infer the activity of the NFkB cellular signaling pathway, a more robust inference of the activity of the NFkB cellular signaling pathway can be achieved.
According to another aspect of the invention, a kit of the invention as described herein is used to perform the method of the invention described herein.
The invention described herein may also be used advantageously, for example, in combination with the following objectives:
diagnosing based on the inferred activity of the NFkB cellular signaling pathway;
(ii) performing a prognosis based on the inferred activity of the NFkB cellular signaling pathway;
prescribing a drug based on the inferred activity of the NFkB cellular signaling pathway;
predicting drug efficacy based on the inferred activity of the NFkB cellular signaling pathway;
predicting side effects based on the inferred activity of the NFkB cellular signaling pathway;
monitoring drug efficacy;
developing a drug;
developing an assay;
research of a way;
staging of cancer;
recruitment into clinical trials based on inferred activity of the NFkB cell signaling pathway;
selecting a subsequent test to be performed; and
a companion diagnostic test is selected.
Other advantages will become apparent to those of ordinary skill in the art upon reading and understanding the drawings, the following description and particularly the detailed description provided below.
It shall be understood that the method of claim 1, the device of claim 7, the non-transitory storage medium of claim 8, the computer program of claim 9, the kit of claims 10-14 and the use of the kit of claim 15 have similar and/or identical preferred embodiments, in particular as defined in the dependent claims.
It is to be understood that preferred embodiments of the invention can also be any combination of the dependent claims or the above embodiments with the respective independent claims.
These and other aspects of the invention are apparent from and will be elucidated with reference to the embodiments described hereinafter.
Brief Description of Drawings
FIG. 1 shows schematically and exemplarily the classical and the alternative pathway of NFkB activation (see Dolcet X. et al, "NF-kB in degradation and progression of human cancer", Virchows Archives, Vol.446.No.5,2005, pages 475-482).
Fig. 2 schematically and exemplarily shows a mathematical model, in this case a bayesian network model, for modeling transcription programs of the NFkB cell signaling pathway.
Fig. 3to 6 show the training results of an exemplary bayesian network model based on evidence supported lists (evidence cured list) of target genes of the NFkB cell signaling pathway, 19 target gene candidate lists, 13 target gene candidate lists and 7 target gene candidate lists, respectively (see tables 1 to 4).
Figure 7 shows prediction of NFkB cell signaling pathway activity using an exemplary bayesian network model trained on the list (see table 1) of evidence supporting target genes against diffuse large B-cell lymphoma (DLBCL) samples from GSE 34171.
Figures 8 to 11 show the prediction of NFkB cell signaling pathway activity for an exemplary bayesian network model trained against Normal Human Bronchial Epithelial (NHBE) cell line samples from E-MTAB-1312 stimulated with different TNF α concentrations at different stimulation times using evidence support lists of target genes, 19 target gene candidate lists, 13 target gene candidate lists and 7 target gene candidate lists, respectively (see tables 1 to 4).
Figure 12 shows prediction of NFkB cell signaling pathway activity for THP-1monocytes using an evidence support list of target genes (see table 1), an exemplary bayesian network model trained.
Figures 13 to 16 show the prediction of NFkB cell signaling pathway activity of exemplary bayesian network models trained using the target gene evidence support list, 19 target gene candidate list, 13 target gene candidate list and 7 target gene candidate list, respectively, for colon samples from GSE4183 (see tables 1 to 4).
Figure 17 shows prediction of NFkB cell signaling pathway activity of an exemplary bayesian network model trained against breast cancer cell lines from GSE10890 using an evidence support list of target genes (see table 1).
Figure 18 shows prediction of NFkB cell signaling pathway activity for ovarian samples from GSE20565 using an evidence support list of target genes (see table 1), a trained exemplary bayesian network model.
Figure 19 shows prediction of NFkB cell signaling pathway activity for an exemplary bayesian network model trained against breast cancer samples from E-MTAB-1006 using an evidence support list of target genes (see table 1).
Fig. 20 shows the prognosis of glioma patients (GSE16011) depicted in Kaplan-Meier plots, where a trained exemplary bayesian network model using an evidence support list of target genes (see table 1) was applied.
Fig. 21 shows the training results of an exemplary linear model based on the evidence support list of target genes (see table 1).
Figure 22 shows prediction of NFkB cell signaling pathway activity for an exemplary linear model trained against Normal Human Bronchial Epithelial (NHBE) cell line samples from E-MTAB-1312 stimulated with different TNF α concentrations for different stimulation times, using an evidence-supported list of target genes (see table 1).
Figure 23 shows prediction of NFkB cell signaling pathway activity for colon samples from GSE4183 using evidence support list of target genes (see table 1), an exemplary linear model trained.
Figure 24 shows the training results of an exemplary bayesian network model based on an extensive literature list of putative target genes of the NFkB cellular signaling pathway (see table 5).
Figure 25 shows the prediction of NFkB cell signaling pathway activity for an exemplary bayesian network model trained against a broad literature list of putative target genes of the NFkB cell signaling pathway (see table 5) for Normal Human Bronchial Epithelial (NHBE) cell line samples from E-MTAB-1312 stimulated with different TNF α concentrations for different stimulation times.
Figure 26 shows prediction of NFkB cell signaling pathway activity for colon samples from GSE4183 using an extensive literature list of putative target genes of the NFkB cell signaling pathway (see table 5), an exemplary bayesian network model trained.
Detailed description of the embodiments
The following examples illustrate only particularly preferred methods and selected aspects associated therewith. The teachings provided herein can be used to construct several assays and/or kits, e.g., for detecting, predicting, and/or diagnosing abnormal activity of one or more cellular signaling pathways. Furthermore, when using the methods as described herein, it may be advantageous to guide drug prescription, drug response prediction and drug efficacy (and/or side effect) monitoring may be performed, drug resistance may be predicted and monitored, e.g., selection of subsequent tests to be performed (such as companion diagnostic tests). The following examples should not be construed as limiting the scope of the invention.
Example 1: mathematical model construction
As described in detail in published international patent application WO2013/011479a2 ("association of cellular signaling pathway activity using basic modeling of target gene expression"), by constructing a probabilistic model, such as a bayesian network model, and introducing a conditional probability relationship between the expression level of six or more target genes of a cellular signaling pathway, herein the NFkB cellular signaling pathway, and the level of Transcription Factor (TF) elements, herein the NFkB TF elements, controlling transcription of the six or more target genes of the cellular signaling pathway, such a model can be used to determine the activity of the cellular signaling pathway with high accuracy. Furthermore, the probabilistic model can be easily updated to incorporate additional knowledge gained from later clinical studies, by adjusting conditional probabilities and/or adding new nodes to the model to represent additional information sources. In this way, the probabilistic model can be updated appropriately to embody the latest medical knowledge.
In another readily understandable and interpretable method described in detail in published international patent application WO2013/011479a2 ("association of cellular signaling pathway activity using linear combinations"(s) of target gene expressions "), the activity of a cell signaling pathway-herein an NFkB cell signaling pathway-can be determined by constructing and evaluating a linear or (pseudo) linear model of the relationship between the expression level of six or more target genes incorporating a cell signaling pathway and the level of a Transcription Factor (TF) element-herein an NFkB TF element- (TF element controlling transcription of six or more target genes of a cell signaling pathway), which model is based at least in part on one or more linear combinations of the expression levels of the six or more target genes.
In both methods, the expression level of the six or more target genes may preferably be a measurement of mRNA levels, which may be the result of, for example, (RT) -PCR and microarray techniques using probes related to the target gene mRNA sequences, and RNA sequencing results. In another embodiment, the expression level of the six or more target genes may be measured by protein level, such as the concentration and/or activity of the protein encoded by the target gene.
The above expression levels may optionally be shifted in a number of ways that may or may not be better suited for the application. For example, four different transformations of expression levels, e.g., based on mRNA levels of a microarray, may be:
"continuous data", i.e.expression levels obtained after pretreatment of microarrays with well-known algorithms such as MAS5.0 and fRMA,
"z-fraction", i.e.the continuous expression level proportioned such that the mean value of all samples is 0 and the standard deviation is 1,
"discrete", i.e. each expression above a certain threshold is set to 1 and below to 0 (e.g. the threshold of a probe set may be selected as the mean of the values over a series of several positive and the same number of negative clinical samples),
"fuzzy", i.e. the conversion of successive expression levels into values between 0 and 1 using a sigmoid function of the following formula: 1/(1+ exp ((thr-expr)/se)), where expr is the continuous expression level, thr is a threshold as described above, and se is a softening parameter that affects the difference between 0 and 1.
One of the simplest linear models that can be constructed is a model with nodes representing Transcription Factor (TF) elements-herein NFkB TF elements-in the first layer and weighted nodes representing direct measurements of target gene expression levels, e.g. by one probe set particularly highly correlated to a specific target gene, e.g. in microarray or (q) PCR experiments, in the second layer. The weights may be based on calculations of the training data set or on expert knowledge. This method is particularly simple where multiple expression levels may be measured per target gene (e.g., in microarray experiments where one target gene may be measured with multiple probe sets). A specific way to select one expression level for a particular target gene is to use the expression level in a probe set that best separates the activity of the training data set from the passive sample. One method of identifying such probe sets is to perform a statistical test, such as a t-test, and select the probe set with the lowest p-value. The expression level of the training dataset for the probe set with the lowest p-value is defined as the probe set with the least likelihood of overlap in the expression levels of the (known) active and inactive samples. Another selection method is based on odds ratio. In this model, one or more expression levels are provided for each of the six or more target genes, and the one or more linear combinations comprise linear combinations including a weighted term (weighted term) for each of the six or more target genes, each weighted term being based on only one of the one or more expression levels provided for the respective target gene. If only one expression level is selected for each target gene as described above, this model may be referred to as the "most discriminatory probe set" model.
In an alternative form of the "maximum discrimination probe set" model, where multiple expression levels may be measured for each target gene, all of the expression levels provided by each target gene may be utilized. In this model, one or more expression levels are provided for each of the six or more target genes and the one or more linear combinations include a linear combination of all of the expression levels of the one or more expression levels provided to the six or more target genes. In other words, for each of the six or more target genes, each of the one or more expression levels provided for the respective target gene may be weighted in a linear combination by its own (individual) weight. This variation may be referred to as an "all probe set" model. It has the advantage of being relatively simple and that all provided expression levels can be used.
Both models have in common that they can be considered as "monolayer" models, in which the level of the TF element is calculated based on a linear combination of the expression levels of one or more probe sets of six or more target genes.
After determining the level of a TF element, herein the NFkB TF element, by evaluating the respective model, the determined TF element level can be thresholded to infer the activity of a cell signaling pathway, herein the NFkB cell signaling pathway. A preferred method of calculating such a suitable threshold is by comparing the determined TF element levels wlc of a training sample known to have a passive cell signaling pathway with a training sample having an active cell signaling pathway. The method as such and also taking into account the variance (variance) in these groups is given by using a threshold:
where σ and μ are the standard deviation and mean of the determined TF element level wlc for the training sample. In the case where only a small number of samples are available in the active and/or blunt training samples, then a false count (pseudocount) may be added to the calculated variance based on the average of the variances of the two groups:
where v is the variance of the determined TF element level wlc in the group and x is a positive false count, e.g., 1 or 10, nactAnd npasThe number of active and inactive samples, respectively. The standard deviation σ can be obtained by taking the square root of the variance v.
For ease of explanation, a threshold value may be subtracted from the determined TF element level wlc to obtain a score for the activity of a cellular signaling pathway, where negative values correspond to a passive cellular signaling pathway and positive values correspond to an active cellular signaling pathway.
As an alternative to the "single layer" model described above, a "double layer" model may also be used in embodiments. In this model, a total value ("first (bottom) layer") is calculated for each target gene using linear combination based on the measured intensities of its associated probe sets. The calculated total value is then combined with the total value of other target genes of the cell signaling pathway using further linear combinations ("second (upper)" layer). Likewise, the weights may be known from the training data set or based on expert knowledge or a combination thereof. In other words, in the "bilayer" model, the providing of one or more expression levels and one or more linear combinations for each of the six or more target genes comprises a first linear combination ("first (bottom) layer") of all expression levels of the one or more expression levels provided for the respective target gene for each of the six or more target genes. The model is further based at least in part on additional linear combinations, including a weighted word for each of six or more target genes, each weighted word based on a first linear combination ("second (upper) layer") of the respective target genes.
In a preferred version of the "two-layer" model, the calculation of the total value may include defining a threshold value for each target gene using the training data and subtracting the threshold value from the calculated linear combination to produce the total value for the target gene. The threshold value may be chosen here such that a negative total value of the target gene corresponds to a down-regulated target gene and a positive total value of the target gene corresponds to an up-regulated target gene. The total value of the target gene may also be converted before its combination in the "second (upper) layer" using, for example, one of the above-mentioned conversions (fuzzy, discrete, etc.).
After determining the level of the TF element by evaluating the "double layer" model, the determined TF element level can be thresholded as described above to infer the activity of the cellular signaling pathway.
Hereinafter, the above models are collectively referred to as "(pseudo) linear" models. A more detailed description of the training and use of probabilistic models, such as bayesian network models, is provided in example 3 below.
Example 2: selection of target Gene
Transcription Factors (TFs) are protein complexes (i.e., combinations of proteins held together in a specific structure) or proteins that regulate transcription of a target gene by binding to a specific DNA sequence, thereby controlling transcription of genetic information from DNA to mRNA. The mRNA directly produced as a result of this action of the TF complex is referred to herein as the "direct target gene" (of the transcription factor). Activation of cellular signaling pathways may also lead to more secondary gene transcription, referred to as "indirect target genes". In the following, a (pseudo) linear model or a bayesian network model (as exemplified mathematical model) comprising or consisting of a direct target gene as a direct link between the activity of the cell signaling pathway and the mRNA level is preferred, however, the distinction between direct and indirect target genes is not always obvious. Here, a method of selecting a direct target gene using a scoring function based on available scientific literature data is proposed. Nevertheless, due to limited information and biological variations and uncertainties, accidental selection of indirect target genes cannot be ruled out. To select target genes, two lists of target genes were generated using the MEDLINE database of the National Institute of Health accessible at "www.ncbi.nlm.nih.gov/Pubmed" and further referred to herein as "Pubmed".
Publications containing putative NFkB target genes were searched during the second AND third quarter of 2013 by using queries such as (NFkB AND "target gene"). The resulting publications were further analyzed manually according to the methods described in more detail below.
Specific cellular signaling pathway mRNA target genes are selected from the scientific literature by using a sequencing system, wherein scientific evidence for a specific target gene is evaluated based on the type of scientific experiment in which the evidence accumulates. While some experimental evidence only suggests that the gene is a direct target gene, e.g., mRNA gain is detected by increasing probe set strength on microarrays of cell lines in which the NFkB cell signaling pathway is known to be active, other evidence may be very strong, e.g., the identified NFkB cell signaling pathway TF binding site and the restoration of this site in chromatin immunoprecipitation (ChIP) assays after stimulation of a particular cell signaling pathway in the cell and mRNA gain after specific stimulation of a cell signaling pathway in a cell line.
Several experimental types for the discovery of target genes of specific cell signaling pathways can be identified in the scientific literature:
ChIP assay showing direct binding of TF of the cellular signaling pathway of interest to its binding site on the genome. For example: by using chromatin immunoprecipitation (ChIP) technology, putative functional NFkB TF binding sites in DNA of cell lines with and without induction of activity of the NFkB cell signaling pathway were subsequently identified, e.g. by stimulation with tumor necrosis factor α (TNF α) or Lipopolysaccharide (LPS), as a subset of binding sites recognized based on nucleotide sequences only. Putative functionality was identified as evidence of ChIP derivation and TF was found to bind to the DNA binding site.
2. Electrophoretic Migration Shift (EMSA) assay, which shows in vitro binding of TF to DNA fragments containing binding sequences. The EMSA-based evidence is less strong than the ChIP-based evidence because it cannot be translated into an in vivo situation.
3. Stimulating the cellular signaling pathway and measuring mRNA expression using microarray, RNA sequencing, quantitative PCR or other techniques, using cell lines inducible by the NFkB cellular signaling pathway and measuring the mRNA profile measured at least one but preferably several time points after induction in the presence of cycloheximide that inhibits translation into protein, thus assuming that the induced mRNA is the direct target gene.
4. Similar to 3, but optionally mRNA expression and further subsequent protein abundance measurements, such as Western blots.
5. Bioinformatics methods were used to identify TF binding sites in the genome. For example for NFkB TF elements: using the NFkB binding motif 5'-GGGRNWYYCC-3' (R: A or G, N: any nucleotides, W: A or T, Y: C or T), a software program was run on the human genomic sequence to identify potential binding sites in the gene promoter region and other genomic regions.
6. Similar to 3, only cycloheximide was present.
7. Similar to 4, except that no cycloheximide was present.
In the simplest form, for each of these experimental methods, 1 point can be given to each potential gene, where the gene is identified as a target gene for the NFkB transcription factor family. Using this relative ranking strategy, the most reliable list of target genes can be listed.
Alternatively, sequencing in another way can be used to identify the target gene that is most likely to be the direct target gene by providing a higher number of points by techniques that provide the majority of evidence for the direct target gene in vivo. In the above list, for experimental method 1), this means 8 points, 7 points for method 2), and down to 1 point for experimental method 8). This list may be referred to as a "general list of target genes".
Despite biological variation and uncertainty, the present inventors postulated that the direct target gene is most likely induced in a tissue-independent manner. The list of these target genes may be referred to as an "evidence support list of target genes". The supporting list of evidence for such target genes has been used to construct computational models of the NFkB cell signaling pathway that can be applied to samples from different tissue sources.
As will be exemplified below, how to specifically construct evidence for the NFkB cell signaling pathway supports the selection of a list of target genes.
A scoring function was introduced that provides points for each type of experimental evidence, e.g., ChIP, EMSA, differential expression, knockdown/knock-out, luciferase gene reporter assay, sequence analysis, which has been reported in publications. Sometimes the same experimental evidence is mentioned in multiple publications to obtain a corresponding number of points, e.g. two publications mentioning a ChIP finding results in a double score for a single ChIP finding. Further analysis was performed to allow only genes with different types of experimental evidence, not just one type of experimental evidence, e.g., differential expression. Finally, evidence scores were calculated for all putative NFkB target genes and all putative NFkB target genes with an evidence score of 5 or higher were selected (as shown in table 1). A cutoff level of 5 was chosen heuristically as providing sufficiently strong evidence.
The inventors further selected a supporting list of evidence of target genes (listed in table 1). The evidence of selection supports a list of target genes that proved to be more attentive in determining the activity of NFkB cell signaling pathways from training samples. The selection was made using a combination of literature evidence scores and training results for two different datasets. Three ranks were performed based on evidence score, with the highest rank ranked first, resulting in a lower rank each time one point was decreased, and the "soft" odds ratio for all included probe sets (as shown in table 1) was calculated as described herein using the GSE12195 and E-MTAB-1312 datasets. The absolute value of the Log2 converted "soft" odds ratio was used to rank probe sets; the highest odds ratio ranks first, and so on. As described herein, samples from ABC DLBCL samples from GSE12195 were selected as active NFkB training samples, while normal samples from this dataset were selected as inactive NFkB training samples. In addition, a second set of "soft" odds ratios was calculated using samples from E-MTAB-1312-the optional NFkB training dataset, with control samples used as inactive NFkB samples, and samples stimulated with different concentrations of TNF α for 2 hours selected as active NFkB training samples. A preferred target genome with at least one associated probe set is selected for the "19 target gene candidate list" based on the average rank, which is less than or equal to 20. A more preferred set of target genes with an average ranking of less than or equal to 15 was selected for the "13 target gene candidate list". The most preferred set of target genes with an average ranking of less than or equal to 12 is selected for the "7 target gene candidate list". The 19 target gene candidate lists, 13 target gene candidate lists and 7 target gene candidate lists are shown in tables 2-4, respectively.
Table 1: the "target gene evidence support list" for the NFkB cell signaling pathway in the NFkB cell signaling pathway model and related probe sets for measuring mRNA expression levels of the target gene.
Table 2: "19 target Gene candidate List" of target genes of the NFkB cellular signaling pathway based on literature evidence scores and "Soft" odds ratio for GSE12195 and E-MTAB-1312 "
CXCL2 |
ICAM1 |
IL6 |
CCL5 |
NFKBIA |
IL8 |
TNFAIP2 |
CXCL3 |
MMP9 |
NFKB2 |
CCL20 |
CCL2 |
CXCL1 |
TNF |
IRF1 |
TRAF1 |
NFKBIE |
VCAM1 |
BIRC3 |
Table 3: "13 target Gene candidate List" of target genes of the NFkB cellular signaling pathway based on literature evidence scores and "Soft" odds ratio for GSE12195 and E-MTAB-1312 "
CXCL2 |
ICAM1 |
IL6 |
CCL5 |
NFKBIA |
IL8 |
TNFAIP2 |
CXCL3 |
MMP9 |
NFKB2 |
CCL20 |
CCL2 |
CXCL1 |
Table 4: "7 target gene candidate list" of target genes of the NFkB cell signaling pathway based on literature evidence scores and "soft" odds ratio of GSE12195 and E-MTAB-1312.
Example 3: training and Using mathematical models
Before a mathematical model can be used to infer the activity of a cell signaling pathway, here the NFkB cell signaling pathway, the model must be properly trained.
If the mathematical model is a probabilistic model, such as a bayesian network model, which is based at least in part on conditional probabilities regarding the expression levels of the NFkB TF element and six or more target genes of the NFkB cell signaling pathway measured in the sample, the training may preferably be performed as described in detail in published international patent application WO2013/011479a2 ("association of cellular signaling pathway activity using basic modeling of target gene expression").
If the mathematical model is based at least in part on one or more linear combinations of expression levels of six or more target genes of the NFkB cell signaling pathway measured in the sample, the training may preferably be performed as described in published international patent application WO2014/102668a2 ("association of cellular signaling pathway activity association(s)").
Here, the following figures are used first2, the transcription program of the NFkB cell signalling pathway is modeled in a simple manner. The model consists of three types of nodes: (a) transcription Factor (TF) elements in the first layer 1 (described as "absent" and "present"); (b) second layer 2 target Gene TG1、TG2、TGn(described as "downregulation" and "upregulation"); and (c) a measurement node associated with the expression level of the target gene in the third layer 3. These may be microarray probe sets PS1,1、PS1,2、PS1,3、PS2,1、PSn,1、PS n,m(described as "low" and "high"), as is preferred for use in the present invention, but other gene expression measurements are possible, such as RNAseq or RT-qPCR.
Suitable implementations of the mathematical model, here the exemplary bayesian network model, are based on microarray data. The model describes how (i) the expression level of a target gene depends on the activation of the TF element, and (ii) how the probe set strength in turn depends on the expression level of the respective target gene. For the latter, probe set intensities were obtained from fRMA-pretreated Affymetrix HG-U133Plus2.0 microarrays, which are widely available from Gene Expression Omnibus (GEO, www.ncbi.nlm.nih.gov/GEO) and Arrayexpress (www.ebi.ac.uk/ArrayExpress).
Since the exemplified bayesian network model is a simplification of the biology of the cell signaling pathway, here the NFkB cell signaling pathway, and is generally noisy as a biological measure, a probabilistic approach was chosen, i.e., the relationship between (i) the TF element and the target gene, and (ii) the target gene and its respective probe set, is described in probabilistic terms. Furthermore, it is speculated that the activity of oncogenic cellular signaling pathways driving tumor growth is not transiently and dynamically altered, but rather is chronically or even irreversibly altered. Thus, an example bayesian network model was developed to account for static cell conditions. For this reason, complex dynamic cellular signaling pathway features were not incorporated into the model.
Once the exemplary bayesian network model (see below) is established and calibrated, the model can be used for microarray data for new samples by inputting probe set measurements observed in the third layer 3 and extrapolating back in the model the probabilities necessary for the TF element to "exist". Herein, "presence" is considered to be a phenomenon in which the TF element binds to DNA and controls transcription of a target gene of a cell signaling pathway, and "absence" is a case in which the TF element does not control transcription. Thus, this probability is a raw readout that can be used to represent the activity of a cell signaling pathway, here the NFkB cell signaling pathway, which can then be translated into a probability that the cell signaling pathway is active (i.e., the probability is given by p/(1-p), where p is the probability that the predicted cell signaling pathway is active) by taking the ratio of its probability that it is active over its probability that it is inactive.
In the example Bayesian network model, the probabilistic relationships have been quantified to allow quantitative probabilistic reasoning. To improve the generalization behavior across tissue types, parameters describing the probabilistic relationship between (i) the TF element and the target gene have been carefully chosen. If the TF element is "absent", the target gene is likely to be "down-regulated", so a probability of 0.95 is chosen, and 0.05 is chosen as the probability of the target gene being "up-regulated". The latter (non-zero) probability is a (rare) probability that the target gene is regulated by other factors or is observed "upregulation" (e.g. due to measurement noise) unexpectedly. If the TF element is "present", the target gene is considered to be "up-regulated" with a probability of 0.70 and the target gene is considered to be "down-regulated" with a probability of 0.30. The latter value is chosen because there are some reasons why the target gene is not highly expressed even if the TF element is present, for example because the promoter region of the gene is methylated. In the case where the target gene is not up-regulated by the TF element but down-regulated, the probability is chosen in a similar way, but reflects down-regulation in the presence of the TF element. The parameters describing the relationship between (ii) the target gene and its respective probe set have been calibrated according to experimental data. For the latter, in this example, microarray data from patient samples known to have an active NFkB cell signaling pathway are used, while normal healthy samples from the same dataset are used as inactive NFkB cell signaling pathway samples, but this can also be done using cell line experiments or other patient samples with known states of cell signaling pathway activity. The resulting conditional probability table is given by:
for up-regulated target genes
For down-regulated target genes
In these tables, the variable ALi,j、AHi,j、PLi,jAnd pHi,jThe number of calibration samples with "absent" (a) or "present" (P) transcription complexes with "low" (L) or "high" (H) probe set intensities, respectively, is indicated. The virtual count is increased to avoid extreme probabilities of 0 and 1.
For discretizing the observed probe set intensities, PS was performed for each probe seti,jUsing a threshold value ti,jBelow this threshold the observed value is called "low" and above it is called "high". The threshold is chosen to be the (weighted) median intensity of the probe sets in the calibration dataset used. Due to noise in the microarray data, a fuzzy approach was used when comparing the observed probe set intensities to their threshold values, by assuming a normal distribution with a standard deviation of 0.25 (on a log2 scale) around the reported intensity and determining the probability mass below and above the threshold. The combination of false counts and the use of probabilistic quality instead of deterministic measurements for the calculated odds ratio is referred to herein as a "soft" odds ratio.
Further details regarding the use of mathematical modeling of target gene expression to infer cell signaling pathway activity can be found in Verhaegh W.et al, "Selection of qualified pathway therapy through the use of knowledge-based regulatory models through pathways," Cancer Research, Vol.74, No.11,2014, 2936-.
If instead of the exemplary bayesian network described above, a (pseudo) linear model as described in example 1 above is employed, the weights representing the signs and magnitudes of correlation between the nodes and the threshold values referred to as "absent" or "present" of the nodes need to be determined before the model can be used to infer cell signaling pathway activity in the test sample. Weights and thresholds may be filled out a priori using expert knowledge, but typically the model will be trained using a representative set of training samples, with known training events being preferred, e.g., expression data for probe sets in samples with known "presence" or "absence" of transcription factor complexes (═ inactive cell signaling pathways).
A large number of training algorithms (e.g. regression) are known in the art that take into account the model topology and change the model parameters-here the weights and thresholds, so that the model outputs-here the weighted linear scores-are optimized. Alternatively, the weights can be calculated directly from the observed expression levels without the need for an optimization algorithm.
The first method, referred to herein as the "black and white" method, ends up as a ternary system, where each weight is an element of the set { -1,0,1 }. If placed in a biological context, -1 and 1 correspond to a target gene or probe set, respectively, that is down-regulated and up-regulated in the context of cell signaling pathway activity. If a probe set or target gene does not statistically prove to be up-or down-regulated, it is accepted as having a weight of 0. In one example, both left and right sample t-tests of expression levels of samples with inactive cell signaling pathways may be performed using expression levels of samples with active cell signaling pathways, with the probes or genes being up-or down-regulated in view of the training data used. In the case where the average value of the active samples is statistically greater than the passive samples, i.e., the p-value is below a certain threshold (e.g., 0.3), the target gene or probe set is determined to be up-regulated. Conversely, in the case where the average value of the active sample is statistically lower than that of the passive sample, it is determined that the target gene or probe set is down-regulated upon activation of the cell signaling pathway. In the case where the lowest p-value (left or right) exceeds the above threshold, the weight of the target gene or probe set may be defined as 0.
The second method, referred to as "log probability" -referred to herein as weight-is based on the logarithm of the odds ratio (e.g., base e). The odds ratio for each target gene or probe set is calculated based on the number of positive and negative training samples whose probe set/target gene levels are above and below the respective thresholds (e.g., the (weighted) median of all training samples). False counts may be added to avoid zero as a divisor. A further improvement is to count samples above/below the threshold in a somewhat more probabilistic manner, by assuming that probe set/target gene levels are, for example, normally distributed around their observations with some particular standard deviation (e.g., 0.25 on a2 log scale), and counting the probabilistic masses above and below the threshold.
Here, publicly available data regarding Expression of activated B-cell like (e.g., ABC), diffuse large B-cell lymphoma (DLBCL) cells from known active NFkB cell signaling pathways and normal cells known to have inactive NFkB cell signaling pathways (GSE12195, available from Gene Expression Omnibus (GEO, www.ncbi.nlm.nih.gov/GEO, last visit October 9,2013) are used as examples.
Fig. 3-6 show the training results of an exemplary bayesian network model based on the evidence support list of target genes of the NFkB cellular signaling pathway, 19 target gene candidate lists, 13 target gene candidate lists, and 7 target gene candidate lists (see tables 1-4). In the figure, the vertical axis indicates the probability of TF element "presence" or "absence", which corresponds to the NFkB cell signaling pathway being active or inactive, wherein the values above the horizontal axis correspond to TF element being more likely to be "presence"/active, while values below the horizontal axis indicate that the probability of TF element "absence"/inactive is greater than it is "presence"/active. Group B cells (ABC) known to have activation of an active NFkB cell signaling pathway (group 1) were designated as active NFkB markers, while normal samples comprising known inactive NFkB cell signaling pathways (group 7) were labeled as NFkB inactive. A full score was obtained for the training samples of all four models, which were all ABCs that are active and all normal samples with a passive NFkB cell signaling pathway. As expected, the probability of training the sample becomes less extreme because the model contains fewer target genes. In the same dataset, healthy unstimulated memory B cells and native B cells (groups 5and 6, respectively) were correctly predicted to have a passive NFkB cell signaling pathway in a model of 19 and 7 target genes, with few exceptions. Most other lymphoma samples among germinal center B cells (GCB) (group 4), follicular lymphoma (group 3), lymphoblastoid cell lines (group 2), and diffuse large B-cell lymphoma (DLBCL) samples with unknown subtype (group 8) were predicted to have an active NFkB cell signaling pathway in all four models. For GCB, a high proportion of tumors (> 50%) are known to have Mutations in at least one NFkB regulatory gene, which may be associated with aberrant NFkB activity (see, e.g., Campagno m. et. at, "Mutations of multiple genes use differentiation of NF-kappaB in differential large B-cell lymphoma", Nature, vol.459, No.7247,2009,717 pages 721). Still other studies indicate that constitutive NFkB activation is a common feature of most hematological malignancies (see, e.g., Keutgens A. et al, "defined NF-kB activity in hematological malignancies," Biochemical Pharmacology, Vol.72, No.9,2006, 1069-. (legend: 1-ABC DLBCL; 2-lymphoblastoid cell line; 3-follicular lymphoma; 4-GCB DLBCL; 5-memory B cells; 6-natural B cells; 7-normal; 8-DLBCL unknown subtype)
Hereinafter, the verification results of the trained exemplary bayesian network model using the evidence support list of target genes, the 19 target gene candidate list, the 13 target gene candidate list, and the 7 target gene candidate list, respectively, are shown in fig. 7-19.
Figure 7 shows evidence of target genes using diffuse large B-cell lymphoma (DLBCL) samples from GSE34171 to support prediction of NFkB cell signaling pathway activity for a trained exemplary bayesian network model of the list (see table 1). In the figure, the vertical axis indicates the probability of TF element "presence" or "absence", which corresponds to the NFkB cell signaling pathway being active or inactive, wherein the values above the horizontal axis correspond to TF element being more likely to be "presence"/active, while values below the horizontal axis indicate that the probability of TF element "absence"/inactive is greater than it is "presence"/active. All samples were predicted to have an active NFkB cell signaling pathway. DLBCL lymphomas are known to have NFkB aberrant activity (see, e.g., Keutgens A. et al, "defined NF-kB activity in biological therapeutics), Biochemical Pharmacology, Vol.72, No.9,2006,1069, 1080).
Figures 8-11 show the prediction of NFkB cell signaling pathway activity of an exemplary bayesian network model using the training of evidence support lists of target genes, 19 target gene candidate lists, 13 target gene candidate lists, and 7 target gene candidate lists for Normal Human Bronchial Epithelial (NHBE) cell line samples from E-MTAB-1312 stimulated with different TNF α concentrations for different stimulation times, respectively (see tables 1-4). In the figure, the vertical axis indicates the probability of TF element "presence" or "absence", which corresponds to the NFkB cell signaling pathway being active or inactive, wherein the values above the horizontal axis correspond to TF element being more likely to be "presence"/active, while values below the horizontal axis indicate that the probability of TF element "absence"/inactive is greater than it is "presence"/active. Stimulation of only 0.5h apparently was too short for the NFkB cell Signaling pathway to become active ( groups 5, 9 and 13), while stimulation for more than 4 hours-here 24 hours ( groups 8, 12 and 16) appeared to decrease NFkB cell Signaling pathway activity, as expected due to asynchronous NFkB oscillation, with longer stimulation times resulting in inhibition of net NFkB cell Signaling pathway activity (see, e.g., D.E. Nelson et al, "catalysis in NF-. kappa.B Signaling Control of the Dynamics of Gene Expression", Science, Vol.306, No.5696,2004, page 708). The model using the evidence support list of target genes showed almost complete inhibition of NFkB cell signaling pathway activity in the 24 hour sample, while the model using the 19 target gene candidate list, 13 target gene candidate list and 7 target gene candidate list showed less strong reduction of NFkB cell signaling pathway activity decrease in the 24 hour sample. All control samples without TNF α stimulation (groups 1-4) were correctly predicted to have a passive NFkB cell signaling pathway in all four models. There appears to be no such large difference in cell signaling pathway activity due to increased TNF α concentrations, which is evident in 2 and 4 hour stimulation; it is clear that all concentrations tested in this experiment, even the lowest, are sufficient to have full NFkB cell signalling pathway activity. (legend: 1-no TNF α (0.5 h); 2-no TNF α (2 h); 3-no TNF α (4 h); 4-no TNF α (24 h); 5-low TNF α (0.5 h); 6-low TNF α (2 h); 7-low TNF α (4 h); 8-low TNF α (24 h); 9-medium TNF α (0.5 h); 10-medium TNF α (2 h); 11-medium TNF α (4 h); 12-medium TNF α (24 h); 13-high TNF α (0.5 h); 14-more TNF α (2 h); 15-high TNF α (4 h); 16-high TNF α (24h))
Figure 12 shows prediction of NFkB cell signaling pathway activity using evidence of target genes of THP-1monocytes to support an exemplary bayesian network model trained in the list (see table 1). In the figure, the vertical axis represents the probability of a TF element being "present" or "absent", corresponding to the NFkB cell signaling pathway being active or inactive, wherein values above the horizontal axis correspond to the TF element being more likely to be "present"/active, while values below the horizontal axis indicate that the probability of a TF element being "absent"/inactive is greater than the probability that it is "present"/active. THP-1 cell lines are known to express NFkB cell signaling pathway activity upon stimulation with immune response inducers such as Lipopolysaccharide (LPS). The NFkB cell signaling pathway activity in these monocytes was predicted to be low in the absence of LPS stimulation (group 1). Upon stimulation of these cells with LPS, NFkB cell signaling pathway activity was amplified (group 2). Coenzyme Q10 (coenzyme Q10) has been reported to have anti-inflammatory effects, but the effects on the activity of the NFkB cell signaling pathway have been reported to be limited (see, e.g., c."Identification of LPS-insoluble genes downscaled by ubiquitin in human THP-1 cells", Biofactors, Vol.36, No.3,2010, 222-. This is also reflected in NFk predicted in THP-1 cells treated with CoQ10 and stimulated with LPSHigh activity of B cell signaling pathways (group 3). (legend: 1-LPS free; 2-LPS stimulation; 3-LPS stimulation + coenzyme Q10)
Figures 13-16 show the prediction of NFkB cell signaling pathway activity of an exemplary bayesian network model trained using evidence support lists of target genes from a colon sample of GSE4183, 19 target gene candidate lists, 13 target gene candidate lists, and 7 target gene candidate lists (see tables 1-4), respectively. In the figure, the vertical axis indicates the probability of TF element "presence" or "absence", which corresponds to the NFkB cell signaling pathway being active or inactive, wherein the values above the horizontal axis correspond to TF element being more likely to be "presence"/active, while values below the horizontal axis indicate that the probability of TF element "absence"/inactive is greater than it is "presence"/active. By using a model of the target gene support list, most normal samples (group 1) were correctly predicted to have a low probability of NFkB cell signaling pathway activity, however the cell signaling pathway activity of normal samples appeared to be higher than expected for the other data sets in the 19 target gene candidate list, the 13 target gene candidate list, and the 7 target gene candidate list. On the other hand, all four models predict that all Inflammatory Bowel Disease (IBD) (group 2) samples have significantly higher active NFkB cell signaling pathways than normal samples, as expected due to the inflammatory state of the colon in these patients and that the NFkB cell signaling pathways are activated due to inflammation. Furthermore, it is evident from all four models that the NFkB cell signaling pathway is more active in more advanced colorectal cancer (CRC) (group 4) compared to benign adenoma (group 2), possibly due to more mutations as the tumor progresses to a more malignant tumor, inflammation due to the progressing tumor, or leukocyte infiltration due to immune response to late stage cancer. Potentially, the NFkB cell signaling pathway activity score can be used to predict the likelihood of an adenoma progressing to a malignancy. (legend: 1-Normal colon; 2-IBD; 3-adenoma; 4-CRC)
Figure 17 shows prediction of NFkB cell signaling pathway activity using an exemplary bayesian network model trained with an evidence support list of target genes from breast cancer cell lines of GSE10890 (see table 1). In the figure, the vertical axis represents the probability of a TF element being "present" or "absent", corresponding to the NFkB cell signaling pathway being active or inactive, wherein values above the horizontal axis correspond to the TF element being more likely to be "present"/active, while values below the horizontal axis indicate that the probability of a TF element being "absent"/inactive is greater than the probability that it is "present"/active. Some breast cancer cell lines (groups 1 and 2) for which pathways driving cell survival and proliferation are still unknown were identified as having aberrantly active NFkB cell signaling pathways, which may be responsible for cell survival and proliferation. In the third group previously identified as having an active Wnt cell signaling pathway (see published International patent application WO2013/011479A2 "Association of cellular signaling pathway activity using basic modulation of target gene expression"), no NFkB cell signaling pathway active sample was found. (legends: 1 and 2-cell lines driven by unknown cell signaling pathways; 3: cell lines driven by Wnt cell signaling pathways)
Figure 18 shows prediction of NFkB cell signaling pathway activity using an exemplary bayesian network model trained with an evidence support list of target genes from ovarian samples of GSE20565 (see table 1). In the figure, the vertical axis represents the probability of a TF element being "present" or "absent", corresponding to the NFkB cell signaling pathway being active or inactive, wherein values above the horizontal axis correspond to the TF element being more likely to be "present"/active, while values below the horizontal axis indicate that the probability of a TF element being "absent"/inactive is greater than the probability that it is "present"/active. The proportion of samples of predicted NFkB cell signaling pathway activity was significantly higher in primary cancers (groups 1 and 2) compared to breast cancer metastasis to the ovaries (groups 3 and 4), with p ═ 6.0e-5(Fisher exact test). Chronic inflammation is associated with the development and progression of tumors. Recurrent inflammatory processes in the ovary due to ovulation, endometriosis and pelvic infection have been shown to be associated with NFkB cell signaling pathway active tumors (see, e.g., Alvero a.b., "Recent impedances into the role of NF-kappaB in ovarian carcinogenesis", Genome Medicine, vol.2, No.8,2010,56, and Guo r.x.et., incorporated stabilizing for phosphorylated AKT and nuclear factor-kappaB p65and the hair diagnosis with diagnosis in epithelial cancer, Pathology International, Vol.58, No.12,2008, 746. 756). In the ovary, carcinogenesis is associated with inflammatory processes such as repeated ovulation, endometriosis and pelvic infection, possibly resulting in a higher proportion of NFkB cell signalling pathway active tumours originating in the ovary compared to the breast. (legend: 1-Primary ovarian carcinoma; 2-true-like Primary ovarian carcinoma; 3-Breast carcinoma ovarian metastasis; 4-true-like Breast carcinoma ovarian metastasis)
Figure 19 shows prediction of NFkB cell signaling pathway activity using evidence of target genes from breast cancer samples of E-MTAB-1006 to support a trained exemplary bayesian network model of the list (see table 1). In the figure, the vertical axis represents the probability of TF element "presence" or "absence", corresponding to the NFkB cell signaling pathway being active or inactive, wherein values above the horizontal axis correspond to the TF element being more likely to be "present"/active, while values below the horizontal axis indicate that the probability of TF element "absence"/inactive is greater than its probability of "presence"/active. The mean value of the odds ratio of logit NFkB in the inflammatory breast cancer sample (IBC) (group 1) was significantly higher with p <0.01 (dual sample t-test) compared to the non-inflamed breast cancer (nbbc) (group 2). This may be a reflection of the high inflammatory state of IBC compared to nbbc. However, the predicted NFkB cell signaling pathway activities in the two groups overlap to a large extent, making the clear classification of IBC and nbbc based on NFkB cell signaling pathway activities very cumbersome in the current state of the model. (legend: 1-inflammatory Breast cancer; 2-non-inflammatory Breast cancer)
Fig. 20 shows the overall survival of 272 glioma patients (GSE16011) depicted in Kaplan-Meier plots, to which a trained exemplary bayesian network model using an evidence support list of target genes (see table 1) was applied. In the figure, the vertical axis represents the overall survival rate of the patient group and the horizontal axis represents time (expressed in years). The figure shows that the active NFkB TF element (indicated by the steep slope of the curve) is a prognostic marker of poor overall survival. (the patient group with the predicted active NFkB TF element consisted of 113 patients (solid line) and the patient group with the predicted inactive NFkB TF element consisted of 159 patients (dashed line)). The prognostic value of the activity level of the NFkB TF element is also indicated in the risk ratio of the predicted probability of NFkB activity: 1.83 (95% CI: 1.34-2.49, p ═ 7.2 e-5).
Following the exemplary bayesian network model described above, the transcription program of the NFkB cell signaling pathway was modeled in a simple manner using an exemplary linear model. The structure of this model corresponds to the structure shown in FIG. 3 of published International patent application WO2014/1026868A2 ("Association of cellular signalling access utilization(s) of target gene expressions"). More specifically, the exemplified linear model consists of TF nodes, one layer of target gene expression and one layer of mRNA expression levels, and uses continuous expression data, evidence from the target genes to support all target genes in the list (see table 1) and all probe sets included and does not use false counts. Training of this model was performed using the same activity and blunt training samples from GSE12195 and included calculating the target gene expression level-here expressed in probe set strength-the weight of the linkage to the target gene node using the "soft log probability" method as described herein, and then calculating the activity score of the transcription factor complex by multiplying the calculated target gene expression score by the sum of 1 or-1 for up-or down-regulated target genes, respectively.
Fig. 21 shows the training results of an exemplary linear model based on an evidence support list of target genes (see table 1). In this figure, the vertical axis represents the activity score, with positive values corresponding to active NFkB cell signaling pathways and negative values corresponding to inactive NFkB cell signaling pathways. Activated B-cells (ABC) DLBCL (group 1) and normal (group 7) were used as active and blunt training samples, respectively. All training samples were correctly classified; all ABC samples were predicted to have an active NFkB cell signaling pathway, while normal samples were predicted to have a passive NFkB cell signaling pathway. In the same dataset, all but one lymphoma samples and cell lines as well as DLBCL (groups 2-4, 8) were predicted to be NFkB active, similar to the bayesian network model using an evidence-supported list of target genes (figure 3), which seems reasonable in view of the literature evidence described herein. Healthy unstimulated memory B cells and naive B cells (groups 5and 6) were correctly predicted to lack NFkB activity. (legend: 1-ABC DLBCL; 2-lymphoblastoid cell line; 3-follicular lymphoma; 4-GCB DLBCL; 5-memory B cells; 6-natural B cells; 7-normal; 8-DLBCL unknown subtype)
Figure 22 shows prediction of NFkB cell signaling pathway activity using evidence of target genes from Normal Human Bronchial Epithelial (NHBE) cell line samples from E-MTAB-1312 stimulated with different TNF α concentrations at different stimulation times to support an exemplary linear model of training of the list (see table 1). In this figure, the vertical axis represents the activity score, with positive values corresponding to active NFkB cell signaling pathways and negative values corresponding to inactive NFkB cell signaling pathways. The NHBE cell line was either unstimulated or stimulated with different TNF α concentrations for different periods of time. All unstimulated healthy cells (groups 1-4) were correctly predicted to have a passive NFkB cell signaling pathway. Stimulation with all TNF α concentrations for only 0.5h was predicted to be too short for the NFkB cell signaling pathway to become active ( groups 5, 9 and 13), similar to the bayesian network model using evidence-supported lists of target genes (figure 8), while stimulation times of 2-24 hours resulted in active NFkB cell signaling pathways. The reduction in NFkB cell signaling pathway activity after 4 hours observed with the bayesian network model, particularly using the evidence support list of target genes (see fig. 8), cannot be reproduced with the trained exemplary linear model. (legend: 1-no TNF α (0.5 h); 2-no TNF α (2 h); 3-no TNF α (4 h); 4-no TNF α (24 h); 5-low TNF α (0.5 h); 6-low TNF α (2 h); 7-low TNF α (4 h); 8-low TNF α (24 h); 9-medium TNF α (0.5 h); 10-medium TNF α (2 h); 11-medium TNF α (4 h); 12-medium TNF α (24 h); 13-high TNF α (0.5 h); 14-high TNF α (2 h); 15-high TNF α (4 h); 16-high TNF α (24h))
Figure 23 shows prediction of NFkB cell signaling pathway activity using an exemplary linear model trained with an evidence support list (see table 1) of target genes from a colon sample of GSE 4183. In this figure, the vertical axis represents the activity score, with positive values corresponding to active NFkB cell signaling pathways and negative values corresponding to inactive NFkB cell signaling pathways. Most normal samples (group 1) were predicted to have low NFkB cell signaling pathway activity, although the activity score was higher than expected for the other data sets. All inflammatory bowel disease samples (group 2) were predicted to have a higher NFkB activity score than normal samples, and it is reasonable to consider the inflammatory state of the colon (which activates the NFkB cell signalling pathway as one of the inflammatory responses) in these patients. Furthermore, it is clear that samples in the adenoma sample (group 3) are on average lower than the more advanced colorectal cancers (CRC, group 4), which can be partly explained by more mutations resulting from the progression of the tumor to more malignant tumors, inflammation due to tumor progression or leukocyte infiltration due to immune response to late stage cancer. (legend: 1-Normal colon; 2-IBD; 3-adenoma; 4-CRC).
Instead of applying mathematical models, such as the exemplary bayesian network model, to mRNA input data from microarray or RNA sequencing, it may be beneficial to develop specific assays to make sample measurements in clinical applications, such as using qPCR on an integrated platform to determine mRNA levels of target genes. The RNA/DNA sequence of the revealed target gene can then be used to determine which primers and probes are selected on this platform.
Such a dedicated assay can be validated by using a microarray-based mathematical model as a reference model and verifying whether the developed assay provides similar results for a set of validation samples. Following the dedicated assay, similar mathematical models can also be built and calibrated using mRNA sequencing data as input measurements.
Using mathematical models such as the exemplified bayesian network model, microarray/RNA sequencing-based studies, it was found that target genomes that are best representative of the activity of a particular cell signaling pathway can be translated into multiplex quantitative PCR assays performed on samples and/or computers to account for expression measurements and/or to infer the activity of the NFkB cell signaling pathway. To develop such tests for cell signaling pathway activity (e.g., FDA-approved or CLIA-exempt tests in a central service laboratory), it is necessary to develop standardized test kits that require clinical validation in clinical trials for regulatory approval.
The present invention relates to a method comprising inferring activity of an NFkB cellular signaling pathway based at least on measured expression levels of six or more target genes of the NFkB cellular signaling pathway in a sample. The invention also relates to an apparatus comprising a digital processor configured to perform such a method, a non-transitory storage medium storing instructions executable by a digital processing device to perform such a method, and a computer program comprising program code means to cause a digital processing device to perform such a method.
For example, the methods may be used to diagnose (aberrant) activity of the NFkB cell signaling pathway, prognose based on inferred activity of the NFkB cell signaling pathway, recruit clinical trials based on inferred activity of the NFkB cell signaling pathway, select subsequent tests to be performed, select companion diagnostic tests, clinical decision support systems, and the like. In this regard, reference is made to published International patent application WO2013/011479A2 ("Association of cell signaling pathway activity using basic modeling of target gene expression"), published International patent application WO2014/102668A2 ("association of cell signaling pathway activity using(s) of target gene expression") and Verhaegh W.et., "Selection of localized pathway activity pathway use of knowledge modules having specific activity pathway activity, which is incorporated herein by reference, and the detailed description of Cancer pathways, volume Research, 74, 2936, which are incorporated herein by reference.
Example 4: comparison of evidence support lists with broad document lists
A list of target genes of the NFkB cell signaling pathway constructed based on literature evidence according to the procedure described herein (the "target gene evidence support list", see table 1) is compared to a "broad literature list" of putative target genes of the NFkB cell signaling pathway not constructed according to the procedure described above. Another list is due to gene compilation in response to NFkB cell signaling pathway activity provided in Thomson-Reuters' Metecore (final visit time 9/6 days 2013). This database was queried for genes directly transcriptionally regulated downstream of the NFkB protein family, i.e., NFKB1 or p50/p105, NFKB2 or p52/p100, RELA or p65, REL and RELB. This query produced 343 unique genes. Further selections were made based on the number of published references supporting transcriptional regulation of the corresponding genes in the NFkBSMAD family. Genes with ten or more references were selected as broad literature lists. In other words, non-manual management of references and non-experimental evidence-based calculation of evidence scores are performed. This procedure yielded 32 genes, see table 5. Gene expression was measured by 56 probe sets on the Affymetrix HG-u133plus2.0 microarray platform, which were selected using the Bioconductor insert from R and one artificially added DEFB4A probe set, see table 5.
Table 5: an "extensive literature list" of putative target genes for the NFkB cell signaling pathway in the NFkB cell signaling pathway model and related sets of probes for measuring mRNA expression levels of genes.
An exemplary bayesian network model is then constructed using the procedures as described herein. Similar to the description of the NFkB cell signaling pathway model based on the evidence support list, data from fRMA processing of GSE12195 was used to train conditional probability tables for the boundaries between probe sets of this model and their respective putative target genes, including the extensive literature list. The training results are shown in fig. 24. In the figure, the vertical axis indicates the probability of a TF element being "present" or "absent", which corresponds to the NFkB cell signaling pathway being active or inactive, wherein the values above the horizontal axis correspond to the TF element being more likely to be "present"/active, while the values below the horizontal axis indicate that the probability of a TF element being "absent"/inactive is greater than the probability that it is "present"/active. Activated B-cells (ABC) DLBCL (group 1) was used as the active training sample and normal samples (group 7) were used as the inactive training samples. The training results as shown in fig. 24 show the expected clear separation between the passive (group 7) and active (group 1) training samples. In the same dataset, all other lymphoma samples and cell lines as well as DLBCL (groups 2-4, 8) predicted NFkB activity, which is reasonable in view of the literature evidence discussed herein. Healthy unstimulated memory B cells and native B cells (groups 5and 6) were correctly predicted to lack NFkB cell signaling pathway activity. (legend: 1-ABC DLBCL; 2-lymphoblastoid cell line; 3-follicular lymphoma; 4-GCB DLBCL; 5-memory B cells; 6-natural B cells; 7-normal; 8-DLBCL unknown subtype).
Next, an exemplary network bayesian model based on extensive literature list training was tested on many datasets.
Figure 25 shows the prediction of NFkB cell signaling pathway activity for an exemplary bayesian network model trained using an extensive literature list of putative target genes of the NFkB cell signaling pathway (see table 5) for Normal Human Bronchial Epithelial (NHBE) cell line samples from E-MTAB-1312 stimulated with different TNF α concentrations for different stimulation times. In the figure, the vertical axis indicates the probability of a TF element being "present" or "absent", which corresponds to the NFkB cell signaling pathway being active or inactive, wherein the values above the horizontal axis correspond to the TF element being more likely to be "present"/active, and the values below the horizontal axis indicate that the probability of a TF element being "absent"/inactive is greater than the probability that it is "present"/active. All unstimulated healthy cells were correctly predicted to have a passive NFkB cell signaling pathway up to 24 hours (groups 1-3). Absence of stimulation for 24 hours results in the predicted accidental activation of the NFkB cell signaling pathway. Extensive literature bayesian models predict that stimulation with all TNF α concentrations for only 0.5h is also too short for the NFkB cell signaling pathway to become active ( groups 5, 9 and 13), while stimulation times of 2-24 hours result in active NFkB cell signaling pathways (groups 5-8, 9-12 and 13-15). The evidence support list using target genes (see figure 8) was not reproduced using the trained extensive literature bayesian network model and a reduction in NFkB cell signaling pathway activity after 4 hours was observed with the bayesian network model. (legend: 1-no TNF α (0.5 h); 2-no TNF α (2 h); 3-no TNF α (4 h); 4-no TNF α (24 h); 5-low TNF α (0.5 h); 6-low TNF α (2 h); 7-low TNF α (4 h); 8-low TNF α (24 h); 9-medium TNF α (0.5 h); 10-medium TNF α (2 h); 11-medium TNF α (4 h); 12-medium TNF α (24 h); 13-high TNF α (0.5 h); 14-high TNF α (2 h); 15-high TNF α (4 h); 16-high TNF α (24h))
Figure 26 shows prediction of NFkB cell signaling pathway activity using an extensive literature list of putative target genes of the NFkB cell signaling pathway (see table 5), an exemplary bayesian network model of training of colon samples from GSE 4183. In the figure, the vertical axis represents the probability of TF element being "present" or "absent", corresponding to the NFkB cell signaling pathway being active or inactive, wherein the values above the horizontal axis correspond to the TF element being more likely to be "present"/active, while the values below the horizontal axis indicate that the probability of TF element being "absent"/inactive is greater than the probability that it is "present"/active. All adenoma samples except one were predicted to have an active NFkB cell signaling pathway by extensive literature bayesian network models, which is particularly undesirable in normal healthy colon samples (group 1). (legend: 1-Normal colon; 2-IBD; 3-adenoma; 4-CRC)
Example 5: primers and probes for use in kits
Preferred PCR primers and probes for use in the kit are listed in Table 6 below. The PCR primers For each gene are called forward (For) and reverse (Rev) primers, and the probes used to detect the PCR product For each gene are labeled probes. In one non-limiting embodiment, the probes listed in Table 6 are labeled with a 5'FAM dye with an internal ZEN Quencher and 3' Iowa Black Fluorescent Quencher (IBFQ).
Table 6: non-limiting examples of primers and probes for a kit for measuring gene expression of NFkB target genes
Preferably, the first and second electrodes are formed of a metal,
the kit comprises one or more components for measuring the expression level of a control gene, wherein the one or more components comprise PCR primers and probes for at least one control gene listed in table 7. The PCR primers for each gene are referred to as forward (F) and reverse (R) primers, and the probes used to detect the PCR product for each gene are labeled probes (P or FAM). In one non-limiting embodiment, the probes listed in Table 7 are labeled with a 5'FAM dye with an internal ZEN Quencher and 3' Iowa Black Fluorescent Quencher (IBFQ).
Table 7: oligonucleotide sequences for control
A sequence table:
seq. No. Gene:
Seq.1 BCL2
Seq.2 BCL2L1
Seq.3 BIRC3
Seq.4 CCL2
Seq.5 CCL3
Seq.6 CCL4
Seq.7 CCL5
Seq.8 CCL20
Seq.9 CCL22
Seq.10 CSF2
Seq.11 CX3CL1
Seq.12 CXCL1
Seq.13 CXCL2
Seq.14 CXCL3
Seq.15 CXCL10
Seq.16 DEFB4A
Seq.17 FASLG
Seq.18 GCLC
Seq.19 ICAM1
Seq.20 IER3
Seq.21 IFNB1
Seq.22 IL12B
Seq.23 IL1B
Seq.24 IL2
Seq.25 IL23A
Seq.26 IL6
Seq.27 IL8
Seq.28 IRF1
Seq.29 MMP9
Seq.30 MYC
Seq.31 NFKB2
Seq.32 NFKBIA
Seq.33 NFKBIE
Seq.34 NOS2
Seq.35 PTGS2
Seq.36 SELE
Seq.37 SOD2
Seq.38 STAT5A
Seq.39 TNF
Seq.40 TNFAIP2
Seq.41 TNIP1
Seq.42 TRAF1
Seq.43 VCAM1
sequence listing
<110> Philips Intellectual Property & Standards
<120> Assessment of NFkB cellular signaling pathway activity usingmathematical modeling of target gene expression
<130> 2015PF00604
<160> 94
<170> SIPOSequenceListing 1.0
<210> 1
<211> 6492
<212> DNA
<213> Homo sapiens
<400> 1
tttctgtgaa gcagaagtct gggaatcgat ctggaaatcc tcctaatttt tactccctct 60
ccccgcgact cctgattcat tgggaagttt caaatcagct ataactggag agtgctgaag 120
attgatggga tcgttgcctt atgcatttgt tttggtttta caaaaaggaa acttgacaga 180
ggatcatgct gtacttaaaa aatacaacat cacagaggaa gtagactgat attaacaata 240
cttactaata ataacgtgcc tcatgaaata aagatccgaa aggaattgga ataaaaattt 300
cctgcatctc atgccaaggg ggaaacacca gaatcaagtg ttccgcgtga ttgaagacac 360
cccctcgtcc aagaatgcaa agcacatcca ataaaatagc tggattataa ctcctcttct 420
ttctctgggg gccgtggggt gggagctggg gcgagaggtg ccgttggccc ccgttgcttt 480
tcctctggga aggatggcgc acgctgggag aacagggtac gataaccggg agatagtgat 540
gaagtacatc cattataagc tgtcgcagag gggctacgag tgggatgcgg gagatgtggg 600
cgccgcgccc ccgggggccg cccccgcacc gggcatcttc tcctcccagc ccgggcacac 660
gccccatcca gccgcatccc gggacccggt cgccaggacc tcgccgctgc agaccccggc 720
tgcccccggc gccgccgcgg ggcctgcgct cagcccggtg ccacctgtgg tccacctgac 780
cctccgccag gccggcgacg acttctcccg ccgctaccgc cgcgacttcg ccgagatgtc 840
cagccagctg cacctgacgc ccttcaccgc gcggggacgc tttgccacgg tggtggagga 900
gctcttcagg gacggggtga actgggggag gattgtggcc ttctttgagt tcggtggggt 960
catgtgtgtg gagagcgtca accgggagat gtcgcccctg gtggacaaca tcgccctgtg 1020
gatgactgag tacctgaacc ggcacctgca cacctggatc caggataacg gaggctggga 1080
tgcctttgtg gaactgtacg gccccagcat gcggcctctg tttgatttct cctggctgtc 1140
tctgaagact ctgctcagtt tggccctggt gggagcttgc atcaccctgg gtgcctatct 1200
gggccacaag tgaagtcaac atgcctgccc caaacaaata tgcaaaaggt tcactaaagc 1260
agtagaaata atatgcattg tcagtgatgt accatgaaac aaagctgcag gctgtttaag 1320
aaaaaataac acacatataa acatcacaca cacagacaga cacacacaca cacaacaatt 1380
aacagtcttc aggcaaaacg tcgaatcagc tatttactgc caaagggaaa tatcatttat 1440
tttttacatt attaagaaaa aaagatttat ttatttaaga cagtcccatc aaaactcctg 1500
tctttggaaa tccgaccact aattgccaag caccgcttcg tgtggctcca cctggatgtt 1560
ctgtgcctgt aaacatagat tcgctttcca tgttgttggc cggatcacca tctgaagagc 1620
agacggatgg aaaaaggacc tgatcattgg ggaagctggc tttctggctg ctggaggctg 1680
gggagaaggt gttcattcac ttgcatttct ttgccctggg ggctgtgata ttaacagagg 1740
gagggttcct gtggggggaa gtccatgcct ccctggcctg aagaagagac tctttgcata 1800
tgactcacat gatgcatacc tggtgggagg aaaagagttg ggaacttcag atggacctag 1860
tacccactga gatttccacg ccgaaggaca gcgatgggaa aaatgccctt aaatcatagg 1920
aaagtatttt tttaagctac caattgtgcc gagaaaagca ttttagcaat ttatacaata 1980
tcatccagta ccttaagccc tgattgtgta tattcatata ttttggatac gcacccccca 2040
actcccaata ctggctctgt ctgagtaaga aacagaatcc tctggaactt gaggaagtga 2100
acatttcggt gacttccgca tcaggaaggc tagagttacc cagagcatca ggccgccaca 2160
agtgcctgct tttaggagac cgaagtccgc agaacctgcc tgtgtcccag cttggaggcc 2220
tggtcctgga actgagccgg ggccctcact ggcctcctcc agggatgatc aacagggcag 2280
tgtggtctcc gaatgtctgg aagctgatgg agctcagaat tccactgtca agaaagagca 2340
gtagaggggt gtggctgggc ctgtcaccct ggggccctcc aggtaggccc gttttcacgt 2400
ggagcatggg agccacgacc cttcttaaga catgtatcac tgtagaggga aggaacagag 2460
gccctgggcc cttcctatca gaaggacatg gtgaaggctg ggaacgtgag gagaggcaat 2520
ggccacggcc cattttggct gtagcacatg gcacgttggc tgtgtggcct tggcccacct 2580
gtgagtttaa agcaaggctt taaatgactt tggagagggt cacaaatcct aaaagaagca 2640
ttgaagtgag gtgtcatgga ttaattgacc cctgtctatg gaattacatg taaaacatta 2700
tcttgtcact gtagtttggt tttatttgaa aacctgacaa aaaaaaagtt ccaggtgtgg 2760
aatatggggg ttatctgtac atcctggggc attaaaaaaa aaatcaatgg tggggaacta 2820
taaagaagta acaaaagaag tgacatcttc agcaaataaa ctaggaaatt tttttttctt 2880
ccagtttaga atcagccttg aaacattgat ggaataactc tgtggcatta ttgcattata 2940
taccatttat ctgtattaac tttggaatgt actctgttca atgtttaatg ctgtggttga 3000
tatttcgaaa gctgctttaa aaaaatacat gcatctcagc gtttttttgt ttttaattgt 3060
atttagttat ggcctataca ctatttgtga gcaaaggtga tcgttttctg tttgagattt 3120
ttatctcttg attcttcaaa agcattctga gaaggtgaga taagccctga gtctcagcta 3180
cctaagaaaa acctggatgt cactggccac tgaggagctt tgtttcaacc aagtcatgtg 3240
catttccacg tcaacagaat tgtttattgt gacagttata tctgttgtcc ctttgacctt 3300
gtttcttgaa ggtttcctcg tccctgggca attccgcatt taattcatgg tattcaggat 3360
tacatgcatg tttggttaaa cccatgagat tcattcagtt aaaaatccag atggcaaatg 3420
accagcagat tcaaatctat ggtggtttga cctttagaga gttgctttac gtggcctgtt 3480
tcaacacaga cccacccaga gccctcctgc cctccttccg cgggggcttt ctcatggctg 3540
tccttcaggg tcttcctgaa atgcagtggt gcttacgctc caccaagaaa gcaggaaacc 3600
tgtggtatga agccagacct ccccggcggg cctcagggaa cagaatgatc agacctttga 3660
atgattctaa tttttaagca aaatattatt ttatgaaagg tttacattgt caaagtgatg 3720
aatatggaat atccaatcct gtgctgctat cctgccaaaa tcattttaat ggagtcagtt 3780
tgcagtatgc tccacgtggt aagatcctcc aagctgcttt agaagtaaca atgaagaacg 3840
tggacgtttt taatataaag cctgttttgt cttttgttgt tgttcaaacg ggattcacag 3900
agtatttgaa aaatgtatat atattaagag gtcacggggg ctaattgctg gctggctgcc 3960
ttttgctgtg gggttttgtt acctggtttt aataacagta aatgtgccca gcctcttggc 4020
cccagaactg tacagtattg tggctgcact tgctctaaga gtagttgatg ttgcattttc 4080
cttattgtta aaaacatgtt agaagcaatg aatgtatata aaagcctcaa ctagtcattt 4140
ttttctcctc ttcttttttt tcattatatc taattatttt gcagttgggc aacagagaac 4200
catccctatt ttgtattgaa gagggattca catctgcatc ttaactgctc tttatgaatg 4260
aaaaaacagt cctctgtatg tactcctctt tacactggcc agggtcagag ttaaatagag 4320
tatatgcact ttccaaattg gggacaaggg ctctaaaaaa agccccaaaa ggagaagaac 4380
atctgagaac ctcctcggcc ctcccagtcc ctcgctgcac aaatactccg caagagaggc 4440
cagaatgaca gctgacaggg tctatggcca tcgggtcgtc tccgaagatt tggcaggggc 4500
agaaaactct ggcaggctta agatttggaa taaagtcaca gaattaagga agcacctcaa 4560
tttagttcaa acaagacgcc aacattctct ccacagctca cttacctctc tgtgttcaga 4620
tgtggccttc catttatatg tgatctttgt tttattagta aatgcttatc atctaaagat 4680
gtagctctgg cccagtggga aaaattagga agtgattata aatcgagagg agttataata 4740
atcaagatta aatgtaaata atcagggcaa tcccaacaca tgtctagctt tcacctccag 4800
gatctattga gtgaacagaa ttgcaaatag tctctatttg taattgaact tatcctaaaa 4860
caaatagttt ataaatgtga acttaaactc taattaattc caactgtact tttaaggcag 4920
tggctgtttt tagactttct tatcacttat agttagtaat gtacacctac tctatcagag 4980
aaaaacagga aaggctcgaa atacaagcca ttctaaggaa attagggagt cagttgaaat 5040
tctattctga tcttattctg tggtgtcttt tgcagcccag acaaatgtgg ttacacactt 5100
tttaagaaat acaattctac attgtcaagc ttatgaaggt tccaatcaga tctttattgt 5160
tattcaattt ggatctttca gggatttttt ttttaaatta ttatgggaca aaggacattt 5220
gttggagggg tgggagggag gaagaatttt taaatgtaaa acattcccaa gtttggatca 5280
gggagttgga agttttcaga ataaccagaa ctaagggtat gaaggacctg tattggggtc 5340
gatgtgatgc ctctgcgaag aaccttgtgt gacaaatgag aaacattttg aagtttgtgg 5400
tacgaccttt agattccaga gacatcagca tggctcaaag tgcagctccg tttggcagtg 5460
caatggtata aatttcaagc tggatatgtc taatgggtat ttaaacaata aatgtgcagt 5520
tttaactaac aggatattta atgacaacct tctggttggt agggacatct gtttctaaat 5580
gtttattatg tacaatacag aaaaaaattt tataaaatta agcaatgtga aactgaattg 5640
gagagtgata atacaagtcc tttagtctta cccagtgaat cattctgttc catgtctttg 5700
gacaaccatg accttggaca atcatgaaat atgcatctca ctggatgcaa agaaaatcag 5760
atggagcatg aatggtactg taccggttca tctggactgc cccagaaaaa taacttcaag 5820
caaacatcct atcaacaaca aggttgttct gcataccaag ctgagcacag aagatgggaa 5880
cactggtgga ggatggaaag gctcgctcaa tcaagaaaat tctgagacta ttaataaata 5940
agactgtagt gtagatactg agtaaatcca tgcacctaaa ccttttggaa aatctgccgt 6000
gggccctcca gatagctcat ttcattaagt ttttccctcc aaggtagaat ttgcaagagt 6060
gacagtggat tgcatttctt ttggggaagc tttcttttgg tggttttgtt tattatacct 6120
tcttaagttt tcaaccaagg tttgcttttg ttttgagtta ctggggttat ttttgtttta 6180
aataaaaata agtgtacaat aagtgttttt gtattgaaag cttttgttat caagattttc 6240
atacttttac cttccatggc tctttttaag attgatactt ttaagaggtg gctgatattc 6300
tgcaacactg tacacataaa aaatacggta aggatacttt acatggttaa ggtaaagtaa 6360
gtctccagtt ggccaccatt agctataatg gcactttgtt tgtgttgttg gaaaaagtca 6420
cattgccatt aaactttcct tgtctgtcta gttaatattg tgaagaaaaa taaagtacag 6480
tgtgagatac tg 6492
<210> 2
<211> 2575
<212> DNA
<213> Homo sapiens
<400> 2
ggaggaggaa gcaagcgagg gggctggttc ctgagcttcg caattcctgt gtcgccttct 60
gggctcccag cctgccgggt cgcatgatcc ctccggccgg agctggtttt tttgccagcc 120
accgcgaggc cggctgagtt accggcatcc ccgcagccac ctcctctccc gacctgtgat 180
acaaaagatc ttccgggggc tgcacctgcc tgcctttgcc taaggcggat ttgaatctct 240
ttctctccct tcagaatctt atcttggctt tggatcttag aagagaatca ctaaccagag 300
acgagactca gtgagtgagc aggtgttttg gacaatggac tggttgagcc catccctatt 360
ataaaaatgt ctcagagcaa ccgggagctg gtggttgact ttctctccta caagctttcc 420
cagaaaggat acagctggag tcagtttagt gatgtggaag agaacaggac tgaggcccca 480
gaagggactg aatcggagat ggagaccccc agtgccatca atggcaaccc atcctggcac 540
ctggcagaca gccccgcggt gaatggagcc actggccaca gcagcagttt ggatgcccgg 600
gaggtgatcc ccatggcagc agtaaagcaa gcgctgaggg aggcaggcga cgagtttgaa 660
ctgcggtacc ggcgggcatt cagtgacctg acatcccagc tccacatcac cccagggaca 720
gcatatcaga gctttgaaca ggtagtgaat gaactcttcc gggatggggt aaactggggt 780
cgcattgtgg cctttttctc cttcggcggg gcactgtgcg tggaaagcgt agacaaggag 840
atgcaggtat tggtgagtcg gatcgcagct tggatggcca cttacctgaa tgaccaccta 900
gagccttgga tccaggagaa cggcggctgg gatacttttg tggaactcta tgggaacaat 960
gcagcagccg agagccgaaa gggccaggaa cgcttcaacc gctggttcct gacgggcatg 1020
actgtggccg gcgtggttct gctgggctca ctcttcagtc ggaaatgacc agacactgac 1080
catccactct accctcccac ccccttctct gctccaccac atcctccgtc cagccgccat 1140
tgccaccagg agaaccacta catgcagccc atgcccacct gcccatcaca gggttgggcc 1200
cagatctggt cccttgcagc tagttttcta gaatttatca cacttctgtg agacccccac 1260
acctcagttc ccttggcctc agaattcaca aaatttccac aaaatctgtc caaaggaggc 1320
tggcaggtat ggaagggttt gtggctgggg gcaggagggc cctacctgat tggtgcaacc 1380
cttacccctt agcctccctg aaaatgtttt tctgccaggg agcttgaaag ttttcagaac 1440
ctcttcccca gaaaggagac tagattgcct ttgttttgat gtttgtggcc tcagaattga 1500
tcattttccc cccactctcc ccacactaac ctgggttccc tttccttcca tccctacccc 1560
ctaagagcca tttaggggcc acttttgact agggattcag gctgcttggg ataaagatgc 1620
aaggaccagg actccctcct cacctctgga ctggctagag tcctcactcc cagtccaaat 1680
gtcctccaga agcctctggc tagaggccag ccccacccag gagggagggg gctatagcta 1740
caggaagcac cccatgccaa agctagggtg gcccttgcag ttcagcacca ccctagtccc 1800
ttcccctccc tggctcccat gaccatactg agggaccaac tgggcccaag acagatgccc 1860
cagagctgtt tatggcctca gctgcctcac ttcctacaag agcagcctgt ggcatctttg 1920
ccttgggctg ctcctcatgg tgggttcagg ggactcagcc ctgaggtgaa agggagctat 1980
caggaacagc tatgggagcc ccagggtctt ccctacctca ggcaggaagg gcaggaagga 2040
gagcctgctg catggggtgg ggtagggctg actagaaggg ccagtcctgc ctggccaggc 2100
agatctgtgc cccatgcctg tccagcctgg gcagccaggc tgccaaggcc agagtggcct 2160
ggccaggagc tcttcaggcc tccctctctc ttctgctcca cccttggcct gtctcatccc 2220
caggggtccc agccaccccg ggctctctgc tgtacatatt tgagactagt ttttattcct 2280
tgtgaagatg atatactatt tttgttaagc gtgtctgtat ttatgtgtga ggagctgctg 2340
gcttgcagtg cgcgtgcacg tggagagctg gtgcccggag attggacggc ctgatgctcc 2400
ctcccctgcc ctggtccagg gaagctggcc gagggtcctg gctcctgagg ggcatctgcc 2460
cctcccccaa cccccacccc acacttgttc cagctctttg aaatagtctg tgtgaaggtg 2520
aaagtgcagt tcagtaataa actgtgttta ctcagtgaaa aaaaaaaaaa aaaaa 2575
<210> 3
<211> 6932
<212> DNA
<213> Homo sapiens
<400> 3
gcatttaaaa gacagcgtga gactcgcgcc ctccggcacg gaaaaggcca ggcgacaggt 60
gtcgcttgaa aagactgggc ttgtccttgc tggtgcatgc gtcgtcggcc tctgggcagc 120
aggtttacaa aggaggaaaa cgacttcttc tagatttttt tttcagtttc ttctataaat 180
caaaacatct caaaatggag acctaaaatc cttaaaggga cttagtctaa tctcgggagg 240
tagttttgtg catgggtaaa caaattaagt attaactggt gttttactat ccaaagaatg 300
ctaattttat aaacatgatc gagttatata aggtatacca taatgagttt gattttgaat 360
ttgatttgtg gaaataaagg aaaagtgatt ctagctgggg catattgtta aagcattttt 420
ttcagagttg gccaggcagt ctcctactgg cacattctcc cattatgtag aatagaaata 480
gtacctgtgt ttgggaaaga ttttaaaatg agtgacagtt atttggaaca aagagctaat 540
aatcaatcca ctgcaaatta aagaaacatg cagatgaaag ttttgacaca ttaaaatact 600
tctacagtga caaagaaaaa tcaagaacaa agctttttga tatgtgcaac aaatttagag 660
gaagtaaaaa gataaatgtg atgattggtc aagaaattat ccagttattt acaaggccac 720
tgatatttta aacgtccaaa agtttgttta aatgggctgt taccgctgag aatgatgagg 780
atgagaatga tggttgaagg ttacatttta ggaaatgaag aaacttagaa aattaatata 840
aagacagtga tgaatacaaa gaagattttt ataacaatgt gtaaaatttt tggccaggga 900
aaggaatatt gaagttagat acaattactt acctttgagg gaaataattg ttggtaatga 960
gatgtgatgt ttctcctgcc acctggaaac aaagcattga agtctgcagt tgaaaagccc 1020
aacgtctgtg agatccagga aaccatgctt gcaaaccact ggtaaaaaaa aaaaaaaaaa 1080
aaaaaaaaag ccacagtgac ttgcttattg gtcattgcta gtattatcga ctcagaacct 1140
ctttactaat ggctagtaaa tcataattga gaaattctga attttgacaa ggtctctgct 1200
gttgaaatgg taaatttatt attttttttg tcatgataaa ttctggttca aggtatgcta 1260
tccatgaaat aatttctgac caaaactaaa ttgatgcaat ttgattatcc atcttagcct 1320
acagatggca tctggtaact tttgactgtt ttaaaaaata aatccactat cagagtagat 1380
ttgatgttgg cttcagaaac atttagaaaa acaaaagttc aaaaatgttt tcaggaggtg 1440
ataagttgaa taactctaca atgttagttc tttgaggggg acaaaaaatt taaaatcttt 1500
gaaaggtctt attttacagc catatctaaa ttatcttaag aaaattttta acaaagggaa 1560
tgaaatatat atcatgattc tgtttttcca aaagtaacct gaatatagca atgaagttca 1620
gttttgttat tggtagtttg ggcagagtct ctttttgcag cacctgttgt ctaccataat 1680
tacagaggac atttccatgt tctagccaag tatactatta gaataaaaaa acttaacatt 1740
gagttgcttc aacagcatga aactgagtcc aaaagaccaa atgaacaaac acattaatct 1800
ctgattattt attttaaata gaatatttaa ttgtgtaaga tctaatagta tcattatact 1860
taagcaatca tattcctgat gatctatggg aaataactat tatttaatta atattgaaac 1920
caggttttaa gatgtgttag ccagtcctgt tactagtaaa tctctttatt tggagagaaa 1980
ttttagattg ttttgttctc cttattagaa ggattgtaga aagaaaaaaa tgactaattg 2040
gagaaaaatt ggggatatat catatttcac tgaattcaaa atgtcttcag ttgtaaatct 2100
taccattatt ttacgtacct ctaagaaata aaagtgcttc taattaaaat atgatgtcat 2160
taattatgaa atacttcttg ataacagaag ttttaaaata gccatcttag aatcagtgaa 2220
atatggtaat gtattatttt cctcctttga gttaggtctt gtgctttttt ttcctggcca 2280
ctaaatttca caatttccaa aaagcaaaat aaacatattc tgaatatttt tgctgtgaaa 2340
cacttgacag cagagctttc caccatgaaa agaagcttca tgagtcacac attacatctt 2400
tgggttgatt gaatgccact gaaacattct agtagcctgg agaagttgac ctacctgtgg 2460
agatgcctgc cattaaatgg catcctgatg gcttaataca catcactctt ctgtgaaggg 2520
ttttaatttt caacacagct tactctgtag catcatgttt acattgtatg tataaagatt 2580
atacaaaggt gcaattgtgt atttcttcct taaaatgtat cagtatagga tttagaatct 2640
ccatgttgaa actctaaatg catagaaata aaaataataa aaaatttttc attttggctt 2700
ttcagcctag tattaaaact gataaaagca aagccatgca caaaactacc tccctagaga 2760
aaggctagtc ccttttcttc cccattcatt tcattatgaa catagtagaa aacagcatat 2820
tcttatcaaa tttgatgaaa agcgccaaca cgtttgaact gaaatacgac ttgtcatgtg 2880
aactgtaccg aatgtctacg tattccactt ttcctgctgg ggttcctgtc tcagaaagga 2940
gtcttgctcg tgctggtttc tattacactg gtgtgaatga caaggtcaaa tgcttctgtt 3000
gtggcctgat gctggataac tggaaaagag gagacagtcc tactgaaaag cataaaaagt 3060
tgtatcctag ctgcagattc gttcagagtc taaattccgt taacaacttg gaagctacct 3120
ctcagcctac ttttccttct tcagtaacaa attccacaca ctcattactt ccgggtacag 3180
aaaacagtgg atatttccgt ggctcttatt caaactctcc atcaaatcct gtaaactcca 3240
gagcaaatca agatttttct gccttgatga gaagttccta ccactgtgca atgaataacg 3300
aaaatgccag attacttact tttcagacat ggccattgac ttttctgtcg ccaacagatc 3360
tggcaaaagc aggcttttac tacataggac ctggagacag agtggcttgc tttgcctgtg 3420
gtggaaaatt gagcaattgg gaaccgaagg ataatgctat gtcagaacac ctgagacatt 3480
ttcccaaatg cccatttata gaaaatcagc ttcaagacac ttcaagatac acagtttcta 3540
atctgagcat gcagacacat gcagcccgct ttaaaacatt ctttaactgg ccctctagtg 3600
ttctagttaa tcctgagcag cttgcaagtg cgggttttta ttatgtgggt aacagtgatg 3660
atgtcaaatg cttttgctgt gatggtggac tcaggtgttg ggaatctgga gatgatccat 3720
gggttcaaca tgccaagtgg tttccaaggt gtgagtactt gataagaatt aaaggacagg 3780
agttcatccg tcaagttcaa gccagttacc ctcatctact tgaacagctg ctatccacat 3840
cagacagccc aggagatgaa aatgcagagt catcaattat ccattttgaa cctggagaag 3900
accattcaga agatgcaatc atgatgaata ctcctgtgat taatgctgcc gtggaaatgg 3960
gctttagtag aagcctggta aaacagacag ttcagagaaa aatcctagca actggagaga 4020
attatagact agtcaatgat cttgtgttag acttactcaa tgcagaagat gaaataaggg 4080
aagaggagag agaaagagca actgaggaaa aagaatcaaa tgatttatta ttaatccgga 4140
agaatagaat ggcacttttt caacatttga cttgtgtaat tccaatcctg gatagtctac 4200
taactgccgg aattattaat gaacaagaac atgatgttat taaacagaag acacagacgt 4260
ctttacaagc aagagaactg attgatacga ttttagtaaa aggaaatatt gcagccactg 4320
tattcagaaa ctctctgcaa gaagctgaag ctgtgttata tgagcattta tttgtgcaac 4380
aggacataaa atatattccc acagaagatg tttcagatct accagtggaa gaacaattgc 4440
ggagactaca agaagaaaga acatgtaaag tgtgtatgga caaagaagtg tccatagtgt 4500
ttattccttg tggtcatcta gtagtatgca aagattgtgc tccttcttta agaaagtgtc 4560
ctatttgtag gagtacaatc aagggtacag ttcgtacatt tctttcatga agaagaacca 4620
aaacatcgtc taaactttag aattaattta ttaaatgtat tataacttta acttttatcc 4680
taatttggtt tccttaaaat ttttatttat ttacaactca aaaaacattg ttttgtgtaa 4740
catatttata tatgtatcta aaccatatga acatatattt tttagaaact aagagaatga 4800
taggcttttg ttcttatgaa cgaaaaagag gtagcactac aaacacaata ttcaatcaaa 4860
atttcagcat tattgaaatt gtaagtgaag taaaacttaa gatatttgag ttaaccttta 4920
agaattttaa atattttggc attgtactaa taccgggaac atgaagccag gtgtggtggt 4980
atgtgcctgt agtcccaggc tgaggcaaga gaattacttg agcccaggag tttgaatcca 5040
tcctgggcag catactgaga ccctgccttt aaaaacaaac agaacaaaaa caaaacacca 5100
gggacacatt tctctgtctt ttttgatcag tgtcctatac atcgaaggtg tgcatatatg 5160
ttgaatgaca ttttagggac atggtgtttt tataaagaat tctgtgagaa aaaatttaat 5220
aaagcaacaa aaattactct tattcttcat tgctttattt caatgacatt ggatagttta 5280
gtcactccca gactctttcc ataccttctt aaagcctctc aaatattgaa ctacagttta 5340
tactccttcc cataagatgc ttcttcattg acacttgtag aacacggggt caacacatca 5400
taaaatctat tatggaatgc ctgagacaag aatcaaacag tccctttagt aagtttgttt 5460
attcacttct ctattgattc attcaagaag tctcatgcca gccccaccta ttggaagaag 5520
gtctgagttt tattcttatc tctttggtat taattctgaa acttagaaag tacactggtt 5580
agcaatgctt gggaccaaca ggttgttctg gtaaataaat ctgtttcata ttgtcagtgc 5640
aacaaaatgt ccccctctgc attatgttat tggtactcaa cacgtccgag tcataactct 5700
gtcctttgct tcttatagag gtattaggtc ttcaagagca gaagtaagac tgtaataggg 5760
aatactcagg ggaaggcagg caaaggctag tcatctaaac cagttctaga tgtctgtata 5820
ggggcagatg gctctgtaag ggcagaaggg aaagacccct tcataagggt cacagctgac 5880
aatcctataa caaaagacag gttaacaaga gaaaaactta acaaatttat ttaatcacag 5940
atttacatca ccggggagcc ttcgtaatga agatccaaaa ttacagggga aactgtgcat 6000
ttttatgctt aggtttgata atgaatggac agccctgaag aatagtgatt ggaaaaaaag 6060
gatatgatct aatgggaata gacacaggtt ggggacccag caaggcctgt ctgttcagat 6120
tattcttggt ctctgtgcag cattccttcc tcctggatat agggcagggc ctgtatggga 6180
tggggatatt ataacctgct atcaagcaag gtaggtcaga gaatttattt atggccagct 6240
cttacatagt taggtgagga aagattagag tactatcttt aagatgtaag tctggcattg 6300
tggaaagatg gttccagttt ctatgaccta ccttggggaa gaggaattca agtttctgtg 6360
gcttgccttc agggagaatg aggctgagac aggagggcag gataacatca gagaaaaact 6420
ttgcttctga ggccttcact ttgggttttc tgagccccaa catctgctag tgttgtaaag 6480
agaacaatta gggaccaagt gaggggagga aagaatccat ctctgcattc tgatgctggg 6540
agacttattt ccttgaaatg caattgattt tgcctctgct aagaggctct gctggctacc 6600
catgtactag ccagtgtcct gcatgggtgc taggctgaat tatttgtaat tgtgcttagg 6660
tgatttgtaa ctcaggtata gggtatttaa atagtaggca ccctttttgc accatgtgtt 6720
ttttttttta tctagttctt gtatactaca gataatattt gaactttgtc atctcactgt 6780
aaaacttttg ttcatttctc attatggtaa taaatagcta ttataaccaa cccatttatt 6840
caaatatgtt atttccctaa gtgttatttt gacattttgt tttggaaaaa ataaatcacc 6900
atagataata aaaaaaaaaa aaaaaaaaaa aa 6932
<210> 4
<211> 760
<212> DNA
<213> Homo sapiens
<400> 4
gaggaaccga gaggctgaga ctaacccaga aacatccaat tctcaaactg aagctcgcac 60
tctcgcctcc agcatgaaag tctctgccgc ccttctgtgc ctgctgctca tagcagccac 120
cttcattccc caagggctcg ctcagccaga tgcaatcaat gccccagtca cctgctgtta 180
taacttcacc aataggaaga tctcagtgca gaggctcgcg agctatagaa gaatcaccag 240
cagcaagtgt cccaaagaag ctgtgatctt caagaccatt gtggccaagg agatctgtgc 300
tgaccccaag cagaagtggg ttcaggattc catggaccac ctggacaagc aaacccaaac 360
tccgaagact tgaacactca ctccacaacc caagaatctg cagctaactt attttcccct 420
agctttcccc agacaccctg ttttatttta ttataatgaa ttttgtttgt tgatgtgaaa 480
cattatgcct taagtaatgt taattcttat ttaagttatt gatgttttaa gtttatcttt 540
catggtacta gtgtttttta gatacagaga cttggggaaa ttgcttttcc tcttgaacca 600
cagttctacc cctgggatgt tttgagggtc tttgcaagaa tcattaatac aaagaatttt 660
ttttaacatt ccaatgcatt gctaaaatat tattgtggaa atgaatattt tgtaactatt 720
acaccaaata aatatatttt tgtacaaaaa aaaaaaaaaa 760
<210> 5
<211> 813
<212> DNA
<213> Homo sapiens
<400> 5
agctggtttc agacttcaga aggacacggg cagcagacag tggtcagtcc tttcttggct 60
ctgctgacac tcgagcccac attccgtcac ctgctcagaa tcatgcaggt ctccactgct 120
gcccttgctg tcctcctctg caccatggct ctctgcaacc agttctctgc atcacttgct 180
gctgacacgc cgaccgcctg ctgcttcagc tacacctccc ggcagattcc acagaatttc 240
atagctgact actttgagac gagcagccag tgctccaagc ccggtgtcat cttcctaacc 300
aagcgaagcc ggcaggtctg tgctgacccc agtgaggagt gggtccagaa atatgtcagc 360
gacctggagc tgagtgcctg aggggtccag aagcttcgag gcccagcgac ctcggtgggc 420
ccagtgggga ggagcaggag cctgagcctt gggaacatgc gtgtgacctc cacagctacc 480
tcttctatgg actggttgtt gccaaacagc cacactgtgg gactcttctt aacttaaatt 540
ttaatttatt tatactattt agtttttgta atttattttc gatttcacag tgtgtttgtg 600
attgtttgct ctgagagttc ccctgtcccc tcccccttcc ctcacaccgc gtctggtgac 660
aaccgagtgg ctgtcatcag cctgtgtagg cagtcatggc accaaagcca ccagactgac 720
aaatgtgtat cggatgcttt tgttcagggc tgtgatcggc ctggggaaat aataaagatg 780
ctcttttaaa aggtaaaaaa aaaaaaaaaa aaa 813
<210> 6
<211> 667
<212> DNA
<213> Homo sapiens
<400> 6
agcacaggac acagctgggt tctgaagctt ctgagttctg cagcctcacc tctgagaaaa 60
cctcttttcc accaatacca tgaagctctg cgtgactgtc ctgtctctcc tcatgctagt 120
agctgccttc tgctctccag cgctctcagc accaatgggc tcagaccctc ccaccgcctg 180
ctgcttttct tacaccgcga ggaagcttcc tcgcaacttt gtggtagatt actatgagac 240
cagcagcctc tgctcccagc cagctgtggt attccaaacc aaaagaagca agcaagtctg 300
tgctgatccc agtgaatcct gggtccagga gtacgtgtat gacctggaac tgaactgagc 360
tgctcagaga caggaagtct tcagggaagg tcacctgagc ccggatgctt ctccatgaga 420
cacatctcct ccatactcag gactcctctc cgcagttcct gtcccttctc ttaatttaat 480
cttttttatg tgccgtgtta ttgtattagg tgtcatttcc attatttata ttagtttagc 540
caaaggataa gtgtccccta tggggatggt ccactgtcac tgtttctctg ctgttgcaaa 600
tacatggata acacatttga ttctgtgtgt tttcataata aaactttaaa ataaaatgca 660
gacagtt 667
<210> 7
<211> 1237
<212> DNA
<213> Homo sapiens
<400> 7
gctgcagagg attcctgcag aggatcaaga cagcacgtgg acctcgcaca gcctctccca 60
caggtaccat gaaggtctcc gcggcagccc tcgctgtcat cctcattgct actgccctct 120
gcgctcctgc atctgcctcc ccatattcct cggacaccac accctgctgc tttgcctaca 180
ttgcccgccc actgccccgt gcccacatca aggagtattt ctacaccagt ggcaagtgct 240
ccaacccagc agtcgtcttt gtcacccgaa agaaccgcca agtgtgtgcc aacccagaga 300
agaaatgggt tcgggagtac atcaactctt tggagatgag ctaggatgga gagtccttga 360
acctgaactt acacaaattt gcctgtttct gcttgctctt gtcctagctt gggaggcttc 420
ccctcactat cctaccccac ccgctccttg aagggcccag attctaccac acagcagcag 480
ttacaaaaac cttccccagg ctggacgtgg tggctcacgc ctgtaatccc agcactttgg 540
gaggccaagg tgggtggatc acttgaggtc aggagttcga gaccagcctg gccaacatga 600
tgaaacccca tctctactaa aaatacaaaa aattagccgg gcgtggtagc gggcgcctgt 660
agtcccagct actcgggagg ctgaggcagg agaatggcgt gaacccggga ggcggagctt 720
gcagtgagcc gagatcgcgc cactgcactc cagcctgggc gacagagcga gactccgtct 780
caaaaaaaaa aaaaaaaaaa aaaatacaaa aattagccgg gcgtggtggc ccacgcctgt 840
aatcccagct actcgggagg ctaaggcagg aaaattgttt gaacccagga ggtggaggct 900
gcagtgagct gagattgtgc cacttcactc cagcctgggt gacaaagtga gactccgtca 960
caacaacaac aacaaaaagc ttccccaact aaagcctaga agagcttctg aggcgctgct 1020
ttgtcaaaag gaagtctcta ggttctgagc tctggctttg ccttggcttt gccagggctc 1080
tgtgaccagg aaggaagtca gcatgcctct agaggcaagg aggggaggaa cactgcactc 1140
ttaagcttcc gccgtctcaa cccctcacag gagcttactg gcaaacatga aaaatcggct 1200
taccattaaa gttctcaatg caaccataaa aaaaaaa 1237
<210> 8
<211> 851
<212> DNA
<213> Homo sapiens
<400> 8
agaatataac agcactccca aagaactggg tactcaacac tgagcagatc tgttctttga 60
gctaaaaacc atgtgctgta ccaagagttt gctcctggct gctttgatgt cagtgctgct 120
actccacctc tgcggcgaat cagaagcagc aagcaacttt gactgctgtc ttggatacac 180
agaccgtatt cttcatccta aatttattgt gggcttcaca cggcagctgg ccaatgaagg 240
ctgtgacatc aatgctatca tctttcacac aaagaaaaag ttgtctgtgt gcgcaaatcc 300
aaaacagact tgggtgaaat atattgtgcg tctcctcagt aaaaaagtca agaacatgta 360
aaaactgtgg cttttctgga atggaattgg acatagccca agaacagaaa gaaccttgct 420
ggggttggag gtttcacttg cacatcatgg agggtttagt gcttatctaa tttgtgcctc 480
actggacttg tccaattaat gaagttgatt catattgcat catagtttgc tttgtttaag 540
catcacatta aagttaaact gtattttatg ttatttatag ctgtaggttt tctgtgttta 600
gctatttaat actaattttc cataagctat tttggtttag tgcaaagtat aaaattatat 660
ttggggggga ataagattat atggactttc ttgcaagcaa caagctattt tttaaaaaaa 720
actatttaac attcttttgt ttatattgtt ttgtctccta aattgttgta attgcattat 780
aaaataagaa aaatattaat aagacaaata ttgaaaataa agaaacaaaa agttcttctg 840
ttaaaaaaaa a 851
<210> 9
<211> 2933
<212> DNA
<213> Homo sapiens
<400> 9
gcagacacct gggctgagac atacaggaca gagcatggat cgcctacaga ctgcactcct 60
ggttgtcctc gtcctccttg ctgtggcgct tcaagcaact gaggcaggcc cctacggcgc 120
caacatggaa gacagcgtct gctgccgtga ttacgtccgt taccgtctgc ccctgcgcgt 180
ggtgaaacac ttctactgga cctcagactc ctgcccgagg cctggcgtgg tgttgctaac 240
cttcagggat aaggagatct gtgccgatcc cagagtgccc tgggtgaaga tgattctcaa 300
taagctgagc caatgaagag cctactctga tgaccgtggc cttggctcct ccaggaaggc 360
tcaggagccc tacctccctg ccattatagc tgctccccgc cagaagcctg tgccaactct 420
ctgcattccc tgatctccat ccctgtggct gtcacccttg gtcacctccg tgctgtcact 480
gccatctccc ccctgacccc tctaacccat cctctgcctc cctccctgca gtcagagggt 540
cctgttccca tcagcgattc ccctgcttaa acccttccat gactccccac tgccctaagc 600
tgaggtcagt ctcccaagcc tggcatgtgg ccctctggat ctgggttcca tctctgtctc 660
cagcctgccc acttcccttc atgaatgttg ggttctagct ccctgttctc caaacccata 720
ctacacatcc cacttctggg tctttgcctg ggatgttgct gacacccaga aagtcccacc 780
acctgcacat gtgtagcccc accagccctc caaggcattg ctcgcccaag cagctggtaa 840
ttccatttca tgtattagat gtcccctggc cctctgtccc ctcttaataa ccctagtcac 900
agtctccgca gattcttggg atttgggggt tttctccccc acctctccac tagttggacc 960
aaggtttcta gctaagttac tctagtctcc aagcctctag catagagcac tgcagacagg 1020
ccctggctca gaatcagagc ccagaaagtg gctgcagaca aaatcaataa aactaatgtc 1080
cctcccctct ccctgccaaa aggcagttac atatcaatac agagactcaa ggtcactaga 1140
aatgggccag ctgggtcaat gtgaagcccc aaatttgccc agattcacct ttcttccccc 1200
actccctttt tttttttttt tttgagatgg agtttcgctc ttgtcaccca cgctggagtg 1260
caatggtgtg gtcttggctt attgaagcct ctgcctcctg ggttcaagtg attctcttgc 1320
ctcagcctcc tgagtagctg ggattacagg ttcctgctac cacgcccagc taatttttgt 1380
atttttagta gagacgaggc ttcaccatgt tggccaggct ggtctcgaac tcctgtcctc 1440
aggtaatccg cccacctcag cctcccaaag tgctgggatt acaggcgtga gccacagtgc 1500
ctggcctctt ccctctcccc accccccccc caactttttt ttttttttat ggcagggtct 1560
cactctgtcg cccaggctgg agtgcagtgg cgtgatctcg gctcactaca acctcgacct 1620
cctgggttca agcgattctc ccaccccagc ctcccaagta gctgggatta caggtgtgtg 1680
ccactacggc tggctaattt ttgtattttt agtagagaca ggtttcacca tattggccag 1740
gctggtcttg aactcctgac ctcaagtgat ccaccttcct tgtgctccca aagtgctgag 1800
attacaggcg tgagctatca cacccagcct cccccttttt ttcctaatag gagactcctg 1860
tacctttctt cgttttacct atgtgtcgtg tctgcttaca tttccttctc ccctcaggct 1920
ttttttgggt ggtcctccaa cctccaatac ccaggcctgg cctcttcaga gtacccccca 1980
ttccactttc cctgcctcct tccttaaata gctgacaatc aaattcatgc tatggtgtga 2040
aagactacct ttgacttggt attataagct ggagttatat atgtatttga aaacagagta 2100
aatacttaag aggccaaata gatgaatgga agaattttag gaactgtgag agggggacaa 2160
ggtggagctt tcctggccct gggaggaagc tggctgtggt agcgtagcgc tctctctctc 2220
tgtctgtggc aggaggcaaa gagtagggtg taattgagtg aaggaatcct gggtagagac 2280
cattctcagg tggttgggcc aggctaaaga ctgggatttg ggtctatcta tgcctttctg 2340
gctgattttt gtagagacgg ggttttgcca tgttacccag gctggtctca aactcctggg 2400
ctcaagcgat cctcctggct cagcctccca aagtgctggg attacaggcg tgagtcactg 2460
cgcctggctt cctcttcctc ttgagaaata ttcttttcat acagcaagta tgggacagca 2520
gtgtcccagg taaaggacat aaatgttaca agtgtctggt cctttctgag ggaggctggt 2580
gccgctctgc agggtatttg aacctgtgga attggaggag gccatttcac tccctgaacc 2640
cagcctgaca aatcacagtg agaatgttca ccttataggc ttgctgtggg gctcaggttg 2700
aaagtgtggg gagtgacact gcctaggcat ccagctcagt gtcatccagg gcctgtgtcc 2760
ctcccgaacc cagggtcaac ctgcctacca caggcactag aaggacgaat ctgcctactg 2820
cccatgaacg gggccctcaa gcgtcctggg atctccttct ccctcctgtc ctgtccttgc 2880
ccctcaggac tgctggaaaa taaatccttt aaaatagtaa aaaaaaaaaa aaa 2933
<210> 10
<211> 800
<212> DNA
<213> Homo sapiens
<400> 10
acacagagag aaaggctaaa gttctctgga ggatgtggct gcagagcctg ctgctcttgg 60
gcactgtggc ctgcagcatc tctgcacccg cccgctcgcc cagccccagc acgcagccct 120
gggagcatgt gaatgccatc caggaggccc ggcgtctcct gaacctgagt agagacactg 180
ctgctgagat gaatgaaaca gtagaagtca tctcagaaat gtttgacctc caggagccga 240
cctgcctaca gacccgcctg gagctgtaca agcagggcct gcggggcagc ctcaccaagc 300
tcaagggccc cttgaccatg atggccagcc actacaagca gcactgccct ccaaccccgg 360
aaacttcctg tgcaacccag attatcacct ttgaaagttt caaagagaac ctgaaggact 420
ttctgcttgt catccccttt gactgctggg agccagtcca ggagtgagac cggccagatg 480
aggctggcca agccggggag ctgctctctc atgaaacaag agctagaaac tcaggatggt 540
catcttggag ggaccaaggg gtgggccaca gccatggtgg gagtggcctg gacctgccct 600
gggccacact gaccctgata caggcatggc agaagaatgg gaatatttta tactgacaga 660
aatcagtaat atttatatat ttatattttt aaaatattta tttatttatt tatttaagtt 720
catattccat atttattcaa gatgttttac cgtaataatt attattaaaa atatgcttct 780
acttgaaaaa aaaaaaaaaa 800
<210> 11
<211> 3338
<212> DNA
<213> Homo sapiens
<400> 11
ataaaaagcc acagatctct ggcggcggca aggggacagc actgagctct gccgcctggc 60
tctagccgcc tgcctggccc ccgccgggac tcttgcccac cctcagccat ggctccgata 120
tctctgtcgt ggctgctccg cttggccacc ttctgccatc tgactgtcct gctggctgga 180
cagcaccacg gtgtgacgaa atgcaacatc acgtgcagca agatgacatc aaagatacct 240
gtagctttgc tcatccacta tcaacagaac caggcatcat gcggcaaacg cgcaatcatc 300
ttggagacga gacagcacag gctgttctgt gccgacccga aggagcaatg ggtcaaggac 360
gcgatgcagc atctggaccg ccaggctgct gccctaactc gaaatggcgg caccttcgag 420
aagcagatcg gcgaggtgaa gcccaggacc acccctgccg ccgggggaat ggacgagtct 480
gtggtcctgg agcccgaagc cacaggcgaa agcagtagcc tggagccgac tccttcttcc 540
caggaagcac agagggccct ggggacctcc ccagagctgc cgacgggcgt gactggttcc 600
tcagggacca ggctcccccc gacgccaaag gctcaggatg gagggcctgt gggcacggag 660
cttttccgag tgcctcccgt ctccactgcc gccacgtggc agagttctgc tccccaccaa 720
cctgggccca gcctctgggc tgaggcaaag acctctgagg ccccgtccac ccaggacccc 780
tccacccagg cctccactgc gtcctcccca gccccagagg agaatgctcc gtctgaaggc 840
cagcgtgtgt ggggtcaggg acagagcccc aggccagaga actctctgga gcgggaggag 900
atgggtcccg tgccagcgca cacggatgcc ttccaggact gggggcctgg cagcatggcc 960
cacgtctctg tggtccctgt ctcctcagaa gggaccccca gcagggagcc agtggcttca 1020
ggcagctgga cccctaaggc tgaggaaccc atccatgcca ccatggaccc ccagaggctg 1080
ggcgtcctta tcactcctgt ccctgacgcc caggctgcca cccggaggca ggcggtgggg 1140
ctgctggcct tccttggcct cctcttctgc ctgggggtgg ccatgttcac ctaccagagc 1200
ctccagggct gccctcgaaa gatggcagga gagatggcgg agggccttcg ctacatcccc 1260
cggagctgtg gtagtaattc atatgtcctg gtgcccgtgt gaactcctct ggcctgtgtc 1320
tagttgtttg attcagacag ctgcctggga tccctcatcc tcatacccac ccccacccaa 1380
gggcctggcc tgagctggga tgattggagg ggggaggtgg gatcctccag gtgcacaagc 1440
tccaagctcc caggcattcc ccaggaggcc agccttgacc attctccacc tgccagggac 1500
agagggggtg gcctcccaac tcaccccagc cccaaaactc tcctctgctg ctggctggtt 1560
agaggttccc tttgacgcca tcccagcccc aatgaacaat tatttattaa atgcccagcc 1620
ccttctgacc catgctgccc tgtgagtact acagtcctcc catctcacac atgagcatca 1680
ggccaggccc tctgcccact ccctgcaacc tgattgtgtc tcttggtcct gctgcagttg 1740
ccagtcaccc cggccacctg cggtgctatc tcccccagcc ccatcctctg tacagagccc 1800
acgcccccac tggtgacatg tcttttcttg catgaggcta gtgtggtgtt tcctggcact 1860
gcttccagtg aggctctgcc cttggttagg cattgtggga aggggagata agggtatctg 1920
gtgactttcc tctttggtct acactgtgct gagtctgaag gctgggttct gatcctagtt 1980
ccaccatcaa gccaccaaca tactcccatc tgtgaaagga aagagggagg taaggaatac 2040
ctgtccccct gacaacactc attgacctga ggcccttctc tccagcccct ggatgcagcc 2100
tcacagtcct taccagcaga gcaccttaga cagtccctgc caatggacta acttgtcttt 2160
ggaccctgag gcccagaggg cctgcaaggg agtgagttga tagcacagac cctgccctgt 2220
gggcccccaa atggaaatgg gcagagcaga gaccatccct gaaggccccg cccaggctta 2280
gtcactgaga cagcccgggc tctgcctccc atcacccgct aagagggagg gagggctcca 2340
gacacatgtc caagaagccc aggaaaggct ccaggagcag ccacattcct gatgcttctt 2400
cagagactcc tgcaggcagc caggccacaa gacccttgtg gtcccacccc acacacgcca 2460
gattctttcc tgaggctggg ctcccttccc acctctctca ctccttgaaa acactgttct 2520
ctgccctcca agaccttctc cttcaccttt gtccccaccg cagacaggac cagggatttc 2580
catgatgttt tccatgagtc ccctgtttgt ttctgaaagg gacgctaccc gggaaggggg 2640
ctgggacatg ggaaagggga agttgtaggc ataaagtcag gggttccctt ttttggctgc 2700
tgaaggctcg agcatgcctg gatggggctg caccggctgg cctggcccct cagggtccct 2760
ggtggcagct cacctctccc ttggattgtc cccgaccctt gccgtctacc tgaggggcct 2820
cttatgggct gggttctacc caggtgctag gaacactcct tcacagatgg gtgcttggag 2880
gaaggaaacc cagctctggt ccatagagag caagacgctg tgctgccctg cccacctggc 2940
ctctgcactc ccctgctggg tgtggcgcag catattcagg aagctcaggg cctggctcag 3000
gtggggtcac tctggcagct cagagagggt gggagtgggc ccaatgcact ttgttctggc 3060
tcttccaggc tgggagagcc ttccaggggt gggacaccct gtgatggggc cctgcctcct 3120
ttgtgaggaa gccgctgggg ccagttggtc ccccttccat ggactttgtt agtttctcca 3180
agcaggacat ggacaaggat gatctaggaa gactttggaa agagtaggaa gactttggaa 3240
agacttttcc aaccctcatc accaacgtct gtgccatttt gtattttact aataaaattt 3300
aaaagtcttg tgaatcaaaa aaaaaaaaaa aaaaaaaa 3338
<210> 12
<211> 1184
<212> DNA
<213> Homo sapiens
<400> 12
cacagagccc gggccgcagg cacctcctcg ccagctcttc cgctcctctc acagccgcca 60
gacccgcctg ctgagcccca tggcccgcgc tgctctctcc gccgccccca gcaatccccg 120
gctcctgcga gtggcactgc tgctcctgct cctggtagcc gctggccggc gcgcagcagg 180
agcgtccgtg gccactgaac tgcgctgcca gtgcttgcag accctgcagg gaattcaccc 240
caagaacatc caaagtgtga acgtgaagtc ccccggaccc cactgcgccc aaaccgaagt 300
catagccaca ctcaagaatg ggcggaaagc ttgcctcaat cctgcatccc ccatagttaa 360
gaaaatcatc gaaaagatgc tgaacagtga caaatccaac tgaccagaag ggaggaggaa 420
gctcactggt ggctgttcct gaaggaggcc ctgcccttat aggaacagaa gaggaaagag 480
agacacagct gcagaggcca cctggattgt gcctaatgtg tttgagcatc gcttaggaga 540
agtcttctat ttatttattt attcattagt tttgaagatt ctatgttaat attttaggtg 600
taaaataatt aagggtatga ttaactctac ctgcacactg tcctattata ttcattcttt 660
ttgaaatgtc aaccccaagt tagttcaatc tggattcata tttaatttga aggtagaatg 720
ttttcaaatg ttctccagtc attatgttaa tatttctgag gagcctgcaa catgccagcc 780
actgtgatag aggctggcgg atccaagcaa atggccaatg agatcattgt gaaggcaggg 840
gaatgtatgt gcacatctgt tttgtaactg tttagatgaa tgtcagttgt tatttattga 900
aatgatttca cagtgtgtgg tcaacatttc tcatgttgaa actttaagaa ctaaaatgtt 960
ctaaatatcc cttggacatt ttatgtcttt cttgtaaggc atactgcctt gtttaatggt 1020
agttttacag tgtttctggc ttagaacaaa ggggcttaat tattgatgtt ttcatagaga 1080
atataaaaat aaagcactta tagaaaaaac tcgtttgatt tttgggggga aacaagggct 1140
acctttactg gaaaatctgg tgatttataa aaaaaaaaaa aaaa 1184
<210> 13
<211> 1234
<212> DNA
<213> Homo sapiens
<400> 13
gagctccggg aatttccctg gcccgggact ccgggctttc cagccccaac catgcataaa 60
aggggttcgc cgttctcgga gagccacaga gcccgggcca caggcagctc cttgccagct 120
ctcctcctcg cacagccgct cgaaccgcct gctgagcccc atggcccgcg ccacgctctc 180
cgccgccccc agcaatcccc ggctcctgcg ggtggcgctg ctgctcctgc tcctggtggc 240
cgccagccgg cgcgcagcag gagcgcccct ggccactgaa ctgcgctgcc agtgcttgca 300
gaccctgcag ggaattcacc tcaagaacat ccaaagtgtg aaggtgaagt cccccggacc 360
ccactgcgcc caaaccgaag tcatagccac actcaagaat gggcagaaag cttgtctcaa 420
ccccgcatcg cccatggtta agaaaatcat cgaaaagatg ctgaaaaatg gcaaatccaa 480
ctgaccagaa ggaaggagga agcttattgg tggctgttcc tgaaggaggc cctgccctta 540
caggaacaga agaggaaaga gagacacagc tgcagaggcc acctggattg cgcctaatgt 600
gtttgagcat cacttaggag aagtcttcta tttatttatt tatttattta tttgtttgtt 660
ttagaagatt ctatgttaat attttatgtg taaaataagg ttatgattga atctacttgc 720
acactctccc attatattta ttgtttattt taggtcaaac ccaagttagt tcaatcctga 780
ttcatattta atttgaagat agaaggtttg cagatattct ctagtcattt gttaatattt 840
cttcgtgatg acatatcaca tgtcagccac tgtgatagag gctgaggaat ccaagaaaat 900
ggccagtgag atcaatgtga cggcagggaa atgtatgtgt gtctattttg taactgtaaa 960
gatgaatgtc agttgttatt tattgaaatg atttcacagt gtgtggtcaa catttctcat 1020
gttgaagctt taagaactaa aatgttctaa atatcccttg gacattttat gtctttcttg 1080
taaggcatac tgccttgttt aatgttaatt atgcagtgtt tccctctgtg ttagagcaga 1140
gaggtttcga tatttattga tgttttcaca aagaacagga aaataaaata tttaaaaata 1200
taaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaa 1234
<210> 14
<211> 1166
<212> DNA
<213> Homo sapiens
<400> 14
gctccgggaa tttccctggc ccggccgctc cgggctttcc agtctcaacc atgcataaaa 60
agggttcgcc gatcttgggg agccacacag cccgggtcgc aggcacctcc ccgccagctc 120
tcccgcttct cgcacagctt cccgacgcgt ctgctgagcc ccatggccca cgccacgctc 180
tccgccgccc ccagcaatcc ccggctcctg cgggtggcgc tgctgctcct gctcctggtg 240
gccgccagcc ggcgcgcagc aggagcgtcc gtggtcactg aactgcgctg ccagtgcttg 300
cagacactgc agggaattca cctcaagaac atccaaagtg tgaatgtaag gtcccccgga 360
ccccactgcg cccaaaccga agtcatagcc acactcaaga atgggaagaa agcttgtctc 420
aaccccgcat cccccatggt tcagaaaatc atcgaaaaga tactgaacaa ggggagcacc 480
aactgacagg agagaagtaa gaagcttatc agcgtatcat tgacacttcc tgcagggtgg 540
tccctgccct taccagagct gaaaatgaaa aagagaacag cagctttcta gggacagctg 600
gaaaggactt aatgtgtttg actatttctt acgagggttc tacttattta tgtatttatt 660
tttgaaagct tgtattttaa tattttacat gctgttattt aaagatgtga gtgtgtttca 720
tcaaacatag ctcagtcctg attatttaat tggaatatga tgggttttaa atgtgtcatt 780
aaactaatat ttagtgggag accataatgt gtcagccacc ttgataaatg acagggtggg 840
gaactggagg gtggggggat tgaaatgcaa gcaattagtg gatcactgtt agggtaaggg 900
aatgtatgta cacatctatt ttttatactt tttttttaaa aaaagaatgt cagttgttat 960
ttattcaaat tatctcacat tatgtgttca acatttttat gctgaagttt cccttagaca 1020
ttttatgtct tgcttgtagg gcataatgcc ttgtttaatg tccattctgc agcgtttctc 1080
tttcccttgg aaaagagaat ttatcattac tgttacattt gtacaaatga catgataata 1140
aaagttttat gaaaaaaaaa aaaaaa 1166
<210> 15
<211> 1227
<212> DNA
<213> Homo sapiens
<400> 15
ctttgcagat aaatatggca cactagcccc acgttttctg agacattcct caattgctta 60
gacatattct gagcctacag cagaggaacc tccagtctca gcaccatgaa tcaaactgcc 120
attctgattt gctgccttat ctttctgact ctaagtggca ttcaaggagt acctctctct 180
agaactgtac gctgtacctg catcagcatt agtaatcaac ctgttaatcc aaggtcttta 240
gaaaaacttg aaattattcc tgcaagccaa ttttgtccac gtgttgagat cattgctaca 300
atgaaaaaga agggtgagaa gagatgtctg aatccagaat cgaaggccat caagaattta 360
ctgaaagcag ttagcaagga aaggtctaaa agatctcctt aaaaccagag gggagcaaaa 420
tcgatgcagt gcttccaagg atggaccaca cagaggctgc ctctcccatc acttccctac 480
atggagtata tgtcaagcca taattgttct tagtttgcag ttacactaaa aggtgaccaa 540
tgatggtcac caaatcagct gctactactc ctgtaggaag gttaatgttc atcatcctaa 600
gctattcagt aataactcta ccctggcact ataatgtaag ctctactgag gtgctatgtt 660
cttagtggat gttctgaccc tgcttcaaat atttccctca cctttcccat cttccaaggg 720
tactaaggaa tctttctgct ttggggttta tcagaattct cagaatctca aataactaaa 780
aggtatgcaa tcaaatctgc tttttaaaga atgctcttta cttcatggac ttccactgcc 840
atcctcccaa ggggcccaaa ttctttcagt ggctacctac atacaattcc aaacacatac 900
aggaaggtag aaatatctga aaatgtatgt gtaagtattc ttatttaatg aaagactgta 960
caaagtagaa gtcttagatg tatatatttc ctatattgtt ttcagtgtac atggaataac 1020
atgtaattaa gtactatgta tcaatgagta acaggaaaat tttaaaaata cagatagata 1080
tatgctctgc atgttacata agataaatgt gctgaatggt tttcaaaata aaaatgaggt 1140
actctcctgg aaatattaag aaagactatc taaatgttga aagatcaaaa ggttaataaa 1200
gtaattataa ctaagaaaaa aaaaaaa 1227
<210> 16
<211> 460
<212> DNA
<213> Homo sapiens
<400> 16
gtggctgaat tctaacctct gtaatgagca ttgcacccaa taccagttct gaactctacc 60
tggtgaccag ggaccaggac ctttataagg tggaaggctt gatgtcctcc ccagactcag 120
ctcctggtga agctcccagc catcagccat gagggtcttg tatctcctct tctcgttcct 180
cttcatattc ctgatgcctc ttccaggtgt ttttggtggt ataggcgatc ctgttacctg 240
ccttaagagt ggagccatat gtcatccagt cttttgccct agaaggtata aacaaattgg 300
cacctgtggt ctccctggaa caaaatgctg caaaaagcca tgaggaggcc aagaagctgc 360
tgtggctgat gcggattcag aaagggctcc ctcatcagag acgtgcgaca tgtaaaccaa 420
attaaactat ggtgtccaaa gatacgcaaa aaaaaaaaaa 460
<210> 17
<211> 1907
<212> DNA
<213> Homo sapiens
<400> 17
agaatcagag agagagagat agagaaagag aaagacagag gtgtttccct tagctatgga 60
aactctataa gagagatcca gcttgcctcc tcttgagcag tcagcaacag ggtcccgtcc 120
ttgacacctc agcctctaca ggactgagaa gaagtaaaac cgtttgctgg ggctggcctg 180
actcaccagc tgccatgcag cagcccttca attacccata tccccagatc tactgggtgg 240
acagcagtgc cagctctccc tgggcccctc caggcacagt tcttccctgt ccaacctctg 300
tgcccagaag gcctggtcaa aggaggccac caccaccacc gccaccgcca ccactaccac 360
ctccgccgcc gccgccacca ctgcctccac taccgctgcc acccctgaag aagagaggga 420
accacagcac aggcctgtgt ctccttgtga tgtttttcat ggttctggtt gccttggtag 480
gattgggcct ggggatgttt cagctcttcc acctacagaa ggagctggca gaactccgag 540
agtctaccag ccagatgcac acagcatcat ctttggagaa gcaaataggc caccccagtc 600
caccccctga aaaaaaggag ctgaggaaag tggcccattt aacaggcaag tccaactcaa 660
ggtccatgcc tctggaatgg gaagacacct atggaattgt cctgctttct ggagtgaagt 720
ataagaaggg tggccttgtg atcaatgaaa ctgggctgta ctttgtatat tccaaagtat 780
acttccgggg tcaatcttgc aacaacctgc ccctgagcca caaggtctac atgaggaact 840
ctaagtatcc ccaggatctg gtgatgatgg aggggaagat gatgagctac tgcactactg 900
ggcagatgtg ggcccgcagc agctacctgg gggcagtgtt caatcttacc agtgctgatc 960
atttatatgt caacgtatct gagctctctc tggtcaattt tgaggaatct cagacgtttt 1020
tcggcttata taagctctaa gagaagcact ttgggattct ttccattatg attctttgtt 1080
acaggcaccg agaatgttgt attcagtgag ggtcttctta catgcatttg aggtcaagta 1140
agaagacatg aaccaagtgg accttgagac cacagggttc aaaatgtctg tagctcctca 1200
actcacctaa tgtttatgag ccagacaaat ggaggaatat gacggaagaa catagaactc 1260
tgggctgcca tgtgaagagg gagaagcatg aaaaagcagc taccaggtgt tctacactca 1320
tcttagtgcc tgagagtatt taggcagatt gaaaaggaca ccttttaact cacctctcaa 1380
ggtgggcctt gctacctcaa gggggactgt ctttcagata catggttgtg acctgaggat 1440
ttaagggatg gaaaaggaag actagaggct tgcataataa gctaaagagg ctgaaagagg 1500
ccaatgcccc actggcagca tcttcacttc taaatgcata tcctgagcca tcggtgaaac 1560
taacagataa gcaagagaga tgttttgggg actcatttca ttcctaacac agcatgtgta 1620
tttccagtgc aattgtaggg gtgtgtgtgt gtgtgtgtgt gtgtgtgtgt gtatgactaa 1680
agagagaatg tagatattgt gaagtacata ttaggaaaat atgggttgca tttggtcaag 1740
attttgaatg cttcctgaca atcaactcta atagtgctta aaaatcattg attgtcagct 1800
actaatgatg ttttcctata atataataaa tatttatgta gatgtgcatt tttgtgaaat 1860
gaaaacatgt aataaaaagt atatgttagg atacaaaaaa aaaaaaa 1907
<210> 18
<211> 3823
<212> DNA
<213> Homo sapiens
<400> 18
ataaaacctg gagcgcagga tcgcgcccag gagcggcgag ctagcggacg caaagactgg 60
gcatgctccg cggcggcgca ggttttggtc acaagtagga agaagccagt gcaccagacc 120
ggcaaagaga agcgggagcc gccgcggcag cgcggccgtg gggtccgccg ccgccgcatc 180
ggagcgggag gaggagcagc ggggagggcg aggccgccgg gccgagagcc gtcccgcctg 240
ctctcggtct tctgccttcg cctccgcgcg gtgcgtcgga cccagggtct gtcacctggg 300
cgccaggggc cgccgccggg gagccggagc gggcaggacc ctccctccgc cgactgcggc 360
ccgagagcgc ccccgcgggg tggagcggca gccgccttct gcgggcggct gagtgtccgt 420
ctcgcgcccg gagcgggcga ccgccgtcag cccggaggag gaggaggagg aggagggggc 480
ggccatgggg ctgctgtccc agggctcgcc gctgagctgg gaggaaacca agcgccatgc 540
cgaccacgtg cggcggcacg ggatcctcca gttcctgcac atctaccacg ccgtcaagga 600
ccggcacaag gacgttctca agtggggcga tgaggtggaa tacatgttgg tatcttttga 660
tcatgaaaat aaaaaagtcc ggttggtcct gtctggggag aaagttcttg aaactctgca 720
agagaagggg gaaaggacaa acccaaacca tcctaccctt tggagaccag agtatgggag 780
ttacatgatt gaagggacac caggacagcc ctacggagga acaatgtccg agttcaatac 840
agttgaggcc aacatgcgaa aacgccggaa ggaggctact tctatattag aagaaaatca 900
ggctctttgc acaataactt catttcccag attaggctgt cctgggttca cactgcccga 960
ggtcaaaccc aacccagtgg aaggaggagc ttccaagtcc ctcttctttc cagatgaagc 1020
aataaacaag caccctcgct tcagtacctt aacaagaaat atccgacata ggagaggaga 1080
aaaggttgtc atcaatgtac caatatttaa ggacaagaat acaccatctc catttataga 1140
aacatttact gaggatgatg aagcttcaag ggcttctaag ccggatcata tttacatgga 1200
tgccatggga tttggaatgg gcaattgctg tctccaggtg acattccaag cctgcagtat 1260
atctgaggcc agataccttt atgatcagtt ggctactatc tgtccaattg ttatggcttt 1320
gagtgctgca tctccctttt accgaggcta tgtgtcagac attgattgtc gctggggagt 1380
gatttctgca tctgtagatg atagaactcg ggaggagcga ggactggagc cattgaagaa 1440
caataactat aggatcagta aatcccgata tgactcaata gacagctatt tatctaagtg 1500
tggtgagaaa tataatgaca tcgacttgac gatagataaa gagatctacg aacagctgtt 1560
gcaggaaggc attgatcatc tcctggccca gcatgttgct catctcttta ttagagaccc 1620
actgacactg tttgaagaga aaatacacct ggatgatgct aatgagtctg accattttga 1680
gaatattcag tccacaaatt ggcagacaat gagatttaag ccccctcctc caaactcaga 1740
cattggatgg agagtagaat ttcgacccat ggaggtgcaa ttaacagact ttgagaactc 1800
tgcctatgtg gtgtttgtgg tactgctcac cagagtgatc ctttcctaca aattggattt 1860
tctcattcca ctgtcaaagg ttgatgagaa catgaaggta gcacagaaaa gagatgctgt 1920
cttgcaggga atgttttatt tcaggaaaga tatttgcaaa ggtggcaatg cagtggtgga 1980
tggttgtggc aaggcccaga acagcacgga gctcgctgca gaggagtaca ccctcatgag 2040
catagacacc atcatcaatg ggaaggaagg tgtgtttcct ggactgatcc caattctgaa 2100
ctcttacctt gaaaacatgg aagtggatgt ggacaccaga tgtagtattc tgaactacct 2160
aaagctaatt aagaagagag catctggaga actaatgaca gttgccagat ggatgaggga 2220
gtttatcgca aaccatcctg actacaagca agacagtgtc ataactgatg aaatgaatta 2280
tagccttatt ttgaagtgta accaaattgc aaatgaatta tgtgaatgcc cagagttact 2340
tggatcagca tttaggaaag taaaatatag tggaagtaaa actgactcat ccaactagac 2400
attctacaga aagaaaaatg cattattgac gaactggcta cagtaccatg cctctcagcc 2460
agcccgtgtg tataatatga agaccaaatg atagaactgt actgttttct gggccagtga 2520
gccagaaatt gattaaggct ttctttggta ggtaaatcta gagtttatac agtgtacatg 2580
tacatagtaa agtatttttg attaacaatg tattttaata acatatctaa agtcatcatg 2640
aactggcttg tacattttta aattcttact ctggagcaac ctactgtcta agcagttttg 2700
taaatgtact ggtaattgta caatacttgc attccagagt taaaatgttt actgtaaatt 2760
tttgttcttt taaagactac ctgggacctg atttattgaa atttttctct ttaaaaacat 2820
tttctctcgt taattttcct ttgtcatttc ctttgttgtc tacattaaat cacttgaatc 2880
cattgaaagt gcttcaaggg taatcttggg tttctagcac cttatctatg atgtttcttt 2940
tgcaattgga ataatcactt ggtcaccttg ccccaagctt tcccctctga ataaataccc 3000
attgaactct gatggctgtt atcaaaggaa cttttctttg tttaaatttg ctgatgcagg 3060
aattaagttt aaacacaact ctatagaaag aaaggagatt attacccaga attcacatgt 3120
agtgattatt aaggacaatt ttttttttta actaaaaaag ttggcggcag gggtgggggg 3180
tggcaatcat ttttcttcct atacatacaa aggatattgt caaaaatggc gttcttctct 3240
tgtggcctgt tattctgatt gctgctgtat acagttttgt cactctttag tttttagtta 3300
agcatactga tagactttcc tctaaaagcc attcactcca gattttacct ggggaatatt 3360
ctacatactg cttactttct ctataaaact catcaataaa tcatgaaagg cactgagttt 3420
tgtaaatcag gaccctaaat gtttaattgt aaataagttt cagataatta ttatagcttt 3480
gcgttgaagt ttgttgtttt ttttctcaac tagttaagtc aactgcttct gaaataactc 3540
tgtattgtag attatgcaga tctttacagg cataaatatt taaactgtaa tatgctaact 3600
tgaagagatt gcaataaagc tgcttcagct aaccctgttt atgtttaaat actagggttt 3660
gttctatatt ttatacatgc attttggatg attaaagaat gcctggtttt cgtttgcaat 3720
ttgcttgtgt aaatcaggtt gtaaaaaggc agataaattg aaatgtttgt ggtatgagga 3780
aataaaagaa tggaattagc tttcattcag aaaaaaaaaa aaa 3823
<210> 19
<211> 3249
<212> DNA
<213> Homo sapiens
<400> 19
caagcttagc ctggccggga aacgggaggc gtggaggccg ggagcagccc ccggggtcat 60
cgccctgcca ccgccgcccg attgctttag cttggaaatt ccggagctga agcggccagc 120
gagggaggat gaccctctcg gcccgggcac cctgtcagtc cggaaataac tgcagcattt 180
gttccggagg ggaaggcgcg aggtttccgg gaaagcagca ccgccccttg gcccccaggt 240
ggctagcgct ataaaggatc acgcgcccca gtcgacgctg agctcctctg ctactcagag 300
ttgcaacctc agcctcgcta tggctcccag cagcccccgg cccgcgctgc ccgcactcct 360
ggtcctgctc ggggctctgt tcccaggacc tggcaatgcc cagacatctg tgtccccctc 420
aaaagtcatc ctgccccggg gaggctccgt gctggtgaca tgcagcacct cctgtgacca 480
gcccaagttg ttgggcatag agaccccgtt gcctaaaaag gagttgctcc tgcctgggaa 540
caaccggaag gtgtatgaac tgagcaatgt gcaagaagat agccaaccaa tgtgctattc 600
aaactgccct gatgggcagt caacagctaa aaccttcctc accgtgtact ggactccaga 660
acgggtggaa ctggcacccc tcccctcttg gcagccagtg ggcaagaacc ttaccctacg 720
ctgccaggtg gagggtgggg caccccgggc caacctcacc gtggtgctgc tccgtgggga 780
gaaggagctg aaacgggagc cagctgtggg ggagcccgct gaggtcacga ccacggtgct 840
ggtgaggaga gatcaccatg gagccaattt ctcgtgccgc actgaactgg acctgcggcc 900
ccaagggctg gagctgtttg agaacacctc ggccccctac cagctccaga cctttgtcct 960
gccagcgact cccccacaac ttgtcagccc ccgggtccta gaggtggaca cgcaggggac 1020
cgtggtctgt tccctggacg ggctgttccc agtctcggag gcccaggtcc acctggcact 1080
gggggaccag aggttgaacc ccacagtcac ctatggcaac gactccttct cggccaaggc 1140
ctcagtcagt gtgaccgcag aggacgaggg cacccagcgg ctgacgtgtg cagtaatact 1200
ggggaaccag agccaggaga cactgcagac agtgaccatc tacagctttc cggcgcccaa 1260
cgtgattctg acgaagccag aggtctcaga agggaccgag gtgacagtga agtgtgaggc 1320
ccaccctaga gccaaggtga cgctgaatgg ggttccagcc cagccactgg gcccgagggc 1380
ccagctcctg ctgaaggcca ccccagagga caacgggcgc agcttctcct gctctgcaac 1440
cctggaggtg gccggccagc ttatacacaa gaaccagacc cgggagcttc gtgtcctgta 1500
tggcccccga ctggacgaga gggattgtcc gggaaactgg acgtggccag aaaattccca 1560
gcagactcca atgtgccagg cttgggggaa cccattgccc gagctcaagt gtctaaagga 1620
tggcactttc ccactgccca tcggggaatc agtgactgtc actcgagatc ttgagggcac 1680
ctacctctgt cgggccagga gcactcaagg ggaggtcacc cgcaaggtga ccgtgaatgt 1740
gctctccccc cggtatgaga ttgtcatcat cactgtggta gcagccgcag tcataatggg 1800
cactgcaggc ctcagcacgt acctctataa ccgccagcgg aagatcaaga aatacagact 1860
acaacaggcc caaaaaggga cccccatgaa accgaacaca caagccacgc ctccctgaac 1920
ctatcccggg acagggcctc ttcctcggcc ttcccatatt ggtggcagtg gtgccacact 1980
gaacagagtg gaagacatat gccatgcagc tacacctacc ggccctggga cgccggagga 2040
cagggcattg tcctcagtca gatacaacag catttggggc catggtacct gcacacctaa 2100
aacactaggc cacgcatctg atctgtagtc acatgactaa gccaagagga aggagcaaga 2160
ctcaagacat gattgatgga tgttaaagtc tagcctgatg agaggggaag tggtggggga 2220
gacatagccc caccatgagg acatacaact gggaaatact gaaacttgct gcctattggg 2280
tatgctgagg ccccacagac ttacagaaga agtggccctc catagacatg tgtagcatca 2340
aaacacaaag gcccacactt cctgacggat gccagcttgg gcactgctgt ctactgaccc 2400
caacccttga tgatatgtat ttattcattt gttattttac cagctattta ttgagtgtct 2460
tttatgtagg ctaaatgaac ataggtctct ggcctcacgg agctcccagt cctaatcaca 2520
ttcaaggtca ccaggtacag ttgtacaggt tgtacactgc aggagagtgc ctggcaaaaa 2580
gatcaaatgg ggctgggact tctcattggc caacctgcct ttccccagaa ggagtgattt 2640
ttctatcggc acaaaagcac tatatggact ggtaatggtt acaggttcag agattaccca 2700
gtgaggcctt attcctccct tccccccaaa actgacacct ttgttagcca cctccccacc 2760
cacatacatt tctgccagtg ttcacaatga cactcagcgg tcatgtctgg acatgagtgc 2820
ccagggaata tgcccaagct atgccttgtc ctcttgtcct gtttgcattt cactgggagc 2880
ttgcactatg cagctccagt ttcctgcagt gatcagggtc ctgcaagcag tggggaaggg 2940
ggccaaggta ttggaggact ccctcccagc tttggaagcc tcatccgcgt gtgtgtgtgt 3000
gtgtatgtgt agacaagctc tcgctctgtc acccaggctg gagtgcagtg gtgcaatcat 3060
ggttcactgc agtcttgacc ttttgggctc aagtgatcct cccacctcag cctcctgagt 3120
agctgggacc ataggctcac aacaccacac ctggcaaatt tgattttttt tttttttcca 3180
gagacggggt ctcgcaacat tgcccagact tcctttgtgt tagttaataa agctttctca 3240
actgccaaa 3249
<210> 20
<211> 1254
<212> DNA
<213> Homo sapiens
<400> 20
ctcacttggc cttacactcc gctcggctca ccatgtgtca ctctcgcagc tgccacccga 60
ccatgaccat cctgcaggcc ccgaccccgg ccccctccac catcccggga ccccggcggg 120
gctccggtcc tgagatcttc accttcgacc ctctcccgga gcccgcagcg gcccctgccg 180
ggcgccccag cgcctctcgc gggcaccgaa agcgcagccg cagggttctc taccctcgag 240
tggtccggcg ccagctgcca gtcgaggaac cgaacccagc caaaaggctt ctctttctgc 300
tgctcaccat cgtcttctgc cagatcctga tggctgaaga gggtgtgccg gcgcccctgc 360
ctccagagga cgcccctaac gccgcatccc tggcgcccac ccctgtgtcc gccgtcctcg 420
agccctttaa tctgacttcg gagccctcgg actacgctct ggacctcagc actttcctcc 480
agcaacaccc ggccgccttc taactgtgac tccccgcact ccccaaaaag aatccgaaaa 540
accacaaaga aacaccaggc gtacctggtg cgcgagagcg tatccccaac tgggacttcc 600
gaggcaactt gaactcagaa cactacagcg gagacgccac ccggtgcttg aggcgggacc 660
gaggcgcaca gagaccgagg cgcatagaga ccgaggcaca gcccagctgg ggctaggccc 720
ggtgggaagg agagcgtcgt taatttattt cttattgctc ctaattaata tttatatgta 780
tttatgtacg tcctcctagg tgatggagat gtgtacgtaa tatttatttt aacttatgca 840
agggtgtgag atgttccccc tgctgtaaat gcaggtctct tggtatttat tgagctttgt 900
gggactggtg gaagcaggac acctggaact gcggcaaagt aggagaagaa atggggagga 960
ctcgggtggg ggaggacgtc ccggctggga tgaagtctgg tggtgggtcg taagtttagg 1020
aggtgactgc atcctccagc atctcaactc cgtctgtcta ctgtgtgaga cttcggcgga 1080
ccattaggaa tgagatccgt gagatccttc catcttcttg aagtcgcctt tagggtggct 1140
gcgaggtaga gggttggggg ttggtgggct gtcacggagc gactgtcgag atcgcctagt 1200
atgttctgtg aacacaaata aaattgattt actgtctgca aaaaaaaaaa aaaa 1254
<210> 21
<211> 840
<212> DNA
<213> Homo sapiens
<400> 21
acattctaac tgcaaccttt cgaagccttt gctctggcac aacaggtagt aggcgacact 60
gttcgtgttg tcaacatgac caacaagtgt ctcctccaaa ttgctctcct gttgtgcttc 120
tccactacag ctctttccat gagctacaac ttgcttggat tcctacaaag aagcagcaat 180
tttcagtgtc agaagctcct gtggcaattg aatgggaggc ttgaatactg cctcaaggac 240
aggatgaact ttgacatccc tgaggagatt aagcagctgc agcagttcca gaaggaggac 300
gccgcattga ccatctatga gatgctccag aacatctttg ctattttcag acaagattca 360
tctagcactg gctggaatga gactattgtt gagaacctcc tggctaatgt ctatcatcag 420
ataaaccatc tgaagacagt cctggaagaa aaactggaga aagaagattt caccagggga 480
aaactcatga gcagtctgca cctgaaaaga tattatggga ggattctgca ttacctgaag 540
gccaaggagt acagtcactg tgcctggacc atagtcagag tggaaatcct aaggaacttt 600
tacttcatta acagacttac aggttacctc cgaaactgaa gatctcctag cctgtgcctc 660
tgggactgga caattgcttc aagcattctt caaccagcag atgctgttta agtgactgat 720
ggctaatgta ctgcatatga aaggacacta gaagattttg aaatttttat taaattatga 780
gttattttta tttatttaaa ttttattttg gaaaataaat tatttttggt gcaaaagtca 840
<210> 22
<211> 2347
<212> DNA
<213> Homo sapiens
<400> 22
ctgtttcagg gccattggac tctccgtcct gcccagagca agatgtgtca ccagcagttg 60
gtcatctctt ggttttccct ggtttttctg gcatctcccc tcgtggccat atgggaactg 120
aagaaagatg tttatgtcgt agaattggat tggtatccgg atgcccctgg agaaatggtg 180
gtcctcacct gtgacacccc tgaagaagat ggtatcacct ggaccttgga ccagagcagt 240
gaggtcttag gctctggcaa aaccctgacc atccaagtca aagagtttgg agatgctggc 300
cagtacacct gtcacaaagg aggcgaggtt ctaagccatt cgctcctgct gcttcacaaa 360
aaggaagatg gaatttggtc cactgatatt ttaaaggacc agaaagaacc caaaaataag 420
acctttctaa gatgcgaggc caagaattat tctggacgtt tcacctgctg gtggctgacg 480
acaatcagta ctgatttgac attcagtgtc aaaagcagca gaggctcttc tgacccccaa 540
ggggtgacgt gcggagctgc tacactctct gcagagagag tcagagggga caacaaggag 600
tatgagtact cagtggagtg ccaggaggac agtgcctgcc cagctgctga ggagagtctg 660
cccattgagg tcatggtgga tgccgttcac aagctcaagt atgaaaacta caccagcagc 720
ttcttcatca gggacatcat caaacctgac ccacccaaga acttgcagct gaagccatta 780
aagaattctc ggcaggtgga ggtcagctgg gagtaccctg acacctggag tactccacat 840
tcctacttct ccctgacatt ctgcgttcag gtccagggca agagcaagag agaaaagaaa 900
gatagagtct tcacggacaa gacctcagcc acggtcatct gccgcaaaaa tgccagcatt 960
agcgtgcggg cccaggaccg ctactatagc tcatcttgga gcgaatgggc atctgtgccc 1020
tgcagttagg ttctgatcca ggatgaaaat ttggaggaaa agtggaagat attaagcaaa 1080
atgtttaaag acacaacgga atagacccaa aaagataatt tctatctgat ttgctttaaa 1140
acgttttttt aggatcacaa tgatatcttt gctgtatttg tatagttaga tgctaaatgc 1200
tcattgaaac aatcagctaa tttatgtata gattttccag ctctcaagtt gccatgggcc 1260
ttcatgctat ttaaatattt aagtaattta tgtatttatt agtatattac tgttatttaa 1320
cgtttgtctg ccaggatgta tggaatgttt catactctta tgacctgatc catcaggatc 1380
agtccctatt atgcaaaatg tgaatttaat tttatttgta ctgacaactt ttcaagcaag 1440
gctgcaagta catcagtttt atgacaatca ggaagaatgc agtgttctga taccagtgcc 1500
atcatacact tgtgatggat gggaacgcaa gagatactta catggaaacc tgacaatgca 1560
aacctgttga gaagatccag gagaacaaga tgctagttcc catgtctgtg aagacttcct 1620
ggagatggtg ttgataaagc aatttagggc cacttacact tctaagcaag tttaatcttt 1680
ggatgcctga attttaaaag ggctagaaaa aaatgattga ccagcctggg aaacataaca 1740
agaccccgtc tctacaaaaa aaatttaaaa ttagccaggc gtggtggctc atgcttgtgg 1800
tcccagctgt tcaggaggat gaggcaggag gatctcttga gcccaggagg tcaaggctat 1860
ggtgagccgt gattgtgcca ctgcatacca gcctaggtga cagaatgaga ccctgtctca 1920
aaaaaaaaaa tgattgaaat taaaattcag ctttagcttc catggcagtc ctcaccccca 1980
cctctctaaa agacacagga ggatgacaca gaaacaccgt aagtgtctgg aaggcaaaaa 2040
gatcttaaga ttcaagagag aggacaagta gttatggcta aggacatgaa attgtcagaa 2100
tggcaggtgg cttcttaaca gccctgtgag aagcagacag atgcaaagaa aatctggaat 2160
ccctttctca ttagcatgaa tgaacctgat acacaattat gaccagaaaa tatggctcca 2220
tgaaggtgct acttttaagt aatgtatgtg cgctctgtaa agtgattaca tttgtttcct 2280
gtttgtttat ttatttattt atttttgcat tctgaggctg aactaataaa aactcttctt 2340
tgtaatc 2347
<210> 23
<211> 1498
<212> DNA
<213> Homo sapiens
<400> 23
accaaacctc ttcgaggcac aaggcacaac aggctgctct gggattctct tcagccaatc 60
ttcattgctc aagtgtctga agcagccatg gcagaagtac ctgagctcgc cagtgaaatg 120
atggcttatt acagtggcaa tgaggatgac ttgttctttg aagctgatgg ccctaaacag 180
atgaagtgct ccttccagga cctggacctc tgccctctgg atggcggcat ccagctacga 240
atctccgacc accactacag caagggcttc aggcaggccg cgtcagttgt tgtggccatg 300
gacaagctga ggaagatgct ggttccctgc ccacagacct tccaggagaa tgacctgagc 360
accttctttc ccttcatctt tgaagaagaa cctatcttct tcgacacatg ggataacgag 420
gcttatgtgc acgatgcacc tgtacgatca ctgaactgca cgctccggga ctcacagcaa 480
aaaagcttgg tgatgtctgg tccatatgaa ctgaaagctc tccacctcca gggacaggat 540
atggagcaac aagtggtgtt ctccatgtcc tttgtacaag gagaagaaag taatgacaaa 600
atacctgtgg ccttgggcct caaggaaaag aatctgtacc tgtcctgcgt gttgaaagat 660
gataagccca ctctacagct ggagagtgta gatcccaaaa attacccaaa gaagaagatg 720
gaaaagcgat ttgtcttcaa caagatagaa atcaataaca agctggaatt tgagtctgcc 780
cagttcccca actggtacat cagcacctct caagcagaaa acatgcccgt cttcctggga 840
gggaccaaag gcggccagga tataactgac ttcaccatgc aatttgtgtc ttcctaaaga 900
gagctgtacc cagagagtcc tgtgctgaat gtggactcaa tccctagggc tggcagaaag 960
ggaacagaaa ggtttttgag tacggctata gcctggactt tcctgttgtc tacaccaatg 1020
cccaactgcc tgccttaggg tagtgctaag aggatctcct gtccatcagc caggacagtc 1080
agctctctcc tttcagggcc aatccccagc ccttttgttg agccaggcct ctctcacctc 1140
tcctactcac ttaaagcccg cctgacagaa accacggcca catttggttc taagaaaccc 1200
tctgtcattc gctcccacat tctgatgagc aaccgcttcc ctatttattt atttatttgt 1260
ttgtttgttt tattcattgg tctaatttat tcaaaggggg caagaagtag cagtgtctgt 1320
aaaagagcct agtttttaat agctatggaa tcaattcaat ttggactggt gtgctctctt 1380
taaatcaagt cctttaatta agactgaaaa tatataagct cagattattt aaatgggaat 1440
atttataaat gagcaaatat catactgttc aatggttctg aaataaactt cactgaag 1498
<210> 24
<211> 822
<212> DNA
<213> Homo sapiens
<400> 24
agttccctat cactctcttt aatcactact cacagtaacc tcaactcctg ccacaatgta 60
caggatgcaa ctcctgtctt gcattgcact aagtcttgca cttgtcacaa acagtgcacc 120
tacttcaagt tctacaaaga aaacacagct acaactggag catttactgc tggatttaca 180
gatgattttg aatggaatta ataattacaa gaatcccaaa ctcaccagga tgctcacatt 240
taagttttac atgcccaaga aggccacaga actgaaacat cttcagtgtc tagaagaaga 300
actcaaacct ctggaggaag tgctaaattt agctcaaagc aaaaactttc acttaagacc 360
cagggactta atcagcaata tcaacgtaat agttctggaa ctaaagggat ctgaaacaac 420
attcatgtgt gaatatgctg atgagacagc aaccattgta gaatttctga acagatggat 480
taccttttgt caaagcatca tctcaacact gacttgataa ttaagtgctt cccacttaaa 540
acatatcagg ccttctattt atttaaatat ttaaatttta tatttattgt tgaatgtatg 600
gtttgctacc tattgtaact attattctta atcttaaaac tataaatatg gatcttttat 660
gattcttttt gtaagcccta ggggctctaa aatggtttca cttatttatc ccaaaatatt 720
tattattatg ttgaatgtta aatatagtat ctatgtagat tggttagtaa aactatttaa 780
taaatttgat aaatataaaa aaaaaaaaaa aaaaaaaaaa aa 822
<210> 25
<211> 1049
<212> DNA
<213> Homo sapiens
<400> 25
aaaacaacag gaagcagctt acaaactcgg tgaacaactg agggaaccaa accagagacg 60
cgctgaacag agagaatcag gctcaaagca agtggaagtg ggcagagatt ccaccaggac 120
tggtgcaagg cgcagagcca gccagatttg agaagaaggc aaaaagatgc tggggagcag 180
agctgtaatg ctgctgttgc tgctgccctg gacagctcag ggcagagctg tgcctggggg 240
cagcagccct gcctggactc agtgccagca gctttcacag aagctctgca cactggcctg 300
gagtgcacat ccactagtgg gacacatgga tctaagagaa gagggagatg aagagactac 360
aaatgatgtt ccccatatcc agtgtggaga tggctgtgac ccccaaggac tcagggacaa 420
cagtcagttc tgcttgcaaa ggatccacca gggtctgatt ttttatgaga agctgctagg 480
atcggatatt ttcacagggg agccttctct gctccctgat agccctgtgg gccagcttca 540
tgcctcccta ctgggcctca gccaactcct gcagcctgag ggtcaccact gggagactca 600
gcagattcca agcctcagtc ccagccagcc atggcagcgt ctccttctcc gcttcaaaat 660
ccttcgcagc ctccaggcct ttgtggctgt agccgcccgg gtctttgccc atggagcagc 720
aaccctgagt ccctaaaggc agcagctcaa ggatggcact cagatctcca tggcccagca 780
aggccaagat aaatctacca ccccaggcac ctgtgagcca acaggttaat tagtccatta 840
attttagtgg gacctgcata tgttgaaaat taccaatact gactgacatg tgatgctgac 900
ctatgataag gttgagtatt tattagatgg gaagggaaat ttggggatta tttatcctcc 960
tggggacagt ttggggagga ttatttattg tatttatatt gaattatgta cttttttcaa 1020
taaagtctta tttttgtggc taaaaaaaa 1049
<210> 26
<211> 1201
<212> DNA
<213> Homo sapiens
<400> 26
aatattagag tctcaacccc caataaatat aggactggag atgtctgagg ctcattctgc 60
cctcgagccc accgggaacg aaagagaagc tctatctccc ctccaggagc ccagctatga 120
actccttctc cacaagcgcc ttcggtccag ttgccttctc cctggggctg ctcctggtgt 180
tgcctgctgc cttccctgcc ccagtacccc caggagaaga ttccaaagat gtagccgccc 240
cacacagaca gccactcacc tcttcagaac gaattgacaa acaaattcgg tacatcctcg 300
acggcatctc agccctgaga aaggagacat gtaacaagag taacatgtgt gaaagcagca 360
aagaggcact ggcagaaaac aacctgaacc ttccaaagat ggctgaaaaa gatggatgct 420
tccaatctgg attcaatgag gagacttgcc tggtgaaaat catcactggt cttttggagt 480
ttgaggtata cctagagtac ctccagaaca gatttgagag tagtgaggaa caagccagag 540
ctgtgcagat gagtacaaaa gtcctgatcc agttcctgca gaaaaaggca aagaatctag 600
atgcaataac cacccctgac ccaaccacaa atgccagcct gctgacgaag ctgcaggcac 660
agaaccagtg gctgcaggac atgacaactc atctcattct gcgcagcttt aaggagttcc 720
tgcagtccag cctgagggct cttcggcaaa tgtagcatgg gcacctcaga ttgttgttgt 780
taatgggcat tccttcttct ggtcagaaac ctgtccactg ggcacagaac ttatgttgtt 840
ctctatggag aactaaaagt atgagcgtta ggacactatt ttaattattt ttaatttatt 900
aatatttaaa tatgtgaagc tgagttaatt tatgtaagtc atatttatat ttttaagaag 960
taccacttga aacattttat gtattagttt tgaaataata atggaaagtg gctatgcagt 1020
ttgaatatcc tttgtttcag agccagatca tttcttggaa agtgtaggct tacctcaaat 1080
aaatggctaa cttatacata tttttaaaga aatatttata ttgtatttat ataatgtata 1140
aatggttttt ataccaataa atggcatttt aaaaaattca gcaaaaaaaa aaaaaaaaaa 1200
a 1201
<210> 27
<211> 1718
<212> DNA
<213> Homo sapiens
<400> 27
gagggtgcat aagttctcta gtagggtgat gatataaaaa gccaccggag cactccataa 60
ggcacaaact ttcagagaca gcagagcaca caagcttcta ggacaagagc caggaagaaa 120
ccaccggaag gaaccatctc actgtgtgta aacatgactt ccaagctggc cgtggctctc 180
ttggcagcct tcctgatttc tgcagctctg tgtgaaggtg cagttttgcc aaggagtgct 240
aaagaactta gatgtcagtg cataaagaca tactccaaac ctttccaccc caaatttatc 300
aaagaactga gagtgattga gagtggacca cactgcgcca acacagaaat tattgtaaag 360
ctttctgatg gaagagagct ctgtctggac cccaaggaaa actgggtgca gagggttgtg 420
gagaagtttt tgaagagggc tgagaattca taaaaaaatt cattctctgt ggtatccaag 480
aatcagtgaa gatgccagtg aaacttcaag caaatctact tcaacacttc atgtattgtg 540
tgggtctgtt gtagggttgc cagatgcaat acaagattcc tggttaaatt tgaatttcag 600
taaacaatga atagtttttc attgtaccat gaaatatcca gaacatactt atatgtaaag 660
tattatttat ttgaatctac aaaaaacaac aaataatttt taaatataag gattttccta 720
gatattgcac gggagaatat acaaatagca aaattgaggc caagggccaa gagaatatcc 780
gaactttaat ttcaggaatt gaatgggttt gctagaatgt gatatttgaa gcatcacata 840
aaaatgatgg gacaataaat tttgccataa agtcaaattt agctggaaat cctggatttt 900
tttctgttaa atctggcaac cctagtctgc tagccaggat ccacaagtcc ttgttccact 960
gtgccttggt ttctccttta tttctaagtg gaaaaagtat tagccaccat cttacctcac 1020
agtgatgttg tgaggacatg tggaagcact ttaagttttt tcatcataac ataaattatt 1080
ttcaagtgta acttattaac ctatttatta tttatgtatt tatttaagca tcaaatattt 1140
gtgcaagaat ttggaaaaat agaagatgaa tcattgattg aatagttata aagatgttat 1200
agtaaattta ttttatttta gatattaaat gatgttttat tagataaatt tcaatcaggg 1260
tttttagatt aaacaaacaa acaattgggt acccagttaa attttcattt cagataaaca 1320
acaaataatt ttttagtata agtacattat tgtttatctg aaattttaat tgaactaaca 1380
atcctagttt gatactccca gtcttgtcat tgccagctgt gttggtagtg ctgtgttgaa 1440
ttacggaata atgagttaga actattaaaa cagccaaaac tccacagtca atattagtaa 1500
tttcttgctg gttgaaactt gtttattatg tacaaataga ttcttataat attatttaaa 1560
tgactgcatt tttaaataca aggctttata tttttaactt taagatgttt ttatgtgctc 1620
tccaaatttt ttttactgtt tctgattgta tggaaatata aaagtaaata tgaaacattt 1680
aaaatataat ttgttgtcaa agtaaaaaaa aaaaaaaa 1718
<210> 28
<211> 3567
<212> DNA
<213> Homo sapiens
<400> 28
agagctcgcc actccttagt cgaggcaaga cgtgcgcccg agccccgccg aaccgaggcc 60
acccggagcc gtgcccagtc cacgccggcc gtgcccggcg gccttaagaa cccggcaacc 120
tctgccttct tccctcttcc actcggagtc gcgctccgcg cgccctcact gcagcccctg 180
cgtcgccggg accctcgcgc gcgaccgccg aatcgctcct gcagcagagc caacatgccc 240
atcactcgga tgcgcatgag accctggcta gagatgcaga ttaattccaa ccaaatcccg 300
gggctcatct ggattaataa agaggagatg atcttccaga tcccatggaa gcatgctgcc 360
aagcatggct gggacatcaa caaggatgcc tgtttgttcc ggagctgggc cattcacaca 420
ggccgataca aagcagggga aaaggagcca gatcccaaga cgtggaaggc caactttcgc 480
tgtgccatga actccctgcc agatatcgag gaggtgaaag accagagcag gaacaagggc 540
agctcagctg tgcgagtgta ccggatgctt ccacctctca ccaagaacca gagaaaagaa 600
agaaagtcga agtccagccg agatgctaag agcaaggcca agaggaagtc atgtggggat 660
tccagccctg ataccttctc tgatggactc agcagctcca ctctgcctga tgaccacagc 720
agctacacag ttccaggcta catgcaggac ttggaggtgg agcaggccct gactccagca 780
ctgtcgccat gtgctgtcag cagcactctc cccgactggc acatcccagt ggaagttgtg 840
ccggacagca ccagtgatct gtacaacttc caggtgtcac ccatgccctc cacctctgaa 900
gctacaacag atgaggatga ggaagggaaa ttacctgagg acatcatgaa gctcttggag 960
cagtcggagt ggcagccaac aaacgtggat gggaaggggt acctactcaa tgaacctgga 1020
gtccagccca cctctgtcta tggagacttt agctgtaagg aggagccaga aattgacagc 1080
ccaggggggg atattgggct gagtctacag cgtgtcttca cagatctgaa gaacatggat 1140
gccacctggc tggacagcct gctgacccca gtccggttgc cctccatcca ggccattccc 1200
tgtgcaccgt agcagggccc ctgggcccct cttattcctc taggcaagca ggacctggca 1260
tcatggtgga tatggtgcag agaagctgga cttctgtggg cccctcaaca gccaagtgtg 1320
accccactgc caagtgggga tggggcctcc ctccttgggt cattgacctc tcagggcctg 1380
gcaggccagt gtctgggttt ttcttgtggt gtaaagctgg ccctgcctcc tgggaagatg 1440
aggttctgag accagtgtat caggtcaggg acttggacag gagtcagtgt ctggcttttt 1500
cctctgagcc cagctgcctg gagagggtct cgctgtcact ggctggctcc taggggaaca 1560
gaccagtgac cccagaaaag cataacacca atcccagggc tggctctgca ctaagagaaa 1620
attgcactaa atgaatctcg ttcccaaaga actaccccct tttcagctga gccctgggga 1680
ctgttccaaa gccagtgaaa tgtgaaggaa agtggggtcc ttcggggcga tgctccctca 1740
gcctcagagg agctctaccc tgctccctgc tttggctgag gggcttggga aaaaaacttg 1800
gcactttttc gtgtggatct tgccacattt ctgatcagag gtgtacacta acatttcccc 1860
cgagctcttg gcctttgcat ttatttatac agtgccttgc tcggcgccca ccaccccctc 1920
aagccccagc agccctcaac aggcccaggg agggaagtgt gagcgccttg gtatgactta 1980
aaattggaaa tgtcatctaa ccattaagtc atgtgtgaac acataaggac gtgtgtaaat 2040
atgtacattt gtctttttat aaaaagtaaa ttgtttataa ggggtgtggc ctttttagag 2100
agaaatttaa cttgtagatg attttacttt ttatggaaac actgatggac ttattattgg 2160
catcccgcct gaacttgact ttggggtgaa cagggacatg catctattat aaaatccttt 2220
cggccaggcg cggtggctca cacctgtaat cccagcactt tgggaggccg agatgggtgg 2280
atcacctgag gtcaggagtt cgagaccagc ctggtgaaac tccatttcta ctaaaaatgc 2340
aaaaattagc tgggcgtggt tgcgggtgct tgtaatccca gctactcagg aggctgaggc 2400
aagagaatcg cttgaacctg ggaggtggag gttgcagtga gccgagaaca tgccattgca 2460
ctccagcccg ggcaccaaaa aaaaaaaaaa aaaaaaaaac ctttcatttg gccgggcatg 2520
gtggcttatg cctgtaatcc tggcactttg ggaggccaag gtgggcagat cacctgaggt 2580
caggagtttg agaccagcct ggccaacatg gtgaaacctc atctctacta aaaatacaaa 2640
aattaggccg ggcacggtgg ctcacgcctg taatcccagc actttgggag gcagaggcgg 2700
gcggatcacg aggtcaggag atcaagacca tcctggctaa cacggtgaaa ccccgtctct 2760
actaaaaata taaaaaatta gccgggccta gtggcgggtg cctgtagtcc cagctactcg 2820
ggaggctgag gcaggagaat ggcatgaacc ccggaggcag agcttgcagt gagccgagat 2880
tgcaccactg cactacagcc tgggcgacag agcgagactc cgtctcaaaa aaaaaaaaaa 2940
aaattagccg ggcctggtgg cgggcgcctg taatcccagc tactgtggag gctgaagcac 3000
aagaatcact tgaacccggg agatggaggt tgcagtgagc tgagactgtg ccactgcact 3060
ccagcctggg tgacaagagt gagactttgt ctcaaaaaaa aaaaaatcct tttgtttatg 3120
ttcacataga caatggcaga aggaggggac attcctgtca taggaacatg cttatataaa 3180
catagtcacc tgtccttgac tatcaccagg gctgtcagtt gattctgggc tcctggggcc 3240
caaggagtgt taagttttga ggcatgtgcc ataggtgatg tgtcctgcta acacacagat 3300
gctgctccaa aaagtcagtt gatatgacac agtcacagac agaacagtca gcagcccaag 3360
aaaggtcctc acggctgctg tgctgggtag cacttgccat ccagtttcta gagtgatgaa 3420
atgctctgtc tgtaccgttc aatacagtag gcactggcac tagccacatg tgccagctaa 3480
gcacttgaaa tgtggccagt gcaataagga attgaacttt taattgcatt taataaactg 3540
tatgtaaata gtcaaaaaaa aaaaaaa 3567
<210> 29
<211> 2387
<212> DNA
<213> Homo sapiens
<400> 29
agacacctct gccctcacca tgagcctctg gcagcccctg gtcctggtgc tcctggtgct 60
gggctgctgc tttgctgccc ccagacagcg ccagtccacc cttgtgctct tccctggaga 120
cctgagaacc aatctcaccg acaggcagct ggcagaggaa tacctgtacc gctatggtta 180
cactcgggtg gcagagatgc gtggagagtc gaaatctctg gggcctgcgc tgctgcttct 240
ccagaagcaa ctgtccctgc ccgagaccgg tgagctggat agcgccacgc tgaaggccat 300
gcgaacccca cggtgcgggg tcccagacct gggcagattc caaacctttg agggcgacct 360
caagtggcac caccacaaca tcacctattg gatccaaaac tactcggaag acttgccgcg 420
ggcggtgatt gacgacgcct ttgcccgcgc cttcgcactg tggagcgcgg tgacgccgct 480
caccttcact cgcgtgtaca gccgggacgc agacatcgtc atccagtttg gtgtcgcgga 540
gcacggagac gggtatccct tcgacgggaa ggacgggctc ctggcacacg cctttcctcc 600
tggccccggc attcagggag acgcccattt cgacgatgac gagttgtggt ccctgggcaa 660
gggcgtcgtg gttccaactc ggtttggaaa cgcagatggc gcggcctgcc acttcccctt 720
catcttcgag ggccgctcct actctgcctg caccaccgac ggtcgctccg acggcttgcc 780
ctggtgcagt accacggcca actacgacac cgacgaccgg tttggcttct gccccagcga 840
gagactctac acccaggacg gcaatgctga tgggaaaccc tgccagtttc cattcatctt 900
ccaaggccaa tcctactccg cctgcaccac ggacggtcgc tccgacggct accgctggtg 960
cgccaccacc gccaactacg accgggacaa gctcttcggc ttctgcccga cccgagctga 1020
ctcgacggtg atggggggca actcggcggg ggagctgtgc gtcttcccct tcactttcct 1080
gggtaaggag tactcgacct gtaccagcga gggccgcgga gatgggcgcc tctggtgcgc 1140
taccacctcg aactttgaca gcgacaagaa gtggggcttc tgcccggacc aaggatacag 1200
tttgttcctc gtggcggcgc atgagttcgg ccacgcgctg ggcttagatc attcctcagt 1260
gccggaggcg ctcatgtacc ctatgtaccg cttcactgag gggcccccct tgcataagga 1320
cgacgtgaat ggcatccggc acctctatgg tcctcgccct gaacctgagc cacggcctcc 1380
aaccaccacc acaccgcagc ccacggctcc cccgacggtc tgccccaccg gaccccccac 1440
tgtccacccc tcagagcgcc ccacagctgg ccccacaggt cccccctcag ctggccccac 1500
aggtcccccc actgctggcc cttctacggc cactactgtg cctttgagtc cggtggacga 1560
tgcctgcaac gtgaacatct tcgacgccat cgcggagatt gggaaccagc tgtatttgtt 1620
caaggatggg aagtactggc gattctctga gggcaggggg agccggccgc agggcccctt 1680
ccttatcgcc gacaagtggc ccgcgctgcc ccgcaagctg gactcggtct ttgaggagcg 1740
gctctccaag aagcttttct tcttctctgg gcgccaggtg tgggtgtaca caggcgcgtc 1800
ggtgctgggc ccgaggcgtc tggacaagct gggcctggga gccgacgtgg cccaggtgac 1860
cggggccctc cggagtggca gggggaagat gctgctgttc agcgggcggc gcctctggag 1920
gttcgacgtg aaggcgcaga tggtggatcc ccggagcgcc agcgaggtgg accggatgtt 1980
ccccggggtg cctttggaca cgcacgacgt cttccagtac cgagagaaag cctatttctg 2040
ccaggaccgc ttctactggc gcgtgagttc ccggagtgag ttgaaccagg tggaccaagt 2100
gggctacgtg acctatgaca tcctgcagtg ccctgaggac tagggctccc gtcctgcttt 2160
ggcagtgcca tgtaaatccc cactgggacc aaccctgggg aaggagccag tttgccggat 2220
acaaactggt attctgttct ggaggaaagg gaggagtgga ggtgggctgg gccctctctt 2280
ctcacctttg ttttttgttg gagtgtttct aataaacttg gattctctaa cctttaaaaa 2340
aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaa 2387
<210> 30
<211> 2379
<212> DNA
<213> Homo sapiens
<400> 30
gacccccgag ctgtgctgct cgcggccgcc accgccgggc cccggccgtc cctggctccc 60
ctcctgcctc gagaagggca gggcttctca gaggcttggc gggaaaaaga acggagggag 120
ggatcgcgct gagtataaaa gccggttttc ggggctttat ctaactcgct gtagtaattc 180
cagcgagagg cagagggagc gagcgggcgg ccggctaggg tggaagagcc gggcgagcag 240
agctgcgctg cgggcgtcct gggaagggag atccggagcg aatagggggc ttcgcctctg 300
gcccagccct cccgctgatc ccccagccag cggtccgcaa cccttgccgc atccacgaaa 360
ctttgcccat agcagcgggc gggcactttg cactggaact tacaacaccc gagcaaggac 420
gcgactctcc cgacgcgggg aggctattct gcccatttgg ggacacttcc ccgccgctgc 480
caggacccgc ttctctgaaa ggctctcctt gcagctgctt agacgctgga tttttttcgg 540
gtagtggaaa accagcagcc tcccgcgacg atgcccctca acgttagctt caccaacagg 600
aactatgacc tcgactacga ctcggtgcag ccgtatttct actgcgacga ggaggagaac 660
ttctaccagc agcagcagca gagcgagctg cagcccccgg cgcccagcga ggatatctgg 720
aagaaattcg agctgctgcc caccccgccc ctgtccccta gccgccgctc cgggctctgc 780
tcgccctcct acgttgcggt cacacccttc tcccttcggg gagacaacga cggcggtggc 840
gggagcttct ccacggccga ccagctggag atggtgaccg agctgctggg aggagacatg 900
gtgaaccaga gtttcatctg cgacccggac gacgagacct tcatcaaaaa catcatcatc 960
caggactgta tgtggagcgg cttctcggcc gccgccaagc tcgtctcaga gaagctggcc 1020
tcctaccagg ctgcgcgcaa agacagcggc agcccgaacc ccgcccgcgg ccacagcgtc 1080
tgctccacct ccagcttgta cctgcaggat ctgagcgccg ccgcctcaga gtgcatcgac 1140
ccctcggtgg tcttccccta ccctctcaac gacagcagct cgcccaagtc ctgcgcctcg 1200
caagactcca gcgccttctc tccgtcctcg gattctctgc tctcctcgac ggagtcctcc 1260
ccgcagggca gccccgagcc cctggtgctc catgaggaga caccgcccac caccagcagc 1320
gactctgagg aggaacaaga agatgaggaa gaaatcgatg ttgtttctgt ggaaaagagg 1380
caggctcctg gcaaaaggtc agagtctgga tcaccttctg ctggaggcca cagcaaacct 1440
cctcacagcc cactggtcct caagaggtgc cacgtctcca cacatcagca caactacgca 1500
gcgcctccct ccactcggaa ggactatcct gctgccaaga gggtcaagtt ggacagtgtc 1560
agagtcctga gacagatcag caacaaccga aaatgcacca gccccaggtc ctcggacacc 1620
gaggagaatg tcaagaggcg aacacacaac gtcttggagc gccagaggag gaacgagcta 1680
aaacggagct tttttgccct gcgtgaccag atcccggagt tggaaaacaa tgaaaaggcc 1740
cccaaggtag ttatccttaa aaaagccaca gcatacatcc tgtccgtcca agcagaggag 1800
caaaagctca tttctgaaga ggacttgttg cggaaacgac gagaacagtt gaaacacaaa 1860
cttgaacagc tacggaactc ttgtgcgtaa ggaaaagtaa ggaaaacgat tccttctaac 1920
agaaatgtcc tgagcaatca cctatgaact tgtttcaaat gcatgatcaa atgcaacctc 1980
acaaccttgg ctgagtcttg agactgaaag atttagccat aatgtaaact gcctcaaatt 2040
ggactttggg cataaaagaa cttttttatg cttaccatct tttttttttc tttaacagat 2100
ttgtatttaa gaattgtttt taaaaaattt taagatttac acaatgtttc tctgtaaata 2160
ttgccattaa atgtaaataa ctttaataaa acgtttatag cagttacaca gaatttcaat 2220
cctagtatat agtacctagt attataggta ctataaaccc taattttttt tatttaagta 2280
cattttgctt tttaaagttg atttttttct attgttttta gaaaaaataa aataactggc 2340
aaatatatca ttgagccaaa tcttaaaaaa aaaaaaaaa 2379
<210> 31
<211> 3125
<212> DNA
<213> Homo sapiens
<400> 31
ccgcaaccag agccgccgcc acggtgagtg gctggattca gacccctggg tggccgggac 60
aagagaaaag agggaggagg gcctttagcg gacagcgcct ggggctggag agcagcagct 120
gcacacagcc ggaaagggcg cgcaggcgac gacactcgga tccacgtcga caccgttgta 180
caaagatacg cggacccgcg ggcgtctaaa attctgggaa gcagaacctg gccggagcca 240
ctagacagag ccgggcctag cccagagaca tggagagttg ctacaaccca ggtctggatg 300
gtattattga atatgatgat ttcaaattga actcctccat tgtggaaccc aaggagccag 360
ccccagaaac agctgatggc ccctacctgg tgatcgtgga acagcctaag cagagaggct 420
tccgatttcg atatggctgt gaaggcccct cccatggagg actgcccggt gcctccagtg 480
agaagggccg aaagacctat cccactgtca agatctgtaa ctacgaggga ccagccaaga 540
tcgaggtgga cctggtaaca cacagtgacc cacctcgtgc tcatgcccac agtctggtgg 600
gcaagcaatg ctcggagctg gggatctgcg ccgtttctgt ggggcccaag gacatgactg 660
cccaatttaa caacctgggt gtcctgcatg tgactaagaa gaacatgatg gggactatga 720
tacaaaaact tcagaggcag cggctccgct ctaggcccca gggccttacg gaggccgagc 780
agcgggagct ggagcaagag gccaaagaac tgaagaaggt gatggatctg agtatagtgc 840
ggctgcgctt ctctgccttc cttagagcca gtgatggctc cttctccctg cccctgaagc 900
cagtcatctc ccagcccatc catgacagca aatctccggg ggcatcaaac ctgaagattt 960
ctcgaatgga caagacagca ggctctgtgc ggggtggaga tgaagtttat ctgctttgtg 1020
acaaggtgca gaaagatgac attgaggttc ggttctatga ggatgatgag aatggatggc 1080
aggcctttgg ggacttctct cccacagatg tgcataaaca gtatgccatt gtgttccgga 1140
caccccccta tcacaagatg aagattgagc ggcctgtaac agtgtttctg caactgaaac 1200
gcaagcgagg aggggacgtg tctgattcca aacagttcac ctattaccct ctggtggaag 1260
acaaggaaga ggtgcagcgg aagcggagga aggccttgcc caccttctcc cagcccttcg 1320
ggggtggctc ccacatgggt ggaggctctg ggggtgcagc cgggggctac ggaggagctg 1380
gaggaggtgg cagcctcggt ttcttcccct cctccctggc ctacagcccc taccagtccg 1440
gcgcgggccc catgggctgc tacccgggag gcgggggcgg ggcgcagatg gccgccacgg 1500
tgcccagcag ggactccggg gaggaagccg cggagccgag cgccccctcc aggacccccc 1560
agtgcgagcc gcaggccccg gagatgctgc agcgagctcg agagtacaac gcgcgcctgt 1620
tcggcctggc gcagcgcagc gcccgagccc tactcgacta cggcgtcacc gcggacgcgc 1680
gcgcgctgct ggcgggacag cgccacctgc tgacggcgca ggacgagaac ggagacacac 1740
cactgcacct agccatcatc cacgggcaga ccagtgtcat tgagcagata gtctatgtca 1800
tccaccacgc ccaggacctc ggcgttgtca acctcaccaa ccacctgcac cagacgcccc 1860
tgcacctggc ggtgatcacg gggcagacga gtgtggtgag ctttctgctg cgggtaggtg 1920
cagacccagc tctgctggat cggcatggag actcagccat gcatctggcg ctgcgggcag 1980
gcgctggtgc tcctgagctg ctgcgtgcac tgcttcagag tggagctcct gctgtgcccc 2040
agctgttgca tatgcctgac tttgagggac tgtatccagt acacctggcg gtccgagccc 2100
gaagccctga gtgcctggat ctgctggtgg acagtggggc tgaagtggag gccacagagc 2160
ggcagggggg acgaacagcc ttgcatctag ccacagagat ggaggagctg gggttggtca 2220
cccatctggt caccaagctc cgggccaacg tgaacgctcg cacctttgcg ggaaacacac 2280
ccctgcacct ggcagctgga ctggggtacc cgaccctcac ccgcctcctt ctgaaggctg 2340
gtgctgacat ccatgctgaa aacgaggagc ccctgtgccc actgccttca ccccctacct 2400
ctgatagcga ctcggactct gaagggcctg agaaggacac ccgaagcagc ttccggggcc 2460
acacgcctct tgacctcact tgcagcacca aggtgaagac cttgctgcta aatgctgctc 2520
agaacaccat ggagccaccc ctgaccccgc ccagcccagc agggccggga ctgtcacttg 2580
gtgatacagc tctgcagaac ctggagcagc tgctagacgg gccagaagcc cagggcagct 2640
gggcagagct ggcagagcgt ctggggctgc gcagcctggt agacacgtac cgacagacaa 2700
cctcacccag tggcagcctc ctgcgcagct acgagctggc tggcggggac ctggcaggtc 2760
tactggaggc cctgtctgac atgggcctag aggagggagt gaggctgctg aggggtccag 2820
aaacccgaga caagctgccc agcacagcag aggtgaagga agacagtgcg tacgggagcc 2880
agtcagtgga gcaggaggca gagaagctgg gcccaccccc tgagccacca ggagggctct 2940
gccacgggca cccccagcct caggtgcact gacctgctgc ctgcccccag cccccttccc 3000
ggaccccctg tacagcgtcc ccacctattt caaatcttat ttaacacccc acacccaccc 3060
ctcagttggg acaaataaag gattctcatg ggaaggggag gacccctcct tcccaactta 3120
tggca 3125
<210> 32
<211> 1579
<212> DNA
<213> Homo sapiens
<400> 32
agcccacagc agtccgtgcc gccgtcccgc ccgccagcgc cccagcgagg aagcagcgcg 60
cagcccgcgg cccagcgcac ccgcagcagc gcccgcagct cgtccgcgcc atgttccagg 120
cggccgagcg cccccaggag tgggccatgg agggcccccg cgacgggctg aagaaggagc 180
ggctactgga cgaccgccac gacagcggcc tggactccat gaaagacgag gagtacgagc 240
agatggtcaa ggagctgcag gagatccgcc tcgagccgca ggaggtgccg cgcggctcgg 300
agccctggaa gcagcagctc accgaggacg gggactcgtt cctgcacttg gccatcatcc 360
atgaagaaaa ggcactgacc atggaagtga tccgccaggt gaagggagac ctggccttcc 420
tcaacttcca gaacaacctg cagcagactc cactccactt ggctgtgatc accaaccagc 480
cagaaattgc tgaggcactt ctgggagctg gctgtgatcc tgagctccga gactttcgag 540
gaaatacccc cctacacctt gcctgtgagc agggctgcct ggccagcgtg ggagtcctga 600
ctcagtcctg caccaccccg cacctccact ccatcctgaa ggctaccaac tacaatggcc 660
acacgtgtct acacttagcc tctatccatg gctacctggg catcgtggag cttttggtgt 720
ccttgggtgc tgatgtcaat gctcaggagc cctgtaatgg ccggactgcc cttcacctcg 780
cagtggacct gcaaaatcct gacctggtgt cactcctgtt gaagtgtggg gctgatgtca 840
acagagttac ctaccagggc tattctccct accagctcac ctggggccgc ccaagcaccc 900
ggatacagca gcagctgggc cagctgacac tagaaaacct tcagatgctg ccagagagtg 960
aggatgagga gagctatgac acagagtcag agttcacgga gttcacagag gacgagctgc 1020
cctatgatga ctgtgtgttt ggaggccagc gtctgacgtt atgagcgcaa aggggctgaa 1080
agaacatgga cttgtatatt tgtacaaaaa aaaagtttta tttttctaaa aaaagaaaaa 1140
agaagaaaaa atttaaaggg tgtacttata tccacactgc acactgcctg gcccaaaacg 1200
tcttattgtg gtaggatcag ccctcatttt gttgcttttg tgaacttttt gtaggggacg 1260
agaaagatca ttgaaattct gagaaaactt cttttaaacc tcacctttgt ggggtttttg 1320
gagaaggtta tcaaaaattt catggaagga ccacatttta tatttattgt gcttcgagtg 1380
actgacccca gtggtatcct gtgacatgta acagccagga gtgttaagcg ttcagtgatg 1440
tggggtgaaa agttactacc tgtcaaggtt tgtgttaccc tcctgtaaat ggtgtacata 1500
atgtattgtt ggtaattatt ttggtacttt tatgatgtat atttattaaa cagattttta 1560
caaatgaaaa aaaaaaaaa 1579
<210> 33
<211> 2581
<212> DNA
<213> Homo sapiens
<400> 33
aatgaatgaa tgaatgaatg agtgaatgaa tcaacgaagg agtgagtcaa ggcccgggaa 60
ccacagactc caagcctacg cagagcccgg gaagggggat tccggagggg cggggcctct 120
ttccggaagc gcccgccggg ggcggggagg gggcggggcc atccgcgtga ggcgaccctg 180
ttggtccgga ggggcggggc gaggaggagg acccgcttgg gcggttcggc tgcccacagt 240
aaccgctggg tggacctggc cagcgctccg aaccttgtcc tcgctgcgcg ccggcccctc 300
ggagccccac agcccgggaa ggaggccgcc gcgggccggg cgcccgctct gccaagcgga 360
cccgcaaccc ggaaaggcgg cgcggcggag cctggagccg gatcctgctc agaccgggcc 420
ccggccggcc agagccgcgg gcatgtcgga ggcgcggaag gggccggacg aggcggagga 480
gagccagtac gactctggca ttgagtctct gcgctctctg cgctccctac ccgagtccac 540
ctcggctcca gcctccgggc cctcggacgg cagcccccag ccctgcaccc atcctccggg 600
acccgtcaag gaaccacagg agaaggaaga cgcggatggg gagcgggctg attccaccta 660
tggctcctcc tcgctcacct acaccctgtc cttgctgggg ggccccgagg ctgaggaccc 720
ggccccacgc ctgccactcc cccacgtggg ggcgctgagc cctcagcagc tggaagcact 780
cacttacatc tccgaggacg gagacacgct ggtccacctg gcagtgattc atgaggcccc 840
agcggtgctg ctctgttgcc tggctttgct gccccaggag gtcctggaca ttcaaaataa 900
cctttaccag acagcactcc atctggctgt acatctggac caaccgggcg cagttcgggc 960
actggtgctg aagggggcca gccgggcact acaggaccgg catggtgaca cagcccttca 1020
tgtggcctgc cagcgccagc acttggcctg tgcccgctgc ctgctggaag ggcggccaga 1080
gccaggcaga ggaacatctc actctctgga cctccagctg caaaactggc aaggtctggc 1140
ttgtctccac attgccaccc ttcagaagaa ccaaccactc atggaattgc tgcttcggaa 1200
tggagctgac attgatgtgc aggagggcac cagtggtaag acagcgctgc acctggctgt 1260
ggaaacccaa gagcggggcc tggtacagtt cctgctccag gctggtgccc aggtagatgc 1320
ccgcatgctg aacgggtgca cacccctgca cctggcagct ggccggggtc tcatgggcat 1380
ctcatccact ctgtgcaagg cgggtgctga ctccctgctg cggaatgtgg aggatgagac 1440
gccccaggac ctgactgagg aatcccttgt ccttttgccc tttgatgacc tgaagatctc 1500
agggaaactg ctgctgtgta ccgactgaag ccaggcaggg tctgggatcc tcagggctcc 1560
acctctccat ctggaagccg gagccataac tgctgcagtt tgggcccagg ctatgtgctc 1620
ttctggtgcc ctagggactg ctgtggccag agcctggggc cagccagtac agtcctgagc 1680
cgaggaggag ggactgcaag tggaagagag ccagtctgga aggaagagct ttccaggtgg 1740
acagggcttc ttggaagacc cccaaagccc caggtatcct gggtgaagcc tgtttgcctc 1800
tcttgaaaat ggcaggtgct cttgttttac ccatgttggg tcagcctgaa actgccaacc 1860
agtaggaagc atggactctc ctgagtgaga agagactgaa ataggagcaa gcagaaccct 1920
gagaggtgtc ccatcttatt gctgttgagg accctgaaac accgttgttt aaagacttca 1980
cacagaaggc tctgaactga gccactgggg aagggaagtt tcagtaacat gacactaaaa 2040
tggcagagac gttaaaaaaa gtttttccct tctagagctg ttttgcgcgc atgcatgtct 2100
gtgtgcattg gggcttttta gacaggcctg ccctgtgact ttgtggtaga ggcagagaga 2160
aggaaattgt cccctgagca cggtagggcc ttgctgggtg gggtcagagg ccagtagttc 2220
caggcctttc tctgtgtcca gcacagaccc ttgtccttgc tgtggaaatg atgagggatg 2280
gagggacaag aggaagaatg agaggacaca cgccctggag ccctcaccac tgccctgggg 2340
gttgccatct ggaggagcct ggggataagg gtaacccagg gaggctgggc acgagggagc 2400
tgactccacg tttttccccc cgttcctcac cttcgagggc cctgctaggt cacccctatg 2460
ctggcatgaa gagcatgggg caataaacca gcacagtctc tgaccacttg gagcgtctca 2520
tccagtgaga gagacagccg ttaaaagcat aaacatccaa ataaagatgc ctttccaagt 2580
t 2581
<210> 34
<211> 4206
<212> DNA
<213> Homo sapiens
<400> 34
ataactttgt agcgagtcga aaactgaggc tccggccgca gagaactcag cctcattcct 60
gctttaaaat ctctcggcca cctttgatga ggggactggg cagttctaga cagtcccgaa 120
gttctcaagg cacaggtctc ttcctggttt gactgtcctt accccgggga ggcagtgcag 180
ccagctgcaa gccccacagt gaagaacatc tgagctcaaa tccagataag tgacataagt 240
gacctgcttt gtaaagccat agagatggcc tgtccttgga aatttctgtt caagaccaaa 300
ttccaccagt atgcaatgaa tggggaaaaa gacatcaaca acaatgtgga gaaagccccc 360
tgtgccacct ccagtccagt gacacaggat gaccttcagt atcacaacct cagcaagcag 420
cagaatgagt ccccgcagcc cctcgtggag acgggaaaga agtctccaga atctctggtc 480
aagctggatg caaccccatt gtcctcccca cggcatgtga ggatcaaaaa ctggggcagc 540
gggatgactt tccaagacac acttcaccat aaggccaaag ggattttaac ttgcaggtcc 600
aaatcttgcc tggggtccat tatgactccc aaaagtttga ccagaggacc cagggacaag 660
cctacccctc cagatgagct tctacctcaa gctatcgaat ttgtcaacca atattacggc 720
tccttcaaag aggcaaaaat agaggaacat ctggccaggg tggaagcggt aacaaaggag 780
atagaaacaa caggaaccta ccaactgacg ggagatgagc tcatcttcgc caccaagcag 840
gcctggcgca atgccccacg ctgcattggg aggatccagt ggtccaacct gcaggtcttc 900
gatgcccgca gctgttccac tgcccgggaa atgtttgaac acatctgcag acacgtgcgt 960
tactccacca acaatggcaa catcaggtcg gccatcaccg tgttccccca gcggagtgat 1020
ggcaagcacg acttccgggt gtggaatgct cagctcatcc gctatgctgg ctaccagatg 1080
ccagatggca gcatcagagg ggaccctgcc aacgtggaat tcactcagct gtgcatcgac 1140
ctgggctgga agcccaagta cggccgcttc gatgtggtcc ccctggtcct gcaggccaat 1200
ggccgtgacc ctgagctctt cgaaatccca cctgaccttg tgcttgaggt ggccatggaa 1260
catcccaaat acgagtggtt tcgggaactg gagctaaagt ggtacgccct gcctgcagtg 1320
gccaacatgc tgcttgaggt gggcggcctg gagttcccag ggtgcccctt caatggctgg 1380
tacatgggca cagagatcgg agtccgggac ttctgtgacg tccagcgcta caacatcctg 1440
gaggaagtgg gcaggagaat gggcctggaa acgcacaagc tggcctcgct ctggaaagac 1500
caggctgtcg ttgagatcaa cattgctgtg ctccatagtt tccagaagca gaatgtgacc 1560
atcatggacc accactcggc tgcagaatcc ttcatgaagt acatgcagaa tgaataccgg 1620
tcccgtgggg gctgcccggc agactggatt tggctggtcc ctcccatgtc tgggagcatc 1680
acccccgtgt ttcaccagga gatgctgaac tacgtcctgt cccctttcta ctactatcag 1740
gtagaggcct ggaaaaccca tgtctggcag gacgagaagc ggagacccaa gagaagagag 1800
attccattga aagtcttggt caaagctgtg ctctttgcct gtatgctgat gcgcaagaca 1860
atggcgtccc gagtcagagt caccatcctc tttgcgacag agacaggaaa atcagaggcg 1920
ctggcctggg acctgggggc cttattcagc tgtgccttca accccaaggt tgtctgcatg 1980
gataagtaca ggctgagctg cctggaggag gaacggctgc tgttggtggt gaccagtacg 2040
tttggcaatg gagactgccc tggcaatgga gagaaactga agaaatcgct cttcatgctg 2100
aaagagctca acaacaaatt caggtacgct gtgtttggcc tcggctccag catgtaccct 2160
cggttctgcg cctttgctca tgacattgat cagaagctgt cccacctggg ggcctctcag 2220
ctcaccccga tgggagaagg ggatgagctc agtgggcagg aggacgcctt ccgcagctgg 2280
gccgtgcaaa ccttcaaggc agcctgtgag acgtttgatg tccgaggcaa acagcacatt 2340
cagatcccca agctctacac ctccaatgtg acctgggacc cgcaccacta caggctcgtg 2400
caggactcac agcctttgga cctcagcaaa gccctcagca gcatgcatgc caagaacgtg 2460
ttcaccatga ggctcaaatc tcggcagaat ctacaaagtc cgacatccag ccgtgccacc 2520
atcctggtgg aactctcctg tgaggatggc caaggcctga actacctgcc gggggagcac 2580
cttggggttt gcccaggcaa ccagccggcc ctggtccaag gtatcctgga gcgagtggtg 2640
gatggcccca caccccacca gacagtgcgc ctggaggccc tggatgagag tggcagctac 2700
tgggtcagtg acaagaggct gcccccctgc tcactcagcc aggccctcac ctacttcctg 2760
gacatcacca cacccccaac ccagctgctg ctccaaaagc tggcccaggt ggccacagaa 2820
gagcctgaga gacagaggct ggaggccctg tgccagccct cagagtacag caagtggaag 2880
ttcaccaaca gccccacatt cctggaggtg ctagaggagt tcccgtccct gcgggtgtct 2940
gctggcttcc tgctttccca gctccccatt ctgaagccca ggttctactc catcagctcc 3000
tcccgggatc acacgcccac agagatccac ctgactgtgg ccgtggtcac ctaccacacc 3060
cgagatggcc agggtcccct gcaccacggc gtctgcagca catggctcaa cagcctgaag 3120
ccccaagacc cagtgccctg ctttgtgcgg aatgccagcg gcttccacct ccccgaggat 3180
ccctcccatc cttgcatcct catcgggcct ggcacaggca tcgcgccctt ccgcagtttc 3240
tggcagcaac ggctccatga ctcccagcac aagggagtgc ggggaggccg catgaccttg 3300
gtgtttgggt gccgccgccc agatgaggac cacatctacc aggaggagat gctggagatg 3360
gcccagaagg gggtgctgca tgcggtgcac acagcctatt cccgcctgcc tggcaagccc 3420
aaggtctatg ttcaggacat cctgcggcag cagctggcca gcgaggtgct ccgtgtgctc 3480
cacaaggagc caggccacct ctatgtttgc ggggatgtgc gcatggcccg ggacgtggcc 3540
cacaccctga agcagctggt ggctgccaag ctgaaattga atgaggagca ggtcgaggac 3600
tatttctttc agctcaagag ccagaagcgc tatcacgaag atatctttgg tgctgtattt 3660
ccttacgagg cgaagaagga cagggtggcg gtgcagccca gcagcctgga gatgtcagcg 3720
ctctgagggc ctacaggagg ggttaaagct gccggcacag aacttaagga tggagccagc 3780
tctgcattat ctgaggtcac agggcctggg gagatggagg aaagtgatat cccccagcct 3840
caagtcttat ttcctcaacg ttgctcccca tcaagccctt tacttgacct cctaacaagt 3900
agcaccctgg attgatcgga gcctcctctc tcaaactggg gcctccctgg tcccttggag 3960
acaaaatctt aaatgccagg cctggcaagt gggtgaaaga tggaacttgc tgctgagtgc 4020
accacttcaa gtgaccacca ggaggtgcta tcgcaccact gtgtatttaa ctgccttgtg 4080
tacagttatt tatgcctctg tatttaaaaa actaacaccc agtctgttcc ccatggccac 4140
ttgggtcttc cctgtatgat tccttgatgg agatatttac atgaattgca ttttacttta 4200
atcaca 4206
<210> 35
<211> 4507
<212> DNA
<213> Homo sapiens
<400> 35
gaccaattgt catacgactt gcagtgagcg tcaggagcac gtccaggaac tcctcagcag 60
cgcctccttc agctccacag ccagacgccc tcagacagca aagcctaccc ccgcgccgcg 120
ccctgcccgc cgctgcgatg ctcgcccgcg ccctgctgct gtgcgcggtc ctggcgctca 180
gccatacagc aaatccttgc tgttcccacc catgtcaaaa ccgaggtgta tgtatgagtg 240
tgggatttga ccagtataag tgcgattgta cccggacagg attctatgga gaaaactgct 300
caacaccgga atttttgaca agaataaaat tatttctgaa acccactcca aacacagtgc 360
actacatact tacccacttc aagggatttt ggaacgttgt gaataacatt cccttccttc 420
gaaatgcaat tatgagttat gtgttgacat ccagatcaca tttgattgac agtccaccaa 480
cttacaatgc tgactatggc tacaaaagct gggaagcctt ctctaacctc tcctattata 540
ctagagccct tcctcctgtg cctgatgatt gcccgactcc cttgggtgtc aaaggtaaaa 600
agcagcttcc tgattcaaat gagattgtgg aaaaattgct tctaagaaga aagttcatcc 660
ctgatcccca gggctcaaac atgatgtttg cattctttgc ccagcacttc acgcatcagt 720
ttttcaagac agatcataag cgagggccag ctttcaccaa cgggctgggc catggggtgg 780
acttaaatca tatttacggt gaaactctgg ctagacagcg taaactgcgc cttttcaagg 840
atggaaaaat gaaatatcag ataattgatg gagagatgta tcctcccaca gtcaaagata 900
ctcaggcaga gatgatctac cctcctcaag tccctgagca tctacggttt gctgtggggc 960
aggaggtctt tggtctggtg cctggtctga tgatgtatgc cacaatctgg ctgcgggaac 1020
acaacagagt atgcgatgtg cttaaacagg agcatcctga atggggtgat gagcagttgt 1080
tccagacaag caggctaata ctgataggag agactattaa gattgtgatt gaagattatg 1140
tgcaacactt gagtggctat cacttcaaac tgaaatttga cccagaacta cttttcaaca 1200
aacaattcca gtaccaaaat cgtattgctg ctgaatttaa caccctctat cactggcatc 1260
cccttctgcc tgacaccttt caaattcatg accagaaata caactatcaa cagtttatct 1320
acaacaactc tatattgctg gaacatggaa ttacccagtt tgttgaatca ttcaccaggc 1380
aaattgctgg cagggttgct ggtggtagga atgttccacc cgcagtacag aaagtatcac 1440
aggcttccat tgaccagagc aggcagatga aataccagtc ttttaatgag taccgcaaac 1500
gctttatgct gaagccctat gaatcatttg aagaacttac aggagaaaag gaaatgtctg 1560
cagagttgga agcactctat ggtgacatcg atgctgtgga gctgtatcct gcccttctgg 1620
tagaaaagcc tcggccagat gccatctttg gtgaaaccat ggtagaagtt ggagcaccat 1680
tctccttgaa aggacttatg ggtaatgtta tatgttctcc tgcctactgg aagccaagca 1740
cttttggtgg agaagtgggt tttcaaatca tcaacactgc ctcaattcag tctctcatct 1800
gcaataacgt gaagggctgt ccctttactt cattcagtgt tccagatcca gagctcatta 1860
aaacagtcac catcaatgca agttcttccc gctccggact agatgatatc aatcccacag 1920
tactactaaa agaacgttcg actgaactgt agaagtctaa tgatcatatt tatttattta 1980
tatgaaccat gtctattaat ttaattattt aataatattt atattaaact ccttatgtta 2040
cttaacatct tctgtaacag aagtcagtac tcctgttgcg gagaaaggag tcatacttgt 2100
gaagactttt atgtcactac tctaaagatt ttgctgttgc tgttaagttt ggaaaacagt 2160
ttttattctg ttttataaac cagagagaaa tgagttttga cgtcttttta cttgaatttc 2220
aacttatatt ataagaacga aagtaaagat gtttgaatac ttaaacactg tcacaagatg 2280
gcaaaatgct gaaagttttt acactgtcga tgtttccaat gcatcttcca tgatgcatta 2340
gaagtaacta atgtttgaaa ttttaaagta cttttggtta tttttctgtc atcaaacaaa 2400
aacaggtatc agtgcattat taaatgaata tttaaattag acattaccag taatttcatg 2460
tctacttttt aaaatcagca atgaaacaat aatttgaaat ttctaaattc atagggtaga 2520
atcacctgta aaagcttgtt tgatttctta aagttattaa acttgtacat ataccaaaaa 2580
gaagctgtct tggatttaaa tctgtaaaat cagtagaaat tttactacaa ttgcttgtta 2640
aaatatttta taagtgatgt tcctttttca ccaagagtat aaaccttttt agtgtgactg 2700
ttaaaacttc cttttaaatc aaaatgccaa atttattaag gtggtggagc cactgcagtg 2760
ttatcttaaa ataagaatat tttgttgaga tattccagaa tttgtttata tggctggtaa 2820
catgtaaaat ctatatcagc aaaagggtct acctttaaaa taagcaataa caaagaagaa 2880
aaccaaatta ttgttcaaat ttaggtttaa acttttgaag caaacttttt tttatccttg 2940
tgcactgcag gcctggtact cagattttgc tatgaggtta atgaagtacc aagctgtgct 3000
tgaataatga tatgttttct cagattttct gttgtacagt ttaatttagc agtccatatc 3060
acattgcaaa agtagcaatg acctcataaa atacctcttc aaaatgctta aattcatttc 3120
acacattaat tttatctcag tcttgaagcc aattcagtag gtgcattgga atcaagcctg 3180
gctacctgca tgctgttcct tttcttttct tcttttagcc attttgctaa gagacacagt 3240
cttctcatca cttcgtttct cctattttgt tttactagtt ttaagatcag agttcacttt 3300
ctttggactc tgcctatatt ttcttacctg aacttttgca agttttcagg taaacctcag 3360
ctcaggactg ctatttagct cctcttaaga agattaaaag agaaaaaaaa aggccctttt 3420
aaaaatagta tacacttatt ttaagtgaaa agcagagaat tttatttata gctaatttta 3480
gctatctgta accaagatgg atgcaaagag gctagtgcct cagagagaac tgtacggggt 3540
ttgtgactgg aaaaagttac gttcccattc taattaatgc cctttcttat ttaaaaacaa 3600
aaccaaatga tatctaagta gttctcagca ataataataa tgacgataat acttcttttc 3660
cacatctcat tgtcactgac atttaatggt actgtatatt acttaattta ttgaagatta 3720
ttatttatgt cttattagga cactatggtt ataaactgtg tttaagccta caatcattga 3780
tttttttttg ttatgtcaca atcagtatat tttctttggg gttacctctc tgaatattat 3840
gtaaacaatc caaagaaatg attgtattaa gatttgtgaa taaattttta gaaatctgat 3900
tggcatattg agatatttaa ggttgaatgt ttgtccttag gataggccta tgtgctagcc 3960
cacaaagaat attgtctcat tagcctgaat gtgccataag actgaccttt taaaatgttt 4020
tgagggatct gtggatgctt cgttaatttg ttcagccaca atttattgag aaaatattct 4080
gtgtcaagca ctgtgggttt taatattttt aaatcaaacg ctgattacag ataatagtat 4140
ttatataaat aattgaaaaa aattttcttt tgggaagagg gagaaaatga aataaatatc 4200
attaaagata actcaggaga atcttcttta caattttacg tttagaatgt ttaaggttaa 4260
gaaagaaata gtcaatatgc ttgtataaaa cactgttcac tgtttttttt aaaaaaaaaa 4320
cttgatttgt tattaacatt gatctgctga caaaacctgg gaatttgggt tgtgtatgcg 4380
aatgtttcag tgcctcagac aaatgtgtat ttaacttatg taaaagataa gtctggaaat 4440
aaatgtctgt ttatttttgt actatttaaa aattgacaga tcttttctga agaaaaaaaa 4500
aaaaaaa 4507
<210> 36
<211> 3875
<212> DNA
<213> Homo sapiens
<400> 36
agctgttctt ggctgacttc acatcaaaac tcctatactg acctgagaca gaggcagcag 60
tgatacccac ctgagagatc ctgtgtttga acaactgctt cccaaaacgg aaagtatttc 120
aagcctaaac ctttgggtga aaagaactct tgaagtcatg attgcttcac agtttctctc 180
agctctcact ttggtgcttc tcattaaaga gagtggagcc tggtcttaca acacctccac 240
ggaagctatg acttatgatg aggccagtgc ttattgtcag caaaggtaca cacacctggt 300
tgcaattcaa aacaaagaag agattgagta cctaaactcc atattgagct attcaccaag 360
ttattactgg attggaatca gaaaagtcaa caatgtgtgg gtctgggtag gaacccagaa 420
acctctgaca gaagaagcca agaactgggc tccaggtgaa cccaacaata ggcaaaaaga 480
tgaggactgc gtggagatct acatcaagag agaaaaagat gtgggcatgt ggaatgatga 540
gaggtgcagc aagaagaagc ttgccctatg ctacacagct gcctgtacca atacatcctg 600
cagtggccac ggtgaatgtg tagagaccat caataattac acttgcaagt gtgaccctgg 660
cttcagtgga ctcaagtgtg agcaaattgt gaactgtaca gccctggaat cccctgagca 720
tggaagcctg gtttgcagtc acccactggg aaacttcagc tacaattctt cctgctctat 780
cagctgtgat aggggttacc tgccaagcag catggagacc atgcagtgta tgtcctctgg 840
agaatggagt gctcctattc cagcctgcaa tgtggttgag tgtgatgctg tgacaaatcc 900
agccaatggg ttcgtggaat gtttccaaaa ccctggaagc ttcccatgga acacaacctg 960
tacatttgac tgtgaagaag gatttgaact aatgggagcc cagagccttc agtgtacctc 1020
atctgggaat tgggacaacg agaagccaac gtgtaaagct gtgacatgca gggccgtccg 1080
ccagcctcag aatggctctg tgaggtgcag ccattcccct gctggagagt tcaccttcaa 1140
atcatcctgc aacttcacct gtgaggaagg cttcatgttg cagggaccag cccaggttga 1200
atgcaccact caagggcagt ggacacagca aatcccagtt tgtgaagctt tccagtgcac 1260
agccttgtcc aaccccgagc gaggctacat gaattgtctt cctagtgctt ctggcagttt 1320
ccgttatggg tccagctgtg agttctcctg tgagcagggt tttgtgttga agggatccaa 1380
aaggctccaa tgtggcccca caggggagtg ggacaacgag aagcccacat gtgaagctgt 1440
gagatgcgat gctgtccacc agcccccgaa gggtttggtg aggtgtgctc attcccctat 1500
tggagaattc acctacaagt cctcttgtgc cttcagctgt gaggagggat ttgaattaca 1560
tggatcaact caacttgagt gcacatctca gggacaatgg acagaagagg ttccttcctg 1620
ccaagtggta aaatgttcaa gcctggcagt tccgggaaag atcaacatga gctgcagtgg 1680
ggagcccgtg tttggcactg tgtgcaagtt cgcctgtcct gaaggatgga cgctcaatgg 1740
ctctgcagct cggacatgtg gagccacagg acactggtct ggcctgctac ctacctgtga 1800
agctcccact gagtccaaca ttcccttggt agctggactt tctgctgctg gactctccct 1860
cctgacatta gcaccatttc tcctctggct tcggaaatgc ttacggaaag caaagaaatt 1920
tgttcctgcc agcagctgcc aaagccttga atcagatgga agctaccaaa agccttctta 1980
catcctttaa gttcaaaaga atcagaaaca ggtgcatctg gggaactaga gggatacact 2040
gaagttaaca gagacagata actctcctcg ggtctctggc ccttcttgcc tactatgcca 2100
gatgccttta tggctgaaac cgcaacaccc atcaccactt caatagatca aagtccagca 2160
ggcaaggacg gccttcaact gaaaagactc agtgttccct ttcctactct caggatcaag 2220
aaagtgttgg ctaatgaagg gaaaggatat tttcttccaa gcaaaggtga agagaccaag 2280
actctgaaat ctcagaattc cttttctaac tctcccttgc tcgctgtaaa atcttggcac 2340
agaaacacaa tattttgtgg ctttctttct tttgcccttc acagtgtttc gacagctgat 2400
tacacagttg ctgtcataag aatgaataat aattatccag agtttagagg aaaaaaatga 2460
ctaaaaatat tataacttaa aaaaatgaca gatgttgaat gcccacaggc aaatgcatgg 2520
agggttgtta atggtgcaaa tcctactgaa tgctctgtgc gagggttact atgcacaatt 2580
taatcacttt catccctatg ggattcagtg cttcttaaag agttcttaag gattgtgata 2640
tttttacttg cattgaatat attataatct tccatacttc ttcattcaat acaagtgtgg 2700
tagggactta aaaaacttgt aaatgctgtc aactatgata tggtaaaagt tacttattct 2760
agattacccc ctcattgttt attaacaaat tatgttacat ctgttttaaa tttatttcaa 2820
aaagggaaac tattgtcccc tagcaaggca tgatgttaac cagaataaag ttctgagtgt 2880
ttttactaca gttgtttttt gaaaacatgg tagaattgga gagtaaaaac tgaatggaag 2940
gtttgtatat tgtcagatat tttttcagaa atatgtggtt tccacgatga aaaacttcca 3000
tgaggccaaa cgttttgaac taataaaagc ataaatgcaa acacacaaag gtataatttt 3060
atgaatgtct ttgttggaaa agaatacaga aagatggatg tgctttgcat tcctacaaag 3120
atgtttgtca gatatgatat gtaaacataa ttcttgtata ttatggaaga ttttaaattc 3180
acaatagaaa ctcaccatgt aaaagagtca tctggtagat ttttaacgaa tgaagatgtc 3240
taatagttat tccctatttg ttttcttctg tatgttaggg tgctctggaa gagaggaatg 3300
cctgtgtgag caagcattta tgtttattta taagcagatt taacaattcc aaaggaatct 3360
ccagttttca gttgatcact ggcaatgaaa aattctcagt cagtaattgc caaagctgct 3420
ctagccttga ggagtgtgag aatcaaaact ctcctacact tccattaact tagcatgtgt 3480
tgaaaaaaaa gtttcagaga agttctggct gaacactggc aacaacaaag ccaacagtca 3540
aaacagagat gtgataagga tcagaacagc agaggttctt ttaaaggggc agaaaaactc 3600
tgggaaataa gagagaacaa ctactgtgat caggctatgt atggaataca gtgttatttt 3660
ctttgaaatt gtttaagtgt tgtaaatatt tatgtaaact gcattagaaa ttagctgtgt 3720
gaaataccag tgtggtttgt gtttgagttt tattgagaat tttaaattat aacttaaaat 3780
attttataat ttttaaagta tatatttatt taagcttatg tcagacctat ttgacataac 3840
actataaagg ttgacaataa atgtgcttat gttta 3875
<210> 37
<211> 1593
<212> DNA
<213> Homo sapiens
<400> 37
gcggtgccct tgcggcgcag ctggggtcgc ggccctgctc cccgcgcttt cttaaggccc 60
gcgggcggcg caggagcggc actcgtggct gtggtggctt cggcagcggc ttcagcagat 120
cggcggcatc agcggtagca ccagcactag cagcatgttg agccgggcag tgtgcggcac 180
cagcaggcag ctggctccgg ttttggggta tctgggctcc aggcagaagc acagcctccc 240
cgacctgccc tacgactacg gcgccctgga acctcacatc aacgcgcaga tcatgcagct 300
gcaccacagc aagcaccacg cggcctacgt gaacaacctg aacgtcaccg aggagaagta 360
ccaggaggcg ttggccaagg gagatgttac agcccagata gctcttcagc ctgcactgaa 420
gttcaatggt ggtggtcata tcaatcatag cattttctgg acaaacctca gccctaacgg 480
tggtggagaa cccaaagggg agttgctgga agccatcaaa cgtgactttg gttcctttga 540
caagtttaag gagaagctga cggctgcatc tgttggtgtc caaggctcag gttggggttg 600
gcttggtttc aataaggaac ggggacactt acaaattgct gcttgtccaa atcaggatcc 660
actgcaagga acaacaggcc ttattccact gctggggatt gatgtgtggg agcacgctta 720
ctaccttcag tataaaaatg tcaggcctga ttatctaaaa gctatttgga atgtaatcaa 780
ctgggagaat gtaactgaaa gatacatggc ttgcaaaaag taaaccacga tcgttatgct 840
gagtatgtta agctctttat gactgttttt gtagtggtat agagtactgc agaatacagt 900
aagctgctct attgtagcat ttcttgatgt tgcttagtca cttatttcat aaacaactta 960
atgttctgaa taatttctta ctaaacattt tgttattggg caagtgattg aaaatagtaa 1020
atgctttgtg tgattgaatc tgattggaca ttttcttcag agagctaaat tacaattgtc 1080
atttataaaa ccatcaaaaa tattccatcc atatactttg gggacttgta gggatgcctt 1140
tctagtccta ttctattgca gttatagaaa atctagtctt ttgccccagt tacttaaaaa 1200
taaaatatta acactttccc aagggaaaca ctcggctttc tatagaaaat tgcacttttt 1260
gtcgagtaat cctctgcagt gatacttctg gtagatgtca cccagtggtt tttgttaggt 1320
caaatgttcc tgtatagttt ttgcaaatag agctgtatac tgtttaaatg tagcaggtga 1380
actgaactgg ggtttgctca cctgcacagt aaaggcaaac ttcaacagca aaactgcaaa 1440
aaggtggttt ttgcagtagg agaaaggagg atgtttattt gcagggcgcc aagcaaggag 1500
aattgggcag ctcatgcttg agacccaatc tccatgatga cctacaagct agagtattta 1560
aaggcagtgg taaatttcag gaaagcagaa gtt 1593
<210> 38
<211> 4314
<212> DNA
<213> Homo sapiens
<400> 38
agacaggata ttcactgctg tggcaaggcc tgtagagagt ttcgaagtta ggaggactca 60
agacggtccc tccctggact tttctgaagg ggctcaaaag atgacacgcg ccagagctgg 120
aaggcgtcgc caattggtcc acttttccct cctccctttt tgcggatgag aaaactgagg 180
cccaggtttg ggatttccag agcccgggat ttcccggcaa cgcccgacaa ccacattccc 240
ccggctattc tgacccgccc cggttccggg acgctccctg ggagccgccg ccgagggcct 300
gctgggactc ccgggggacc ccgccgtcgg ggcagccccc acgcccggcg ccgcccgccg 360
ggaacggccg ccgctgttgc gcacttgcag gggagccggc gactgagggc gaggcaggga 420
gggagcaagc ggggctggga gggctgctgg cgcgggctcg cgcgctgtgt atggtctatc 480
gcaggcagct gacctttgag gaggaaatcg ctgctctccg ctccttcctg tagtaacagc 540
cgccgctgcc gccgccgcca ggaaccccgg ccgggagcga gagccgcggg gcgcagagcc 600
ggcccggctg ccggacggtg cggccccacc aggtgaacgg ccatggcggg ctggatccag 660
gcccagcagc tgcagggaga cgcgctgcgc cagatgcagg tgctgtacgg ccagcacttc 720
cccatcgagg tccggcacta cttggcccag tggattgaga gccagccatg ggatgccatt 780
gacttggaca atccccagga cagagcccaa gccacccagc tcctggaggg cctggtgcag 840
gagctgcaga agaaggcgga gcaccaggtg ggggaagatg ggtttttact gaagatcaag 900
ctggggcact acgccacgca gctccagaaa acatatgacc gctgccccct ggagctggtc 960
cgctgcatcc ggcacattct gtacaatgaa cagaggctgg tccgagaagc caacaattgc 1020
agctctccgg ctgggatcct ggttgacgcc atgtcccaga agcaccttca gatcaaccag 1080
acatttgagg agctgcgact ggtcacgcag gacacagaga atgagctgaa gaaactgcag 1140
cagactcagg agtacttcat catccagtac caggagagcc tgaggatcca agctcagttt 1200
gcccagctgg cccagctgag cccccaggag cgtctgagcc gggagacggc cctccagcag 1260
aagcaggtgt ctctggaggc ctggttgcag cgtgaggcac agacactgca gcagtaccgc 1320
gtggagctgg ccgagaagca ccagaagacc ctgcagctgc tgcggaagca gcagaccatc 1380
atcctggatg acgagctgat ccagtggaag cggcggcagc agctggccgg gaacggcggg 1440
ccccccgagg gcagcctgga cgtgctacag tcctggtgtg agaagttggc cgagatcatc 1500
tggcagaacc ggcagcagat ccgcagggct gagcacctct gccagcagct gcccatcccc 1560
ggcccagtgg aggagatgct ggccgaggtc aacgccacca tcacggacat tatctcagcc 1620
ctggtgacca gcacattcat cattgagaag cagcctcctc aggtcctgaa gacccagacc 1680
aagtttgcag ccaccgtacg cctgctggtg ggcgggaagc tgaacgtgca catgaatccc 1740
ccccaggtga aggccaccat catcagtgag cagcaggcca agtctctgct taaaaatgag 1800
aacacccgca acgagtgcag tggtgagatc ctgaacaact gctgcgtgat ggagtaccac 1860
caagccacgg gcaccctcag tgcccacttc aggaacatgt cactgaagag gatcaagcgt 1920
gctgaccggc ggggtgcaga gtccgtgaca gaggagaagt tcacagtcct gtttgagtct 1980
cagttcagtg ttggcagcaa tgagcttgtg ttccaggtga agactctgtc cctacctgtg 2040
gttgtcatcg tccacggcag ccaggaccac aatgccacgg ctactgtgct gtgggacaat 2100
gcctttgctg agccgggcag ggtgccattt gccgtgcctg acaaagtgct gtggccgcag 2160
ctgtgtgagg cgctcaacat gaaattcaag gccgaagtgc agagcaaccg gggcctgacc 2220
aaggagaacc tcgtgttcct ggcgcagaaa ctgttcaaca acagcagcag ccacctggag 2280
gactacagtg gcctgtccgt gtcctggtcc cagttcaaca gggagaactt gccgggctgg 2340
aactacacct tctggcagtg gtttgacggg gtgatggagg tgttgaagaa gcaccacaag 2400
ccccactgga atgatggggc catcctaggt tttgtgaata agcaacaggc ccacgacctg 2460
ctcatcaaca agcccgacgg gaccttcttg ttgcgcttta gtgactcaga aatcgggggc 2520
atcaccatcg cctggaagtt tgactccccg gaacgcaacc tgtggaacct gaaaccattc 2580
accacgcggg atttctccat caggtccctg gctgaccggc tgggggacct gagctatctc 2640
atctatgtgt ttcctgaccg ccccaaggat gaggtcttct ccaagtacta cactcctgtg 2700
ctggctaaag ctgttgatgg atatgtgaaa ccacagatca agcaagtggt ccctgagttt 2760
gtgaatgcat ctgcagatgc tgggggcagc agcgccacgt acatggacca ggccccctcc 2820
ccagctgtgt gcccccaggc tccctataac atgtacccac agaaccctga ccatgtactc 2880
gatcaggatg gagaattcga cctggatgag accatggatg tggccaggca cgtggaggaa 2940
ctcttacgcc gaccaatgga cagtcttgac tcccgcctct cgccccctgc cggtcttttc 3000
acctctgcca gaggctccct ctcatgaatg tttgaatccc acgcttctct ttggaaacaa 3060
tatgcaatgt gaagcggtcg tgttgtgagt ttagtaaggc tgtgtacact gacacctttg 3120
caggcatgca tgtgcttgtg tgtgtgtgtg tgtgtgtgtc cttgtgcatg agctacgcct 3180
gcctcccctg tgcagtcctg ggatgtggct gcagcagcgg tggcctcttt tcagatcatg 3240
gcatccaaga gtgcgccgag tctgtctctg tcatggtaga gaccgagcct ctgtcactgc 3300
aggcactcaa tgcagccaga cctattcctc ctgggcccct catctgctca gcagctattt 3360
gaatgagatg attcagaagg ggaggggaga caggtaacgt ctgtaagctg aagtttcact 3420
ccggagtgag aagctttgcc ctcctaagag agagagacag agagacagag agagagaaag 3480
agagagtgtg tgggtctatg taaatgcatc tgtcctcatg tgttgatgta accgattcat 3540
ctctcagaag ggaggctggg gttcattttc gagtagtatt ttatacttta gtgaacgtgg 3600
actccagact ctctgtgaac cctatgagag cgcgtctggg cccggccatg tccttagcac 3660
aggggggccg ccggtttgag tgagggtttc tgagctgctc tgaattagtc cttgcttggc 3720
tgcttggcct tgggcttcat tcaagtctat gatgctgttg cccacgtttc ccgggatata 3780
tattctctcc cctccgttgg gccccagcct tctttgcttg cctctctgtt tgtaaccttg 3840
tcgacaaaga ggtagaaaag attgggtcta ggatatggtg ggtggacagg ggccccggga 3900
cttggagggt tggtcctctt gcctcctgga aaaaacaaaa acaaaaaact gcagtgaaag 3960
acaagctgca aatcagccat gtgctgcgtg cctgtggaat ctggagtgag gggtaaaagc 4020
tgatctggtt tgactccgct ggaggtgggg cctggagcag gccttgcgct gttgcgtaac 4080
tggctgtgtt ctggtgaggc cttgctccca accccacacg ctcctccctc tgaggctgta 4140
ggactcgcag tcaggggcag ctgaccatgg aagattgaga gcccaaggtt taaacttctc 4200
tgaagggagg tggggatgag aagaggggtt tttttgtact ttgtacaaag accacacatt 4260
tgtgtaaaca gtgttttgga ataaaatatt tttttcataa aaaaaaaaaa aaaa 4314
<210> 39
<211> 1686
<212> DNA
<213> Homo sapiens
<400> 39
cagacgctcc ctcagcaagg acagcagagg accagctaag agggagagaa gcaactacag 60
accccccctg aaaacaaccc tcagacgcca catcccctga caagctgcca ggcaggttct 120
cttcctctca catactgacc cacggctcca ccctctctcc cctggaaagg acaccatgag 180
cactgaaagc atgatccggg acgtggagct ggccgaggag gcgctcccca agaagacagg 240
ggggccccag ggctccaggc ggtgcttgtt cctcagcctc ttctccttcc tgatcgtggc 300
aggcgccacc acgctcttct gcctgctgca ctttggagtg atcggccccc agagggaaga 360
gttccccagg gacctctctc taatcagccc tctggcccag gcagtcagat catcttctcg 420
aaccccgagt gacaagcctg tagcccatgt tgtagcaaac cctcaagctg aggggcagct 480
ccagtggctg aaccgccggg ccaatgccct cctggccaat ggcgtggagc tgagagataa 540
ccagctggtg gtgccatcag agggcctgta cctcatctac tcccaggtcc tcttcaaggg 600
ccaaggctgc ccctccaccc atgtgctcct cacccacacc atcagccgca tcgccgtctc 660
ctaccagacc aaggtcaacc tcctctctgc catcaagagc ccctgccaga gggagacccc 720
agagggggct gaggccaagc cctggtatga gcccatctat ctgggagggg tcttccagct 780
ggagaagggt gaccgactca gcgctgagat caatcggccc gactatctcg actttgccga 840
gtctgggcag gtctactttg ggatcattgc cctgtgagga ggacgaacat ccaaccttcc 900
caaacgcctc ccctgcccca atccctttat taccccctcc ttcagacacc ctcaacctct 960
tctggctcaa aaagagaatt gggggcttag ggtcggaacc caagcttaga actttaagca 1020
acaagaccac cacttcgaaa cctgggattc aggaatgtgt ggcctgcaca gtgaagtgct 1080
ggcaaccact aagaattcaa actggggcct ccagaactca ctggggccta cagctttgat 1140
ccctgacatc tggaatctgg agaccaggga gcctttggtt ctggccagaa tgctgcagga 1200
cttgagaaga cctcacctag aaattgacac aagtggacct taggccttcc tctctccaga 1260
tgtttccaga cttccttgag acacggagcc cagccctccc catggagcca gctccctcta 1320
tttatgtttg cacttgtgat tatttattat ttatttatta tttatttatt tacagatgaa 1380
tgtatttatt tgggagaccg gggtatcctg ggggacccaa tgtaggagct gccttggctc 1440
agacatgttt tccgtgaaaa cggagctgaa caataggctg ttcccatgta gccccctggc 1500
ctctgtgcct tcttttgatt atgtttttta aaatatttat ctgattaagt tgtctaaaca 1560
atgctgattt ggtgaccaac tgtcactcat tgctgagcct ctgctcccca ggggagttgt 1620
gtctgtaatc gccctactat tcagtggcga gaaataaagt ttgcttagaa aagaaaaaaa 1680
aaaaaa 1686
<210> 40
<211> 4180
<212> DNA
<213> Homo sapiens
<400> 40
ccagggtgat gctgaagatg atgaccttct tccaaggcct ctagagccat cagcctgtgc 60
caggcaccct cgacttgcct agaggccccc aaaagttgca gtccacatca gaggcagagt 120
cagaggcctc catgtcggag gcctcctctg aggacctggt gccacccctg gaggctgggg 180
cagccccata tagggaggag gaagaggcgg cgaagaagaa gaaggagaag aagaagaagt 240
ccaaaggcct ggccaatgtg ttctgcgtct tcaccaaagg gaagaagaag aagggtcagc 300
ccagctcagc ggagcccgag gacgcagccg ggtccaggca ggggctggat ggcccgcccc 360
ccacagtgga ggagctgaag gcggcgctgg agcgcgggca gctggaggcg gcgcggccgc 420
tgctggcgct ggagcgggag ctggcggcgg cggcggcggc gggcggtgtg agcgaggagg 480
agctggtgcg gcgccagagc aaggtggagg cgctgtacga gctgctgcgc gaccaggtgc 540
tgggcgtgct gcggcggccg ctggaggcgc cgcccgagcg gctgcgccag gcgctggccg 600
tggtggcgga gcaggagcgc gaggaccgcc aggcggcggc ggcggggccg gggacctcgg 660
ggctggcggc cacgcgcccg cggcgctggc tgcagctgtg gcggcgcggc gtggcggagg 720
cggccgagga gcgcatgggc cagcggccgg ccgcgggcgc cgaggtcccc gagagcgtct 780
ttctgcactt gggccgcacc atgaaggagg acctggaggc cgtggtggag cggctgaagc 840
cgctgttccc cgccgagttc ggcgtcgtgg cggcctacgc cgagagctac caccagcact 900
tcgcggccca cctggccgcc gtggcgcagt tcgagctgtg cgagcgcgac acctacatgc 960
tgctgctctg ggtgcagaac ctctacccca atgacatcat caacagcccc aagctggtgg 1020
gtgagctgca gggtatgggg ctcgggagcc tcctgccccc caggcagatc cgactgctgg 1080
aggccacatt cctgtccagt gaggcggcca atgtgaggga gttgatggac cgagctctgg 1140
agctagaggc acggcgctgg gctgaggatg tgcctcccca gaggctggac ggccactgcc 1200
acagcgagct ggccatcgac atcatccaga tcacctccca ggcccaggcc aaggccgaga 1260
gcatcacgct ggacttgggc tcacagataa agcgggtgct gctggtggag ctgcctgcgt 1320
tcctgaggag ctaccagcgc gcctttaatg aatttctgga gagaggcaag cagctgacga 1380
attacagggc caatgttatt gccaacatca acaactgcct gtccttccgg atgtccatgg 1440
agcagaattg gcaggtaccc caggacaccc tgagcctcct gctgggcccc ctgggtgagc 1500
tcaagagcca cggctttgac accctgctcc agaacctgca tgaggacctg aagccactgt 1560
tcaagaggtt cacgcacacc cgctgggcgg cccctgtgga gaccctggaa aacatcatcg 1620
ccactgtaga cacgaggctg cctgagttct cagagctgca gggctgtttc cgggaggagc 1680
tcatggaggc cttgcacctg cacctggtga aggagtacat catccaactc agcaaggggc 1740
gcctggtcct caagacggcc gagcagcagc agcagctggc tgggtacatc ctggccaatg 1800
ctgacaccat ccagcacttc tgcacccagc acggctcccc ggcgacctgg ctgcagcctg 1860
ctctccctac gctggccgag atcattcgcc tgcaggaccc cagtgccatc aagattgagg 1920
tggccactta tgccacctgc taccctgact tcagcaaagg ccacctgagc gctatcctgg 1980
ccatcaaggg gaacctatcc aacagtgagg tcaagcgcat ccggagcatc ttggacgtca 2040
gcatgggggc gcaggagccc tcccggcccc tattttccct tataaaggtt ggttagcttt 2100
tcctgtggcc tgacctgcct gtgagtgccc agcaagcctt gggcacaccc cgctgggagc 2160
tgttaagagc agcgctggtt ctcggttcct cccgggtctc ctgtgctctg atgctacttc 2220
tgcctagccc tggcggaggt gcaggccctg tcagctggaa ctggacagac cttggtttgt 2280
ttacatgtcc gatgggggca ggagctccca tcctgggcag ccaaccaggc aacaccaagg 2340
actctttgta aacgatagct gatcgtgtgc acgcaaggaa agaaccagga gggagagtgc 2400
agccaggctc agggatcccc ggacacctct gtccagagcc cctccacagt cggcctcatg 2460
actgtcctcc tcgtgggtgg ggccgagggc cctcttcagc tctctggaga caggggccga 2520
gcctcaccca tctgccctct gcagcccagg gccgccgtga gcgggattca gcaatggtgg 2580
aatggaagac agaactggaa gagaaagaag gaaaagatga gctctcgtct ggcaggggct 2640
tttagggtcc tgtggcgagc tgtgagcacc gccagcatta gacgtcacat ccaggtggcc 2700
ccacggcccc tacaggctgg ccctgcaatg gggccctgag ccctccctct tcatccccca 2760
aggcctcaac tagagggtgg tcccccgagg gcttggtgtc tactaccgaa gggcccaaga 2820
cctcctgggt cctctcaggc tcccccttcc ccaaggcagg gacaggccct gggggtgcca 2880
ccgtgggccc tgccacccag aagtctggct gaggtctggg caggggcagg gcaagcttga 2940
cctctcactg ttgacccttt ggcctctgta tttgtttcct attgccgtga caggtttcca 3000
caaacttcgt ggatcaaaac gaggtcttcc agttctgcgg gtcagaaggc tgacccgggg 3060
ctcaaatctg ggtgtcggca gtcctgcact ccttctggag gctctagggg agaattcatt 3120
tctggccttt tcatttttag aggctgaccg taattcttga cttcaggctc ctccatcttc 3180
agagccagct gtgggtagtt gaatcttttt cccgtcacct cattgaggcc tcccctctcc 3240
tgcctccctc caccactttt tttttttttt ttttgagaca gggtcttgct gtgttgccca 3300
ggctggagtg cagtggcctg gtcatggcat caaggctcac tgcagcctgg acctcctggt 3360
tcaagtgatc ctcttgtctc agtcccctga gacaatcccc cacgcccagc tacatatttt 3420
ttgtggatac agggtctcat tctgttgcct aggcttgtct ggaactcctg ggctcaaggg 3480
atcttgtagc cttagcctcc taaagtgctg ggattatagg catgagtcac tgtacccggc 3540
ctgctctacc gcttttaagg acgcttatga tcacattgcg cctacccaga gaacccaggt 3600
cgtctttcta ttttcaggtc agctgattag ccaccttagt tccatctgca actttagttc 3660
ccactggctg tgtaacctaa catagtcaca ggctctgggg actgtcacgt ggacatcttt 3720
gggaggccgt tattctgccc accgcaccct ccgttcatcc cctgccctgc cgggcacctc 3780
gctctacccc aggaaaatgt gagctcgttt tcctgctcgg catgtgctcc ccctaaggct 3840
ctgctcctcc ctgggcctga aagttccttc tcagcctgag agggggccct tcggactcag 3900
gcatgactca gcccggctga tgcctctgca gtgctgagtc aggatttggg gccggctctc 3960
ttgggtccgt ccccttttcc caggtactgc cttacaaagc tgtggccagg aagtggccgg 4020
tataaaggat gcccaaggtc tttgtacgtg tgtaggagtt agcgtgtttg atattgttaa 4080
tataataata attatttttt agagtactgc ttttgtatgt atgttgaaca ggatccaggt 4140
ttttatagct tgatataaaa cagaattcaa aagtgaaaaa 4180
<210> 41
<211> 2933
<212> DNA
<213> Homo sapiens
<400> 41
agtctccggg gactttccca ggggtggggc ggcccggcca ggcccccggc acttcctcgt 60
cctcggcccg ggtgccctgc ccccgtccag gagccctagg agtgctacgg ggggccggag 120
ccttgcccgg gccgctgccc cgtccctgga ttcggggctg gacgcagcaa gcggggcgct 180
gtgtccccaa gctccccgtc ctcggccagg cgggcaccac ggcaggggct gagctaccct 240
catggaaggg agaggaccgt accggatcta cgaccctggg ggcagcgtgc cctcaggaga 300
ggcatccgca gcttttgagc gcctagtgaa ggagaattcc cggctgaagg aaaaaatgca 360
agggataaag atgttagggg agcttttgga agagtcccag atggaagcga ccaggctccg 420
gcagaaggca gaggagctag tgaaggacaa cgagctgctc ccaccacctt ctccctcctt 480
gggctccttc gaccccctgg ctgagctcac aggaaaggac tcaaatgtca cagcatctcc 540
cacagcccct gcatgcccca gtgacaagcc agcaccagtc cagaagcctc catccagtgg 600
cacctcctct gaatttgaag tggtcactcc tgaggagcag aattcaccag agagcagcag 660
ccatgccaat gcgatggcgc tgggccccct gccccgtgag gacggcaacc tgatgctgca 720
cctgcagcgc ctggagacca cgctgagtgt gtgtgccgag gagccggacc acggccagct 780
cttcacccac ctgggccgca tggccctgga gttcaaccga ctggcatcca aggtgcacaa 840
gaatgagcag cgcacctcca ttctgcagac cctgtgtgag cagcttcgga aggagaacga 900
ggctctgaag gccaagttgg ataagggcct ggaacagcgg gatcaggctg ccgagaggct 960
gcgggaggaa aatttggagc tcaagaagtt gttgatgagc aatggcaaca aagagggtgc 1020
gtctgggcgg ccaggctcac cgaagatgga agggacaggc aagaaggcag tggctggaca 1080
gcagcaggct agtgtgacgg caggtaaggt cccagaggtg gtggccttgg gcgcagccga 1140
gaagaaggtg aagatgctgg agcagcagcg cagtgagctg ctggaagtga acaagcagtg 1200
ggaccagcat ttccggtcca tgaagcagca gtatgagcag aagatcactg agctgcgtca 1260
gaagctggct gatttgcaga agcaggtgac tgacctggag gccgagcggg agcagaagca 1320
gcgtgacttt gaccgcaagc tcctcctggc caagtccaag attgaaatgg aggagaccga 1380
caaggagcag ctgacagcag aggccaagga gctgcgccaa aaggtcaagt acctgcagga 1440
tcagctgagc ccactcaccc gacagcgtga gtaccaggaa aaggagatcc agcggctcaa 1500
caaggccctg gaggaagcac tgagcatcca aaccccgcca tcatctccac caacagcatt 1560
tgggagccca gaaggagcag gggccctcct aaggaaacag gagctggtca cgcagaatga 1620
gttgctgaaa cagcaggtga agatcttcga ggaggacttc cagagggagc gcagtgatcg 1680
tgagcgcatg aatgaggaga aggaagagct gaagaagcaa gtggagaagc tgcaggccca 1740
ggtcaccctg tcaaatgccc agctaaaagc attcaaagat gaggagaagg caagagaagc 1800
cctcagacag cagaagagga aagcaaaggc ctcaggagag cgttaccatg tggagcccca 1860
cccagaacat ctctgcgggg cctaccccta cgcctacccg cccatgccag ccatggtgcc 1920
acaccatggc ttcgaggact ggtcccagat ccgctacccc cctcccccca tggccatgga 1980
gcacccgccc ccactcccca actcgcgcct cttccatctg ccggaataca cctggcgtct 2040
accctgtgga ggggttcgaa atccaaatca gagctcccaa gtgatggacc ctcccacagc 2100
caggcctaca gaaccagagt ctccaaaaaa tgaccgtgag gggcctcagt gagaccagat 2160
tgtgtcattt ggctccacct tcatcttgca gagccagctg atctcagatt gccaagaaac 2220
tagaagccac ttgcacggtg tggccagagc ctcagctgga tgagaggctg agatgggtgg 2280
ccagcttgta caccagtccc tgaactgagc tgtttacagg actggggagg ctccacccag 2340
aaggctttca tttgtactct gctgggagtg actgggaaaa actccttccc tgctgctgag 2400
tggagagagg cctcatccgg ctttgaccca ccatccgttg cagaagcctc caggagcagc 2460
aatcctaaga gtgggaggca gccaagaccc ccttccttca aaacctcccg gaagtggttt 2520
caggccctct agttgccatg accaatttgt gtgtgtgttt aatttttgct tcaagctctg 2580
tagcaggacc tgccccacgc acacccctac ccctctgtga ggagctgtgg gaagtgtggg 2640
tttgtctcca gaacagaaga gaatgatgga tattctggct ctggggccct ctccaccacc 2700
actcacagta gccttgctga agccatcaca gatgggagaa ggccatgcca gccacgtccg 2760
ccgaggggcg ccagcctgaa gctgccaggc cctgaggttc agaccctgga ccccatagct 2820
ggaggcctgt ggtgccagaa gcccagatta gggtggctgt ccatccctgg atagctattt 2880
gcacgaatca tggacataaa tccaagttga agaagatcaa caaaaaaaaa aaa 2933
<210> 42
<211> 4450
<212> DNA
<213> Homo sapiens
<400> 42
gtaaagtggt ggcacctgac cggagcatgt agccacctat gacttcctga gtggggtggt 60
gtggggaggg ccctgaggag tctgtggtcc taggccttgg atttcatcag ggctttcctg 120
ctccttctgg ttgcctcagt cccaggcaag atagggtccc agaatcttac agcccagagg 180
ccacctgttg caacttcttt tttacagatg gagtcatcag aacccactgt gaggaagtga 240
cttctccttg aggtcaccca gacactccaa acagagcaga gcaaaagcgc ctagaacttg 300
aaattttgga cctgtctcca acaccctggg gatttccacc aggaagcctt cagtcaccat 360
ccaggggatt tttatcgcca caaagggtaa ttcctgctcc atccctgctg tgactcagct 420
gtgacgttga accacacaag ccagagagaa gaagataaag tcatcagagc tcctactcac 480
cagagagtga ggcccaggcc aggactccac aaggctggtc ccctgccctg gagcaactta 540
aacaggccct ctggccagcc tggaaccctg agatggcctc cagctcaggc agcagtcctc 600
gcccggcccc tgatgagaat gagtttccct ttgggtgccc tcccaccgtc tgccaggacc 660
caaaggagcc cagggctctc tgctgtgcag gctgtctctc tgagaacccg aggaatggcg 720
aggatcagat ctgccccaaa tgcagagggg aagacctcca gtctataagc ccaggaagcc 780
gtcttcgaac tcaggagaag gctcaccccg aggtggctga ggctggaatt gggtgcccct 840
ttgcaggtgt cggctgctcc ttcaagggaa gcccacagtc tgtgcaagag catgaggtca 900
cctcccagac ctcccaccta aacctgctgt tggggttcat gaaacagtgg aaggcccggc 960
tgggctgtgg cctggagtct gggcccatgg ccctggagca gaacctgtca gacctgcagc 1020
tgcaggcagc cgtggaagtg gcgggggacc tggaggtcga ttgctaccgg gcaccctgct 1080
ccgagagcca ggaggagctg gccctgcagc acttcatgaa ggagaagctt ctggctgagc 1140
tggaggggaa gctgcgtgtg tttgagaaca ttgttgctgt cctcaacaag gaggtggagg 1200
cctcccacct ggccctggcc acctctatcc accagagcca gctggaccgt gagcgcatcc 1260
tgagcttgga gcagagggtg gtggagcttc agcagaccct ggcccagaaa gaccaggccc 1320
tgggcaagct ggagcagagc ttgcgcctca tggaggaggc ctccttcgat ggcactttcc 1380
tgtggaagat caccaatgtc accaggcggt gccatgagtc ggcctgtggc aggaccgtca 1440
gcctcttctc cccagccttc tacactgcca agtatggcta caagttgtgc ctgcggctgt 1500
acctgaatgg agatggcact ggaaagagaa cccatctgtc gctcttcatc gtgatcatga 1560
gaggggagta tgatgcgctg ctgccgtggc ccttccggaa caaggtcacc ttcatgctgc 1620
tggaccagaa caaccgtgag cacgccattg acgccttccg gcctgaccta agctcagcgt 1680
ccttccagag gccccagagt gaaaccaacg tggccagtgg atgcccactc ttcttccccc 1740
tcagcaaact gcagtcaccc aagcacgcct acgtgaagga cgacacaatg ttcctcaagt 1800
gcattgtgga gaccagcact tagggtgggc ggggctcctg agggagctcc aactcagaag 1860
ggagctagcc agaggactgt gatgccctgc ccttggcacc caagacctca gggcacaaag 1920
atgggtgaag gctggcatga tccaagcaag actgaggggt cgacttcggg ctggccatct 1980
ggttaggatg gcaggacgtg ggctgggccc acaaaggcaa agggtccaga aggagacagg 2040
cagagctgct cccctctgca cggaccatgc gacactggga ggccagtgag ccactccggc 2100
cccgaatgtt gaggtggact ctcaccaaat gagaagaaaa tggaaccagg cttggaaccg 2160
taggacccaa gcagagaagc tctcgggcta ggaagatctc tgcagggccg ccagggagac 2220
ctggacacag gcctgctctc tttttctcca gggtcagaaa caggaccggg tggaagggat 2280
ggggtgccag tttgaatgca gtctgtccag gctcgtcatt ggaggtgaac aagcaaaccc 2340
agacggctcc actaggactt caaattgggg gttggatttg aagactttta agtttccttc 2400
cagcccagaa agtctctcat tctaggcctc ctggcccagg tgagtcctag agctacaggg 2460
gttctggaaa cattcaggag cttcctgtcc tcccagctcc tcactcacct tcagtaaccc 2520
ccactggact gacctggtcc acagggcacc tgccaccctg ggcctggcag ctcagcttcc 2580
ccaacacgca ggagcacacc cagcccccac atcctgtgcc tccatcagct aaacaccacg 2640
tcacttcatg caggtgaaac ccagtcactg tgagctccca ggtgcagcca gaggcacctc 2700
aagaagaaga ggggcataaa ctttcctctt cctgcctaga ggccccacct ttggtgcttt 2760
ccagaatccc gtaacacctg attaactgag gcatccactt ctttcagcag actgatcagg 2820
acctccaagc cactgagcaa tgtataaccc caaagaaata atttttagaa tctctttcga 2880
agttttccta aagtgtatgg tttgggagtt gtttgtactg agccaggttt gaaaaggcca 2940
ttgctgagtt tgaggtggtg ccaccagttt tgcaggtggc atcagaggct ggcatgctgg 3000
caggaacatc ccctcttagc cccagtcttc tcttttctat aatgagaccc accccagctt 3060
gcctccctcc ctggcttctc tgaccctcaa aggagatgcc acgcaggaca gactggagag 3120
agaagcctgg gcaatactgc ccgcctgtca tggcctggtg gtggcccaca cctatcttct 3180
caccttggag gccacaccca actttccaaa gaccctgaga caagtcagga ccctactact 3240
ctcccctgct gctttcctga cagcttactc ttcctccacc tgctgagcct gtgccagact 3300
ccattgctca aatcgttagg gttgcttcta taaaaatggg tcagtagccc ttcctgttct 3360
ctccagccca gcatacagga ggatcaaagg aggtgacgga gcatcgtggc acggcagtct 3420
caatgggtca gaaacccagg ccagctgggc tctaagcctg gcatcctgtc atgcttagtc 3480
cttcagctga agatcagagg aagcctctcc cttgccctct ccagctctag gggttttcag 3540
gaggcccaac tgcaataatg gagcactagt gcttttatgt gcaatggtgt cgtccatgca 3600
cgacagcagc aaacattctg gggctgcttt ttattgttcc cacggctgac agcgtggcag 3660
cggagactgt ggaggcagtg gagactgact tcttcctgct gacagctgga tgtcacacat 3720
gaaggtctgg cctagcgagt gatgggtcta ggccctgaaa ctgatgtcct agcaataacc 3780
tcttgatccc tactcaccga gtgttgagcc caagggggga tttgtagaac aagcccccat 3840
gagaaacagc tgttactcta cacttttgat tgcctatttc tgatggcaag agatacatac 3900
tctcttcaaa gagcatgaga tgcagccatt ctttcagcaa agcttcattg acacctgcac 3960
ctgttaactg tgttcgacat tgaagggaga aaggcaagat gtgcactctg gactcaagaa 4020
actcttagtt cagtggagga aatgagcaga taagtagatc attatgattg agagtaggag 4080
aagcttagag aaagcacaga atcccagatc cagctggtga aggagggaag gcttcaggcc 4140
tttaagctca gcctgagaat attgtgaaat gcagaggatg gggaaaaggg aagagtaccg 4200
acttgaaaac ggagagctgt cttggctgag ggcagggtct gtgtggcagg atgggggagg 4260
gagtcagaag ggtcagatga actggagtgt agacagcatc agatgcagca gtgcccaccg 4320
cccccccccc caccccccgc cctgcccaca gagcacctgc tggtaaccct gggcctattg 4380
aaagcaggat gagatgataa tttaagacac tgaatgattt cttttccaac aaagctctat 4440
gttaagtgca 4450
<210> 43
<211> 3220
<212> DNA
<213> Homo sapiens
<400> 43
aaactttttt ccctggctct gccctgggtt tccccttgaa gggatttccc tccgcctctg 60
caacaagacc ctttataaag cacagacttt ctatttcact ccgcggtatc tgcatcgggc 120
ctcactggct tcaggagctg aataccctcc caggcacaca caggtgggac acaaataagg 180
gttttggaac cactattttc tcatcacgac agcaacttaa aatgcctggg aagatggtcg 240
tgatccttgg agcctcaaat atactttgga taatgtttgc agcttctcaa gcttttaaaa 300
tcgagaccac cccagaatct agatatcttg ctcagattgg tgactccgtc tcattgactt 360
gcagcaccac aggctgtgag tccccatttt tctcttggag aacccagata gatagtccac 420
tgaatgggaa ggtgacgaat gaggggacca catctacgct gacaatgaat cctgttagtt 480
ttgggaacga acactcttac ctgtgcacag caacttgtga atctaggaaa ttggaaaaag 540
gaatccaggt ggagatctac tcttttccta aggatccaga gattcatttg agtggccctc 600
tggaggctgg gaagccgatc acagtcaagt gttcagttgc tgatgtatac ccatttgaca 660
ggctggagat agacttactg aaaggagatc atctcatgaa gagtcaggaa tttctggagg 720
atgcagacag gaagtccctg gaaaccaaga gtttggaagt aacctttact cctgtcattg 780
aggatattgg aaaagttctt gtttgccgag ctaaattaca cattgatgaa atggattctg 840
tgcccacagt aaggcaggct gtaaaagaat tgcaagtcta catatcaccc aagaatacag 900
ttatttctgt gaatccatcc acaaagctgc aagaaggtgg ctctgtgacc atgacctgtt 960
ccagcgaggg tctaccagct ccagagattt tctggagtaa gaaattagat aatgggaatc 1020
tacagcacct ttctggaaat gcaactctca ccttaattgc tatgaggatg gaagattctg 1080
gaatttatgt gtgtgaagga gttaatttga ttgggaaaaa cagaaaagag gtggaattaa 1140
ttgttcaaga gaaaccattt actgttgaga tctcccctgg accccggatt gctgctcaga 1200
ttggagactc agtcatgttg acatgtagtg tcatgggctg tgaatcccca tctttctcct 1260
ggagaaccca gatagacagc cctctgagcg ggaaggtgag gagtgagggg accaattcca 1320
cgctgaccct gagccctgtg agttttgaga acgaacactc ttatctgtgc acagtgactt 1380
gtggacataa gaaactggaa aagggaatcc aggtggagct ctactcattc cctagagatc 1440
cagaaatcga gatgagtggt ggcctcgtga atgggagctc tgtcactgta agctgcaagg 1500
ttcctagcgt gtaccccctt gaccggctgg agattgaatt acttaagggg gagactattc 1560
tggagaatat agagtttttg gaggatacgg atatgaaatc tctagagaac aaaagtttgg 1620
aaatgacctt catccctacc attgaagata ctggaaaagc tcttgtttgt caggctaagt 1680
tacatattga tgacatggaa ttcgaaccca aacaaaggca gagtacgcaa acactttatg 1740
tcaatgttgc ccccagagat acaaccgtct tggtcagccc ttcctccatc ctggaggaag 1800
gcagttctgt gaatatgaca tgcttgagcc agggctttcc tgctccgaaa atcctgtgga 1860
gcaggcagct ccctaacggg gagctacagc ctctttctga gaatgcaact ctcaccttaa 1920
tttctacaaa aatggaagat tctggggttt atttatgtga aggaattaac caggctggaa 1980
gaagcagaaa ggaagtggaa ttaattatcc aagttactcc aaaagacata aaacttacag 2040
cttttccttc tgagagtgtc aaagaaggag acactgtcat catctcttgt acatgtggaa 2100
atgttccaga aacatggata atcctgaaga aaaaagcgga gacaggagac acagtactaa 2160
aatctataga tggcgcctat accatccgaa aggcccagtt gaaggatgcg ggagtatatg 2220
aatgtgaatc taaaaacaaa gttggctcac aattaagaag tttaacactt gatgttcaag 2280
gaagagaaaa caacaaagac tatttttctc ctgagcttct cgtgctctat tttgcatcct 2340
ccttaataat acctgccatt ggaatgataa tttactttgc aagaaaagcc aacatgaagg 2400
ggtcatatag tcttgtagaa gcacagaagt caaaagtgta gctaatgctt gatatgttca 2460
actggagaca ctatttatct gtgcaaatcc ttgatactgc tcatcattcc ttgagaaaaa 2520
caatgagctg agaggcagac ttccctgaat gtattgaact tggaaagaaa tgcccatcta 2580
tgtcccttgc tgtgagcaag aagtcaaagt aaaacttgct gcctgaagaa cagtaactgc 2640
catcaagatg agagaactgg aggagttcct tgatctgtat atacaataac ataatttgta 2700
catatgtaaa ataaaattat gccatagcaa gattgcttaa aatagcaaca ctctatattt 2760
agattgttaa aataactagt gttgcttgga ctattataat ttaatgcatg ttaggaaaat 2820
ttcacattaa tatttgctga cagctgacct ttgtcatctt tcttctattt tattcccttt 2880
cacaaaattt tattcctata tagtttattg acaataattt caggttttgt aaagatgccg 2940
ggttttatat ttttatagac aaataataag caaagggagc actgggttga ctttcaggta 3000
ctaaatacct caacctatgg tataatggtt gactgggttt ctctgtatag tactggcatg 3060
gtacggagat gtttcacgaa gtttgttcat cagactcctg tgcaactttc ccaatgtggc 3120
ctaaaaatgc aacttctttt tattttcttt tgtaaatgtt taggtttttt tgtatagtaa 3180
agtgataatt tctggaatta gaaaaaaaaa aaaaaaaaaa 3220
<210> 44
<211> 20
<212> DNA
<213> Homo sapiens
<400> 44
ggcagccttc ctgatttctg 20
<210> 45
<211> 21
<212> DNA
<213> Homo sapiens
<400> 45
ggtggaaagg tttggagtat g 21
<210> 46
<211> 24
<212> DNA
<213> Homo sapiens
<400> 46
cagctctgtg tgaaggtgca gttt 24
<210> 47
<211> 22
<212> DNA
<213> Homo sapiens
<400> 47
ggagttcatc cgtcaagttc aa 22
<210> 48
<211> 19
<212> DNA
<213> Homo sapiens
<400> 48
tttcatctcc tgggctgtc 19
<210> 49
<211> 24
<212> DNA
<213> Homo sapiens
<400> 49
accctcatct acttgaacag ctgct 25
<210> 50
<211> 20
<212> DNA
<213> Homo sapiens
<400> 50
tgcagtaata ctggggaacc 26
<210> 51
<211> 21
<212> DNA
<213> Homo sapiens
<400> 51
gcttcgtcag aatcacgttg g 21
<210> 52
<211> xx
<212> DNA
<213> Homo sapiens
<400> 52
tgaccatcta cagctttccg gcg 23
<210> 53
<211> 19
<212> DNA
<213> Homo sapiens
<400> 53
cctccactcc atcctgaag 19
<210> 54
<211> 20
<212> DNA
<213> Homo sapiens
<400> 54
<210> 55
<211> 23
<212> DNA
<213> Homo sapiens
<400> 55
accaactaca atggccacac gtg 23
<210> 56
<211> 23
<212> DNA
<213> Homo sapiens
<400> 56
tttcaagaca gatcataagc gag 23
<210> 57
<211> 22
<212> DNA
<213> Homo sapiens
<400> 57
agccagagtt tcaccgtaaa ta 22
<210> 58
<211> 23
<212> DNA
<213> Homo sapiens
<400> 58
tgggccatgg ggtggactta aat 23
<210> 59
<211> 18
<212> DNA
<213> Homo sapiens
<400> 59
ccaaccgcga gaagatga 18
<210> 60
<211> 20
<212> DNA
<213> Homo sapiens
<400> 60
<210> 61
<211> 23
<212> DNA
<213> Homo sapiens
<400> 61
ccatgtacgt tgctatccag gct 23
<210> 62
<211> 19
<212> DNA
<213> Homo sapiens
<400> 62
agtcctgagt ccggatgaa 19
<210> 63
<211> 18
<212> DNA
<213> Homo sapiens
<400> 63
cctccctcag tcgtctct 18
<210> 64
<211> 24
<212> DNA
<213> Homo sapiens
<400> 64
tgacggaggg tggcatcaaa tacc 24
<210> 65
<211> 22
<212> DNA
<213> Homo sapiens
<400> 65
gccagcttgt cttcaatgaa at 22
<210> 66
<211> 21
<212> DNA
<213> Homo sapiens
<400> 66
caaagccagc ttctgttcaa g 21
<210> 67
<211> 24
<212> DNA
<213> Homo sapiens
<400> 67
atccaccatg agttggtagg cagc 24
<210> 68
<211> 23
<212> DNA
<213> Homo sapiens
<400> 68
gccaagaaga aagtgaacat cat 23
<210> 69
<211> 20
<212> DNA
<213> Homo sapiens
<400> 69
atagggattc cgggagtcat 20
<210> 70
<211> 24
<212> DNA
<213> Homo sapiens
<400> 70
tcagaacaac agcctgccac ctta 24
<210> 71
<211> 22
<212> DNA
<213> Homo sapiens
<400> 71
tgactccttc aacaccttct tc 22
<210> 72
<211> 18
<212> DNA
<213> Homo sapiens
<400> 72
tgccagtgcg aacttcat 18
<210> 73
<211> 24
<212> DNA
<213> Homo sapiens
<400> 73
ccgggctgtg tttgtagact tgga 24
<210> 74
<211> 17
<212> DNA
<213> Homo sapiens
<400> 74
agccacatca tccctgt 17
<210> 75
<211> 22
<212> DNA
<213> Homo sapiens
<400> 75
cgtagatgtt atgtctgctc at 22
<210> 76
<211> 22
<212> DNA
<213> Homo sapiens
<400> 76
tttagcagca tctgcaaccc gc 22
<210> 77
<211> 24
<212> DNA
<213> Homo sapiens
<400> 77
gaggatttgg aaagggtgtt tatt 24
<210> 78
<211> 21
<212> DNA
<213> Homo sapiens
<400> 78
acagagggct acaatgtgat g 21
<210> 79
<211> 26
<212> DNA
<213> Homo sapiens
<400> 79
acgtcttgct cgagatgtga tgaagg 26
<210> 80
<211> 18
<212> DNA
<213> Homo sapiens
<400> 80
taaaccctgc gtggcaat 18
<210> 81
<211> 27
<212> DNA
<213> Homo sapiens
<400> 81
acatttcgga taatcatcca atagttg 27
<210> 82
<211> 24
<212> DNA
<213> Homo sapiens
<400> 82
aagtagttgg acttccaggt cgcc 24
<210> 83
<211> 17
<212> DNA
<213> Homo sapiens
<400> 83
ccgtggcctt agctgtg 17
<210> 84
<211> 21
<212> DNA
<213> Homo sapiens
<400> 84
ctgctggatg acgtgagtaa a 21
<210> 85
<211> 24
<212> DNA
<213> Homo sapiens
<400> 85
tctctctttc tggcctggag gcta 24
<210> 86
<211> 25
<212> DNA
<213> Homo sapiens
<400> 86
aaatgttaac aaatgtggca attat 25
<210> 87
<211> 20
<212> DNA
<213> Homo sapiens
<400> 87
<210> 88
<211> 20
<212> DNA
<213> Homo sapiens
<400> 88
<210> 89
<211> 22
<212> DNA
<213> Homo sapiens
<400> 89
tgaaaactac ccctaaaagc ca 22
<210> 90
<211> 21
<212> DNA
<213> Homo sapiens
<400> 90
tatccaagac ccaggcatac t 21
<210> 91
<211> 21
<212> DNA
<213> Homo sapiens
<400> 91
tagattcggg caagtccacc a 21
<210> 92
<211> 20
<212> DNA
<213> Homo sapiens
<400> 92
<210> 93
<211> 20
<212> DNA
<213> Homo sapiens
<400> 93
<210> 94
<211> 20
<212> DNA
<213> Homo sapiens
<400> 94
Claims (20)
1. An apparatus for inferring activity of a nfkb cell signaling pathway based at least on measured expression levels of six or more target genes of the nfkb cell signaling pathway in a sample, wherein the apparatus comprises:
means for determining or receiving in said sample the level of a nfkb Transcription Factor (TF) element that controls transcription of six or more target genes of said nfkb cellular signaling pathway, said determining based at least in part on evaluating a mathematical model that relates expression levels of the six or more target genes of said nfkb cellular signaling pathway to the level of said nfkb TF element, wherein said six or more target genes are selected from the group consisting of: CCL5, CXCL2, ICAM1, IL6, IL8, NFKBIA and TNFAIP 2; and
a module that infers activity of a nfkb cellular signaling pathway based on the determined level of nfkb TF element in the sample;
wherein the inference is performed by a digital processing device using the mathematical model.
2. The apparatus of claim 1, further comprising:
a module for determining whether the NF κ B cell signaling pathway is operating abnormally based on the inferred activity of the NF κ B cell signaling pathway.
3. The apparatus of claim 2, further comprising:
it is proposed to prescribe a module for drugs that correct abnormal operation of the NF κ B cell signaling pathway,
wherein the recommendation is made only if the NF κ B cell signaling pathway is determined to be abnormally operational based on the inferred activity of the NF κ B cell signaling pathway.
4. The device of any of claims 1-3, wherein the device is used for at least one of the following activities:
performing a diagnosis based on the inferred activity of the NF κ B cell signaling pathway;
performing a prognosis based on the inferred activity of the NF κ B cell signaling pathway;
prescribing a drug based on the inferred activity of the NF κ B cell signaling pathway;
predicting drug efficacy based on the inferred activity of the NF κ B cell signaling pathway;
predicting side effects based on the inferred activity of the NF κ B cell signaling pathway;
monitoring drug efficacy;
developing a drug;
developing an assay;
path research;
staging of cancer;
clinical trial enrollment based on inferred activity of NF κ B cell signaling pathway;
selecting a subsequent test to be performed; and
a companion diagnostic test is selected.
5. The device of any one of claims 1-4, wherein the mathematical model is a probability model based at least in part on conditional probabilities relating the nfkb TF element to the measured expression levels of six or more target genes of the nfkb cell signaling pathway in the sample, or wherein the mathematical model is based at least in part on one or more linear combinations of the measured expression levels of six or more target genes of the nfkb cell signaling pathway in the sample.
6. The apparatus of claim 5, wherein the mathematical model is a Bayesian network model.
7. An apparatus comprising a digital processor configured to perform a method comprising:
inferring activity of a nfkb cell signaling pathway based at least on measured expression levels of six or more target genes of the nfkb cell signaling pathway in the sample, wherein the inferring comprises:
determining in the sample the level of a nfkb Transcription Factor (TF) element that controls transcription of six or more target genes of the nfkb cellular signaling pathway, based at least in part on evaluating a mathematical model that relates expression levels of the six or more target genes of the nfkb cellular signaling pathway to the level of the nfkb TF element, wherein the six or more target genes are selected from the group consisting of: CCL5, CXCL2, ICAM1, IL6, IL8, NFKBIA and TNFAIP 2; and
inferring an activity of an nfkb cellular signaling pathway based on the determined level of the nfkb TF element in the sample;
wherein the inference is performed by a digital processing device using the mathematical model.
8. The apparatus of claim 7, wherein the method further comprises:
determining whether the NF κ B cell signaling pathway is operating abnormally based on the inferred activity of the NF κ B cell signaling pathway.
9. The apparatus of claim 8, wherein the method further comprises: prescribing an agent that corrects abnormal operation of a nfkb cell signaling pathway is suggested, wherein the suggestion is made only if abnormal operation of the nfkb cell signaling pathway is determined based on inferred activity of the nfkb cell signaling pathway.
10. The device of any one of claims 7-9, wherein the mathematical model is a probability model based at least in part on conditional probabilities relating the nfkb TF element to the measured expression levels of six or more target genes of the nfkb cell signaling pathway in the sample, or wherein the mathematical model is based at least in part on one or more linear combinations of the measured expression levels of six or more target genes of the nfkb cell signaling pathway in the sample.
11. The apparatus of claim 10, wherein the mathematical model is a bayesian network model.
12. A non-transitory storage medium storing instructions executable by a digital processing device to perform a method comprising:
inferring activity of a nfkb cell signaling pathway based at least on measured expression levels of six or more target genes of the nfkb cell signaling pathway in the sample, wherein the inferring comprises:
determining in the sample the level of a nfkb Transcription Factor (TF) element that controls transcription of six or more target genes of the nfkb cellular signaling pathway, based at least in part on evaluating a mathematical model that relates expression levels of the six or more target genes of the nfkb cellular signaling pathway to the level of the nfkb TF element, wherein the six or more target genes are selected from the group consisting of: CCL5, CXCL2, ICAM1, IL6, IL8, NFKBIA and TNFAIP 2; and
inferring an activity of an nfkb cellular signaling pathway based on the determined level of the nfkb TF element in the sample;
wherein the inference is performed by a digital processing device using the mathematical model.
13. The non-transitory storage medium of claim 12, wherein the method further comprises:
determining whether the NF κ B cell signaling pathway is operating abnormally based on the inferred activity of the NF κ B cell signaling pathway.
14. The non-transitory storage medium of claim 13, wherein the method further comprises: prescribing an agent that corrects abnormal operation of a nfkb cell signaling pathway is suggested, wherein the suggestion is made only if abnormal operation of the nfkb cell signaling pathway is determined based on inferred activity of the nfkb cell signaling pathway.
15. The non-transitory storage medium of any one of claims 12-14, wherein the mathematical model is a probabilistic model based at least in part on conditional probabilities relating the nfkb TF element to the measured expression levels of six or more target genes of the nfkb cellular signaling pathway in the sample, or wherein the mathematical model is based at least in part on one or more linear combinations of the measured expression levels of six or more target genes of the nfkb cellular signaling pathway in the sample.
16. The non-transitory storage medium of claim 15, wherein the mathematical model is a bayesian network model.
17. A kit for measuring the expression level of six or more target genes of the nfkb cell signaling pathway in a sample, comprising:
polymerase chain reaction primers directed against the six or more NF κ B target genes,
a probe directed against the six or more NF kappa B target genes, and
wherein the six or more NF κ B target genes are selected from the group consisting of: CCL5, CXCL2, ICAM1, IL6, IL8, NFKBIA and TNFAIP 2.
18. Kit according to claim 17, comprising the apparatus according to any one of claims 1 to 11, the non-transitory storage medium according to any one of claims 12 to 16 or a computer program comprising program code means for causing a digital processing device to perform the method as defined in any one of claims 7 to 11 when said computer program is run by the digital processing device.
19. A kit for measuring the expression level of six or more target genes of the nfkb cell signaling pathway in a sample, comprising:
one or more components for determining the expression level of six or more target genes of the NF κ B cell signaling pathway, and
the apparatus of any one of claims 1-11, the non-transitory storage medium of any one of claims 12-16 or a computer program comprising program code means for causing a digital processing device to perform the method as defined in any one of claims 7 to 11 when said computer program is run by the digital processing device, and
wherein the six or more target genes of the NF κ B cell signaling pathway are selected from the group consisting of: CCL5, CXCL2, ICAM1, IL6, IL8, NFKBIA and TNFAIP 2.
20. The kit of claim 19, wherein the one or more components are selected from the group consisting of: microarray chip, antibodies, multiple probes, RNA sequencing and a set of primers.
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP15181029.8 | 2015-08-14 | ||
EP15181029 | 2015-08-14 | ||
PCT/EP2016/069237 WO2017029215A1 (en) | 2015-08-14 | 2016-08-12 | Assessment of nfkb cellular signaling pathway activity using mathematical modelling of target gene expression |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108138237A CN108138237A (en) | 2018-06-08 |
CN108138237B true CN108138237B (en) | 2022-04-05 |
Family
ID=53938129
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201680056337.1A Expired - Fee Related CN108138237B (en) | 2015-08-14 | 2016-08-12 | Assessment of NFkB cell signaling pathway activity using mathematical modeling of target gene expression |
Country Status (10)
Country | Link |
---|---|
US (2) | US11450409B2 (en) |
EP (1) | EP3334837B1 (en) |
JP (1) | JP7028763B2 (en) |
CN (1) | CN108138237B (en) |
AU (1) | AU2016309659A1 (en) |
BR (1) | BR112018002848A2 (en) |
CA (1) | CA2995520A1 (en) |
DK (1) | DK3334837T3 (en) |
ES (1) | ES2861400T3 (en) |
WO (1) | WO2017029215A1 (en) |
Families Citing this family (22)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP3431582A1 (en) | 2017-07-18 | 2019-01-23 | Koninklijke Philips N.V. | Cell culturing materials |
EP3462348A1 (en) | 2017-09-28 | 2019-04-03 | Koninklijke Philips N.V. | Bayesian inference |
US20200240979A1 (en) * | 2017-10-02 | 2020-07-30 | Koninklijke Philips N.V. | Determining functional status of immune cells types and immune response |
EP3462349A1 (en) * | 2017-10-02 | 2019-04-03 | Koninklijke Philips N.V. | Assessment of notch cellular signaling pathway activity using mathematical modelling of target gene expression |
EP3867650A1 (en) * | 2018-10-18 | 2021-08-25 | Fresenius Medical Care Holdings, Inc. | Techniques for modeling parathyroid gland functionality and calcimimetic drug activity |
CN113785075A (en) * | 2019-05-03 | 2021-12-10 | 皇家飞利浦有限公司 | Method for prognosis of high-grade serous ovarian cancer |
EP3812474A1 (en) | 2019-10-22 | 2021-04-28 | Koninklijke Philips N.V. | Methods of prognosis in high-grade serous ovarian cancer |
EP3739588A1 (en) | 2019-05-13 | 2020-11-18 | Koninklijke Philips N.V. | Assessment of multiple signaling pathway activity score in airway epithelial cells to predict airway epithelial abnormality and airway cancer risk |
EP3758005A1 (en) | 2019-06-24 | 2020-12-30 | Koninklijke Philips N.V. | Identification of the cellular function of an active nfkb pathway |
CN110779967B (en) * | 2019-09-18 | 2022-06-17 | 南京农业大学 | NF-kB electrochemical detection method based on traditional glassy carbon electrode |
EP3882363A1 (en) | 2020-03-17 | 2021-09-22 | Koninklijke Philips N.V. | Prognostic pathways for high risk sepsis patients |
WO2021209567A1 (en) | 2020-04-16 | 2021-10-21 | Koninklijke Philips N.V. | Prognostic pathways for viral infections |
EP3978628A1 (en) | 2020-10-01 | 2022-04-06 | Koninklijke Philips N.V. | Prognostic pathways for viral infections |
CN111606976A (en) * | 2020-05-26 | 2020-09-01 | 中国人民解放军军事科学院军事医学研究院 | Small peptide and application thereof in inhibiting opiate addiction and tolerance |
EP3940704A1 (en) | 2020-07-14 | 2022-01-19 | Koninklijke Philips N.V. | Method for determining the differentiation state of a stem cell |
EP3960875A1 (en) | 2020-08-28 | 2022-03-02 | Koninklijke Philips N.V. | Pcr method and kit for determining pathway activity |
EP3965119A1 (en) | 2020-09-04 | 2022-03-09 | Koninklijke Philips N.V. | Methods for estimating heterogeneity of a tumour based on values for two or more genome mutation and/or gene expression related parameter, as well as corresponding devices |
EP3974540A1 (en) | 2020-09-25 | 2022-03-30 | Koninklijke Philips N.V. | Method for predicting immunotherapy resistance |
EP4015651A1 (en) | 2020-12-17 | 2022-06-22 | Koninklijke Philips N.V. | Treatment prediction and effectiveness of anti-tnf alpha treatment in ibd patients |
EP4039825A1 (en) | 2021-02-09 | 2022-08-10 | Koninklijke Philips N.V. | Comparison and standardization of cell and tissue culture |
EP4305206A1 (en) | 2021-03-11 | 2024-01-17 | Koninklijke Philips N.V. | Prognostic pathways for high risk sepsis patients |
TW202340480A (en) * | 2021-11-23 | 2023-10-16 | 義大利商義大利藥品股份有限公司 | Method to detect rna biomarkers |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103649337A (en) * | 2011-07-19 | 2014-03-19 | 皇家飞利浦有限公司 | Assessment of cell signaling pathway activity using probabilistic modeling of target gene expression |
WO2014102668A3 (en) * | 2012-12-26 | 2014-08-21 | Koninklijke Philips N.V. | Assessment of cellular signaling pathway activity using linear combination(s) of target gene expressions; |
Family Cites Families (48)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6004761A (en) | 1986-11-19 | 1999-12-21 | Sanofi | Method for detecting cancer using monoclonal antibodies to new mucin epitopes |
US5436134A (en) | 1993-04-13 | 1995-07-25 | Molecular Probes, Inc. | Cyclic-substituted unsymmetrical cyanine dyes |
US5658751A (en) | 1993-04-13 | 1997-08-19 | Molecular Probes, Inc. | Substituted unsymmetrical cyanine dyes with selected permeability |
US6720149B1 (en) | 1995-06-07 | 2004-04-13 | Affymetrix, Inc. | Methods for concurrently processing multiple biological chip assays |
US5545531A (en) | 1995-06-07 | 1996-08-13 | Affymax Technologies N.V. | Methods for making a device for concurrently processing multiple biological chip assays |
US6146897A (en) | 1995-11-13 | 2000-11-14 | Bio-Rad Laboratories | Method for the detection of cellular abnormalities using Fourier transform infrared spectroscopy |
US6391550B1 (en) | 1996-09-19 | 2002-05-21 | Affymetrix, Inc. | Identification of molecular sequence signatures and methods involving the same |
NZ516848A (en) | 1997-06-20 | 2004-03-26 | Ciphergen Biosystems Inc | Retentate chromatography apparatus with applications in biology and medicine |
JP2001511550A (en) | 1997-07-25 | 2001-08-14 | アフィメトリックス インコーポレイテッド | Method and system for providing a probe array chip design database |
US6953662B2 (en) | 1997-08-29 | 2005-10-11 | Human Genome Sciences, Inc. | Follistatin-3 |
US6020135A (en) | 1998-03-27 | 2000-02-01 | Affymetrix, Inc. | P53-regulated genes |
US6884578B2 (en) | 2000-03-31 | 2005-04-26 | Affymetrix, Inc. | Genes differentially expressed in secretory versus proliferative endometrium |
CN1262337C (en) | 2000-11-16 | 2006-07-05 | 赛弗根生物系统股份有限公司 | Method for analyzing mass spectra |
CN1636068A (en) | 2001-02-16 | 2005-07-06 | 赛弗根生物系统股份有限公司 | Method for correlating gene expression profiles with protein expression profiles |
EP2258872B1 (en) | 2002-03-13 | 2013-08-14 | Genomic Health, Inc. | Gene expression profiling in biopsied tumor tissues |
US7097976B2 (en) | 2002-06-17 | 2006-08-29 | Affymetrix, Inc. | Methods of analysis of allelic imbalance |
AU2003295598B2 (en) | 2002-11-15 | 2009-12-24 | Genomic Health, Inc. | Gene expression profiling of EGFR positive cancer |
US20040231909A1 (en) | 2003-01-15 | 2004-11-25 | Tai-Yang Luh | Motorized vehicle having forward and backward differential structure |
US20040180341A1 (en) * | 2003-03-07 | 2004-09-16 | Brown University | Transcriptional regulation of kinase inhibitors |
CA3061769C (en) | 2003-06-24 | 2021-10-26 | Genomic Health, Inc. | Methods of predicting the likelihood of long-term survival of a human patient with node-negative, estrogen receptor (er) positive, invasive ductal breast cancer without the recurrence of breast cancer |
EP3330875B1 (en) | 2003-07-10 | 2021-12-01 | Genomic Health, Inc. | Expression profile algorithm and test for prognosing breast cancer recurrence |
US7930104B2 (en) | 2004-11-05 | 2011-04-19 | Genomic Health, Inc. | Predicting response to chemotherapy using gene expression markers |
US7754861B2 (en) | 2005-03-23 | 2010-07-13 | Bio-Rad Laboratories, Inc. | Method for purifying proteins |
US20060234911A1 (en) | 2005-03-24 | 2006-10-19 | Hoffmann F M | Method of reversing epithelial mesenchymal transition |
US20090186024A1 (en) | 2005-05-13 | 2009-07-23 | Nevins Joseph R | Gene Expression Signatures for Oncogenic Pathway Deregulation |
KR100806274B1 (en) | 2005-12-06 | 2008-02-22 | 한국전자통신연구원 | Adaptive Execution Method for Multithreaded Processor Based Parallel Systems |
NZ593227A (en) | 2006-01-11 | 2012-10-26 | Genomic Health Inc | Gene expression markers (MYBL2) for colorectal cancer prognosis |
WO2007123772A2 (en) | 2006-03-31 | 2007-11-01 | Genomic Health, Inc. | Genes involved in estrogen metabolism |
WO2007115582A1 (en) | 2006-04-11 | 2007-10-18 | Bio-Rad Pasteur | Hpv detection and quantification by real-time multiplex amplification |
EP2162552A4 (en) | 2007-05-11 | 2010-06-30 | Univ Johns Hopkins | Biomarkers for melanoma |
WO2008157459A2 (en) | 2007-06-14 | 2008-12-24 | The Regents Of The University Of Michigan | Methods and systems for identifying molecular pathway elements |
AU2008304158A1 (en) | 2007-09-28 | 2009-04-02 | Duke University | Individualized cancer treatments |
US7816084B2 (en) | 2007-11-30 | 2010-10-19 | Applied Genomics, Inc. | TLE3 as a marker for chemotherapy |
US8067178B2 (en) | 2008-03-14 | 2011-11-29 | Genomic Health, Inc. | Gene expression markers for prediction of patient response to chemotherapy |
US20110053804A1 (en) | 2008-04-03 | 2011-03-03 | Sloan-Kettering Institute For Cancer Research | Gene Signatures for the Prognosis of Cancer |
CA2730277A1 (en) | 2008-07-08 | 2010-01-14 | Source Precision Medicine, Inc. D/B/A Source Mdx | Gene expression profiling for predicting the survivability of prostate cancer subjects |
EP3831954A3 (en) | 2008-11-17 | 2021-10-13 | Veracyte, Inc. | Methods and compositions of molecular profiling for disease diagnostics |
US8765383B2 (en) | 2009-04-07 | 2014-07-01 | Genomic Health, Inc. | Methods of predicting cancer risk using gene expression in premalignant tissue |
US8911940B2 (en) | 2009-07-31 | 2014-12-16 | The Translational Genomics Research Institute | Methods of assessing a risk of cancer progression |
US8451450B2 (en) | 2009-09-14 | 2013-05-28 | Bio-Rad Laboratories, Inc. | Near real time optical phase conjugation |
WO2011146619A2 (en) | 2010-05-19 | 2011-11-24 | The Regents Of The University Of California | Systems and methods for identifying drug targets using biological networks |
US8703736B2 (en) | 2011-04-04 | 2014-04-22 | The Translational Genomics Research Institute | Therapeutic target for pancreatic cancer cells |
WO2012154567A2 (en) | 2011-05-06 | 2012-11-15 | Albert Einstein College Of Medicine Of Yeshiva University | Human invasion signature for prognosis of metastatic risk |
AU2012275500A1 (en) | 2011-06-27 | 2014-01-16 | Dana-Farber Cancer Institute, Inc. | Signatures and determinants associated with prostate cancer progression and methods of use thereof |
WO2013075059A1 (en) | 2011-11-18 | 2013-05-23 | Vanderbilt University | Markers of triple-negative breast cancer and uses thereof |
US8725426B2 (en) | 2012-01-31 | 2014-05-13 | Genomic Health, Inc. | Gene expression profile algorithm and test for determining prognosis of prostate cancer |
US11309059B2 (en) | 2013-04-26 | 2022-04-19 | Koninklijke Philips N.V. | Medical prognosis and prediction of treatment response using multiple cellular signalling pathway activities |
ES2613521T3 (en) | 2014-01-03 | 2017-05-24 | Koninklijke Philips N.V. | Evaluation of the activity of the PI3K cell signaling path using a mathematical modeling of target gene expression |
-
2016
- 2016-08-12 WO PCT/EP2016/069237 patent/WO2017029215A1/en active Application Filing
- 2016-08-12 US US15/235,478 patent/US11450409B2/en active Active
- 2016-08-12 CN CN201680056337.1A patent/CN108138237B/en not_active Expired - Fee Related
- 2016-08-12 DK DK16753888.3T patent/DK3334837T3/en active
- 2016-08-12 AU AU2016309659A patent/AU2016309659A1/en not_active Abandoned
- 2016-08-12 ES ES16753888T patent/ES2861400T3/en active Active
- 2016-08-12 EP EP16753888.3A patent/EP3334837B1/en active Active
- 2016-08-12 BR BR112018002848A patent/BR112018002848A2/en not_active IP Right Cessation
- 2016-08-12 CA CA2995520A patent/CA2995520A1/en not_active Abandoned
- 2016-08-12 JP JP2018507571A patent/JP7028763B2/en active Active
-
2022
- 2022-09-14 US US17/931,919 patent/US20230260595A1/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103649337A (en) * | 2011-07-19 | 2014-03-19 | 皇家飞利浦有限公司 | Assessment of cell signaling pathway activity using probabilistic modeling of target gene expression |
WO2014102668A3 (en) * | 2012-12-26 | 2014-08-21 | Koninklijke Philips N.V. | Assessment of cellular signaling pathway activity using linear combination(s) of target gene expressions; |
Non-Patent Citations (1)
Title |
---|
Subset of genes targeted by transcription factor NF-κB in TNFα-stimulated human HeLa cells;Yujun Xing et,al.;《Funct Integr Genomics.》;20121218;第13卷(第1期);图1、3、4 * |
Also Published As
Publication number | Publication date |
---|---|
US20170046477A1 (en) | 2017-02-16 |
BR112018002848A2 (en) | 2018-11-06 |
US20230260595A1 (en) | 2023-08-17 |
CA2995520A1 (en) | 2017-02-23 |
DK3334837T3 (en) | 2021-02-15 |
ES2861400T3 (en) | 2021-10-06 |
CN108138237A (en) | 2018-06-08 |
AU2016309659A1 (en) | 2018-04-12 |
JP2018522575A (en) | 2018-08-16 |
US11450409B2 (en) | 2022-09-20 |
EP3334837A1 (en) | 2018-06-20 |
WO2017029215A1 (en) | 2017-02-23 |
JP7028763B2 (en) | 2022-03-02 |
EP3334837B1 (en) | 2020-12-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108138237B (en) | Assessment of NFkB cell signaling pathway activity using mathematical modeling of target gene expression | |
AU2019201577B2 (en) | Cancer diagnostics using biomarkers | |
CN110382521B (en) | Method for differentiating tumor-inhibiting FOXO activity from oxidative stress | |
CN107077536B (en) | Evaluation of activity of TGF-beta cell signaling pathway using mathematical modeling of target gene expression | |
KR102023584B1 (en) | PREDICTING GASTROENTEROPANCREATIC NEUROENDOCRINE NEOPLASMS (GEP-NENs) | |
RU2721130C2 (en) | Assessment of activity of cell signaling pathways using a linear combination(s) of target gene expression | |
RU2719194C2 (en) | Assessing activity of cell signaling pathways using probabilistic modeling of expression of target genes | |
CN112795650A (en) | Evaluation of PI3K cell signaling pathway activity using mathematical modeling of target gene expression | |
CN111448325A (en) | Assessment of JAK-STAT3 cell signaling pathway activity using mathematical modeling of target gene expression | |
KR102103886B1 (en) | A method for assessing risk of hepatocellular carcinoma using cpg methylation status of gene | |
CN111183233A (en) | Assessment of Notch cell signaling pathway activity using mathematical modeling of target gene expression | |
KR20140044341A (en) | Molecular diagnostic test for cancer | |
KR101421326B1 (en) | Composition for predicting prognosis of breast cancer and kit comprising the same | |
KR20150090246A (en) | Molecular diagnostic test for cancer | |
US20220389519A1 (en) | Biomarkers predictive of anti-immune checkpoint response | |
AU2018210695A1 (en) | Molecular subtyping, prognosis, and treatment of bladder cancer | |
KR20160018525A (en) | Methods and compositions for reducing immunosupression by tumor cells | |
KR20160117606A (en) | Molecular diagnostic test for predicting response to anti-angiogenic drugs and prognosis of cancer | |
CN101573453A (en) | Methods of predicting distant metastasis of lymph node-negative primary breast cancer using biological pathway gene expression analysis | |
KR20080011287A (en) | Methods and nucleic acids for analyses of cellular proliferative disorders | |
KR20140140069A (en) | Compositions and methods for diagnosis and treatment of pervasive developmental disorder | |
CN111479933A (en) | Assessment of JAK-STAT1/2 cell signaling pathway activity using mathematical modeling of target gene expression | |
TW201013187A (en) | Molecular markers for lung and colorectal carcinomas | |
DK1939287T3 (en) | Method of gene transfer specific to a trophectodermal cell | |
KR20190126812A (en) | Biomarkers for Disease Diagnosis |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20220405 |