CN108124453B - 用于将DNA序列靶向并入细胞或生物体的基因组中的Cas9逆转录病毒整合酶和Cas9重组酶系统 - Google Patents
用于将DNA序列靶向并入细胞或生物体的基因组中的Cas9逆转录病毒整合酶和Cas9重组酶系统 Download PDFInfo
- Publication number
- CN108124453B CN108124453B CN201680031466.5A CN201680031466A CN108124453B CN 108124453 B CN108124453 B CN 108124453B CN 201680031466 A CN201680031466 A CN 201680031466A CN 108124453 B CN108124453 B CN 108124453B
- Authority
- CN
- China
- Prior art keywords
- sequence
- dna
- protein
- cas9
- seq
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 108091028043 Nucleic acid sequence Proteins 0.000 title claims abstract description 100
- 238000010348 incorporation Methods 0.000 title claims description 7
- 108010061833 Integrases Proteins 0.000 title abstract description 113
- 102100034343 Integrase Human genes 0.000 title abstract description 100
- 108091033409 CRISPR Proteins 0.000 title abstract description 84
- 102000018120 Recombinases Human genes 0.000 title abstract description 44
- 108010091086 Recombinases Proteins 0.000 title abstract description 44
- 230000001177 retroviral effect Effects 0.000 title description 10
- 108020004414 DNA Proteins 0.000 claims abstract description 12
- 108020005004 Guide RNA Proteins 0.000 claims description 51
- 238000000034 method Methods 0.000 claims description 47
- 239000013598 vector Substances 0.000 claims description 39
- 125000003275 alpha amino acid group Chemical group 0.000 claims description 14
- 238000001890 transfection Methods 0.000 claims description 14
- 238000013518 transcription Methods 0.000 claims description 13
- 230000035897 transcription Effects 0.000 claims description 13
- 239000000203 mixture Substances 0.000 claims description 12
- 108010008281 Recombinant Fusion Proteins Proteins 0.000 claims description 9
- 102000007056 Recombinant Fusion Proteins Human genes 0.000 claims description 9
- 238000004520 electroporation Methods 0.000 claims description 3
- 239000013603 viral vector Substances 0.000 claims description 3
- 210000004602 germ cell Anatomy 0.000 claims 3
- 238000002560 therapeutic procedure Methods 0.000 claims 2
- 208000036142 Viral infection Diseases 0.000 claims 1
- 238000002347 injection Methods 0.000 claims 1
- 239000007924 injection Substances 0.000 claims 1
- 210000001161 mammalian embryo Anatomy 0.000 claims 1
- 230000009385 viral infection Effects 0.000 claims 1
- 108090000623 proteins and genes Proteins 0.000 abstract description 177
- 102000004169 proteins and genes Human genes 0.000 abstract description 119
- 101710185494 Zinc finger protein Proteins 0.000 abstract description 26
- 102100023597 Zinc finger protein 816 Human genes 0.000 abstract description 26
- 230000003612 virological effect Effects 0.000 abstract description 25
- 102000012330 Integrases Human genes 0.000 abstract description 20
- 102000008579 Transposases Human genes 0.000 abstract description 10
- 108010020764 Transposases Proteins 0.000 abstract description 10
- 108091032973 (ribonucleotides)n+m Proteins 0.000 abstract description 4
- 238000002744 homologous recombination Methods 0.000 abstract description 3
- 230000006801 homologous recombination Effects 0.000 abstract description 3
- 230000001225 therapeutic effect Effects 0.000 abstract description 2
- 235000018102 proteins Nutrition 0.000 description 115
- 210000004027 cell Anatomy 0.000 description 90
- 150000007523 nucleic acids Chemical class 0.000 description 65
- 102000039446 nucleic acids Human genes 0.000 description 50
- 108020004707 nucleic acids Proteins 0.000 description 50
- 108020001507 fusion proteins Proteins 0.000 description 46
- 102000037865 fusion proteins Human genes 0.000 description 46
- 102000040430 polynucleotide Human genes 0.000 description 43
- 108091033319 polynucleotide Proteins 0.000 description 43
- 239000002157 polynucleotide Substances 0.000 description 43
- 108090000765 processed proteins & peptides Proteins 0.000 description 39
- 235000001014 amino acid Nutrition 0.000 description 38
- 229940024606 amino acid Drugs 0.000 description 37
- 150000001413 amino acids Chemical class 0.000 description 31
- 125000003729 nucleotide group Chemical group 0.000 description 29
- 239000002773 nucleotide Substances 0.000 description 27
- 101000588302 Homo sapiens Nuclear factor erythroid 2-related factor 2 Proteins 0.000 description 24
- 102100031701 Nuclear factor erythroid 2-related factor 2 Human genes 0.000 description 24
- 102000004196 processed proteins & peptides Human genes 0.000 description 24
- 108010043645 Transcription Activator-Like Effector Nucleases Proteins 0.000 description 22
- 238000010459 TALEN Methods 0.000 description 21
- 230000014509 gene expression Effects 0.000 description 21
- 230000010354 integration Effects 0.000 description 21
- 238000003752 polymerase chain reaction Methods 0.000 description 21
- 238000003780 insertion Methods 0.000 description 19
- 230000037431 insertion Effects 0.000 description 19
- 108020004705 Codon Proteins 0.000 description 18
- HCHKCACWOHOZIP-UHFFFAOYSA-N Zinc Chemical compound [Zn] HCHKCACWOHOZIP-UHFFFAOYSA-N 0.000 description 16
- 238000010362 genome editing Methods 0.000 description 16
- 229920001184 polypeptide Polymers 0.000 description 16
- 229910052725 zinc Inorganic materials 0.000 description 16
- 239000013604 expression vector Substances 0.000 description 15
- 241000588724 Escherichia coli Species 0.000 description 14
- 230000000694 effects Effects 0.000 description 14
- 239000011701 zinc Substances 0.000 description 14
- 230000015572 biosynthetic process Effects 0.000 description 13
- 238000003786 synthesis reaction Methods 0.000 description 13
- 230000004568 DNA-binding Effects 0.000 description 12
- FAPWRFPIFSIZLT-UHFFFAOYSA-M Sodium chloride Chemical compound [Na+].[Cl-] FAPWRFPIFSIZLT-UHFFFAOYSA-M 0.000 description 12
- 101000922348 Homo sapiens C-X-C chemokine receptor type 4 Proteins 0.000 description 11
- 108010073062 Transcription Activator-Like Effectors Proteins 0.000 description 11
- 238000003776 cleavage reaction Methods 0.000 description 11
- 230000007017 scission Effects 0.000 description 11
- 108010040467 CRISPR-Associated Proteins Proteins 0.000 description 10
- 241000713772 Human immunodeficiency virus 1 Species 0.000 description 10
- 230000001580 bacterial effect Effects 0.000 description 10
- 230000027455 binding Effects 0.000 description 10
- 239000007791 liquid phase Substances 0.000 description 10
- 239000013612 plasmid Substances 0.000 description 10
- 102100031650 C-X-C chemokine receptor type 4 Human genes 0.000 description 9
- 108700020129 Human immunodeficiency virus 1 p31 integrase Proteins 0.000 description 9
- 108091034117 Oligonucleotide Proteins 0.000 description 9
- 108010092799 RNA-directed DNA polymerase Proteins 0.000 description 9
- 230000004927 fusion Effects 0.000 description 9
- 239000000047 product Substances 0.000 description 9
- 230000008685 targeting Effects 0.000 description 9
- 241000894006 Bacteria Species 0.000 description 8
- 108091026890 Coding region Proteins 0.000 description 8
- 108700010070 Codon Usage Proteins 0.000 description 8
- 241000725303 Human immunodeficiency virus Species 0.000 description 8
- 230000006287 biotinylation Effects 0.000 description 8
- 238000007413 biotinylation Methods 0.000 description 8
- 238000013461 design Methods 0.000 description 8
- 238000000746 purification Methods 0.000 description 8
- YBJHBAHKTGYVGT-ZKWXMUAHSA-N (+)-Biotin Chemical compound N1C(=O)N[C@@H]2[C@H](CCCCC(=O)O)SC[C@@H]21 YBJHBAHKTGYVGT-ZKWXMUAHSA-N 0.000 description 7
- -1 Cpf1 Proteins 0.000 description 7
- 241000196324 Embryophyta Species 0.000 description 7
- XUJNEKJLAYXESH-REOHCLBHSA-N L-Cysteine Chemical compound SC[C@H](N)C(O)=O XUJNEKJLAYXESH-REOHCLBHSA-N 0.000 description 7
- 101710175625 Maltose/maltodextrin-binding periplasmic protein Proteins 0.000 description 7
- 108091027544 Subgenomic mRNA Proteins 0.000 description 7
- 238000004458 analytical method Methods 0.000 description 7
- 238000003556 assay Methods 0.000 description 7
- 230000004048 modification Effects 0.000 description 7
- 238000012986 modification Methods 0.000 description 7
- 125000006850 spacer group Chemical group 0.000 description 7
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Chemical compound O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 7
- 108010051219 Cre recombinase Proteins 0.000 description 6
- PEDCQBHIVMGVHV-UHFFFAOYSA-N Glycerine Chemical compound OCC(O)CO PEDCQBHIVMGVHV-UHFFFAOYSA-N 0.000 description 6
- 229960002685 biotin Drugs 0.000 description 6
- 239000011616 biotin Substances 0.000 description 6
- 239000000872 buffer Substances 0.000 description 6
- 235000018417 cysteine Nutrition 0.000 description 6
- 238000005516 engineering process Methods 0.000 description 6
- RXWNCPJZOCPEPQ-NVWDDTSBSA-N puromycin Chemical compound C1=CC(OC)=CC=C1C[C@H](N)C(=O)N[C@H]1[C@@H](O)[C@H](N2C3=NC=NC(=C3N=C2)N(C)C)O[C@@H]1CO RXWNCPJZOCPEPQ-NVWDDTSBSA-N 0.000 description 6
- 239000011780 sodium chloride Substances 0.000 description 6
- 239000007790 solid phase Substances 0.000 description 6
- 238000013519 translation Methods 0.000 description 6
- 241001430294 unidentified retrovirus Species 0.000 description 6
- 238000001262 western blot Methods 0.000 description 6
- 102000052510 DNA-Binding Proteins Human genes 0.000 description 5
- 241000713666 Lentivirus Species 0.000 description 5
- 240000004808 Saccharomyces cerevisiae Species 0.000 description 5
- 235000014680 Saccharomyces cerevisiae Nutrition 0.000 description 5
- 241000700605 Viruses Species 0.000 description 5
- 230000003197 catalytic effect Effects 0.000 description 5
- 238000006243 chemical reaction Methods 0.000 description 5
- 238000002474 experimental method Methods 0.000 description 5
- 238000000338 in vitro Methods 0.000 description 5
- 108020004999 messenger RNA Proteins 0.000 description 5
- 238000010647 peptide synthesis reaction Methods 0.000 description 5
- 239000000376 reactant Substances 0.000 description 5
- 230000001105 regulatory effect Effects 0.000 description 5
- 241000894007 species Species 0.000 description 5
- 210000001519 tissue Anatomy 0.000 description 5
- 230000007018 DNA scission Effects 0.000 description 4
- 108700020911 DNA-Binding Proteins Proteins 0.000 description 4
- 108010042407 Endonucleases Proteins 0.000 description 4
- DHMQDGOQFOQNFH-UHFFFAOYSA-N Glycine Chemical compound NCC(O)=O DHMQDGOQFOQNFH-UHFFFAOYSA-N 0.000 description 4
- 241000713333 Mouse mammary tumor virus Species 0.000 description 4
- 108010077850 Nuclear Localization Signals Proteins 0.000 description 4
- 108010029485 Protein Isoforms Proteins 0.000 description 4
- 102000001708 Protein Isoforms Human genes 0.000 description 4
- 241000191967 Staphylococcus aureus Species 0.000 description 4
- 239000007983 Tris buffer Substances 0.000 description 4
- JLCPHMBAVCMARE-UHFFFAOYSA-N [3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-hydroxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methyl [5-(6-aminopurin-9-yl)-2-(hydroxymethyl)oxolan-3-yl] hydrogen phosphate Polymers Cc1cn(C2CC(OP(O)(=O)OCC3OC(CC3OP(O)(=O)OCC3OC(CC3O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c3nc(N)[nH]c4=O)C(COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3CO)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cc(C)c(=O)[nH]c3=O)n3cc(C)c(=O)[nH]c3=O)n3ccc(N)nc3=O)n3cc(C)c(=O)[nH]c3=O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)O2)c(=O)[nH]c1=O JLCPHMBAVCMARE-UHFFFAOYSA-N 0.000 description 4
- 125000000539 amino acid group Chemical group 0.000 description 4
- 239000003242 anti bacterial agent Substances 0.000 description 4
- 229940088710 antibiotic agent Drugs 0.000 description 4
- 235000020958 biotin Nutrition 0.000 description 4
- 210000004899 c-terminal region Anatomy 0.000 description 4
- 230000000295 complement effect Effects 0.000 description 4
- XUJNEKJLAYXESH-UHFFFAOYSA-N cysteine Natural products SCC(N)C(O)=O XUJNEKJLAYXESH-UHFFFAOYSA-N 0.000 description 4
- 238000001514 detection method Methods 0.000 description 4
- 125000000487 histidyl group Chemical group [H]N([H])C(C(=O)O*)C([H])([H])C1=C([H])N([H])C([H])=N1 0.000 description 4
- 238000011534 incubation Methods 0.000 description 4
- 230000001939 inductive effect Effects 0.000 description 4
- 238000011068 loading method Methods 0.000 description 4
- 230000036961 partial effect Effects 0.000 description 4
- 229920000642 polymer Polymers 0.000 description 4
- 238000002360 preparation method Methods 0.000 description 4
- 229920005989 resin Polymers 0.000 description 4
- 239000011347 resin Substances 0.000 description 4
- 238000006467 substitution reaction Methods 0.000 description 4
- 230000009466 transformation Effects 0.000 description 4
- LENZDBCJOHFCAS-UHFFFAOYSA-N tris Chemical compound OCC(N)(CO)CO LENZDBCJOHFCAS-UHFFFAOYSA-N 0.000 description 4
- MTCFGRXMJLQNBG-REOHCLBHSA-N (2S)-2-Amino-3-hydroxypropansäure Chemical compound OC[C@H](N)C(O)=O MTCFGRXMJLQNBG-REOHCLBHSA-N 0.000 description 3
- WEVYAHXRMPXWCK-UHFFFAOYSA-N Acetonitrile Chemical compound CC#N WEVYAHXRMPXWCK-UHFFFAOYSA-N 0.000 description 3
- 101710132601 Capsid protein Proteins 0.000 description 3
- 238000001712 DNA sequencing Methods 0.000 description 3
- 102100031780 Endonuclease Human genes 0.000 description 3
- 102000004190 Enzymes Human genes 0.000 description 3
- 108090000790 Enzymes Proteins 0.000 description 3
- WHUUTDBJXJRKMK-UHFFFAOYSA-N Glutamic acid Natural products OC(=O)C(N)CCC(O)=O WHUUTDBJXJRKMK-UHFFFAOYSA-N 0.000 description 3
- 102100031181 Glyceraldehyde-3-phosphate dehydrogenase Human genes 0.000 description 3
- 241000238631 Hexapoda Species 0.000 description 3
- DCXYFEDJOCDNAF-REOHCLBHSA-N L-asparagine Chemical compound OC(=O)[C@@H](N)CC(N)=O DCXYFEDJOCDNAF-REOHCLBHSA-N 0.000 description 3
- WHUUTDBJXJRKMK-VKHMYHEASA-N L-glutamic acid Chemical compound OC(=O)[C@@H](N)CCC(O)=O WHUUTDBJXJRKMK-VKHMYHEASA-N 0.000 description 3
- ZDXPYRJPNDTMRX-VKHMYHEASA-N L-glutamine Chemical compound OC(=O)[C@@H](N)CCC(N)=O ZDXPYRJPNDTMRX-VKHMYHEASA-N 0.000 description 3
- AYFVYJQAPQTCCC-GBXIJSLDSA-N L-threonine Chemical compound C[C@@H](O)[C@H](N)C(O)=O AYFVYJQAPQTCCC-GBXIJSLDSA-N 0.000 description 3
- KDXKERNSBIXSRK-UHFFFAOYSA-N Lysine Natural products NCCCCC(N)C(O)=O KDXKERNSBIXSRK-UHFFFAOYSA-N 0.000 description 3
- 239000012124 Opti-MEM Substances 0.000 description 3
- 241000605894 Porphyromonas Species 0.000 description 3
- 101710145752 Serine recombinase gin Proteins 0.000 description 3
- 241000607768 Shigella Species 0.000 description 3
- 241000193996 Streptococcus pyogenes Species 0.000 description 3
- 125000001931 aliphatic group Chemical group 0.000 description 3
- 210000004102 animal cell Anatomy 0.000 description 3
- 238000000137 annealing Methods 0.000 description 3
- 230000000692 anti-sense effect Effects 0.000 description 3
- 230000008901 benefit Effects 0.000 description 3
- 230000000903 blocking effect Effects 0.000 description 3
- 239000003153 chemical reaction reagent Substances 0.000 description 3
- 229940088598 enzyme Drugs 0.000 description 3
- 230000006870 function Effects 0.000 description 3
- 230000002068 genetic effect Effects 0.000 description 3
- 108020004445 glyceraldehyde-3-phosphate dehydrogenase Proteins 0.000 description 3
- 238000004128 high performance liquid chromatography Methods 0.000 description 3
- 235000014304 histidine Nutrition 0.000 description 3
- 229910052739 hydrogen Inorganic materials 0.000 description 3
- 239000001257 hydrogen Substances 0.000 description 3
- 125000002887 hydroxy group Chemical group [H]O* 0.000 description 3
- RAXXELZNTBOGNW-UHFFFAOYSA-N imidazole Natural products C1=CNC=N1 RAXXELZNTBOGNW-UHFFFAOYSA-N 0.000 description 3
- 210000004962 mammalian cell Anatomy 0.000 description 3
- 239000002609 medium Substances 0.000 description 3
- 230000035772 mutation Effects 0.000 description 3
- 238000012545 processing Methods 0.000 description 3
- 229950010131 puromycin Drugs 0.000 description 3
- 230000002829 reductive effect Effects 0.000 description 3
- 108091008146 restriction endonucleases Proteins 0.000 description 3
- 230000002441 reversible effect Effects 0.000 description 3
- 238000002415 sodium dodecyl sulfate polyacrylamide gel electrophoresis Methods 0.000 description 3
- 238000010532 solid phase synthesis reaction Methods 0.000 description 3
- 239000000243 solution Substances 0.000 description 3
- 239000000758 substrate Substances 0.000 description 3
- 238000012360 testing method Methods 0.000 description 3
- 241000701161 unidentified adenovirus Species 0.000 description 3
- 102000040650 (ribonucleotides)n+m Human genes 0.000 description 2
- MYRTYDVEIRVNKP-UHFFFAOYSA-N 1,2-Divinylbenzene Chemical compound C=CC1=CC=CC=C1C=C MYRTYDVEIRVNKP-UHFFFAOYSA-N 0.000 description 2
- 101710169336 5'-deoxyadenosine deaminase Proteins 0.000 description 2
- 102000055025 Adenosine deaminases Human genes 0.000 description 2
- 229920000936 Agarose Polymers 0.000 description 2
- 244000105975 Antidesma platyphyllum Species 0.000 description 2
- 108091023037 Aptamer Proteins 0.000 description 2
- 108090000565 Capsid Proteins Proteins 0.000 description 2
- 101710197658 Capsid protein VP1 Proteins 0.000 description 2
- 108010051109 Cell-Penetrating Peptides Proteins 0.000 description 2
- 102000020313 Cell-Penetrating Peptides Human genes 0.000 description 2
- 102100023321 Ceruloplasmin Human genes 0.000 description 2
- 238000010442 DNA editing Methods 0.000 description 2
- 102000004163 DNA-directed RNA polymerases Human genes 0.000 description 2
- 108090000626 DNA-directed RNA polymerases Proteins 0.000 description 2
- 101100239628 Danio rerio myca gene Proteins 0.000 description 2
- KRHYYFGTRYWZRS-UHFFFAOYSA-N Fluorane Chemical compound F KRHYYFGTRYWZRS-UHFFFAOYSA-N 0.000 description 2
- 241000589601 Francisella Species 0.000 description 2
- 101150106478 GPS1 gene Proteins 0.000 description 2
- 108010014458 Gin recombinase Proteins 0.000 description 2
- 229920000209 Hexadimethrine bromide Polymers 0.000 description 2
- 108091092195 Intron Proteins 0.000 description 2
- 150000008575 L-amino acids Chemical class 0.000 description 2
- CKLJMWTZIZZHCS-REOHCLBHSA-N L-aspartic acid Chemical compound OC(=O)[C@@H](N)CC(O)=O CKLJMWTZIZZHCS-REOHCLBHSA-N 0.000 description 2
- COLNVLDHVKWLRT-QMMMGPOBSA-N L-phenylalanine Chemical compound OC(=O)[C@@H](N)CC1=CC=CC=C1 COLNVLDHVKWLRT-QMMMGPOBSA-N 0.000 description 2
- 241001112693 Lachnospiraceae Species 0.000 description 2
- 241000282553 Macaca Species 0.000 description 2
- 241000829100 Macaca mulatta polyomavirus 1 Species 0.000 description 2
- 101710081079 Minor spike protein H Proteins 0.000 description 2
- 241001494479 Pecora Species 0.000 description 2
- ONIBWKKTOPOVIA-UHFFFAOYSA-N Proline Natural products OC(=O)C1CCCN1 ONIBWKKTOPOVIA-UHFFFAOYSA-N 0.000 description 2
- 108010076504 Protein Sorting Signals Proteins 0.000 description 2
- 101710091955 Putative integrase Proteins 0.000 description 2
- 101710118046 RNA-directed RNA polymerase Proteins 0.000 description 2
- 108020004511 Recombinant DNA Proteins 0.000 description 2
- 241000607142 Salmonella Species 0.000 description 2
- 108091027967 Small hairpin RNA Proteins 0.000 description 2
- 108020004459 Small interfering RNA Proteins 0.000 description 2
- 108091081024 Start codon Proteins 0.000 description 2
- 241000193998 Streptococcus pneumoniae Species 0.000 description 2
- 108010076818 TEV protease Proteins 0.000 description 2
- 108010022394 Threonine synthase Proteins 0.000 description 2
- IQFYYKKMVGJFEH-XLPZGREQSA-N Thymidine Chemical compound O=C1NC(=O)C(C)=CN1[C@@H]1O[C@H](CO)[C@@H](O)C1 IQFYYKKMVGJFEH-XLPZGREQSA-N 0.000 description 2
- 102000006601 Thymidine Kinase Human genes 0.000 description 2
- 108020004440 Thymidine kinase Proteins 0.000 description 2
- 108091028113 Trans-activating crRNA Proteins 0.000 description 2
- 108700009124 Transcription Initiation Site Proteins 0.000 description 2
- 108091023040 Transcription factor Proteins 0.000 description 2
- 102000040945 Transcription factor Human genes 0.000 description 2
- DTQVDTLACAAQTR-UHFFFAOYSA-N Trifluoroacetic acid Chemical compound OC(=O)C(F)(F)F DTQVDTLACAAQTR-UHFFFAOYSA-N 0.000 description 2
- KZSNJWFQEVHDMF-UHFFFAOYSA-N Valine Natural products CC(C)C(N)C(O)=O KZSNJWFQEVHDMF-UHFFFAOYSA-N 0.000 description 2
- 108020005202 Viral DNA Proteins 0.000 description 2
- 108700005077 Viral Genes Proteins 0.000 description 2
- 101710108545 Viral protein 1 Proteins 0.000 description 2
- 108010027570 Xanthine phosphoribosyltransferase Proteins 0.000 description 2
- 241000269368 Xenopus laevis Species 0.000 description 2
- 150000001408 amides Chemical class 0.000 description 2
- 230000003321 amplification Effects 0.000 description 2
- 239000000427 antigen Substances 0.000 description 2
- 238000013528 artificial neural network Methods 0.000 description 2
- 125000003118 aryl group Chemical group 0.000 description 2
- 229910000389 calcium phosphate Inorganic materials 0.000 description 2
- 239000001506 calcium phosphate Substances 0.000 description 2
- 235000011010 calcium phosphates Nutrition 0.000 description 2
- 238000004113 cell culture Methods 0.000 description 2
- 239000003795 chemical substances by application Substances 0.000 description 2
- 210000003763 chloroplast Anatomy 0.000 description 2
- 239000002299 complementary DNA Substances 0.000 description 2
- 238000012790 confirmation Methods 0.000 description 2
- 230000021615 conjugation Effects 0.000 description 2
- 238000007796 conventional method Methods 0.000 description 2
- 150000001945 cysteines Chemical class 0.000 description 2
- 238000012217 deletion Methods 0.000 description 2
- 230000037430 deletion Effects 0.000 description 2
- 102000004419 dihydrofolate reductase Human genes 0.000 description 2
- 230000005782 double-strand break Effects 0.000 description 2
- 239000003814 drug Substances 0.000 description 2
- 238000002330 electrospray ionisation mass spectrometry Methods 0.000 description 2
- 239000003623 enhancer Substances 0.000 description 2
- 210000003527 eukaryotic cell Anatomy 0.000 description 2
- 239000012634 fragment Substances 0.000 description 2
- 238000003209 gene knockout Methods 0.000 description 2
- 235000009424 haa Nutrition 0.000 description 2
- HNDVDQJCIGZPNO-UHFFFAOYSA-N histidine Natural products OC(=O)C(N)CC1=CN=CN1 HNDVDQJCIGZPNO-UHFFFAOYSA-N 0.000 description 2
- 239000012456 homogeneous solution Substances 0.000 description 2
- 102000053523 human CXCR4 Human genes 0.000 description 2
- 238000000126 in silico method Methods 0.000 description 2
- 238000001727 in vivo Methods 0.000 description 2
- 230000000977 initiatory effect Effects 0.000 description 2
- 238000011081 inoculation Methods 0.000 description 2
- 238000002955 isolation Methods 0.000 description 2
- BPHPUYQFMNQIOC-NXRLNHOXSA-N isopropyl beta-D-thiogalactopyranoside Chemical compound CC(C)S[C@@H]1O[C@H](CO)[C@H](O)[C@H](O)[C@H]1O BPHPUYQFMNQIOC-NXRLNHOXSA-N 0.000 description 2
- 238000002372 labelling Methods 0.000 description 2
- 230000000670 limiting effect Effects 0.000 description 2
- 239000002502 liposome Substances 0.000 description 2
- 239000003550 marker Substances 0.000 description 2
- 230000007246 mechanism Effects 0.000 description 2
- 108091070501 miRNA Proteins 0.000 description 2
- 244000005700 microbiome Species 0.000 description 2
- 238000000520 microinjection Methods 0.000 description 2
- 238000010369 molecular cloning Methods 0.000 description 2
- 230000032965 negative regulation of cell volume Effects 0.000 description 2
- 238000007481 next generation sequencing Methods 0.000 description 2
- 238000003199 nucleic acid amplification method Methods 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 239000012071 phase Substances 0.000 description 2
- 230000008488 polyadenylation Effects 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 210000001236 prokaryotic cell Anatomy 0.000 description 2
- 238000001742 protein purification Methods 0.000 description 2
- 230000006798 recombination Effects 0.000 description 2
- 238000005215 recombination Methods 0.000 description 2
- 238000012216 screening Methods 0.000 description 2
- 230000003248 secreting effect Effects 0.000 description 2
- 238000002741 site-directed mutagenesis Methods 0.000 description 2
- 239000004055 small Interfering RNA Substances 0.000 description 2
- 229940031000 streptococcus pneumoniae Drugs 0.000 description 2
- 239000006228 supernatant Substances 0.000 description 2
- 239000012096 transfection reagent Substances 0.000 description 2
- QORWJWZARLRLPR-UHFFFAOYSA-H tricalcium bis(phosphate) Chemical compound [Ca+2].[Ca+2].[Ca+2].[O-]P([O-])([O-])=O.[O-]P([O-])([O-])=O QORWJWZARLRLPR-UHFFFAOYSA-H 0.000 description 2
- 238000010200 validation analysis Methods 0.000 description 2
- 230000004572 zinc-binding Effects 0.000 description 2
- CXBDYQVECUFKRK-UHFFFAOYSA-N 1-methoxybutane Chemical compound CCCCOC CXBDYQVECUFKRK-UHFFFAOYSA-N 0.000 description 1
- MXHRCPNRJAMMIM-SHYZEUOFSA-N 2'-deoxyuridine Chemical compound C1[C@H](O)[C@@H](CO)O[C@H]1N1C(=O)NC(=O)C=C1 MXHRCPNRJAMMIM-SHYZEUOFSA-N 0.000 description 1
- CKTSBUTUHBMZGZ-SHYZEUOFSA-N 2'‐deoxycytidine Chemical compound O=C1N=C(N)C=CN1[C@@H]1O[C@H](CO)[C@@H](O)C1 CKTSBUTUHBMZGZ-SHYZEUOFSA-N 0.000 description 1
- JKMHFZQWWAIEOD-UHFFFAOYSA-N 2-[4-(2-hydroxyethyl)piperazin-1-yl]ethanesulfonic acid Chemical compound OCC[NH+]1CCN(CCS([O-])(=O)=O)CC1 JKMHFZQWWAIEOD-UHFFFAOYSA-N 0.000 description 1
- KISUPFXQEHWGAR-RRKCRQDMSA-N 4-amino-5-bromo-1-[(2r,4s,5r)-4-hydroxy-5-(hydroxymethyl)oxolan-2-yl]pyrimidin-2-one Chemical compound C1=C(Br)C(N)=NC(=O)N1[C@@H]1O[C@H](CO)[C@@H](O)C1 KISUPFXQEHWGAR-RRKCRQDMSA-N 0.000 description 1
- QCVGEOXPDFCNHA-UHFFFAOYSA-N 5,5-dimethyl-2,4-dioxo-1,3-oxazolidine-3-carboxamide Chemical compound CC1(C)OC(=O)N(C(N)=O)C1=O QCVGEOXPDFCNHA-UHFFFAOYSA-N 0.000 description 1
- CIVGYTYIDWRBQU-UFLZEWODSA-N 5-[(3as,4s,6ar)-2-oxo-1,3,3a,4,6,6a-hexahydrothieno[3,4-d]imidazol-4-yl]pentanoic acid;pyrrole-2,5-dione Chemical class O=C1NC(=O)C=C1.N1C(=O)N[C@@H]2[C@H](CCCCC(=O)O)SC[C@@H]21 CIVGYTYIDWRBQU-UFLZEWODSA-N 0.000 description 1
- LUCHPKXVUGJYGU-XLPZGREQSA-N 5-methyl-2'-deoxycytidine Chemical compound O=C1N=C(N)C(C)=CN1[C@@H]1O[C@H](CO)[C@@H](O)C1 LUCHPKXVUGJYGU-XLPZGREQSA-N 0.000 description 1
- 101710154451 60S ribosomal protein L27-A Proteins 0.000 description 1
- 101710187898 60S ribosomal protein L28 Proteins 0.000 description 1
- 102100021671 60S ribosomal protein L29 Human genes 0.000 description 1
- 241000251468 Actinopterygii Species 0.000 description 1
- 208000003200 Adenoma Diseases 0.000 description 1
- 206010001233 Adenoma benign Diseases 0.000 description 1
- 108700015125 Adenovirus DBP Proteins 0.000 description 1
- 108010024878 Adenovirus E1A Proteins Proteins 0.000 description 1
- 229920000856 Amylose Polymers 0.000 description 1
- 241000219194 Arabidopsis Species 0.000 description 1
- 239000004475 Arginine Substances 0.000 description 1
- DCXYFEDJOCDNAF-UHFFFAOYSA-N Asparagine Natural products OC(=O)C(N)CC(N)=O DCXYFEDJOCDNAF-UHFFFAOYSA-N 0.000 description 1
- 241000223651 Aureobasidium Species 0.000 description 1
- 241001213911 Avian retroviruses Species 0.000 description 1
- 108700003860 Bacterial Genes Proteins 0.000 description 1
- 108091003079 Bovine Serum Albumin Proteins 0.000 description 1
- 101150005393 CBF1 gene Proteins 0.000 description 1
- 101710172824 CRISPR-associated endonuclease Cas9 Proteins 0.000 description 1
- 241000244203 Caenorhabditis elegans Species 0.000 description 1
- 102000014914 Carrier Proteins Human genes 0.000 description 1
- 102100025064 Cellular tumor antigen p53 Human genes 0.000 description 1
- 241000282693 Cercopithecidae Species 0.000 description 1
- 102000019034 Chemokines Human genes 0.000 description 1
- 108010012236 Chemokines Proteins 0.000 description 1
- 108091092236 Chimeric RNA Proteins 0.000 description 1
- 108020004638 Circular DNA Proteins 0.000 description 1
- 241001112696 Clostridia Species 0.000 description 1
- 241001112695 Clostridiales Species 0.000 description 1
- 241000193163 Clostridioides difficile Species 0.000 description 1
- 241001478240 Coccus Species 0.000 description 1
- 101100329224 Coprinopsis cinerea (strain Okayama-7 / 130 / ATCC MYA-4618 / FGSC 9003) cpf1 gene Proteins 0.000 description 1
- 241000195493 Cryptophyta Species 0.000 description 1
- 241000701022 Cytomegalovirus Species 0.000 description 1
- 150000008574 D-amino acids Chemical class 0.000 description 1
- HMFHBZSHGGEWLO-SOOFDHNKSA-N D-ribofuranose Chemical compound OC[C@H]1OC(O)[C@H](O)[C@@H]1O HMFHBZSHGGEWLO-SOOFDHNKSA-N 0.000 description 1
- 102000053602 DNA Human genes 0.000 description 1
- 230000004544 DNA amplification Effects 0.000 description 1
- 230000033616 DNA repair Effects 0.000 description 1
- 101710096438 DNA-binding protein Proteins 0.000 description 1
- CKTSBUTUHBMZGZ-UHFFFAOYSA-N Deoxycytidine Natural products O=C1N=C(N)C=CN1C1OC(CO)C(O)C1 CKTSBUTUHBMZGZ-UHFFFAOYSA-N 0.000 description 1
- 241000702421 Dependoparvovirus Species 0.000 description 1
- 229920002307 Dextran Polymers 0.000 description 1
- 241000255925 Diptera Species 0.000 description 1
- 241000255581 Drosophila <fruit fly, genus> Species 0.000 description 1
- 206010059866 Drug resistance Diseases 0.000 description 1
- 239000006144 Dulbecco’s modified Eagle's medium Substances 0.000 description 1
- 102000002322 Egg Proteins Human genes 0.000 description 1
- 108010000912 Egg Proteins Proteins 0.000 description 1
- 108010061435 Enalapril Proteins 0.000 description 1
- 102000004533 Endonucleases Human genes 0.000 description 1
- 101000926726 Escherichia coli Dihydrofolate reductase type 7 Proteins 0.000 description 1
- 101001091269 Escherichia coli Hygromycin-B 4-O-kinase Proteins 0.000 description 1
- 241000186394 Eubacterium Species 0.000 description 1
- 108700024394 Exon Proteins 0.000 description 1
- 238000012413 Fluorescence activated cell sorting analysis Methods 0.000 description 1
- 241000589602 Francisella tularensis Species 0.000 description 1
- 102100039556 Galectin-4 Human genes 0.000 description 1
- 241000159512 Geotrichum Species 0.000 description 1
- 239000004471 Glycine Substances 0.000 description 1
- 108010043121 Green Fluorescent Proteins Proteins 0.000 description 1
- 102000004144 Green Fluorescent Proteins Human genes 0.000 description 1
- 239000007995 HEPES buffer Substances 0.000 description 1
- 108010002459 HIV Integrase Proteins 0.000 description 1
- 108010056307 Hin recombinase Proteins 0.000 description 1
- 101710103773 Histone H2B Proteins 0.000 description 1
- 102100021639 Histone H2B type 1-K Human genes 0.000 description 1
- 101000721661 Homo sapiens Cellular tumor antigen p53 Proteins 0.000 description 1
- 101000608765 Homo sapiens Galectin-4 Proteins 0.000 description 1
- 206010061598 Immunodeficiency Diseases 0.000 description 1
- 208000029462 Immunodeficiency disease Diseases 0.000 description 1
- 108010025815 Kanamycin Kinase Proteins 0.000 description 1
- ONIBWKKTOPOVIA-BYPYZUCNSA-N L-Proline Chemical compound OC(=O)[C@@H]1CCCN1 ONIBWKKTOPOVIA-BYPYZUCNSA-N 0.000 description 1
- QNAYBMKLOCPYGJ-REOHCLBHSA-N L-alanine Chemical compound C[C@H](N)C(O)=O QNAYBMKLOCPYGJ-REOHCLBHSA-N 0.000 description 1
- 125000000998 L-alanino group Chemical group [H]N([*])[C@](C([H])([H])[H])([H])C(=O)O[H] 0.000 description 1
- ROHFNLRQFUQHCH-YFKPBYRVSA-N L-leucine Chemical compound CC(C)C[C@H](N)C(O)=O ROHFNLRQFUQHCH-YFKPBYRVSA-N 0.000 description 1
- FFEARJCKVFRZRR-BYPYZUCNSA-N L-methionine Chemical compound CSCC[C@H](N)C(O)=O FFEARJCKVFRZRR-BYPYZUCNSA-N 0.000 description 1
- 125000000174 L-prolyl group Chemical group [H]N1C([H])([H])C([H])([H])C([H])([H])[C@@]1([H])C(*)=O 0.000 description 1
- OUYCCCASQSFEME-QMMMGPOBSA-N L-tyrosine Chemical compound OC(=O)[C@@H](N)CC1=CC=C(O)C=C1 OUYCCCASQSFEME-QMMMGPOBSA-N 0.000 description 1
- KZSNJWFQEVHDMF-BYPYZUCNSA-N L-valine Chemical compound CC(C)[C@H](N)C(O)=O KZSNJWFQEVHDMF-BYPYZUCNSA-N 0.000 description 1
- 108010054278 Lac Repressors Proteins 0.000 description 1
- 241000186660 Lactobacillus Species 0.000 description 1
- 102000006835 Lamins Human genes 0.000 description 1
- 108010047294 Lamins Proteins 0.000 description 1
- 241000589902 Leptospira Species 0.000 description 1
- 241001148627 Leptospira inadai Species 0.000 description 1
- ROHFNLRQFUQHCH-UHFFFAOYSA-N Leucine Natural products CC(C)CC(N)C(O)=O ROHFNLRQFUQHCH-UHFFFAOYSA-N 0.000 description 1
- 102000003960 Ligases Human genes 0.000 description 1
- 108090000364 Ligases Proteins 0.000 description 1
- 108010007859 Lisinopril Proteins 0.000 description 1
- 239000004472 Lysine Substances 0.000 description 1
- 208000030984 MIRAGE syndrome Diseases 0.000 description 1
- PBOUVYGPDSARIS-IUCAKERBSA-N Met-Leu Chemical compound CSCC[C@H](N)C(=O)N[C@H](C(O)=O)CC(C)C PBOUVYGPDSARIS-IUCAKERBSA-N 0.000 description 1
- 241000219470 Mirabilis Species 0.000 description 1
- 241000713869 Moloney murine leukemia virus Species 0.000 description 1
- YNAVUWVOSKDBBP-UHFFFAOYSA-N Morpholine Chemical group C1COCCN1 YNAVUWVOSKDBBP-UHFFFAOYSA-N 0.000 description 1
- 101000686985 Mouse mammary tumor virus (strain C3H) Protein PR73 Proteins 0.000 description 1
- 102000016943 Muramidase Human genes 0.000 description 1
- 108010014251 Muramidase Proteins 0.000 description 1
- 241001531296 Muricidae Species 0.000 description 1
- 241001529936 Murinae Species 0.000 description 1
- 102100038895 Myc proto-oncogene protein Human genes 0.000 description 1
- 101710135898 Myc proto-oncogene protein Proteins 0.000 description 1
- 108010062010 N-Acetylmuramoyl-L-alanine Amidase Proteins 0.000 description 1
- 241000588650 Neisseria meningitidis Species 0.000 description 1
- 208000009869 Neu-Laxova syndrome Diseases 0.000 description 1
- 244000061176 Nicotiana tabacum Species 0.000 description 1
- 235000002637 Nicotiana tabacum Nutrition 0.000 description 1
- 101710163270 Nuclease Proteins 0.000 description 1
- 108020004711 Nucleic Acid Probes Proteins 0.000 description 1
- 108091005461 Nucleic proteins Proteins 0.000 description 1
- 241000606856 Pasteurella multocida Species 0.000 description 1
- 108091093037 Peptide nucleic acid Proteins 0.000 description 1
- 241001505332 Polyomavirus sp. Species 0.000 description 1
- 108010087776 Proto-Oncogene Proteins c-myb Proteins 0.000 description 1
- 102000009096 Proto-Oncogene Proteins c-myb Human genes 0.000 description 1
- 241000589517 Pseudomonas aeruginosa Species 0.000 description 1
- 241001453299 Pseudomonas mevalonii Species 0.000 description 1
- 241000589776 Pseudomonas putida Species 0.000 description 1
- 239000012614 Q-Sepharose Substances 0.000 description 1
- 230000004570 RNA-binding Effects 0.000 description 1
- 238000011529 RT qPCR Methods 0.000 description 1
- 108091081062 Repeated sequence (DNA) Proteins 0.000 description 1
- 241000191023 Rhodobacter capsulatus Species 0.000 description 1
- 241000191043 Rhodobacter sphaeroides Species 0.000 description 1
- 241000187562 Rhodococcus sp. Species 0.000 description 1
- 241000190984 Rhodospirillum rubrum Species 0.000 description 1
- 102000006382 Ribonucleases Human genes 0.000 description 1
- 108010083644 Ribonucleases Proteins 0.000 description 1
- 108091028664 Ribonucleotide Proteins 0.000 description 1
- PYMYPHUHKUWMLA-LMVFSUKVSA-N Ribose Natural products OC[C@@H](O)[C@@H](O)[C@@H](O)C=O PYMYPHUHKUWMLA-LMVFSUKVSA-N 0.000 description 1
- 241000293871 Salmonella enterica subsp. enterica serovar Typhi Species 0.000 description 1
- 241000293869 Salmonella enterica subsp. enterica serovar Typhimurium Species 0.000 description 1
- 241000909295 Selenomonadales Species 0.000 description 1
- 229920002684 Sepharose Polymers 0.000 description 1
- MTCFGRXMJLQNBG-UHFFFAOYSA-N Serine Natural products OCC(N)C(O)=O MTCFGRXMJLQNBG-UHFFFAOYSA-N 0.000 description 1
- 241000607764 Shigella dysenteriae Species 0.000 description 1
- 241000607762 Shigella flexneri Species 0.000 description 1
- 241001466984 Simian T-lymphotropic virus 1 Species 0.000 description 1
- 101100289792 Squirrel monkey polyomavirus large T gene Proteins 0.000 description 1
- 108010085012 Steroid Receptors Proteins 0.000 description 1
- 102000007451 Steroid Receptors Human genes 0.000 description 1
- 108010090804 Streptavidin Proteins 0.000 description 1
- 241000194017 Streptococcus Species 0.000 description 1
- 241001134658 Streptococcus mitis Species 0.000 description 1
- 241000194019 Streptococcus mutans Species 0.000 description 1
- 241000416934 Streptococcus pyogenes A20 Species 0.000 description 1
- 241000194020 Streptococcus thermophilus Species 0.000 description 1
- 101001091268 Streptomyces hygroscopicus Hygromycin-B 7''-O-kinase Proteins 0.000 description 1
- 108700005078 Synthetic Genes Proteins 0.000 description 1
- 102100029210 Tetratricopeptide repeat protein 37 Human genes 0.000 description 1
- 101710129246 Tetratricopeptide repeat protein 37 Proteins 0.000 description 1
- 241000566950 Thelephoraceae Species 0.000 description 1
- 241000186339 Thermoanaerobacter Species 0.000 description 1
- 241001137870 Thermoanaerobacterium Species 0.000 description 1
- RYYWUUFWQRZTIU-UHFFFAOYSA-N Thiophosphoric acid Chemical group OP(O)(S)=O RYYWUUFWQRZTIU-UHFFFAOYSA-N 0.000 description 1
- AYFVYJQAPQTCCC-UHFFFAOYSA-N Threonine Natural products CC(O)C(N)C(O)=O AYFVYJQAPQTCCC-UHFFFAOYSA-N 0.000 description 1
- 239000004473 Threonine Substances 0.000 description 1
- 102000005610 Thyroid Hormone Receptors alpha Human genes 0.000 description 1
- 108010045070 Thyroid Hormone Receptors alpha Proteins 0.000 description 1
- 108010068068 Transcription Factor TFIIIA Proteins 0.000 description 1
- 101710150448 Transcriptional regulator Myc Proteins 0.000 description 1
- 108020004566 Transfer RNA Proteins 0.000 description 1
- QIVBCDIJIAJPQS-UHFFFAOYSA-N Tryptophan Natural products C1=CC=C2C(CC(N)C(O)=O)=CNC2=C1 QIVBCDIJIAJPQS-UHFFFAOYSA-N 0.000 description 1
- DOFAQXCYFQKSHT-SRVKXCTJSA-N Val-Pro-Pro Chemical compound CC(C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N1[C@H](C(O)=O)CCC1 DOFAQXCYFQKSHT-SRVKXCTJSA-N 0.000 description 1
- 108010067390 Viral Proteins Proteins 0.000 description 1
- 241000589634 Xanthomonas Species 0.000 description 1
- 241000810153 Youngiibacter fragilis 232.1 Species 0.000 description 1
- PTFCDOFLOPIGGS-UHFFFAOYSA-N Zinc dication Chemical compound [Zn+2] PTFCDOFLOPIGGS-UHFFFAOYSA-N 0.000 description 1
- 238000002835 absorbance Methods 0.000 description 1
- 230000021736 acetylation Effects 0.000 description 1
- 238000006640 acetylation reaction Methods 0.000 description 1
- 239000002253 acid Substances 0.000 description 1
- 230000002378 acidificating effect Effects 0.000 description 1
- 230000002730 additional effect Effects 0.000 description 1
- 230000002411 adverse Effects 0.000 description 1
- 238000001042 affinity chromatography Methods 0.000 description 1
- 235000004279 alanine Nutrition 0.000 description 1
- HMFHBZSHGGEWLO-UHFFFAOYSA-N alpha-D-Furanose-Ribose Natural products OCC1OC(O)C(O)C1O HMFHBZSHGGEWLO-UHFFFAOYSA-N 0.000 description 1
- 230000009435 amidation Effects 0.000 description 1
- 238000007112 amidation reaction Methods 0.000 description 1
- 125000003368 amide group Chemical group 0.000 description 1
- 150000001412 amines Chemical group 0.000 description 1
- 150000003862 amino acid derivatives Chemical class 0.000 description 1
- 125000003277 amino group Chemical group 0.000 description 1
- 102000006646 aminoglycoside phosphotransferase Human genes 0.000 description 1
- 108091007433 antigens Proteins 0.000 description 1
- 102000036639 antigens Human genes 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- ODKSFYDXXFIFQN-UHFFFAOYSA-N arginine Natural products OC(=O)C(N)CCCNC(N)=N ODKSFYDXXFIFQN-UHFFFAOYSA-N 0.000 description 1
- 235000009582 asparagine Nutrition 0.000 description 1
- 229960001230 asparagine Drugs 0.000 description 1
- 235000003704 aspartic acid Nutrition 0.000 description 1
- 241000510314 bacterium MA2020 Species 0.000 description 1
- 125000000051 benzyloxy group Chemical group [H]C1=C([H])C([H])=C(C([H])=C1[H])C([H])([H])O* 0.000 description 1
- OQFSQFPPLPISGP-UHFFFAOYSA-N beta-carboxyaspartic acid Natural products OC(=O)C(N)C(C(O)=O)C(O)=O OQFSQFPPLPISGP-UHFFFAOYSA-N 0.000 description 1
- 108091008324 binding proteins Proteins 0.000 description 1
- 230000003115 biocidal effect Effects 0.000 description 1
- 230000008827 biological function Effects 0.000 description 1
- 230000033228 biological regulation Effects 0.000 description 1
- 229930189065 blasticidin Natural products 0.000 description 1
- 239000007853 buffer solution Substances 0.000 description 1
- 125000001314 canonical amino-acid group Chemical group 0.000 description 1
- 125000003178 carboxy group Chemical group [H]OC(*)=O 0.000 description 1
- 101150059443 cas12a gene Proteins 0.000 description 1
- 101150038500 cas9 gene Proteins 0.000 description 1
- 238000005277 cation exchange chromatography Methods 0.000 description 1
- 230000001364 causal effect Effects 0.000 description 1
- 238000000423 cell based assay Methods 0.000 description 1
- 125000003636 chemical group Chemical group 0.000 description 1
- 238000004587 chromatography analysis Methods 0.000 description 1
- 238000010367 cloning Methods 0.000 description 1
- 230000004186 co-expression Effects 0.000 description 1
- 230000009918 complex formation Effects 0.000 description 1
- 238000010668 complexation reaction Methods 0.000 description 1
- 108091036078 conserved sequence Proteins 0.000 description 1
- 244000038559 crop plants Species 0.000 description 1
- 238000002425 crystallisation Methods 0.000 description 1
- 230000008025 crystallization Effects 0.000 description 1
- 210000004748 cultured cell Anatomy 0.000 description 1
- 230000001351 cycling effect Effects 0.000 description 1
- 125000000151 cysteine group Chemical group N[C@@H](CS)C(=O)* 0.000 description 1
- 230000006378 damage Effects 0.000 description 1
- 230000002950 deficient Effects 0.000 description 1
- 238000004925 denaturation Methods 0.000 description 1
- 230000036425 denaturation Effects 0.000 description 1
- 238000010217 densitometric analysis Methods 0.000 description 1
- 239000005547 deoxyribonucleotide Substances 0.000 description 1
- 125000002637 deoxyribonucleotide group Chemical group 0.000 description 1
- MXHRCPNRJAMMIM-UHFFFAOYSA-N desoxyuridine Natural products C1C(O)C(CO)OC1N1C(=O)NC(=O)C=C1 MXHRCPNRJAMMIM-UHFFFAOYSA-N 0.000 description 1
- 101150087738 dhfrVII gene Proteins 0.000 description 1
- 238000000502 dialysis Methods 0.000 description 1
- 238000010790 dilution Methods 0.000 description 1
- 239000012895 dilution Substances 0.000 description 1
- 201000010099 disease Diseases 0.000 description 1
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 1
- 238000004090 dissolution Methods 0.000 description 1
- 239000012153 distilled water Substances 0.000 description 1
- NAGJZTKCGNOGPW-UHFFFAOYSA-N dithiophosphoric acid Chemical group OP(O)(S)=S NAGJZTKCGNOGPW-UHFFFAOYSA-N 0.000 description 1
- 238000009509 drug development Methods 0.000 description 1
- 235000014103 egg white Nutrition 0.000 description 1
- 210000000969 egg white Anatomy 0.000 description 1
- GBXSMTUPTTWBMN-XIRDDKMYSA-N enalapril Chemical compound C([C@@H](C(=O)OCC)N[C@@H](C)C(=O)N1[C@@H](CCC1)C(O)=O)CC1=CC=CC=C1 GBXSMTUPTTWBMN-XIRDDKMYSA-N 0.000 description 1
- 229960000873 enalapril Drugs 0.000 description 1
- 108010078428 env Gene Products Proteins 0.000 description 1
- 230000002255 enzymatic effect Effects 0.000 description 1
- 238000006911 enzymatic reaction Methods 0.000 description 1
- 230000001747 exhibiting effect Effects 0.000 description 1
- 239000013613 expression plasmid Substances 0.000 description 1
- 239000012091 fetal bovine serum Substances 0.000 description 1
- 210000002950 fibroblast Anatomy 0.000 description 1
- 229940118764 francisella tularensis Drugs 0.000 description 1
- 230000008014 freezing Effects 0.000 description 1
- 238000007710 freezing Methods 0.000 description 1
- 230000002538 fungal effect Effects 0.000 description 1
- IRSCQMHQWWYFCW-UHFFFAOYSA-N ganciclovir Chemical compound O=C1NC(N)=NC2=C1N=CN2COC(CO)CO IRSCQMHQWWYFCW-UHFFFAOYSA-N 0.000 description 1
- 229960002963 ganciclovir Drugs 0.000 description 1
- 238000001502 gel electrophoresis Methods 0.000 description 1
- 238000002523 gelfiltration Methods 0.000 description 1
- 238000007429 general method Methods 0.000 description 1
- 230000007614 genetic variation Effects 0.000 description 1
- 238000012268 genome sequencing Methods 0.000 description 1
- 235000013922 glutamic acid Nutrition 0.000 description 1
- 239000004220 glutamic acid Substances 0.000 description 1
- ZDXPYRJPNDTMRX-UHFFFAOYSA-N glutamine Natural products OC(=O)C(N)CCC(N)=O ZDXPYRJPNDTMRX-UHFFFAOYSA-N 0.000 description 1
- 239000005090 green fluorescent protein Substances 0.000 description 1
- 239000001963 growth medium Substances 0.000 description 1
- 238000010438 heat treatment Methods 0.000 description 1
- 208000006454 hepatitis Diseases 0.000 description 1
- 231100000283 hepatitis Toxicity 0.000 description 1
- 238000009396 hybridization Methods 0.000 description 1
- 230000002209 hydrophobic effect Effects 0.000 description 1
- 125000001841 imino group Chemical group [H]N=* 0.000 description 1
- 230000007813 immunodeficiency Effects 0.000 description 1
- 238000011065 in-situ storage Methods 0.000 description 1
- 108700032552 influenza virus INS1 Proteins 0.000 description 1
- 230000002401 inhibitory effect Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 210000003734 kidney Anatomy 0.000 description 1
- 238000011813 knockout mouse model Methods 0.000 description 1
- 210000005053 lamin Anatomy 0.000 description 1
- 238000011031 large-scale manufacturing process Methods 0.000 description 1
- 239000003446 ligand Substances 0.000 description 1
- RLAWWYSOJDYHDC-BZSNNMDCSA-N lisinopril Chemical compound C([C@H](N[C@@H](CCCCN)C(=O)N1[C@@H](CCC1)C(O)=O)C(O)=O)CC1=CC=CC=C1 RLAWWYSOJDYHDC-BZSNNMDCSA-N 0.000 description 1
- 229960002394 lisinopril Drugs 0.000 description 1
- 230000004807 localization Effects 0.000 description 1
- 125000003588 lysine group Chemical group [H]N([H])C([H])([H])C([H])([H])C([H])([H])C([H])([H])C([H])(N([H])[H])C(*)=O 0.000 description 1
- 229960000274 lysozyme Drugs 0.000 description 1
- 239000004325 lysozyme Substances 0.000 description 1
- 235000010335 lysozyme Nutrition 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 230000001404 mediated effect Effects 0.000 description 1
- 229910052751 metal Inorganic materials 0.000 description 1
- 239000002184 metal Substances 0.000 description 1
- 229930182817 methionine Natural products 0.000 description 1
- 230000011987 methylation Effects 0.000 description 1
- 238000007069 methylation reaction Methods 0.000 description 1
- 239000002679 microRNA Substances 0.000 description 1
- 230000025608 mitochondrion localization Effects 0.000 description 1
- 239000002853 nucleic acid probe Substances 0.000 description 1
- 230000009438 off-target cleavage Effects 0.000 description 1
- 210000000287 oocyte Anatomy 0.000 description 1
- 210000003463 organelle Anatomy 0.000 description 1
- 239000003960 organic solvent Substances 0.000 description 1
- 239000002245 particle Substances 0.000 description 1
- 229940051027 pasteurella multocida Drugs 0.000 description 1
- 230000001717 pathogenic effect Effects 0.000 description 1
- 239000008188 pellet Substances 0.000 description 1
- JTJMJGYZQZDUJJ-UHFFFAOYSA-N phencyclidine Chemical compound C1CCCCN1C1(C=2C=CC=CC=2)CCCCC1 JTJMJGYZQZDUJJ-UHFFFAOYSA-N 0.000 description 1
- COLNVLDHVKWLRT-UHFFFAOYSA-N phenylalanine Natural products OC(=O)C(N)CC1=CC=CC=C1 COLNVLDHVKWLRT-UHFFFAOYSA-N 0.000 description 1
- 125000002467 phosphate group Chemical group [H]OP(=O)(O[H])O[*] 0.000 description 1
- SXADIBFZNXBEGI-UHFFFAOYSA-N phosphoramidous acid Chemical group NP(O)O SXADIBFZNXBEGI-UHFFFAOYSA-N 0.000 description 1
- 239000004033 plastic Substances 0.000 description 1
- 238000007747 plating Methods 0.000 description 1
- 238000006116 polymerization reaction Methods 0.000 description 1
- 102000054765 polymorphisms of proteins Human genes 0.000 description 1
- 229920005990 polystyrene resin Polymers 0.000 description 1
- 238000012257 pre-denaturation Methods 0.000 description 1
- 238000001556 precipitation Methods 0.000 description 1
- 125000002924 primary amino group Chemical group [H]N([H])* 0.000 description 1
- 230000001566 pro-viral effect Effects 0.000 description 1
- TVLSRXXIMLFWEO-UHFFFAOYSA-N prochloraz Chemical compound C1=CN=CN1C(=O)N(CCC)CCOC1=C(Cl)C=C(Cl)C=C1Cl TVLSRXXIMLFWEO-UHFFFAOYSA-N 0.000 description 1
- 108020001580 protein domains Proteins 0.000 description 1
- 239000003531 protein hydrolysate Substances 0.000 description 1
- 238000000164 protein isolation Methods 0.000 description 1
- 210000001938 protoplast Anatomy 0.000 description 1
- 238000003908 quality control method Methods 0.000 description 1
- 102000005912 ran GTP Binding Protein Human genes 0.000 description 1
- 238000010188 recombinant method Methods 0.000 description 1
- 108010054624 red fluorescent protein Proteins 0.000 description 1
- 230000003252 repetitive effect Effects 0.000 description 1
- 230000010076 replication Effects 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 238000010839 reverse transcription Methods 0.000 description 1
- 238000012552 review Methods 0.000 description 1
- 239000002336 ribonucleotide Substances 0.000 description 1
- 125000002652 ribonucleotide group Chemical group 0.000 description 1
- 108020004418 ribosomal RNA Proteins 0.000 description 1
- 150000003839 salts Chemical class 0.000 description 1
- 231100000241 scar Toxicity 0.000 description 1
- 230000028327 secretion Effects 0.000 description 1
- 230000035945 sensitivity Effects 0.000 description 1
- 210000002966 serum Anatomy 0.000 description 1
- 229940007046 shigella dysenteriae Drugs 0.000 description 1
- 230000011664 signaling Effects 0.000 description 1
- 239000002195 soluble material Substances 0.000 description 1
- 239000002904 solvent Substances 0.000 description 1
- 230000000087 stabilizing effect Effects 0.000 description 1
- 238000010186 staining Methods 0.000 description 1
- 238000010561 standard procedure Methods 0.000 description 1
- 238000007619 statistical method Methods 0.000 description 1
- 238000003860 storage Methods 0.000 description 1
- 230000035892 strand transfer Effects 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
- 101150090749 sulI gene Proteins 0.000 description 1
- 229910052717 sulfur Inorganic materials 0.000 description 1
- 125000004434 sulfur atom Chemical group 0.000 description 1
- 231100000617 superantigen Toxicity 0.000 description 1
- 230000001629 suppression Effects 0.000 description 1
- 230000002195 synergetic effect Effects 0.000 description 1
- 238000001308 synthesis method Methods 0.000 description 1
- 230000009897 systematic effect Effects 0.000 description 1
- 238000010809 targeting technique Methods 0.000 description 1
- 125000005931 tert-butyloxycarbonyl group Chemical group [H]C([H])([H])C(OC(*)=O)(C([H])([H])[H])C([H])([H])[H] 0.000 description 1
- 238000010257 thawing Methods 0.000 description 1
- 238000012549 training Methods 0.000 description 1
- 230000002103 transcriptional effect Effects 0.000 description 1
- 238000006276 transfer reaction Methods 0.000 description 1
- 230000009261 transgenic effect Effects 0.000 description 1
- ITMCEJHCFYSIIV-UHFFFAOYSA-N triflic acid Chemical compound OS(=O)(=O)C(F)(F)F ITMCEJHCFYSIIV-UHFFFAOYSA-N 0.000 description 1
- OUYCCCASQSFEME-UHFFFAOYSA-N tyrosine Natural products OC(=O)C(N)CC1=CC=C(O)C=C1 OUYCCCASQSFEME-UHFFFAOYSA-N 0.000 description 1
- 241001113232 unclassified Lachnospiraceae Species 0.000 description 1
- 238000011144 upstream manufacturing Methods 0.000 description 1
- 239000004474 valine Substances 0.000 description 1
- 108010015385 valyl-prolyl-proline Proteins 0.000 description 1
- 210000005253 yeast cell Anatomy 0.000 description 1
- 108091005957 yellow fluorescent proteins Proteins 0.000 description 1
- 150000003751 zinc Chemical class 0.000 description 1
Images
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/87—Introduction of foreign genetic material using processes not otherwise provided for, e.g. co-transformation
- C12N15/90—Stable introduction of foreign DNA into chromosome
- C12N15/902—Stable introduction of foreign DNA into chromosome using homologous recombination
- C12N15/907—Stable introduction of foreign DNA into chromosome using homologous recombination in mammalian cells
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/11—DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
- C12N15/111—General methods applicable to biologically active non-coding nucleic acids
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/79—Vectors or expression systems specially adapted for eukaryotic hosts
- C12N15/85—Vectors or expression systems specially adapted for eukaryotic hosts for animal cells
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/79—Vectors or expression systems specially adapted for eukaryotic hosts
- C12N15/85—Vectors or expression systems specially adapted for eukaryotic hosts for animal cells
- C12N15/8509—Vectors or expression systems specially adapted for eukaryotic hosts for animal cells for producing genetically modified animals, e.g. transgenic
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/10—Transferases (2.)
- C12N9/12—Transferases (2.) transferring phosphorus containing groups, e.g. kinases (2.7)
- C12N9/1241—Nucleotidyltransferases (2.7.7)
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/14—Hydrolases (3)
- C12N9/16—Hydrolases (3) acting on ester bonds (3.1)
- C12N9/22—Ribonucleases RNAses, DNAses
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K2319/00—Fusion polypeptide
- C07K2319/80—Fusion polypeptide containing a DNA binding domain, e.g. Lacl or Tet-repressor
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K2319/00—Fusion polypeptide
- C07K2319/80—Fusion polypeptide containing a DNA binding domain, e.g. Lacl or Tet-repressor
- C07K2319/81—Fusion polypeptide containing a DNA binding domain, e.g. Lacl or Tet-repressor containing a Zn-finger domain for DNA binding
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2310/00—Structure or type of the nucleic acid
- C12N2310/10—Type of nucleic acid
- C12N2310/20—Type of nucleic acid involving clustered regularly interspaced short palindromic repeats [CRISPRs]
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2800/00—Nucleic acids vectors
- C12N2800/30—Vector systems comprising sequences for excision in presence of a recombinase, e.g. loxP or FRT
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2800/00—Nucleic acids vectors
- C12N2800/80—Vectors containing sites for inducing double-stranded breaks, e.g. meganuclease restriction sites
Landscapes
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Genetics & Genomics (AREA)
- Engineering & Computer Science (AREA)
- Chemical & Material Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Organic Chemistry (AREA)
- Zoology (AREA)
- Wood Science & Technology (AREA)
- Biomedical Technology (AREA)
- Biotechnology (AREA)
- General Engineering & Computer Science (AREA)
- Molecular Biology (AREA)
- General Health & Medical Sciences (AREA)
- Microbiology (AREA)
- Biochemistry (AREA)
- Biophysics (AREA)
- Physics & Mathematics (AREA)
- Plant Pathology (AREA)
- Medicinal Chemistry (AREA)
- Cell Biology (AREA)
- Mycology (AREA)
- Veterinary Medicine (AREA)
- Micro-Organisms Or Cultivation Processes Thereof (AREA)
- Enzymes And Modification Thereof (AREA)
- Peptides Or Proteins (AREA)
Abstract
本披露涉及使用与病毒整合酶、重组酶或转座酶附接的工程化蛋白如Cas9、Cpf1、TALE和锌指蛋白以便将感兴趣的DNA序列(或感兴趣的基因)递送至细胞或生物体的基因组中的靶向位点。使用在切割DNA的功能上无活性的Cas9将允许使用Cas9蛋白靶向DNA的能力,通过使用RNA引导物而不引起如在其他系统中预期的DNA断裂用于同源重组。还披露了使用与病毒整合酶或重组酶附接的锌指蛋白或TALE(结合特定DNA序列的工程化蛋白)。所述系统可用于实验室和治疗目的。例如,感兴趣的基因可以被包括在具有如下基因的细胞中,所述基因缺乏产生其基因产物以恢复所述细胞中的正常基因产物(例如基因产物可以是蛋白质或专用RNA)的能力。
Description
相关申请的交叉引用
本申请要求2015年3月31日提交的美国临时申请号62,140,454、2015年8月27日提交的美国临时申请号62,210,451以及2015年10月12日提交的美国临时申请号62,240,359的权益,每个申请的全部内容出于所有目的都通过引用而并入。
引言
本披露涉及使用通过接头与病毒整合酶(例如HIV或MMTV整合酶)或重组酶附接的具有展示基因组特异性的DNA结合蛋白的工程化蛋白如Cas9(CRISPR(规律间隔成簇短回文重复)蛋白)、TALE和锌指蛋白以便将感兴趣的DNA序列(或感兴趣的基因)递送至细胞或生物体的基因组中的靶向位点。使用在切割DNA的功能上无活性的Cas9将允许使用Cas9蛋白靶向DNA的能力,通过使用RNA引导物(gRNA)而不引起如在其他系统中预期的DNA断裂用于同源重组。还披露了使用与病毒整合酶或重组酶附接的锌指蛋白或TALE(结合特定DNA序列的工程化蛋白)。所述系统可用于实验室和治疗目的。例如,含有感兴趣的基因的供体DNA可以被容易地引入宿主基因组,而没有常规方法的脱靶切割的可能。供体DNA也可以被工程化以促进“敲除”策略。还讨论了改进Cas9靶向特异性的新策略。此策略在测定中使用表面结合的dCas9(对DNA切割能力无活性的Cas9)以及引导RNA和基因组DNA,以发现哪些引导RNA提供了对所述Cas9的特异性靶向。这在CRISPR/Cas9的体内应用中尤其重要,并且克服了当前计算机模拟预测模型的局限性,尽管它也可以与计算机模拟预测模型结合使用以便基于训练确定哪些gRNA将被用在所述测定中。
背景技术
基因组测序技术和分析方法的当前进展明显加速了对与不同范围的生物学功能和疾病相关联的遗传因子/基因组因子进行编目和映射的能力。需要精确的基因组靶向技术以通过允许个体遗传元件的选择性干扰而使得因果性遗传变异的系统性逆向工程成为可能,以及推进合成生物学、生物技术学和医疗应用。基因组编辑技术如设计师锌指、转录激活因子样效应子(TALE)、CRISPR/Cas9或大范围核酸酶可以用于产生靶向基因组干扰,仍然需要允许将DNA序列(包括全基因序列)并入给定基因组的特定位置的新的基因组工程技术。这将允许产生表达工程化基因的细胞系或转基因生物体或者允许在对其有需要的受试者中替代功能障碍性基因。
整合酶是允许将病毒核酸插入宿主基因组(哺乳动物、人、小鼠、大鼠、猴、青蛙、鱼、植物(包括作物植物和实验植物如拟南芥)、实验室或生物医学细胞系或原代细胞培养物、秀丽隐杆线虫、蝇(果蝇)等)中的病毒蛋白。整合酶使用宿主的DNA结合蛋白以将整合酶与宿主基因组缔合,以便将病毒核酸序列并入宿主基因组中。整合酶在逆转录病毒如HIV(人类免疫缺陷病毒)中被发现。整合酶依赖于病毒基因的序列以将其基因组插入到宿主DNA中。Leavitt等人(Journal of Biological Chemistry,1993,volume 268,pages 2113-2119[生物化学杂志,1993,第268卷,第2113-2119页])通过使用定点诱变和体外研究检查了HIV1整合酶的功能。Leavitt还指出了U5和U3 HIV1 att位点的序列,所述位点对于通过病毒整合酶将HIV1DNA(逆转录后产生)整合到宿主基因组中是重要的。
本披露通过允许人们将希望的核酸(DNA)序列于基因组中的指定位置特异性地插入基因组而改进了当前基因组编辑技术。具有DNA结合能力的重组工程化整合酶(或重组酶)将结合基因组中的给定的DNA序列并识别提供的具有整合酶识别结构域(例如HIV1(或其他逆转录病毒)att位点)和/或同源臂的DNA序列以将给定的核酸序列以位点特异性方式插入到基因组中。本披露的一个方面涉及在基因的转录起始位点之后插入终止密码子(UAA、UAG和/或UGA)的DNA序列。这将允许有效抑制细胞或生物体基因组中的基因转录。
发明内容
本披露将DNA靶向技术(包括锌指蛋白、TALEN和CRISPR/Cas9或其他CRISPR蛋白(如Cpf1)等)与逆转录病毒整合酶联系以形成DNA靶向整合酶。然后可以为感兴趣的基因(GOI)提供DNA靶向整合酶,使得可以按靶向方式将其并入基因组中。GOI将被设计具有同源臂,以向将其插入到基因组中提供另一个水平的特异性。
本披露具体涉及使用对切割DNA无活性的变体Cas9以与逆转录病毒整合酶连接。
本披露包括如下系统,所述系统包括:A)病毒整合酶(或细菌重组酶),其与例如对DNA切割能力无活性的Cas蛋白(例如Cas9)共价连接。可替代地,病毒整合酶(或重组酶)与TALE蛋白或锌指蛋白共价连接,其中这些蛋白被设计为靶向基因组中的特定DNA序列。这可以在表达载体中或作为纯化蛋白提供;B)待被并入希望的基因组中的感兴趣的基因(或感兴趣的DNA序列),其具有或不具有同源臂。根据需要,GOI或感兴趣的DNA序列可以经修饰以被病毒整合酶识别。多核苷酸转染和/或将蛋白质引入细胞需要其他试剂。测定DNA序列的脱靶整合。在一个方面,使用标记序列以工程化到插入的DNA序列中。
本文提供了如下核酸构建体,所述核酸构建体包括可操作连接的:a)编码Cas9、无活性Cas9或Cpf1的第一多核苷酸序列或其部分:b)编码整合酶、重组酶或转座酶的第二多核苷酸序列或其部分;和c)编码核酸接头的第三多核苷酸序列;其中第一多核苷酸序列包括5'和3'末端且第二多核苷酸序列包括5'和3'末端,并且第一多核苷酸的3'末端通过核酸接头连接至第二多核苷酸的5'末端,并且第一多核苷酸和第二多核苷酸能够在细胞或生物体中表达为融合蛋白。在一些实施例中,第一多核苷酸序列包括SEQ ID NO:1、3、5、7、9、11、13、27-46、49、56或68中的任一个,或与其具有至少80%、至少85%、至少90%、至少95%或至少99%一致性的序列。在一些实施例中,Cas9、无活性的Cas9或Cpf1包括SEQ ID NO:2、4、6、8、10、12、14、50、52、69、72-78或86-92中的任一个,或与其具有至少80%、至少85%、至少90%、至少95%或至少99%一致性的序列。在一些实施例中,第二多核苷酸序列包括SEQ IDNO:15、17、19、21、23、47、55、62、64、66、70或79中的任一个,或与其具有至少80%、至少85%、至少90%、至少95%或至少99%一致性的序列。在一些实施例中,整合酶、重组酶或转座酶包括SEQ ID NO:16、18、20、22、24、25、26、48、63、65、67、71或80中的任一个,或与其具有至少80%、至少85%、至少90%、至少95%或至少99%一致性的序列。本文还描述了包括所述核酸构建体的生物体。本文还描述了包括融合蛋白的生物体,其中所述生物体具有经修饰的基因组。
本文提供了如下生物体,所述生物体包括:a)编码Cas9、无活性Cas9或Cpf1的第一多核苷酸序列或其部分:b)编码整合酶、重组酶或转座酶的第二多核苷酸序列或其部分;和c)编码核酸接头的第三多核苷酸序列;其中第一多核苷酸序列包括5'和3'末端且第二多核苷酸序列包括5'和3'末端,并且第一多核苷酸的3'末端通过核酸接头连接至第二多核苷酸的5'末端,并且第一多核苷酸和第二多核苷酸能够在细胞或生物体中表达为融合蛋白。
本文还提供了融合蛋白,所述融合蛋白包括:a)第一蛋白,所述第一蛋白是无催化活性的Cas9、Cas9、TALE蛋白、锌指蛋白或Cpf1蛋白,其中所述第一蛋白被靶向靶DNA序列;b)第二蛋白,所述第二蛋白是整合酶、重组酶或转座酶;和c)接头,所述接头连接第一蛋白与第二蛋白。在一些实施例中,第二蛋白是整合酶;所述整合酶是HIV1整合酶或慢病毒整合酶;接头序列的长度为一个或多个氨基酸;或第一蛋白是无催化活性的Cas9。在一些实施例中,接头序列的长度为4-8个氨基酸;第一蛋白是TALE蛋白;或第一蛋白是锌指蛋白。在一些实施例中,其中融合蛋白包括TALE或锌指蛋白,靶DNA序列的长度为约16至约24个碱基对。在一些实施例中,第一蛋白是Cas9或无催化活性的Cas9,并且其中使用一个或多个引导RNA来靶向具有从约16至约24个碱基对的靶DNA序列。
本文还提供了将DNA序列插入到基因组DNA中的方法,所述方法包括:a)鉴定基因组DNA中的靶序列;b)设计根据权利要求1所述的融合蛋白以结合基因组DNA中的靶序列;c)设计感兴趣的DNA序列以并入基因组DNA中;和d)通过允许融合蛋白和感兴趣的DNA序列进入细胞或生物体的技术,向细胞或生物体提供融合蛋白和感兴趣的DNA序列;其中感兴趣的DNA序列被整合在基因组DNA中的靶序列处。
本文还提供了如下核苷酸载体,所述核苷酸载体包括:a)第一蛋白的第一编码序列,所述第一蛋白是Cas9、无催化活性的Cas9、TALE蛋白、锌指蛋白或Cpf1蛋白,所述第一蛋白被工程化以结合靶DNA序列;b)第二蛋白的第二编码序列,所述第二蛋白是整合酶、重组酶或转座酶;c)在第一编码序列和第二编码序列之间的DNA序列,其在第一蛋白和第二蛋白之间形成氨基酸接头;d)任选地经表达的由被整合酶识别的att位点包围的感兴趣的DNA序列,以及任选地一个或多个引导RNA,其中第一蛋白被靶向确定的DNA序列,并且其中第一蛋白通过氨基酸接头序列连接到第二蛋白。
本文提供了抑制细胞或生物体中基因转录的方法,所述方法包括:a)鉴定基因中的ATG起始密码子;b)用根据权利要求1所述的融合蛋白设计融合蛋白系统,以在基因的ATG起始密码子之后立即与靶序列结合;c)设计感兴趣的DNA序列,所述感兴趣的DNA序列为一个或多个连续终止密码子;和d)通过允许融合蛋白和感兴趣的DNA序列进入细胞或生物体的技术,向细胞或生物体提供融合蛋白和感兴趣的DNA序列;其中感兴趣的DNA序列被整合在基因组DNA中的靶序列处;并且其中基因的转录被抑制。在一些实施例中,第二蛋白是重组酶;所述重组酶是Cre重组酶或其经修饰形式,其中经修饰的Cre重组酶具有组成型重组酶活性。在一个实施例中,所述载体另外包括待在细胞中表达的逆转录酶基因。
本文还提供了如下组合物,所述组合物包括DNA结合蛋白/整合酶融合物的纯化蛋白和长度从约15至约100个碱基对的RNA,其中所述DNA结合蛋白选自工程化到基因组中的靶向DNA序列的Cas9、Cpf1、TALEN和锌指蛋白,并且其中所述整合酶是HIV整合酶、慢病毒整合酶、腺病毒整合酶、逆转录病毒整合酶或MMTV整合酶。
附图说明
参考以下说明、所附权利要求书及附图,本披露的这些和其他特征、方面和优点将变得更好理解,其中:
图1示出了a)示例性无催化活性的Cas9/HIV1整合酶融合蛋白;b)示例性TALE/HIV1整合酶融合蛋白;c)示例性锌指蛋白/HIV1整合酶融合蛋白;和d)示例性Cas9/HIV1整合酶融合蛋白,其被设计到靶向位置处的DNA的相对侧。每种融合蛋白都与DNA的特定靶序列结合。“ZnFn”是锌指蛋白。“整合酶”代表一个整合酶单位或通过例如短氨基酸接头连接的两个整合酶单位。在一些实施例中,整合酶可以被重组酶替代。Cas9可以是有催化活性的或无催化活性的。
图2示出了DNA质粒系统,其包括:包括无催化活性的Cas9/整合酶融合蛋白的载体、包括感兴趣的DNA序列的载体和包括逆转录酶的载体。引导RNA(gRNA)或RNA可以分开提供。另一个载体可以用来表达gRNA。“1或2”指一个整合酶或通过例如氨基酸接头连接的两个整合酶。
图3示出了示例性DNA质粒,其包括核苷酸序列无催化活性的Cas9/整合酶融合蛋白、引导RNA、感兴趣的DNA(基因)序列和逆转录酶。可以将病毒att位点提供给感兴趣的DNA序列,允许将整合酶并入细胞的基因组DNA中。引导RNA(gRNA)或RNA可以分开提供。另一个载体可以用来表达gRNA。“1或2”指一个整合酶或通过例如氨基酸接头连接的两个整合酶。
图4示出了流程图。采用图2和图3中所示的载体的一种示例性方法示于图4中,并且如下所示:1)逆转录酶逆转录从载体表达的具有att位点的感兴趣的DNA序列(可替代地使用具有att位点的线性DNA);2)融合Cas9/整合酶基于引导RNA靶向基因组DNA上的位点;3)整合酶识别感兴趣的DNA序列上的att(LTR)位点,并将DNA于靶向位点整合到基因组中;并且4)进行测定(例如PCR(聚合酶链式反应))以检查感兴趣的DNA序列的正确插入。可以进行测定以检查非特异性整合。
图5示出了使用引导NrF2-sgRNA2和sgRNA3靶向Nrf2的外显子2的Abbie1基因编辑。
图6示出了由Abbie1基因编辑产生的理论数据。
图7示出了使用引导Nrf2-sgRNA3靶向Nrf2的外显子2的Abbie1基因编辑。
图8示出了在合并的Hek293T细胞中Nrf2的Abbie1敲除。
图9示出了在合并的Hek293T细胞中Nrf2的Abbie1敲除。
图10示出了靶向CXCR4外显子2的Abbie1基因编辑。
图11示出了
从大肠杆菌考马斯(Coomassie)染色凝胶中分离和纯化后对ABBIE1蛋白的检测。
具体实施方式
提供以下详细说明以帮助本领域技术人员实践本披露。即使这样,此详细说明不应当解释为不适当地限制本披露,因为本领域的普通技术人员可以在本文讨论的实施例中做出修改和变化,而不脱离本发现的精神或范围。
如本披露和所附权利要求书中使用的,除非上下文另有明确指示,否则单数形式“一个/一种(a/an)”和“所述(the)”包括复数的提及物。如本披露和所附权利要求书中使用的,术语“或”可以是单数的或包容性的。例如,A或B可以是A和B。
内源的
如本文所述的内源核酸、核苷酸、多肽或蛋白质是基于与宿主生物体的关系而定义。内源核酸、核苷酸、多肽或蛋白质是天然存在于宿主生物体内的核酸、核苷酸、多肽或蛋白质。
外源的
如本文所述的外源核酸、核苷酸、多肽或蛋白质是基于与宿主生物体的关系而定义。外源核酸、核苷酸、多肽或蛋白质是非天然存在于宿主生物体内或是在宿主生物中不同的位置的核酸、核苷酸、多肽或蛋白质。
敲除
当外源核酸被转化到宿主生物体(例如通过随机插入或同源重组)导致基因的破坏(例如通过缺失、插入)时,认为基因被敲除。
在敲除基因后,可以降低相应蛋白质的活性。例如,与其中基因未被敲除的相同蛋白质的活性相比,降低至少10%、至少20%、至少30%、至少40%、至少50%、至少60%、至少70%、至少80%、至少90%或100%。
在敲除基因后,与未被敲除的基因相比,所述基因的转录可以降低至少20%、至少30%、至少40%、至少50%、至少60%、至少70%、至少80%、至少90%或100%。
经修饰的
经修饰的生物体是与未经修饰的生物体不同的生物体。例如,经修饰的生物体可以包括本披露的融合蛋白,其导致靶向基因序列的敲除。经修饰的生物体可以具有经修饰的基因组。
经修饰的核酸序列或氨基酸序列不同于未经修饰的核酸序列或氨基酸序列。例如,核酸序列的一个或多个核酸可以被插入、缺失或添加。例如,氨基酸序列的一个或多个氨基酸可以被插入、缺失或添加。
可操作地连接的
在一些实施例中,载体包括与一个或多个控制元件(如启动子和/或转录终止子)可操作地连接的多核苷酸。当核酸序列被放置地与另一核酸序列有功能关系时,所述核酸序列是可操作地连接的。例如,将前序列或分泌性前导子的DNA可操作地连接至多肽的DNA,如果它被表达为参与多肽的分泌的前蛋白的话;将启动子可操作地连接至编码序列,如果它影响序列的转录的话;或将核糖体结合位点可操作地连接至编码序列,如果它被定位地促进翻译的话。可操作地连接的序列可以是连续的,并且在分泌性前导子的情况下是连续的且处于阅读相。
宿主细胞或宿主生物体
宿主细胞可以含有编码本披露的多肽的多核苷酸。在一些实施例中,宿主细胞是多细胞生物体的一部分。在其他实施例中,宿主细胞作为单细胞生物体进行培养。
宿主生物体可以包括任何合适的宿主,例如微生物。可用于本文所述方法的微生物包括例如细菌(例如,大肠杆菌)、酵母(例如,酿酒酵母)和植物。生物体可以是原核的或真核的。生物体可以是单细胞的或多细胞的。
宿主细胞可以是原核的。合适的原核细胞包括但不限于大肠杆菌、乳杆菌属物种、沙门氏菌属物种和志贺菌属的多种实验室菌株中的任一种(例如,如Carrier et al.(1992)J.Immunol.148:1176-1181[Carrier等人(1992)免疫学杂志148:1176-1181];美国专利号6,447,784;和Sizemore et al.(1995)Science 270:299-302[Sizemore等人(1995)科学270:299-302]中所述)。可用于本披露的沙门氏菌属菌株的实例包括但不限于伤寒沙门氏菌和鼠伤寒沙门氏菌。合适的志贺菌属菌株包括但不限于弗氏志贺菌、索氏志贺菌和痢疾志贺菌。典型地,实验室菌株是非致病性菌株。其他合适的细菌的非限制性实例包括但不限于恶臭假单胞菌、铜绿假单胞菌、梅瓦诺氏假单胞菌(Pseudomonas mevalonii)、类球红细菌、荚膜红细菌、深红红螺菌和红球菌属。
在一些实施例中,宿主生物体是真核的。合适的真核宿主细胞包括但不限于酵母细胞、昆虫细胞、植物细胞、真菌细胞和藻细胞。
多核苷酸和多肽[核酸和蛋白质]
本披露的蛋白质可以通过本领域已知的任何方法进行制备。蛋白质可以使用固相肽合成或通过也称为液相肽合成的经典溶液肽合成来合成。使用Val-Pro-Pro、依那普利和赖诺普利作为起始模板,可以使用固相或液相肽合成来合成几个系列的肽类似物如X-Pro-Pro、X-Ala-Pro和X-Lys-Pro,其中X代表任何氨基酸残基。还描述了用于进行与可溶性低聚物支持体偶联的肽和寡核苷酸文库的液相合成的方法。Bayer,Ernst and Mutter,Manfred,Nature 237:512-513(1972)[Bayer,Ernst和Mutter,Manfred,自然237:512-513(1972)];Bayer,Ernst,et al.,J.Am.Chem.Soc.96:7333-7336(1974)[Bayer,Ernst等人,美国化学学会杂志96:7333-7336(1974)];Bonora,Gian Maria,et al.,Nucleic AcidsRes.18:3155-3159(1990)[Bonora,Gian Maria等人,核酸研究18:3155-3159(1990)]。液相合成方法优于固相合成方法的优势在于液相合成方法不需要存在于第一反应物上的如下结构,所述结构适合将反应物附接到固相上。另外,液相合成方法不需要避免可能裂解固相和第一反应物(或中间产物)之间的键的化学条件。此外,均相溶液中的反应相比在非均相固相/液相体系中获得的反应(如固相合成中存在的反应)可以给出更好的产率和更完全的反应。
在低聚物支持的液相合成中,生长的产物附接在大的可溶性聚合物基团上。然后基于相对较大的聚合物附接的产物和未反应的反应物之间的大的尺寸差异,可以将来自合成每个步骤的产物与未反应的反应物分离。这允许反应在均相溶液中发生,并且消除了与传统的液相合成相关的繁琐的纯化步骤。低聚物支持的液相合成也适用于肽的自动液相合成。Bayer,Ernst,et al.,Peptides:Chemistry,Structure,Biology,426-432[Bayer,Ernst等人,肽:化学,结构,生物学,426-432]。
对于固相肽合成,所述程序必需将适当的氨基酸顺序组装成具有所希望序列的肽,而生长的肽的末端连接到不溶性支持体上。通常,肽的羧基末端与聚合物连接,在用裂解剂处理后,所述聚合物可以从中释解出来。在常见的方法中,氨基酸与树脂颗粒结合,并且通过连续添加受保护的氨基酸以产生氨基酸链来以逐步方式产生肽。常使用Merrifield描述的对所述技术的修改。参见例如,Merrifield,J.Am.Chem.Soc.96:2989-93(1964)[Merrifield,美国化学学会会志96:2989-93(1964)]。在自动化固相法中,通过将羧基末端氨基酸加载到共价附接于与二乙烯基苯交联的不溶性聚苯乙烯树脂上的有机接头(例如PAM,4-氧甲基苯基乙酰胺基甲基)上来合成肽。末端胺可以通过用叔丁氧基羰基阻断来保护。羟基和羧基基团常通过用O-苄基基团阻断来保护。合成在自动肽合成仪中完成,如可从应用生物系统公司(Applied Biosystems)(加利福尼亚州福斯特市)获得的自动肽合成仪。合成后,可以从树脂中除去产物。根据已建立的方法,使用氢氟酸或三氟甲基磺酸除去阻断基团。常规合成可产生0.5毫摩尔的肽树脂。裂解和纯化后,典型地产生大约60%至70%的产率。通过例如从有机溶剂如甲基-丁基醚中结晶肽,然后溶解在蒸馏水中,并使用透析(如果主题肽的分子量大于约500道尔顿)或反向高压液相色谱(如果肽的分子量小于500道尔顿,例如使用具有0.1%三氟乙酸和乙腈作为溶剂的C18柱)来完成产物肽的纯化。经纯化的肽可以冻干并以干燥状态保存直到使用。所得肽的分析可以使用分析型高压液相色谱(HPLC)和电喷雾质谱(ES-MS)的常见方法来完成。
在其他情况下,蛋白质,例如,蛋白质通过重组方法产生。为了产生本文所述的任何蛋白质,可以使用用含有编码此种蛋白质的多核苷酸的表达载体来转化的宿主细胞。宿主细胞可以是高等真核细胞如哺乳动物细胞或低等真核细胞如酵母,或宿主可以是原核细胞如细菌细胞。可以通过包括磷酸钙转染、DEAE-葡聚糖介导的转染、聚凝胺(polybrene)、原生质体融合、脂质体、直接微注射入细胞核、划痕荷载(scrape loading)、生物射弹转化和电穿孔的多种方法来完成将表达载体引入宿主细胞。从重组生物体大规模生产蛋白质是在商业规模上实践并且正好在本领域技术人员能力范围内的良好建立的工艺。
密码子优化
编码多核苷酸的一个或多个密码子可被“偏倚”或“优化”以反映宿主生物体的密码子使用。例如,编码多核苷酸的一个或多个密码子可被“偏倚”或“优化”以反映叶绿体密码子使用或核密码子使用。大多数氨基酸由两个或更多个不同的(简并)密码子编码,并且公认各种生物体优先于其他利用某些密码子。“偏倚”或“优化”密码子可以在整个说明书中互换使用。密码子偏倚可以在不同的植物中有不同的倾斜,包括例如在藻类中,与在烟草中相比较。通常,选择的密码子偏倚反映了用本披露的核酸转化的植物(或其中的细胞器)的密码子使用。
对特定密码子使用偏倚的多核苷酸可以从头合成,或者可以使用常规重组DNA技术例如通过定点诱变方法进行基因修饰,以改变一个或多个密码子,使得它们对叶绿体密码子使用偏倚。
序列一致性百分比
适合于确定核酸或多肽序列之间的序列一致性百分比或序列相似性的算法的一个实例是BLAST算法,其描述于例如Altschul et al.,J.Mol.Biol.215:403-410(1990)[Altschul等人,分子生物学杂志215:403-410(1990)]中。用于执行BLAST分析的软件可通过美国国家生物技术信息中心(National Center for Biotechnology Information)公开地获得。BLAST算法的参数W、T、以及X决定了比对的灵敏度与速度。BLASTN程序(对核苷酸序列来说)使用字长(W)为11、期望值(E)为10、截止值(cutoff)为100、M=5、N=4、以及两条链的比较作为默认值。对于氨基酸序列,BLASTP程序使用字长(W)为3、期望值(E)为10、以及BLOSUM62评分矩阵作为默认值(如在例如Henikoff&Henikoff(1989)Proc.Natl.Acad.Sci.USA,89:10915[Henikoff&Henikoff(1989)美国国家科学院院刊89:10915]中所述的)。除计算序列一致性百分比之外,BLAST算法还可执行两个序列之间的相似性统计分析(例如,如在Karlin&Altschul,Proc.Nat'l.Acad.Sci.USA,90:5873-5787(1993)[Karlin&Altschul,美国国家科学院院刊90:5873-5787(1993)]中所述的)。由BLAST算法提供的相似性的一个量度是最小概率和(P(N)),它提供了由此将偶然发生在两个核苷酸序列或氨基酸序列之间的匹配的概率的指示。例如,若在测试核酸与参比核酸的比较中最小概率和小于约0.1、小于约0.01、或小于约0.001,则所述核酸被认为是与所述参比序列相类似的。
本披露包括如下系统,所述系统包括:A)病毒整合酶(或重组酶),其与例如对DNA切割能力无活性的Cas蛋白(例如Cas9)共价连接。可替代地,病毒整合酶(或者细菌或噬菌体重组酶)与TALE蛋白或锌指蛋白共价连接,其中这些蛋白被设计为靶向基因组中的特定DNA序列。
这可以在表达载体中或作为纯化蛋白提供。B)待被并入希望的基因组中的感兴趣的基因(或感兴趣的DNA序列),其具有或不具有同源臂。根据需要,GOI或感兴趣的DNA序列可以经修饰以被病毒整合酶识别。例如,可以将病毒att位点添加到DNA序列的末端。C)多核苷酸转染和/或将蛋白质引入细胞所需的其他试剂。
核酸
术语“多核苷酸”、“核苷酸”、“核苷酸序列”、“核酸”以及“寡核苷酸”在本披露中是可互换使用的。它们是指任何长度的核苷酸(脱氧核糖核苷酸或核糖核苷酸)的聚合形式或其类似物。多核苷酸可以具有任何三维结构并且可以执行任何已知或未知的功能。以下是多核苷酸的非限制性实例:基因或基因片段的编码区或非编码区、由连锁分析定义的多个基因座(一个基因座)、外显子、内含子、信使RNA(mRNA)、转移RNA、核糖体RNA、短干扰RNA(siRNA)、短发夹RNA(shRNA)、微RNA(miRNA)、核糖核酸酶、cDNA、重组多核苷酸、分枝多核苷酸、质粒、载体、分离的任何序列的DNA、分离的任何序列的RNA、核酸探针以及引物。多核苷酸可以包括一个或多个经修饰的核苷酸,诸如甲基化核苷酸和核苷酸类似物。如果存在的话,对核苷酸结构的修饰可以在聚合物组装之前或之后赋予。核苷酸的序列可以被非核苷酸组分中断。多核苷酸可以在聚合之后诸如通过与标记组分轭合来进一步修饰。
引导RNA
在本披露的方面中,术语“嵌合RNA”、“嵌合引导RNA”、“引导RNA”、“单引导RNA”和“合成引导RNA”可互换使用,并且指如下多核苷酸序列,所述多核苷酸序列包括引导序列、tracr序列和tracr配对序列。术语“引导序列”指指定了靶位点的在引导RNA内的约20bp(12-30bp)的序列,并可与术语“引导物”或“间隔子”互换使用。术语“tracr配对序列”也可以与术语“同向重复”互换使用。
野生型
如本文所用的,术语“野生型”是本领域技术人员理解的技术术语并且意指在自然界中出现的典型式的生物体、菌株、基因或特征,与突变体或变体形式区分。
变体
如本文所用的,术语“变体”或“突变体”应理解为意指具有源自自然界中存在的模式的性质展示。关于基因,这些术语指示基因中的许多变化,其使得它不同于野生型基因,包括单核苷酸多态性(SNP)、插入、缺失、基因漂移等。
工程化的
术语“非天然存在的”或“工程化的”是可互换使用的并且是指人造技术的介入。所述术语当提及核酸分子或多肽时意指核酸分子或多肽至少基本上与至少一种其他组分分离,所述至少一种其他组分在自然界中与核酸分子或多肽天然缔合并且如自然界中发现的。
互补的
“互补性”是指核酸通过传统的沃森-克里克或其他非传统类型来与另一个核酸序列形成氢键的能力。互补百分比表示核酸分子中可与第二核酸序列形成氢键(例如,沃森-克里克碱基配对)的残基百分比(例如,10分之5、6、7、8、9、10是50%、60%、70%、80%、90%以及100%互补)。“完全互补的”意指核酸序列的所有连续残基都将与第二核酸序列中同样数目的感兴趣的连续残基形成氢键。如本文所用的“基本上互补的”是指在8、9、10、11、12、13、14、15、16、17、18、19、20、21、22、23、24、25、30、35、40、45、50、或更多个核苷酸的区域上的至少60%、70%、75%、80%、85%、90%、95%、97%、98%、99%、或100%或者之间的百分比的互补程度,或者是指在严格条件下杂交的两个核酸。
氨基酸
全名,三字母密码,单字母密码
天冬氨酸Asp D
谷氨酸Glu E
赖氨酸Lys K
精氨酸Arg R
组氨酸His H
酪氨酸Tyr Y
半胱氨酸Cys C
天冬酰胺Asn N
谷氨酰胺Gln Q
丝氨酸Ser S
苏氨酸Thr T
甘氨酸Gly G
丙氨酸Ala A
缬氨酸Val V
亮氨酸Leu L
异亮氨酸Ile I
甲硫氨酸Met M
脯氨酸Pro P
苯丙氨酸Phe F
色氨酸Trp W
如本文所用的表述“氨基酸”意在包括天然氨基酸和合成氨基酸,以及D和L氨基酸。“标准氨基酸”意指天然存在的蛋白质/肽中常见的二十种标准L-氨基酸中的任一种。“非标准氨基酸残基”意指除标准氨基酸以外的任何氨基酸,不管它是合成制备还是衍生自天然来源。如本文所用的,“合成氨基酸”包括经化学修饰的氨基酸,包括但不限于盐、氨基酸衍生物(如酰胺)和取代。含在本披露的肽内并且具体在羧基末端或氨基末端的氨基酸可以通过甲基化、酰胺化、乙酰化或用其他化学基团取代来修饰,这些可以改变肽的循环半衰期而不会不利地影响其活性。此外,肽中可以存在或不存在二硫键。
氨基酸可以基于侧链R分为七组:(1)脂肪族侧链;(2)含有羟基(OH)基团的侧链;(3)含有硫原子的侧链;(4)含有酸性或酰胺基团的侧链;(5)含有碱性基团的侧链;(6)含有芳香环的侧链;和(7)脯氨酸,即其中侧链与氨基基团融合的亚氨基酸。
如本文所用的,术语“保守性氨基酸取代”在本文被定义为在以下五组之一内的交换:
I.小的脂肪族非极性或轻微极性的残基:
Ala,Ser,Thr,Pro,Gly;
II.极性带负电的残基及其酰胺:
Asp,Asn,Glu,Gln;
III.极性带正电的残基:
His,Arg,Lys;
IV.大的脂肪族非极性残基:
Met Leu,He,Val,Cys(Ile;自动更正不易懂(autocorrect is not literat))
V.大的芳香族残基:
Phe,Tyr,Tip(Trp,同样地)
除非另外提供,否则本披露利用处于本领域技能范围内的免疫学、生物化学、化学、分子生物学、微生物学、细胞生物学、基因组学以及重组DNA的常规技术。参见Sambrook,Fritsch and Maniatis,MOLECULAR CLONING:A LABORATORY MANUAL,2nd edition(1989)[Sambrook,Fritsch和Maniatis,分子克隆:实验室手册,第2版(1989)];CURRENTPROTOCOLS IN MOLECULAR BIOLOGY(F.M.Ausubel,et al.eds.,(1987))[分子生物学通用方法(F.M.Ausubel等人编辑(1987))];系列丛书METHODS IN ENZYMOLOGY(AcademicPress,Inc.):PCR 2:APRACTICAL APPROACH(M.J.MacPherson,B.D.Hames and G.R.Tayloreds.(1995)),Harlow and Lane,eds.(1988)ANTIBODIES,A LABORATORY MANUAL,andANIMAL CELL CULTURE(R.I.Freshney,ed.(1987))[酶学方法(学术出版社公司):PCR 2:实践方法(M.J.MacPherson,B.D.Hames和G.R.Taylor编辑(1995)),Harlow和Lane编辑(1988)抗体、实验室手册和动物细胞培养(R.I.Freshney编辑(1987))]。
载体
基因表达载体(基于DNA的或病毒的)将用于在细胞或组织中表达融合整合酶,以及为感兴趣的DNA序列(或基因)提供整合酶或重组酶所需的将DNA(或基因)整合到宿主物种或细胞的基因组中的合适位点。许多基因表达载体在本领域是已知的。载体将用于感兴趣的基因(或感兴趣的DNA序列)。载体可以用本领域已知的许多限制酶来切割。
CRISPR/CAS9
在美国专利8697359、US 8889356和Ran等人(Nature Protocols,2013,volume 8,pages 2281-2308[自然实验手册,2013,第8卷,第2281-2308页])中描述了CRISPR/Cas9。Cas9蛋白利用RNA引导物以结合基因组中的特定DNA序列。可以将RNA引导物(引导RNA)设计成长度为从10至40、从12至35、从15至30或例如从18至22或20个核苷酸。参见Hsu et al,Nature Biotechnology,September 2013,volume 31,pages 827-832[Hsu等人,自然生物技术,2013年9月,第31卷,第827-832页],其使用来自化脓链球菌的Cas9。另一种关键Cas9来自金黄色葡萄球菌(比化脓链球菌的Cas9小的Cas9)。Cas9蛋白利用引导RNA来结合DNA序列的特定区域。
Guilinger et al,Fusion of catalytically inactive Cas9to FokI nucleaseimproves the specificity of genome modification,Nature Biotechnology,April25,2014,volume 32,pages 577-582[Guilinger等人,无催化活性的Cas9与FokI核酸酶的融合改进了基因组修饰的特异性,自然生物技术,2014年4月25日,第32卷,第577-582页]中描述了无催化活性的形式的Cas9。Guilinger等人将无催化活性的Cas9附接到Fok1酶上以实现在基因组DNA中进行切割方面更大的特异性。这种无催化活性的Cas9允许Cas9使用RNA引导物来结合基因组DNA同时不能切割所述DNA。
Cas9也可以其天然wt形式获得,并且也是对于在细胞中更好地表达Cas9构建体而言的人类优化的密码子形式。(参见Mali et al,Science,2013,volume 339,pages 823-826[Mali等人,科学,2013,第339卷,第823-826页])。Cas9的密码子优化可以取决于用于表达它的物种进行。取决于是产生蛋白质形式的整合酶/Cas9融合蛋白(也称为ABBIE1还是产生核苷酸表达载体形式,可以使用优化的或未优化的(wt)形式。
针对特定DNA序列的RNA引导物可以通过各种基于计算机的工具进行设计。
CRISPR/CPF1
Cpf1是另一种蛋白质,它使用引导RNA来结合基因组DNA中的特定序列。Cpf1还切割DNA,制造交错切口。可以使得Cpf1对于切割能力是无催化活性的。
其他CRISPR蛋白
这些是利用引导RNA来靶向特定DNA序列的蛋白质,并且无论它们是否具有切割DNA的能力。这些蛋白质中的一些可能天然具有其他的酶/催化功能。
TALEN
转录激活因子样效应子核酸酶(TALEN)是融合蛋白,其具有通过融合TAL效应子DNA结合结构域和DNA切割结构域而产生的限制酶。这些试剂能够实现高效、可编程且特异性的DNA切割,并代表原位基因组编辑的强大工具。转录激活因子样效应子(TALE)可以被快速工程化以几乎结合任何DNA序列。如本文所用的,术语TALEN是广泛的,并且包括可在不借助另一TALEN的情况下切割双链DNA的单体TALEN。术语TALEN也用于指一对TALEN中的一个或两个成员,所述TALEN被工程化为一起作用来在相同位点切割DNA。一起作用的TALEN可以被称为左TALEN和右TALEN,其参考了DNA的手性。参见US 8,440,432。
TAL效应子是由黄单胞菌属细菌分泌的蛋白质。DNA结合结构域含有高度保守的33-34个氨基酸的序列,除了第12和第13个氨基酸外。这两个位置是高度可变的(重复可变双残基(RVD)),并显示与特定核苷酸识别的强相关性。氨基酸序列和DNA识别之间的这种简单的关系已经允许通过选择含有适当的RVD的重复区段的组合来工程化特定的DNA结合结构域。
整合酶或重组酶可用于构建在酵母或细胞测定中有活性的杂合整合酶或重组酶。这些试剂在植物细胞和动物细胞中也有活性。TALEN研究使用野生型FokI切割结构域,但是一些随后的TALEN研究也使用具有突变的FokI切割结构域变体,所述突变被设计为改进切割特异性和切割活性。TALEN DNA结合结构域和整合酶或重组酶结构域之间的氨基酸残基数目和两个单独的TALEN结合位点之间的碱基数目都是实现高水平活性的参数。TALEN DNA结合结构域和整合酶或重组酶结构域之间的氨基酸残基数目可以通过在多个TAL效应子重复序列和整合酶或重组酶结构域之间引入间隔子(区别于间隔子序列)来进行修饰。间隔子序列可以是6至102或9至30个核苷酸或15至21个核苷酸。除了在DNA靶向蛋白(Cas9、TALE或锌指蛋白)和整合酶或重组酶之间提供连接之外,这些间隔子通常不会为杂合蛋白提供其他活性。本披露中用于间隔子和用于其他用途的氨基酸是
TALEN结合结构域的氨基酸序列与DNA识别之间的关系允许可设计的蛋白质。在这种情况下,人工基因合成是有问题的,因为在TALE结合结构域中发现的重复序列的不适当的退火。这个问题的一个解决方案是使用名为DNAWorks的可公开获得的软件程序来找到适合在两步PCR;寡核苷酸装配随后进行全基因扩增中组装的寡核苷酸。在本领域中也已经报道了许多用于产生工程化TALE构建体的模块化组装方法。
一旦TALEN基因组装在一起,它们被插入到质粒中;然后使用质粒来转染靶细胞,在靶细胞中基因产物表达并进入细胞核以接近基因组。可以使用TALEN来通过诱导双链断裂(DSB)编辑基因组,所述细胞响应于DNA修复,然而,本披露寻求使用病毒整合酶或者细菌或噬菌体重组酶的力量来将感兴趣的DNA序列插入到基因组中的靶向位点。参见WO2014134412和美国专利8748134的披露。
锌指蛋白
用于结合DNA的锌指蛋白及其设计描述于US 7928195、US 2009/0111188和US7951925中。锌指蛋白以特定顺序利用许多连接的锌指结构域来结合特定DNA序列。锌指蛋白核酸内切酶已经良好建立。
锌指蛋白(ZFP)是可以按序列特异性方式与DNA结合的蛋白。锌指首次在来自非洲爪蟾(滑爪蟾(Xenopus laevis))的卵母细胞的转录因子TFIIIA中被鉴定出来。这类ZFP的单个锌指结构域的长度为约30个氨基酸,并且一些结构研究已经证明它含有β转角(含有两个保守的半胱氨酸残基)和α螺旋(含有两个保守的组氨酸残基),其通过这两个半胱氨酸和这两个组氨酸配位锌原子而保持为特定构象。这类ZFP也被称为C2H2ZFP。还提出了另外类别的ZFP。参见例如,Jiang et al.(1996)J.Biol.Chem.271:10723-10730[Jiang等人(1996)生物化学杂志271:10723-10730]中对Cys-Cys-His-Cys(C3H)ZFP的讨论。迄今为止,已经在几千种已知或假定的转录因子中鉴定出了超过10,000个锌指序列。锌指结构域不仅在DNA识别中有涉及,RNA结合和蛋白质-蛋白质结合中也有涉及。当前的估计是,这类分子将占所有人类基因的约2%。
许多锌指蛋白具有保守的半胱氨酸和组氨酸残基,所述残基四面体地配位每个指结构域中的单个锌原子。具体地,大多数ZFP通过如下通用序列的指组分来表征:-Cys-(X)2-4-Cys-(X)12-His-(X)3-5-His-(SEQ ID NO:49,其中X表示任何氨基酸(C2H2ZFP))。这个最广泛代表的类别的锌配位序列含有具特定间隔的两个半胱氨酸和两个组氨酸。每指的折叠结构含有反向平行的β-转角、指尖区和短的两亲性α-螺旋。金属配位配体与锌离子结合,并且在zif268型锌指的情况下,短的两亲性α-螺旋结合在DNA的大沟中。此外,锌指的结构通过某些保守的疏水性氨基酸残基(例如,直接在第一个保守Cys前的残基和在指的螺旋部分的+4位的残基)以及通过保守的半胱氨酸和组氨酸残基的锌配位来稳定。
可结合基因组DNA中的特定靶序列的其他DNA结合蛋白
所述蛋白包括与锌指蛋白、TALEN和CRISPR蛋白无关的那些,其可能与各种生物体的基因组DNA中的特定序列结合。这些可以包括转录因子、转录阻遏物、大范围核酸酶、核酸内切酶DNA结合结构域等。
整合酶
在US 2009/0011509中描述了整合酶及其核酸内切酶融合蛋白。引入的整合酶是慢病毒整合酶和HIV1(人类免疫缺陷病毒1)整合酶。本披露将无催化活性的(或有催化活性的)Cas9、TALE或锌指蛋白融合至整合酶,以将所述整合酶靶向用户选择的基因组中的特定DNA区域。
与其他逆转录病毒整合酶一样,HIV-1整合酶能够识别位于长末端重复序列(LTR)的U3和U5区域的病毒DNA末端的特殊特征(Brown,1997)。LTR末端是被认为逆转录病毒整合机器识别(顺式)所需的唯一病毒序列。在鼠类和禽类逆转录病毒中,LTR外缘存在短的不完全的反向重复序列(Reicin等人,1995)。与位于逆转录病毒DNA末端最外侧位置3和4处的近端CA一起(位置1和2是3'末端加工的核苷酸),这些序列对于体外和体内正确的前病毒整合来说都是必需的且足够的。内于CA二核苷酸的序列对于最佳整合酶活性似乎是重要的(Brin&Leis,2002a;Brin&Leis,2002b;Brown,1997)。已显示HIV-1LTR的末端15bp对于体外正确的3'末端加工和链转移反应来说是至关重要的(Reicin等人,1995;Brown,1997)。HIV-1IN使用更长的底物比使用更短的底物更有效,这指示结合相互作用从病毒DNA末端向内延伸至少14-21bp。Brin和Leis(2002a)分析了HIV-1LTR的特定特征,并得出结论,对于IN催化的协同DNA整合来说U3和U5LTR识别序列都是被需要的,即使U5LTR是体外IN加工更有效的底物(Bushman&Craigie,1991;Sherman等人,1992)。IN识别序列的位置17-20是协同DNA整合机制所需要的,但是HIV-1IN在从不变的近端CA二核苷酸延伸的U3和U5末端中容忍相当大的变化(Brin&Leis,2002b)。本披露包括在容纳待整合到基因组中的感兴趣的DNA序列或基因的位置的5'和3'末端含有病毒(逆转录病毒或HIV)LTR区域的DNA载体。LTR区域不必是全长LTR,只要它们与整合酶相互作用以进行正确的整合即可。LTR区域可以经修饰以含有可检测物(例如荧光)、PCR检测或选择性标记(例如抗生素抗性)。载体被设计成被切割和线性化,使得LTR区域在DNA片段的5'和3'末端(通过设计的限制性位点到限制性核酸内切酶)。
整合酶由通过柔性接头连接的三个结构域组成。这些结构域是N末端HH-CC锌结合结构域、催化核心结构域和C末端DNA结合结构域(Lodi et al,Biochemistry,1995,volume34,pages 9826-9833[Lodi等人,生物化学,1995,第34卷,第9826-9833页])。在本披露的一些方面,与Cas9(或其他DNA结合分子)结合的整合酶将不具有C末端结合结构域。在本披露的一个方面,将产生两种不同的融合蛋白,其中一种具有与整合酶的N末端锌结合结构域融合的无催化活性的Cas9(或者TALE或锌指蛋白),并且另一种具有与整合酶的催化核心结构域融合的无催化活性的Cas9(或者TALE或锌指蛋白)。这两种不同的融合蛋白将被设计为如TALE-Fok1或锌指-Fok1系统所见的与基因组DNA的相反链结合。以此方式,当N末端结构域和催化核心接触时,在基因组DNA上的位点处,它将展现整合酶活性。由于整合酶的完整活性也被观察到涉及整合酶的四聚体,所以融合蛋白可以被设计为具有通过柔性接头连接的1、2、3、4个整合酶蛋白,所述柔性接头的长度可以为1至20个氨基酸或长度可以为4-12个氨基酸。
重组酶
在US 8816153和US 2004/0003420中描述了重组酶,包括Cre、Flp、R、Dre、Kw和Gin重组酶。重组酶如Cre重组酶使用LoxP位点以从基因组中切除序列。重组酶可以经修饰以变得在其重组活性方面有组成性活性,并且也变得有较小位点特异性。因此,有可能通过并入本披露的融合蛋白中而将无序列特异性的此类有组成性活性的重组酶蛋白靶向基因组中的特定DNA序列。以此方式,CRISPR/Cas9、TALE或锌指蛋白结构域指定DNA序列,其中重组酶将有助于其重组活性。此类重组酶蛋白可以是野生型的、有组成性活性的或对重组酶活性而言无活性的。Cas9-重组酶如Cas9-Gin或Cas9-Cre可以通过使用接头序列或通过直接融合来产生。
融合蛋白的核定位信号序列(NLS)
信号肽结构域(也称为“NLS”)例如源自于酵母GAL4、SKI3、L29或组蛋白H2B蛋白、多瘤病毒大T蛋白、VP1或VP2衣壳蛋白、SV40 VP1或VP2衣壳蛋白、腺病毒E1a或DBP蛋白、流感病毒NS1蛋白、肝炎病毒核心抗原或哺乳动物核纤层蛋白、c-myc、max、c-myb、p53、c-erbA、jun、Tax、类固醇受体或Mx蛋白(参见Boulikas,Crit.Rev.Eucar.Gene Expression,3,193-227(1993)[Boulikas,真核基因表达关键评论,3,193-227(1993)])、猿猴病毒40(“SV40”)T抗原(Kalderon et.al,Cell,39,499-509(1984)[Kalderon等人,细胞,39,499-509(1984)])或其他具有已知核定位的蛋白质。NLS例如源自于SV40T-抗原,但可以是本领域已知的其他NLS序列。可以使用串联NLS序列。
接头区域
在被合成的融合蛋白/肽之间使用的各种接头将由氨基酸组成。在DNA水平上,这些由如遗传密码中已知的3个碱基对(bp)密码子代表。接头的长度可以是从1至1000个氨基酸,并且可以是之间的任何整数。例如,接头的长度为从1至200个氨基酸,或接头的长度为从1至20个氨基酸。
表达载体
可以将许多核酸引入细胞中以引起基因的表达。如本文所用的,术语核酸包括DNA、RNA和核酸类似物以及双链或单链(即正义或反义单链)的核酸。可以在碱基部分、糖部分或磷酸骨架处修饰核酸类似物以改进例如核酸的稳定性、杂交或溶解性。碱基部分处的修饰包括针对脱氧胸苷的脱氧尿苷和针对脱氧胞苷的5-甲基-2'-脱氧胞苷和5-溴-2'-脱氧胞苷。糖部分的修饰包括修饰核糖的2'羟基以形成2'-O-甲基或2'-O-烯丙基糖。可以修饰脱氧核糖磷酸骨架以产生其中每个碱基部分与六元吗啉环连接的吗啉代核酸,或其中脱氧磷酸骨架被假肽骨架替代并且保留四个碱基的肽核酸。参见ummerton and Weller(1997)Antisense Nucleic Acid Drug Dev.7(3):187[Summerton和Weller(1997)反义和核酸药物开发7(3):187];和Hyrup et al.(1996)Bioorgan.Med.Chem.4:5[Hyrup等人(1996)生物有机化学与药物化学4:5]。此外,脱氧磷酸骨架可以用例如硫代磷酸骨架或二硫代磷酸骨架、亚磷酰胺骨架或烷基磷酸三酯骨架替代。核酸序列可以可操作地连接至调控区如启动子。调控区可以来自任何物种。如本文所用的,可操作地连接的是指调控区以允许或促进靶核酸的转录的方式相对于核酸序列的定位。任何类型的启动子都可以可操作地连接至核酸序列。启动子的实例包括但不限于组织特异性启动子、组成型启动子和对特定刺激物有反应或无反应的启动子(例如诱导型启动子)。
在核酸构建体中可能有用的另外的区域包括但不限于聚腺苷酸化序列、翻译控制序列(例如,内部核糖体进入区段,IRES)、增强子、诱导型元件或内含子。此类调控区可能不是必需的,虽然它们可以通过影响转录、mRNA的稳定性、翻译效率等来增加表达。此类调控区可以如所希望的包括在核酸构建体中以获得细胞中核酸的最佳表达。有时在没有这些另外的元件的情况下可以获得足够的表达。
可使用编码信号肽或选择性标记的核酸构建体。可以使用信号传导(标记)肽,使得编码的多肽被导向特定的细胞位置(例如细胞表面)。此类选择性标记的非限制性实例包括嘌呤霉素、更昔洛韦、腺苷脱氨酶(ADA)、氨基糖苷磷酸转移酶(neo、G418、APH)、二氢叶酸还原酶(DHFR)、潮霉素-B-磷酸转移酶、胸苷激酶(TK)和黄嘌呤-鸟嘌呤磷酸核糖基转移酶(XGPRT)。这些标记对于选择培养物中稳定的转化体是有用的。其他选择性标记包括荧光多肽,如绿色荧光蛋白、红色荧光蛋白或黄色荧光蛋白。
可以使用本领域已知的多种生物学技术将核酸构建体引入任何类型的细胞中。这些技术的非限制性实例将包括使用转座子系统,可以感染细胞的重组病毒,或能够将核酸递送至细胞的脂质体或其他非病毒方法,如电穿孔、微注射或磷酸钙沉淀。也可以使用称为Nucleofection.TM.的系统。
可以将核酸并入载体中。载体是广泛的术语,其包括任何特定的DNA区段,被设计为从运载体移动到靶DNA。载体可以被称为表达载体或载体系统,其是使DNA插入到基因组或其他靶向DNA序列(如附加体、质粒或甚至病毒/噬菌体DNA区段)中所需的一组组分。载体最常见地含有包括一个或多个表达控制序列的一个或多个表达盒,其中表达控制序列是分别控制和调控另一个DNA序列或mRNA的转录和/或翻译的DNA序列。
许多不同类型的载体在本领域是已知的。例如,质粒和病毒载体(包括逆转录病毒载体)是已知的。哺乳动物表达质粒典型地具有复制起点、合适的启动子和任选的增强子,以及任何必需的核糖体结合位点、聚腺苷酸化位点、剪接供体和受体位点、转录终止序列和5'侧翼非转录序列。此类载体包括质粒(其也可以是另一种载体的运载体)、腺病毒、腺相关病毒(AAV)、慢病毒(例如,经修饰的HIV-1、SIV或FIV)、逆转录病毒(例如,ASV、ALV或MoMLV)和转座子(P-元件、Tol-2、Frog Prince、piggyBac等)。
用于本披露的细菌和病毒基因以及蛋白质在下文标题为“本披露的序列”的部分中列出。其他病毒整合酶,例如来自小鼠乳腺肿瘤病毒(MMTV)和腺病毒的那些,也可用于本文披露的方法和组合物中。
经编辑的细胞的合并群体被认为是已经接受基因编辑的细胞和没有接受基因编辑的细胞的混合物。
示例性ABBIE1体外测定
1)用引导RNA孵育ABBIE1蛋白;
2)用具有部分LTR的供体DNA孵育ABBIE1/引导RNA以形成起始前复合物;
3)用含有待编辑基因(例如CXCR4)的质粒孵育起始前复合物;并且
4)用于供体DNA整合的PCR和DNA测序确认。
例如,在Gagnon等人,2014,http://labs.mcb.harvard.edu/schier/VertEmbryo/Cas9_Protocols.pdf中描述了Cas9方案。
例如Merkel et al.,Methods,2009,volume 47,pages 243–248[Merkel等人,方法,2009,第47卷,第243-248页]中描述了整合酶活性的测定。
实例
以下实例旨在提供本披露的应用的说明。以下实例并不旨在完全限定或以其它方式限制本披露的范围。本领域技术人员将会理解,本领域已知的许多其他方法可以代替本文具体描述或参考的方法。
实例1:用于表达CAS9-整合酶融合蛋白的DNA载体
将无催化活性的Cas9的DNA序列并入具有12、15、18、21、24、27或30bp间隔子(编码4、5、6、7、8、9或10个氨基酸作为Cas9和整合酶之间的接头)和HIV1整合酶的表达载体中。在其他实验中,使用细菌或噬菌体来源的重组酶而不是整合酶。这些包括Hin重组酶(SEQ IDNO:25)和Cre重组酶(SEQ ID NO:26),它们具有或不具有允许其在任何其他位点重组DNA的突变。可以包括His或cMyc标签(或用于蛋白质纯化的其他序列)以分离融合蛋白。表达载体使用将在细胞中被激活的启动子,该所述启动子将与载体一起提供。CMV(巨细胞病毒启动子)常用于哺乳动物细胞的表达载体。U6启动子也是常用的。在某些实施例中,T7启动子可以用于体外转录。
实例2:用于表达感兴趣的DNA序列(感兴趣的基因)的DNA载体
感兴趣的DNA序列将被插入到适当的表达载体中,并且位点将被适当地添加到感兴趣的DNA序列中,使得HIV1整合酶将识别所述序列以整合到基因组中。这些位点被称为att位点(U5和U3att位点)(参见Masuda et al,Journal of Virology,1998,volume 72,pages 8396-8402[Masuda等人,病毒学杂志,1998,第72卷,第8396-8402页])。基因组中靶位点的同源臂可以被包括在感兴趣的DNA(基因)序列的5'和3'末端侧翼的区域中(参见Ishii et al,PLOS ONE,September 24,2014,DOI:10.1371/journal.pone.0108236[Ishii等人,公共科学图书馆·综合,2014年9月24日,DOI:10.1371/journal.pone.0108236])。使用重组酶时,可以不包括整合酶识别位点。将包括标记如药物抗性标记(例如杀稻瘟菌素或嘌呤霉素)以检查感兴趣的DNA序列的插入并帮助测定基因组中的随机插入。这些抗性标记可以按从靶向基因组着陆点(landing pad)去除它们的方式进行工程化。例如,将嘌呤霉素抗性基因侧接LoxP位点并引入外源表达的CRE会去除内部序列,留下含有LoxP位点的瘢痕。
实例3:用于逆转录酶表达的DNA载体
逆转录酶也可以在这样的系统中共表达,如在载体中设计的感兴趣的DNA序列(基因)将被表达为RNA,并将必须被转换回DNA而由整合酶进行整合。逆转录酶可以是病毒来源的(例如逆转录病毒如HIV1)。这可以被并入与感兴趣的DNA序列相同的载体内。
实例4:DNA靶向整合酶(或重组酶)与感兴趣的DNA序列的共表达
针对上述载体连同基因组中靶位点所需的Cas9RNA引导物,细胞被电穿孔。在一些实验中,创建了表达所有组分(融合Cas9/整合酶(或重组酶)、Cas9RNA引导物以及具有整合酶识别位点且具有或没有同源臂的感兴趣的DNA序列)的载体。逆转录酶也可以在这样的系统中共表达,如在载体中设计的感兴趣的DNA序列(基因)将被表达为RNA,并将必须被转换回DNA而由整合酶进行整合。逆转录酶可以是病毒来源的(例如逆转录病毒如HIV1)。在其他实验中,在引入细胞之前,感兴趣的DNA序列被线性化。必须设计Cas9RNA引导序列和感兴趣的DNA序列,并在使用之前通过标准分子生物学方案将其插入载体中。
实例5:测试实验和脱靶插入测定
用其中包括感兴趣的基因的以上载体对缺失特定基因的表达的细胞(如来自敲除小鼠模型的小鼠胚胎成纤维细胞)或基因工程化为给定基因敲除的细胞进行转染或电穿孔。使用被设计为覆盖插入的基因以及侧翼基因组序列的嵌合引物组来筛选经编辑的细胞的初始库。然后进行有限稀释克隆(LDC)和/或FACS分析以确保单克隆性。进行下一代测序(NGS)或单核苷酸多态性(SNP)分析作为最终的质量控制步骤以确保对于所设计的编辑而言经分离的克隆是同源的。其他筛选机制可以包括但不限于qRT-PCR和用适当抗体进行的蛋白质印迹。如果蛋白质与细胞的某种表型相关,则可以针对该表型的挽救而检查细胞。针对DNA插入的特异性测定细胞的基因组并找到脱靶插入的相对数目(如果有的话)。
实例6:CAS9连接的整合酶蛋白质表达和分离
针对大肠杆菌或昆虫细胞中的基因表达设计的载体将被并入大肠杆菌或昆虫细胞中并允许在给定的时间段表达。将利用几种设计来产生Cas9(或无活性Cas9)连接的整合酶蛋白。载体还将并入标签(不限于His或cMyc标签)以最终以高纯度和产率分离蛋白质。嵌合蛋白的制备将包括但不限于标准色谱技术。蛋白质也可以被设计具有一个或多个NLS(核定位信号序列)和/或TAT序列。核定位信号允许蛋白质进入细胞核。TAT序列允许蛋白质更容易地进入细胞(它是细胞穿透肽)。可以考虑本领域中的其他细胞穿透肽。在足够的表达时间发生后,将从细胞中收集蛋白质裂解物,并取决于所使用的标签在适当的柱中进行纯化。然后将经纯化的蛋白质置于适当的缓冲溶液中并储存在-20℃或-80℃下。
实例7:使用CAS9-整合酶以在转录起始位点的正上游并入终止密码子
本披露包括建立敲除细胞系或生物体的方法。上述系统与感兴趣的DNA序列一起使用,所述感兴趣的DNA序列是1、3、6、10、15或20个连续的终止密码子,其待恰被置于靶基因的ATG起始位点之后。这将建立有效的基因敲除,因为转录/翻译将在到达ATG起始位点之后的立即终止密码子时终止。另外的终止密码子将有助于防止可能的转录酶通过(如果转录酶绕道第一终止密码子的话)。
实例8:使用ABBIE1(或具有其他特定DNA结合结构域的其他变体)作为纯化蛋白来编辑细胞的基因组
在合适的缓冲液中用具有病毒LTR区域的可插入/可整合的DNA孵育经分离的Abbie1蛋白(与逆转录病毒整合酶连接的其他特异性DNA序列结合蛋白)。(取决于实例用于形成四聚体或其他多聚体)。可替代地,可以将经分离的Abbie1蛋白与引导RNA的预制组合物与可插入的DNA序列组合。包括引导RNA并孵育以并入引导RNA。将Abbie1/DNA制剂转染或电穿孔(或向细胞提供蛋白质的其他技术)到细胞中。允许发生基因组/DNA编辑的时间。检查设计的可插入的DNA序列插入细胞基因组DNA的特定位点中。通过PCR和DNA测序检查非特异性插入。
按照当前计划的,细菌表达载体将是pMAL-c5e,它是来自NEB的停售产品,并且也是Genscript的内部克隆选择之一。克隆密码子优化的Spy Cas9,其具有his标签和与麦芽糖结合蛋白(MBP)标签符合读框的TEV蛋白酶切割位点。ORF在可诱导的Tac启动子之下,并且载体也编码lac阻遏物(LacI)以进行更严格的调控。MBP将仅用作稳定标签而不是纯化标签,因为直链淀粉树脂相当昂贵。将经表达的可溶性物质用Ni亲和色谱纯化,然后将Cas9通过TEV蛋白酶从MBP中释放出来,通过阳离子交换色谱进行纯化,并通过凝胶过滤进行精制。
实例9:融合蛋白的构建体的设计
设计序列特异性锌指结构域、TALE或引导RNA,用于针对靶DNA序列的基于CRISPR的方法。使用选择的在线设计软件。
产生DNA构建体,其具有整合酶、转座酶或重组酶的编码序列;合适的氨基酸接头;适当的锌指、TALE或CRISPR蛋白(例如Cas9、Cpf1);和核定位信号(或线粒体定位信号),以形成位点特异性融合整合酶蛋白。以多个安排设想了这些。如果希望的话,可以包括合适的标签用于蛋白质分离和纯化(例如麦芽糖结合蛋白(MBP)或His标签)。
DNA构建体可以利用本领域常用的哺乳动物细胞启动子或细菌启动子(例如CMV、T7等)。
可以用大肠杆菌作为来源产生重组融合蛋白。通过本领域中的标准方法(例如MBP柱、镍-琼脂糖柱等)分离蛋白质。
组装供体-RNP复合物(使RNA寡核苷酸双链体化并与本发明的融合蛋白混合(当融合蛋白具有对其DNA结合能力而言无核酸内切酶活性的CRISPR相关蛋白(例如ABBIE1)时)-形成RNP的这些步骤对于锌指结构域和TALE不是必需的。
1.将具有适当的LTR结构域和可插入的序列的供体DNA和融合蛋白混合,并孵育10分钟。(可替代地在RNP复合物形成之后添加供体DNA)
2.将每种RNA寡核苷酸(crRNA和tracrRNA)重悬浮于不含核酸酶的IDTE缓冲液中。例如,使用100μM的终浓度。
3.将等摩尔浓度的两种RNA寡核苷酸在无菌微量离心管中混合。例如,使用下表建立3μM的最终双链体浓度:组分量100μM crRNA 3μL 100μM tracrRNA3μL无核酸酶双链体缓冲液94μL终体积100μL
4.在95℃加热5min。
5.除掉加热,并允许在台面上冷却至室温(15℃–25℃)。
6.如果需要的话,在无核酸酶双链缓冲液中将双链RNA稀释到工作浓度(例如,3μM)。
7.在工作缓冲液(20mM HEPES、150mM KCl、5%甘油、1mM DTT,pH 7.5)中将融合蛋白稀释到工作浓度(例如,5μM)。
8.对于每次转染,将1.5pmol双链RNA寡核苷酸(步骤A5)与1.5pmol融合蛋白(步骤A6)在Opti-MEM培养基中组合至12.5μL的最终体积。
9.在室温下孵育5min以组装RNP复合物。
实例10:在96孔板中逆转染GRNA融合蛋白
2.孵育期间(步骤B1),使用不具有抗生素的完全培养基将经培养的细胞稀释至400,000个细胞/mL。
3.在孵育完成时,将25μL转染复合物(来自步骤B1)添加到96孔组织培养板。
4.将125μL经稀释的细胞(来自步骤B2)添加到96孔组织培养板(50,000个细胞/孔;RNP的终浓度将为10nM)。
5.在组织培养箱(37℃,5%CO2)中将含有转染复合物和细胞的板孵育48小时。为了检测中靶突变,使用用适当的引物(供体序列内的引物和围绕靶插入位点的引物)进行的PCR。
实例11:用于测试CRISPR/CAS9的特异性的方案
产生与生物素连接的dCas9(无DNA切割活性的Cas9)(dCas9-生物素)。Cas9(化脓链球菌、金黄色葡萄球菌等)。下面描述了生物素化方法。
生物素化方法#1:在N末端或C末端工程化avi标签(约15个残基),像WT(未标签化的)蛋白质一样表达并纯化。使用大肠杆菌生物素连接酶(BirA)和生物素来生物素化avi标签化的Cas9。使用这个方案来生物素化趋化因子。相信关于avi标签技术的IP在几年前就到期了。
生物素化方法#2.1:用琥珀酰亚胺基-酯官能化的生物素可以在表面暴露的赖氨酸残基处并入(不需要酶促反应)。对于和Cas9一样大的蛋白质,这可能是可行的选择。
生物素化方法#2.2:沿着同样的思路,生物素-马来酰亚胺是可商购的,并且它们可以在表面暴露的半胱氨酸处缀合(无酶)。
将完成测试以表征生物素化的Cas9在DNA结合和切割方面的表现。
链霉亲和素包衣的96孔板是可商购的,但也可以内部生产。
将dCas9-生物素结合到塑料板(96孔、24孔、384孔等)上。
为每个孔提供设计的引导RNA。允许引导RNA与Cas9蛋白相互作用的时间。
向每个孔提供基因组DNA或具有靶向序列的DNA。允许Cas9结合到DNA的时间。
用适当的缓冲液洗涤各孔。
提供适配子(DNA寡聚体)。允许时间结合。
限制性消化基因组DNA,以使其更易控制,并且更容易连接适配子。
洗涤各孔。
进行DNA测序以鉴定结合位点(中靶与脱靶)。
实例12:通过ABBIE1进行NRF2编辑
图5示出了使用引导NrF2-sgRNA2和sgRNA3靶向Nrf2的外显子2的Abbie1基因编辑。通过Abbie1编辑,针对敲除,PCR筛选外显子2靶向Nrf2基因座。使用引导NrF2-sgRNA 2和3靶向Nrf2的外显子2的Abbie1转染显示供体在靶向区域的整合。独特的条带被鉴定为1-8。
图6示出了由Abbie1基因编辑产生的理论数据。通过Abbie1系统可视化插入的供体DNA以使用sgRNA 1-3靶向基因组材料的DNA凝胶电泳的表示。黑色条带代表由于PCR方法而产生的背景产物。红色条带代表扩增插入片段和插入片段区域侧翼的遗传材料所产生的独特产物。多个条带代表靶向区域可能的多重插入。
图7示出了使用引导Nrf2-sgRNA3靶向Nrf2的外显子2的Abbie1基因编辑。通过Abbie1编辑,针对敲除,PCR筛选外显子2靶向Nrf2基因座。使用引导NrF2-sgRNA3靶向Nrf2的外显子2表明供体插入,如设计为供体序列的PCR引物和预期插入的相邻位点所指示的。独特的条带被鉴定为1-4。
图8示出了在合并的Hek293T细胞中Nrf2的Abbie1敲除。(A)使用针对55kD同种型的多克隆抗体(圣克鲁兹生物公司(Santa Cruz Bio))进行蛋白质印迹分析,显示在合并的HEK293T群体中Nrf2的敲除。(B)GAPDH(圣克鲁兹生物公司)加载控制。
图9示出了在合并的Hek293T细胞中Nrf2的Abbie1敲除。(A)利用针对Nrf2的单克隆抗体(艾博康公司(Abcam))的蛋白质印迹分析,显示在HEK 293t细胞中Nrf2合并群体的敲除。(B)GAPDH加载控制。(C)光密度测定分析的平均值,显示与对照相比表达比率的降低。
经Abbie1处理的细胞产生独特的PCR条带,指示供体DNA的整合。通过用独特且不同的抗体探测两个同种型的蛋白质印迹分析证实在HEK293T合并细胞系中敲除的表型确认。在两周之内在合并的群体中观察到整合敲除约80%。
实例13:通过ABBIE1进行CXCR4编辑
图10示出了靶向CXCR4外显子2的Abbie1基因编辑。通过Abbie1编辑进行靶向CXCR4的外显子2的PCR筛选。针对感兴趣的区域设计了四组引物。组号2和组号4似乎已经产生了独特的条带,表明在感兴趣的区域供体DNA的整合。
实例14:使用ABBIE1在NRF2基因座处针对敲入实验进行转染。
注意:500ng蛋白质和120ng sgRNA用于单一反应。DNA的量取决于供体构建体的大小。可以在提供/转染/电穿孔到细胞之前、期间或之后将供体DNA(具有LTR序列的DNA)用ABBIE1进行孵育。所有反应均在无菌生物安全柜中进行。
第1天:将人胚肾(HEK 293T)细胞以每孔200,000个HEK293T细胞(ATCC)于补充有10%胎牛血清(欧米茄科学公司(Omega Scientific))的500μL DMEM(吉博科公司(Gibco))中接种在24孔培养板(康宁公司(Corning))中。允许细胞恢复24小时。
第2天:ABBIE1制备:
管1:
在减血清转染培养基(OptiMEM,生命技术公司(Life Technologies))中1:1摩尔比的经纯化的ABBIE1蛋白(SEQ ID NO:58)和供体DNA(SEQ ID NO:101)在室温下持续10分钟。添加sgRNA至1.3倍摩尔过量(大约120ng)到蛋白质/DNA复合物中并继续在室温下孵育另外10分钟。此混合物的体积是25μl。
管2:
将2μL转染试剂(RNAiMAX,生命技术公司)添加到23μL的OptiMEM中。并允许在室温下孵育10分钟。
组合管1和管2(50μl终体积)并在室温下孵育15分钟。
向孔中加入全部50μL转染混合物。
转染后48小时收获合并的经编辑的细胞的一半,用于验证合并群体中的基因组DNA编辑。经编辑的基因组的验证是通过聚合酶链式反应(PCR)来进行。按照以下所述对靶向区域进行PCR(参见PCR方案),将剩余物接种到6cm培养皿(康宁公司)上并允许其恢复48小时。
第5天:通过蛋白质印迹筛选表型变化。
使用靶向55kD同种型(圣克鲁兹生物公司,sc-722)以及98kD同种型(艾博康公司,ab-62352)的一级抗体对NrF2同种型进行标准蛋白质印迹分析。GAPDH(圣克鲁兹生物公司,sc-51907)
实例15:针对NRF2和CXCR4基因座使用ABBIE1进行的基因编辑的检测的PCR条件。
人类Nrf2的登录号
Uniprot:Q16236
Ensembl基因ID:ENSG00000116044
针对Nrf2(外显子2)的编辑靶序列和PAM:用于sgRNA设计1-3。
GCGACGGAAAGAGTATGAGC TGG
TATTTGACTTCAGTCAGCGA CGG
TGGAGGCAAGATATAGATCT TGG
用于检测Nrf2靶标处的整合的关键引物
引物组1:引物1:5’-GTGTTAATTTCAAACATCAGCAGC-3’,引物2:5’-GACAAGACATCCTTGATTTG-3’
引物组2:引物1:5’-GAGGTTGACTGTGTAAATG-3’,引物2:5’-GATACCAGAGTCACACAACAG-3’
引物组3:引物1:5’-TCTACATTAATTCTCTTGTGC-3’,引物2:5’-GATACCAGAGTCACACAACAG-3’
人类CXCR4的登录号
Uniprot P61073
Ensembl基因ID:ENSG00000121966
针对CXCR4(外显子2)的编辑靶序列和PAM:用于sgRNA设计1。
GGGCAATGGATTGGTCATCC TGG
用于检测CXCR4靶标处的整合的关键引物
引物组1:引物1:5'-TCTACATTAATTCTCTTGTGC-3',引物2:5'-GACAAGACATCCTTGATTTG-3'
引物组2:引物1:5'-TCTACATTAATTCTCTTGTGC-3',引物2:5'-GATACCAGAGTCACACAACAG-3'
引物组3:引物1:5'-GAGGTTGACTGTGTAAATG-3',引物2:5'-GACAAGACATCCTTGATTTG-3'
引物组4:引物1:5'-GAGGTTGACTGTGTAAATG-3’,引物2:5'-GATACCAGAGTCACACAACAG-3’
用于检测经整合的供体DNA的PCR循环条件
*注意退火温度会依赖于引物序列而变化
1.预变性: 4min 94℃
2.变性: 30sec 94℃
3.退火: 30sec 55℃
4.延伸: 30sec 72℃
5.转到步骤2:40个循环
6.最终延伸: 4min 72℃
7.最终保持: ∞ 4℃
用于生物素化的Avi标签化的Cas9
用于Cas9生物素化的avi标签的序列
氨基酸序列:
G G D L E G S G L N D I F E A Q K I E W H E*
核酸序列:
GGCGGCGACCTCGAGGGTAGCGGTCTGAACGATATTTTTGAAGCGCAGAAAATTGAATGGCATGAATAA
第一个加下划线部分=Cas9C末端
斜体部分=限制性位点/接头
第二个加下划线部分=avi标签(突出显示生物素化位点)
实例16:ABBIE1融合蛋白的表达方案。
含有全长融合蛋白(SEQ ID NO:57)的表达构建体的转化。
从-80℃冰箱取出感受态大肠杆菌细胞。
打开水浴到42℃。
将感受态细胞放入1.5ml管(Eppendorf或类似品)中。为了转化DNA构建体,使用50ul感受态细胞。
保持管在冰上。
将50ng环状DNA添加到大肠杆菌细胞中。在冰上孵育10min以解冻感受态细胞。
将带有DNA和大肠杆菌的管置于42℃水浴中45秒。将管放回冰上2分钟,以减少对大肠杆菌细胞的损害。
添加1ml的LB(没有添加抗生素)。在37℃将管孵育1小时。(可将管孵育30分钟)
在具有适当抗生素的LB板上铺展约100ul所得培养物
约12-16小时后挑取菌落。
接种和扩增
接种含有LB和抗生素的1升瓶
允许细菌培养物生长直至达到0.6OD,然后用1mM终浓度的异丙基β-D-1-硫代吡喃半乳糖苷(IPTG)诱导
允许培养物扩增6-8小时,并以最少两千G力将悬浮的细菌培养物离心10分钟。
在-80℃下冷冻丸粒以在之后进一步加工
蛋白质制备和纯化
所有步骤均在室温下进行。
通过在20mM Tris pH 8.0、300mM NaCl、0.1mg/ml鸡蛋清溶菌酶中进行2个循环的冻融来溶解细胞。以6,000g离心15分钟并保留上清液。
将上清液加载到在20mM Tris pH 8.0、300mM氯化钠中平衡的Ni-IDA琼脂糖柱上。用0至200mM咪唑梯度洗脱蛋白质。通过7%SDS-PAGE鉴定含有融合蛋白的级分。
合并级分并用20mM Tris pH 8.0稀释,以使最终的NaCl浓度为50mM。加载到Q-琼脂糖柱上,并用0到500mM氯化钠梯度洗脱。通过7%SDS-PAGE鉴定含有融合蛋白的级分。
合并级分并用20mM Tris pH 8.0稀释,以使最终的NaCl浓度为100mM。加载到SP-琼脂糖柱上,并用0到500mM氯化钠梯度洗脱。通过7%SDS-PAGE鉴定含有融合蛋白的级分。
合并级分,通过其在280nm处的UV吸光度测量浓度,并通过离心过滤器浓缩至400μg/ml的最终浓度。添加甘油到最终浓度为50%。在-20℃下储存。
尽管本文已经显示并描述了某些实施例,但是本领域的技术人员将清楚的是仅作为举例而提供了此类实施例。本领域的普通技术人员现在将会想到众多变体、改变以及替代,而不背离本披露。应该理解的是,本文描述的本披露的实施例的不同替代方案可以用于实践本披露。预期的是以下权利要求书限定了本披露的范围并且由此覆盖在这些权利要求和它们的等效物的范围内的方法和结构。
本披露的序列
对于下面提供的每个序列,提供以下信息:序列类型(核酸或氨基酸)、来源(例如大肠杆菌)、长度和标识号(如果可获得的话)。
本披露的第一多核苷酸可以编码例如Cas9、Cpf1、TALE或ZnFn蛋白。本披露的第二多核苷酸可以编码例如整合酶、转座酶或重组酶。下面列出的是示例性的第一和第二多核苷酸序列和蛋白质序列以及示例性的接头序列,其可以用于本文所述的组合物(构建体、融合蛋白)和方法中。本披露中可以提供未列于下表1中但可用于本文所述的组合物(构建体、融合蛋白)和方法中的其他多核苷酸序列、蛋白质序列或接头序列。例如,SEQ ID NO:49、SEQ ID NO:57、SEQ ID NO:58和/或其部分。
接头可以是任何长度,例如,长度为3至300个核苷酸,长度为6至60个核苷酸或允许第一和第二多核苷酸融合的任何长度。多肽可以由生物体(例如大肠杆菌)制备,或合成制备,或两者的组合。
示例性的核酸序列:1、3、5、7、9、11、13、15、17、19、21、23、27-47、49、55、56、57、62、64、66、68、70、79、82和83。
示例性的氨基酸序列:2、4、6、8、10、12、14、16、18、20、22、24、25、26、48、50、52、58、63、65、67、69、71、72-78和80。
表1:第一蛋白、第二蛋白或接头
表2:部分序列表
另外的序列
SEQ ID NO:1
名称:嗜热链球菌Csn1cds HQ712120.1
序列:
ATGACTAAGCCATACTCAATTGGACTTGATATTGGAACGAATAGTGTTGGATGGGCTGTAATAACTGATAATTACAAGGTTCCGTCTAAAAAAATGAAAGTCTTAGGAAATACGAGTAAAAAGTATATCAAAAAGAACCTGTTAGGTGTATTACTCTTTGACTCTGGAATCACAGCAGAAGGAAGAAGATTGAAGCGTACTGCAAGAAGACGTTATACTAGACGCCGTAATCGTATCCTTTATTTGCAGGAAATTTTTAGCACGGAGATGGCTACATTAGATGATGCTTTCTTTCAAAGACTTGACGATTCGTTTTTAGTTCCTGATGATAAACGTGATAGTAAGTATCCGATATTTGGAAACTTAGTAGAAGAAAAAGTCTATCATGATGAATTTCCAACTATCTATCATTTAAGGAAATATTTAGCAGATAGTACTAAAAAAGCAGATTTGCGTCTAGTTTATCTTGCATTGGCTCATATGATTAAATATAGAGGTCACTTCTTAATTGAAGGAGAGTTTAATTCAAAAAATAATGATATTCAGAAGAATTTTCAAGACTTTTTGGACACTTATAATGCTATTTTTGAATCGGATTTATCACTTGAGAATAGTAAACAACTTGAGGAAATTGTTAAAGATAAGATTAGTAAATTAGAAAAGAAAGATCGTATTTTAAAACTCTTCCCTGGGGAGAAGAATTCGGGGATTTTTTCAGAGTTTCTAAAGTTGATTGTAGGAAATCAAGCTGATTTTAGGAAATGTTTTAATTTAGACGAAAAAGCCTCCTTACATTTTTCCAAAGAAAGCTATGATGAAGATTTAGAGACTTTGTTAGGTTATATTGGAGATGATTACAGTGATGTCTTTCTCAAAGCAAAGAAACTTTATGATGCTATTCTTTTATCGGGTTTTCTGACTGTAACTGATAATGAGACAGAAGCACCTCTCTCTTCTGCTATGATAAAGCGATATAATGAACACAAAGAAGATTTAGCGTTACTAAAGGAATATATAAGAAATATTTCACTAAAAACGTATAATGAAGTATTTAAAGATGACACCAAAAATGGTTATGCTGGTTATATTGATGGAAAAACAAATCAGGAAGATTTCTACGTATATCTAAAAAACCTATTGGCTGAATTTGAAGGTGCGGATTATTTTCTTGAAAAAATTGATCGAGAAGATTTTTTGAGAAAGCAACGTACATTTGACAATGGTTCGATACCATATCAGATTCATCTTCAAGAAATGAGAGCAATTCTTGATAAGCAAGCTAAATTTTATCCTTTCTTGGCTAAAAATAAAGAAAGAATCGAGAAGATTTTAACCTTCCGAATTCCTTATTATGTAGGTCCACTTGCGAGAGGGAATAGTGATTTTGCCTGGTCAATAAGAAAACGAAATGAAAAAATTACACCTTGGAATTTTGAGGACGTTATTGACAAAGAATCTTCGGCAGAGGCTTTCATTAATCGAATGACTAGTTTTGATTTGTATTTGCCAGAAGAGAAGGTACTTCCAAAGCATAGTCTCTTATACGAAACTTTTAATGTATATAATGAATTAACAAAAGTTAGATTTATTGCCGAAAGTATGAGAGATTATCAATTTTTAGATAGTAAGCAGAAGAAAGATATTGTTAGACTTTATTTTAAAGATAAAAGGAAAGTTACTGATAAGGATATTATTGAATATTTACATGCAATTTATGGGTATGATGGAATTGAATTAAAAGGCATAGAGAAACAGTTTAATTCTAGTTTATCTACTTATCACGATCTTTTAAATATTATTAATGATAAAGAGTTTTTGGATGATAGTTCAAATGAAGCGATTATCGAAGAAATTATCCATACTTTGACAATTTTTGAAGATAGAGAGATGATAAAACAACGTCTTTCAAAATTTGAGAATATATTCGATAAATCCGTTTTGAAAAAGTTATCTCGTAGACATTACACTGGCTGGGGTAAGTTATCTGCTAAGCTTATTAATGGTATTCGAGATGAAAAATCTGGTAATACTATTCTTGATTACTTAATTGATGATGGTATTTCTAACCGTAATTTCATGCAACTTATTCACGATGATGCTCTTTCTTTTAAAAAGAAGATACAGAAAGCACAAATTATTGGTGACGAAGATAAAGGTAATATTAAAGAGGTCGTTAAGTCTTTGCCAGGTAGTCCTGCGATTAAAAAAGGTATTTTACAAAGCATAAAAATTGTAGATGAATTGGTCAAAGTAATGGGAGGAAGAAAACCCGAGTCAATTGTTGTTGAGATGGCTCGTGAAAATCAATATACCAATCAAGGTAAGTCTAATTCCCAACAACGCTTGAAACGTTTAGAAAAATCTCTCAAAGAGTTAGGTAGTAAGATACTTAAGGAAAATATTCCTGCAAAACTTTCTAAAATAGACAATAACGCACTTCAAAATGATCGACTTTACTTATACTATCTTCAAAATGGAAAAGATATGTATACCGGAGATGATTTAGATATTGATAGATTAAGTAATTATGATATTGATCATATTATTCCTCAAGCTTTTTTGAAAGATAATTCTATTGACAATAAAGTACTTGTTTCATCTGCTAGTAACCGTGGTAAATCAGATGATTTTCCAAGTTTAGAGGTTGTCAAAAAAAGAAAGACATTTTGGTATCAATTATTGAAATCAAAATTAATTTCTCAACGAAAATTTGATAATCTGACAAAAGCTGAACGGGGAGGATTGTTACCTGAGGACAAAGCTGGTTTTATTCAACGCCAGTTGGTTGAAACACGTCAAATAACAAAACATGTAGCTCGTTTACTTGATGAGAAATTTAATAATAAAAAAGATGAAAATAATAGAGCGGTACGAACAGTAAAAATTATTACCTTGAAATCTACCTTAGTTTCTCAATTTCGTAAGGATTTTGAACTTTATAAAGTTCGTGAAATCAATGATTTTCATCATGCTCATGATGCTTACTTGAATGCCGTTATAGCAAGTGCTTTACTTAAGAAATACCCTAAACTAGAGCCAGAATTTGTGTACGGTGATTATCCAAAATACAATAGTTTTAGAGAAAGAAAGTCCGCTACAGAAAAGGTATATTTCTATTCAAATATCATGAATATCTTTAAAAAATCTATTTCTTTAGCTGATGGTAGAGTTATTGAAAGACCACTTATTGAGGTAAATGAGGAGACCGGCGAATCCGTTTGGAATAAAGAATCTGATTTAGCAACTGTAAGGAGAGTACTCTCTTATCCGCAAGTAAATGTTGTGAAAAAAGTTGAGGAACAGAATCACGGATTGGATAGAGGAAAACCAAAGGGATTGTTTAATGCAAATCTTTCCTCAAAGCCAAAACCAAATAGTAATGAAAATTTAGTAGGTGCTAAAGAGTATCTTGACCCCAAAAAGTATGGGGGGTATGCTGGAATTTCTAATTCTTTTGCTGTTCTTGTTAAAGGGACAATTGAAAAAGGTGCTAAGAAAAAAATAACAAATGTACTAGAATTTCAAGGTATTTCTATTTTAGATAGGATTAATTATAGAAAAGATAAACTTAATTTTTTACTTGAAAAAGGTTATAAAGATATTGAGTTAATTATTGAACTACCTAAATATAGTTTATTTGAACTTTCAGATGGTTCACGTCGTATGTTGGCTAGTATTTTGTCAACGAATAATAAGAGGGGAGAGATTCACAAAGGAAATCAGATTTTTCTTTCACAGAAGTTTGTGAAATTACTTTATCATGCTAAGAGAATAAGTAACACAATTAATGAGAATCATAGAAAATATGTTGAGAACCATAAAAAAGAGTTTGAAGAATTATTTTACTACATTCTTGAGTTTAATGAGAATTATGTTGGAGCTAAAAAGAATGGTAAACTTTTAAACTCTGCCTTTCAATCTTGGCAAAATCATAGTATAGATGAACTCTGTAGTAGTTTTATAGGACCTACCGGAAGTGAAAGAAAGGGGCTATTTGAATTAACCTCTCGTGGAAGTGCTGCTGATTTTGAATTTTTAGGTGTTAAAATTCCAAGGTATAGAGACTATACCCCATCATCCCTATTAAAAGATGCCACACTTATTCATCAATCTGTTACAGGCCTCTATGAAACACGAATAGACCTTGCCAAACTAGGAGAGGGTTAA
SEQ ID NO:2
序列:
MTKPYSIGLDIGTNSVGWAVITDNYKVPSKKMKVLGNTSKKYIKKNLLGVLLFDSGITAEGRRLKRTARRRYTRRRNRILYLQEIFSTEMATLDDAFFQRLDDSFLVPDDKRDSKYPIFGNLVEEKVYHDEFPTIYHLRKYLADSTKKADLRLVYLALAHMIKYRGHFLIEGEFNSKNNDIQKNFQDFLDTYNAIFESDLSLENSKQLEEIVKDKISKLEKKDRILKLFPGEKNSGIFSEFLKLIVGNQADFRKCFNLDEKASLHFSKESYDEDLETLLGYIGDDYSDVFLKAKKLYDAILLSGFLTVTDNETEAPLSSAMIKRYNEHKEDLALLKEYIRNISLKTYNEVFKDDTKNGYAGYIDGKTNQEDFYVYLKNLLAEFEGADYFLEKIDREDFLRKQRTFDNGSIPYQIHLQEMRAILDKQAKFYPFLAKNKERIEKILTFRIPYYVGPLARGNSDFAWSIRKRNEKITPWNFEDVIDKESSAEAFINRMTSFDLYLPEEKVLPKHSLLYETFNVYNELTKVRFIAESMRDYQFLDSKQKKDIVRLYFKDKRKVTDKDIIEYLHAIYGYDGIELKGIEKQFNSSLSTYHDLLNIINDKEFLDDSSNEAIIEEIIHTLTIFEDREMIKQRLSKFENIFDKSVLKKLSRRHYTGWGKLSAKLINGIRDEKSGNTILDYLIDDGISNRNFMQLIHDDALSFKKKIQKAQIIGDEDKGNIKEVVKSLPGSPAIKKGILQSIKIVDELVKVMGGRKPESIVVEMARENQYTNQGKSNSQQRLKRLEKSLKELGSKILKENIPAKLSKIDNNALQNDRLYLYYLQNGKDMYTGDDLDIDRLSNYDIDHIIPQAFLKDNSIDNKVLVSSASNRGKSDDFPSLEVVKKRKTFWYQLLKSKLISQRKFDNLTKAERGGLLPEDKAGFIQRQLVETRQITKHVARLLDEKFNNKKDENNRAVRTVKIITLKSTLVSQFRKDFELYKVREINDFHHAHDAYLNAVIASALLKKYPKLEPEFVYGDYPKYNSFRERKSATEKVYFYSNIMNIFKKSISLADGRVIERPLIEVNEETGESVWNKESDLATVRRVLSYPQVNVVKKVEEQNHGLDRGKPKGLFNANLSSKPKPNSNENLVGAKEYLDPKKYGGYAGISNSFAVLVKGTIEKGAKKKITNVLEFQGISILDRINYRKDKLNFLLEKGYKDIELIIELPKYSLFELSDGSRRMLASILSTNNKRGEIHKGNQIFLSQKFVKLLYHAKRISNTINENHRKYVENHKKEFEELFYYILEFNENYVGAKKNGKLLNSAFQSWQNHSIDELCSSFIGPTGSERKGLFELTSRGSAADFEFLGVKIPRYRDYTPSSLLKDATLIHQSVTGLYETRIDLAKLGEG
SEQ ID NO:3
名称:多杀性巴氏杆菌Cas9
序列:
ATGCAAACAACAAATTTAAGTTATATTTTAGGTTTAGATTTGGGGATCGCTTCTGTAGGTTGGGCTGTCGTTGAAATCAATGAAAATGAAGACCCTATCGGCTTGATTGATGTAGGAGTAAGGATATTTGAGCGTGCTGAGGTACCCAAAACTGGAGAATCTTTAGCACTCTCTCGCCGTCTTGCAAGAAGTACTCGCCGTTTGATACGCCGTCGTGCACACCGTTTACTCCTCGCAAAACGCTTCTTAAAACGTGAAGGTATACTTTCCACAATCGACTTAGAAAAAGGATTACCCAACCAAGCTTGGGAATTACGTGTCGCCGGTCTTGAACGTCGGTTATCCGCCATAGAATGGGGTGCGGTTCTGCTACATTTAATCAAGCATCGAGGTTATCTTTCTAAACGTAAAAATGAATCCCAAACAAACAACAAAGAATTAGGAGCCTTACTCTCTGGAGTGGCACAAAACCATCAATTATTACAATCAGATGACTACCGAACACCAGCAGAGCTCGCACTGAAAAAATTTGCTAAAGAAGAAGGGCATATCCGTAATCAACGAGGTGCCTATACACATACATTTAATCGATTAGACTTATTAGCTGAACTTAACTTGCTTTTTGCTCAACAACATCAGTTTGGTAACCCTCACTGTAAAGAGCATATTCAACAATATATGACAGAATTGCTTATGTGGCAAAAGCCAGCCTTATCTGGTGAGGCAATTTTAAAAATGTTGGGTAAATGTACGCATGAAAAAAATGAGTTTAAAGCAGCAAAACATACCTACAGTGCGGAGCGCTTTGTTTGGCTAACCAAACTCAATAACTTGCGCATTTTAGAAGATGGGGCAGAACGAGCTCTTAATGAAGAAGAACGTCAACTATTGATAAATCATCCGTATGAGAAATCAAAATTAACCTATGCCCAAGTCAGAAAATTGTTAGGGCTTTCCGAACAAGCGATTTTTAAGCATCTACGTTATAGTAAAGAAAACGCAGAATCAGCTACTTTTATGGAGCTTAAAGCTTGGCATGCAATTCGTAAAGCGTTAGAAAATCAAGGATTGAAGGATACTTGGCAAGATCTCGCTAAGAAACCTGACTTACTAGATGAAATTGGTACCGCATTTTCTCTTTATAAAACTGATGAAGATATTCAGCAATATTTGACAAATAAGGTACCGAACTCAGTCATCAATGCATTATTAGTTTCTCTGAATTTCGATAAATTCATTGAGTTATCTTTGAAAAGTTTACGTAAAATCTTGCCCCTAATGGAGCAAGGTAAGCGTTATGATCAAGCTTGTCGTGAAATTTATGGGCATCATTATGGTGAGGCAAATCAAAAAACTTCTCAGCTACTACCAGCTATTCCAGCCCAAGAAATTCGTAATCCTGTTGTTTTACGTACACTTTCACAAGCACGTAAAGTGATCAATGCCATTATTCGTCAATATGGTTCCCCTGCTCGAGTCCATATTGAAACAGGAAGAGAACTTGGGAAATCTTTTAAAGAACGTCGTGAAATTCAAAAACAACAGGAAGATAATCGAACTAAGCGAGAAAGTGCGGTACAAAAATTCAAAGAATTATTTTCTGACTTTTCAAGTGAACCCAAAAGTAAAGATATTTTAAAATTCCGCTTATACGAACAACAGCATGGTAAATGCTTATACTCTGGAAAAGAGATCAATATTCATCGCTTAAATGAAAAGGGTTATGTGGAAATTGATCATGCTTTACCTTTCTCACGGACTTGGGATGATAGTTTTAATAATAAAGTATTAGTTCTTGCCAGCGAAAACCAAAACAAAGGGAATCAAACACCGTATGAATGGCTACAAGGTAAAATAAATTCGGAACGTTGGAAAAACTTTGTTGCTTTAGTACTGGGTAGCCAGTGCAGTGCAGCCAAGAAACAACGATTACTCACTCAAGTTATTGATGATAATAAATTTATTGATAGAAACTTAAATGATACTCGCTATATTGCCCGATTCCTATCCAACTATATTCAAGAAAATTTGCTTTTGGTGGGTAAAAATAAGAAAAATGTCTTTACACCAAACGGTCAAATTACTGCATTATTAAGAAGTCGCTGGGGATTAATTAAGGCTCGTGAGAATAATAACCGTCATCATGCTTTAGATGCGATAGTTGTGGCTTGTGCAACACCTTCTATGCAACAAAAAATTACCCGATTTATTCGATTTAAAGAAGTGCATCCATACAAAATAGAAAATAGGTATGAAATGGTGGATCAAGAAAGCGGAGAAATTATTTCACCTCATTTTCCTGAACCTTGGGCTTATTTTAGACAAGAGGTTAATATTCGTGTTTTTGATAATCATCCAGATACTGTCTTAAAAGAGATGCTACCTGATCGCCCACAAGCAAATCACCAGTTTGTACAGCCCCTTTTTGTTTCTCGTGCCCCAACTCGTAAAATGAGTGGTCAAGGGCATATGGAAACAATTAAATCAGCTAAACGCTTAGCAGAAGGCATTAGCGTTTTAAGAATTCCTCTCACGCAATTAAAACCTAATTTATTGGAAAATATGGTGAATAAAGAACGTGAGCCAGCACTTTATGCAGGACTAAAAGCACGCTTGGCTGAATTTAATCAAGATCCAGCAAAAGCGTTTGCTACGCCTTTTTATAAACAAGGAGGGCAGCAGGTCAAAGCTATTCGTGTTGAACAGGTACAAAAATCAGGGGTATTAGTCAGAGAAAACAATGGGGTAGCAGATAATGCCTCTATCGTTCGAACAGACGTATTTATCAAAAATAATAAATTTTTCCTTGTTCCTATCTATACTTGGCAAGTTGCGAAAGGCATCTTGCCAAATAAAGCTATTGTTGCTCATAAAAATGAAGATGAATGGGAAGAAATGGATGAAGGTGCTAAGTTTAAATTCAGCCTTTTCCCGAATGATCTTGTCGAGCTAAAAACCAAAAAAGAATACTTTTTCGGCTATTACATCGGACTAGATCGTGCAACTGGAAACATTAGCCTAAAAGAACATGATGGTGAGATATCAAAAGGTAAAGACGGTGTTTACCGTGTTGGTGTCAAGTTAGCTCTTTCTTTTGAAAAATATCAAGTTGATGAGCTCGGTAAAAATAGACAAATTTGCCGACCTCAGCAAAGACAACCTGTGCGTTAA
SEQ ID NO:4
序列:
MQTTNLSYILGLDLGIASVGWAVVEINENEDPIGLIDVGVRIFERAEVPKTGESLALSRRLARSTRRLIRRRAHRLLLAKRFLKREGILSTIDLEKGLPNQAWELRVAGLERRLSAIEWGAVLLHLIKHRGYLSKRKNESQTNNKELGALLSGVAQNHQLLQSDDYRTPAELALKKFAKEEGHIRNQRGAYTHTFNRLDLLAELNLLFAQQHQFGNPHCKEHIQQYMTELLMWQKPALSGEAILKMLGKCTHEKNEFKAAKHTYSAERFVWLTKLNNLRILEDGAERALNEEERQLLINHPYEKSKLTYAQVRKLLGLSEQAIFKHLRYSKENAESATFMELKAWHAIRKALENQGLKDTWQDLAKKPDLLDEIGTAFSLYKTDEDIQQYLTNKVPNSVINALLVSLNFDKFIELSLKSLRKILPLMEQGKRYDQACREIYGHHYGEANQKTSQLLPAIPAQEIRNPVVLRTLSQARKVINAIIRQYGSPARVHIETGRELGKSFKERREIQKQQEDNRTKRESAVQKFKELFSDFSSEPKSKDILKFRLYEQQHGKCLYSGKEINIHRLNEKGYVEIDHALPFSRTWDDSFNNKVLVLASENQNKGNQTPYEWLQGKINSERWKNFVALVLGSQCSAAKKQRLLTQVIDDNKFIDRNLNDTRYIARFLSNYIQENLLLVGKNKKNVFTPNGQITALLRSRWGLIKARENNNRHHALDAIVVACATPSMQQKITRFIRFKEVHPYKIENRYEMVDQESGEIISPHFPEPWAYFRQEVNIRVFDNHPDTVLKEMLPDRPQANHQFVQPLFVSRAPTRKMSGQGHMETIKSAKRLAEGISVLRIPLTQLKPNLLENMVNKEREPALYAGLKARLAEFNQDPAKAFATPFYKQGGQQVKAIRVEQVQKSGVLVRENNGVADNASIVRTDVFIKNNKFFLVPIYTWQVAKGILPNKAIVAHKNEDEWEEMDEGAKFKFSLFPNDLVELKTKKEYFFGYYIGLDRATGNISLKEHDGEISKGKDGVYRVGVKLALSFEKYQVDELGKNRQICRPQQRQPVR
SEQ ID NO:5
名称:变异链球菌Cas9
序列:
ATGAAAAAACCTTACTCTATTGGACTTGATATTGGAACCAATTCTGTTGGTTGGGCTGTTGTGACAGATGACTACAAAGTTCCTGCTAAGAAGATGAAGGTTCTGGGAAATACAGATAAAAGTCATATCGAGAAAAATTTGCTTGGCGCTTTATTATTTGATAGCGGGAATACTGCAGAAGACAGACGGTTAAAGAGAACTGCTCGCCGTCGTTACACACGTCGCAGAAATCGTATTTTATATTTGCAAGAGATTTTTTCAGAAGAAATGGGCAAGGTAGATGATAGTTTCTTTCATCGTTTAGAGGATTCTTTTCTTGTTACTGAGGATAAACGAGGAGAGCGCCATCCCATTTTTGGGAATCTTGAAGAAGAAGTTAAGTATCATGAAAATTTTCCAACCATTTATCATTTGCGGCAATATCTTGCGGATAATCCAGAAAAAGTTGATTTGCGTTTAGTTTATTTGGCTTTGGCACATATAATTAAGTTTAGAGGTCATTTTTTAATTGAAGGAAAGTTTGATACACGCAATAATGATGTACAAAGACTGTTTCAAGAATTTTTAGCAGTCTATGATAATACTTTTGAGAATAGTTCGCTTCAGGAGCAAAATGTTCAAGTTGAAGAAATTCTGACTGATAAAATCAGTAAATCTGCTAAGAAAGATAGAGTTTTGAAACTTTTTCCTAATGAAAAGTCTAATGGCCGCTTTGCAGAATTTCTAAAACTAATTGTTGGTAATCAAGCTGATTTTAAAAAGCATTTTGAATTAGAAGAGAAAGCACCATTGCAATTTTCTAAAGATACTTATGAAGAAGAGTTAGAAGTACTATTAGCTCAAATTGGAGATAATTACGCAGAGCTCTTTTTATCAGCAAAGAAACTGTATGATAGTATCCTTTTATCAGGGATTTTAACAGTTACTGATGTTGGTACCAAAGCGCCTTTATCTGCTTCGATGATTCAGCGATATAATGAACATCAGATGGATTTAGCTCAGCTTAAACAATTCATTCGTCAGAAATTATCAGATAAATATAACGAAGTTTTTTCTGATGTTTCAAAAGACGGCTATGCGGGTTATATTGATGGGAAAACAAATCAAGAAGCTTTTTATAAATACCTTAAAGGTCTATTAAATAAGATTGAGGGAAGTGGCTATTTCCTTGATAAAATTGAGCGTGAAGATTTTCTAAGAAAGCAACGTACCTTTGACAATGGCTCTATTCCACATCAGATTCATCTTCAAGAAATGCGTGCTATCATTCGTAGACAGGCTGAATTTTATCCGTTTTTAGCAGACAATCAAGATAGGATTGAGAAATTATTGACTTTCCGTATTCCCTACTATGTTGGTCCATTAGCGCGCGGAAAAAGTGATTTTGCTTGGTTAAGTCGGAAATCGGCTGATAAAATTACACCATGGAATTTTGATGAAATCGTTGATAAAGAATCCTCTGCAGAAGCTTTTATCAATCGTATGACAAATTATGATTTGTACTTGCCAAATCAAAAAGTTCTTCCTAAACATAGTTTATTATACGAAAAATTTACTGTTTACAATGAATTAACAAAGGTTAAATATAAAACAGAGCAAGGAAAAACAGCATTTTTTGATGCCAATATGAAGCAAGAAATCTTTGATGGCGTATTTAAGGTTTATCGAAAAGTAACTAAAGATAAATTAATGGATTTCCTTGAAAAAGAATTTGATGAATTTCGTATTGTTGATTTAACAGGTCTGGATAAAGAAAATAAAGTATTTAACGCTTCTTATGGAACTTATCATGATTTGTGTAAAATTTTAGATAAAGATTTTCTCGATAATTCAAAGAATGAAAAGATTTTAGAAGATATTGTGTTGACCTTAACGTTATTTGAAGATAGAGAAATGATTAGAAAACGTCTAGAAAATTACAGTGATTTATTGACCAAAGAACAAGTGAAAAAGCTGGAAAGACGTCATTATACTGGTTGGGGAAGATTATCAGCTGAGTTAATTCATGGTATTCGCAATAAAGAAAGCAGAAAAACAATTCTTGATTATCTCATTGATGATGGCAATAGCAATCGGAACTTTATGCAACTGATTAACGATGATGCTCTTTCTTTCAAAGAAGAGATTGCTAAGGCACAAGTTATTGGAGAAACAGACAATCTAAATCAAGTTGTTAGTGATATTGCTGGCAGCCCTGCTATTAAAAAAGGAATTTTACAAAGCTTGAAGATTGTTGATGAGCTTGTCAAAATTATGGGACATCAACCTGAAAATATCGTCGTGGAGATGGCGCGTGAAAACCAGTTTACCAATCAGGGACGACGAAATTCACAGCAACGTTTGAAAGGTTTGACAGATTCTATTAAAGAATTTGGAAGTCAAATTCTTAAAGAACATCCGGTTGAGAATTCACAGTTACAAAATGATAGATTGTTTCTATATTATTTACAAAACGGCAGAGATATGTATACTGGAGAAGAATTGGATATTGATTATCTAAGCCAGTATGATATAGACCATATTATCCCGCAAGCTTTTATAAAGGATAATTCTATTGATAATAGAGTATTGACTAGCTCAAAGGAAAATCGTGGAAAATCGGATGATGTACCAAGTAAAGATGTTGTTCGTAAAATGAAATCCTATTGGAGTAAGCTACTTTCGGCAAAGCTTATTACACAACGTAAATTTGATAATTTGACAAAAGCTGAACGAGGTGGATTGACCGACGATGATAAAGCTGGATTCATCAAGCGTCAATTAGTAGAAACACGACAAATTACCAAACATGTAGCACGTATTCTGGACGAACGATTTAATACAGAAACAGATGAAAACAACAAGAAAATTCGTCAAGTAAAAATTGTGACCTTGAAATCAAATCTTGTTTCCAATTTCCGTAAAGAGTTTGAACTCTACAAAGTGCGTGAAATTAATGACTATCATCATGCACATGATGCCTATCTCAATGCTGTAATTGGAAAGGCTTTACTAGGTGTTTACCCACAATTGGAACCTGAATTTGTTTATGGTGATTATCCTCATTTTCATGGACATAAAGAAAATAAAGCAACTGCTAAGAAATTTTTCTATTCAAATATTATGAACTTCTTTAAAAAAGATGATGTCCGTACTGATAAAAATGGTGAAATTATCTGGAAAAAAGATGAGCATATTTCTAATATTAAAAAAGTGCTTTCTTATCCACAAGTTAATATTGTTAAGAAAGTAGAGGAGCAAACGGGAGGATTTTCTAAAGAATCTATCTTGCCGAAAGGTAATTCTGACAAGCTTATTCCTCGAAAAACGAAGAAATTTTATTGGGATACCAAGAAATATGGAGGATTTGATAGCCCGATTGTTGCTTATTCTATTTTAGTTATTGCTGATATTGAAAAAGGTAAATCTAAAAAATTGAAAACAGTCAAAGCCTTAGTTGGTGTCACTATTATGGAAAAGATGACTTTTGAAAGGGATCCAGTTGCTTTTCTTGAGCGAAAAGGCTATCGAAATGTTCAAGAAGAAAATATTATAAAGTTACCAAAATATAGTTTATTTAAACTAGAAAACGGACGAAAAAGGCTATTGGCAAGTGCTAGGGAACTTCAAAAGGGAAATGAAATCGTTTTGCCAAATCATTTAGGAACCTTGCTTTATCACGCTAAAAATATTCATAAAGTTGATGAACCAAAGCATTTGGACTATGTTGATAAACATAAAGATGAATTTAAGGAGTTGCTAGATGTTGTGTCAAACTTTTCTAAAAAATATACTTTAGCAGAAGGAAATTTAGAAAAAATCAAAGAATTATATGCACAAAATAATGGTGAAGATCTTAAAGAATTAGCAAGTTCATTTATCAACTTATTAACATTTACTGCTATAGGAGCACCGGCTACTTTTAAATTCTTTGATAAAAATATTGATCGAAAACGATATACTTCAACTACTGAAATTCTCAACGCTACCCTCATCCACCAATCCATCACCGGTCTTTATGAAACGCGGATTGATCTCAATAAGTTAGGAGGAGACTAA
SEQ ID NO:6
序列:
MKKPYSIGLDIGTNSVGWAVVTDDYKVPAKKMKVLGNTDKSHIEKNLLGALLFDSGNTAEDRRLKRTARRRYTRRRNRILYLQEIFSEEMGKVDDSFFHRLEDSFLVTEDKRGERHPIFGNLEEEVKYHENFPTIYHLRQYLADNPEKVDLRLVYLALAHIIKFRGHFLIEGKFDTRNNDVQRLFQEFLAVYDNTFENSSLQEQNVQVEEILTDKISKSAKKDRVLKLFPNEKSNGRFAEFLKLIVGNQADFKKHFELEEKAPLQFSKDTYEEELEVLLAQIGDNYAELFLSAKKLYDSILLSGILTVTDVGTKAPLSASMIQRYNEHQMDLAQLKQFIRQKLSDKYNEVFSDVSKDGYAGYIDGKTNQEAFYKYLKGLLNKIEGSGYFLDKIEREDFLRKQRTFDNGSIPHQIHLQEMRAIIRRQAEFYPFLADNQDRIEKLLTFRIPYYVGPLARGKSDFAWLSRKSADKITPWNFDEIVDKESSAEAFINRMTNYDLYLPNQKVLPKHSLLYEKFTVYNELTKVKYKTEQGKTAFFDANMKQEIFDGVFKVYRKVTKDKLMDFLEKEFDEFRIVDLTGLDKENKVFNASYGTYHDLCKILDKDFLDNSKNEKILEDIVLTLTLFEDREMIRKRLENYSDLLTKEQVKKLERRHYTGWGRLSAELIHGIRNKESRKTILDYLIDDGNSNRNFMQLINDDALSFKEEIAKAQVIGETDNLNQVVSDIAGSPAIKKGILQSLKIVDELVKIMGHQPENIVVEMARENQFTNQGRRNSQQRLKGLTDSIKEFGSQILKEHPVENSQLQNDRLFLYYLQNGRDMYTGEELDIDYLSQYDIDHIIPQAFIKDNSIDNRVLTSSKENRGKSDDVPSKDVVRKMKSYWSKLLSAKLITQRKFDNLTKAERGGLTDDDKAGFIKRQLVETRQITKHVARILDERFNTETDENNKKIRQVKIVTLKSNLVSNFRKEFELYKVREINDYHHAHDAYLNAVIGKALLGVYPQLEPEFVYGDYPHFHGHKENKATAKKFFYSNIMNFFKKDDVRTDKNGEIIWKKDEHISNIKKVLSYPQVNIVKKVEEQTGGFSKESILPKGNSDKLIPRKTKKFYWDTKKYGGFDSPIVAYSILVIADIEKGKSKKLKTVKALVGVTIMEKMTFERDPVAFLERKGYRNVQEENIIKLPKYSLFKLENGRKRLLASARELQKGNEIVLPNHLGTLLYHAKNIHKVDEPKHLDYVDKHKDEFKELLDVVSNFSKKYTLAEGNLEKIKELYAQNNGEDLKELASSFINLLTFTAIGAPATFKFFDKNIDRKRYTSTTEILNATLIHQSITGLYETRIDLNKLGGD
SEQ ID NO:7
名称:脑膜炎奈瑟氏菌Cas9
序列:
ATGGCTGCCTTCAAACCTAATTCAATCAACTACATCCTCGGCCTCGATATCGGCATCGCATCCGTCGGCTGGGCGATGGTAGAAATTGACGAAGAAGAAAACCCCATCCGCCTGATTGATTTGGGCGTGCGCGTATTTGAGCGTGCCGAAGTACCGAAAACAGGCGACTCCCTTGCCATGGCAAGGCGTTTGGCGCGCAGTGTTCGCCGCCTGACCCGCCGTCGCGCCCACCGCCTGCTTCGGACCCGCCGCCTATTGAAACGCGAAGGCGTATTACAAGCCGCCAATTTTGACGAAAACGGCTTGATTAAATCCTTACCGAATACACCATGGCAACTTCGCGCAGCCGCATTAGACCGCAAACTGACGCCTTTAGAGTGGTCGGCAGTCTTGTTGCATTTAATCAAACATCGCGGCTATTTATCGCAACGGAAAAACGAGGGCGAAACTGCCGATAAGGAGCTTGGCGCTTTGCTTAAAGGCGTAGCCGGCAATGCCCATGCCTTACAGACAGGCGATTTCCGCACACCGGCCGAATTGGCTTTAAATAAATTTGAGAAAGAAAGCGGCCATATCCGCAATCAGCGCAGCGATTATTCGCATACGTTCAGCCGCAAAGATTTACAGGCGGAGCTGATTTTGCTGTTTGAAAAACAAAAAGAATTTGGCAATCCGCATGTTTCAGGCGGCCTTAAAGAAGGTATTGAAACCCTACTGATGACGCAACGCCCTGCCCTGTCCGGCGATGCCGTTCAAAAAATGTTGGGGCATTGCACCTTCGAACCGGCAGAGCCGAAAGCCGCTAAAAACACCTACACAGCCGAACGTTTCATCTGGCTGACCAAGCTGAACAACCTGCGTATTTTAGAGCAAGGCAGCGAGCGGCCATTGACCGATACCGAACGCGCCACGCTTATGGACGAGCCATACAGAAAATCCAAACTGACTTACGCACAAGCCCGTAAGCTGCTGGGTTTAGAAGATACCGCCTTTTTCAAAGGCTTGCGCTATGGTAAAGACAATGCCGAAGCCTCAACATTGATGGAAATGAAGGCCTACCATGCCATCAGCCGTGCACTGGAAAAAGAAGGATTGAAAGACAAAAAATCCCCATTAAACCTTTCTCCCGAATTACAAGACGAAATCGGCACGGCATTCTCCCTGTTCAAAACCGATGAAGACATTACAGGCCGTCTGAAAGACCGTATACAGCCCGAAATCTTAGAAGCGCTGTTGAAACACATCAGCTTCGATAAGTTCGTCCAAATTTCCTTGAAAGCATTGCGCCGAATTGTGCCTCTAATGGAACAAGGCAAACGTTACGATGAAGCCTGCGCCGAAATCTACGGAGACCATTACGGCAAGAAGAATACGGAAGAAAAGATTTATCTGCCGCCGATTCCCGCCGACGAAATCCGCAACCCCGTCGTCTTGCGCGCCTTATCTCAAGCACGTAAGGTCATTAACGGCGTGGTACGCCGTTACGGCTCCCCAGCTCGTATCCATATTGAAACTGCAAGGGAAGTAGGTAAATCGTTTAAAGACCGCAAAGAAATTGAGAAACGCCAAGAAGAAAACCGCAAAGACCGGGAAAAAGCCGCCGCCAAATTCCGAGAGTATTTCCCCAATTTTGTCGGAGAACCCAAATCCAAAGATATTCTGAAACTGCGCCTGTACGAGCAACAACACGGCAAATGCCTGTATTCGGGCAAAGAAATCAACTTAGGCCGTCTGAACGAAAAAGGCTATGTCGAAATCGACCATGCCCTGCCGTTCTCGCGCACATGGGACGACAGTTTCAACAATAAAGTACTGGTATTGGGCAGCGAAAACCAAAACAAAGGCAATCAAACCCCTTACGAATACTTCAACGGCAAAGACAACAGCCGCGAATGGCAGGAATTTAAAGCGCGTGTCGAAACCAGCCGTTTCCCGCGCAGTAAAAAACAACGGATTCTGCTGCAAAAATTCGATGAAGACGGCTTTAAAGAACGCAATCTGAACGACACGCGCTACGTCAACCGTTTCCTGTGTCAATTTGTTGCCGACCGTATGCGGCTGACAGGTAAAGGCAAGAAACGTGTCTTTGCATCCAACGGACAAATTACCAATCTGTTGCGCGGCTTTTGGGGATTGCGCAAAGTGCGTGCGGAAAACGACCGCCATCACGCCTTGGACGCCGTCGTCGTTGCCTGCTCGACCGTTGCCATGCAGCAGAAAATTACCCGTTTTGTACGCTATAAAGAGATGAACGCGTTTGACGGTAAAACCATAGACAAAGAAACAGGAGAAGTGCTGCATCAAAAAACACACTTCCCACAACCTTGGGAATTTTTCGCACAAGAAGTCATGATTCGCGTCTTCGGCAAACCGGACGGCAAACCCGAATTCGAAGAAGCCGATACCCTAGAAAAACTGCGCACGTTGCTTGCCGAAAAATTATCATCTCGCCCCGAAGCCGTACACGAATACGTTACGCCACTGTTTGTTTCACGCGCGCCCAATCGGAAGATGAGCGGGCAAGGGCATATGGAGACCGTCAAATCCGCCAAACGACTGGACGAAGGCGTCAGCGTGTTGCGCGTACCGCTGACACAGTTAAAACTGAAAGACTTGGAAAAAATGGTCAATCGGGAGCGCGAACCTAAGCTATACGAAGCACTGAAAGCACGGCTGGAAGCACATAAAGACGATCCTGCCAAAGCCTTTGCCGAGCCGTTTTACAAATACGATAAAGCAGGCAACCGCACCCAACAGGTAAAAGCCGTACGCGTAGAGCAAGTACAGAAAACCGGCGTATGGGTGCGCAACCATAACGGTATTGCCGACAACGCAACCATGGTGCGCGTAGATGTGTTTGAGAAAGGCGACAAGTATTATCTGGTACCGATTTACAGTTGGCAGGTAGCGAAAGGGATTTTGCCGGATAGGGCTGTTGTACAAGGAAAAGATGAAGAAGATTGGCAACTTATTGATGATAGTTTCAACTTTAAATTCTCATTACACCCTAATGATTTAGTCGAGGTTATAACAAAAAAAGCTAGAATGTTTGGTTACTTTGCCAGCTGCCATCGAGGCACAGGTAATATCAATATACGCATTCATGATCTTGATCATAAAATTGGCAAAAATGGAATACTGGAAGGTATCGGCGTCAAAACCGCCCTTTCATTCCAAAAATACCAAATTGACGAACTGGGCAAAGAAATCAGACCATGCCGTCTGAAAAAACGCCCGCCTGTCCGTTAA
SEQ ID NO:8
序列:
MAAFKPNSINYILGLDIGIASVGWAMVEIDEEENPIRLIDLGVRVFERAEVPKTGDSLAMARRLARSVRRLTRRRAHRLLRTRRLLKREGVLQAANFDENGLIKSLPNTPWQLRAAALDRKLTPLEWSAVLLHLIKHRGYLSQRKNEGETADKELGALLKGVAGNAHALQTGDFRTPAELALNKFEKESGHIRNQRSDYSHTFSRKDLQAELILLFEKQKEFGNPHVSGGLKEGIETLLMTQRPALSGDAVQKMLGHCTFEPAEPKAAKNTYTAERFIWLTKLNNLRILEQGSERPLTDTERATLMDEPYRKSKLTYAQARKLLGLEDTAFFKGLRYGKDNAEASTLMEMKAYHAISRALEKEGLKDKKSPLNLSPELQDEIGTAFSLFKTDEDITGRLKDRIQPEILEALLKHISFDKFVQISLKALRRIVPLMEQGKRYDEACAEIYGDHYGKKNTEEKIYLPPIPADEIRNPVVLRALSQARKVINGVVRRYGSPARIHIETAREVGKSFKDRKEIEKRQEENRKDREKAAAKFREYFPNFVGEPKSKDILKLRLYEQQHGKCLYSGKEINLGRLNEKGYVEIDHALPFSRTWDDSFNNKVLVLGSENQNKGNQTPYEYFNGKDNSREWQEFKARVETSRFPRSKKQRILLQKFDEDGFKERNLNDTRYVNRFLCQFVADRMRLTGKGKKRVFASNGQITNLLRGFWGLRKVRAENDRHHALDAVVVACSTVAMQQKITRFVRYKEMNAFDGKTIDKETGEVLHQKTHFPQPWEFFAQEVMIRVFGKPDGKPEFEEADTLEKLRTLLAEKLSSRPEAVHEYVTPLFVSRAPNRKMSGQGHMETVKSAKRLDEGVSVLRVPLTQLKLKDLEKMVNREREPKLYEALKARLEAHKDDPAKAFAEPFYKYDKAGNRTQQVKAVRVEQVQKTGVWVRNHNGIADNATMVRVDVFEKGDKYYLVPIYSWQVAKGILPDRAVVQGKDEEDWQLIDDSFNFKFSLHPNDLVEVITKKARMFGYFASCHRGTGNINIRIHDLDHKIGKNGILEGIGVKTALSFQKYQIDELGKEIRPCRLKKRPPVRSEQ ID NO:9
序列:
ATGAACAATAACAATTACTCTATCGGACTCGATATCGGAACAAACAGCGTCGGATGGGCCGTCATTACGGATGACTATAAGGTGCCATCGAAAAAGATGAAAGTTCTAGGCAATACAGATAAACACTTTATCAAGAAAAATCTAATTGGAGCTTTATTATTTGATGAAGGAGCTACTGCTGAAGATAGACGTTTCAAACGAACAGCACGCCGTCGCTATACTCGTCGAAAAAATCGTCTTCGCTATCTTCAAGAAATCTTTTCTGAGGAAATGAGCAAAGTGGATAGTAGTTTCTTTCATCGATTAGATGACTCATTCTTAGTTCCTGAGGATAAAAGAGGAAGTAAATATCCTATTTTTGCTACCTTGGCAGAAGAAAAAGAATATCACAAGAAATTTCCAACTATCTATCATTTGAGAAAACACCTTGCGGACTCAAAAGAAAAAACTGACTTGCGCTTGATCTATCTAGCATTAGCGCATATGATTAAATACCGCGGACATTTTTTGTATGAAGAATCTTTCGATATTAAAAACAATGATATCCAAAAAATCTTTAGCGAGTTTATAAGCATTTACGACAACACCTTTGAAGGAAGTTCACTTAGTGGACAAAATGCACAAGTAGAAGCAATTTTTACTGATAAAATTAGTAAATCTGCTAAGAGAGAACGCATTCTAAAACTCTTTGCTTATGAAAAATCCACTGATCTATTTTCAGAATTTCTCAAGCTGATTGTAGGAAATCAAGCTGATTTTAAGAAACACTTTGACTTGGAAGAAAAAGCTCCACTACAATTCTCTAAAGATACCTATGATGAGGATTTGGAAAACTTACTCGGACAAATTGGAGATGACTTTGCAGACCTTTTCCTAGTTGCTAAAAAACTCTATGATGCCATTCTTTTATCAGGAATCTTAACTGTTACAGATTCTTCAACTAAGGCCCCACTATCAGCATCTATGATTGAGCGCTATGAAAACCACCAAAAAGACTTAGCGGCTTTAAAACAATTCATCCAAAACAATCTTCAAGAAAAATATGATGAAGTTTTCTCTGACCAATCTAAAGATGGGTATGCTAGGTATATCAATGGCAAAACCACTCAAGAAGCATTTTACAAGTACATCAAAAATCTTCTCTCTAAATTCGAAGGATCAGATTATTTCCTTGATAAAATTGAACGTGAAGATTTCTTGAGAAAACAACGCACCTTTGATAATGGTTCTATCCCTCATCAAATTCATCTTCAAGAAATGAATGCCATTATCCGTCGGCAAGGAGAACATTATCCATTTCTGAAGGAATATAAAGAAAAGATAGAGACAATCTTGACTTTCCGTATTCCTTATTATGTTGGCCCATTGGCTCGTGGAAATCGTAATTTTGCTTGGCTTACTCGAAACTCTGACCAAGCAATCCGACCTTGGAATTTTGAAGAAATTGTTGATCAAGCAAGCTCTGCGGAAGAATTCATCAATAAGATGACTAACTATGACTTGTATCTGCCAGAGGAAAAAGTTTTGCCCAAGCATAGTCTCTTGTATGAAACATTTGCTGTCTACAATGAATTAACAAAAGTAAAATTTATTTCAGAGGGATTGAGAGACTATCAATTCCTTGATAGTGGGCAAAAGAAGCAAATTGTCAATCAATTATTCAAAGAGAAAAGAAAAGTAACTGAAAAAGACATCATTCAGTATCTACACAATGTTGATGGCTACGATGGAATCGAACTAAAAGGAATTGAAAAACAATTTAACGCTAGTCTTTCTACTTATCATGATTTACTCAAAATAATCAAGGATAAAGAGTTTATGGATGATCCTAAAAATGAAGAGATTCTTGAAAATATCGTCCACACACTAACTATCTTTGAAGATCGTGAGATGATCAAGCAACGCCTTGCTCAATATGCCTCTATCTTTGATAAAAAAGTGATCAAGGCACTGACTCGTCGACATTATACTGGTTGGGGAAAACTCTCTGCTAAGCTAATCAACGGTATCTGTGATAAAAAAACTGGTAAAACAATTCTTGACTACTTGATTGATGACGGCTACAGCAATCGTAACTTTATGCAGTTAATCAATGATGACGGGCTTTCCTTCAAAGATATTATTCAAAAAGCACAAGTGGTTGGTAAGACAAACGATGTGAAGCAAGTTGTCCAAGAACTCCCAGGTAGTCCTGCTATTAAAAAGGGAATTTTACAAAGTATCAAGCTTGTCGATGAGCTTGTCAAAGTTATGGGCCATGCTCCCGAGTCCATTGTGATTGAAATTGCACGAGAAAATCAGACAACTGCCAGAGGGAAAAAGAATTCTCAACAAAGATATAAGCGCATTGAAGATGCACTAAAAAATTTAGCACCTGGGCTTGATTCAAATATATTAAAAGAACATCCAACAGATAATATTCAACTTCAAAATGACCGTCTCTTCCTTTACTATCTCCAAAATGGGAAGGATATGTACACTGGAGAAGCTCTTGATATCAACCAACTGAGCAGCTATGACATTGACCACATCGTCCCACAGGCCTTTATCAAGGATGATTCTCTTGATAACCGTGTCTTGACTAGTTCAAAGGATAATCGTGGGAAATCCGATAATGTTCCAAGTTTAGAAGTCGTTCAAAAAAGAAAAGCTTTTTGGCAACAATTACTAGATTCCAAATTGATTTCAGAACATAAATTTAATAATTTAACCAAGGCTGAACGTGGTGGGCTAGATGAGCGAGATAAAGTTGGCTTTATCAGACGCCAACTAGTTGAAACACGGCAAATCACAAAACATGTTGCTCAGATTTTGGATGCCCGTTTTAATACAGAAGTGAATGAGAAAGATAAGAAGAACCGTACCGTCAAAATTATCACTTTGAAATCCAATCTAGTTTCCAACTTCCGTAAAGAATTTAAGTTATATAAGGTACGCGAAATCAATGACTACCACCATGCACATGATGCCTATTTAAATGCAGTGGTGGCTAAGGCTATCCTTAAGAAATATCCTAAACTAGAGCCTGAATTCGTCTATGGTGACTATCAAAAGTACGATATTAAGAGATATATTTCCAGATCCAAAGATCCTAAAGAAGTTGAAAAAGCAACTGAAAAGTATTTCTTCTACTCAAACTTGTTGAACTTCTTTAAAGAAGAGGTGCATTACGCAGACGGAACCATCGTAAAACGAGAGAATATCGAATACTCTAAGGACACTGGAGAAATCGCTTGGAATAAAGAAAAAGATTTCGCTACAATTAAAAAAGTTCTTTCACTTCCGCAGGTGAATATTGTGAAGAAAACAGAGATTCAAACACATGGTCTAGATAGAGGTAAACCTAGAGGATTGTTCAATTCCAATCCATCTCCTAAACCTTCAGAAGATCGTAAAGAAAACCTTGTCCCAATTAAACAAGGGCTTGACCCACGAAAATACGGTGGTTACGCTGGTATTTCTAACTCATACGCGGTCTTAGTTAAAGCTATTATTGAAAAAGGAGCGAAAAAACAACAAAAGACCGTTCTTGAATTTCAAGGTATCTCTATTTTAGATAAAATAAATTTTGAAAAGAACAAAGAAAACTATCTTCTTGAAAAAGGATACATAAAAATTCTATCAACTATTACTTTACCTAAATATAGTTTGTTTGAGTTTCCTGATGGTACAAGAAGAAGACTAGCAAGTATTCTATCGACAAACAATAAACGAGGAGAAATTCATAAAGGTAATGAATTGGTCATCCCTGAAAAGTATACGACTCTTTTGTATCATGCTAAGAATATTAATAAAACACTTGAACCAGAACACTTAGAGTATGTTGAGAAACATCGAAATGATTTTGCTAAACTTTTAGAATATGTACTTAACTTTAACGATAAGTATGTAGGCGCATTAAAAAATGGAGAAAGAATCAGACAAGCATTTATTGATTGGGAAACAGTTGATATTGAAAAGTTATGTTTCAGTTTCATTGGTCCAAGAAATAGTAAAAATGCTGGTTTATTCGAGTTAACTTCACAAGGAAGTGCTTCTGACTTCGAGTTCTTGGGAGTAAAAATTCCACGATACAGAGACTATACACCTTCGTCACTCCTCAACGCCACCCTCATCCACCAATCCATCACTGGTCTTTACGAGACTCGGATTGACTTAAGCAAACTGGGAGAAGACTGA
SEQ ID NO:10
名称:gi|777888062|gb|KJQ69483.1|CRISPR相关核酸内切酶Cas9[缓症链球菌]序列:
MNNNNYSIGLDIGTNSVGWAVITDDYKVPSKKMKVLGNTDKHFIKKNLIGALLFDEGATAEDRRFKRTARRRYTRRKNRLRYLQEIFSEEMSKVDSSFFHRLDDSFLVPEDKRGSKYPIFATLAEEKEYHKKFPTIYHLRKHLADSKEKTDLRLIYLALAHMIKYRGHFLYEESFDIKNNDIQKIFSEFISIYDNTFEGSSLSGQNAQVEAIFTDKISKSAKRERILKLFAYEKSTDLFSEFLKLIVGNQADFKKHFDLEEKAPLQFSKDTYDEDLENLLGQIGDDFADLFLVAKKLYDAILLSGILTVTDSSTKAPLSASMIERYENHQKDLAALKQFIQNNLQEKYDEVFSDQSKDGYARYINGKTTQEAFYKYIKNLLSKFEGSDYFLDKIEREDFLRKQRTFDNGSIPHQIHLQEMNAIIRRQGEHYPFLKEYKEKIETILTFRIPYYVGPLARGNRNFAWLTRNSDQAIRPWNFEEIVDQASSAEEFINKMTNYDLYLPEEKVLPKHSLLYETFAVYNELTKVKFISEGLRDYQFLDSGQKKQIVNQLFKEKRKVTEKDIIQYLHNVDGYDGIELKGIEKQFNASLSTYHDLLKIIKDKEFMDDPKNEEILENIVHTLTIFEDREMIKQRLAQYASIFDKKVIKALTRRHYTGWGKLSAKLINGICDKKTGKTILDYLIDDGYSNRNFMQLINDDGLSFKDIIQKAQVVGKTNDVKQVVQELPGSPAIKKGILQSIKLVDELVKVMGHAPESIVIEIARENQTTARGKKNSQQRYKRIEDALKNLAPGLDSNILKEHPTDNIQLQNDRLFLYYLQNGKDMYTGEALDINQLSSYDIDHIVPQAFIKDDSLDNRVLTSSKDNRGKSDNVPSLEVVQKRKAFWQQLLDSKLISEHKFNNLTKAERGGLDERDKVGFIRRQLVETRQITKHVAQILDARFNTEVNEKDKKNRTVKIITLKSNLVSNFRKEFKLYKVREINDYHHAHDAYLNAVVAKAILKKYPKLEPEFVYGDYQKYDIKRYISRSKDPKEVEKATEKYFFYSNLLNFFKEEVHYADGTIVKRENIEYSKDTGEIAWNKEKDFATIKKVLSLPQVNIVKKTEIQTHGLDRGKPRGLFNSNPSPKPSEDRKENLVPIKQGLDPRKYGGYAGISNSYAVLVKAIIEKGAKKQQKTVLEFQGISILDKINFEKNKENYLLEKGYIKILSTITLPKYSLFEFPDGTRRRLASILSTNNKRGEIHKGNELVIPEKYTTLLYHAKNINKTLEPEHLEYVEKHRNDFAKLLEYVLNFNDKYVGALKNGERIRQAFIDWETVDIEKLCFSFIGPRNSKNAGLFELTSQGSASDFEFLGVKIPRYRDYTPSSLLNATLIHQSITGLYETRIDLSKLGED
SEQ ID NO:11
序列:
ATGACAAAACCTTATTCTATTGGACTTGATATTGGGACTAACTCTGTTGGTTGGGCTGTTGTGACAGATGGCTACAAAGTTCCTGCTAAGAAGATGAAGGTTCTGGGAAATACAGATAAAAGCCATATCAAGAAAAATTTACTTGGAGCTTTATTGTTTGATAGCGGTAATACTGCAAAAGACAGACGTTTGAAGCGGACAGCTAGGCGTCGATATACACGTCGTAGAAACCGTATTTTATATTTGCAGGAAATTTTTGCTGAAGAAATGGCTAAAGCAGACGAAAGTTTCTTCCAGCGCTTAAACGAATCGTTTTTAACAAATGATGACAAAGAATTTGATTCTCATCCAATCTTTGGGAATAAAGCTGAAGAGGAGGCTCATCACCATAAATTTCCAACAATTTTTCATTTGCGAAAGCATTTAGCAGACTCAACCGAGAAATCTGATTTGCGCTTAATTTATCTAGCTTTAGCGCATATGATTAAATTCCGGGGACATTTCTTAATTGAAGGTCAGCTAAAAGCTGAAAATACAAATGTTCAAACATTATTTGACGATTTTGTAGAAGTATATGATAAGACAGTTGAAGAAAGTCATTTATCAGAAATTAGTGTCTCCAGTATTCTGACAGAAAAAATTAGTAAATCGCGTCGCTTAGAAAATCTTATAAAATACTATCCCACTGAGAAGAAAAACACTCTCTTCGGAAATCTTATCGCCTTGTCTTTAGGATTACAGCCAAACTTTAAAACAAATTTTAAATTATCCGAAGATGCTAAACTACAGTTTTCTAAGGATACTTATGAAGAAGATTTAGGAGAATTACTTGGAAAAATCGGAGATAATTATGCAGATTTATTTATATCAGCTAAAAATCTTTATGATGCTATTTTGCTATCAGGAATTTTAACAATAGATGACAACACGACAAAGGCTCCGTTGTCTGCTTCAATGATTAAACGTTATGAGGAACATCAGGAAGATTTAGCACAACTTAAGAAATTTATCCGTCAGAATTTACCAGATCAATATAGTGAGGTTTTTTCTGATAAAACAAAGGATGGCTATGCTGGTTATATTGATGGAAAAACGAATCAGGAGGCCTTTTATAAATACATCAAAAATATGCTGTCAAAAACAGAAGGTGCAGATTATTTTCTTGACAAAATTGATCGTGAAGACTTTTTGAGAAAACAGAGAACGTTTGATAATGGTTCCGTTCCGCATCAGATTCATCTGCAAGAGATGCATGCTATTTTACGACGTCAGGGTGAATACTATCCATTCTTGAAAGAAAATCAGGATAAAATTGAAAAAATCTTAACGTTTAGAATTCCTTACTACGTTGGTCCTTTGGCGCGAAAAGGTAGCCGCTTTGCCTGGGCAGAATACAAGGCGGATAAAAAAGTTACGCCATGGAATTTTGATGATATTCTTGATAAAGAAAAATCAGCAGAAGAATTCATCACACGCATGACTTTAAATGATTTGTATTTACCTGAAGAAAAAGTCTTACCAAAGCATAGTCTTGTTTATGAAACGTTTAATGTTTACAATGAGTTAACTAAAGTTAAGTATGTCAATGAGCAAGGGAAAGCCATTTTCTTTGATGCCAATATGAAGCAAGAGATTTTTGATCATGTTTTTAAAGAAAATCGGAAAGTTACTAAAGATAAACTTTTAAATTATTTGAATAAAGAGTTTGAAGAATTTAGAATTGTTAACTTAACTGGACTGGATAAGGAAAATAAAGCCTTTAATTCCAGTCTTGGAACCTATCATGATTTGCGTAAAATTTTAGATAAATCATTCTTAGATGATAAAGTAAATGAAAAGATAATTGAGGATATCATTCAAACACTAACTCTGTTTGAAGACAGAGAAATGATTCGTCAGCGTCTTCAAAAGTATAGTGATATTTTTACAACACAGCAATTGAAAAAACTTGAACGCCGTCATTATACAGGTTGGGGAAGATTATCAGCGAAGTTAATCAATGGTATTCGAGATAAACAGAGTAATAAGACTATTCTGGGTTATTTGATTGATGATGGTTATAGCAATCGTAACTTTATGCAGTTGATTAATGACGATTCTCTTCCTTTTAAAGAAGAAATTGCTAGGGCACAAGTCATTGGAGAAACAGATGACTTAAATCAACTTGTTAGTGATATTGCTGGCAGTCCTGCTATTAAAAAGGGAATTTTACAAAGTCTGAAAATTGTAGATGAGCTTGTTAAAGTCATGGGGCATAATCCTGCTAACATTGTTATCGAAATGGCGCGTGAAAATCAGACTACAGCCAAAGGGCGTCGCAGTTCACAGCAACGTTATAAACGACTTGAGGAGGCAATAAAAAATCTTGACCATGATTTAAATCATAAGATTTTAAAAGAACACCCAACAGATAATCAAGCTTTACAGAATGACCGTCTTTTCTTATATTATCTCCAAAATGGCCGAGATATGTATACTGAAGATCCACTTGATATTAATCGTTTAAGTGATTATGATATCGACCATATTATTCCACAATCTTTTATAAAAGATGACTCTATTGACAATAAGGTTCTGGTTTCATCAGCTAAAAACCGTGGGAAATCGGATAATGTACCGAGTGAAGATGTTGTCAATAGGATGAGACCGTTTTGGAATAAATTATTGAGCTGTGGATTGATTTCTCAACGGAAATACAGCAATCTAACCAAAAAAGAATTAAAACCAGATGATAAGGCTGGTTTCATCAAACGTCAATTGGTTGAGACAAGACAAATTACAAAGCATGTTGCACAAATTTTAGACGCTCGTTTTAATACAAAACGTGATGAAAATAAAAAAGTAATTCGTGATGTCAAAATTATCACTTTAAAATCTAATTTAGTTTCACAATTTCGTAAAGACTTTAAATTTTACAAAGTACGTGAGATTAATGATTACCATCATGCGCATGACGCTTATCTTAATGCAGTTATAGGAAAAGCTTTATTAGATGTTTATCCGCAGTTAGAGCCCGAATTTGTTTATGGTGAGTACCCTCATTTTCATGGATATAAAGAAAATAAAGCAACTGCTAAGAAATTTTTCTATTCAAATATTATGAATTTTTTTAAGAAAGATGATATCCGTACCGATGAAAATGGTGAGATTGTTTGGAAAAAAGATGAGCATATTTCTAATATTAAAAGGGTGCTTTCCTATCCCCAAGTTAATATTGTTAAGAAAGTAGAAATACAGACTGTTGGACAAAATGGGGGACTTTTTGACGATAATCCTAAATCACCATTAGAGGTTACACCTAGTAAACTTGTTCCACTAAAAAAAGAATTAAACCCTAAAAAATATGGAGGATATCAAAAACCGACGACAGCTTATCCTGTTTTACTGATAACAGATACTAAACAGCTAATTCCAATCTCAGTAATGAATAAGAAGCAATTTGAACAAAATCCGGTTAAATTTTTAAGAGATAGAGGCTATCAACAGGTAGGAAAGAATGACTTTATTAAATTACCCAAATATACCCTAGTTGATATCGGTGATGGGATTAAACGCCTATGGGCTAGTTCGAAAGAAATACATAAAGGAAATCAATTAGTTGTATCTAAAAAATCTCAAATTTTGCTTTATCATGCACATCACTTAGATAGTGATTTGAGTAATGATTATCTTCAAAATCATAATCAACAATTCGATGTTTTATTTAATGAAATTATTTCTTTTTCTAAAAAATGTAAATTGGGAAAAGAACATATTCAGAAAATTGAAAATGTTTACTCCAATAAGAAGAATAGTGCATCAATAGAAGAATTAGCAGAGAGTTTTATTAAATTATTAGGATTTACACAATTAGGTGCAACTTCCCCATTTAATTTTTTAGGGGTAAAACTAAATCAAAAACAATATAAAGGTAAAAAAGATTATATTTTACCGTGTACAGAGGGGACCCTTATCCGCCAATCTATCACTGGTCTTTACGAAACACGAGTTGATCTTAGTAAAATAGGAGAAGACTAA
SEQ ID NO:12
名称:gi|357584860|gb|EHJ52063.1|CRISPR相关蛋白Cas9/Csn1,亚型II/NMEMI[猕猴链球菌NCTC 11558]
序列:
MTKPYSIGLDIGTNSVGWAVVTDGYKVPAKKMKVLGNTDKSHIKKNLLGALLFDSGNTAKDRRLKRTARRRYTRRRNRILYLQEIFAEEMAKADESFFQRLNESFLTNDDKEFDSHPIFGNKAEEEAHHHKFPTIFHLRKHLADSTEKSDLRLIYLALAHMIKFRGHFLIEGQLKAENTNVQTLFDDFVEVYDKTVEESHLSEISVSSILTEKISKSRRLENLIKYYPTEKKNTLFGNLIALSLGLQPNFKTNFKLSEDAKLQFSKDTYEEDLGELLGKIGDNYADLFISAKNLYDAILLSGILTIDDNTTKAPLSASMIKRYEEHQEDLAQLKKFIRQNLPDQYSEVFSDKTKDGYAGYIDGKTNQEAFYKYIKNMLSKTEGADYFLDKIDREDFLRKQRTFDNGSVPHQIHLQEMHAILRRQGEYYPFLKENQDKIEKILTFRIPYYVGPLARKGSRFAWAEYKADKKVTPWNFDDILDKEKSAEEFITRMTLNDLYLPEEKVLPKHSLVYETFNVYNELTKVKYVNEQGKAIFFDANMKQEIFDHVFKENRKVTKDKLLNYLNKEFEEFRIVNLTGLDKENKAFNSSLGTYHDLRKILDKSFLDDKVNEKIIEDIIQTLTLFEDREMIRQRLQKYSDIFTTQQLKKLERRHYTGWGRLSAKLINGIRDKQSNKTILGYLIDDGYSNRNFMQLINDDSLPFKEEIARAQVIGETDDLNQLVSDIAGSPAIKKGILQSLKIVDELVKVMGHNPANIVIEMARENQTTAKGRRSSQQRYKRLEEAIKNLDHDLNHKILKEHPTDNQALQNDRLFLYYLQNGRDMYTEDPLDINRLSDYDIDHIIPQSFIKDDSIDNKVLVSSAKNRGKSDNVPSEDVVNRMRPFWNKLLSCGLISQRKYSNLTKKELKPDDKAGFIKRQLVETRQITKHVAQILDARFNTKRDENKKVIRDVKIITLKSNLVSQFRKDFKFYKVREINDYHHAHDAYLNAVIGKALLDVYPQLEPEFVYGEYPHFHGYKENKATAKKFFYSNIMNFFKKDDIRTDENGEIVWKKDEHISNIKRVLSYPQVNIVKKVEIQTVGQNGGLFDDNPKSPLEVTPSKLVPLKKELNPKKYGGYQKPTTAYPVLLITDTKQLIPISVMNKKQFEQNPVKFLRDRGYQQVGKNDFIKLPKYTLVDIGDGIKRLWASSKEIHKGNQLVVSKKSQILLYHAHHLDSDLSNDYLQNHNQQFDVLFNEIISFSKKCKLGKEHIQKIENVYSNKKNSASIEELAESFIKLLGFTQLGATSPFNFLGVKLNQKQYKGKKDYILPCTEGTLIRQSITGLYETRVDLSKIGED
SEQ ID NO:13
序列:
ATGGATAAGAAATACTCAATAGGCTTAGATATCGGCACAAATAGCGTCGGATGGGCGGTGATCACTGATGATTATAAGGTTCCGTCTAAAAAGTTCAAGGTTCTGGGAAATACAGACCGCCACAGTATCAAAAAAAATCTTATAGGGGCTCTTTTATTTGACAGTGGAGAGACAGCGGAAGCGACTCGTCTCAAACGGACAGCTCGTAGAAGGTATACACGTCGGAAGAATCGTATTTGTTATCTACAGGAGATTTTTTCAAATGAGATGGCGAAAGTAGATGATAGTTTCTTTCATCGACTTGAAGAGTCTTTTTTGGTGGAAGAAGACAAGAAGCATGAACGTCATCCTATTTTTGGAAATATAGTAGATGAAGTTGCTTATCATGAGAAATATCCAACTATCTATCATCTGCGAAAAAAATTGGTAGATTCTACTGATAAAGCGGATTTGCGCTTAATCTATTTGGCCTTAGCGCATATGATTAAGTTTCGTGGTCATTTTTTGATTGAGGGAGATTTAAATCCTGATAATAGTGATGTGGACAAACTATTTATCCAGTTGGTACAAACCTACAATCAATTATTTGAAGAAAACCCTATTAACGCAAGTGGAGTAGATGCTAAAGCGATTCTTTCTGCACGATTGAGTAAATCAAGACGATTAGAAAATCTCATTGCTCAGCTCCCCGGTGAGAAGAAAAATGGCTTATTTGGGAATCTCATTGCTTTGTCATTGGGTTTGACCCCTAATTTTAAATCAAATTTTGATTTGGCAGAAGATGCTAAATTACAGCTTTCAAAAGATACTTACGATGATGATTTAGATAATTTATTGGCGCAAATTGGAGATCAATATGCTGATTTGTTTTTGGCAGCTAAGAATTTATCAGATGCTATTTTACTTTCAGATATCCTAAGAGTAAATACTGAAATAACTAAGGCTCCCCTATCAGCTTCAATGATTAAACGCTACGATGAACATCATCAAGACTTGACTCTTTTAAAAGCTTTAGTTCGACAACAACTTCCAGAAAAGTATAAAGAAATCTTTTTTGATCAATCAAAAAACGGATATGCAGGTTATATTGATGGGGGAGCTAGCCAAGAAGAATTTTATAAATTTATCAAACCAATTTTAGAAAAAATGGATGGTACTGAGGAATTATTGGTGAAACTAAATCGTGAAGATTTGCTGCGCAAGCAACGGACCTTTGACAACGGCTCTATTCCCCATCAAATTCACTTGGGTGAGCTGCATGCTATTTTGAGAAGACAAGAAGACTTTTATCCATTTTTAAAAGACAATCGTGAGAAGATTGAAAAAATCTTGACTTTTCGAATTCCTTATTATGTTGGTCCATTGGCGCGTGGCAATAGTCGTTTTGCATGGATGACTCGGAAGTCTGAAGAAACAATTACCCCATGGAATTTTGAAGAAGTTGTCGATAAAGGTGCTTCAGCTCAATCATTTATTGAACGCATGACAAACTTTGATAAAAATCTTCCAAATGAAAAAGTACTACCAAAACATAGTTTGCTTTATGAGTATTTTACGGTTTATAACGAATTGACAAAGGTCAAATATGTTACTGAAGGAATGCGAAAACCAGCATTTCTTTCAGGTGAACAGAAGAAAGCCATTGTTGATTTACTCTTCAAAACAAATCGAAAAGTAACCGTTAAGCAATTAAAAGAAGATTATTTCAAAAAAATAGAATGTTTTGATAGTGTTGAAATTTCAGGAGTTGAAGATAGATTTAATGCTTCATTAGGTACCTACCATGATTTGCTAAAAATTATTAAAGATAAAGATTTTTTGGATAATGAAGAAAATGAAGATATCTTAGAGGATATTGTTTTAACATTGACCTTATTTGAAGATAGGGAGATGATTGAGGAAAGACTTAAAACATATGCTCACCTCTTTGATGATAAGGTGATGAAACAGCTTAAACGTCGCCGTTATACTGGTTGGGGACGTTTGTCTCGAAAATTGATTAATGGTATTAGGGATAAGCAATCTGGCAAAACAATATTAGATTTTTTGAAATCAGATGGTTTTGCCAATCGCAATTTTATGCAGCTGATCCATGATGATAGTTTGACATTTAAAGAAGACATTCAAAAAGCACAAGTGTCTGGACAAGGCGATAGTTTACATGAACATATTGCAAATTTAGCTGGTAGCCCTGCTATTAAAAAAGGTATTTTACAGACTGTAAAAGTTGTTGATGAATTGGTCAAAGTAATGGGGCGGCATAAGCCAGAAAATATCGTTATTGAAATGGCACGTGAAAATCAGACAACTCAAAAGGGCCAGAAAAATTCGCGAGAGCGTATGAAACGAATCGAAGAAGGTATCAAAGAATTAGGAAGTCAGATTCTTAAAGAGCATCCTGTTGAAAATACTCAATTGCAAAATGAAAAGCTCTATCTCTATTATCTCCAAAATGGAAGAGACATGTATGTGGACCAAGAATTAGATATTAATCGTTTAAGTGATTATGATGTCGATCACATTGTTCCACAAAGTTTCCTTAAAGACGATTCAATAGACAATAAGGTCTTAACGCGTTCTGATAAAAATCGTGGTAAATCGGATAACGTTCCAAGTGAAGAAGTAGTCAAAAAGATGAAAAACTATTGGAGACAACTTCTAAACGCCAAGTTAATCACTCAACGTAAGTTTGATAATTTAACGAAAGCTGAACGTGGAGGTTTGAGTGAACTTGATAAAGCTGGTTTTATCAAACGCCAATTGGTTGAAACTCGCCAAATCACTAAGCATGTGGCACAAATTTTGGATAGTCGCATGAATACTAAATACGATGAAAATGATAAACTTATTCGAGAGGTTAAAGTGATTACCTTAAAATCTAAATTAGTTTCTGACTTCCGAAAAGATTTCCAATTCTATAAAGTACGTGAGATTAACAATTACCATCATGCCCATGATGCGTATCTAAATGCCGTCGTTGGAACTGCTTTGATTAAGAAATATCCAAAACTTGAATCGGAGTTTGTCTATGGTGATTATAAAGTTTATGATGTTCGTAAAATGATTGCTAAGTCTGAGCAAGAAATAGGCAAAGCAACCGCAAAATATTTCTTTTACTCTAATATCATGAACTTCTTCAAAACAGAAATTACACTTGCAAATGGAGAGATTCGCAAACGCCCTCTAATCGAAACTAATGGGGAAACTGGAGAAATTGTCTGGGATAAAGGGCGAGATTTTGCCACAGTGCGCAAAGTATTGTCCATGCCCCAAGTCAATATTGTCAAGAAAACAGAAGTACAGACAGGCGGATTCTCCAAGGAGTCAATTTTACCAAAAAGAAATTCGGACAAGCTTATTGCTCGTAAAAAAGACTGGGATCCAAAAAAATATGGTGGTTTTGATAGTCCAACGGTAGCTTATTCAGTCCTAGTGGTTGCTAAGGTGGAAAAAGGGAAATCGAAGAAGTTAAAATCCGTTAAAGAGTTACTAGGGATCACAATTATGGAAAGAAGTTCCTTTGAAAAAAATCCGATTGACTTTTTAGAAGCTAAAGGATATAAGGAAGTTAAAAAAGACTTAATCATTAAACTACCTAAATATAGTCTTTTTGAGTTAGAAAACGGTCGTAAACGGATGCTGGCTAGTGCCGGAGAATTACAAAAAGGAAATGAGCTGGCTCTGCCAAGCAAATATGTGAATTTTTTATATTTAGCTAGTCATTATGAAAAGTTGAAGGGTAGTCCAGAAGATAACGAACAAAAACAATTGTTTGTGGAGCAGCATAAGCATTATTTAGATGAGATTATTGAGCAAATCAGTGAATTTTCTAAGCGTGTTATTTTAGCAGATGCCAATTTAGATAAAGTTCTTAGTGCATATAACAAACATAGAGACAAACCAATACGTGAACAAGCAGAAAATATTATTCATTTATTTACGTTGACGAATCTTGGAGCTCCCGCTGCTTTTAAATATTTTGATACAACAATTGATCGTAAACGATATACGTCTACAAAAGAAGTTTTAGATGCCACTCTTATCCATCAATCCATCACTGGTCTTTATGAAACACGCATTGATTTGAGTCAGCTAGGAGGTGACTGA
SEQ ID NO:14
名称:gi|409693032|gb|AFV37892.1|CRISPR相关蛋白,Csn1家族[化脓链球菌A20]
序列:
MDKKYSIGLDIGTNSVGWAVITDDYKVPSKKFKVLGNTDRHSIKKNLIGALLFDSGETAEATRLKRTARRRYTRRKNRICYLQEIFSNEMAKVDDSFFHRLEESFLVEEDKKHERHPIFGNIVDEVAYHEKYPTIYHLRKKLVDSTDKADLRLIYLALAHMIKFRGHFLIEGDLNPDNSDVDKLFIQLVQTYNQLFEENPINASGVDAKAILSARLSKSRRLENLIAQLPGEKKNGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLSKDTYDDDLDNLLAQIGDQYADLFLAAKNLSDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDLTLLKALVRQQLPEKYKEIFFDQSKNGYAGYIDGGASQEEFYKFIKPILEKMDGTEELLVKLNREDLLRKQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDNREKIEKILTFRIPYYVGPLARGNSRFAWMTRKSEETITPWNFEEVVDKGASAQSFIERMTNFDKNLPNEKVLPKHSLLYEYFTVYNELTKVKYVTEGMRKPAFLSGEQKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVEDRFNASLGTYHDLLKIIKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYAHLFDDKVMKQLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANRNFMQLIHDDSLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIKKGILQTVKVVDELVKVMGRHKPENIVIEMARENQTTQKGQKNSRERMKRIEEGIKELGSQILKEHPVENTQLQNEKLYLYYLQNGRDMYVDQELDINRLSDYDVDHIVPQSFLKDDSIDNKVLTRSDKNRGKSDNVPSEEVVKKMKNYWRQLLNAKLITQRKFDNLTKAERGGLSELDKAGFIKRQLVETRQITKHVAQILDSRMNTKYDENDKLIREVKVITLKSKLVSDFRKDFQFYKVREINNYHHAHDAYLNAVVGTALIKKYPKLESEFVYGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEIRKRPLIETNGETGEIVWDKGRDFATVRKVLSMPQVNIVKKTEVQTGGFSKESILPKRNSDKLIARKKDWDPKKYGGFDSPTVAYSVLVVAKVEKGKSKKLKSVKELLGITIMERSSFEKNPIDFLEAKGYKEVKKDLIIKLPKYSLFELENGRKRMLASAGELQKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKHYLDEIIEQISEFSKRVILADANLDKVLSAYNKHRDKPIREQAENIIHLFTLTNLGAPAAFKYFDTTIDRKRYTSTKEVLDATLIHQSITGLYETRIDLSQLGGD
SEQ ID NO:15
名称:gi|150381361|gb|EF472760.1|HIV-1克隆39B,来自USA整合酶(pol)基因,部分cds
序列:
TTTTTGGATGGAATAGATAGGGCCCAAGAAGAGCATGAGAAATATCACAATAATTGGAGAGCAATGGCTAGTGATTTTAACCTGCCACCTNTAGTAGCAAAGGAGATAGTAGCCAGCTGTGATAAATGTCAGCTAAAAGGAGAAGCCATGCATGGACAAGTAGACTGTAGTCCAGGAATATGGCAACTAGATTGTACACATNTAGAAGGAAAAGTTATCCTGGTAGCAGTNCATGTAGCCAGTGGTTATATAGAAGCAGAAGTTATTCCAGCAGAGACAGGGCAGGAAACAGCATACTTCCTCTTAAAATTAGCAGGAAGATGGCCAGTAAAAACAGTACATACAGACAATGGCAGCAACTTCACCAGTGCTGCGNTGAAGGCCGCCTGTTGGTGGGCAGGGATCAAGCAGGAATTTGGCATTCCCTACAATCCCCAAAGTCAAGGAGTAGTAGAGTCTATGAATAATGAATTAAAGAAAATTGTAGGACAAGTAAGAGATCAGGCTGAGCATCTCAAGACAGCAGTACAAATGGCAGTATTCATCCACAATTTTAAAAGAAAAGGGGGGATTGGGGGGTACAGTGCAGGAGAAAGAATAGTAGACATAATAGCCACAGACATACAAACTAAAGAACTACAAAAAAATATTACAAAAATGCAAAATTTTCGGGTCTATTTCAGAGACAGCAGAGATCCACTTTGGAAAGGACCAGCAAAGCTTCTCTGGAAAGGTGAAGGGGCAGTAGTAATACAAGATACCAATGACATAAARGTAGTGCCARGAAGAAAAGCAAAGATCATTAGAGATTATGGAAAACAGATGGCAGGTGATGATTGTGTGGCAAGTAGACAGGNTGAGGATTAG
SEQ ID NO:16
名称:gi|150381362|gb|ABR68182.1|整合酶,部分的[人类免疫缺陷病毒1]
序列:
FLDGIDRAQEEHEKYHNNWRAMASDFNLPPXVAKEIVASCDKCQLKGEAMHGQVDCSPGIWQLDCTHXEGKVILVAVHVASGYIEAEVIPAETGQETAYFLLKLAGRWPVKTVHTDNGSNFTSAAXKAACWWAGIKQEFGIPYNPQSQGVVESMNNELKKIVGQVRDQAEHLKTAVQMAVFIHNFKRKGGIGGYSAGERIVDIIATDIQTKELQKNITKMQNFRVYFRDSRDPLWKGPAKLLWKGEGAVVIQDTNDIKVVPXRKAKIIRDYGKQMAGDDCVASRQXED
SEQ ID NO:17
名称:gi|459980|gb|L20651.1|STLKIAPOL猿猴T细胞淋巴病毒I型整合酶(pol)基因,部分cds
序列:
GACTTGTAGAACGCTCTAATGGCATTCTTAAAACCCTATTATATAAGTACTTTACTGACAAACCCGACCT
ACCTATGGATAATGCTCTATCCATAGCCCTATGGACGATCAACCACCTGAATGTGTTAACCCACTGCCAC
SEQ ID NO:18
名称:gi|459981|gb|AAA47841.1|整合酶,部分的[猿猴T-淋巴病毒1]
序列:
LVERSNGILKTLLYKYFTDKPDLPMDNALSIALWTINHLNVLTHCH
SEQ ID NO:19
名称:gi|321156784:1-1509肺炎链球菌整合和缀合元件ICESpn11930,菌株11930
序列:
GAGTTTTTTTCCTTTCGTAGCAAGGGTTTAGAGCCCCTATTTTATTTTACTATTGTCTAAACACCAAGCG
AACACCAAAACTACCATGCAATGGAAAAACCTCTGATTTGATTCTCACTTGATTTCACAATCTTTATATC
AAACTGTGGGTGGTATTTGACAATATCTTTTTTGATTTTTAATAGTAAATTCGAAATAATATTTTTAGGT
GAGTAACGTGGACTAAGATGTAACAAGTCTTTGAACTCATCGACACTTAATTCTACTTTATTGCTATTAT
CACTAGTTTCAATGAATTTTTCAATTATTCTGGAATATTTACAGGTATAACTTTTCAATTCTTCAAAATG
GAAATTGTGATTTTCTACAAATTGATTTAAGGCTTTTACAGTATTTTCTTGTGAACGATTTATATTATGT
GTATAGCCCATTGTTGTCTCAAAGTTAGCGTGTCCTACTCTAGTCATAATATCTTTCACTGCTATGTGCA
TCTCATTACTTTGAAGGTAACTAATATGCATATGCCTAAACGAATGGGGAGTAACATGTTTTACCCACTT
AAAACCATAGTCACTTAAACAATTTGTCAATAATTTTCCTTCTATTCGTTTCAAAATTTGACGAAAAGTG
CTTGATGTTATTGGAGAGCCGTATTCTGTTCTAAATACACTTTCAGAATGTGTAAAAGCAGGACAGGGAT
GTTTCTCCATATAAGCATCAAACTCTTTATTTCTCTGTATTGTCCTTTTAATAGCTTCGCTTGCAGCTTC
AGGCAAAGCTACTTCTCTAATTGAATTGAGTGTTTTAGTTGTATCAAAATGAAATTGTTTAACTTTTAAA
CAATGATATTGAAGTGCTTTATCAATATGCAAGATTCCTTTTTCAAAATCAATATCTGATGGTAAAAATG
CTGCTTCACTAATTCGAATACCTGTAAGCAACAATACTATAGCAAGATCATAATAGTTTGCATTTCTGCA
TTGGCGTAACACATCAAAAAATGCATGTAATTCATGGATTTCTAGAAATTTAGAATCATGTCTTTCTTTT
GCTTTACGCCTTTTCTCTAGTGAAATATCTAGTTTTACCGCAGTCATTGGAGAAAACTTAATGACATTAT
ATAACACACCATGATTAAAAATCTTATTACAAGTACTTTTTATATGAGTCATTGTTGAAGGCGATGCATC
ATACATTTCTAAATATTTATTGAGACTATTTTTCATCAGAAGTGGAGTAATCCTGTCTAACAAAAAATCA
TCTCCTATAATTTTCCCAAGACGCTTCATAACCAGTAGTTCTCTCTGAATTGTTTGTGGTTTAACAGAGA
CACACCAAGTCTGAAACCAATTTTCTTTTAACTCTCCAAATGTTGTAATCAGTTCAGGACTATACTGACT
TTCAAATGAAGTAGTTAGTCTATCTATTTTATCAAGAACCTCTCTTTCAGCTTGTTTCCTCGCCCTACTA
GTATTCTTAGTATAACTTACAGTTACTGATTTCCACTTT
SEQ ID NO:20
名称:gi|321156785|emb|CBW38769.1|整合酶[肺炎链球菌]
序列:
MYYVTKTNSKGQPLYQVVEKYKDPLTGKWKSVTVSYTKNTSRARKQAEREVLDKIDRLTTSFESQYSPEL
ITTFGELKENWFQTWCVSVKPQTIQRELLVMKRLGKIIGDDFLLDRITPLLMKNSLNKYLEMYDASPSTM
THIKSTCNKIFNHGVLYNVIKFSPMTAVKLDISLEKRRKAKERHDSKFLEIHELHAFFDVLRQCRNANYY
DLAIVLLLTGIRISEAAFLPSDIDFEKGILHIDKALQYHCLKVKQFHFDTTKTLNSIREVALPEAASEAI
KRTIQRNKEFDAYMEKHPCPAFTHSESVFRTEYGSPITSSTFRQILKRIEGKLLTNCLSDYGFKWVKHVT
PHSFRHMHISYLQSNEMHIAVKDIMTRVGHANFETTMGYTHNINRSQENTVKALNQFVENHNFHFEELKS
YTCKYSRIIEKFIETSDNSNKVELSVDEFKDLLHLSPRYSPKNIISNLLLKIKKDIVKYHPQFDIKIVKS
SENQIRGFSIAW
SEQ ID NO:21
名称:gi|43090:1-436大肠杆菌(Tn5086)dhfrVII基因,针对二氢叶酸还原酶VII型,以及sulI基因,5'末端(整合酶)
序列:
GCATGCCCGTTCCATACAGAAGCTGGGCGAACAAACGATGCTCGCCTTCCAGAAAACCGAGGATGCGAAC
CACTTCATCCGGGGTCAGCACCACCGGCAAGCGCCGCGACGGCCGAGGTCTTCCGATCTCCTGAAGCCAG
GGCAGATCCGTGCACAGCACCTTGCCGTAGAAGAACAGCAAGGCCGCCAATGCCTGACGATGCGTGGAGA
CCGAAACCTTGCGCTCGTTCGCCAGCCAGGACAGAAATGCCTCGACTTCGCTGCTGCCCAAGGTTGCCGG
GTGACGCACACCGTGGAAACGGATGAAGGCACGAACCCAGTGGACATAAGCCTGTTCGGTTCGTAAGCTG
TAATGCAAGTAGCGTATGCGCTCACGCAACTGGTCCAGAACCTTGACCGAACGCAGCGGTGGTAACGGCG
CAGTGGCGGTTTTCAT
SEQ ID NO:22
名称:gi|43091|emb|CAA41325.1|整合酶,部分的(质粒)[大肠杆菌]
序列:
MKTATAPLPPLRSVKVLDQLRERIRYLHYSLRTEQAYVHWVRAFIRFHGVRHPATLGSSEVEAFLSWLAN
ERKVSVSTHRQALAALLFFYGKVLCTDLPWLQEIGRPRPSRRLPVVLTPDEVVRILGFLEGEHRLFAQLL
YGTGM
SEQ ID NO:23
>gi|397912605:40372-41898嗜热厌氧杆菌噬菌体THSA-485A,完整基因组-重组酶
ATGAATCGTGTATGTATTTATCTTAGGAAGTCCCGAGCAGACGAAGAAATAGAAAAAGAGCTTGGACAAG
GAGAAACACTCGCAAAACATCGTAAGGCCCTTCTTAAATTTGCAAAAGAGAAAAATTTGAACATAGTAAA
AATCAGAGAGGAAATAGTATCAGGCGAAAGCCTTATCCATAGACCTGAAATGTTGGAATTACTAAAAGAA
GTCGAACAAGGCATGTACGATGCTGTATTATGTATGGATCTACAGCGTTTAGGGCGTGGCAACATGCAGG
AACAAGGTCTCATTTTAGAAGCCTTTAAAAAGTCAAACACTAAAATTATAACGCTTCAAAAAACTTATGA
TTTGAACAATGATTTTGACGAAGAATATAGCGAATTTGAAGCATTTATGAGCCGAAAGGAACTTAAAATG
ATAAATAGAAGGCTACAAGGTGGCAGAGTACGCTCTATTCAGGAAGGTAATTATTTATCACCATTGCCAC
CTTATGGTTACTTAATACACGAAGAAAAATTTTCGCGCACTCTTGTGCCTAATCCTGAGCAAGCTGATGT
AGTTAAAATGATTTTTGATATGTATGTCAATAAACAGATGGGGTCTAGTGCTATAGCGAACGAACTAAAC
AAAATGGGTTATAAGACGTATACTGGCAGGAATTGGGCTTCAAGCTCTGTAATAAACATACTCAAGAATC
CAGTTTACATCGGTAAAATAACGTGGAAGAAGAAGGATATAAAGAAGTCTGCTGACCCAAATAAAAGCAA
AGATACACGTCAAAGACCACGCTCTGAATGGATTGTATCAGATGGCAAACATGAACCAATAGTGGGCAAA
GAGCTCTTTGCCAAGGCTCAAGAAATCATTAAAAACAAGTATCACATACCGTATCAGATCGTTAATGGTC
CACGTAACCCATTGGCAGGGCTTATTATATGCAAAATATGTGGCTCTAAAATGGTGTATAGACCCTACAA
AGATAAAGAAGCGCATATAATATGTCCAAACAAGTGCGGCAATAAAAGCAGCAAATTTATCTATGTAGAA
AAAAGATTATTACAGGCTTTGGAGGAATGGATGCAAGGCTACGAGCTGGATCTGCAAATAGAAGAAGATG
ACAGCTCTTTTGCAGAAGCACAAGAGAAACAAAAAGAAGCTCTTGAAAGAGAATTGCACGAGCTGCAAAA
GCAAAAGAACAATTTACACGATTTGCTCGAGCGTGGCATATACGATATAGATACATTTGTGGAAAGATCT
ACAATTGTAGCACAGAGAATAGAAGAAACACAGAAAAGTATAGATGTGCTTGTGCAAAAAATAGAAGAAG
AAAAGAATAAAAGAGACAAAGAAAAAATACTTCCGGAAATTCGGCATGTGTTGGATCTATATTGGAAAAC
AGACGACATTGCACAAAAAAATATGTTGTTAAAGAGCGTACTTGAAAAAGCAGAATATCTAAAAGAAAAG
AAGCAGAGAGAAGACAACTTCGAACTTTGGATTTATCCAAAGCTGCCTGAAAAATAG
SEQ ID NO:24
>gi|397912662|ref|YP_006546326.1|重组酶[嗜热厌氧杆菌噬菌体THSA-485A]MNRVCIYLRKSRADEEIEKELGQGETLAKHRKALLKFAKEKNLNIVKIREEIVSGESLIHRPEMLELLKE
VEQGMYDAVLCMDLQRLGRGNMQEQGLILEAFKKSNTKIITLQKTYDLNNDFDEEYSEFEAFMSRKELKM
INRRLQGGRVRSIQEGNYLSPLPPYGYLIHEEKFSRTLVPNPEQADVVKMIFDMYVNKQMGSSAIANELN
KMGYKTYTGRNWASSSVINILKNPVYIGKITWKKKDIKKSADPNKSKDTRQRPRSEWIVSDGKHEPIVGK
ELFAKAQEIIKNKYHIPYQIVNGPRNPLAGLIICKICGSKMVYRPYKDKEAHIICPNKCGNKSSKFIYVE
KRLLQALEEWMQGYELDLQIEEDDSSFAEAQEKQKEALERELHELQKQKNNLHDLLERGIYDIDTFVERS
TIVAQRIEETQKSIDVLVQKIEEEKNKRDKEKILPEIRHVLDLYWKTDDIAQKNMLLKSVLEKAEYLKEK
KQREDNFELWIYPKLPEK
SEQ ID NO:25
Gin重组酶
>gi|657193240|sp|Q38199.2|GIN_BPD10Rec名称:全名=丝氨酸重组酶gin;Alt名称:全名=G-段转化酶;短名=Gin
MLIGYVRVSTNDQNTDLQRNALVCAGCEQIFEDKLSGTRTDRPGLKRALKRLQKGDTLVVWKLDRLGRSM
KHLISLVGELRERGINFRSLTDSIDTSSPMGRFFFHVMGALAEMERELIIERTMAGLAAARNKGRIGGRP
PKLTKAEWEQAGRLLAQGIPRKQVALIYDVALSTLYKKHPAKRTHIENDDRINQIDR
SEQ ID NO:26
Cre重组酶
>gi|375331813|dbj|BAL61207.1|Cre重组酶[Cre-表达载体pHVX2-cre]
MVQTSLLTVHQNLPALPVDATSDEVRKNLMDMFRDRQAFSEHTWKMLLSVCRSWAAWCKLNNRKWFPAEP
EDVRDYLLYLQARGLAVKTIQQHLGQLNMLHRRSGLPRPSDSNAVSLVMRRIRKENVDAGERAKQALAFE
RTDFDQVRSLMENSDRCQDIRNLAFLGIAYNTLLRIAEIARIRVKDISRTDGGRMLIHIGRTKTLVSTAG
VEKALSLGVTKLVERWISVSGVADDPNNYLFCRVRKNGVAAPSATSQLSTRALEGIFEATHRLIYGAKDD
SGQRYLAWSGHSARVGAARDMARAGVSIPEIMQAGGWTNVNIVMNYIRNLDSETGAMVRLLEDGD
SEQ ID NO:27-46
这些是编码用于连接到本发明所述的整合酶或重组酶的TALE重复模块的多核苷酸的示例性序列。
SEQ ID NO:27
名称:NI
序列:
CTGACCCCAGAGCAGGTCGTGGCAATCGCCTCCAACATTGGCGGGAAACAGGCACTCGAGACTGTCCAGCGCCTGCTTCCCGTGCTGTGCCAAGCGCACGGA
SEQ ID NO:28
名称:NG
序列:
CTGACCCCAGAGCAGGTCGTGGCCATTGCCTCGAATGGAGGGGGCAAACAGGCGTTGGAAACCGTACAACGATTGCTGCCGGTGCTGTGCCAAGCGCACGGC
SEQ ID NO:29
名称:HD
序列:
TTGACCCCAGAGCAGGTCGTGGCGATCGCAAGCCACGACGGAGGAAAGCAAGCCTTGGAAACAGTACAGAGGCTGTTGCCTGTGCTGTGCCAAGCGCACGGG
SEQ ID NO:30
名称:NN
序列:
CTTACCCCAGAGCAGGTCGTGGCAATCGCGAGCAATAACGGCGGAAAACAGGCTTTGGAAACGGTGCAGAGGCTCCTTCCAGTGCTGTGCCAAGCGCACGGG
SEQ ID NO:31
名称:NI-NI
序列:
CTGACCCCAGAGCAGGTCGTGGCAATCGCCTCCAACATTGGCGGGAAACAGGCACTCGAGACTGTCCAGCGCCTGCTTCCCGTGCTTTGTCAGGCACACGGCCTCACTCCGGAACAAGTGGTCGCAATCGCCTCCAACATTGGCGGGAAACAGGCACTCGAGACTGTCCAGCGCCTGCTTCCCGTGCTGTGCCAAGCGCACGGT
SEQ ID NO:32
名称:NI-NG
序列:
CTGACCCCAGAGCAGGTCGTGGCAATCGCCTCCAACATTGGCGGGAAACAGGCACTCGAGACTGTCCAGCGCCTGCTTCCCGTGCTTTGTCAGGCACACGGCCTCACTCCGGAACAAGTGGTCGCCATTGCCTCGAATGGAGGGGGCAAACAGGCGTTGGAAACCGTACAACGATTGCTGCCGGTGCTGTGCCAAGCGCACGGT
SEQ ID NO:33
名称:NI-HD
序列:
CTGACCCCAGAGCAGGTCGTGGCAATCGCCTCCAACATTGGCGGGAAACAGGCACTCGAGACTGTCCAGCGCCTGCTTCCCGTGCTTTGTCAGGCACACGGCCTCACTCCGGAACAAGTGGTCGCGATCGCAAGCCACGACGGAGGAAAGCAAGCCTTGGAAACAGTACAGAGGCTGTTGCCTGTGCTGTGCCAAGCGCACGGT
SEQ ID NO:34
名称:NI-NN
序列:
CTGACCCCAGAGCAGGTCGTGGCAATCGCCTCCAACATTGGCGGGAAACAGGCACTCGAGACTGTCCAGCGCCTGCTTCCCGTGCTTTGTCAGGCACACGGCCTCACTCCGGAACAAGTGGTCGCAATCGCGAGCAATAACGGCGGAAAACAGGCTTTGGAAACGGTGCAGAGGCTCCTTCCAGTGCTGTGCCAAGCGCACGGT
SEQ ID NO:35
名称:NG-NI
序列:
CTGACCCCAGAGCAGGTCGTGGCCATTGCCTCGAATGGAGGGGGCAAACAGGCGTTGGAAACCGTACAACGATTGCTGCCGGTGCTTTGTCAGGCACACGGCCTCACTCCGGAACAAGTGGTCGCAATCGCCTCCAACATTGGCGGGAAACAGGCACTCGAGACTGTCCAGCGCCTGCTTCCCGTGCTGTGCCAAGCGCACGGT
SEQ ID NO:36
名称:NG-NG
序列:
CTGACCCCAGAGCAGGTCGTGGCCATTGCCTCGAATGGAGGGGGCAAACAGGCGTTGGAAACCGTACAACGATTGCTGCCGGTGCTTTGTCAGGCACACGGCCTCACTCCGGAACAAGTGGTCGCCATTGCCTCGAATGGAGGGGGCAAACAGGCGTTGGAAACCGTACAACGATTGCTGCCGGTGCTGTGCCAAGCGCACGGT
SEQ ID NO:37
名称:NG-HD
序列:
CAAACAGGCGTTGGAAACCGTACAACGATTGCTGCCGGTGCTTTGTCAGGCACACGGCCTCACTCCGGAACAAGTGGTCGCGATCGCAAGCCACGACGGAGGAAAGCAAGCCTTGGAAACAGTACAGAGGCTGTTGCCTGTGCTGTGCCAAGCGCACGGT
SEQ ID NO:38
名称:NG-NN
序列:
CTGACCCCAGAGCAGGTCGTGGCCATTGCCTCGAATGGAGGGGGCAAACAGGCGTTGGAAACCGTACAACGATTGCTGCCGGTGCTTTGTCAGGCACACGGCCTCACTCCGGAACAAGTGGTCGCAATCGCGAGCAATAACGGCGGAAAACAGGCTTTGGAAACGGTGCAGAGGCTCCTTCCAGTGCTGTGCCAAGCGCACGGT
SEQ ID NO:39
名称:HD-NI
序列:
CTGACCCCAGAGCAGGTCGTGGCGATCGCAAGCCACGACGGAGGAAAGCAAGCCTTGGAAACAGTACAGAGGCTGTTGCCTGTGCTTTGTCAGGCACACGGCCTCACTCCGGAACAAGTGGTCGCAATCGCCTCCAACATTGGCGGGAAACAGGCACTCGAGACTGTCCAGCGCCTGCTTCCCGTGCTGTGCCAAGCGCACGGT
SEQ ID NO:40
名称:HD-NG
序列:
GAAAGCAAGCCTTGGAAACAGTACAGAGGCTGTTGCCTGTGCTTTGTCAGGCACACGGCCTCACTCCGGAACAAGTGGTCGCCATTGCCTCGAATGGAGGGGGCAAACAGGCGTTGGAAACCGTACAACGATTGCTGCCGGTGCTGTGCCAAGCGCACGGT
SEQ ID NO:41
名称:HD-HD
序列:
CTGACCCCAGAGCAGGTCGTGGCGATCGCAAGCCACGACGGAGGAAAGCAAGCCTTGGAAACAGTACAGAGGCTGTTGCCTGTGCTTTGTCAGGCACACGGCCTCACTCCGGAACAAGTGGTCGCGATCGCAAGCCACGACGGAGGAAAGCAAGCCTTGGAAACAGTACAGAGGCTGTTGCCTGTGCTGTGCCAAGCGCACGGT
SEQ ID NO:42
名称:HD-NN
序列:
CTCACCCCAGAGCAGGTCGTGGCGATCGCAAGCCACGACGGAGGAAAGCAAGCCTTGGAAACAGTACAGAGGCTGTTGCCTGTGCTTTGTCAGGCACACGGCCTCACTCCGGAACAAGTGGTCGCAATCGCGAGCAATAACGGCGGAAAACAGGCTTTGGAAACGGTGCAGAGGCTCCTTCCAGTGCTGTGCCAAGCGCACGGA
SEQ ID NO:43
名称:NN-NI
序列:
CTGACCCCAGAGCAGGTCGTGGCAATCGCGAGCAATAACGGCGGAAAACAGGCTTTGGAAACGGTGCAGAGGCTCCTTCCAGTGCTTTGTCAGGCACACGGCCTCACTCCGGAACAAGTGGTCGCAATCGCCTCCAACATTGGCGGGAAACAGGCACTCGAGACTGTCCAGCGCCTGCTTCCCGTGCTGTGCCAAGCGCACGGT
SEQ ID NO:44
名称:NN-NG
序列:
CTGACCCCAGAGCAGGTCGTGGCAATCGCGAGCAATAACGGCGGAAAACAGGCTTTGGAAACGGTGCAGAGGCTCCTTCCAGTGCTTTGTCAGGCACACGGCCTCACTCCGGAACAAGTGGTCGCCATTGCCTCGAATGGAGGGGGCAAACAGGCGTTGGAAACCGTACAACGATTGCTGCCGGTGCTGTGCCAAGCGCACGGT
SEQ ID NO:45
名称:NN-HD
序列:
CTGACCCCAGAGCAGGTCGTGGCAATCGCGAGCAATAACGGCGGAAAACAGGCTTTGGAAACGGTGCAGAGGCTCCTTCCAGTGCTTTGTCAGGCACACGGCCTCACTCCGGAACAAGTGGTCGCGATCGCAAGCCACGACGGAGGAAAGCAAGCCTTGGAAACAGTACAGAGGCTGTTGCCTGTGCTGTGCCAAGCGCACGGT
SEQ ID NO:46
名称:NN-NN
序列:
CTGACCCCAGAGCAGGTCGTGGCAATCGCGAGCAATAACGGCGGAAAACAGGCTTTGGAAACGGTGCAGAGGCTCCTTCCAGTGCTTTGTCAGGCACACGGCCTCACTCCGGAACAAGTGGTCGCAATCGCGAGCAATAACGGCGGAAAACAGGCTTTGGAAACGGTGCAGAGGCTSEQ ID NO:47
名称:gi|71796612|gb|DQ084353.1|绵羊慢病毒隔离群Ov10整合酶(pol)基因,部分cds
序列:
CATAGTAAATGGCATCAAGATGCTATGTCATTGCAGTTAGATTTTGGGATACCGAAAGGTGCGGCAGAAG
ATATAGTACAACAATGTGAAGTATGTCAGGAAAATAAAATGCCTAGCACCATCAGAGGAAGTAACAAAAG
AGGGATAGATCATTGGCAGGTGGATTATACTCATTATAAAGACAAAATAATATTGGTATGGGTAGAAACA
AATTCGGGA
SEQ ID NO:48
名称:gi|71796613|gb|AAZ41325.1|整合酶,部分的[绵羊慢病毒]
序列:
HSKWHQDAMSLQLDFGIPKGAAEDIVQQCEVCQENKMPSTIRGSNKRGIDHWQVDYTHYKDKIILVWVET
NSG
SEQ ID NO:49
>gb|AYLT01000127.1|:11804-12046金黄色葡萄球菌金黄亚种SK1585contig000127,全基因组鸟枪序列
TTATAGATAGGTTAGTGACAAAATACATTTTTCGTCTAGATTAACCGTGCCTCTTAGATTATTAATATTT
TCGTTTAGATGTTTTTCAGAAACTTTAGCAACTTCATAATCGTTCATGTAAAGTGTTTGGTTTTTTATTG
TATAATTAAGTAATTCATAATCTTTGTATACTTCTTTTACTTTATCTATATCAACATTTTCAAGAACAAG
TTTTTTTATGTTATTATAATTAAAGTTTTCCAT
SEQ ID NO:50
>gi|669035130|gb|KFD30483.1|假定蛋白D484_02234[金黄色葡萄球菌金黄亚种SK1585]–金黄色葡萄球菌cas9
MENFNYNNIKKLVLENVDIDKVKEVYKDYELLNYTIKNQTLYMNDYEVAKVSEKHLNENINNLRGTVNLD
EKCILSLTYL
SEQ ID NO:51
名称:接头2的dna
序列:
agcggcagcgaaaccccgggcaccagcgaaagcgcgaccccggaaagc
SEQ ID NO:52
名称:dCas9蛋白
序列:
MDKKYSIGLAIGTNSVGWAVITDEYKVPSKKFKVLGNTDRHSIKKNLIGALLFDSGETAEATR
LKRTARRRYTRRKNRICYLQEIFSNEMAKVDDSFFHRLEESFLVEEDKKHERHPIFGNIVDEVA
YHEKYPTIYHLRKKLVDSTDKADLRLIYLALAHMIKFRGHFLIEGDLNPDNSDVDKLFIQLVQ
TYNQLFEENPINASGVDAKAILSARLSKSRRLENLIAQLPGEKKNGLFGNLIALSLGLTPNFKSN
FDLAEDAKLQLSKDTYDDDLDNLLAQIGDQYADLFLAAKNLSDAILLSDILRVNTEITKAPLS
ASMIKRYDEHHQDLTLLKALVRQQLPEKYKEIFFDQSKNGYAGYIDGGASQEEFYKFIKPILEK
MDGTEELLVKLNREDLLRKQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDNREKIEKILTFR
IPYYVGPLARGNSRFAWMTRKSEETITPWNFEEVVDKGASAQSFIERMTNFDKNLPNEKVLPK
HSLLYEYFTVYNELTKVKYVTEGMRKPAFLSGEQKKAIVDLLFKTNRKVTVKQLKEDYFKKI
ECFDSVEISGVEDRFNASLGTYHDLLKIIKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLK
TYAHLFDDKVMKQLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANRNFMQLIHD
DSLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIKKGILQTVKVVDELVKVMGRHKPENIVIE
MARENQTTQKGQKNSRERMKRIEEGIKELGSQILKEHPVENTQLQNEKLYLYYLQNGRDMY
VDQELDINRLSDYDVDAIVPQSFLKDDSIDNKVLTRSDKNRGKSDNVPSEEVVKKMKNYWRQ
LLNAKLITQRKFDNLTKAERGGLSELDKAGFIKRQLVETRQITKHVAQILDSRMNTKYDENDK
LIREVKVITLKSKLVSDFRKDFQFYKVREINNYHHAHDAYLNAVVGTALIKKYPKLESEFVYG
DYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEIRKRPLIETNGETGEIVWD
KGRDFATVRKVLSMPQVNIVKKTEVQTGGFSKESILPKRNSDKLIARKKDWDPKKYGGFDSP
TVAYSVLVVAKVEKGKSKKLKSVKELLGITIMERSSFEKNPIDFLEAKGYKEVKKDLIIKLPKY
SLFELENGRKRMLASAGELQKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHK
HYLDEIIEQISEFSKRVILADANLDKVLSAYNKHRDKPIREQAENIIHLFTLTNLGAPAAFKYFD
TTIDRKRYTSTKEVLDATLIHQSITGLYETRIDLSQLGGD
SEQ ID NO:53
名称:带ATG的NLS核甘酸
序列:
ATGGACTACAAAGACCATGACGGTGATTATAAAGATCATGACATCGATTACAAGGATGAC
GATGACAAGATGGCCCCCAAGAAGAAGAGGAAGGTGGGCATTCACCGCGGGGTACCT
SEQ ID NO:54
名称:GGS接头核甘酸
序列:
GGGGGAAGT
SEQ ID NO:55
名称:合成整合酶
序列:
ATGTTCCTGGACGGTATCGACAAAGCTCAGGACGAGCACGAAAAGTACCATTCTAACTGGCGCGCCATGG
CCTCTGACTTCAATCTCCCGCCGGTTGTTGCCAAGGAGATCGTGGCTTCTTGCGACAAGTGCCAATTGAA
GGGTGAGGCTATGCATGGTCAGGTCGATTGCTCTCCCGGTATCTGGCAGCTGGACTGCACTCACCTCGAG
GGTAAGGTGATTCTCGTTGCTGTGCACGTGGCTTCCGGCTACATCGAGGCTGAGGTCATCCCGGCTGAGA
CCGGTCAAGAGACTGCTTACTTCCTGCTCAAGCTGGCCGGCCGTTGGCCAGTTAAGACTATTCACACTGA
TAACGGTTCTAACTTTACTTCCGCAACTGTGAAAGCTGCATGCTGGTGGGCCGGCATTAAACAAGAGTTC
GGAATTCCGTATAACCCGCAGTCTCAGGGCGTTGTCGAGTCTATGAACAAGGAGCTCAAAAAGATCATTG
GTCAAGTCCGTGACCAAGCTGAGCACCTTAAGACCGCTGTGCAGATGGCTGTTTTTATTCATAACTTCAA
GCGTAAGGGTGGTATCGGTGGTTATAGCGCTGGTGAGCGTATCGTAGACATCATCGCTACTGATATCCAG
ACAAAGGAGCTGCAGAAGCAGATCACTAAGATCCAGAACTTCCGTGTGTACTATCGGGACTCTAGGAACC
CGCTCTGGAAGGGTCCTGCTAAACTGCTGTGGAAGGGAGAGGGTGCTGTTGTTATCCAGGACAACTCTGA
TATCAAGGTGGTTCCGCGTCGTAAGGCTAAAATTATCCGCGACTACGGCAAGCAAATGGCTGGAGACGAC
TGCGTTGCTAGCCGTCAAGACGAAGACTAA
SEQ ID NO:56
名称:带ATG的dCas9核甘酸
序列:
ATGGATAAAAAGTATTCTATTGGTTTAGCTATCGGCACTAATTCCGTTGGATGGGCTGTCA
TAACCGATGAATACAAAGTACCTTCAAAGAAATTTAAGGTGTTGGGGAACACAGACCGTC
ATTCGATTAAAAAGAATCTTATCGGTGCCCTCCTATTCGATAGTGGCGAAACGGCAGAGG
CGACTCGCCTGAAACGAACCGCTCGGAGAAGGTATACACGTCGCAAGAACCGAATATGTT
ACTTACAAGAAATTTTTAGCAATGAGATGGCCAAAGTTGACGATTCTTTCTTTCACCGTTT
GGAAGAGTCCTTCCTTGTCGAAGAGGACAAGAAACATGAACGGCACCCCATCTTTGGAAA
CATAGTAGATGAGGTGGCATATCATGAAAAGTACCCAACGATTTATCACCTCAGAAAAAA
GCTAGTTGACTCAACTGATAAAGCGGACCTGAGGTTAATCTACTTGGCTCTTGCCCATATG
ATAAAGTTCCGTGGGCACTTTCTCATTGAGGGTGATCTAAATCCGGACAACTCGGATGTC
GACAAACTGTTCATCCAGTTAGTACAAACCTATAATCAGTTGTTTGAAGAGAACCCTATA
AATGCAAGTGGCGTGGATGCGAAGGCTATTCTTAGCGCCCGCCTCTCTAAATCCCGACGG
CTAGAAAACCTGATCGCACAATTACCCGGAGAGAAGAAAAATGGGTTGTTCGGTAACCTT
ATAGCGCTCTCACTAGGCCTGACACCAAATTTTAAGTCGAACTTCGACTTAGCTGAAGAT
GCCAAATTGCAGCTTAGTAAGGACACGTACGATGACGATCTCGACAATCTACTGGCACAA
ATTGGAGATCAGTATGCGGACTTATTTTTGGCTGCCAAAAACCTTAGCGATGCAATCCTCC
TATCTGACATACTGAGAGTTAATACTGAGATTACCAAGGCGCCGTTATCCGCTTCAATGAT
CAAAAGGTACGATGAACATCACCAAGACTTGACACTTCTCAAGGCCCTAGTCCGTCAGCA
ACTGCCTGAGAAATATAAGGAAATATTCTTTGATCAGTCGAAAAACGGGTACGCAGGTTA
TATTGACGGCGGAGCGAGTCAAGAGGAATTCTACAAGTTTATCAAACCCATATTAGAGAA
GATGGATGGGACGGAAGAGTTGCTTGTAAAACTCAATCGCGAAGATCTACTGCGAAAGC
AGCGGACTTTCGACAACGGTAGCATTCCACATCAAATCCACTTAGGCGAATTGCATGCTA
TACTTAGAAGGCAGGAGGATTTTTATCCGTTCCTCAAAGACAATCGTGAAAAGATTGAGA
AAATCCTAACCTTTCGCATACCTTACTATGTGGGACCCCTGGCCCGAGGGAACTCTCGGTT
CGCATGGATGACAAGAAAGTCCGAAGAAACGATTACTCCATGGAATTTTGAGGAAGTTGT
CGATAAAGGTGCGTCAGCTCAATCGTTCATCGAGAGGATGACCAACTTTGACAAGAATTT
ACCGAACGAAAAAGTATTGCCTAAGCACAGTTTACTTTACGAGTATTTCACAGTGTACAA
TGAACTCACGAAAGTTAAGTATGTCACTGAGGGCATGCGTAAACCCGCCTTTCTAAGCGG
AGAACAGAAGAAAGCAATAGTAGATCTGTTATTCAAGACCAACCGCAAAGTGACAGTTA
AGCAATTGAAAGAGGACTACTTTAAGAAAATTGAATGCTTCGATTCTGTCGAGATCTCCG
GGGTAGAAGATCGATTTAATGCGTCACTTGGTACGTATCATGACCTCCTAAAGATAATTA
AAGATAAGGACTTCCTGGATAACGAAGAGAATGAAGATATCTTAGAAGATATAGTGTTGA
CTCTTACCCTCTTTGAAGATCGGGAAATGATTGAGGAAAGACTAAAAACATACGCTCACC
TGTTCGACGATAAGGTTATGAAACAGTTAAAGAGGCGTCGCTATACGGGCTGGGGACGAT
TGTCGCGGAAACTTATCAACGGGATAAGAGACAAGCAAAGTGGTAAAACTATTCTCGATT
TTCTAAAGAGCGACGGCTTCGCCAATAGGAACTTTATGCAGCTGATCCATGATGACTCTTT
AACCTTCAAAGAGGATATACAAAAGGCACAGGTTTCCGGACAAGGGGACTCATTGCACG
AACATATTGCGAATCTTGCTGGTTCGCCAGCCATCAAAAAGGGCATACTCCAGACAGTCA
AAGTAGTGGATGAGCTAGTTAAGGTCATGGGACGTCACAAACCGGAAAACATTGTAATCG
AGATGGCACGCGAAAATCAAACGACTCAGAAGGGGCAAAAAAACAGTCGAGAGCGGAT
GAAGAGAATAGAAGAGGGTATTAAAGAACTGGGCAGCCAGATCTTAAAGGAGCATCCTG
TGGAAAATACCCAATTGCAGAACGAGAAACTTTACCTCTATTACCTACAAAATGGAAGGG
ACATGTATGTTGATCAGGAACTGGACATAAACCGTTTATCTGATTACGACGTCGATGCCAT
TGTACCCCAATCCTTTTTGAAGGACGATTCAATCGACAATAAAGTGCTTACACGCTCGGAT
AAGAACCGAGGGAAAAGTGACAATGTTCCAAGCGAGGAAGTCGTAAAGAAAATGAAGA
ACTATTGGCGGCAGCTCCTAAATGCGAAACTGATAACGCAAAGAAAGTTCGATAACTTAA
CTAAAGCTGAGAGGGGTGGCTTGTCTGAACTTGACAAGGCCGGATTTATTAAACGTCAGC
TCGTGGAAACCCGCCAAATCACAAAGCATGTTGCACAGATACTAGATTCCCGAATGAATA
CGAAATACGACGAGAACGATAAGCTGATTCGGGAAGTCAAAGTAATCACTTTAAAGTCA
AAATTGGTGTCGGACTTCAGAAAGGATTTTCAATTCTATAAAGTTAGGGAGATAAATAAC
TACCACCATGCGCACGACGCTTATCTTAATGCCGTCGTAGGGACCGCACTCATTAAGAAA
TACCCGAAGCTAGAAAGTGAGTTTGTGTATGGTGATTACAAAGTTTATGACGTCCGTAAG
ATGATCGCGAAAAGCGAACAGGAGATAGGCAAGGCTACAGCCAAATACTTCTTTTATTCT
AACATTATGAATTTCTTTAAGACGGAAATCACTCTGGCAAACGGAGAGATACGCAAACGA
CCTTTAATTGAAACCAATGGGGAGACAGGTGAAATCGTATGGGATAAGGGCCGGGACTTC
GCGACGGTGAGAAAAGTTTTGTCCATGCCCCAAGTCAACATAGTAAAGAAAACTGAGGTG
CAGACCGGAGGGTTTTCAAAGGAATCGATTCTTCCAAAAAGGAATAGTGATAAGCTCATC
GCTCGTAAAAAGGACTGGGACCCGAAAAAGTACGGTGGCTTCGATAGCCCTACAGTTGCC
TATTCTGTCCTAGTAGTGGCAAAAGTTGAGAAGGGAAAATCCAAGAAACTGAAGTCAGTC
AAAGAATTATTGGGGATAACGATTATGGAGCGCTCGTCTTTTGAAAAGAACCCCATCGAC
TTCCTTGAGGCGAAAGGTTACAAGGAAGTAAAAAAGGATCTCATAATTAAACTACCAAAG
TATAGTCTGTTTGAGTTAGAAAATGGCCGAAAACGGATGTTGGCTAGCGCCGGAGAGCTT
CAAAAGGGGAACGAACTCGCACTACCGTCTAAATACGTGAATTTCCTGTATTTAGCGTCC
CATTACGAGAAGTTGAAAGGTTCACCTGAAGATAACGAACAGAAGCAACTTTTTGTTGAG
CAGCACAAACATTATCTCGACGAAATCATAGAGCAAATTTCGGAATTCAGTAAGAGAGTC
ATCCTAGCTGATGCCAATCTGGACAAAGTATTAAGCGCATACAACAAGCACAGGGATAAA
CCCATACGTGAGCAGGCGGAAAATATTATCCATTTGTTTACTCTTACCAACCTCGGCGCTC
CAGCCGCATTCAAGTATTTTGACACAACGATAGATCGCAAACGATACACTTCTACCAAGG
AGGTGCTAGACGCGACACTGATTCACCAATCCATCACGGGATTATATGAAACTCGGATAGATTTGTCACAGCTTGGGGGTGACTAA
SEQ ID NO:57
名称:ABBIE1(NLS-接头1-整合酶-接头2-dCas9)–DNA序列
序列:
ATGGACTACAAAGACCATGACGGTGATTATAAAGATCATGACATCGATTACAAGGATGAC
GATGACAAGATGGCCCCCAAGAAGAAGAGGAAGGTGGGCATTCACCGCGGGGTACCT
GGGGGAAGTATGTTCCTGGACGGTATCGACAAAGCTCAGGACGAGCACGAAAAGTACCATTCTAACTGGCGCGCCATGGCCTCTGACTTCAATCTCCCGCCGGTTGTTGCCAAGGAGATCGTGGCTTCTTGCGACAAGTGCCAATTGAA
GGGTGAGGCTATGCATGGTCAGGTCGATTGCTCTCCCGGTATCTGGCAGCTGGACTGCACTCACCTCGAG
GGTAAGGTGATTCTCGTTGCTGTGCACGTGGCTTCCGGCTACATCGAGGCTGAGGTCATCCCGGCTGAGA
CCGGTCAAGAGACTGCTTACTTCCTGCTCAAGCTGGCCGGCCGTTGGCCAGTTAAGACTATTCACACTGA
TAACGGTTCTAACTTTACTTCCGCAACTGTGAAAGCTGCATGCTGGTGGGCCGGCATTAAACAAGAGTTC
GGAATTCCGTATAACCCGCAGTCTCAGGGCGTTGTCGAGTCTATGAACAAGGAGCTCAAAAAGATCATTG
GTCAAGTCCGTGACCAAGCTGAGCACCTTAAGACCGCTGTGCAGATGGCTGTTTTTATTCATAACTTCAA
GCGTAAGGGTGGTATCGGTGGTTATAGCGCTGGTGAGCGTATCGTAGACATCATCGCTACTGATATCCAG
ACAAAGGAGCTGCAGAAGCAGATCACTAAGATCCAGAACTTCCGTGTGTACTATCGGGACTCTAGGAACC
CGCTCTGGAAGGGTCCTGCTAAACTGCTGTGGAAGGGAGAGGGTGCTGTTGTTATCCAGGACAACTCTGA
TATCAAGGTGGTTCCGCGTCGTAAGGCTAAAATTATCCGCGACTACGGCAAGCAAATGGCTGGAGACGAC
TGCGTTGCTAGCCGTCAAGACGAAGACagcggcagcgaaaccccgggcaccagcgaaagcgcgaccccggaaagc
ATGGATAAAAAGTATTCTATTGGTTTAGCTATCGGCACTAATTCCGTTGGATGGGCTGTCA
TAACCGATGAATACAAAGTACCTTCAAAGAAATTTAAGGTGTTGGGGAACACAGACCGTC
ATTCGATTAAAAAGAATCTTATCGGTGCCCTCCTATTCGATAGTGGCGAAACGGCAGAGG
CGACTCGCCTGAAACGAACCGCTCGGAGAAGGTATACACGTCGCAAGAACCGAATATGTT
ACTTACAAGAAATTTTTAGCAATGAGATGGCCAAAGTTGACGATTCTTTCTTTCACCGTTT
GGAAGAGTCCTTCCTTGTCGAAGAGGACAAGAAACATGAACGGCACCCCATCTTTGGAAA
CATAGTAGATGAGGTGGCATATCATGAAAAGTACCCAACGATTTATCACCTCAGAAAAAA
GCTAGTTGACTCAACTGATAAAGCGGACCTGAGGTTAATCTACTTGGCTCTTGCCCATATG
ATAAAGTTCCGTGGGCACTTTCTCATTGAGGGTGATCTAAATCCGGACAACTCGGATGTC
GACAAACTGTTCATCCAGTTAGTACAAACCTATAATCAGTTGTTTGAAGAGAACCCTATA
AATGCAAGTGGCGTGGATGCGAAGGCTATTCTTAGCGCCCGCCTCTCTAAATCCCGACGG
CTAGAAAACCTGATCGCACAATTACCCGGAGAGAAGAAAAATGGGTTGTTCGGTAACCTT
ATAGCGCTCTCACTAGGCCTGACACCAAATTTTAAGTCGAACTTCGACTTAGCTGAAGAT
GCCAAATTGCAGCTTAGTAAGGACACGTACGATGACGATCTCGACAATCTACTGGCACAA
ATTGGAGATCAGTATGCGGACTTATTTTTGGCTGCCAAAAACCTTAGCGATGCAATCCTCC
TATCTGACATACTGAGAGTTAATACTGAGATTACCAAGGCGCCGTTATCCGCTTCAATGAT
CAAAAGGTACGATGAACATCACCAAGACTTGACACTTCTCAAGGCCCTAGTCCGTCAGCA
ACTGCCTGAGAAATATAAGGAAATATTCTTTGATCAGTCGAAAAACGGGTACGCAGGTTA
TATTGACGGCGGAGCGAGTCAAGAGGAATTCTACAAGTTTATCAAACCCATATTAGAGAA
GATGGATGGGACGGAAGAGTTGCTTGTAAAACTCAATCGCGAAGATCTACTGCGAAAGC
AGCGGACTTTCGACAACGGTAGCATTCCACATCAAATCCACTTAGGCGAATTGCATGCTA
TACTTAGAAGGCAGGAGGATTTTTATCCGTTCCTCAAAGACAATCGTGAAAAGATTGAGA
AAATCCTAACCTTTCGCATACCTTACTATGTGGGACCCCTGGCCCGAGGGAACTCTCGGTT
CGCATGGATGACAAGAAAGTCCGAAGAAACGATTACTCCATGGAATTTTGAGGAAGTTGT
CGATAAAGGTGCGTCAGCTCAATCGTTCATCGAGAGGATGACCAACTTTGACAAGAATTT
ACCGAACGAAAAAGTATTGCCTAAGCACAGTTTACTTTACGAGTATTTCACAGTGTACAA
TGAACTCACGAAAGTTAAGTATGTCACTGAGGGCATGCGTAAACCCGCCTTTCTAAGCGG
AGAACAGAAGAAAGCAATAGTAGATCTGTTATTCAAGACCAACCGCAAAGTGACAGTTA
AGCAATTGAAAGAGGACTACTTTAAGAAAATTGAATGCTTCGATTCTGTCGAGATCTCCG
GGGTAGAAGATCGATTTAATGCGTCACTTGGTACGTATCATGACCTCCTAAAGATAATTA
AAGATAAGGACTTCCTGGATAACGAAGAGAATGAAGATATCTTAGAAGATATAGTGTTGA
CTCTTACCCTCTTTGAAGATCGGGAAATGATTGAGGAAAGACTAAAAACATACGCTCACC
TGTTCGACGATAAGGTTATGAAACAGTTAAAGAGGCGTCGCTATACGGGCTGGGGACGAT
TGTCGCGGAAACTTATCAACGGGATAAGAGACAAGCAAAGTGGTAAAACTATTCTCGATT
TTCTAAAGAGCGACGGCTTCGCCAATAGGAACTTTATGCAGCTGATCCATGATGACTCTTT
AACCTTCAAAGAGGATATACAAAAGGCACAGGTTTCCGGACAAGGGGACTCATTGCACG
AACATATTGCGAATCTTGCTGGTTCGCCAGCCATCAAAAAGGGCATACTCCAGACAGTCA
AAGTAGTGGATGAGCTAGTTAAGGTCATGGGACGTCACAAACCGGAAAACATTGTAATCG
AGATGGCACGCGAAAATCAAACGACTCAGAAGGGGCAAAAAAACAGTCGAGAGCGGAT
GAAGAGAATAGAAGAGGGTATTAAAGAACTGGGCAGCCAGATCTTAAAGGAGCATCCTG
TGGAAAATACCCAATTGCAGAACGAGAAACTTTACCTCTATTACCTACAAAATGGAAGGG
ACATGTATGTTGATCAGGAACTGGACATAAACCGTTTATCTGATTACGACGTCGATGCCAT
TGTACCCCAATCCTTTTTGAAGGACGATTCAATCGACAATAAAGTGCTTACACGCTCGGAT
AAGAACCGAGGGAAAAGTGACAATGTTCCAAGCGAGGAAGTCGTAAAGAAAATGAAGA
ACTATTGGCGGCAGCTCCTAAATGCGAAACTGATAACGCAAAGAAAGTTCGATAACTTAA
CTAAAGCTGAGAGGGGTGGCTTGTCTGAACTTGACAAGGCCGGATTTATTAAACGTCAGC
TCGTGGAAACCCGCCAAATCACAAAGCATGTTGCACAGATACTAGATTCCCGAATGAATA
CGAAATACGACGAGAACGATAAGCTGATTCGGGAAGTCAAAGTAATCACTTTAAAGTCA
AAATTGGTGTCGGACTTCAGAAAGGATTTTCAATTCTATAAAGTTAGGGAGATAAATAAC
TACCACCATGCGCACGACGCTTATCTTAATGCCGTCGTAGGGACCGCACTCATTAAGAAA
TACCCGAAGCTAGAAAGTGAGTTTGTGTATGGTGATTACAAAGTTTATGACGTCCGTAAG
ATGATCGCGAAAAGCGAACAGGAGATAGGCAAGGCTACAGCCAAATACTTCTTTTATTCT
AACATTATGAATTTCTTTAAGACGGAAATCACTCTGGCAAACGGAGAGATACGCAAACGA
CCTTTAATTGAAACCAATGGGGAGACAGGTGAAATCGTATGGGATAAGGGCCGGGACTTC
GCGACGGTGAGAAAAGTTTTGTCCATGCCCCAAGTCAACATAGTAAAGAAAACTGAGGTG
CAGACCGGAGGGTTTTCAAAGGAATCGATTCTTCCAAAAAGGAATAGTGATAAGCTCATC
GCTCGTAAAAAGGACTGGGACCCGAAAAAGTACGGTGGCTTCGATAGCCCTACAGTTGCC
TATTCTGTCCTAGTAGTGGCAAAAGTTGAGAAGGGAAAATCCAAGAAACTGAAGTCAGTC
AAAGAATTATTGGGGATAACGATTATGGAGCGCTCGTCTTTTGAAAAGAACCCCATCGAC
TTCCTTGAGGCGAAAGGTTACAAGGAAGTAAAAAAGGATCTCATAATTAAACTACCAAAG
TATAGTCTGTTTGAGTTAGAAAATGGCCGAAAACGGATGTTGGCTAGCGCCGGAGAGCTT
CAAAAGGGGAACGAACTCGCACTACCGTCTAAATACGTGAATTTCCTGTATTTAGCGTCC
CATTACGAGAAGTTGAAAGGTTCACCTGAAGATAACGAACAGAAGCAACTTTTTGTTGAG
CAGCACAAACATTATCTCGACGAAATCATAGAGCAAATTTCGGAATTCAGTAAGAGAGTC
ATCCTAGCTGATGCCAATCTGGACAAAGTATTAAGCGCATACAACAAGCACAGGGATAAA
CCCATACGTGAGCAGGCGGAAAATATTATCCATTTGTTTACTCTTACCAACCTCGGCGCTC
CAGCCGCATTCAAGTATTTTGACACAACGATAGATCGCAAACGATACACTTCTACCAAGG
AGGTGCTAGACGCGACACTGATTCACCAATCCATCACGGGATTATATGAAACTCGGATAGATTTGTCACAGCTTGGGGGTGACTAA
SEQ ID NO:58
名称:ABBIE1的翻译(基于结合的整合酶编辑器)
序列:
Met D Y K D H D G D Y K D H D I D Y K D D D D K Met A P K K K R K V GI H R G V P G G S Met F L D G I D K A Q D E H E K Y H S N W R A Met A S D F NL P P V V A K E I V A S C D K C Q L K G E A Met H G Q V D C S P G I W Q L D CT H L E G K V I L V A V H V A S G Y I E A E V I P A E T G Q E T A Y F L L K LA G R W P V K T I H T D N G S N F T S A T V K A A C W W A G I K Q E F G I P YN P Q S Q G V V E S Met N K E L K K I I G Q V R D Q A E H L K T A V Q Met A VF I H N F K R K G G I G G Y S A G E R I V D I I A T D I Q T K E L Q K Q I T KI Q N F R V Y Y R D S R N P L W K G P A K L L W K G E G A V V I Q D N S D I KV V P R R K A K I I R D Y G K Q Met A G D D C V A S R Q D E D S G S E T P G TS E S A T P E S Met D K K Y S I G L A I G T N S V G W A V I T D E Y K V P S KK F K V L G N T D R H S I K K N L I G A L L F D S G E T A E A T R L K R T A RR R Y T R R K N R I C Y L Q E I F S N E Met A K V D D S F F H R L E E S F L VE E D K K H E R H P I F G N I V D E V A Y H E K Y P T I Y H L R K K L V D S TD K A D L R L I Y L A L A H Met I K F R G H F L I E G D L N P D N S D V D K LF I Q L V Q T Y N Q L F E E N P I N A S G V D A K A I L S A R L S K S R R L EN L I A Q L P G E K K N G L F G N L I A L S L G L T P N F K S N F D L A E D AK L Q L S K D T Y D D D L D N L L A Q I G D Q Y A D L F L A A K N L S D A I LL S D I L R V N T E I T K A P L S A S Met I K R Y D E H H Q D L T L L K A L VR Q Q L P E K Y K E I F F D Q S K N G Y A G Y I D G G A S Q E E F Y K F I K PI L E K Met D G T E E L L V K L N R E D L L R K Q R T F D N G S I P H Q I H LG E L H A I L R R Q E D F Y P F L K D N R E K I E K I L T F R I P Y Y V G P LA R G N S R F A W Met T R K S E E T I T P W N F E E V V D K G A S A Q S F I ERMet T N F D K N L P N E K V L P K H S L L Y E Y F T V Y N E L T K V K Y V TE G Met R K P A F L S G E Q K K A I V D L L F K T N R K V T V K Q L K E D Y FK K I E C F D S V E I S G V E D R F N A S L G T Y H D L L K I I K D K D F L DN E E N E D I L E D I V L T L T L F E D R E Met I E E R L K T Y A H L F D D KV Met K Q L K R R R Y T G W G R L S R K L I N G I R D K Q S G K T I L D F L KS D G F A N R N F Met Q L I H D D S L T F K E D I Q K A Q V S G Q G D S L H EH I A N L A G S P A I K K G I L Q T V K V V D E L V K V Met G R H K P E N I VI E Met A R E N Q T T Q K G Q K N S R E R Met K R I E E G I K E L G S Q I L KE H P V E N T Q L Q N E K L Y L Y Y L Q N G R D Met Y V D Q E L D I N R L S DY D V D A I V P Q S F L K D D S I D N K V L T R S D K N R G K S D N V P S E EV V K K Met K N Y W R Q L L N A K L I T Q R K F D N L T K A E R G G L S E L DK A G F I K R Q L V E T R Q I T K H V A Q I L D S R Met N T K Y D E N D K L IR E V K V I T L K S K L V S D F R K D F Q F Y K V R E I N N Y H H A H D A Y LN A V V G T A L IK K Y P K L E S E F V Y G D Y K V Y D V R K Met I A K S E QE I G K A T A K Y F F Y S N I Met N F F K T E I T L A N G E I R K R P L I E TN G E T G E I V W D K G R D F A T V R K V L S Met P Q V N I V K K T E V Q T GG F S K E S I L P K R N S D K L I A R K K D W D P K K Y G G F D S P T V A Y SV L V V A K V E K G K S K K L K S V K E L L G I T I Met E R S S F E K N P I DF L E A K G Y K E V K K D L I I K L P K Y S L F E L E N G R K R Met L A S A GE L Q K G N E L A L P S K Y V N F L Y L A S H Y E K L K G S P E D N E Q K Q LF V E Q H K H Y L D E I I E Q I S E F S K R V I L A D A N L D K V L S A Y N KH R D K P I R E Q A E N I I H L F T L T N L G A P A A F K Y F D T T I D R K RY T S T K E V L D A T L I H Q S I T G L Y E T R I D L S Q L G G D终止
对于供体DNA(用于整合酶识别的LTR区域的att位点)。
SEQ ID NO:59
名称:U3att
序列:
ACTGGAAGGGCTAATTCACTCCCAAAGAA
SEQ ID NO:60
名称:U5att
序列:
GACCCTTTTAGTCAGTGTGGAAAATCTCTAGCAGT
NLS-接头1-整合酶-接头2-dCas9,或整合酶-接头1-NLS-接头2-dCas9或整合酶-接头2-dCas9-接头1-NLS或整合酶-接头2-dCas9-NLS
接头1=GGS
SEQ ID NO:61
名称:接头2
序列:
SGSETPGTSESATPES
SEQ ID NO:62
名称:MMTV整合酶cDNA,gb|AF071010.1|:16-1113小鼠乳腺瘤病毒假定整合酶,env多蛋白,以及超抗原mRNA,完整cds
序列:
ATGACAGGAAAGTGGCCTTGTATTTACTCCACTAACTGCAGAGATGTGTTGCATGGGACGGGGGGCACTG
CACCAGCCCTCGTGCTGAATTCGGCACGAGGAAATGCCTATGCAGATTCTTTAACAAGAATTCTGACCGC
TTTAGAGTCAGCTCAAGAAAGCCACGCACTGCACCATCAAAATGCCGCGGCGCTTAGGTTTCAGTTTCAC
ATCACTCGTGAACAAGCACGAGAAATAGTAAAATTATGTCCAAATTGCCCCGACTGGGGACATGCACCAC
AACTAGGAGTAAACCCTAGGGGCCTTAAGCCCGGGGTTCTATGGCAAATGGATGTTACTCATGTCTCAGA
ATTTGGAAAATTAAAGTATGTACATGTGACAGTGGATACTTACTCTCATTTTACTTTCGCTACCGCCCGG
ACGGGCGAAGCAGCCAAAGATGTGTTACAACACTTGGCTCAAAGCTTTGCATACATGGGCATTCCTCAAA
AAATAAAAACAGATAATGCCCCTGCCTATGTGTCTCGTTCAATACAAGAATTTCTGGCCAGATGGAAAAT
ATCTCACGTCACGGGGATCCCTTACAATCCCCAAGGACAGGCCATTGTTGAACGAACGCACCAAAATATA
AAGGCACAGATTAATAAACTTCAAAAGGCTGGAAAATACTATACACCCCACCATCTATTGGCACATGCTC
TTTTTGTGCTGAATCATGTAAATATGGACAATCAAGGCCATACAGCGGCCGAAAGACATTGGGGTCCAAT
CTCAGCCGATCCAAAACCTATGGTCATGTGGAAAGACCTTCTCACAGGGTCCTGGAAAGGACCCGATGTC
CTAATAACAGCCGGACGAGGCTATGCTTGTGTTTTTCCACAGGATGCCGAATCACCAATCTGGGTCCCCG
ACCGGTTCATCCGACCTTTTACTGAGCGGAAAGAAGCAACGCCCACACCTGGCACTGCGGAGAAAACGCC
GCCGCGAGATGAGAAAGATCAACAGGAAAGTCCGGAGGATGAATCTTGCCCCCATCAAAGAGAAGACGGC
TTGGCAACATCTGCAGGCGTTAATCTCCGAAGCGGAGGAGGTTCTTAA
SEQ ID NO:63
名称:gi|3273866|gb|AAC24859.1|假定整合酶[小鼠乳腺瘤病毒]
序列:
MTGKWPCIYSTNCRDVLHGTGGTAPALVLNSARGNAYADSLTRILTALESAQESHALHHQNAAALRFQFH
ITREQAREIVKLCPNCPDWGHAPQLGVNPRGLKPGVLWQMDVTHVSEFGKLKYVHVTVDTYSHFTFATAR
TGEAAKDVLQHLAQSFAYMGIPQKIKTDNAPAYVSRSIQEFLARWKISHVTGIPYNPQGQAIVERTHQNI
KAQINKLQKAGKYYTPHHLLAHALFVLNHVNMDNQGHTAAERHWGPISADPKPMVMWKDLLTGSWKGPDV
LITAGRGYACVFPQDAESPIWVPDRFIRPFTERKEATPTPGTAEKTPPRDEKDQQESPEDESCPHQREDG
LATSAGVNLRSGGGS
SEQ ID NO:64
名称:gb|AXUN02000059.1|:5116-8850Youngiibacter fragilis 232.1contig_151,全基因组鸟枪序列-重组酶
序列:
TTGAAAGATAACGATAAAAGGATGTGGGTTCAGACTTTATGGAATCCCATCAATGAAAGACATAAAAGTC
CACTGGATAGCCCAGAACCAGGGATTAAAGTAGCGGCCTACTGCAGAGTAAGCATGAAAGAGGAGGAACA
ACTCCGGTCATTGGAAAACCAGGTGCATCACTATACTCATTTTATCAAAAGTAAGCCGAATTGGAGATTT
GTAGGGGTTTATTACGATGATGGCATAAGTGCAGCCATGGCAAGTGGGAGAAGAGGGTTCCAGCGGATTA
TCCGTCATGCTGAAGAAGGTAAGGTTGATCTGATTCTAACAAAGAATATTTCACGGTTTTCCAGAAATTC
CAAGGAGTTACTGGATATAATCAATCAACTGAAAGCTATCGGTGTGGGCATCTATTTTGAGAAAGAGAAT
ATTGATACTTCAAGAGAGTACAATAAATTCCTCTTAAGCACTTATGCTGCGCTGGCACAGGAAGAGATAG
AAACTATTTCAAACTCTACGATGTGGGGTTATGAGAAAAGGTTTCTAAAGGGTATCCCAAAGTTCAACCG
CTTATATGGATACAAAGTCATCCATGCAGGGGATGATTCCCAATTGATTGTTCTTGAAGATGAAGCAAAA
ATCGTAAGAATGATGTATGAACAGTACCTTCAAGGGAAGACGTTCACTGATATTGCAAGGGCGCTAACAG
AAGCTGGAGTGAAAACAGCCAAAGGGAAGGATGTCTGGATAGGCGGCATGATAAAGCATATTTTATCCAA
CGTCACCTACACCGGTAACAAGCTTACACGAGAACTGAAAAGAGATTTATTTACGAACAAAGTTAATAGC
GGTGAACGGGATCAGGTTTTTATAGGAAACACTCACGAACCGATCATCAGCAATGATATTTTCAATCTTG
TTCAAAAGAAGCTTGAGGCCAATACGAAGGAAAGAAAGCCCAGTGAGAAGCGAGAGAAGAACCACATGTC
TGGTCGGCTACTTTGCGGAAGATGTGGATACAGTTTTACCATAATTCACAATAGAGCTTCTCATCACTTT
AAGTGTAGCCCTAAAATCATGGGGGTCTGTGATTCTGAACTTTATCGGGATGCGGATATTCGAGAAATGA
TGATGAGGGCAATGTATATAAAATATGACTTCACCGATGAAGACATAGTACTAAAACTGCTGAAGGAACT
CCAGGTCATCAATCAAAATGATCACTTTGAGTTTCATAGGCTAAAGTTTATCACTGAAATTGAAATCGTA
AAAAGGCAGCAGGCCATTTCAGATAGATATTCAGCTATTAGCATAGAAAAAATGGAAGAAGAATACCGCA
CTTTTGAAAGCAAGATTGCGAAAATTGAGGATGACAGGTACATCAGAATCGATGCAGTGGAGTGGTTAAA
GAAAAACAAGACGCTGGATTCTTTTATCGCTCAGGTCACCACTAAAATATTGCGAGCTTGGGTTTCCGAG
ATGACTGTTTATACACGAGATGACTTTTTAGTGCAGTGGATTGACGGAACTCAAACTGAGATAGGAAGCT
GCGAGCATCATCTTGTGAAGGATAGAAATAGTAAGAGTTACGAGTCCGGTGAAGAAACGAGCAGGAGGGC
CAAATTTGAAGTCAACCACATTAGTGAAACCACCGAAGGACAAGGAGAACTTGATCTCTTAAGCAAGAGT
GCAAGTTCAAACAATGAAGATAGTAATCAACCAGAAAATAATTCTACGGGAAAGGAGGAGCTTGAATTGA
ACTTAAACAGTAATGCAGAAATTATCAAAATTGAGCCCGGGCAAAGGGACTATATTATGAAGAATTTGCA
CAAGAGCCTGAGTGCAAATATGATGATGCAAAATGCTTCAGTACACACGGCAAGTATTAACAAACCTAGA
CTTAAGACTGCTGCTTACTGCAGAATCTCAACAGATTCAGAAGAACAAAAGGTAAGCTTGAAAACCCAAG
TAGCCTATTACACTTATCTGATTCTAAAGGATCCCCAATATGAATATGCAGGCATCTATGCCGATGAAGG
TATATCAGGGCGTTCTATGAAAAACCGTACAGAATTTCTCAAACTACTCGAAGAATGTAAAGCCGGGAAT
GTGGACTTGATTTTAACCAAGTCAATCTCACGGTTTAGCAGAAACGCATTAGATTGCTTGGAACAGATCA
GGATGCTGAAGTCGCTGCCAAGTCCAGTTTATGTGTATTTTGAGAAAGAGAATATTCATACAAAAGATGA
GAAGAGTGAGCTGATGATTTCTATTTTTGGAAGTATCGCTCAGGAAGAGAGCGTAAACATGGGAGAAGCC
ATGGCTTGGGGAAAACGGAGATATGCTGAGAGAGGGATAGTAAACCCAAGTGTTGCACCTTATGGATATA
GAACGGTCAGAAAAGGTGAATGGGAGGTGGTTGAAGAAGAAGCTACGATCATTAGAAGAATTTATCGGAT
GCTCCTAAGTGGAAAGAGTATTCATGAAATCACAAAGGAGCTCTCCATGGAGAAGATAAAGGGTCCTGGC
GGCAACGAGCAGTGGCATCTTCAAACCATTAGAAATATCTTGAGAAATGAAATCTATAGGGGTAACTACC
TTTATCAAAAGGCTTATATCAAGGACACGATCGAGAAGAAGGTGGTAATGAATCGAGGAGAACTGCCACA
GTATCTCATAGAGAATCATCATAAAGCCATTGTTGACAATGAGACCTGGGAAAAGGTCCAGAAGGTACTA
GAAGCCAGAAGGGAAAAATATGAGAATAAAAAGTCCATAACTTATCCTGAAGACAAAATGAAAAACGCTT
CTCTTGAAGATATTTTTACCTGTGGAGAATGTGGAAGTAAAATAGGCCATAGAAGGAGCATCCAGAGCTC
TAATGAGATTCATTCCTGGATCTGCACAAAAGCCGCTAAGTCTTTCTTGGTGGACTCGTGTAAGTCCACA
AGCGTATATCAGAAGCACCTGGAGCTGCATTTTATGAAGACTCTTCTCGATATTAAAAAGCATCGTTCTT
TCAAAGATGAGGTGCTCACCTATATTCGAACCCAAGAAGTAGATGAAAAGGAAGAGTGGAGAATCAAAGT
CATAGAGAAACGAATCAAAGATCTTAACAGAGAGCTTTATAATGCGGTAGACCAGGAGCTCAATAAAAAA
GGTCAGGACTCCAGGAAAGTTGATGAGCTCACAGAGAAAATTGTGGATCTTCAAGAGGAATTAAAGGTGT
TTAGGGACCGAAAGGCAAAGGTTGAGGATCTTAAAGCTGAGCTTGAATGGTTCCTAAAGAAGCTGGAAAC
CATTGATGACGCTCGAGTAAAAAGAAATGAAGGAATAGGCCACGGTGAAGAGATCTACTTCAGAGAAGAT
ATTTTTGAAAGAATAGTAAGGAGTGCACAGCTTTATAGCGATGGAAGGATCGTCTACGAACTAAGCCTCG
GGATCCAGTGGTTCATTGACTTTAAATACAGCGCATTTCAGAAGCTTCTTATAAAGTGGAAGGATAAACA
AAGGGCAGAAGAAAAAGAGGCTTTTCTTGAGGGGCCGGAAGTTAAAGAGCTGCTGGAATTTTGTAAGGAA
CCGAAGAGCTACTCTGATTTACATGCCTTCATGTGTGAGAGAAAAGAGGTGTCTTATAGCTATTTCAGGA
AATTGGTGATAAGACCTTTGATGAAGAAAGGAAAGCTGAAGTTCACCATACCAGAAGATGTTATGAATAG
GCATCAGAGATACACATCAATCTAA
SEQ ID NO:65
名称:gi|564135645|gb|ETA81829.1|重组酶[Youngiibacter fragilis 232.1]
序列:
MKDNDKRMWVQTLWNPINERHKSPLDSPEPGIKVAAYCRVSMKEEEQLRSLENQVHHYTHFIKSKPNWRF
VGVYYDDGISAAMASGRRGFQRIIRHAEEGKVDLILTKNISRFSRNSKELLDIINQLKAIGVGIYFEKEN
IDTSREYNKFLLSTYAALAQEEIETISNSTMWGYEKRFLKGIPKFNRLYGYKVIHAGDDSQLIVLEDEAK
IVRMMYEQYLQGKTFTDIARALTEAGVKTAKGKDVWIGGMIKHILSNVTYTGNKLTRELKRDLFTNKVNS
GERDQVFIGNTHEPIISNDIFNLVQKKLEANTKERKPSEKREKNHMSGRLLCGRCGYSFTIIHNRASHHF
KCSPKIMGVCDSELYRDADIREMMMRAMYIKYDFTDEDIVLKLLKELQVINQNDHFEFHRLKFITEIEIV
KRQQAISDRYSAISIEKMEEEYRTFESKIAKIEDDRYIRIDAVEWLKKNKTLDSFIAQVTTKILRAWVSE
MTVYTRDDFLVQWIDGTQTEIGSCEHHLVKDRNSKSYESGEETSRRAKFEVNHISETTEGQGELDLLSKS
ASSNNEDSNQPENNSTGKEELELNLNSNAEIIKIEPGQRDYIMKNLHKSLSANMMMQNASVHTASINKPR
LKTAAYCRISTDSEEQKVSLKTQVAYYTYLILKDPQYEYAGIYADEGISGRSMKNRTEFLKLLEECKAGN
VDLILTKSISRFSRNALDCLEQIRMLKSLPSPVYVYFEKENIHTKDEKSELMISIFGSIAQEESVNMGEA
MAWGKRRYAERGIVNPSVAPYGYRTVRKGEWEVVEEEATIIRRIYRMLLSGKSIHEITKELSMEKIKGPG
GNEQWHLQTIRNILRNEIYRGNYLYQKAYIKDTIEKKVVMNRGELPQYLIENHHKAIVDNETWEKVQKVL
EARREKYENKKSITYPEDKMKNASLEDIFTCGECGSKIGHRRSIQSSNEIHSWICTKAAKSFLVDSCKST
SVYQKHLELHFMKTLLDIKKHRSFKDEVLTYIRTQEVDEKEEWRIKVIEKRIKDLNRELYNAVDQELNKK
GQDSRKVDELTEKIVDLQEELKVFRDRKAKVEDLKAELEWFLKKLETIDDARVKRNEGIGHGEEIYFRED
IFERIVRSAQLYSDGRIVYELSLGIQWFIDFKYSAFQKLLIKWKDKQRAEEKEAFLEGPEVKELLEFCKE
PKSYSDLHAFMCERKEVSYSYFRKLVIRPLMKKGKLKFTIPEDVMNRHQRYTSI
SEQ ID NO:66
名称:gi|571264543:16423-16770艰难梭状芽孢杆菌转座子Tn6218,菌株Ox42转座酶
序列:
TTAGTCTTCAAAAGGTTTTGGACTAAATTTACTCTCGTAGTCAGGTCCAAGTGTTTCTTCAGATTTTTTT
TTCAACCAATCCACCTGCATGGTGAGCTGGCCAACTTTTTTCGCATATTCAGCTTTTTCCTTGCGTTCTA
AAGCGAGTTTTTCTTTCAGATTATCCTCTCGTGTGTCATTAAAAACCACGGATGCTTTATCGAGGAACTC
CTTCTTCCAGTTGCGGAGAAGATTCGGCTGAATATTGTTTTCGGTTGCGATTGTATTTAAGTCTTTTTCT
CCTTTGAGCAGTTCAATCACTAATTCTGATTTGAATTTGGCAGAGAAATTTCTTCTTGTTCGAGACAT
SEQ ID NO:67
名称:gi|571264559|emb|CDF47133.1|转座酶[Peptoclostridium difficile]
序列:
MSRTRRNFSAKFKSELVIELLKGEKDLNTIATENNIQPNLLRNWKKEFLDKASVVFNDTREDNLKEKLAL
ERKEKAEYAKKVGQLTMQVDWLKKKSEETLGPDYESKFSPKPFED
SEQ ID NO:68
名称:gb|CP009444.1|:1317724-1320543蜃楼弗朗西斯菌(Francisellaphilomiragia)菌株GA01-2801,全基因组Cpf1
序列:
ATGAATCTATATAGTAATCTAACAAATAAATATAGTTTAAGTAAAACTCTAAGATTTGAGTTAATTCCAC
AGGGTGAAACACTTGAAAATATAAAAGCAAGAGGTTTGATTTTAGATGATGAGAAAAGAGCTAAAGACTA
TAAAAAAGCTAAACAAATCATTGATAAATATCATCAGTTTTTTATAGAGGAGATATTAAGTTCGGTATGT
ATTAGCGAAGATTTATTACAAAACTATTCTGATGTTTATTTTAAACTTAAAAAGAGTGATGATGATAATC
TACAAAAAGATTTTAAAAGTGCAAAAGATACGATAAAGAAACACATATCTAGATATATAAATGACTCGGA
GAAATTTAAGAATTTGTTTAATCAAAATCTTATAGATGCTAAAAAAGGGCAAGAGTCAGATTTAATTCTA
TGGCTAAAGCAATCTAAGGATAATGGCATAGAACTATTTAAAGCTAACAGTGATATCACAGACATAGATG
AGGCGTTAGAAATAATCAAATCTTTTAAAGGTTGGACAACTTATTTTAAGGGTTTTCATGAAAATAGAAA
AAATGTCTATAGTAGTGATGATATCCCTACATCTATTATTTATAGAATAGTAGATGATAATTTGCCTAAA
TTTATAGAAAATAAAGCTAAGTATGAGAATTTAAAAGACAAAGCTCCAGAAGCTATAAACTATGAACAAA
TTAAAAAAGATTTGGCAGAAGAGCTAACCTTTGATATTGACTACAAAACATCTGAAGTTAATCAAAGAGT
TTTTTCACTTGATGAAGTTTTTGAGATAGCAAACTTTAATAATTATCTAAATCAAAGTGGTATTACTAAA
TTTAATACTATTATTGGTGGTAAATTTGTTAATGGTGAAAATACAAAGAGAAAAGGTATAAATGAATATA
TAAATCTATACTCACAGCAAATAAATGATAAAACACTTAAAAAATATAAAATGAGTGTTTTATTTAAGCA
AATTTTAAGTGATACAGAATCTAAATCTTTTGTAATTGATAAGTTAGAAGATGATAGTGATGTAGTTACA
ACGATGCAAAGTTTTTATGAGCAAATAGCAGCTTTTAAAACATTAGAAGAAAAGTCTATTAAGGAAACAT
TATCTTTACTATTTGATGATTTAAAAGCTCAAAAACTTGATTTGAGTAAAATTTATTTTAAAAATGATAA
ATCTCTTACTGATCTATCACAACAAGTTTTTGATGATTATAGTGTTATTGGTACAGCGGTACTAGAATAT
ATAACTCAACAAGTAGCACCTAAAAATCTTGATAACCCTAGTAAGAAAGAGCAAGATTTAATAGCCAAAA
AAACTGAAAAAGCAAAATACTTATCTCTAGAAACTATAAAGCTTGCCTTAGAAGAATTTAATAAGTATAG
AGATATAGATAAACAGTGTAGGTTTGAAGAAATATTTGCAAGCTTTGCAGATATTCCGGTGCTATTTGAT
GAAATAGCTCAAAACAAAAACAATTTGGCACAGATATCTATCAAATATCAAAATCAAGGTAAAAAAGACC
TGCTTCAAACTAGTGCAGAAGTAGATGTTAAAGCTATCAAGGATCTTTTGGATCAAACTAATAATCTCTT
GCATAAACTAAAAATATTTCATATTACGCAATCAGAAGATAAGGCAAATATTTTAGACAAGGATGAGCAT
TTTTATTTAGTATTTGATGAGTGCTACTTTGAGCTAGCGAATATAGTGGCTCTTTATAACAAAATTAGAA
ACTATATAACTCAAAAGCCATATAGTGATGAGAAATTTAAGCTCAATTTTGAGAACTCAACTTTAGCCAA
TGGTTGGGATAAAAATAAAGAGCCTGACAATACGGCAATTTTATTTATCAAAGATGATAAATATTATCTG
GGTGTGATGAACAAGAAAAATAACAAAATATTTGATGATAAAGCTATCAAAGAAAATAAAGGTGAAGGAT
ATAAGAAAGTTGTATATAAACTTTTACCCGGTGCAAATAAAATGTTACCTAAGGTTTTCTTTTCTGCTAA
ATCTATAAATTTTTATAATCCTAGTGAAGATATACTTAGAATAAGAAACCACTCAACACATACAAAAAAT
GGTAGTCCTCAAAAAGGATATGAAAAACTTGAGTTTAATATTGAAGATTGCCGAAAATTTATAGATTTTT
ATAAACATTCTATAAGTAGGCATCCAGAGTGGAAAGATTTTGGATTTAGATTTTCTGATACTAAAAAATA
CAACTCTATAGATGAATTTTATAGAGAAGTTGAAAATCAAGGCTACAAACTAACTTTTGAAAATATATCA
GAAAGCTATATTGATAGTTTAGTCGATGAAGGCAAATTATACCTATTCCAAATCTATAATAAAGATTTCT
CAGTATATAGTAAGGGTAAACCAAATTTACATACGCTATATTGGAAGGCGTTGTTTGATGAGAGAAATCT
CCAAGATGTAGTATATAAATTAAATGGTGAAGCAGAACTCTTCTATCGTAAACAATCAATACCTAAGAAA
ATCACTCACCCAGCCAAAGAGGCAATAGCTAATAAAAACAAAGATAATCCTAAAAAAGAGAGTATTTTTG
AATATGATTTAATCAAAGATAAACGCTTTACTGAAGATAAGTTTTTCTTTCACTGTCCTATTACAATCAA
TTTCAAATCTAGTGGAGCTAATAAGTTTAATGATGAAATCAATTTATTGCTAAAAGAAAAAGCAAATGAT
GTTCATATCCTAAGTATAGATAGAGGAGAAAGACATTTAGCTTACTATACTTTGGTAGATGGTAAAGGAA
ACATTATCTGTAAGAATTAA
SEQ ID NO:69
名称:gi|754264888|gb|AJI57252.1|CRISPR相关蛋白Cpf1,亚型PREFRAN[蜃楼弗朗西斯菌]
序列:
MKTNYHDKLAAIEKDRESARKDWKKINNIKEMKEGYLSQVVHEIAKLVIGYNAIVVFEDLNFGFKRGRFK
VEKQVYQKLEKMLIEKLNYLVFKDNEFDKAGGVLRAYQLTAPFETFKKMGKQTGIIYYVPADFTSKICPV
TGFVNQLYPKYESVSKSQEFFSKFDKICYNLDKGYFEFSFDYKNFGDKAAKGKWTIASFGSRLINFRNSD
KNHNWDTREVYPTKELEKLLKDYSIEYGHGECIKAAIYAENDKKFFAKLTSILNSILQMRNSKTGTELDY
LISPVADVNGNFFDSRHAPKNMPQDADANGAYHIGLKGLMLLYRIKNNQDGKKLNLVIKNEEYFEFVQNR
NKSSKI
SEQ ID NO:70
名称:gi|438609|gb|L21188.1|HIV1NY5A人类免疫缺陷病毒1型整合酶基因,3'末端
序列:
TTCCTGGACGGTATCGATAAAGCTCAGGAAGAACACGAAAAATACCACTCTAACTGGCGCGCCATGGCTT
CTGACTTCAACCTGCCGCCGGTTGTTGCCAAGGAAATCGTGGCTTCTTGCGACAAATGCCAATTGAAAGG
TGAAGCTATGCATGGTCAGGTCGACTGCTCTCCAGGTATCTGGCAGCTGGACTGCACTCATCTCGAGGGT
AAAGTTATCCTGGTTGCTGTTCACGTGGCTTCCGGATACATCGAAGCTGAAGTTATCCCGGCTGAAACCG
GTCAGGAAACTGCTTACTTCCTGCTTAAGCTGGCCGGCCGTTGGCCGGTTAAAACTGTTCACACTGACAA
CGGTTCTAACTTCACTAGTACTACTGTTAAAGCTGCATGCTGGTGGGCCGGCATCAAACAGGAGTTCGGG
ATCCCGTACAACCCGCAGTCTCAGGGCGTTATCGAATCTATGAACAAAGAGCTCAAAAAAATCATTGGCC
AGGTACGTGATCAGGCTGAGCACCTGAAAACCGCGGTGCAGATGGCTGTTTTCATCCACAACTTCAAACG
TAAAGGTGGTATCGGTGGTTACAGCGCTGGTGAACGTATCGTTGACATCATCGCTACTGATATCCAGACT
AAAGAACTGCAGAAACAGATCACTAAAATCCAGAACTTCCGTGTATACTACCGTGACTCTAGAGACCCGG
TTTGGAAAGGTCCTGCTAAACTCCTGTGGAAGGGTGAAGGTGCTGTTGTTATCCAGGACAACTCTGACAT
CAAAGTGGTACCGCGTCGTAAAGCTAAAATCATTCGCGACTACGGCAAACAGATGGCTGGTGACGACTGC
GTTGCTAGCCGTCAGGACGAAGACTAAAAGCTTCAGGC
SEQ ID NO:71
名称:gi|438610|gb|AAC37875.1|整合酶,部分的[人类免疫缺陷病毒1]
序列:
FLDGIDKAQEEHEKYHSNWRAMASDFNLPPVVAKEIVASCDKCQLKGEAMHGQVDCSPGIWQLDCTHLEG
KVILVAVHVASGYIEAEVIPAETGQETAYFLLKLAGRWPVKTVHTDNGSNFTSTTVKAACWWAGIKQEFG
IPYNPQSQGVIESMNKELKKIIGQVRDQAEHLKTAVQMAVFIHNFKRKGGIGGYSAGERIVDIIATDIQT
KELQKQITKIQNFRVYYRDSRDPVWKGPAKLLWKGEGAVVIQDNSDIKVVPRRKAKIIRDYGKQMAGDDC
VASRQDED
SEQ ID NO:72
名称:gi|545612232|ref|WP_021736722.1|V型CRISPR相关蛋白Cpf1[氨基酸球菌属物种BV3L6]
序列:
MTQFEGFTNLYQVSKTLRFELIPQGKTLKHIQEQGFIEEDKARNDHYKELKPIIDRIYKTYADQCLQLVQ
LDWENLSAAIDSYRKEKTEETRNALIEEQATYRNAIHDYFIGRTDNLTDAINKRHAEIYKGLFKAELFNG
KVLKQLGTVTTTEHENALLRSFDKFTTYFSGFYENRKNVFSAEDISTAIPHRIVQDNFPKFKENCHIFTR
LITAVPSLREHFENVKKAIGIFVSTSIEEVFSFPFYNQLLTQTQIDLYNQLLGGISREAGTEKIKGLNEV
LNLAIQKNDETAHIIASLPHRFIPLFKQILSDRNTLSFILEEFKSDEEVIQSFCKYKTLLRNENVLETAE
ALFNELNSIDLTHIFISHKKLETISSALCDHWDTLRNALYERRISELTGKITKSAKEKVQRSLKHEDINL
QEIISAAGKELSEAFKQKTSEILSHAHAALDQPLPTTLKKQEEKEILKSQLDSLLGLYHLLDWFAVDESN
EVDPEFSARLTGIKLEMEPSLSFYNKARNYATKKPYSVEKFKLNFQMPTLASGWDVNKEKNNGAILFVKN
GLYYLGIMPKQKGRYKALSFEPTEKTSEGFDKMYYDYFPDAAKMIPKCSTQLKAVTAHFQTHTTPILLSN
NFIEPLEITKEIYDLNNPEKEPKKFQTAYAKKTGDQKGYREALCKWIDFTRDFLSKYTKTTSIDLSSLRP
SSQYKDLGEYYAELNPLLYHISFQRIAEKEIMDAVETGKLYLFQIYNKDFAKGHHGKPNLHTLYWTGLFS
PENLAKTSIKLNGQAELFYRPKSRMKRMAHRLGEKMLNKKLKDQKTPIPDTLYQELYDYVNHRLSHDLSD
EARALLPNVITKEVSHEIIKDRRFTSDKFFFHVPITLNYQAANSPSKFNQRVNAYLKEHPETPIIGIDRG
ERNLIYITVIDSTGKILEQRSLNTIQQFDYQKKLDNREKERVAARQAWSVVGTIKDLKQGYLSQVIHEIV
DLMIHYQAVVVLENLNFGFKSKRTGIAEKAVYQQFEKMLIDKLNCLVLKDYPAEKVGGVLNPYQLTDQFT
SFAKMGTQSGFLFYVPAPYTSKIDPLTGFVDPFVWKTIKNHESRKHFLEGFDFLHYDVKTGDFILHFKMN
RNLSFQRGLPGFMPAWDIVFEKNETQFDAKGTPFIAGKRIVPVIENHRFTGRYRDLYPANELIALLEEKG
IVFRDGSNILPKLLENDDSHAIDTMVALIRSVLQMRNSNAATGEDYINSPVRDLNGVCFDSRFQNPEWPM
DADANGAYHIALKGQLLLNHLKESKDLKLQNGISNQDWLAYIQELRN
SEQ ID NO:73
名称:gi|769142322|ref|WP_044919442.1|V型CRISPR相关蛋白Cpf1[毛螺菌科细菌MA2020]
序列:
MYYESLTKQYPVSKTIRNELIPIGKTLDNIRQNNILESDVKRKQNYEHVKGILDEYHKQLINEALDNCTL
PSLKIAAEIYLKNQKEVSDREDFNKTQDLLRKEVVEKLKAHENFTKIGKKDILDLLEKLPSISEDDYNAL
ESFRNFYTYFTSYNKVRENLYSDKEKSSTVAYRLINENFPKFLDNVKSYRFVKTAGILADGLGEEEQDSL
FIVETFNKTLTQDGIDTYNSQVGKINSSINLYNQKNQKANGFRKIPKMKMLYKQILSDREESFIDEFQSD
EVLIDNVESYGSVLIESLKSSKVSAFFDALRESKGKNVYVKNDLAKTAMSNIVFENWRTFDDLLNQEYDL
ANENKKKDDKYFEKRQKELKKNKSYSLEHLCNLSEDSCNLIENYIHQISDDIENIIINNETFLRIVINEH
DRSRKLAKNRKAVKAIKDFLDSIKVLERELKLINSSGQELEKDLIVYSAHEELLVELKQVDSLYNMTRNY
LTKKPFSTEKVKLNFNRSTLLNGWDRNKETDNLGVLLLKDGKYYLGIMNTSANKAFVNPPVAKTEKVFKK
VDYKLLPVPNQMLPKVFFAKSNIDFYNPSSEIYSNYKKGTHKKGNMFSLEDCHNLIDFFKESISKHEDWS
KFGFKFSDTASYNDISEFYREVEKQGYKLTYTDIDETYINDLIERNELYLFQIYNKDFSMYSKGKLNLHT
LYFMMLFDQRNIDDVVYKLNGEAEVFYRPASISEDELIIHKAGEEIKNKNPNRARTKETSTFSYDIVKDK
RYSKDKFTLHIPITMNFGVDEVKRFNDAVNSAIRIDENVNVIGIDRGERNLLYVVVIDSKGNILEQISLN
SIINKEYDIETDYHALLDEREGGRDKARKDWNTVENIRDLKAGYLSQVVNVVAKLVLKYNAIICLEDLNF
GFKRGRQKVEKQVYQKFEKMLIDKLNYLVIDKSREQTSPKELGGALNALQLTSKFKSFKELGKQSGVIYY
VPAYLTSKIDPTTGFANLFYMKCENVEKSKRFFDGFDFIRFNALENVFEFGFDYRSFTQRACGINSKWTV
CTNGERIIKYRNPDKNNMFDEKVVVVTDEMKNLFEQYKIPYEDGRNVKDMIISNEEAEFYRRLYRLLQQT
LQMRNSTSDGTRDYIISPVKNKREAYFNSELSDGSVPKDADANGAYNIARKGLWVLEQIRQKSEGEKINL
AMTNAEWLEYAQTHLL
SEQ ID NO:74
名称:gi|489130501|ref|WP_003040289.1|V型CRISPR相关蛋白Cpf1[土拉弗朗西斯菌]
序列:
MSIYQEFVNKYSLSKTLRFELIPQGKTLENIKARGLILDDEKRAKDYKKAKQIIDKYHQFFIEEILSSVC
ISEDLLQNYSDVYFKLKKSDDDNLQKDFKSAKDTIKKQISEYIKDSEKFKNLFNQNLIDAKKGQESDLIL
WLKQSKDNGIELFKANSDITDIDEALEIIKSFKGWTTYFKGFHENRKNVYSSNDIPTSIIYRIVDDNLPK
FLENKAKYESLKDKAPEAINYEQIKKDLAEELTFDIDYKTSEVNQRVFSLDEVFEIANFNNYLNQSGITK
FNTIIGGKFVNGENTKRKGINEYINLYSQQINDKTLKKYKMSVLFKQILSDTESKSFVIDKLEDDSDVVT
TMQSFYEQIAAFKTVEEKSIKETLSLLFDDLKAQKLDLSKIYFKNDKSLTDLSQQVFDDYSVIGTAVLEY
ITQQIAPKNLDNPSKKEQELIAKKTEKAKYLSLETIKLALEEFNKHRDIDKQCRFEEILANFAAIPMIFD
EIAQNKDNLAQISIKYQNQGKKDLLQASAEDDVKAIKDLLDQTNNLLHKLKIFHISQSEDKANILDKDEH
FYLVFEECYFELANIVPLYNKIRNYITQKPYSDEKFKLNFENSTLANGWDKNKEPDNTAILFIKDDKYYL
GVMNKKNNKIFDDKAIKENKGEGYKKIVYKLLPGANKMLPKVFFSAKSIKFYNPSEDILRIRNHSTHTKN
GSPQKGYEKFEFNIEDCRKFIDFYKQSISKHPEWKDFGFRFSDTQRYNSIDEFYREVENQGYKLTFENIS
ESYIDSVVNQGKLYLFQIYNKDFSAYSKGRPNLHTLYWKALFDERNLQDVVYKLNGEAELFYRKQSIPKK
ITHPAKEAIANKNKDNPKKESVFEYDLIKDKRFTEDKFFFHCPITINFKSSGANKFNDEINLLLKEKAND
VHILSIDRGERHLAYYTLVDGKGNIIKQDTFNIIGNDRMKTNYHDKLAAIEKDRDSARKDWKKINNIKEM
KEGYLSQVVHEIAKLVIEYNAIVVFEDLNFGFKRGRFKVEKQVYQKLEKMLIEKLNYLVFKDNEFDKTGG
VLRAYQLTAPFETFKKMGKQTGIIYYVPAGFTSKICPVTGFVNQLYPKYESVSKSQEFFSKFDKICYNLD
KGYFEFSFDYKNFGDKAAKGKWTIASFGSRLINFRNSDKNHNWDTREVYPTKELEKLLKDYSIEYGHGEC
IKAAICGESDKKFFAKLTSVLNTILQMRNSKTGTELDYLISPVADVNGNFFDSRQAPKNMPQDADANGAY
HIGLKGLMLLGRIKNNQEGKKLNLVIKNEEYFEFVQNRNN
SEQ ID NO:75
名称:gi|502240446|ref|WP_012739647.1|V型CRISPR相关蛋白Cpf1[挑剔[真细菌]([Eubacterium]eligens)]
序列:
MNGNRSIVYREFVGVIPVAKTLRNELRPVGHTQEHIIQNGLIQEDELRQEKSTELKNIMDDYYREYIDKS
LSGVTDLDFTLLFELMNLVQSSPSKDNKKALEKEQSKMREQICTHLQSDSNYKNIFNAKLLKEILPDFIK
NYNQYDVKDKAGKLETLALFNGFSTYFTDFFEKRKNVFTKEAVSTSIAYRIVHENSLIFLANMTSYKKIS
EKALDEIEVIEKNNQDKMGDWELNQIFNPDFYNMVLIQSGIDFYNEICGVVNAHMNLYCQQTKNNYNLFK
MRKLHKQILAYTSTSFEVPKMFEDDMSVYNAVNAFIDETEKGNIIGKLKDIVNKYDELDEKRIYISKDFY
ETLSCFMSGNWNLITGCVENFYDENIHAKGKSKEEKVKKAVKEDKYKSINDVNDLVEKYIDEKERNEFKN
SNAKQYIREISNIITDTETAHLEYDDHISLIESEEKADEMKKRLDMYMNMYHWAKAFIVDEVLDRDEMFY
SDIDDIYNILENIVPLYNRVRNYVTQKPYNSKKIKLNFQSPTLANGWSQSKEFDNNAIILIRDNKYYLAI
FNAKNKPDKKIIQGNSDKKNDNDYKKMVYNLLPGANKMLPKVFLSKKGIETFKPSDYIISGYNAHKHIKT
SENFDISFCRDLIDYFKNSIEKHAEWRKYEFKFSATDSYSDISEFYREVEMQGYRIDWTYISEADINKLD
EEGKIYLFQIYNKDFAENSTGKENLHTMYFKNIFSEENLKDIIIKLNGQAELFYRRASVKNPVKHKKDSV
LVNKTYKNQLDNGDVVRIPIPDDIYNEIYKMYNGYIKESDLSEAAKEYLDKVEVRTAQKDIVKDYRYTVD
KYFIHTPITINYKVTARNNVNDMVVKYIAQNDDIHVIGIDRGERNLIYISVIDSHGNIVKQKSYNILNNY
DYKKKLVEKEKTREYARKNWKSIGNIKELKEGYISGVVHEIAMLIVEYNAIIAMEDLNYGFKRGRFKVER
QVYQKFESMLINKLNYFASKEKSVDEPGGLLKGYQLTYVPDNIKNLGKQCGVIFYVPAAFTSKIDPSTGF
ISAFNFKSISTNASRKQFFMQFDEIRYCAEKDMFSFGFDYNNFDTYNITMGKTQWTVYTNGERLQSEFNN
ARRTGKTKSINLTETIKLLLEDNEINYADGHDIRIDMEKMDEDKKSEFFAQLLSLYKLTVQMRNSYTEAE
EQENGISYDKIISPVINDEGEFFDSDNYKESDDKECKMPKDADANGAYCIALKGLYEVLKIKSEWTEDGF
DRNCLKLPHAEWLDFIQNKRYE
SEQ ID NO:76
名称:gi|537834683|ref|WP_020988726.1|V型CRISPR相关蛋白Cpf1[稻田钩端螺旋体(Leptospira inadai)]
序列:
MEDYSGFVNIYSIQKTLRFELKPVGKTLEHIEKKGFLKKDKIRAEDYKAVKKIIDKYHRAYIEEVFDSVL
HQKKKKDKTRFSTQFIKEIKEFSELYYKTEKNIPDKERLEALSEKLRKMLVGAFKGEFSEEVAEKYKNLF
SKELIRNEIEKFCETDEERKQVSNFKSFTTYFTGFHSNRQNIYSDEKKSTAIGYRIIHQNLPKFLDNLKI
IESIQRRFKDFPWSDLKKNLKKIDKNIKLTEYFSIDGFVNVLNQKGIDAYNTILGGKSEESGEKIQGLNE
YINLYRQKNNIDRKNLPNVKILFKQILGDRETKSFIPEAFPDDQSVLNSITEFAKYLKLDKKKKSIIAEL
KKFLSSFNRYELDGIYLANDNSLASISTFLFDDWSFIKKSVSFKYDESVGDPKKKIKSPLKYEKEKEKWL
KQKYYTISFLNDAIESYSKSQDEKRVKIRLEAYFAEFKSKDDAKKQFDLLERIEEAYAIVEPLLGAEYPR
DRNLKADKKEVGKIKDFLDSIKSLQFFLKPLLSAEIFDEKDLGFYNQLEGYYEEIDSIGHLYNKVRNYLT
GKIYSKEKFKLNFENSTLLKGWDENREVANLCVIFREDQKYYLGVMDKENNTILSDIPKVKPNELFYEKM
VYKLIPTPHMQLPRIIFSSDNLSIYNPSKSILKIREAKSFKEGKNFKLKDCHKFIDFYKESISKNEDWSR
FDFKFSKTSSYENISEFYREVERQGYNLDFKKVSKFYIDSLVEDGKLYLFQIYNKDFSIFSKGKPNLHTI
YFRSLFSKENLKDVCLKLNGEAEMFFRKKSINYDEKKKREGHHPELFEKLKYPILKDKRYSEDKFQFHLP
ISLNFKSKERLNFNLKVNEFLKRNKDINIIGIDRGERNLLYLVMINQKGEILKQTLLDSMQSGKGRPEIN
YKEKLQEKEIERDKARKSWGTVENIKELKEGYLSIVIHQISKLMVENNAIVVLEDLNIGFKRGRQKVERQ
VYQKFEKMLIDKLNFLVFKENKPTEPGGVLKAYQLTDEFQSFEKLSKQTGFLFYVPSWNTSKIDPRTGFI
DFLHPAYENIEKAKQWINKFDSIRFNSKMDWFEFTADTRKFSENLMLGKNRVWVICTTNVERYFTSKTAN
SSIQYNSIQITEKLKELFVDIPFSNGQDLKPEILRKNDAVFFKSLLFYIKTTLSLRQNNGKKGEEEKDFI
LSPVVDSKGRFFNSLEASDDEPKDADANGAYHIALKGLMNLLVLNETKEENLSRPKWKIKNKDWLEFVWE
RNR
SEQ ID NO:77
名称:gi|739008549|ref|WP_036890108.1|V型CRISPR相关蛋白Cpf1[狗口腔卟啉单胞菌(Porphyromonas crevioricanis)]
序列:
MDSLKDFTNLYPVSKTLRFELKPVGKTLENIEKAGILKEDEHRAESYRRVKKIIDTYHKVFIDSSLENMA
KMGIENEIKAMLQSFCELYKKDHRTEGEDKALDKIRAVLRGLIVGAFTGVCGRRENTVQNEKYESLFKEK
LIKEILPDFVLSTEAESLPFSVEEATRSLKEFDSFTSYFAGFYENRKNIYSTKPQSTAIAYRLIHENLPK
FIDNILVFQKIKEPIAKELEHIRADFSAGGYIKKDERLEDIFSLNYYIHVLSQAGIEKYNALIGKIVTEG
DGEMKGLNEHINLYNQQRGREDRLPLFRPLYKQILSDREQLSYLPESFEKDEELLRALKEFYDHIAEDIL
GRTQQLMTSISEYDLSRIYVRNDSQLTDISKKMLGDWNAIYMARERAYDHEQAPKRITAKYERDRIKALK
GEESISLANLNSCIAFLDNVRDCRVDTYLSTLGQKEGPHGLSNLVENVFASYHEAEQLLSFPYPEENNLI
QDKDNVVLIKNLLDNISDLQRFLKPLWGMGDEPDKDERFYGEYNYIRGALDQVIPLYNKVRNYLTRKPYS
TRKVKLNFGNSQLLSGWDRNKEKDNSCVILRKGQNFYLAIMNNRHKRSFENKMLPEYKEGEPYFEKMDYK
FLPDPNKMLPKVFLSKKGIEIYKPSPKLLEQYGHGTHKKGDTFSMDDLHELIDFFKHSIEAHEDWKQFGF
KFSDTATYENVSSFYREVEDQGYKLSFRKVSESYVYSLIDQGKLYLFQIYNKDFSPCSKGTPNLHTLYWR
MLFDERNLADVIYKLDGKAEIFFREKSLKNDHPTHPAGKPIKKKSRQKKGEESLFEYDLVKDRRYTMDKF
QFHVPITMNFKCSAGSKVNDMVNAHIREAKDMHVIGIDRGERNLLYICVIDSRGTILDQISLNTINDIDY
HDLLESRDKDRQQEHRNWQTIEGIKELKQGYLSQAVHRIAELMVAYKAVVALEDLNMGFKRGRQKVESSV
YQQFEKQLIDKLNYLVDKKKRPEDIGGLLRAYQFTAPFKSFKEMGKQNGFLFYIPAWNTSNIDPTTGFVN
LFHVQYENVDKAKSFFQKFDSISYNPKKDWFEFAFDYKNFTKKAEGSRSMWILCTHGSRIKNFRNSQKNG
QWDSEEFALTEAFKSLFVRYEIDYTADLKTAIVDEKQKDFFVDLLKLFKLTVQMRNSWKEKDLDYLISPV
AGADGRFFDTREGNKSLPKDADANGAYNIALKGLWALRQIRQTSEGGKLKLAISNKEWLQFVQERSYEKD
SEQ ID NO:78
名称:gi|517171043|ref|WP_018359861.1|V型CRISPR相关蛋白Cpf1[猕猴卟啉单胞菌]
序列:
MKTQHFFEDFTSLYSLSKTIRFELKPIGKTLENIKKNGLIRRDEQRLDDYEKLKKVIDEYHEDFIANILS
SFSFSEEILQSYIQNLSESEARAKIEKTMRDTLAKAFSEDERYKSIFKKELVKKDIPVWCPAYKSLCKKF
DNFTTSLVPFHENRKNLYTSNEITASIPYRIVHVNLPKFIQNIEALCELQKKMGADLYLEMMENLRNVWP
SFVKTPDDLCNLKTYNHLMVQSSISEYNRFVGGYSTEDGTKHQGINEWINIYRQRNKEMRLPGLVFLHKQ
ILAKVDSSSFISDTLENDDQVFCVLRQFRKLFWNTVSSKEDDAASLKDLFCGLSGYDPEAIYVSDAHLAT
ISKNIFDRWNYISDAIRRKTEVLMPRKKESVERYAEKISKQIKKRQSYSLAELDDLLAHYSEESLPAGFS
LLSYFTSLGGQKYLVSDGEVILYEEGSNIWDEVLIAFRDLQVILDKDFTEKKLGKDEEAVSVIKKALDSA
LRLRKFFDLLSGTGAEIRRDSSFYALYTDRMDKLKGLLKMYDKVRNYLTKKPYSIEKFKLHFDNPSLLSG
WDKNKELNNLSVIFRQNGYYYLGIMTPKGKNLFKTLPKLGAEEMFYEKMEYKQIAEPMLMLPKVFFPKKT
KPAFAPDQSVVDIYNKKTFKTGQKGFNKKDLYRLIDFYKEALTVHEWKLFNFSFSPTEQYRNIGEFFDEV
REQAYKVSMVNVPASYIDEAVENGKLYLFQIYNKDFSPYSKGIPNLHTLYWKALFSEQNQSRVYKLCGGG
ELFYRKASLHMQDTTVHPKGISIHKKNLNKKGETSLFNYDLVKDKRFTEDKFFFHVPISINYKNKKITNV
NQMVRDYIAQNDDLQIIGIDRGERNLLYISRIDTRGNLLEQFSLNVIESDKGDLRTDYQKILGDREQERL
RRRQEWKSIESIKDLKDGYMSQVVHKICNMVVEHKAIVVLENLNLSFMKGRKKVEKSVYEKFERMLVDKL
NYLVVDKKNLSNEPGGLYAAYQLTNPLFSFEELHRYPQSGILFFVDPWNTSLTDPSTGFVNLLGRINYTN
VGDARKFFDRFNAIRYDGKGNILFDLDLSRFDVRVETQRKLWTLTTFGSRIAKSKKSGKWMVERIENLSL
CFLELFEQFNIGYRVEKDLKKAILSQDRKEFYVRLIYLFNLMMQIRNSDGEEDYILSPALNEKNLQFDSR
LIEAKDLPVDADANGAYNVARKGLMVVQRIKRGDHESIHRIGRAQWLRYVQEGIVE
SEQ ID NO:79
名称:Uniprot位点上发现的整合酶蛋白序列。DNA序列获得自GenBank。序列:
TTTTTAGATGGAATAGATAAGGCCCAAGATGAACATGAGAAATATCACAGTAATTGGAGAGCAATGGCTAGTGATTTTAACCTGCCACCTGTAGTAGCAAAAGAAATAGTAGCCAGCTGTGATAAATGTCAGCTAAAAGGAGAAGCCATGCATGGACAAGTAGACTGTAGTCCAGGAATATGGCAACTAGATTGTACACATTTAGAAGGAAAAGTTATCCTGGTAGCAGTTCATGTAGCCAGTGGATATATAGAAGCAGAAGTTATTCCAGCAGAAACAGGGCAGGAAACAGCATATTTTCTTTTAAAATTAGCAGGAAGATGGCCAGTAAAAACAATACATACTGACAATGGCAGCAATTTCACCGGTGCTACGGTTAGGGCCGCCTGTTGGTGGGCGGGAATCAAGCAGGAATTTGGAATTCCCTACAATCCCCAAAGTCAAGGAGTAGTAGAATCTATGAATAAAGAATTAAAGAAAATTATAGGACAGGTAAGAGATCAGGCTGAACATCTTAAGACAGCAGTACAAATGGCAGTATTCATCCACAATTTTAAAAGAAAAGGGGGGATTGGGGGGTACAGTGCAGGGGAAAGAATAGTAGACATAATAGCAACAGACATACAAACTAAAGAATTACAAAAACAAATTACAAAAATTCAAAATTTTCGGGTTTATTACAGGGACAGCAGAAATCCACTTTGGAAAGGACCAGCAAAGCTCCTCTGGAAAGGTGAAGGGGCAGTAGTAATACAAGATAATAGTGACATAAAAGTAGTGCCAAGAAGAAAAGCAAAGATCATTAGGGATTATGGAAAACAGATGGCAGGTGATGATTGTGTGGCAAGTAGACAGGATGAGGATTAG
SEQ ID NO:80
名称:sp|P04585|1148-1435
序列:
FLDGIDKAQDEHEKYHSNWRAMASDFNLPPVVAKEIVASCDKCQLKGEAMHGQVDCSPGIWQLDCTHLEGKVILVAVHVASGYIEAEVIPAETGQETAYFLLKLAGRWPVKTIHTDNGSNFTGATVRAACWWAGIKQEFGIPYNPQSQGVVESMNKELKKIIGQVRDQAEHLKTAVQMAVFIHNFKRKGGIGGYSAGERIVDIIATDIQTKELQKQITKIQNFRVYYRDSRNPLWKGPAKLLWKGEGAVVIQDNSDIKVVPRRKAKIIRDYGKQMAGDDCVASRQDED
SEQ ID NO:81
表征锌指蛋白的蛋白质结构域
CX(2-4)CX(12)HX(3-5)H(X(2-4)意指例如XX或XXX或XXXX)
SEQ ID NO:82
>gi|1616606|emb|X97044.1|小鼠乳腺瘤病毒5'LTR DNA
ATGCCGCGCCTGCAGCAGAAATGGTTGAACTCCCGAGAGTGTCCTACACTTAGGGGAGAAGCAGCCAAGG
GGTTGTTTCCCACCCAGAACGACCCATCTGCGCACACACGGATGAGCCCGTCAAACAAAGACATATTCAT
TCTCTGCTGCAAACTTGGCATAGCTCTGCTTTGCCTGGGGCTATTGGGGGAAGTTGCGGTTCATGCTCGC
AGGGCTCTCACCCTTGACTCTTTTAATAGCTCTTCTGTGCAAGATTACAATCTAAACAATTCGGAGAACT
CGACCTTCCTCCTGAGGCAAGGACCACAGCCAACTTCCTCTTACAAGCCGCATCGATTTAGTCCTTCAGA
AATAGAAATAAGAATGCTTGCTAAAAATTATATTTTTACCAATGAGACCAATCCAATAGGTCGATTATTA
ATTACTATGTTAAGAAATGAATCATTATCTTTTAGTACTATTTTTACTCAAATTCAGAAGTTAGAAATGG
GAATAGAAAATAGAAAGAGACGCTCAGCCTCAGTTGAAGAACAGGTGCAAGGACTAAGGGCCTCAGGCCT
AGAAGTAAAAAGGGGGAAGAGGAGTGCGCTTGTCAAAATAGGAGACAGGTGGTGGCAACCAGGAACTTAT
AGGGGACCTTACATCTACAGACCAACAGACGCCCCCTTACCGTATACAGGAAGATATGACCTAAATTTTG
ATAGGTGGGTCACAGTCAATGGCTATAAAGTGTTATACAGATCCCTCCCCTTTCGTGAAAGGCTCGCCAG
AGCTAGACCTCCTTGGTGCGTGTTGTCTCAGGAAGAAAAAGACGACATGAAACAACAGGTACATGATTAT
ATTTATCTAGGAACAGGAATGAACTTTTGGAGATATTATACCAAGGAGGGGGCAGTGGCTAGACTATTAG
AACACATTTCTGCAGATACTAATAGCATGAGTTATTATGATTAGCCTTTATTGGCCCAATCTTGTGGTTC
CCAGGGTTCAAGTAGGTTCATGGTCACAAACTGTTCTTAAAAACAAGGATGTGAGACAAGTGGTTTCCTG
GCTTGGTTTGGTATCAAATGTTTTGATCTGAGCTCTGAGTGTTCTGTTTTCCTATGTTCTTTTGGAATCT
ATCCAAGTCTTATGTAAATGCTTATGTAAACCAAAGTATAAAAGAGTGCTGATTTTTTGAGTAAACTTGC
AACAGTCCTAACATTCACCTCTCGTGTGTTTGTGTCTGTTCGCCATCCCGTCTCCGCTCGTCACTTATCC
TTCACTTTCCAGAGGGTCCCCCCGCAGACCCCGGTGACCCTCAGGTTGGCCGACTGCGGCA
SEQ ID NO:83
>gi|1403387|emb|X98457.1|小鼠乳腺瘤病毒3'LTR
ATGCCGCGCCTGCAGCAGAAATGGTTGAACTCCCGAGAGTGTCCTACACTTAGGAGAGAAGCAGCCAAGG
GGTTGTTTCCCACCAAGGACGACCCGTCTGCGTGCACGCGGATGAGCCCATCAGACAAAGACATACTCAT
TCTCTGCTGCAAACTTGGCATAGCTCTGCTTTGCCTGGGGCTATTGGGGGAAGTTGCGGTTCGTGCTCGC
AGGGCTCTCACCCTTGATTCTTTTAATAACTCTTCTGTGCAAGATTACAATCTAAACGATTCGGAGAACT
CGACCTTCCTCCTGGGGCAAGGACCACAGCCAACTTCCTCTTACAAGCCACACCGACTTTGTCCTTCAGA
AATAGAAATAAGAATGCTTGCTAAAAATTATATTTTTACCAATGAGACCAATCCAATAGGTCGATTATTA
ATCATGATGTTTAGAAATGAATCTTTGTCTTTTAGCACTATATTTACTCAAATTCAAAGGTTAGAAATGG
GAATAGAAAATAGAAAGAGACGCTCAACCTCAGTTGAAGAACAGGTGCAAGGACTAAGGGCCTCAGGCCT
AGAAGTAAAAAGGGGAAAGAGGAGTGCGCTTGTCAAAATAGGAGACAGGTGGTGGCAACCAGGGACTTAT
AGGGGACCTTACATCTACAGACCAACAGACGCCCCGCTACCATATACAGGAAGATACGATTTAAATTTTG
ATAGGTGGGTCACAGTCAACGGCTATAAAGTGTTATACAGATCCCTCCCCCTTCGTGAAAGACTCGCCAG
GGCTAGACCTCCTTGGTGTGTGTTAACTCAGGAAGAAAAAGACGACATGAAACAACAGGTACATGATTAT
ATTTATCTAGGAACAGGAATGAACTTCTGGGGAAAGATATTTGACTACACCGAAGAGGGAGCTATAGCAA
AAATTATATATAATATGAAATATACTCATGGGGGTCGCATTGGCTTCGATCCCTTTTGAAACATTTATAA
ATACAATTAGGTCTACCTTGCGGTTCCCAAGGTTTAAGTAAGTTCAGGGTCACAAACTGTTCTTAAAACA
AGGATGTGAGACAAGTGGTTTCCTGACTTGGT
SEQ ID NO:84
>gi|119662099|emb|AM076881.1|人类免疫缺陷病毒1原病毒5'LTR,TAR元件和U3、U5和R重复区域,克隆PG232.14
GGCAAGAAATCCTTGATTTGTGGGTCTACTACACACAAGGCTTCTTCCCTGATTGGCAAAACTACACACC
GGGACCAGGGGTCAGATATCCACTGACCTTTGGATGGTGCTACAAGCTAGTGCCAGTTGACCCAAAGGAA
GTAGAAGAGGCTAACCAAAGAGAAGACAACTGTTTGCTACACCCTATGAGCCTGCATGGAATAGAGGACG
AAGACAGAGAAGTATTAAAGTGGCAGTTTGACAGCAGCCTAGCACGCAGACACATGGCCCGCGAGCTACA
TCCAGAGTATTACAAAGACTGCTGACACAGAAAAGACTTTCCGCTAGGACTTTCCACTGAGGCGTTCCAG
GGGGAGTGGTCTAGGCAGGACTAGGAGTGGCCAACCCTCAGATGCTGCATATAAGCAGCTGCTTTTCGCC
TGTACTAGGTCTCTCTAGGTGGACCAGATCTGAGCCTAGGCGCTCTCTGGCTATCTAAGGAACCCACTGC
TTAAGCCTCAATAAAGCTTGCCTTGAGTGCTCTAAGTAGTGTGTGCCCGTCTGTTGTGTGACTCTAGTAA
CTAGAGATCCCTCAGACCAACTTTAGTAGTGTAAAAAATCTCTAGCAGTGGCGCCCGAACAGGGACCCGA
AAGTGAAAGCAGGACCAGAGGAGATCTCTCGACGCAGGACTCGGCTTGCTGAAAGTGCACTCGGCAAGAG
GCGAGAGCAGCGGCGACTGGTGAGTACGCCGAATTTTATTTTGACTAGCGGAGGCTAGAAGGAGAGAGAT
A
SEQ ID NO:85
>gi|1072081|gb|U37267.1|HIV1U37267人类免疫缺陷病毒1型3'LTR区域ATGGGTGGCAAGTGGTCAGAAAGTAGTGTGGTTAGAAGGCATGTACCTTTAAGACAAGGCAGCTATAGAT
CTTAGCCGCTTTTTAAAAGAAAAGGGGGGACTGGAAGGGCTAATTCACTCACAGAGAAGATCAGTTGAAC
CAGAAGAAGATAGAAGAGGCCATGAAGAAGAAAACAACAGATTGTTCCGTTTGTTCCGTTGGGGACTTTC
CAGGAGACGTGGCCTGAGTGATAAGCCGCTGGGGACTTTCCGAAGAGGCGTGACGGGACTTTCCAAGGCG
ACGTGGCCTGGGCGGGACTGGGGAGTGGCGAGCCCTCAGATGCTGCATATAAGCAGCTGCTTTCTGCCTG
TACTGGGTCTCTCTGGTTAGACCAGATCTGAGCCTGGGAGCTCTCTGGCTAACTAGGGAACCCACTGCTT
AAGCCTCAATAAAGCTTGCCTTGAGTGCTTCAAGTAGTGTGTGCCCGTCTGTTGTGTGACTCTGGTATCT
AGA
不存在SEQ ID NO:86-99
SEQ ID NO:100
用于将neo插入到细胞基因组中的寡核苷酸(使用5'和3'HIV LTR的全序列)GACA AGACATCCTTGATTTGTGGGTCTATAACACACAAGGCTTCTTCCCTGATTGGCAAAACTACACACCGGGACCAGGGA CCAGATACCCACTGACCTTTGGATGGTGCTTCAAGCTAGTGCCAGTTGACCCAAGGGAA
GTAGAAGAGGCCAATACAGGGGAAAACAACTGTTTGCTCCACCCTATGAGCCAGCATGGAATGGAAGAT
G
ACCATAGAGAAGTATTAAAGTGGAAGTTTGACAGTATGCTAGCACGCAGACACCTGGCCCGCGAGCTAC
A
TCCGGAGTACTACAAAAACTGCTGACATGGAGGGACTTTCCGCTGGGACTTTCCATTGGGGCGTTCCAG
G
AGGTGTGGTCTGGGCGGGACAAGGGAGTGGTCAACCCTCAGATGCTGCATATAAGCAGCTGCTTTTCGC
T
TGTACTGGGTCTCTTTAGGTAGACCAGATCTGAGCCTGGGAGCTCTCTGGCTACCTGAGGAACCCACTG
C
TTAAGCCTCAATAAAGCTTGCCTTGAGTGCTCTAAGTAGTGTGTGCCCGTCTGTTGTGTGACTCTGGTA
A
CTAGAGATCCCTCAGACCCTTTTGGTAGTGTGGAAAATCTCTAGCAGATGATTGAACAAGATGGATTGCAC
第一个5'LTR加下划线,纯文本为neo,并且3'LTR加粗(1179 bp)
SEQ ID NO:101
具有neo序列的5'LTR和3'LTR的缩写版本(224 bp内)
第一个5'LTR加下划线,纯文本为neo,并且3'LTR加粗
GACAAGACATCCTTGATTTGTGGGTCTATAACACACAAGGCTTCTTCCCTGATTGGCAAAACTACACAC
CATGATTGAACAAGATGGATTGCAC
关于SEQ ID NO:72
Genbank蛋白质编号:WP_021736722.1
来自NR数据库的NCBI蛋白质GI或本地GI(针对来源于WGS数据库的蛋白质):545612232
WGS数据库中的重叠群ID:AWUR01000016.1
重叠群说明:氨基酸球菌属物种BV3L6重叠群00028,全基因组鸟枪序列蛋白质完整性:完整
实验分析的蛋白质:8种
非冗余组:nr
生物体:氨基酸球菌属物种BV3L6
分类学:细菌,厚壁菌门,Negativicutes,Selenomonadales,氨基酸球菌科,氨基酸球菌属,氨基酸球菌属物种BV3L6
关于SEQ ID NO:73
Genbank蛋白质编号:WP_044919442.1
来自NR数据库的NCBI蛋白质GI或本地GI(针对来源于WGS数据库的蛋白质):769142322
WGS数据库中的重叠群ID:JQKK01000008.1
重叠群说明:毛螺菌科细菌MA2020T348DRAFT_scaffold00007.7_C,全基因组鸟枪序列
蛋白质完整性:完整
实验分析的蛋白质:9种
非冗余组:nr
生物体:毛螺菌科细菌MA2020
分类学:细菌,厚壁菌门,梭菌纲,梭菌目,毛螺菌科,未分类的毛螺菌科,毛螺菌科细菌MA2020
可以在所披露的组合物和方法-CPF1比对中使用的另外的核酸序列和蛋白质序列。SEQ ID NO:86-92;从图表顶部到底部的顺序。
可以在所披露的组合物和方法-Cfp1人类切割蛋白质比对中使用的另外的核酸序列和蛋白质序列。SEQ ID NO:86(第一行)和SEQ ID NO:90(第二行)。
可以在所披露的组合物和方法中使用的另外的核酸序列和蛋白质序列。表取自Haft,D.,et al.PLoS Computational Biology,November 2005,Vol.1,Issue 6,pp.474-483[Haft,D.等人,公共科学图书馆·计算生物学,2005年11月,第1卷,第6期,第474-483页]。SEQ ID NO:200-253;从图表顶部到底部的顺序。
*M水mova学人(14).
针对Nrf2(外显子2)的编辑靶序列和PAM:用于sgRNA设计1-3。SEQ ID NO:254
GCGACGGAAAGAGTATGAGC TGG
SEQ ID NO:255
TATTTGACTTCAGTCAGCGA CGG
SEQ ID NO:256
TGGAGGCAAGATATAGATCT TGG
用于检测Nrf2靶标处的整合的关键引物
引物组1:
SEQ ID NO:257
引物1:5’-GTGTTAATTTCAAACATCAGCAGC-3’,
SEQ ID NO:258
引物2:5’-GACAAGACATCCTTGATTTG-3’
引物组2:
SEQ ID NO:259
引物1:5’-GAGGTTGACTGTGTAAATG-3’,
SEQ ID NO:260
引物2:5’-GATACCAGAGTCACACAACAG-3’
引物组3:
SEQ ID NO:261
引物1:5’-TCTACATTAATTCTCTTGTGC-3’,
SEQ ID NO:262
引物2:5’-GATACCAGAGTCACACAACAG-3’
人类CXCR4的登录号
Uniprot P61073
Ensembl基因ID:ENSG00000121966
针对CXCR4(外显子2)的编辑靶序列和PAM:用于sgRNA设计1SEQ ID NO:263
GGGCAATGGATTGGTCATCC TGG
用于检测CXCR4靶标处的整合的关键引物
引物组1:
SEQ ID NO:264
引物1:5'-TCTACATTAATTCTCTTGTGC-3',
SEQ ID NO:265
引物2:5'-GACAAGACATCCTTGATTTG-3'
引物组2:
SEQ ID NO:266
引物1:5'-TCTACATTAATTCTCTTGTGC-3',
SEQ ID NO:267
引物2:5'-GATACCAGAGTCACACAACAG-3'
引物组3:
SEQ ID NO:268
引物1:5'-GAGGTTGACTGTGTAAATG-3',
SEQ ID NO:269
引物2:5'-GACAAGACATCCTTGATTTG-3'
引物组4:
SEQ ID NO:270
引物1:5'-GAGGTTGACTGTGTAAATG-3',
SEQ ID NO:271
引物2:5'-GATACCAGAGTCACACAACAG-3'
用于生物素化的Avi标签化的Cas9
用于Cas9生物素化的avi标签的序列
氨基酸序列:
SEQ ID NO:272
G G D L E G S G L N D I F E AQ K I E W H E*
核酸序列:
SEQ ID NO:273
GGCGGCGACCTCGAGGGTAGCGGTCTGAACGATATTTTTGAAGCGCAGAAAATTGAATGGCATGAATAA
Claims (6)
1.一种氨基酸序列如SEQ ID NO.58所示的重组蛋白。
2.一种组合物,包含权利要求1所述的重组蛋白和12至30个碱基的引导RNA。
3.一种DNA载体或病毒载体,所述载体包括编码权利要求1所述的重组蛋白的DNA序列,以及包含在所述重组蛋白的DNA序列之前的转录启动子。
4.一种包括将权利要求1所述的重组蛋白与12至30个碱基的引导RNA一起或将权利要求3所述的DNA载体或病毒载体与编码12至30个碱基的引导RNA的DNA序列一起引入非生殖细胞或非人类动物的胚胎的非治疗性的方法。
5.权利要求4所述的方法,其中引入是通过转染、病毒感染、注射或电穿孔进行的。
6.一种将DNA序列插入到基因组DNA中的非治疗性的方法,所述方法包括:
a) 鉴定所述基因组DNA中的靶序列;
b) 提供结合所述基因组DNA中的所述靶序列的权利要求1所述的重组蛋白和12至30个碱基的引导RNA;
c) 设计并入所述基因组DNA中的感兴趣的DNA序列;以及
d) 通过允许所述重组蛋白、引导RNA和所述感兴趣的DNA序列进入非生殖细胞或非人类生物体的技术,向所述非生殖细胞或非人类生物体提供所述重组蛋白、引导RNA和所述感兴趣的DNA序列;其中所述感兴趣的DNA序列被整合在所述基因组DNA中的所述靶序列处。
Applications Claiming Priority (7)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201562140454P | 2015-03-31 | 2015-03-31 | |
US62/140,454 | 2015-03-31 | ||
US201562210451P | 2015-08-27 | 2015-08-27 | |
US62/210,451 | 2015-08-27 | ||
US201562240359P | 2015-10-12 | 2015-10-12 | |
US62/240,359 | 2015-10-12 | ||
PCT/US2016/025426 WO2016161207A1 (en) | 2015-03-31 | 2016-03-31 | Cas 9 retroviral integrase and cas 9 recombinase systems for targeted incorporation of a dna sequence into a genome of a cell or organism |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108124453A CN108124453A (zh) | 2018-06-05 |
CN108124453B true CN108124453B (zh) | 2022-04-05 |
Family
ID=55745849
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201680031466.5A Active CN108124453B (zh) | 2015-03-31 | 2016-03-31 | 用于将DNA序列靶向并入细胞或生物体的基因组中的Cas9逆转录病毒整合酶和Cas9重组酶系统 |
Country Status (6)
Country | Link |
---|---|
US (2) | US20180080051A1 (zh) |
EP (1) | EP3277805A1 (zh) |
JP (3) | JP2018513681A (zh) |
KR (1) | KR20180029953A (zh) |
CN (1) | CN108124453B (zh) |
WO (1) | WO2016161207A1 (zh) |
Families Citing this family (58)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
AU2012333134B2 (en) | 2011-07-22 | 2017-05-25 | John Paul Guilinger | Evaluation and improvement of nuclease cleavage specificity |
US20150044192A1 (en) | 2013-08-09 | 2015-02-12 | President And Fellows Of Harvard College | Methods for identifying a target site of a cas9 nuclease |
US9359599B2 (en) | 2013-08-22 | 2016-06-07 | President And Fellows Of Harvard College | Engineered transcription activator-like effector (TALE) domains and uses thereof |
US9388430B2 (en) | 2013-09-06 | 2016-07-12 | President And Fellows Of Harvard College | Cas9-recombinase fusion proteins and uses thereof |
US9340799B2 (en) | 2013-09-06 | 2016-05-17 | President And Fellows Of Harvard College | MRNA-sensing switchable gRNAs |
US9526784B2 (en) | 2013-09-06 | 2016-12-27 | President And Fellows Of Harvard College | Delivery system for functional nucleases |
US20150165054A1 (en) | 2013-12-12 | 2015-06-18 | President And Fellows Of Harvard College | Methods for correcting caspase-9 point mutations |
AU2015298571B2 (en) | 2014-07-30 | 2020-09-03 | President And Fellows Of Harvard College | Cas9 proteins including ligand-dependent inteins |
WO2016073990A2 (en) | 2014-11-07 | 2016-05-12 | Editas Medicine, Inc. | Methods for improving crispr/cas-mediated genome-editing |
CA2986310A1 (en) | 2015-05-11 | 2016-11-17 | Editas Medicine, Inc. | Optimized crispr/cas9 systems and methods for gene editing in stem cells |
EP3307887A1 (en) | 2015-06-09 | 2018-04-18 | Editas Medicine, Inc. | Crispr/cas-related methods and compositions for improving transplantation |
US11667911B2 (en) | 2015-09-24 | 2023-06-06 | Editas Medicine, Inc. | Use of exonucleases to improve CRISPR/CAS-mediated genome editing |
IL310721A (en) | 2015-10-23 | 2024-04-01 | Harvard College | Nucleobase editors and their uses |
EP3433363A1 (en) | 2016-03-25 | 2019-01-30 | Editas Medicine, Inc. | Genome editing systems comprising repair-modulating enzyme molecules and methods of their use |
EP4047092A1 (en) | 2016-04-13 | 2022-08-24 | Editas Medicine, Inc. | Cas9 fusion molecules, gene editing systems, and methods of use thereof |
GB201610041D0 (en) * | 2016-06-08 | 2016-07-20 | Oxford Genetics Ltd | Methods |
IL308426A (en) | 2016-08-03 | 2024-01-01 | Harvard College | Adenosine nuclear base editors and their uses |
CA3033327A1 (en) | 2016-08-09 | 2018-02-15 | President And Fellows Of Harvard College | Programmable cas9-recombinase fusion proteins and uses thereof |
US11542509B2 (en) | 2016-08-24 | 2023-01-03 | President And Fellows Of Harvard College | Incorporation of unnatural amino acids into proteins using base editing |
WO2018053053A1 (en) * | 2016-09-13 | 2018-03-22 | The Broad Institute, Inc. | Proximity-dependent biotinylation and uses thereof |
CN118726313A (zh) | 2016-10-07 | 2024-10-01 | 综合Dna技术公司 | 化脓链球菌cas9突变基因和由其编码的多肽 |
US11242542B2 (en) | 2016-10-07 | 2022-02-08 | Integrated Dna Technologies, Inc. | S. pyogenes Cas9 mutant genes and polypeptides encoded by same |
AU2017342543B2 (en) | 2016-10-14 | 2024-06-27 | President And Fellows Of Harvard College | AAV delivery of nucleobase editors |
WO2018098383A1 (en) * | 2016-11-22 | 2018-05-31 | Integrated Dna Technologies, Inc. | Crispr/cpf1 systems and methods |
WO2018111947A1 (en) | 2016-12-12 | 2018-06-21 | Integrated Dna Technologies, Inc. | Genome editing enhancement |
US11278570B2 (en) | 2016-12-16 | 2022-03-22 | B-Mogen Biotechnologies, Inc. | Enhanced hAT family transposon-mediated gene transfer and associated compositions, systems, and methods |
EP3555273B1 (en) | 2016-12-16 | 2024-05-22 | B-Mogen Biotechnologies, Inc. | ENHANCED hAT FAMILY TRANSPOSON-MEDIATED GENE TRANSFER AND ASSOCIATED COMPOSITIONS, SYSTEMS, AND METHODS |
WO2018119359A1 (en) | 2016-12-23 | 2018-06-28 | President And Fellows Of Harvard College | Editing of ccr5 receptor gene to protect against hiv infection |
JP7229923B2 (ja) | 2017-01-06 | 2023-02-28 | エディタス・メディシン、インコーポレイテッド | ヌクレアーゼ切断を評価する方法 |
US11898179B2 (en) | 2017-03-09 | 2024-02-13 | President And Fellows Of Harvard College | Suppression of pain by gene editing |
JP2020510439A (ja) | 2017-03-10 | 2020-04-09 | プレジデント アンド フェローズ オブ ハーバード カレッジ | シトシンからグアニンへの塩基編集因子 |
IL306092A (en) | 2017-03-23 | 2023-11-01 | Harvard College | Nucleic base editors that include nucleic acid programmable DNA binding proteins |
US11499151B2 (en) | 2017-04-28 | 2022-11-15 | Editas Medicine, Inc. | Methods and systems for analyzing guide RNA molecules |
US11560566B2 (en) | 2017-05-12 | 2023-01-24 | President And Fellows Of Harvard College | Aptazyme-embedded guide RNAs for use with CRISPR-Cas9 in genome editing and transcriptional activation |
CA3065813A1 (en) | 2017-06-09 | 2018-12-13 | Editas Medicine, Inc. | Engineered cas9 nucleases |
US11168322B2 (en) | 2017-06-30 | 2021-11-09 | Arbor Biotechnologies, Inc. | CRISPR RNA targeting enzymes and systems and uses thereof |
JP7562931B2 (ja) * | 2017-07-07 | 2024-10-08 | ツールゲン インコーポレイテッド | 標的特異的なcrisprバリアント |
US11866726B2 (en) | 2017-07-14 | 2024-01-09 | Editas Medicine, Inc. | Systems and methods for targeted integration and genome editing and detection thereof using integrated priming sites |
CN111801345A (zh) | 2017-07-28 | 2020-10-20 | 哈佛大学的校长及成员们 | 使用噬菌体辅助连续进化(pace)的进化碱基编辑器的方法和组合物 |
WO2019139645A2 (en) | 2017-08-30 | 2019-07-18 | President And Fellows Of Harvard College | High efficiency base editors comprising gam |
AU2018352592A1 (en) | 2017-10-16 | 2020-06-04 | Beam Therapeutics, Inc. | Uses of adenosine base editors |
WO2019090174A1 (en) * | 2017-11-02 | 2019-05-09 | Arbor Biotechnologies, Inc. | Novel crispr-associated transposon systems and components |
CN112543808A (zh) | 2018-06-21 | 2021-03-23 | 比莫根生物科技公司 | 增强的hAT家族转座子介导的基因转移及相关组合物、系统和方法 |
BR112021007503A2 (pt) * | 2018-10-22 | 2021-11-03 | Univ Rochester | Proteína de fusão, molécula de ácido nucleico, método para edição de material genético, e, sistemas para edição de material genético e para distribuição de componentes de edição de genoma |
WO2020131862A1 (en) * | 2018-12-17 | 2020-06-25 | The Broad Institute, Inc. | Crispr-associated transposase systems and methods of use thereof |
AU2020242032A1 (en) | 2019-03-19 | 2021-10-07 | Massachusetts Institute Of Technology | Methods and compositions for editing nucleotide sequences |
KR20220030945A (ko) * | 2019-05-23 | 2022-03-11 | 크리스티아나 케어 헬스 서비시즈 인코포레이티드 | 암의 치료를 위한 nrf2의 유전자 녹아웃 |
KR20220018504A (ko) | 2019-05-23 | 2022-02-15 | 크리스티아나 케어 헬스 서비시즈 인코포레이티드 | 암 치료를 위한 변이체 nrf2의 유전자 녹아웃 |
WO2020243085A1 (en) * | 2019-05-24 | 2020-12-03 | The Trustees Of Columbia University In The City Of New York | Engineered cas-transposon system for programmable and site-directed dna transpositions |
CA3141422A1 (en) * | 2019-06-11 | 2020-12-17 | Avencia Sanchez-mejias Garcia | Targeted gene editing constructs and methods of using the same |
WO2021097118A1 (en) * | 2019-11-12 | 2021-05-20 | The Broad Institute, Inc. | Small type ii cas proteins and methods of use thereof |
CN115867650A (zh) * | 2020-04-20 | 2023-03-28 | 克里斯蒂安娜基因编辑研究公司 | 用于肺癌治疗的aav递送系统 |
IL297761A (en) | 2020-05-08 | 2022-12-01 | Broad Inst Inc | Methods and compositions for simultaneously editing two helices of a designated double-helix nucleotide sequence |
KR102679172B1 (ko) * | 2020-07-06 | 2024-06-28 | 한국과학기술연구원 | 질환 세포-특이적인 miRNA에 의해 세포 생리 활성 조절 물질의 활성을 조절하는 복합체 및 이를 CRISPR/Cas 시스템에 적용한 질환 특이적 유전자 조작용 복합체 |
EP4180460A4 (en) * | 2020-07-10 | 2024-10-30 | Inst Zoology Cas | NUCLEIC ACID EDITING SYSTEM AND METHOD |
CN112159822A (zh) * | 2020-09-30 | 2021-01-01 | 扬州大学 | 一种PS转座酶与CRISPR/dCpf1融合蛋白表达载体及其介导的定点整合方法 |
MX2023007030A (es) * | 2020-12-16 | 2023-08-21 | Univ Pompeu Fabra | Transposasas programables y usos de las mismas. |
CN118497204A (zh) * | 2024-07-16 | 2024-08-16 | 西北农林科技大学深圳研究院 | 一种质粒CRISPR-pCas9n及基因编辑的方法与应用 |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104404036A (zh) * | 2014-11-03 | 2015-03-11 | 赛业(苏州)生物科技有限公司 | 基于CRISPR/Cas9技术的条件性基因敲除方法 |
Family Cites Families (19)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6080849A (en) | 1997-09-10 | 2000-06-27 | Vion Pharmaceuticals, Inc. | Genetically modified tumor-targeted bacteria with reduced virulence |
AU746454B2 (en) | 1998-03-02 | 2002-05-02 | Massachusetts Institute Of Technology | Poly zinc finger proteins with improved linkers |
US20040003420A1 (en) | 2000-11-10 | 2004-01-01 | Ralf Kuhn | Modified recombinase |
GB0400814D0 (en) | 2004-01-14 | 2004-02-18 | Ark Therapeutics Ltd | Integrating gene therapy vector |
US20060252140A1 (en) * | 2005-04-29 | 2006-11-09 | Yant Stephen R | Development of a transposon system for site-specific DNA integration in mammalian cells |
EP2765195A1 (en) | 2006-05-25 | 2014-08-13 | Sangamo BioSciences, Inc. | Methods and compositions for gene inactivation |
BRPI0720048A8 (pt) | 2006-12-14 | 2023-02-07 | Dow Agrosciences Llc | Proteínas de dedo de zinco não-canônicas otimizadas |
US8816153B2 (en) | 2010-08-27 | 2014-08-26 | Monsanto Technology Llc | Recombinant DNA constructs employing site-specific recombination |
ES2636902T3 (es) | 2012-05-25 | 2017-10-10 | The Regents Of The University Of California | Métodos y composiciones para la modificación de ADN objetivo dirigida por ARN y para la modulación de la transcripción dirigida por ARN |
CN103668470B (zh) | 2012-09-12 | 2015-07-29 | 上海斯丹赛生物技术有限公司 | 一种dna文库及构建转录激活子样效应因子核酸酶质粒的方法 |
US8697359B1 (en) | 2012-12-12 | 2014-04-15 | The Broad Institute, Inc. | CRISPR-Cas systems and methods for altering expression of gene products |
WO2014093694A1 (en) | 2012-12-12 | 2014-06-19 | The Broad Institute, Inc. | Crispr-cas nickase systems, methods and compositions for sequence manipulation in eukaryotes |
US9708589B2 (en) * | 2012-12-18 | 2017-07-18 | Monsanto Technology Llc | Compositions and methods for custom site-specific DNA recombinases |
WO2014134412A1 (en) | 2013-03-01 | 2014-09-04 | Regents Of The University Of Minnesota | Talen-based gene correction |
AU2014235794A1 (en) | 2013-03-14 | 2015-10-22 | Caribou Biosciences, Inc. | Compositions and methods of nucleic acid-targeting nucleic acids |
EP3004339B1 (en) * | 2013-05-29 | 2021-07-07 | Cellectis | New compact scaffold of cas9 in the type ii crispr system |
AU2014273490B2 (en) * | 2013-05-29 | 2019-05-09 | Cellectis | Methods for engineering T cells for immunotherapy by using RNA-guided Cas nuclease system |
US9388430B2 (en) * | 2013-09-06 | 2016-07-12 | President And Fellows Of Harvard College | Cas9-recombinase fusion proteins and uses thereof |
US9790490B2 (en) * | 2015-06-18 | 2017-10-17 | The Broad Institute Inc. | CRISPR enzymes and systems |
-
2016
- 2016-03-31 CN CN201680031466.5A patent/CN108124453B/zh active Active
- 2016-03-31 JP JP2017552079A patent/JP2018513681A/ja active Pending
- 2016-03-31 WO PCT/US2016/025426 patent/WO2016161207A1/en active Application Filing
- 2016-03-31 EP EP16715977.1A patent/EP3277805A1/en active Pending
- 2016-03-31 KR KR1020177031337A patent/KR20180029953A/ko not_active Application Discontinuation
- 2016-03-31 US US15/563,657 patent/US20180080051A1/en not_active Abandoned
-
2021
- 2021-02-11 US US17/173,494 patent/US20220315952A1/en active Pending
- 2021-07-01 JP JP2021110120A patent/JP2021176301A/ja active Pending
-
2023
- 2023-07-26 JP JP2023121860A patent/JP2023156355A/ja active Pending
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104404036A (zh) * | 2014-11-03 | 2015-03-11 | 赛业(苏州)生物科技有限公司 | 基于CRISPR/Cas9技术的条件性基因敲除方法 |
Non-Patent Citations (3)
Title |
---|
GenBank ID:ABR68182.1, integrase, partial [Human immunodeficiency virus 1];integrase, partial;《GenBank》;20070708;参见第1页 * |
GenBank ID:WP_003079701.1, CRISPR-associated protein Csn1 [Streptococcus macacae];CRISPR-associated protein Csn1;《GenBank》;20130527;参见第1页 * |
Tethering human immunodeficiency virus type 1 preintegration complexes to target DNA promotes integration at nearby sites;FREDERIC D. BUSHMAN等;《JOURNAL OF VIROLOGY》;19970131;第71卷(第1期);参见全文 * |
Also Published As
Publication number | Publication date |
---|---|
US20220315952A1 (en) | 2022-10-06 |
CN108124453A (zh) | 2018-06-05 |
KR20180029953A (ko) | 2018-03-21 |
JP2023156355A (ja) | 2023-10-24 |
JP2018513681A (ja) | 2018-05-31 |
JP2021176301A (ja) | 2021-11-11 |
US20180080051A1 (en) | 2018-03-22 |
WO2016161207A1 (en) | 2016-10-06 |
EP3277805A1 (en) | 2018-02-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108124453B (zh) | 用于将DNA序列靶向并入细胞或生物体的基因组中的Cas9逆转录病毒整合酶和Cas9重组酶系统 | |
US11555181B2 (en) | Engineered cascade components and cascade complexes | |
US20210403861A1 (en) | Nucleotide-specific recognition sequences for designer tal effectors | |
US20220025347A1 (en) | Variants of CRISPR from Prevotella and Francisella 1 (Cpf1) | |
AU2021245148B2 (en) | Using nucleosome interacting protein domains to enhance targeted genome modification | |
AU2019222568B2 (en) | Engineered Cas9 systems for eukaryotic genome modification | |
KR20190005801A (ko) | 표적 특이적 crispr 변이체 | |
JP2020516255A (ja) | ゲノム編集のためのシステム及び方法 | |
US11332749B2 (en) | Real-time reporter systems for monitoring base editing | |
WO2017107898A2 (en) | Compositions and methods for gene editing | |
AU2017302657A1 (en) | Mice comprising mutations resulting in expression of c-truncated fibrillin-1 | |
US20210363206A1 (en) | Proteins that inhibit cas12a (cpf1), a cripr-cas nuclease | |
JP2022534560A (ja) | ヒト化アルブミン遺伝子座を含む非ヒト動物 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right |
Effective date of registration: 20240226 Address after: California, USA Patentee after: SOHM Co. Country or region after: U.S.A. Address before: California, USA Patentee before: EXELIGEN SCIENTIFIC, Inc. Country or region before: U.S.A. |
|
TR01 | Transfer of patent right |