WO2022247873A1 - 工程化的Cas12i核酸酶、效应蛋白及其用途 - Google Patents
工程化的Cas12i核酸酶、效应蛋白及其用途 Download PDFInfo
- Publication number
- WO2022247873A1 WO2022247873A1 PCT/CN2022/095072 CN2022095072W WO2022247873A1 WO 2022247873 A1 WO2022247873 A1 WO 2022247873A1 CN 2022095072 W CN2022095072 W CN 2022095072W WO 2022247873 A1 WO2022247873 A1 WO 2022247873A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- cas12i
- nuclease
- engineered
- amino acid
- amino acids
- Prior art date
Links
- 101710163270 Nuclease Proteins 0.000 title claims abstract description 594
- 108090000623 proteins and genes Proteins 0.000 title claims description 282
- 102000004169 proteins and genes Human genes 0.000 title claims description 267
- 239000012636 effector Substances 0.000 title claims description 228
- 150000001413 amino acids Chemical class 0.000 claims abstract description 390
- 230000035772 mutation Effects 0.000 claims abstract description 259
- 108020004414 DNA Proteins 0.000 claims abstract description 215
- 102000053602 DNA Human genes 0.000 claims abstract description 199
- 239000000758 substrate Substances 0.000 claims abstract description 54
- 108020004682 Single-Stranded DNA Proteins 0.000 claims abstract description 52
- 230000002209 hydrophobic effect Effects 0.000 claims abstract description 24
- 125000003118 aryl group Chemical group 0.000 claims abstract description 21
- 235000001014 amino acid Nutrition 0.000 claims description 604
- 235000018102 proteins Nutrition 0.000 claims description 266
- 125000003275 alpha amino acid group Chemical group 0.000 claims description 252
- 150000007523 nucleic acids Chemical class 0.000 claims description 224
- 102000039446 nucleic acids Human genes 0.000 claims description 198
- 108020004707 nucleic acids Proteins 0.000 claims description 198
- 210000004027 cell Anatomy 0.000 claims description 192
- 238000000034 method Methods 0.000 claims description 111
- 108020005004 Guide RNA Proteins 0.000 claims description 91
- 125000000539 amino acid group Chemical group 0.000 claims description 78
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 claims description 73
- 239000013598 vector Substances 0.000 claims description 73
- 108090000765 processed proteins & peptides Proteins 0.000 claims description 69
- 201000010099 disease Diseases 0.000 claims description 61
- 102000004196 processed proteins & peptides Human genes 0.000 claims description 61
- 229920001184 polypeptide Polymers 0.000 claims description 59
- 238000006467 substitution reaction Methods 0.000 claims description 51
- 239000004116 Sepiolitic clay Substances 0.000 claims description 49
- 239000004257 Anoxomer Substances 0.000 claims description 44
- 102000004190 Enzymes Human genes 0.000 claims description 41
- 108090000790 Enzymes Proteins 0.000 claims description 41
- 230000001939 inductive effect Effects 0.000 claims description 37
- 238000006471 dimerization reaction Methods 0.000 claims description 36
- 238000003776 cleavage reaction Methods 0.000 claims description 32
- 230000007017 scission Effects 0.000 claims description 32
- 241000282414 Homo sapiens Species 0.000 claims description 29
- 230000000295 complement effect Effects 0.000 claims description 28
- 229910052739 hydrogen Inorganic materials 0.000 claims description 22
- 210000004962 mammalian cell Anatomy 0.000 claims description 22
- 238000001514 detection method Methods 0.000 claims description 19
- -1 Q320 Substances 0.000 claims description 18
- 239000002243 precursor Substances 0.000 claims description 18
- 230000027455 binding Effects 0.000 claims description 15
- 102100034343 Integrase Human genes 0.000 claims description 14
- 229910052731 fluorine Inorganic materials 0.000 claims description 14
- 238000000338 in vitro Methods 0.000 claims description 14
- 229910052727 yttrium Inorganic materials 0.000 claims description 14
- 239000003795 chemical substances by application Substances 0.000 claims description 13
- 230000004048 modification Effects 0.000 claims description 13
- 238000012986 modification Methods 0.000 claims description 13
- 108010092799 RNA-directed DNA polymerase Proteins 0.000 claims description 12
- 208000035475 disorder Diseases 0.000 claims description 12
- 230000002255 enzymatic effect Effects 0.000 claims description 12
- 206010028980 Neoplasm Diseases 0.000 claims description 11
- 239000000203 mixture Substances 0.000 claims description 11
- 201000011510 cancer Diseases 0.000 claims description 10
- 238000001727 in vivo Methods 0.000 claims description 10
- 239000013607 AAV vector Substances 0.000 claims description 9
- 230000003993 interaction Effects 0.000 claims description 9
- 230000023603 positive regulation of transcription initiation, DNA-dependent Effects 0.000 claims description 9
- 229910052721 tungsten Inorganic materials 0.000 claims description 9
- 208000024172 Cardiovascular disease Diseases 0.000 claims description 8
- 208000026350 Inborn Genetic disease Diseases 0.000 claims description 8
- 208000036142 Viral infection Diseases 0.000 claims description 8
- 208000016361 genetic disease Diseases 0.000 claims description 8
- 239000004137 magnesium phosphate Substances 0.000 claims description 8
- 239000004250 tert-Butylhydroquinone Substances 0.000 claims description 8
- 230000009385 viral infection Effects 0.000 claims description 8
- 208000023275 Autoimmune disease Diseases 0.000 claims description 7
- 208000035143 Bacterial infection Diseases 0.000 claims description 7
- 230000001580 bacterial effect Effects 0.000 claims description 7
- 208000022362 bacterial infectious disease Diseases 0.000 claims description 7
- 210000004899 c-terminal region Anatomy 0.000 claims description 7
- 239000003814 drug Substances 0.000 claims description 7
- 208000030533 eye disease Diseases 0.000 claims description 7
- 230000002779 inactivation Effects 0.000 claims description 7
- 208000030159 metabolic disease Diseases 0.000 claims description 7
- 230000004770 neurodegeneration Effects 0.000 claims description 7
- 208000015122 neurodegenerative disease Diseases 0.000 claims description 7
- 230000037426 transcriptional repression Effects 0.000 claims description 7
- 210000004102 animal cell Anatomy 0.000 claims description 6
- 208000016097 disease of metabolism Diseases 0.000 claims description 6
- 241000702421 Dependoparvovirus Species 0.000 claims description 5
- QNAYBMKLOCPYGJ-REOHCLBHSA-N L-alanine Chemical compound C[C@H](N)C(O)=O QNAYBMKLOCPYGJ-REOHCLBHSA-N 0.000 claims description 5
- 235000004279 alanine Nutrition 0.000 claims description 5
- 230000001177 retroviral effect Effects 0.000 claims description 5
- 239000001715 Ammonium malate Substances 0.000 claims description 4
- 210000005253 yeast cell Anatomy 0.000 claims description 4
- 239000004280 Sodium formate Substances 0.000 claims description 3
- 230000005782 double-strand break Effects 0.000 claims description 3
- 238000002360 preparation method Methods 0.000 claims description 3
- 230000005783 single-strand break Effects 0.000 claims description 3
- 230000014621 translational initiation Effects 0.000 claims description 3
- 208000009889 Herpes Simplex Diseases 0.000 claims description 2
- 230000001973 epigenetic effect Effects 0.000 claims 1
- 238000012239 gene modification Methods 0.000 claims 1
- 230000005017 genetic modification Effects 0.000 claims 1
- 235000013617 genetically modified food Nutrition 0.000 claims 1
- 238000010362 genome editing Methods 0.000 abstract description 67
- 229940024606 amino acid Drugs 0.000 description 320
- 230000000694 effects Effects 0.000 description 161
- 125000003729 nucleotide group Chemical group 0.000 description 103
- 239000002773 nucleotide Substances 0.000 description 101
- 108091079001 CRISPR RNA Proteins 0.000 description 75
- 229920002477 rna polymer Polymers 0.000 description 72
- 125000006850 spacer group Chemical group 0.000 description 51
- 102000040430 polynucleotide Human genes 0.000 description 34
- 108091033319 polynucleotide Proteins 0.000 description 34
- 239000002157 polynucleotide Substances 0.000 description 34
- 230000001105 regulatory effect Effects 0.000 description 31
- 108091028043 Nucleic acid sequence Proteins 0.000 description 29
- 102000055025 Adenosine deaminases Human genes 0.000 description 27
- 230000014509 gene expression Effects 0.000 description 27
- 101710169336 5'-deoxyadenosine deaminase Proteins 0.000 description 25
- 230000008685 targeting Effects 0.000 description 24
- 238000003780 insertion Methods 0.000 description 20
- 230000037431 insertion Effects 0.000 description 20
- 239000000411 inducer Substances 0.000 description 17
- 102000005381 Cytidine Deaminase Human genes 0.000 description 16
- 108010031325 Cytidine deaminase Proteins 0.000 description 16
- 108010008532 Deoxyribonuclease I Proteins 0.000 description 15
- 102000007260 Deoxyribonuclease I Human genes 0.000 description 15
- 210000005260 human cell Anatomy 0.000 description 15
- 239000002245 particle Substances 0.000 description 15
- 241001465754 Metazoa Species 0.000 description 14
- OPTASPLRGRRNAP-UHFFFAOYSA-N cytosine Chemical compound NC=1C=CNC(=O)N=1 OPTASPLRGRRNAP-UHFFFAOYSA-N 0.000 description 14
- 238000005520 cutting process Methods 0.000 description 13
- 238000012217 deletion Methods 0.000 description 13
- 230000037430 deletion Effects 0.000 description 13
- 108010027179 Tacrolimus Binding Proteins Proteins 0.000 description 12
- 102000018679 Tacrolimus Binding Proteins Human genes 0.000 description 12
- 239000013612 plasmid Substances 0.000 description 11
- 239000000523 sample Substances 0.000 description 11
- OIRDTQYFTABQOQ-KQYNXXCUSA-N adenosine Chemical compound C1=NC=2C(N)=NC=NC=2N1[C@@H]1O[C@H](CO)[C@@H](O)[C@H]1O OIRDTQYFTABQOQ-KQYNXXCUSA-N 0.000 description 10
- 230000006870 function Effects 0.000 description 10
- 230000007018 DNA scission Effects 0.000 description 9
- 241000196324 Embryophyta Species 0.000 description 9
- DHMQDGOQFOQNFH-UHFFFAOYSA-N Glycine Chemical compound NCC(O)=O DHMQDGOQFOQNFH-UHFFFAOYSA-N 0.000 description 9
- 230000003197 catalytic effect Effects 0.000 description 9
- 230000002068 genetic effect Effects 0.000 description 9
- 241000124008 Mammalia Species 0.000 description 8
- ISAKRJDGNUQOIC-UHFFFAOYSA-N Uracil Chemical compound O=C1C=CNC(=O)N1 ISAKRJDGNUQOIC-UHFFFAOYSA-N 0.000 description 8
- 241000699666 Mus <mouse, genus> Species 0.000 description 7
- 230000008859 change Effects 0.000 description 7
- 150000001875 compounds Chemical class 0.000 description 7
- 229940104302 cytosine Drugs 0.000 description 7
- 230000004927 fusion Effects 0.000 description 7
- 230000008569 process Effects 0.000 description 7
- 230000008439 repair process Effects 0.000 description 7
- 230000001225 therapeutic effect Effects 0.000 description 7
- GFFGJBXGBJISGV-UHFFFAOYSA-N Adenine Chemical compound NC1=NC=NC2=C1N=CN2 GFFGJBXGBJISGV-UHFFFAOYSA-N 0.000 description 6
- 108091033409 CRISPR Proteins 0.000 description 6
- 241000701022 Cytomegalovirus Species 0.000 description 6
- 239000004098 Tetracycline Substances 0.000 description 6
- 241000700605 Viruses Species 0.000 description 6
- 238000003556 assay Methods 0.000 description 6
- 230000015572 biosynthetic process Effects 0.000 description 6
- 239000013078 crystal Substances 0.000 description 6
- 238000009396 hybridization Methods 0.000 description 6
- 230000006698 induction Effects 0.000 description 6
- 239000000126 substance Substances 0.000 description 6
- 229960002180 tetracycline Drugs 0.000 description 6
- 229930101283 tetracycline Natural products 0.000 description 6
- 235000019364 tetracycline Nutrition 0.000 description 6
- 150000003522 tetracyclines Chemical class 0.000 description 6
- SGKRLCUYIXIAHR-AKNGSSGZSA-N (4s,4ar,5s,5ar,6r,12ar)-4-(dimethylamino)-1,5,10,11,12a-pentahydroxy-6-methyl-3,12-dioxo-4a,5,5a,6-tetrahydro-4h-tetracene-2-carboxamide Chemical compound C1=CC=C2[C@H](C)[C@@H]([C@H](O)[C@@H]3[C@](C(O)=C(C(N)=O)C(=O)[C@H]3N(C)C)(O)C3=O)C3=C(O)C2=C1O SGKRLCUYIXIAHR-AKNGSSGZSA-N 0.000 description 5
- 108010079649 APOBEC-1 Deaminase Proteins 0.000 description 5
- 229930024421 Adenine Natural products 0.000 description 5
- 241000702423 Adeno-associated virus - 2 Species 0.000 description 5
- 239000002126 C01EB10 - Adenosine Substances 0.000 description 5
- 230000033616 DNA repair Effects 0.000 description 5
- 101000865408 Homo sapiens Double-stranded RNA-specific adenosine deaminase Proteins 0.000 description 5
- WHUUTDBJXJRKMK-VKHMYHEASA-N L-glutamic acid Chemical compound OC(=O)[C@@H](N)CCC(O)=O WHUUTDBJXJRKMK-VKHMYHEASA-N 0.000 description 5
- 230000007022 RNA scission Effects 0.000 description 5
- 108091081062 Repeated sequence (DNA) Proteins 0.000 description 5
- 229960000643 adenine Drugs 0.000 description 5
- 229960005305 adenosine Drugs 0.000 description 5
- 229960003722 doxycycline Drugs 0.000 description 5
- 210000003527 eukaryotic cell Anatomy 0.000 description 5
- 239000001257 hydrogen Substances 0.000 description 5
- FDGQSTZJBFJUBT-UHFFFAOYSA-N hypoxanthine Chemical compound O=C1NC=NC2=C1NC=N2 FDGQSTZJBFJUBT-UHFFFAOYSA-N 0.000 description 5
- 108020004999 messenger RNA Proteins 0.000 description 5
- 230000009437 off-target effect Effects 0.000 description 5
- 230000001717 pathogenic effect Effects 0.000 description 5
- 229960002930 sirolimus Drugs 0.000 description 5
- 150000003384 small molecules Chemical class 0.000 description 5
- 230000009870 specific binding Effects 0.000 description 5
- 230000035897 transcription Effects 0.000 description 5
- 238000013518 transcription Methods 0.000 description 5
- 238000001890 transfection Methods 0.000 description 5
- 201000007905 transthyretin amyloidosis Diseases 0.000 description 5
- 239000013603 viral vector Substances 0.000 description 5
- JLIDBLDQVAYHNE-YKALOCIXSA-N (+)-Abscisic acid Chemical compound OC(=O)/C=C(/C)\C=C\[C@@]1(O)C(C)=CC(=O)CC1(C)C JLIDBLDQVAYHNE-YKALOCIXSA-N 0.000 description 4
- 108091032973 (ribonucleotides)n+m Proteins 0.000 description 4
- 102000040650 (ribonucleotides)n+m Human genes 0.000 description 4
- 241000283690 Bos taurus Species 0.000 description 4
- 102100040397 C->U-editing enzyme APOBEC-1 Human genes 0.000 description 4
- 241000282472 Canis lupus familiaris Species 0.000 description 4
- 241000238366 Cephalopoda Species 0.000 description 4
- 102100029791 Double-stranded RNA-specific adenosine deaminase Human genes 0.000 description 4
- WHUUTDBJXJRKMK-UHFFFAOYSA-N Glutamic acid Natural products OC(=O)C(N)CCC(O)=O WHUUTDBJXJRKMK-UHFFFAOYSA-N 0.000 description 4
- 239000004471 Glycine Substances 0.000 description 4
- 102100032606 Heat shock factor protein 1 Human genes 0.000 description 4
- 108010033040 Histones Proteins 0.000 description 4
- 101000867525 Homo sapiens Heat shock factor protein 1 Proteins 0.000 description 4
- 101000581986 Homo sapiens Neurocan core protein Proteins 0.000 description 4
- KDXKERNSBIXSRK-UHFFFAOYSA-N Lysine Natural products NCCCCC(N)C(O)=O KDXKERNSBIXSRK-UHFFFAOYSA-N 0.000 description 4
- 102100025169 Max-binding protein MNT Human genes 0.000 description 4
- 102100030466 Neurocan core protein Human genes 0.000 description 4
- 108700008625 Reporter Genes Proteins 0.000 description 4
- 229910052799 carbon Inorganic materials 0.000 description 4
- 238000004891 communication Methods 0.000 description 4
- 238000010276 construction Methods 0.000 description 4
- 230000004049 epigenetic modification Effects 0.000 description 4
- 238000002744 homologous recombination Methods 0.000 description 4
- 230000006801 homologous recombination Effects 0.000 description 4
- 210000002865 immune cell Anatomy 0.000 description 4
- 239000002502 liposome Substances 0.000 description 4
- 230000006780 non-homologous end joining Effects 0.000 description 4
- 229910052698 phosphorus Inorganic materials 0.000 description 4
- 238000012545 processing Methods 0.000 description 4
- ZAHRKKWIAAJSAO-UHFFFAOYSA-N rapamycin Natural products COCC(O)C(=C/C(C)C(=O)CC(OC(=O)C1CCCCN1C(=O)C(=O)C2(O)OC(CC(OC)C(=CC=CC=CC(C)CC(C)C(=O)C)C)CCC2C)C(C)CC3CCC(O)C(C3)OC)C ZAHRKKWIAAJSAO-UHFFFAOYSA-N 0.000 description 4
- QFJCIRLUMZQUOT-HPLJOQBZSA-N sirolimus Chemical compound C1C[C@@H](O)[C@H](OC)C[C@@H]1C[C@@H](C)[C@H]1OC(=O)[C@@H]2CCCCN2C(=O)C(=O)[C@](O)(O2)[C@H](C)CC[C@H]2C[C@H](OC)/C(C)=C/C=C/C=C/[C@@H](C)C[C@@H](C)C(=O)[C@H](OC)[C@H](O)/C(C)=C/[C@@H](C)C(=O)C1 QFJCIRLUMZQUOT-HPLJOQBZSA-N 0.000 description 4
- 150000003431 steroids Chemical class 0.000 description 4
- 108091006106 transcriptional activators Proteins 0.000 description 4
- 108091006107 transcriptional repressors Proteins 0.000 description 4
- 230000009466 transformation Effects 0.000 description 4
- 229940035893 uracil Drugs 0.000 description 4
- 241000251468 Actinopterygii Species 0.000 description 3
- 239000004475 Arginine Substances 0.000 description 3
- 241000282994 Cervidae Species 0.000 description 3
- 108091026890 Coding region Proteins 0.000 description 3
- 101100490452 Drosophila melanogaster Adat1 gene Proteins 0.000 description 3
- 102100031780 Endonuclease Human genes 0.000 description 3
- 108010042407 Endonucleases Proteins 0.000 description 3
- 241000283086 Equidae Species 0.000 description 3
- 108010068250 Herpes Simplex Virus Protein Vmw65 Proteins 0.000 description 3
- 239000004472 Lysine Substances 0.000 description 3
- 241000829100 Macaca mulatta polyomavirus 1 Species 0.000 description 3
- 241000283973 Oryctolagus cuniculus Species 0.000 description 3
- 241000283984 Rodentia Species 0.000 description 3
- 210000001744 T-lymphocyte Anatomy 0.000 description 3
- 102000040945 Transcription factor Human genes 0.000 description 3
- 108091023040 Transcription factor Proteins 0.000 description 3
- 239000002253 acid Substances 0.000 description 3
- 239000000427 antigen Substances 0.000 description 3
- 108091007433 antigens Proteins 0.000 description 3
- 102000036639 antigens Human genes 0.000 description 3
- ODKSFYDXXFIFQN-UHFFFAOYSA-N arginine Natural products OC(=O)C(N)CCCNC(N)=N ODKSFYDXXFIFQN-UHFFFAOYSA-N 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 3
- 230000008901 benefit Effects 0.000 description 3
- 230000003115 biocidal effect Effects 0.000 description 3
- 238000004422 calculation algorithm Methods 0.000 description 3
- 238000006243 chemical reaction Methods 0.000 description 3
- 238000006481 deamination reaction Methods 0.000 description 3
- 229940079593 drug Drugs 0.000 description 3
- 238000004520 electroporation Methods 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 238000001415 gene therapy Methods 0.000 description 3
- 235000013922 glutamic acid Nutrition 0.000 description 3
- 239000004220 glutamic acid Substances 0.000 description 3
- 238000007481 next generation sequencing Methods 0.000 description 3
- 230000030648 nucleus localization Effects 0.000 description 3
- 229920000642 polymer Polymers 0.000 description 3
- 230000009467 reduction Effects 0.000 description 3
- 238000010839 reverse transcription Methods 0.000 description 3
- 230000035945 sensitivity Effects 0.000 description 3
- 241000894007 species Species 0.000 description 3
- 210000000130 stem cell Anatomy 0.000 description 3
- 210000001519 tissue Anatomy 0.000 description 3
- 238000011144 upstream manufacturing Methods 0.000 description 3
- 230000003612 virological effect Effects 0.000 description 3
- 241000972680 Adeno-associated virus - 6 Species 0.000 description 2
- 241001164825 Adeno-associated virus - 8 Species 0.000 description 2
- 108700040115 Adenosine deaminases Proteins 0.000 description 2
- 108091093088 Amplicon Proteins 0.000 description 2
- 241000272517 Anseriformes Species 0.000 description 2
- 241000269350 Anura Species 0.000 description 2
- 101710095342 Apolipoprotein B Proteins 0.000 description 2
- 102100040202 Apolipoprotein B-100 Human genes 0.000 description 2
- 241000271566 Aves Species 0.000 description 2
- 241000894006 Bacteria Species 0.000 description 2
- 238000010453 CRISPR/Cas method Methods 0.000 description 2
- 241000283707 Capra Species 0.000 description 2
- 108010019670 Chimeric Antigen Receptors Proteins 0.000 description 2
- 101710180243 Cytidine deaminase 1 Proteins 0.000 description 2
- 102100040264 DNA dC->dU-editing enzyme APOBEC-3D Human genes 0.000 description 2
- 230000004568 DNA-binding Effects 0.000 description 2
- 241000255925 Diptera Species 0.000 description 2
- 241000255581 Drosophila <fruit fly, genus> Species 0.000 description 2
- 206010013801 Duchenne Muscular Dystrophy Diseases 0.000 description 2
- UPEZCKBFRMILAV-JNEQICEOSA-N Ecdysone Natural products O=C1[C@H]2[C@@](C)([C@@H]3C([C@@]4(O)[C@@](C)([C@H]([C@H]([C@@H](O)CCC(O)(C)C)C)CC4)CC3)=C1)C[C@H](O)[C@H](O)C2 UPEZCKBFRMILAV-JNEQICEOSA-N 0.000 description 2
- 101710180995 Endonuclease 1 Proteins 0.000 description 2
- 241000588724 Escherichia coli Species 0.000 description 2
- 208000034846 Familial Amyloid Neuropathies Diseases 0.000 description 2
- 241000282326 Felis catus Species 0.000 description 2
- 101000834253 Gallus gallus Actin, cytoplasmic 1 Proteins 0.000 description 2
- 102000006947 Histones Human genes 0.000 description 2
- 241000282412 Homo Species 0.000 description 2
- 101000964382 Homo sapiens DNA dC->dU-editing enzyme APOBEC-3D Proteins 0.000 description 2
- UGQMRVRMYYASKQ-UHFFFAOYSA-N Hypoxanthine nucleoside Natural products OC1C(O)C(CO)OC1N1C(NC=NC2=O)=C2N=C1 UGQMRVRMYYASKQ-UHFFFAOYSA-N 0.000 description 2
- 229930010555 Inosine Natural products 0.000 description 2
- UGQMRVRMYYASKQ-KQYNXXCUSA-N Inosine Chemical group O[C@@H]1[C@H](O)[C@@H](CO)O[C@H]1N1C2=NC=NC(O)=C2N=C1 UGQMRVRMYYASKQ-KQYNXXCUSA-N 0.000 description 2
- 108010061833 Integrases Proteins 0.000 description 2
- ONIBWKKTOPOVIA-BYPYZUCNSA-N L-Proline Chemical compound OC(=O)[C@@H]1CCCN1 ONIBWKKTOPOVIA-BYPYZUCNSA-N 0.000 description 2
- DCXYFEDJOCDNAF-REOHCLBHSA-N L-asparagine Chemical compound OC(=O)[C@@H](N)CC(N)=O DCXYFEDJOCDNAF-REOHCLBHSA-N 0.000 description 2
- CKLJMWTZIZZHCS-REOHCLBHSA-N L-aspartic acid Chemical compound OC(=O)[C@@H](N)CC(O)=O CKLJMWTZIZZHCS-REOHCLBHSA-N 0.000 description 2
- AGPKZVBTJJNPAG-WHFBIAKZSA-N L-isoleucine Chemical compound CC[C@H](C)[C@H](N)C(O)=O AGPKZVBTJJNPAG-WHFBIAKZSA-N 0.000 description 2
- ROHFNLRQFUQHCH-YFKPBYRVSA-N L-leucine Chemical compound CC(C)C[C@H](N)C(O)=O ROHFNLRQFUQHCH-YFKPBYRVSA-N 0.000 description 2
- FFEARJCKVFRZRR-BYPYZUCNSA-N L-methionine Chemical compound CSCC[C@H](N)C(O)=O FFEARJCKVFRZRR-BYPYZUCNSA-N 0.000 description 2
- COLNVLDHVKWLRT-QMMMGPOBSA-N L-phenylalanine Chemical compound OC(=O)[C@@H](N)CC1=CC=CC=C1 COLNVLDHVKWLRT-QMMMGPOBSA-N 0.000 description 2
- OUYCCCASQSFEME-QMMMGPOBSA-N L-tyrosine Chemical compound OC(=O)[C@@H](N)CC1=CC=C(O)C=C1 OUYCCCASQSFEME-QMMMGPOBSA-N 0.000 description 2
- ROHFNLRQFUQHCH-UHFFFAOYSA-N Leucine Natural products CC(C)CC(N)C(O)=O ROHFNLRQFUQHCH-UHFFFAOYSA-N 0.000 description 2
- 108091034117 Oligonucleotide Proteins 0.000 description 2
- 241001494479 Pecora Species 0.000 description 2
- 102100037935 Polyubiquitin-C Human genes 0.000 description 2
- 241000288906 Primates Species 0.000 description 2
- ONIBWKKTOPOVIA-UHFFFAOYSA-N Proline Natural products OC(=O)C1CCCN1 ONIBWKKTOPOVIA-UHFFFAOYSA-N 0.000 description 2
- CZPWVGJYEJSRLH-UHFFFAOYSA-N Pyrimidine Chemical compound C1=CN=CN=C1 CZPWVGJYEJSRLH-UHFFFAOYSA-N 0.000 description 2
- 102000018120 Recombinases Human genes 0.000 description 2
- 108010091086 Recombinases Proteins 0.000 description 2
- 102000004389 Ribonucleoproteins Human genes 0.000 description 2
- 108010081734 Ribonucleoproteins Proteins 0.000 description 2
- MTCFGRXMJLQNBG-UHFFFAOYSA-N Serine Natural products OCC(N)C(O)=O MTCFGRXMJLQNBG-UHFFFAOYSA-N 0.000 description 2
- 241000282887 Suidae Species 0.000 description 2
- 238000010459 TALEN Methods 0.000 description 2
- AYFVYJQAPQTCCC-UHFFFAOYSA-N Threonine Natural products CC(O)C(N)C(O)=O AYFVYJQAPQTCCC-UHFFFAOYSA-N 0.000 description 2
- 239000004473 Threonine Substances 0.000 description 2
- 108091028113 Trans-activating crRNA Proteins 0.000 description 2
- 102000008579 Transposases Human genes 0.000 description 2
- 108010020764 Transposases Proteins 0.000 description 2
- QIVBCDIJIAJPQS-UHFFFAOYSA-N Tryptophan Natural products C1=CC=C2C(CC(N)C(O)=O)=CNC2=C1 QIVBCDIJIAJPQS-UHFFFAOYSA-N 0.000 description 2
- DRTQHJPVMGBUCF-XVFCMESISA-N Uridine Chemical group O[C@@H]1[C@H](O)[C@@H](CO)O[C@H]1N1C(=O)NC(=O)C=C1 DRTQHJPVMGBUCF-XVFCMESISA-N 0.000 description 2
- 108020005202 Viral DNA Proteins 0.000 description 2
- 240000008042 Zea mays Species 0.000 description 2
- 235000005824 Zea mays ssp. parviglumis Nutrition 0.000 description 2
- 235000002017 Zea mays subsp mays Nutrition 0.000 description 2
- 108010017070 Zinc Finger Nucleases Proteins 0.000 description 2
- 230000004913 activation Effects 0.000 description 2
- 238000007792 addition Methods 0.000 description 2
- UPEZCKBFRMILAV-UHFFFAOYSA-N alpha-Ecdysone Natural products C1C(O)C(O)CC2(C)C(CCC3(C(C(C(O)CCC(C)(C)O)C)CCC33O)C)C3=CC(=O)C21 UPEZCKBFRMILAV-UHFFFAOYSA-N 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 2
- 238000003491 array Methods 0.000 description 2
- 230000037429 base substitution Effects 0.000 description 2
- 239000001506 calcium phosphate Substances 0.000 description 2
- 229910000389 calcium phosphate Inorganic materials 0.000 description 2
- 235000011010 calcium phosphates Nutrition 0.000 description 2
- 239000000969 carrier Substances 0.000 description 2
- 238000002659 cell therapy Methods 0.000 description 2
- 239000003153 chemical reaction reagent Substances 0.000 description 2
- HVYWMOMLDIMFJA-DPAQBDIFSA-N cholesterol Chemical compound C1C=C2C[C@@H](O)CC[C@]2(C)[C@@H]2[C@@H]1[C@@H]1CC[C@H]([C@H](C)CCCC(C)C)[C@@]1(C)CC2 HVYWMOMLDIMFJA-DPAQBDIFSA-N 0.000 description 2
- 235000005822 corn Nutrition 0.000 description 2
- ZYGHJZDHTFUPRJ-UHFFFAOYSA-N coumarin Chemical compound C1=CC=C2OC(=O)C=CC2=C1 ZYGHJZDHTFUPRJ-UHFFFAOYSA-N 0.000 description 2
- 235000018417 cysteine Nutrition 0.000 description 2
- XUJNEKJLAYXESH-UHFFFAOYSA-N cysteine Natural products SCC(N)C(O)=O XUJNEKJLAYXESH-UHFFFAOYSA-N 0.000 description 2
- 230000009615 deamination Effects 0.000 description 2
- 210000004443 dendritic cell Anatomy 0.000 description 2
- 230000001419 dependent effect Effects 0.000 description 2
- FCRACOPGPMPSHN-UHFFFAOYSA-N desoxyabscisic acid Natural products OC(=O)C=C(C)C=CC1C(C)=CC(=O)CC1(C)C FCRACOPGPMPSHN-UHFFFAOYSA-N 0.000 description 2
- 206010012601 diabetes mellitus Diseases 0.000 description 2
- 238000003745 diagnosis Methods 0.000 description 2
- 239000000539 dimer Substances 0.000 description 2
- UPEZCKBFRMILAV-JMZLNJERSA-N ecdysone Chemical compound C1[C@@H](O)[C@@H](O)C[C@]2(C)[C@@H](CC[C@@]3([C@@H]([C@@H]([C@H](O)CCC(C)(C)O)C)CC[C@]33O)C)C3=CC(=O)[C@@H]21 UPEZCKBFRMILAV-JMZLNJERSA-N 0.000 description 2
- 229940011871 estrogen Drugs 0.000 description 2
- 239000000262 estrogen Substances 0.000 description 2
- 238000000684 flow cytometry Methods 0.000 description 2
- 238000002866 fluorescence resonance energy transfer Methods 0.000 description 2
- 108091006047 fluorescent proteins Proteins 0.000 description 2
- 102000034287 fluorescent proteins Human genes 0.000 description 2
- 239000012634 fragment Substances 0.000 description 2
- 108020001507 fusion proteins Proteins 0.000 description 2
- 102000037865 fusion proteins Human genes 0.000 description 2
- ZDXPYRJPNDTMRX-UHFFFAOYSA-N glutamine Natural products OC(=O)C(N)CCC(N)=O ZDXPYRJPNDTMRX-UHFFFAOYSA-N 0.000 description 2
- 150000002333 glycines Chemical class 0.000 description 2
- 230000013595 glycosylation Effects 0.000 description 2
- 238000006206 glycosylation reaction Methods 0.000 description 2
- 239000005090 green fluorescent protein Substances 0.000 description 2
- 229940088597 hormone Drugs 0.000 description 2
- 239000005556 hormone Substances 0.000 description 2
- 230000003301 hydrolyzing effect Effects 0.000 description 2
- 238000003384 imaging method Methods 0.000 description 2
- 238000002347 injection Methods 0.000 description 2
- 239000007924 injection Substances 0.000 description 2
- 229960003786 inosine Drugs 0.000 description 2
- 229960000310 isoleucine Drugs 0.000 description 2
- AGPKZVBTJJNPAG-UHFFFAOYSA-N isoleucine Natural products CCC(C)C(N)C(O)=O AGPKZVBTJJNPAG-UHFFFAOYSA-N 0.000 description 2
- 210000003734 kidney Anatomy 0.000 description 2
- 230000017156 mRNA modification Effects 0.000 description 2
- 239000011159 matrix material Substances 0.000 description 2
- 230000007246 mechanism Effects 0.000 description 2
- 229910052751 metal Inorganic materials 0.000 description 2
- 239000002184 metal Substances 0.000 description 2
- 229930182817 methionine Natural products 0.000 description 2
- 239000000693 micelle Substances 0.000 description 2
- 238000000520 microinjection Methods 0.000 description 2
- 230000003278 mimic effect Effects 0.000 description 2
- 238000010369 molecular cloning Methods 0.000 description 2
- 239000002105 nanoparticle Substances 0.000 description 2
- 229910052757 nitrogen Inorganic materials 0.000 description 2
- 230000030147 nuclear export Effects 0.000 description 2
- COLNVLDHVKWLRT-UHFFFAOYSA-N phenylalanine Natural products OC(=O)C(N)CC1=CC=CC=C1 COLNVLDHVKWLRT-UHFFFAOYSA-N 0.000 description 2
- 229920002401 polyacrylamide Polymers 0.000 description 2
- 230000029279 positive regulation of transcription, DNA-dependent Effects 0.000 description 2
- 108020001580 protein domains Proteins 0.000 description 2
- 102000005962 receptors Human genes 0.000 description 2
- 108020003175 receptors Proteins 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 108091008146 restriction endonucleases Proteins 0.000 description 2
- 238000012163 sequencing technique Methods 0.000 description 2
- 208000007056 sickle cell anemia Diseases 0.000 description 2
- 239000000243 solution Substances 0.000 description 2
- 208000024891 symptom Diseases 0.000 description 2
- 230000002103 transcriptional effect Effects 0.000 description 2
- 238000003151 transfection method Methods 0.000 description 2
- 238000012546 transfer Methods 0.000 description 2
- 230000009261 transgenic effect Effects 0.000 description 2
- 230000007704 transition Effects 0.000 description 2
- QORWJWZARLRLPR-UHFFFAOYSA-H tricalcium bis(phosphate) Chemical compound [Ca+2].[Ca+2].[Ca+2].[O-]P([O-])([O-])=O.[O-]P([O-])([O-])=O QORWJWZARLRLPR-UHFFFAOYSA-H 0.000 description 2
- OUYCCCASQSFEME-UHFFFAOYSA-N tyrosine Natural products OC(=O)C(N)CC1=CC=C(O)C=C1 OUYCCCASQSFEME-UHFFFAOYSA-N 0.000 description 2
- 241000701161 unidentified adenovirus Species 0.000 description 2
- 239000003981 vehicle Substances 0.000 description 2
- DSLBDPPHINVUID-REOHCLBHSA-N (2s)-2-aminobutanediamide Chemical compound NC(=O)[C@@H](N)CC(N)=O DSLBDPPHINVUID-REOHCLBHSA-N 0.000 description 1
- UHDGCWIWMRVCDJ-UHFFFAOYSA-N 1-beta-D-Xylofuranosyl-NH-Cytosine Natural products O=C1N=C(N)C=CN1C1C(O)C(O)C(CO)O1 UHDGCWIWMRVCDJ-UHFFFAOYSA-N 0.000 description 1
- JLIDBLDQVAYHNE-LXGGSRJLSA-N 2-cis-abscisic acid Chemical group OC(=O)/C=C(/C)\C=C\C1(O)C(C)=CC(=O)CC1(C)C JLIDBLDQVAYHNE-LXGGSRJLSA-N 0.000 description 1
- DODQJNMQWMSYGS-QPLCGJKRSA-N 4-[(z)-1-[4-[2-(dimethylamino)ethoxy]phenyl]-1-phenylbut-1-en-2-yl]phenol Chemical compound C=1C=C(O)C=CC=1C(/CC)=C(C=1C=CC(OCCN(C)C)=CC=1)/C1=CC=CC=C1 DODQJNMQWMSYGS-QPLCGJKRSA-N 0.000 description 1
- 108010029988 AICDA (activation-induced cytidine deaminase) Proteins 0.000 description 1
- 102000012758 APOBEC-1 Deaminase Human genes 0.000 description 1
- 108010004483 APOBEC-3G Deaminase Proteins 0.000 description 1
- 102000002797 APOBEC-3G Deaminase Human genes 0.000 description 1
- 108010085238 Actins Proteins 0.000 description 1
- 102000007469 Actins Human genes 0.000 description 1
- 241001655883 Adeno-associated virus - 1 Species 0.000 description 1
- 241000202702 Adeno-associated virus - 3 Species 0.000 description 1
- 241001634120 Adeno-associated virus - 5 Species 0.000 description 1
- 102100032578 Adenosine deaminase domain-containing protein 1 Human genes 0.000 description 1
- 102100032577 Adenosine deaminase domain-containing protein 2 Human genes 0.000 description 1
- 102000002260 Alkaline Phosphatase Human genes 0.000 description 1
- 108020004774 Alkaline Phosphatase Proteins 0.000 description 1
- 239000005695 Ammonium acetate Substances 0.000 description 1
- 241000203069 Archaea Species 0.000 description 1
- DCXYFEDJOCDNAF-UHFFFAOYSA-N Asparagine Natural products OC(=O)C(N)CC(N)=O DCXYFEDJOCDNAF-UHFFFAOYSA-N 0.000 description 1
- 235000007319 Avena orientalis Nutrition 0.000 description 1
- 241000209763 Avena sativa Species 0.000 description 1
- 235000007558 Avena sp Nutrition 0.000 description 1
- 108091032955 Bacterial small RNA Proteins 0.000 description 1
- 201000006935 Becker muscular dystrophy Diseases 0.000 description 1
- 102100026189 Beta-galactosidase Human genes 0.000 description 1
- 239000002028 Biomass Substances 0.000 description 1
- 235000014698 Brassica juncea var multisecta Nutrition 0.000 description 1
- 235000006008 Brassica napus var napus Nutrition 0.000 description 1
- 240000000385 Brassica napus var. napus Species 0.000 description 1
- 235000006618 Brassica rapa subsp oleifera Nutrition 0.000 description 1
- 235000004977 Brassica sinapistrum Nutrition 0.000 description 1
- 102100040399 C->U-editing enzyme APOBEC-2 Human genes 0.000 description 1
- 102100035875 C-C chemokine receptor type 5 Human genes 0.000 description 1
- 101710149870 C-C chemokine receptor type 5 Proteins 0.000 description 1
- 238000010356 CRISPR-Cas9 genome editing Methods 0.000 description 1
- 101710172824 CRISPR-associated endonuclease Cas9 Proteins 0.000 description 1
- 241000244203 Caenorhabditis elegans Species 0.000 description 1
- 101100490563 Caenorhabditis elegans adr-1 gene Proteins 0.000 description 1
- 101100388220 Caenorhabditis elegans adr-2 gene Proteins 0.000 description 1
- 102000004631 Calcineurin Human genes 0.000 description 1
- 108010042955 Calcineurin Proteins 0.000 description 1
- 241000282836 Camelus dromedarius Species 0.000 description 1
- 244000025254 Cannabis sativa Species 0.000 description 1
- 235000012766 Cannabis sativa ssp. sativa var. sativa Nutrition 0.000 description 1
- 235000012765 Cannabis sativa ssp. sativa var. spontanea Nutrition 0.000 description 1
- 206010007509 Cardiac amyloidosis Diseases 0.000 description 1
- 244000020518 Carthamus tinctorius Species 0.000 description 1
- 235000003255 Carthamus tinctorius Nutrition 0.000 description 1
- 108700004991 Cas12a Proteins 0.000 description 1
- 108010035563 Chloramphenicol O-acetyltransferase Proteins 0.000 description 1
- 108020004705 Codon Proteins 0.000 description 1
- 108700010070 Codon Usage Proteins 0.000 description 1
- 206010010144 Completed suicide Diseases 0.000 description 1
- 229920000742 Cotton Polymers 0.000 description 1
- 241000699800 Cricetinae Species 0.000 description 1
- 201000003883 Cystic fibrosis Diseases 0.000 description 1
- UHDGCWIWMRVCDJ-PSQAKQOGSA-N Cytidine Natural products O=C1N=C(N)C=CN1[C@@H]1[C@@H](O)[C@@H](O)[C@H](CO)O1 UHDGCWIWMRVCDJ-PSQAKQOGSA-N 0.000 description 1
- 102000004127 Cytokines Human genes 0.000 description 1
- 108090000695 Cytokines Proteins 0.000 description 1
- 108010080611 Cytosine Deaminase Proteins 0.000 description 1
- 102000000311 Cytosine Deaminase Human genes 0.000 description 1
- 108010054814 DNA Gyrase Proteins 0.000 description 1
- 102100040263 DNA dC->dU-editing enzyme APOBEC-3A Human genes 0.000 description 1
- 102100040262 DNA dC->dU-editing enzyme APOBEC-3B Human genes 0.000 description 1
- 102100040261 DNA dC->dU-editing enzyme APOBEC-3C Human genes 0.000 description 1
- 102100040266 DNA dC->dU-editing enzyme APOBEC-3F Human genes 0.000 description 1
- 102100038050 DNA dC->dU-editing enzyme APOBEC-3H Human genes 0.000 description 1
- 101710082737 DNA dC->dU-editing enzyme APOBEC-3H Proteins 0.000 description 1
- 230000005778 DNA damage Effects 0.000 description 1
- 231100000277 DNA damage Toxicity 0.000 description 1
- 238000010442 DNA editing Methods 0.000 description 1
- 101710150423 DNA nickase Proteins 0.000 description 1
- 230000006820 DNA synthesis Effects 0.000 description 1
- 102000052510 DNA-Binding Proteins Human genes 0.000 description 1
- 108700020911 DNA-Binding Proteins Proteins 0.000 description 1
- 241000252212 Danio rerio Species 0.000 description 1
- 108010046331 Deoxyribodipyrimidine photo-lyase Proteins 0.000 description 1
- 229920002307 Dextran Polymers 0.000 description 1
- 206010061818 Disease progression Diseases 0.000 description 1
- 241001653748 Doryteuthis pealeii Species 0.000 description 1
- 108700019024 Drosophila Adar Proteins 0.000 description 1
- 241000590568 Dynamine Species 0.000 description 1
- 102100037964 E3 ubiquitin-protein ligase RING2 Human genes 0.000 description 1
- 238000002965 ELISA Methods 0.000 description 1
- 240000003133 Elaeis guineensis Species 0.000 description 1
- 235000001950 Elaeis guineensis Nutrition 0.000 description 1
- 241000283070 Equus zebra Species 0.000 description 1
- LFQSCWFLJHTTHZ-UHFFFAOYSA-N Ethanol Chemical compound CCO LFQSCWFLJHTTHZ-UHFFFAOYSA-N 0.000 description 1
- 241000206602 Eukaryota Species 0.000 description 1
- 108091092566 Extrachromosomal DNA Proteins 0.000 description 1
- 108010042634 F2A4-K-NS peptide Proteins 0.000 description 1
- 241000282324 Felis Species 0.000 description 1
- 208000001914 Fragile X syndrome Diseases 0.000 description 1
- 208000024412 Friedreich ataxia Diseases 0.000 description 1
- 201000011240 Frontotemporal dementia Diseases 0.000 description 1
- 241000287828 Gallus gallus Species 0.000 description 1
- 206010064571 Gene mutation Diseases 0.000 description 1
- 108700028146 Genetic Enhancer Elements Proteins 0.000 description 1
- 229930191978 Gibberellin Natural products 0.000 description 1
- 244000068988 Glycine max Species 0.000 description 1
- 235000010469 Glycine max Nutrition 0.000 description 1
- 208000032007 Glycogen storage disease due to acid maltase deficiency Diseases 0.000 description 1
- 206010053185 Glycogen storage disease type II Diseases 0.000 description 1
- 241000219146 Gossypium Species 0.000 description 1
- 108010043121 Green Fluorescent Proteins Proteins 0.000 description 1
- 108060003760 HNH nuclease Proteins 0.000 description 1
- 102000029812 HNH nuclease Human genes 0.000 description 1
- 244000020551 Helianthus annuus Species 0.000 description 1
- 235000003222 Helianthus annuus Nutrition 0.000 description 1
- 102100031573 Hematopoietic progenitor cell antigen CD34 Human genes 0.000 description 1
- 108091005904 Hemoglobin subunit beta Proteins 0.000 description 1
- 206010019860 Hereditary angioedema Diseases 0.000 description 1
- 206010019889 Hereditary neuropathic amyloidosis Diseases 0.000 description 1
- 108091027305 Heteroduplex Proteins 0.000 description 1
- 101710159508 Histone-lysine N-methyltransferase SETD7 Proteins 0.000 description 1
- 102100027704 Histone-lysine N-methyltransferase SETD7 Human genes 0.000 description 1
- 102100023823 Homeobox protein EMX1 Human genes 0.000 description 1
- 101000797006 Homo sapiens Adenosine deaminase domain-containing protein 1 Proteins 0.000 description 1
- 101000796996 Homo sapiens Adenosine deaminase domain-containing protein 2 Proteins 0.000 description 1
- 101000964322 Homo sapiens C->U-editing enzyme APOBEC-2 Proteins 0.000 description 1
- 101000964378 Homo sapiens DNA dC->dU-editing enzyme APOBEC-3A Proteins 0.000 description 1
- 101000964385 Homo sapiens DNA dC->dU-editing enzyme APOBEC-3B Proteins 0.000 description 1
- 101000964383 Homo sapiens DNA dC->dU-editing enzyme APOBEC-3C Proteins 0.000 description 1
- 101000964377 Homo sapiens DNA dC->dU-editing enzyme APOBEC-3F Proteins 0.000 description 1
- 101001095815 Homo sapiens E3 ubiquitin-protein ligase RING2 Proteins 0.000 description 1
- 101000777663 Homo sapiens Hematopoietic progenitor cell antigen CD34 Proteins 0.000 description 1
- 101001048956 Homo sapiens Homeobox protein EMX1 Proteins 0.000 description 1
- 101000878605 Homo sapiens Low affinity immunoglobulin epsilon Fc receptor Proteins 0.000 description 1
- 101000615488 Homo sapiens Methyl-CpG-binding domain protein 2 Proteins 0.000 description 1
- 101000979342 Homo sapiens Nuclear factor NF-kappa-B p105 subunit Proteins 0.000 description 1
- 101001098868 Homo sapiens Proprotein convertase subtilisin/kexin type 9 Proteins 0.000 description 1
- 101000800426 Homo sapiens Putative C->U-editing enzyme APOBEC-4 Proteins 0.000 description 1
- 101100087363 Homo sapiens RBFOX2 gene Proteins 0.000 description 1
- 101000755690 Homo sapiens Single-stranded DNA cytosine deaminase Proteins 0.000 description 1
- 101000799057 Homo sapiens tRNA-specific adenosine deaminase 2 Proteins 0.000 description 1
- 240000005979 Hordeum vulgare Species 0.000 description 1
- 235000007340 Hordeum vulgare Nutrition 0.000 description 1
- 208000023105 Huntington disease Diseases 0.000 description 1
- 208000035150 Hypercholesterolemia Diseases 0.000 description 1
- 206010020751 Hypersensitivity Diseases 0.000 description 1
- 108010002350 Interleukin-2 Proteins 0.000 description 1
- 108091092195 Intron Proteins 0.000 description 1
- QIVBCDIJIAJPQS-VIFPVBQESA-N L-tryptophane Chemical compound C1=CC=C2C(C[C@H](N)C(O)=O)=CNC2=C1 QIVBCDIJIAJPQS-VIFPVBQESA-N 0.000 description 1
- KZSNJWFQEVHDMF-BYPYZUCNSA-N L-valine Chemical compound CC(C)[C@H](N)C(O)=O KZSNJWFQEVHDMF-BYPYZUCNSA-N 0.000 description 1
- 201000003533 Leber congenital amaurosis Diseases 0.000 description 1
- 102000003960 Ligases Human genes 0.000 description 1
- 108090000364 Ligases Proteins 0.000 description 1
- 235000004431 Linum usitatissimum Nutrition 0.000 description 1
- 240000006240 Linum usitatissimum Species 0.000 description 1
- 108091007460 Long intergenic noncoding RNA Proteins 0.000 description 1
- 102100038007 Low affinity immunoglobulin epsilon Fc receptor Human genes 0.000 description 1
- 108060001084 Luciferase Proteins 0.000 description 1
- 239000005089 Luciferase Substances 0.000 description 1
- 102100033448 Lysosomal alpha-glucosidase Human genes 0.000 description 1
- 208000024556 Mendelian disease Diseases 0.000 description 1
- 206010027476 Metastases Diseases 0.000 description 1
- 102100021299 Methyl-CpG-binding domain protein 2 Human genes 0.000 description 1
- 108060004795 Methyltransferase Proteins 0.000 description 1
- 101100434310 Mus musculus Ada gene Proteins 0.000 description 1
- 101000777691 Mus musculus Cytidine and dCMP deaminase domain-containing protein 1 Proteins 0.000 description 1
- 101000912065 Mus musculus Cytidine deaminase Proteins 0.000 description 1
- 241000699670 Mus sp. Species 0.000 description 1
- 206010068871 Myotonic dystrophy Diseases 0.000 description 1
- 244000061176 Nicotiana tabacum Species 0.000 description 1
- 235000002637 Nicotiana tabacum Nutrition 0.000 description 1
- 238000000636 Northern blotting Methods 0.000 description 1
- 102100023050 Nuclear factor NF-kappa-B p105 subunit Human genes 0.000 description 1
- 108020005497 Nuclear hormone receptor Proteins 0.000 description 1
- 108020004711 Nucleic Acid Probes Proteins 0.000 description 1
- 102000005520 O-GlcNAc transferase Human genes 0.000 description 1
- 108010077991 O-GlcNAc transferase Proteins 0.000 description 1
- 240000007594 Oryza sativa Species 0.000 description 1
- 235000007164 Oryza sativa Nutrition 0.000 description 1
- 101100282746 Oryza sativa subsp. japonica GID1 gene Proteins 0.000 description 1
- 230000010718 Oxidation Activity Effects 0.000 description 1
- 229910019142 PO4 Inorganic materials 0.000 description 1
- 241001631646 Papillomaviridae Species 0.000 description 1
- 244000115721 Pennisetum typhoides Species 0.000 description 1
- 235000007195 Pennisetum typhoides Nutrition 0.000 description 1
- 102000010292 Peptide Elongation Factor 1 Human genes 0.000 description 1
- 108010077524 Peptide Elongation Factor 1 Proteins 0.000 description 1
- 102100027913 Peptidyl-prolyl cis-trans isomerase FKBP1A Human genes 0.000 description 1
- 108091000080 Phosphotransferase Proteins 0.000 description 1
- 229920002873 Polyethylenimine Polymers 0.000 description 1
- 108010039918 Polylysine Proteins 0.000 description 1
- 108010071690 Prealbumin Proteins 0.000 description 1
- 241000283080 Proboscidea <mammal> Species 0.000 description 1
- 102100038955 Proprotein convertase subtilisin/kexin type 9 Human genes 0.000 description 1
- 102220470558 Proteasome subunit beta type-3_D32R_mutation Human genes 0.000 description 1
- 102000001253 Protein Kinase Human genes 0.000 description 1
- 241000125945 Protoparvovirus Species 0.000 description 1
- 102100033091 Putative C->U-editing enzyme APOBEC-4 Human genes 0.000 description 1
- 108091093078 Pyrimidine dimer Proteins 0.000 description 1
- 102100038187 RNA binding protein fox-1 homolog 2 Human genes 0.000 description 1
- 230000026279 RNA modification Effects 0.000 description 1
- 108091030071 RNAI Proteins 0.000 description 1
- 241000700159 Rattus Species 0.000 description 1
- 108091028664 Ribonucleotide Proteins 0.000 description 1
- 240000004808 Saccharomyces cerevisiae Species 0.000 description 1
- 101100156295 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) VID30 gene Proteins 0.000 description 1
- 244000000231 Sesamum indicum Species 0.000 description 1
- 235000003434 Sesamum indicum Nutrition 0.000 description 1
- 235000008515 Setaria glauca Nutrition 0.000 description 1
- 208000018020 Sickle cell-beta-thalassemia disease syndrome Diseases 0.000 description 1
- 241000700584 Simplexvirus Species 0.000 description 1
- 102100022433 Single-stranded DNA cytosine deaminase Human genes 0.000 description 1
- 101710143275 Single-stranded DNA cytosine deaminase Proteins 0.000 description 1
- 108091027967 Small hairpin RNA Proteins 0.000 description 1
- 108020004459 Small interfering RNA Proteins 0.000 description 1
- 235000011684 Sorghum saccharatum Nutrition 0.000 description 1
- 244000062793 Sorghum vulgare Species 0.000 description 1
- 238000002105 Southern blotting Methods 0.000 description 1
- 101000910035 Streptococcus pyogenes serotype M1 CRISPR-associated endonuclease Cas9/Csn1 Proteins 0.000 description 1
- 101710124574 Synaptotagmin-1 Proteins 0.000 description 1
- 230000006044 T cell activation Effects 0.000 description 1
- 101150091380 TTR gene Proteins 0.000 description 1
- 108010006877 Tacrolimus Binding Protein 1A Proteins 0.000 description 1
- 108091046869 Telomeric non-coding RNA Proteins 0.000 description 1
- 206010043391 Thalassaemia beta Diseases 0.000 description 1
- 108700019146 Transgenes Proteins 0.000 description 1
- 102000009190 Transthyretin Human genes 0.000 description 1
- 206010044625 Trichorrhexis Diseases 0.000 description 1
- 235000021307 Triticum Nutrition 0.000 description 1
- 244000098338 Triticum aestivum Species 0.000 description 1
- 108010056354 Ubiquitin C Proteins 0.000 description 1
- 102000006275 Ubiquitin-Protein Ligases Human genes 0.000 description 1
- 108010083111 Ubiquitin-Protein Ligases Proteins 0.000 description 1
- 206010046865 Vaccinia virus infection Diseases 0.000 description 1
- KZSNJWFQEVHDMF-UHFFFAOYSA-N Valine Natural products CC(C)C(N)C(O)=O KZSNJWFQEVHDMF-UHFFFAOYSA-N 0.000 description 1
- 230000021736 acetylation Effects 0.000 description 1
- 238000006640 acetylation reaction Methods 0.000 description 1
- 108020002494 acetyltransferase Proteins 0.000 description 1
- 102000005421 acetyltransferase Human genes 0.000 description 1
- 230000004721 adaptive immunity Effects 0.000 description 1
- 230000012136 adenosine to inosine editing Effects 0.000 description 1
- 230000006154 adenylylation Effects 0.000 description 1
- 238000000246 agarose gel electrophoresis Methods 0.000 description 1
- 150000001298 alcohols Chemical class 0.000 description 1
- 230000029936 alkylation Effects 0.000 description 1
- 238000005804 alkylation reaction Methods 0.000 description 1
- 208000026935 allergic disease Diseases 0.000 description 1
- 230000007815 allergy Effects 0.000 description 1
- 230000000735 allogeneic effect Effects 0.000 description 1
- 208000006682 alpha 1-Antitrypsin Deficiency Diseases 0.000 description 1
- OYTKINVCDFNREN-UHFFFAOYSA-N amifampridine Chemical compound NC1=CC=NC=C1N OYTKINVCDFNREN-UHFFFAOYSA-N 0.000 description 1
- 206010002026 amyotrophic lateral sclerosis Diseases 0.000 description 1
- 239000003242 anti bacterial agent Substances 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 239000000823 artificial membrane Substances 0.000 description 1
- 235000009582 asparagine Nutrition 0.000 description 1
- 229960001230 asparagine Drugs 0.000 description 1
- 229940009098 aspartate Drugs 0.000 description 1
- 235000003704 aspartic acid Nutrition 0.000 description 1
- 239000011324 bead Substances 0.000 description 1
- 208000005980 beta thalassemia Diseases 0.000 description 1
- 108010005774 beta-Galactosidase Proteins 0.000 description 1
- DRTQHJPVMGBUCF-PSQAKQOGSA-N beta-L-uridine Natural products O[C@H]1[C@@H](O)[C@H](CO)O[C@@H]1N1C(=O)NC(=O)C=C1 DRTQHJPVMGBUCF-PSQAKQOGSA-N 0.000 description 1
- OQFSQFPPLPISGP-UHFFFAOYSA-N beta-carboxyaspartic acid Natural products OC(=O)C(N)C(C(O)=O)C(O)=O OQFSQFPPLPISGP-UHFFFAOYSA-N 0.000 description 1
- 238000004166 bioassay Methods 0.000 description 1
- 238000010256 biochemical assay Methods 0.000 description 1
- 230000008436 biogenesis Effects 0.000 description 1
- 230000004071 biological effect Effects 0.000 description 1
- 238000010170 biological method Methods 0.000 description 1
- 230000033228 biological regulation Effects 0.000 description 1
- 238000001574 biopsy Methods 0.000 description 1
- 210000000601 blood cell Anatomy 0.000 description 1
- 238000006664 bond formation reaction Methods 0.000 description 1
- 210000004900 c-terminal fragment Anatomy 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 235000009120 camo Nutrition 0.000 description 1
- 210000000234 capsid Anatomy 0.000 description 1
- 210000004413 cardiac myocyte Anatomy 0.000 description 1
- 108020001778 catalytic domains Proteins 0.000 description 1
- 229920006317 cationic polymer Polymers 0.000 description 1
- 238000004113 cell culture Methods 0.000 description 1
- 235000005607 chanvre indien Nutrition 0.000 description 1
- 238000012512 characterization method Methods 0.000 description 1
- 235000013330 chicken meat Nutrition 0.000 description 1
- 208000020832 chronic kidney disease Diseases 0.000 description 1
- 238000010367 cloning Methods 0.000 description 1
- 230000004186 co-expression Effects 0.000 description 1
- 230000008045 co-localization Effects 0.000 description 1
- 238000001246 colloidal dispersion Methods 0.000 description 1
- 239000000084 colloidal system Substances 0.000 description 1
- 239000002299 complementary DNA Substances 0.000 description 1
- 239000000306 component Substances 0.000 description 1
- 230000021615 conjugation Effects 0.000 description 1
- 208000029078 coronary artery disease Diseases 0.000 description 1
- 229960000956 coumarin Drugs 0.000 description 1
- 235000001671 coumarin Nutrition 0.000 description 1
- UHDGCWIWMRVCDJ-ZAKLUEHWSA-N cytidine Chemical group O=C1N=C(N)C=CN1[C@H]1[C@H](O)[C@@H](O)[C@H](CO)O1 UHDGCWIWMRVCDJ-ZAKLUEHWSA-N 0.000 description 1
- 102000003675 cytokine receptors Human genes 0.000 description 1
- 108010057085 cytokine receptors Proteins 0.000 description 1
- 210000001151 cytotoxic T lymphocyte Anatomy 0.000 description 1
- GDPJWJXLKPPEKK-SJAYXVESSA-N dT4 Chemical compound O=C1NC(=O)C(C)=CN1[C@@H]1O[C@H](COP(O)(=O)O[C@@H]2[C@H](O[C@H](C2)N2C(NC(=O)C(C)=C2)=O)COP(O)(=O)O[C@@H]2[C@H](O[C@H](C2)N2C(NC(=O)C(C)=C2)=O)COP(O)(=O)O[C@@H]2[C@H](O[C@H](C2)N2C(NC(=O)C(C)=C2)=O)CO)[C@@H](O)C1 GDPJWJXLKPPEKK-SJAYXVESSA-N 0.000 description 1
- 238000000354 decomposition reaction Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000002950 deficient Effects 0.000 description 1
- 230000022811 deglycosylation Effects 0.000 description 1
- 238000002716 delivery method Methods 0.000 description 1
- 230000017858 demethylation Effects 0.000 description 1
- 238000010520 demethylation reaction Methods 0.000 description 1
- 230000006114 demyristoylation Effects 0.000 description 1
- 239000000412 dendrimer Substances 0.000 description 1
- 229920000736 dendritic polymer Polymers 0.000 description 1
- 239000005547 deoxyribonucleotide Substances 0.000 description 1
- 125000002637 deoxyribonucleotide group Chemical group 0.000 description 1
- 230000027832 depurination Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 230000029180 desumoylation Effects 0.000 description 1
- 229910052805 deuterium Inorganic materials 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000005750 disease progression Effects 0.000 description 1
- 238000009510 drug design Methods 0.000 description 1
- 210000001671 embryonic stem cell Anatomy 0.000 description 1
- 210000002257 embryonic structure Anatomy 0.000 description 1
- 239000000839 emulsion Substances 0.000 description 1
- 230000005966 endogenous activation Effects 0.000 description 1
- 239000003623 enhancer Substances 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 230000005713 exacerbation Effects 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000002376 fluorescence recovery after photobleaching Methods 0.000 description 1
- 239000004459 forage Substances 0.000 description 1
- 238000009472 formulation Methods 0.000 description 1
- 230000037433 frameshift Effects 0.000 description 1
- 230000002538 fungal effect Effects 0.000 description 1
- 239000000499 gel Substances 0.000 description 1
- 238000001476 gene delivery Methods 0.000 description 1
- 230000009368 gene silencing by RNA Effects 0.000 description 1
- IXORZMNAPKEEDV-UHFFFAOYSA-N gibberellic acid GA3 Natural products OC(=O)C1C2(C3)CC(=C)C3(O)CCC2C2(C=CC3O)C1C3(C)C(=O)O2 IXORZMNAPKEEDV-UHFFFAOYSA-N 0.000 description 1
- 239000003448 gibberellin Substances 0.000 description 1
- 229930195712 glutamate Natural products 0.000 description 1
- 201000004502 glycogen storage disease II Diseases 0.000 description 1
- 210000002443 helper t lymphocyte Anatomy 0.000 description 1
- 210000003958 hematopoietic stem cell Anatomy 0.000 description 1
- 239000011487 hemp Substances 0.000 description 1
- 210000002767 hepatic artery Anatomy 0.000 description 1
- HNDVDQJCIGZPNO-UHFFFAOYSA-N histidine Natural products OC(=O)C(N)CC1=CN=CN1 HNDVDQJCIGZPNO-UHFFFAOYSA-N 0.000 description 1
- 239000003688 hormone derivative Substances 0.000 description 1
- 102000043770 human ADAR Human genes 0.000 description 1
- 102000047345 human ADAT2 Human genes 0.000 description 1
- 230000001900 immune effect Effects 0.000 description 1
- 238000000530 impalefection Methods 0.000 description 1
- 238000000126 in silico method Methods 0.000 description 1
- 239000012678 infectious agent Substances 0.000 description 1
- 238000001802 infusion Methods 0.000 description 1
- 238000010253 intravenous injection Methods 0.000 description 1
- 230000005865 ionizing radiation Effects 0.000 description 1
- 238000002955 isolation Methods 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 238000011005 laboratory method Methods 0.000 description 1
- 230000029226 lipidation Effects 0.000 description 1
- 150000002632 lipids Chemical class 0.000 description 1
- 238000001638 lipofection Methods 0.000 description 1
- 210000005229 liver cell Anatomy 0.000 description 1
- 244000144972 livestock Species 0.000 description 1
- 239000003550 marker Substances 0.000 description 1
- 230000001404 mediated effect Effects 0.000 description 1
- 150000002739 metals Chemical class 0.000 description 1
- 230000009401 metastasis Effects 0.000 description 1
- 230000011987 methylation Effects 0.000 description 1
- 238000007069 methylation reaction Methods 0.000 description 1
- 108091070501 miRNA Proteins 0.000 description 1
- 239000002679 microRNA Substances 0.000 description 1
- 238000000386 microscopy Methods 0.000 description 1
- 239000004005 microsphere Substances 0.000 description 1
- 239000000178 monomer Substances 0.000 description 1
- 210000000663 muscle cell Anatomy 0.000 description 1
- 238000002703 mutagenesis Methods 0.000 description 1
- 231100000350 mutagenesis Toxicity 0.000 description 1
- 230000007498 myristoylation Effects 0.000 description 1
- 210000004898 n-terminal fragment Anatomy 0.000 description 1
- 239000002088 nanocapsule Substances 0.000 description 1
- 210000003061 neural cell Anatomy 0.000 description 1
- 210000001178 neural stem cell Anatomy 0.000 description 1
- 210000002569 neuron Anatomy 0.000 description 1
- 102000006255 nuclear receptors Human genes 0.000 description 1
- 108020004017 nuclear receptors Proteins 0.000 description 1
- 239000002853 nucleic acid probe Substances 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 210000000056 organ Anatomy 0.000 description 1
- 230000002018 overexpression Effects 0.000 description 1
- JMANVNJQNLATNU-UHFFFAOYSA-N oxalonitrile Chemical compound N#CC#N JMANVNJQNLATNU-UHFFFAOYSA-N 0.000 description 1
- 230000001575 pathological effect Effects 0.000 description 1
- 230000037361 pathway Effects 0.000 description 1
- NBIIXXVUZAFLBC-UHFFFAOYSA-K phosphate Chemical compound [O-]P([O-])([O-])=O NBIIXXVUZAFLBC-UHFFFAOYSA-K 0.000 description 1
- 239000010452 phosphate Substances 0.000 description 1
- 108010079892 phosphoglycerol kinase Proteins 0.000 description 1
- 230000026731 phosphorylation Effects 0.000 description 1
- 238000006366 phosphorylation reaction Methods 0.000 description 1
- 102000020233 phosphotransferase Human genes 0.000 description 1
- 238000000053 physical method Methods 0.000 description 1
- 230000035790 physiological processes and functions Effects 0.000 description 1
- 208000030683 polygenic disease Diseases 0.000 description 1
- 229920000656 polylysine Polymers 0.000 description 1
- 210000003240 portal vein Anatomy 0.000 description 1
- 238000001556 precipitation Methods 0.000 description 1
- 230000002265 prevention Effects 0.000 description 1
- 239000000047 product Substances 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 230000000644 propagated effect Effects 0.000 description 1
- 108060006633 protein kinase Proteins 0.000 description 1
- 108020001775 protein parts Proteins 0.000 description 1
- 210000001938 protoplast Anatomy 0.000 description 1
- 239000013635 pyrimidine dimer Substances 0.000 description 1
- 238000010188 recombinant method Methods 0.000 description 1
- 230000006798 recombination Effects 0.000 description 1
- 238000005215 recombination Methods 0.000 description 1
- 230000010076 replication Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 238000003757 reverse transcription PCR Methods 0.000 description 1
- 238000012552 review Methods 0.000 description 1
- 239000002336 ribonucleotide Substances 0.000 description 1
- 125000002652 ribonucleotide group Chemical group 0.000 description 1
- 235000009566 rice Nutrition 0.000 description 1
- 238000007480 sanger sequencing Methods 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 239000002924 silencing RNA Substances 0.000 description 1
- 210000004927 skin cell Anatomy 0.000 description 1
- 239000004055 small Interfering RNA Substances 0.000 description 1
- 230000006641 stabilisation Effects 0.000 description 1
- 238000011105 stabilization Methods 0.000 description 1
- 229910052717 sulfur Inorganic materials 0.000 description 1
- 230000010741 sumoylation Effects 0.000 description 1
- 230000004083 survival effect Effects 0.000 description 1
- 238000003786 synthesis reaction Methods 0.000 description 1
- 108010066587 tRNA Methyltransferases Proteins 0.000 description 1
- 102000018477 tRNA Methyltransferases Human genes 0.000 description 1
- 229940124597 therapeutic agent Drugs 0.000 description 1
- 238000002560 therapeutic procedure Methods 0.000 description 1
- 108091008023 transcriptional regulators Proteins 0.000 description 1
- 238000010361 transduction Methods 0.000 description 1
- 230000026683 transduction Effects 0.000 description 1
- 210000004881 tumor cell Anatomy 0.000 description 1
- 210000003171 tumor-infiltrating lymphocyte Anatomy 0.000 description 1
- 238000002604 ultrasonography Methods 0.000 description 1
- 241001529453 unidentified herpesvirus Species 0.000 description 1
- 241001430294 unidentified retrovirus Species 0.000 description 1
- DRTQHJPVMGBUCF-UHFFFAOYSA-N uracil arabinoside Natural products OC1C(O)C(CO)OC1N1C(=O)NC(=O)C=C1 DRTQHJPVMGBUCF-UHFFFAOYSA-N 0.000 description 1
- 229940045145 uridine Drugs 0.000 description 1
- 208000007089 vaccinia Diseases 0.000 description 1
- 238000010200 validation analysis Methods 0.000 description 1
- 239000004474 valine Substances 0.000 description 1
- 235000013311 vegetables Nutrition 0.000 description 1
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 1
- 238000001262 western blot Methods 0.000 description 1
- 208000027121 wild type ATTR amyloidosis Diseases 0.000 description 1
Images
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/14—Hydrolases (3)
- C12N9/16—Hydrolases (3) acting on ester bonds (3.1)
- C12N9/22—Ribonucleases RNAses, DNAses
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2310/00—Structure or type of the nucleic acid
- C12N2310/10—Type of nucleic acid
- C12N2310/20—Type of nucleic acid involving clustered regularly interspaced short palindromic repeats [CRISPRs]
Definitions
- Genome editing is an important and useful technique in genome research.
- Several systems are available for genome editing, including clustered regularly interspaced short palindromic repeat (CRISPR)-Cas systems, transcription activator-like effector nuclease (TALEN) systems, and zinc finger nuclease (ZFN) systems.
- CRISPR clustered regularly interspaced short palindromic repeat
- TALEN transcription activator-like effector nuclease
- ZFN zinc finger nuclease
- type II Cas9 system and type V-A/B/E/J Cas12a/Cas12b/Cas12e/Cas12j system have been exploited for genome editing and provide a broad field for biomedical research. prospect.
- Cas12i nuclease of a kind of engineering It comprises one or more mutations based on reference Cas12i nuclease, said mutation is selected from:
- the one or more amino acids that interact with PAM are one or more of the following amino acids: E176, K238, T447 and E563,
- amino acid position numbers are as defined in the corresponding amino acid positions shown in SEQ ID NO:1.
- the engineered Cas12i nuclease as described in any one of items 1-3, wherein, the replacement of one or more amino acids interacting with PAM in the reference Cas12i nuclease with a positively charged amino acid is Refers to one or more of the following substitutions: E176R, K238R, T447R and E563R;
- the one or more amino acids involved in opening the DNA double strand are one or more amino acids at the following positions: 163 and 164;
- amino acid position numbers are as defined in the corresponding amino acid positions shown in SEQ ID NO:1.
- the one or more amino acids located in the RuvC domain and interacting with the single-stranded DNA substrate are one or more amino acids at the following positions: 323, 327, 355, 359, 360, 361, 362, 388, 390, 391, 392, 393, 414, 417, 418, 421, 424, 425, 650, 652, 653, 696, 705, 708, 709, 751, 752, 755, 840, 848, 851, 856, 885, 897, 925, 926, 928, 929, 932, 1022.
- the one or more amino acids located in the RuvC domain and interacting with the single-stranded DNA substrate are one or more of the following amino acids: E323, L327, V355, G359, G360, K361, D362, L388, N390 , N391, F392, K393, Q414, L417, L418, K421, Q424, Q425, S650, E652, G653, I696, K705, K708, E709, L751, S752, E755, N840, N848, S851, A856, Q885, M897 , N925, I926, T928, G929, Y932, A1022;
- the one or more amino acids located in the RuvC domain and interacting with the single-stranded DNA substrate are one or more of the following amino acids: E323, D362, Q425, N925, I926, N391, Q424 and G929;
- amino acid position numbers are as defined in the corresponding amino acid positions shown in SEQ ID NO:1.
- the positively charged amino acid includes substitution by R or K, preferably, the positively charged amino acid is R.
- the engineered Cas12i nuclease comprises any of the following mutations or mutation combinations: (1) E323R; (2) D362R; (3) Q425R; (4) N925R; (5) I926R; (6) E323R (7) E323R and Q425R; (8) E323R and I926R; (9) Q425R and I926R; (10) D362R and I926R; (11) N925R and I926R; (12) E323R, D362R and Q425R; (13) E323R, D362R and I926R; (14) E323R, Q425R and I926R; (15) D362R, N925R and I926R; and (16) E323R, D362R, Q425R and I926R;
- amino acid position is defined as the corresponding amino acid position shown in SEQ ID NO:1.
- the one or more amino acids that interact with the DNA-RNA double helix are one or more of the following amino acids: G116, E117, A156, T159, S161, T301, I305, K306, T308, N312, F313, D427, K433, V438, N441, Q442, M852, L855, N861, Q865, E160, Q316, E319, Q320, E247, E343, E348, E349, N679, E683, E691, D782, E783, E797, E800, D853, S957, D958, G293, E294 and N297; G116, E117, A156, T159, E160, S161, E247, G293, E294, N297, 1301, 1305, K306, T308, N312, F313, Q316, E319, Q320, E343, E348, E349, D427, K433, V438, N441, Q442, N679, E683, E
- amino acid position is defined as the corresponding amino acid position shown in SEQ ID NO:1.
- one or more polar or positively charged amino acids interacting with the DNA-RNA double helix are selected from one or more of the following Amino acids at multiple positions: 357, 394, 715, 719, 807, 844, 848, 857, 861, that is, one or more of the following amino acids: H357, K394, R715, R719, K807, K844, N848, R857, R861 ;
- amino acid position is defined as the corresponding amino acid position shown in SEQ ID NO:1.
- amino acid position is defined as the corresponding amino acid position shown in SEQ ID NO:1.
- the flexible region mutation is located at position 439 and/or 926;
- the flexible region mutation is a mutation of L439 and/or I926;
- amino acid position is defined as the corresponding amino acid position shown in SEQ ID NO:1.
- said one or more flexible region mutations comprise I926G, L439(L+G) or L439(L+GG);
- the one or more flexible region mutations comprise L439(L+G) or L439(L+GG);
- amino acid position numbers are as defined in the corresponding amino acid positions shown in SEQ ID NO:1.
- the Cas12i nuclease of described engineering comprises any following one or more sets of mutations:
- amino acid position is defined as the corresponding amino acid position shown in SEQ ID NO:1.
- An engineered Cas12i nuclease comprising any of the following mutations: (1) E176R, K238R, T447R, E563R and N164Y; (2) E176R, K238R, T447R, E563R and I926R; (3) N164Y, E323R and D362R; (4) E176R, K238R, T447R, E563R, E323R and D362R; (5) N164Y and I926R; (6) E176R, K238R, T447R, E563R, N164Y and I926R; (7) )E176R, K238R, T447R, E563R, N164Y, E323R, and D362R; (8) E176R, K238R, T447R, E563R, N164Y, I926R, E323R, and D362R; (9) E176R, K238R, T447R, E56
- An engineered Cas12i effector protein comprising the engineered Cas12i nuclease or a functional derivative thereof according to any one of items 1 to 22;
- the engineered Cas12i nuclease or its functional derivative has enzymatic activity, or the engineered Cas12i nuclease or its functional derivative is an enzyme inactive mutant.
- engineered Cas12i effector protein as described in item 23, wherein said engineered Cas12i nuclease or its functional derivative is an enzyme inactivation mutant comprising one or more of the following mutations: D599A, E833A, S883A , H884A, R900A and D1019A; wherein, said amino acid position is as defined in the corresponding amino acid position shown in SEQ ID NO: 1.
- the engineered Cas12i effector protein according to any one of items 23 to 25, further comprising a functional domain fused with the engineered Cas12i nuclease or a functional derivative thereof.
- the engineered Cas12i effector protein according to any one of items 23-27, the engineered Cas12i effector protein comprising: the N-terminal part containing the engineered Cas12i nuclease or its functional derivative A first polypeptide and a second polypeptide comprising the C-terminal portion of the engineered Cas12i nuclease or a functional derivative thereof, wherein the first polypeptide and the second polypeptide are capable of following a guide comprising a guide sequence associate with each other in the presence of the RNAs to form clustered regularly interspaced short palindromic repeat (CRISPR) complexes that specifically bind to a target nucleic acid comprising a target sequence complementary to the guide sequence;
- CRISPR clustered regularly interspaced short palindromic repeat
- the first polypeptide comprises amino acid residues 1 to X of the N-terminal portion of the engineered Cas12i nuclease described in any one of items 1-22, and the second polypeptide comprises any of items 1-22 An amino acid residue X+1 of the engineered Cas12i nuclease to the C-terminus of the Cas12i nuclease;
- said first polypeptide and said second polypeptide each comprise a dimerization domain
- said first dimerization domain and said second dimerization domain associate with each other in the presence of an inducing agent.
- An engineered CRISPR-Cas12i system comprising:
- a guide RNA comprising a guide sequence complementary to a target sequence, or one or more nucleic acids encoding said guide RNA.
- the engineered Cas12i nuclease or the engineered Cas12i effector protein and the guide RNA can form a CRISPR complex, and the CRISPR complex specifically binds to a target nucleic acid comprising the target sequence and induces the target nucleic acid modification;
- the guide RNA is a crRNA comprising the guide sequence
- the engineered CRISPR-Cas12i system includes a precursor guide RNA array (array) encoding multiple crRNAs;
- the engineered Cas12i nuclease or the engineered Cas12i effector protein is a master editor
- the guide RNA is a guide editing guide RNA (pegRNA).
- said one or more vectors are selected from the group consisting of retroviral vectors, lentiviral vectors, adenoviral vectors, adeno-associated vectors and herpes simplex vectors;
- the one or more vectors are adeno-associated virus AAV vectors
- the AAV vector also encodes the guide RNA.
- a method for detecting target nucleic acid in a sample comprising:
- a method of modifying a target nucleic acid comprising a target sequence comprising contacting said target nucleic acid with the engineered CRISPR-Cas12i system described in item 29 or 30;
- the method is performed in vitro, ex vivo or in vivo;
- the target nucleic acid is present in a cell
- the cells are bacterial cells, yeast cells, mammalian cells, plant cells or animal cells;
- the target nucleic acid is genomic DNA
- the engineered CRISPR-Cas12i system comprises a precursor guide RNA array encoding a plurality of crRNAs, wherein each crRNA comprises a different guide sequence.
- the engineered CRISPR-Cas12i system as described in item 29 or 30 in the preparation of a drug for treating a disease or a disease associated with a target nucleic acid in an individual's cells; preferably, the disease or disease is selected from the group consisting of: Cancer, cardiovascular disease, genetic disease, autoimmune disease, metabolic disease, neurodegenerative disease, eye disease, bacterial infection and viral infection.
- a method of treating a disease or disorder associated with a target nucleic acid in a cell of an individual comprising using the method of item 32 to modify the target nucleic acid in the cell of the individual, thereby treating the disease or A disorder; preferably, said disease or disorder is selected from the group consisting of cancer, cardiovascular disease, genetic disease, autoimmune disease, metabolic disease, neurodegenerative disease, eye disease, bacterial infection and viral infection.
- a method of modifying a target nucleic acid comprising a target sequence comprising contacting said target nucleic acid with the engineered CRISPR-Cas12i system described in item 29 or 30.
- An engineered non-human animal comprising one or more engineered cells according to item 37.
- the Cas12i nuclease engineered in this application and its effector protein have higher activity, such as the catalytic efficiency of cutting nucleic acid substrates and the gene editing efficiency in cells.
- the engineered Cas12i nuclease in the present application has more excellent gene editing efficiency in mammalian cells (such as human cells) than existing conventional Cas gene editing tools;
- the gene editing efficiency of multiple sites (such as 62 sites) in the cell was tested, and it was found that the gene editing efficiency of 57 sites exceeded about 60%, and the average gene editing efficiency was close to 70%.
- the engineered Cas12i nuclease and its effector protein of the present application also have one or more of the following advantages: the protein is small (1,054aa), the crRNA component is simple, the PAM sequence is simple, and the protein itself can process the precursor crRNA.
- the application provides Cas12i nuclease with lower off-target rate and higher specificity further artificially modified on the basis of highly active engineered Cas12i nuclease (such as SEQ ID NO.8).
- Figure 1a Amino acids that interact with PAM in the reference Cas12i nuclease (wild-type Cas12i nuclease shown in SEQ ID NO.1) were replaced with positively charged amino acids to improve gene editing efficiency. As shown in the figure, the four mutants E176R, K238R, T447R, and E563R can significantly improve the gene editing efficiency in human 293T cells.
- Figure 1b Combining the amino acid mutations (E176R, K238R, T447R, E563R) that can significantly improve gene editing efficiency in Figure 1a, it was found that the combined mutants can exhibit higher gene editing efficiency in human 293T cells.
- Figure 2 The amino acid involved in opening the DNA double strand in the reference Cas12i nuclease is replaced with an amino acid with an aromatic ring to improve the efficiency of gene editing.
- mutants such as Q163F, Q163Y, Q163W, N164F, and N164Y can significantly improve the gene editing efficiency in human 293T cells.
- Figure 3a, 3b, 3c Amino acids in the reference Cas12i nuclease located in the RuvC domain and interacting with single-stranded DNA substrates were replaced with positively charged amino acids, thereby improving the efficiency of gene editing.
- mutants such as E323R, L327R, V355R, G359R, G360R, D362R, N391R, Q424R, Q425R, N925R, I926R, and G929R can significantly improve the gene editing efficiency in human 293T cells.
- Figure 3d Combining the efficiency-enhancing point mutations in Figures 3a and 3b, it was found that the combined mutants can exhibit higher gene editing efficiency in human 293T cells.
- Figure 3e Combining the efficiency-enhancing point mutations in Figures 3a and 3b and the modified mutations (L439(L+GG), I926G) based on the principle of molecular flexibility, it was found that the combined mutants can exhibit higher efficiencies in human 293T cells Gene editing efficiency.
- Figure 4 Replacing amino acids that interact with the DNA-RNA duplex in the reference Cas12i enzyme with positively charged amino acids improves gene editing efficiency.
- the G116R, E117R, T159R, S161R, E319R, E343R, and D958R mutants can significantly improve the gene editing efficiency in human 293T cells.
- the D958R mutation is the best.
- FIG 5 Combining the high-efficiency mutants obtained from the three transformation strategies in Figure 1a to Figure 3e and the transformation mutations (L439(L+GG), I926G) based on the principle of molecular flexibility, it is found that the combined mutants can display Higher gene editing efficiency in 293T cells. The combination can greatly improve the efficiency of gene editing.
- the mutant with the best gene editing effect was selected and named CasXX for subsequent experiments.
- CasXX (SEQ ID NO:8) is a Cas12i engineered enzyme with E176R+K238R+T447R+E563R+N164Y+E323R+D362R mutation combination (reference Cas12i2 based on amino acid sequence SEQ ID NO:1).
- Figure 6a Summary of gene editing efficiencies of CasXX at 62 human genomic loci.
- PAM NTTN.
- Figure 6b Comparison of gene editing efficiency between CasXX and AsCas12a, BhCas12b v4.
- Figure 6c Comparison of gene editing efficiency between CasXX and SpCas9, SaCas9, SaCas9-KKH.
- Figure 6d Statistics of gene editing efficiency of CasXX in the mouse Hepa1-6 cell line. It can be seen that CasXX exhibited powerful gene editing capabilities at 65 sites in mouse Hepa1-6 cells, with an average gene editing efficiency of more than 60%.
- Figure 7 shows the gene editing efficiency of CasXX at 64 human loci.
- the PAM sequences contained in these 64 sites cover all NNNN combinations: NTTN, NTAN, NTCN, NTGN, NATN, NAAN, NACN, NAGN, NCTN, NCAN, NCCN, NCGN, NGTN, NGAN, NGCN, NGGN.
- CasXX exhibited high gene editing efficiency in PAM sequences such as NTTN, NTAN, NTCN, NTGN, NATN, NAAN, NCTN, NCAN, and NGTN, and the average gene editing efficiency at these sites exceeded 40%.
- Figure 8 shows the results of wild-type Cas12i2 protein (SEQ ID NO: 1) and CasXX protein cutting double-stranded DNA in vitro.
- the wild-type Cas12i2 protein only has partial cutting efficiency for double-stranded DNA containing NTTN PAM, but has almost no cutting activity for the rest of the PAM.
- CasXX protein showed high cutting efficiency for double-stranded DNA containing NTTN, NTAN, NTCN, NTGN, NATN, NAAN, NACN, NCTN, NCAN, NGTN, NGAN PAM.
- Figure 9 Homology alignment of the amino acid sequences of Cas12i2 (SEQ ID NO: 1) and Cas12i1 (SEQ ID NO: 13).
- the shaded amino acids represent the same amino acids in the two Cas12i proteins, and the amino acids marked in white boxes represent the amino acids in the two Cas12i proteins with similar properties.
- Figure 10 is the use of GUIDE-Seq to detect the off-target effect of CasXX at the EMX1-7 target site.
- the numbers after the sequence represent the number of reads (reads) detected for each sequence.
- the first line of sequence is the reference target sequence (SEQ ID NO: 77), and the following sequences represent the target sequence and off-target sequence and the reads (reads) enriched by double-stranded DNA tags in the target sequence and off-target sequence, respectively.
- SEQ ID NO: 77 the reference target sequence
- This figure shows that CasXX has off-target effects at the EMX1-7 site.
- Figure 11 shows the construction of new HF mutants by adding new point mutations or deleting original point mutations on the basis of the original sequence of CasXX.
- the experimental results showed that R857A, N861A, K807A, N848A, R715A, R719A, K394A, H357A, K844A, these single point mutants based on CasXX sequence could effectively reduce the EMX1-7-OT-1, EMX1-7-OT-2 , indel ratio of emx1-7-ot-3.
- Figure 12 shows the construction of new HF mutants by adding new point mutations or deleting original point mutations on the basis of the original sequence of CasXX.
- the experimental results showed that R857A, N861A, K807A, N848A, R715A, R719A, K394A, H357A, K844A, these single point mutants based on CasXX sequence were used to mimic off-target crRNA-RNF2-1-Mis-1/2, crRNA-RNF2 -1-Mis-5/6, crRNA-RNF2-1-Mis-17/18 and crRNA-RNF2-1-Mis-19/20 can effectively reduce the ratio of indels.
- FIG. 13 is an experimental process for screening mutants (CasXX-based HF mutants) that improve the specificity of gene editing using a fluorescent reporter system. 600ng of a plasmid encoding Cas protein, 300ng of a plasmid encoding crRNA and 100ng of a plasmid encoding mCherry were transfected into cells cultured in a 24-well cell culture dish. Three days after transfection, the proportion of remaining mCherry positivity for each sample was calculated using flow cytometry. The calculation formula is shown in the figure.
- Figure 14 shows that CasXX and mutants edit the mCherry gene located on the plasmid in 293T cells. After plasmid transfection, cultured for three days, flow cytometric analysis was performed. The higher the gene editing efficiency of the Cas protein, the lower the proportion of mCherry-positive cells.
- the experimental process corresponds to the schematic diagram of the experimental process in Figure 13; the spacer sequences for mCherry-FM, mCherry-Mis-1/2, mCherry-Mis-5/6 and mCherry-Mis-19/20 are respectively as SEQ ID NO: 37 , 38, 39 and 40.
- Figure 15 shows the construction of a new combination of HF mutants based on the original sequence of CasXX.
- the experimental results showed that CasXX sequence-based mutants R857A, R719A, K394A, K844A, R719A/K844A and R857A/K844A could effectively reduce the EMX1-7-OT-1, EMX1-7-OT-2, EMX1-7 - OT-3 off-target indel ratio without sacrificing efficiency at on-target sites.
- Figure 16 shows the construction of a new combination of HF mutants based on the original sequence of CasXX.
- the experimental results showed that R857A, R719A, K394A, K844A, R719A/K844A and R857A/K844A mutants based on the CasXX sequence were used to mimic off-target crRNA-RNF2-1-Mis-1/2, crRNA-RNF2-1-Mis -5/6, crRNA-RNF2-1-Mis-17/18 and crRNA-RNF2-1-Mis-19/20 can effectively reduce the ratio of indels without sacrificing the efficiency at the target site.
- Figure 17 is the use of GUIDE-Seq to detect the off-target effect of CasXX and CasXX+K394A mutant at the CD34-7 target site.
- the number after the sequence represents the number of reads detected for each sequence.
- the first line of sequence is the reference target sequence (SEQ ID NO: 78), and the following sequences respectively represent the target sequence and off-target sequence and the number of reads enriched by double-stranded DNA tags in the target sequence and off-target sequence.
- SEQ ID NO: 78 the reference target sequence
- This figure shows that CasXX has an off-target effect at the CD34-7 site, but the CasXX+K394A mutant has no off-target effect at this site.
- effector protein refers to an activity such as site-specific binding activity, single-stranded DNA cleavage activity, double-stranded DNA cleavage activity, single-stranded RNA cleavage activity, DNA or RNA modification (such as cleavage, base substitution, insertion, removal) or transcriptional regulatory activity.
- guide RNA and “gRNA” are used interchangeably herein to refer to an RNA capable of forming a complex with a Cas12i effector protein and a target nucleic acid (eg, double-stranded DNA). Also considered herein are arrays of precursor guide RNAs that can be processed into multiple crRNAs.
- crRNA or “CRISPR RNA” comprises a guide sequence having sufficient complementarity to the target sequence of a target nucleic acid (e.g., double-stranded DNA) that directs the specific binding of the CRISPR complex to the target sequence of the target nucleic acid.
- CRISPR array refers to a nucleic acid (e.g., DNA) segment comprising CRISPR repeats and spacers, starting at the first nucleotide of the first CRISPR repeat and ending at the end of the last (terminal) CRISPR repeat. A nucleotide ends. Typically, each spacer in a CRISPR array is located between two repeats.
- CRISPR repeat or “CRISPR direct repeat” or “direct repeat” as used herein refers to a plurality of short direct repeat sequences that exhibit very little or no sequence variation in a CRISPR array. Suitably, direct repeats can form stem-loop structures.
- nucleic acid refers to a polymeric form of nucleotides of any length, including deoxyribonucleotides, ribonucleotides, combinations thereof, and analog.
- Oligonucleotide are used interchangeably to refer to short polynucleotides having no more than about 50 nucleotides.
- complementarity refers to the ability of a nucleic acid to form hydrogen bonds with another nucleic acid through conventional Watson-Crick base pairing.
- Percent complementarity represents the percentage of residues in a nucleic acid molecule that can form hydrogen bonds (i.e., Watson-Crick base pairing) with a second nucleic acid (e.g., 5, 6, 7, 8, 9, 10, complementary about 50%, 60%, 70%, 80%, 90% and 100%, respectively).
- Perfectly complementary means that all contiguous residues of a nucleic acid sequence form hydrogen bonds with the same number of contiguous residues in a second nucleic acid sequence.
- substantially complementary refers to a degree of complementarity of at least about 70%, over a region of about 40, 50, 60, 70, 80, 100, 150, 200, 250 or more nucleotides, Any of 75%, 80%, 85%, 90%, 95%, 97%, 98%, 99% or 100, or refers to two nucleic acids that hybridize under stringent conditions.
- stringent conditions for hybridization refers to conditions under which a nucleic acid having complementarity to a target sequence primarily hybridizes to the target sequence and substantially does not hybridize to non-target sequences. Stringent conditions are generally sequence-dependent and vary depending on many factors. In general, the longer the sequence, the higher the temperature at which the sequence will specifically hybridize to its target sequence. Non-limiting examples of stringent conditions are described in detail in Tijssen (1993), Laboratory Techniques In Biochemistry And Molecular Biology-Hybridization With Nucleic Acid Probes, Part I, Chapter 2 "Overview of principles of hybridization and the strategy of nuclear acid probe assay," Elsevier, N, Y.
- Hybridization refers to a reaction in which one or more polynucleotides react to form a complex that is stabilized by hydrogen bonding between the bases of the nucleotide residues. Hydrogen bonding can occur through Watson-Crick base pairing, Hoogstein bonding, or in any other sequence-specific manner. A sequence that is capable of hybridizing to a given sequence is called the "complement" of that given sequence.
- Percent (%) sequence identity for a nucleic acid sequence is defined as the difference between a candidate sequence and a specific nucleic acid sequence after aligning the sequences (if necessary) by allowing gaps (gaps) to achieve the maximum percent sequence identity The percentage of nucleotides that are identical to each other.
- Percent sequence identity (%) for a peptide, polypeptide or protein sequence is the number of sequences in a candidate sequence that are identical to a particular peptide or The percentage of amino acid residues that are substituted with identical amino acid residues in the amino acid sequence.
- Alignment for purposes of determining percent amino acid sequence identity can be achieved in various ways that are within the skill in the art, for example, using publicly available programs such as BLAST, BLAST-2, ALIGN or MEGALIGN TM (DNASTAR) software. computer software. Those skilled in the art can determine appropriate parameters for measuring alignment, including any algorithms needed to achieve maximal alignment over the full length of the sequences being compared.
- polypeptide and “peptide” are used interchangeably herein to refer to a polymer of amino acids of any length.
- the polymer may be linear or branched, it may contain modified amino acids, and it may be interrupted by non-amino acids.
- a protein can have one or more polypeptides.
- the term also encompasses amino acid polymers that have been modified; eg, disulfide bond formation, glycosylation, lipidation, acetylation, phosphorylation, or any other manipulation (such as conjugation with a labeling component).
- variant is interpreted as a polynucleotide or polypeptide that differs from a reference polynucleotide or polypeptide, respectively, but retains essential properties.
- a typical variant of a polynucleotide differs from the nucleic acid sequence of another reference polynucleotide. Changes in the nucleic acid sequence of a variant may or may not alter the amino acid sequence of the polypeptide encoded by the reference polynucleotide. Nucleotide changes may result in amino acid substitutions, additions, deletions, fusions and truncations in the polypeptide encoded by the reference sequence, as described below.
- a typical variant of a polypeptide differs in amino acid sequence from another reference polypeptide. Usually, the differences are limited such that the sequences of the reference polypeptide and the variant are very similar overall and identical in many regions.
- the amino acid sequence of a variant and reference polypeptide may differ by any combination of one or more substitutions, additions, deletions.
- the substituted or inserted amino acid residues may or may not be those encoded by the genetic code.
- Variants of a polynucleotide or polypeptide may be naturally occurring (such as allelic variants), or may be variants that are not known to occur naturally.
- Non-naturally occurring variants of polynucleotides and polypeptides can be prepared by mutagenesis techniques, by direct synthesis, and by other recombinant methods known to those skilled in the art.
- wild type has the meaning commonly understood by those skilled in the art, meaning a typical form of an organism, strain, strain, gene or trait. It can be isolated from resources in nature and has not been deliberately modified.
- nucleic acid molecule or polypeptide when used to describe a nucleic acid molecule or polypeptide, mean that the nucleic acid molecule or polypeptide is at least substantially free of at least one other component with which it is naturally associated or occurs in nature.
- an "ortholog" of a protein as referred to herein refers to a protein belonging to a different species that performs the same or a similar function as the protein that is its ortholog.
- identity is used to denote a sequence match between two polypeptides or between two nucleic acids.
- a position in two compared sequences is occupied by the same base or amino acid monomer subunit (for example, a position in each of two DNA molecules is occupied by adenine, or a position in each of two polypeptides one position of , is occupied by lysine)
- every molecule is identical at that position.
- the "percent identity" between these two sequences is a function of the number of matching positions shared by the two sequences divided by the number of positions being compared x 100. For example, two sequences are 60% identical if 6 out of 10 positions of the two sequences match.
- the DNA sequences CTGACT and CAGGTT share 50% identity (3 out of a total of 6 positions match).
- comparisons are made when two sequences are aligned to yield maximum identity.
- This alignment can be achieved, for example, by the method of Needleman et al. (1970) J.Mol.Biol. Inc.) to carry out.
- the PAM 120 weight residue table can also be used, integrated into the ALIGN program (version 2.0) using the algorithm of E. Meyers and W. Miller (Comput. Appl Biosci., 4: 11-17 (1988)).
- a gap length penalty of 12 and a gap penalty of 4 are used to determine the percent identity between two amino acid sequences.
- the algorithm of Needleman and Wunsch J MoI Biol.
- Cell as used herein is understood not only to refer to a specific single cell, but also to the progeny or potential progeny of that cell. Because certain modifications may have occurred in the progeny, due to mutations or environmental influences, such progeny may in fact differ from the parental cells and still be included within the scope of the term herein.
- transduction and “transfection” include methods known in the art to introduce DNA into cells to express a protein or molecule of interest using an infectious agent such as a virus or otherwise.
- infectious agent such as a virus or otherwise.
- chemical-based transfection methods such as those using calcium phosphate, dendrimers, liposomes, or cationic polymers such as DEAE-dextran or polyethyleneimine.
- transfection methods non-chemical methods such as electroporation, cell squeezing, sonoporation, optical transfection, impalefection, protoplast fusion, plasmid delivery, or transposons; particle-based Methods such as the use of gene guns, magnetofection or magnet-assisted transfection, particle bombardment; and hybridization methods such as nucleofection.
- transfected refers to the process of transferring or introducing exogenous nucleic acid into a host cell.
- a “transfected”, “transformed” or “transduced” cell is a cell that has been transfected, transformed or transduced with an exogenous nucleic acid.
- in vivo refers to the organism from which cells are obtained. "Ex vivo” or “in vitro” means outside the organism from which cells are obtained.
- treatment/treating is a method used to obtain beneficial or desired results, including clinical results.
- beneficial or desired clinical outcomes include, but are not limited to, one or more of the following: alleviation of one or more symptoms caused by the disease, reduction of the extent of the disease, stabilization of the disease (e.g., prevention or delay of the disease) exacerbation of the disease), preventing or delaying the spread of the disease (e.g.
- metastasis preventing or delaying the recurrence of the disease, reducing the rate of recurrence of the disease, delaying or slowing the progression of the disease, improving the disease state, providing remission (partial or total) of the disease, To reduce the dose of one or more other drugs needed to treat the disease, delay disease progression, improve quality of life, and/or prolong survival.
- Treatment also includes reducing the pathological consequences of a disorder, condition or disease. The methods of the invention contemplate any one or more of these aspects of treatment.
- the term "effective amount” refers to an amount of a compound or composition sufficient to treat a particular disorder, condition or disease, eg, ameliorate, alleviate, lessen and/or delay one or more symptoms thereof.
- an "effective amount” may be administered in one or more doses, ie, a single dose or multiple doses may be required to achieve the desired therapeutic endpoint.
- Subject “Subject”, “individual” or “patient” are used interchangeably herein for purposes of treatment and refer to any animal classified as a mammal, including humans, livestock and farm animals, and zoo, farm or Pet animals such as dogs, horses, cats, cows, etc.
- the individual is a human individual.
- references to "not” values or parameters generally mean and describe “except” values or parameters.
- the method is not used to treat type X cancer, meaning that the method is used to treat cancer other than type X.
- the term “and/or” in words such as “A and/or B” is intended to include both A and B; A or B; A (alone); and B (alone).
- the term “and/or” in words such as "A, B and/or C” is intended to include each of the following embodiments: A, B and C; A, B or C; A or C; A or B; B or C; A and C; A and B; B and C; A (alone); B (alone); and C (alone).
- an engineered Cas12i nuclease comprises one or more (for example two, three, four or five) of the following based on the reference Cas12i Nuclease mutations selected from the group consisting of:
- the reference Cas12i nuclease is a natural Cas12i nuclease, such as the wild-type Cas12i2 nuclease whose amino acid sequence is as shown in SEQ ID NO: 1.
- the reference Cas12i nuclease is a variant of the Cas12i nuclease, such as a natural variant.
- the reference Cas12i nuclease is an engineered Cas12i (such as Cas12i2 or Cas12i1) nuclease that does not comprise one or more mutations in (1)-(5) above.
- the amino acid position is defined as the corresponding amino acid position shown in SEQ ID NO: 1. In some embodiments, the amino acid position is as defined by the amino acid position corresponding to SEQ ID NO: 1 of a wild-type Cas12i nuclease having a sequence different from SEQ ID NO: 1.
- the amino acid is at the X position, wherein said amino acid position is defined as the corresponding amino acid position of the wild-type Cas12i nuclease shown in SEQ ID NO: 1
- the amino acid is at the X position, wherein said Said amino acid position is defined as the amino acid position of the corresponding SEQ ID NO: 1 of the wild-type Cas12i nuclease having a sequence different from SEQ ID NO: 1
- the amino acid residue is located at a certain position of the reference enzyme Cas12i , which is quite at the X position of SEQ ID NO: 1, and the amino acid sequence of the reference enzyme Cas12i is aligned with the amino acid sequence of SEQ ID NO: 1 based on sequence homology.
- Figure 7 shows the homology alignment of the amino acid sequences of CAS12i2 (SEQ ID NO: 1) and CAS12i1 (SEQ ID NO: 13).
- CAS12i2 SEQ ID NO: 1
- CAS12i1 SEQ ID NO: 13
- Those skilled in the art can use software commonly used in the art, such as Clustal Omega, to compare and align (alignment) the amino acid sequence of any reference Cas12i nuclease with SEQ ID NO: 1, and then obtain the amino acid sequence consistent with the present application.
- the amino acid position in the reference Cas12i nuclease corresponding to the amino acid position defined based on SEQ ID NO: 1.
- the present application provides methods for engineering enzymes by introducing amino acid mutations based on any one or combination of more of the above five engineering principles, which lead to in vitro and/or in vivo enzyme activity (such as DNA single-strand or double-strand cleavage activity) (for example, the cutting gene efficiency increased by about 100 times), the increase in the number of recognizable PAMs (for example, as can be seen from Figure 8, the wild-type enzyme can recognize a PAM, and based on the above Amino acid mutated CasXX enzymes can recognize at least 11 PAMs), and/or reduce the off-target rate and improve specificity (for example, as can be seen from Table 18, the off-target efficiency of HF-Cas can be reduced by about 99% relative to CasXX).
- in vitro and/or in vivo enzyme activity such as DNA single-strand or double-strand cleavage activity
- the wild-type enzyme can recognize a PAM
- Amino acid mutated CasXX enzymes can recognize at least 11 PAM
- the engineered Cas12i nuclease contains one or more specific mutations as described in sections 1)-6) below. In some embodiments, any one or more mutations described in the application can be combined with existing Cas12i mutations (such as the mutations described in the following 7) section) to provide engineered Cas12i nucleic acids with higher activity enzyme.
- an engineered Cas12i nuclease is provided; the engineered Cas12i nuclease comprises replacing one or more amino acids interacting with PAM in the reference Cas12i nuclease with a positively charged amino acid mutation. In some embodiments, an engineered Cas12i nuclease is provided; the engineered Cas12i nuclease comprises replacing one or more amino acids involved in opening the DNA double strand in the reference Cas12i nuclease with an aromatic ring amino acid.
- an engineered Cas12i nuclease is provided; the engineered Cas12i nuclease comprises one or more elements that will be located in the RuvC domain in the reference Cas12i nuclease and interact with a single-stranded DNA substrate. Amino acids are replaced with positively charged amino acids.
- an engineered Cas12i nuclease is provided; the engineered Cas12i nuclease comprises one or more amino acid amino acids that interact with the DNA-RNA double helix in the reference Cas12i nuclease are replaced by Positively charged amino acids.
- an engineered Cas12i nuclease is provided; the engineered Cas12i nuclease comprises one or more polarities or bands that interact with the DNA-RNA duplex in the reference Cas12i nuclease. Positively charged amino acids are replaced with hydrophobic amino acids.
- an engineered Cas12i nuclease comprises: 1) replacing one or more amino acids interacting with PAM in the reference Cas12i nuclease with positive and 2) one or more amino acids involved in opening the DNA double strand in the reference Cas12i nuclease are replaced by amino acids with an aromatic ring.
- an engineered Cas12i nuclease comprises: 1) replacing one or more amino acids interacting with PAM in the reference Cas12i nuclease with positive and 2) replacing one or more amino acids located in the RuvC domain in the reference Cas12i nuclease and interacting with single-stranded DNA substrates with positively charged amino acids.
- an engineered Cas12i nuclease comprises: 1) one or more amino acids involved in opening the DNA double strand in the reference Cas12i nuclease are replaced with a band Amino acids of the aromatic ring; and 2) one or more amino acids located in the RuvC domain and interacting with the single-stranded DNA substrate in the reference Cas12i nuclease are replaced with positively charged amino acids.
- the engineered Cas12i nuclease further comprises replacing one or more polar or positively charged amino acids that interact with the DNA-RNA duplex in the reference Cas12i nuclease with hydrophobic amino acids.
- the engineered Cas12i nuclease also comprises one or more mutations in the flexible region (such as replacing G, and/or inserting one or two Gs thereafter) to increase the flexibility of the flexible region.
- the Cas12i engineered enzyme so modified has increased flexibility by at least about 10%, such as by at least about 20%, 30%, 50%, 100%, 150%, 200%, 500%, 1000% % or greater degree of flexibility.
- an engineered Cas12i nuclease comprises: 1) replacing one or more amino acids interacting with PAM in the reference Cas12i nuclease with positive and 2) replacing one or more amino acids that interact with the DNA-RNA duplex in the reference Cas12i nuclease with a positively charged amino acid.
- an engineered Cas12i nuclease comprises: 1) one or more amino acids involved in opening the DNA double strand in the reference Cas12i nuclease are replaced with a band Amino acids of the aromatic ring; and 2) replacing one or more amino acids that interact with the DNA-RNA double helix in the reference Cas12i nuclease with positively charged amino acids.
- an engineered Cas12i nuclease comprises: 1) the RuvC domain and interacting with the single-stranded DNA substrate in the reference Cas12i nuclease One or more amino acids are replaced with positively charged amino acids; and 2) one or more amino acids that interact with the DNA-RNA double helix in the reference Cas12i nuclease are replaced with positively charged amino acids.
- the engineered Cas12i nuclease further comprises replacing one or more polar or positively charged amino acids that interact with the DNA-RNA duplex in the reference Cas12i nuclease with hydrophobic amino acids.
- the engineered Cas12i nuclease also comprises one or more mutations in the flexible region to increase the flexibility of the flexible region.
- an engineered Cas12i nuclease comprises: 1) replacing one or more amino acids interacting with PAM in the reference Cas12i nuclease with positive 2) replace one or more amino acids involved in opening the DNA double strand in the reference Cas12i nuclease with an amino acid with an aromatic ring; and 3) replace the RuvC domain in the reference Cas12i nuclease with the One or more amino acids that interact with the single-stranded DNA substrate are replaced with positively charged amino acids.
- an engineered Cas12i nuclease comprises: 1) replacing one or more amino acids interacting with PAM in the reference Cas12i nuclease with positive 2) one or more amino acids involved in opening the DNA double strand in the reference Cas12i nuclease are replaced by amino acids with an aromatic ring; and 3) the reference Cas12i nuclease is combined with the DNA-RNA double helix Interacting amino acid(s) are replaced with positively charged amino acids.
- an engineered Cas12i nuclease comprises: 1) replacing one or more amino acids interacting with PAM in the reference Cas12i nuclease with positive 2) replacing one or more amino acids located in the RuvC domain in the reference Cas12i nuclease and interacting with the single-stranded DNA substrate with a positively charged amino acid; and 3) replacing the reference Cas12i nucleic acid One or more amino acids in the enzyme that interact with the DNA-RNA double helix are replaced with positively charged amino acids.
- an engineered Cas12i nuclease comprises: 1) one or more amino acids involved in opening the DNA double strand in the reference Cas12i nuclease are replaced with a band 2) replace one or more amino acids in the reference Cas12i nuclease that are located in the RuvC domain and interact with the single-stranded DNA substrate with positively charged amino acids; and 3) replace the reference Cas12i nuclease One or more amino acids that interact with the DNA-RNA double helix are replaced with positively charged amino acids.
- the engineered Cas12i nuclease further comprises replacing one or more polar or positively charged amino acids that interact with the DNA-RNA duplex in the reference Cas12i nuclease with hydrophobic amino acids. In some embodiments, the engineered Cas12i nuclease also comprises one or more mutations in the flexible region to increase the flexibility of the flexible region.
- an engineered Cas12i nuclease comprises: 1) replacing one or more amino acids interacting with PAM in the reference Cas12i nuclease with positive 2) one or more amino acids involved in opening the DNA double strand in the reference Cas12i nuclease are replaced by amino acids with an aromatic ring; 3) the reference Cas12i nuclease is located in the RuvC domain and is combined with a single One or more amino acids that interact with the strand DNA substrate are replaced with positively charged amino acids; and 4) one or more amino acids that interact with the DNA-RNA double helix in the reference Cas12i nuclease are replaced with positively charged amino acids of amino acids.
- the engineered Cas12i nuclease further comprises replacing one or more polar or positively charged amino acids that interact with the DNA-RNA duplex in the reference Cas12i nuclease with hydrophobic amino acids.
- an engineered Cas12i nuclease is provided; the engineered Cas12i nuclease comprises: 1) replacing one or more amino acids interacting with PAM in the reference Cas12i nuclease with positive 2) one or more amino acids involved in opening the DNA double strand in the reference Cas12i nuclease are replaced by amino acids with an aromatic ring; 3) the reference Cas12i nuclease is located in the RuvC domain and is combined with a single One or more amino acids interacting with the strand DNA substrate are replaced with positively charged amino acids; 4) one or more amino acids interacting with the DNA-RNA double helix in the reference Cas12i nuclease are replaced with positively charged amino acids Amino acids; and 5) replacing one or more polar
- the engineered Cas12i nuclease comprises a mutation relative to the corresponding amino acid position shown in SEQ ID NO: 1: N164Y+E176R+K238R+E323R+D362R+T447R+E563R (henceforth named "CasXX ").
- the Cas12i nuclease of described engineering comprises the sequence of SEQ ID NO:8.
- the engineered Cas12i nuclease comprises one or more mutations based on a reference Cas12i nuclease (e.g., Cas12i2) that replaces amino acids in the reference Cas12i nuclease that interact with PAM is a positively charged amino acid (such as R, H or K).
- a reference Cas12i nuclease e.g., Cas12i2
- the engineered Cas12i nuclease comprises one, two, three, four, five, six, seven, eight or more substitutions of said amino acid residues.
- the amino acid that interacts with PAM is an amino acid that is within 9 angstroms of the three-dimensional structure of PAM, for example: an amino acid that is within 9 angstroms of the three-dimensional structure of PAM, and within 9 angstroms of PAM.
- Amino acids within 8 angstroms in the three-dimensional structure amino acids within 7 angstroms in the three-dimensional structure of PAM, within 6 angstroms of the three-dimensional structure of PAM, and within 5 angstroms of the three-dimensional structure of PAM
- Amino acids within 4 angstroms in the three-dimensional structure of PAM within 3 angstroms of the three-dimensional structure of PAM, within 2 angstroms of the three-dimensional structure of PAM, or closer amino acids .
- the distance between PAM and amino acid in spatial structure is defined by the distance between atoms in the 3D structure (PDB file) of the analyzed Cas protein-RNA-DNA three-dimensional complex, and the mutual distance between atoms can be identified by PDB file
- the software displays In some embodiments, the spatial structural distance between the PAM and the amino acid is defined by the minimum distance between the amino acid residue and the atoms comprised by the nucleotide.
- Programs that can be used to measure the distance between PAM and amino acids in the spatial structure, or PDB file recognition software are widely known in the art, including but not limited to PyMOL, ChimeraX, Swiss-pdbviewer, etc.
- the one or more mutations that replace the PAM-interacting amino acid in the reference Cas12i nuclease with a positively charged amino acid are mutations at one or more of the following positions: 176, 178 , 226, 227, 229, 237, 238, 264, 447 and 563. In some embodiments, the one or more mutations that replace the PAM-interacting amino acid in the reference Cas12i nuclease with a positively charged amino acid are mutations at one or more of the following amino acids: E176, E178, Y226 , A227, N229, E237, K238, K264, T447 and E563.
- the one or more mutations that replace the PAM-interacting amino acid in the reference Cas12i nuclease with a positively charged amino acid are mutations at one or more of the following amino acids: E176, K238, T447 and E563.
- the mutation that replaces the amino acid that interacts with PAM in the reference Cas12i nuclease with a positively charged amino acid is located at amino acid residue 563, such as E563.
- the amino acid position is as defined in the corresponding amino acid position of the wild-type Cas12i nuclease shown in SEQ ID NO: 1.
- the engineered Cas12i nuclease comprises a mutation of one or more of the following amino acids: E176R, E178R, Y226R, A227R, N229R, E237R, K238R, K264R, T447R and E563R, the amino acid positions are as follows Defined by the corresponding amino acid positions of the wild-type Cas12i nuclease shown in SEQ ID NO: 1.
- E176 is; in the cited amino acid sequence (such as relative to SEQ ID NO: 1), No. 176 amino acid E (glutamic acid); here, the common amino acid and its three
- the letters and one-letter abbreviations are listed as follows: Alanine Ala A; Arginine Arg R; Aspartate Asp D; Cysteine Cys C; Glutamine Gln Q; Glutamate Glu E; Amino acid His H; Isoleucine Ile I; Glycine Gly G; Asparagine Asn N; Leucine Leu L; Lysine Lys K; Methionine Met M; Phenylalanine Phe F; Proline Acid Pro P; Serine Ser S; Threonine Thr T; Tryptophan Trp W; Tyrosine Tyr Y; Valine Val V.
- the mutation of replacing the amino acid interacting with PAM in the reference Cas12i nuclease with a positively charged amino acid is to replace the corresponding amino acid residue in the reference Cas12i nuclease with R, H or K , such as R or K. In some embodiments, the mutation of replacing the amino acid interacting with PAM in the reference Cas12i nuclease with a positively charged amino acid is to replace the corresponding amino acid residue in the reference Cas12i nuclease with R.
- the engineered Cas12i nuclease comprises one or more amino acid mutations at the following sites: 176R, 238R, 447R, and 563R, wherein the amino acid position is the wild type shown in SEQ ID NO: 1
- the corresponding amino acid positions of the Cas12i nuclease are defined.
- the engineered Cas12i nuclease comprises one or more of the following mutations based on the reference Cas12i nuclease: E176, K238, T447 and E563; wherein the amino acid position numbering is as shown in SEQ ID NO: 1 defined by the corresponding amino acid positions.
- the engineered Cas12i nuclease comprises one or more of the following mutations based on the reference Cas12i nuclease: E176R, K238R, T447R and E563R; wherein the amino acid position numbering is as shown in SEQ ID NO: 1 defined by the corresponding amino acid positions.
- the engineered Cas12i nuclease comprises an E563R mutation; wherein the amino acid position numbering is as defined in the corresponding amino acid position shown in SEQ ID NO:1.
- the above-mentioned engineered Cas12i nuclease (replacing one or more amino acids interacting with PAM in the reference Cas12i nuclease with positively charged amino acids) have at least about 85% (e.g., any at least about 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% %, 99%) sequence identity engineered Cas12i nuclease.
- the engineered Cas12i nuclease comprises any mutation or combination of mutations at the following amino acid residue positions: (i) 176, 238, 264, 447, 563, 176+238, 176+447, 176 +563, 238+447, 238+563, 447+563, 176+238+447, 176+238+563, 176+447+563, 238+447+563, 176+238+447+563; among them, amino acid Position numbers are as defined for the corresponding amino acid positions shown in SEQ ID NO:1.
- the engineered Cas12i nuclease comprises any one of the following mutations/mutation combinations: E176R, K238R, E264R, T447R, E563R, E176R+K238R, E176R+T447R, E176R+E563R, K238R+T447R, K238R+E563R, T447R+E563R, E176R+K238R+T447R, E176R+K238R+E563R, E176R+T447R+E563R, K238R+T447R+E563R, and E176R+K238R+T447R+E563R; wherein, the amino acid position NO ID number is as follows : Defined by the corresponding amino acid position shown in 1.
- the engineered Cas12i nuclease comprises any one of the following mutations/mutation combinations: E563R, E176R+T447R, E176R+E563R, K238R+E563R, E176R+K238R+T447R, E176R+K238R+E563R, E176R+T447R+E563R, and E176R+K238R+T447R+E563R; wherein, the amino acid position numbers are as defined in the corresponding amino acid positions shown in SEQ ID NO:1.
- the engineered Cas12i nuclease comprises a combination of E176R+K238R+T447R+E563R mutations; wherein the amino acid position numbering is as defined in the corresponding amino acid position shown in SEQ ID NO:1.
- an engineered Cas12i nuclease (replacing one or more amino acids interacting with PAM in the reference Cas12i nuclease with a positively charged amino acid) with at least about 85 % (eg, any at least about 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%) sequence Consensus engineered Cas12i nuclease.
- the engineered Cas12i nuclease comprises one or more mutations based on a reference Cas12i nuclease (e.g., Cas12i2), wherein the mutation is one of the reference Cas12i nucleases involved in opening the DNA double strand. Or multiple amino acids are replaced by amino acids with aromatic rings (such as F, Y or W). In some embodiments, the engineered Cas12i nuclease comprises one, two, three, four, five, six, or more substitutions of said amino acid residues.
- a reference Cas12i nuclease e.g., Cas12i2
- the engineered Cas12i nuclease comprises one, two, three, four, five, six, or more substitutions of said amino acid residues.
- the one or more amino acids involved in opening the DNA double strand are the amino acids that interact with the last base pair at the 3' end of the PAM relative to the target strand.
- the PAM sequence recognized by Cas12i2 is 5'-NTTN-3' base pair, wherein the base pair formed by the N base at the 3' end of the PAM sequence and the target strand is the "relative to the PAM" described in the text The last base pair at the 3' end of the target strand," followed by the sequence of the targeting site.
- the one or more amino acids involved in opening the DNA double strand are located at the following positions: 163 and/or 164; wherein the amino acid position numbers are as defined in the corresponding amino acid positions shown in SEQ ID NO:1.
- the one or more amino acids involved in opening the DNA double strand are one or more of the following amino acids: Q163, N164; wherein the amino acid position numbering is as defined in the corresponding amino acid position shown in SEQ ID NO: 1 .
- the amino acid involved in opening the double-strand of DNA is N164; wherein the numbering of the amino acid position is as defined in the corresponding amino acid position shown in SEQ ID NO:1.
- the amino acid involved in opening the double strand of DNA is replaced by F, Y or W. In some embodiments, the amino acid involved in opening the double strand of DNA is replaced by F. In some embodiments, the amino acid involved in opening the double strand of DNA is replaced by Y.
- the engineered Cas12i nuclease comprises a mutation of any one or more of the following amino acid residues: 163F, 163Y, 163W, 164W, 164F or 164Y; wherein the amino acid position numbering is as shown in SEQ ID NO: 1 defined by the corresponding amino acid positions indicated.
- the engineered Cas12i nuclease comprises a mutation of any one or more of the following amino acid residues: 163F, 163Y, 163W, 164F or 164Y; wherein the amino acid position numbering is as shown in SEQ ID NO: 1 The corresponding amino acid positions are defined.
- the engineered Cas12i nuclease comprises any of the following mutations: Q163 and/or N164; wherein the amino acid position numbering is as defined in the corresponding amino acid position shown in SEQ ID NO:1. In some embodiments, the engineered Cas12i nuclease comprises any one of the following mutations: Q163F, Q163Y, Q163W, N164W, N164F or N164Y; wherein the amino acid position numbering is as defined in the corresponding amino acid position shown in SEQ ID NO: 1 .
- the engineered Cas12i nuclease comprises any one of the following mutations: Q163F, Q163Y, Q163W, N164F or N164Y; wherein the amino acid position numbering is as defined in the corresponding amino acid position shown in SEQ ID NO:1.
- the engineered Cas12i nuclease comprises N164Y or N164F mutation; wherein the amino acid position numbering is as defined in the corresponding amino acid position shown in SEQ ID NO:1.
- the engineered Cas12i nuclease comprises an N164Y mutation; wherein the amino acid position numbering is as defined in the corresponding amino acid position shown in SEQ ID NO:1.
- the above-mentioned engineered Cas12i nuclease (one or more amino acids involved in opening the DNA double strand in the reference Cas12i nuclease are replaced with an aromatic ring amino acids) have at least about 85% (e.g., any at least about 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%) sequence identity engineered Cas12i nuclease.
- the Cas12i nuclease of described engineering comprises one or more mutations based on a reference Cas12i nuclease (eg, Cas12i2), which is located in the RuvC domain in the reference Cas12i enzyme and combined with a single-stranded Amino acids that interact with the DNA substrate are replaced with positively charged amino acids (such as R, H, or K).
- the engineered Cas12i enzyme comprises one, two, three, four, five, six or more substitutions of said amino acid residues.
- the one or more amino acids located in the RuvC domain and interacting with the single-stranded DNA substrate are amino acids within 9 Angstroms of the distance from the single-stranded DNA substrate on the three-dimensional structure, for example, it can be: on the three-dimensional structure Amino acids within 8 angstroms of the single-stranded DNA substrate, within 7 angstroms of the three-dimensional structure of the single-stranded DNA substrate, within 6 angstroms of the three-dimensional structure of the single-stranded DNA substrate, Amino acids within 5 angstroms from the single-stranded DNA substrate on the three-dimensional structure, amino acids within 4 angstroms from the single-stranded DNA substrate on the three-dimensional structure, and 3 angstroms away from the single-stranded DNA substrate on the three-dimensional structure Amino acids within 2 Angstroms, or amino acids closer to the single-stranded DNA substrate in the three-dimensional structure.
- the RuvC domain is the enzymatic domain responsible for cutting single-stranded DNA or double-stranded DNA in the Cas12i protein.
- the RuvC domain of Cas12i is divided into three parts: RuvC-1, RuvC-2 and RuvC-3. These three parts are adjacent in the three-dimensional structure and together form a catalytic pocket with enzymatic cleavage activity.
- the three-dimensional crystal structure of Cas12i2, its domain composition, and the description of its interaction with DNA substrates are described in Huang X. et al., Nature Communications, 11, Article number: 5241 (2020).
- the three-dimensional crystal structure of Cas12i1, its domain composition, and the description of its interaction with DNA substrates are described in Zhang H.
- the three-dimensional structural model of the interaction between the reference Cas12i and the substrate can be obtained through homology structure comparison and homology modeling through the known three-dimensional crystal structure of Cas12i.
- a modeling approach is described in Example 3 to obtain amino acids in Cas12i2 that are located in the RuvC domain and within 9 Angstroms of the single-stranded DNA substrate.
- the distance between the amino acids in the RuvC domain and the single-stranded DNA substrate in the spatial structure can be determined by the atoms in the 3D structure (PDB file) of the analyzed Cas protein-RNA-DNA three-dimensional complex Defined by the distance, the mutual distance between atoms can be displayed by the PDB file recognition software.
- the spatial structural distance of amino acids in the RuvC domain from the single-stranded DNA substrate is defined by the minimum distance between amino acid residues and atoms comprised by nucleotides.
- PDB file recognition software Programs that can be used to measure the spatial structure distance between the amino acids in the RuvC domain and the single-stranded DNA substrate, or PDB file recognition software are widely known in the art, including but not limited to PyMOL, ChimeraX, Swiss-pdbviewer, etc.
- the one or more amino acids located in the RuvC domain and interacting with the single-stranded DNA substrate are one or more amino acids at the following positions: 323, 327, 355, 359, 360, 361, 362, 388, 390, 391, 392, 393, 414, 417, 418, 421, 424, 425, 650, 652, 653, 696, 705, 708, 709, 751, 752, 755, 840, 848, 851, 856, 885, 897, 925, 926, 928, 929, 932, and 1022.
- the one or more amino acids located in the RuvC domain and interacting with the single-stranded DNA substrate are one or more of the following amino acids: E323, L327, V355, G359, G360, K361, D362, L388, N390, N391, F392, K393, Q414, L417, L418, K421, Q424, Q425, S650, E652, G653, I696, K705, K708, E709, L751, S752, E755, N840, N848, S851, A856, Q885, M897, N925, I926, T928, G929, Y932, A1022.
- the one or more amino acids located in the RuvC domain and interacting with the single-stranded DNA substrate are one or more of the following amino acids: E323, D362, L388, N391, L417, Q424, Q425, N925, I926, and G929.
- the amino acid position is as defined in the corresponding amino acid position of the wild-type Cas12i nuclease shown in SEQ ID NO: 1.
- the engineered Cas12i nuclease comprises replacing one or more amino acids in the reference Cas12i nuclease that are located in the RuvC domain and interact with single-stranded DNA substrates with R, H or K (e.g. R or K) mutations.
- the engineered Cas12i nuclease comprises a mutation in which one or more amino acids located in the RuvC domain and interacting with a single-stranded DNA substrate in the reference Cas12i nuclease are replaced with R.
- the engineered Cas12i nuclease comprises one or more of the following amino acid mutations or mutation combinations: N390R, N391R, F392R, L751R, E755R, N840R, N848R, S851R, A856R, Q885R, M897R, I926R ⁇ G929R ⁇ Y932R ⁇ E323R ⁇ L327R ⁇ V355R ⁇ G359R ⁇ G360R ⁇ K361R ⁇ D362R ⁇ Q414R ⁇ K421R ⁇ Q425R ⁇ S650R ⁇ E652R ⁇ K705R ⁇ K708R ⁇ E709R ⁇ S752R ⁇ N925R ⁇ T928R ⁇ E323R+D362R ⁇ E323R+Q425R ⁇ E323R +I926R ⁇ Q425R+I926R ⁇ E323R+D362R+Q425R ⁇ E323R+I926R ⁇ Q425R+
- the engineered Cas12i nuclease comprises a mutation or a combination of mutations at any one of the following amino acid residue positions: E323, D362, Q425, N925, I926 and G929; wherein, the amino acid position numbering is as SEQ ID NO: Defined by the corresponding amino acid positions shown in 1.
- the engineered Cas12i nuclease comprises a mutation or combination of mutations at any one of the following amino acid residue positions: E323, D362, Q425, N925, I926, E323+D362, E323+Q425, E323+I926, D362+Q425, D362+N925, D362+I926, Q425+I926, N925+I926, E323+D362+Q425, E323+D362+I926, E323+Q425+I926, D362+N925+I926, D362+Q425+I926, E323+D362+Q425+I926; wherein, the amino acid position numbering is as defined in the corresponding amino acid position shown in SEQ ID NO:1.
- the mutation is a mutation that replaces the amino acid residue at the position with R, H, or K (such as R).
- the engineered Cas12i nuclease comprises any one or amino acid combination of the following: 323R, 362R, 425R, 925R, 926R, 323R+362R, 323R+425R, 323R+926R, 362R+425R, 362R +926R, 425R+926R, 925R+926R, 323R+362R+425R, 323R+362R+926R, 323R+425R+926R, 362R+925R+926R, 323R+362R+425R+926R, and 362R+425R+926R; Wherein, the amino acid position number is defined as the corresponding amino acid position shown in SEQ ID NO:1.
- the engineered Cas12i nuclease comprises any one of the following mutations or mutation combinations: E323R, D362R, Q424R, Q425R, N925R, I926R, and G929R; wherein, the amino acid position numbering is as shown in SEQ ID NO: 1 defined by the corresponding amino acid positions indicated.
- the engineered Cas12i nuclease comprises any one of the following mutations or combinations of mutations: E323R, D362R, Q425R, N925R, I926R, E323R+D362R, E323R+Q425R, E323R+I926R, D362R+Q425R, Q425R +I926R, D362R+I926R, N925R+I926R, E323R+D362R+Q425R, E323R+D362R+I926R, E323R+Q425R+I926R, D362R+N925R+I926R, D362R+Q425R+I926R, and E9323R+
- the amino acid position number is defined as the corresponding amino acid position shown in SEQ ID NO:1.
- the engineered Cas12i nuclease comprises an I926R mutation; wherein the amino acid position numbering is as defined in the corresponding amino acid position shown in SEQ ID NO:1. In some embodiments, the engineered Cas12i nuclease comprises an E323R+D362R mutation; wherein, the amino acid position numbering is as defined in the corresponding amino acid position shown in SEQ ID NO:1. In some embodiments, in order to achieve the purpose of improving gene editing efficiency, one or more combinations with the above-mentioned engineered Cas12i nuclease (the reference Cas12i nuclease is located in the RuvC domain and interacts with the single-stranded DNA substrate) can also be used.
- replacement of amino acids with positively charged amino acids has at least about 85% (e.g., any at least about 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95% , 96%, 97%, 98%, 99%) sequence identity engineered Cas12i nuclease.
- the engineered Cas12i nuclease comprises one or more mutations based on a reference Cas12i nuclease (e.g., Cas12i2) that interacts with the DNA-RNA duplex in the reference Cas12i nuclease
- a reference Cas12i nuclease e.g., Cas12i2
- One or more amino acids that act are replaced with positively charged amino acids (such as R, H, or K).
- the engineered Cas12i enzyme comprises one, two, three, four, five, six, or more substitutions of said amino acid residues.
- the one or more amino acids interacting with the DNA-RNA double helix are amino acids within 9 Angstroms of the distance from the DNA-RNA double helix on the three-dimensional structure, for example, it can be: on the three-dimensional structure and the DNA-RNA double helix Amino acids whose helical distance is within 8 angstroms, amino acids whose three-dimensional structure is within 7 angstroms of DNA-RNA double helix, amino acids whose three-dimensional structure is within 6 angstroms of DNA-RNA double helix, and three-dimensional structure Amino acids within 5 angstroms of DNA-RNA double helix distance, within 4 angstroms of DNA-RNA double helix distance in three-dimensional structure, within 3 angstroms of DNA-RNA double helix distance in three-dimensional structure, in Amino acids that are within 2 Angstroms of the distance from the DNA-RNA double helix in the three-dimensional structure, or amino acids that are closer.
- Cas nucleases work as follows: Cas forms a complex with a guide RNA (such as crRNA), where the crRNA and the target DNA pair to form a DNA-RNA double helix, and interact with the Cas nuclease to open the double-stranded target. To the DNA, and form R-loop, so that the Cas enzyme cutting active site completes the cutting of dsDNA.
- a guide RNA such as crRNA
- the crRNA and the target DNA pair to form a DNA-RNA double helix
- the Cas enzyme cutting active site completes the cutting of dsDNA.
- the three-dimensional crystal structure of Cas12i2, its domain composition, and the description of its interaction with the DNA-RNA double helix are described in Huang X. et al., Nature Communications, 11, Article number: 5241 (2020).
- the distance between the DNA-RNA double helix and the Cas amino acid in the spatial structure can be determined by the amino acid residues and nucleotides in the 3D structure (PDB file) of the analyzed Cas protein-RNA-DNA three-dimensional complex The minimum distances between included atoms are defined, and the mutual distances between atoms can be displayed by PDB file recognition software.
- the distance between the Cas amino acid and the DNA-RNA double helix in the spatial structure is defined by the distance between the hypothetical positions of the atoms.
- the one or more amino acids that interact with the DNA-RNA duplex are one or more amino acids at the following positions: 116, 117, 156, 159, 160, 161, 247, 293, 294 ,297,301,305,306,308,312,313,316,319,320,343,348,349,427,433,438,441,442,679,683,691,782,783,797,800 , 852, 853, 855, 861, 865, 957, 958.
- the one or more amino acids that interact with the DNA-RNA duplex are one or more of the following amino acids: G116, E117, A156, T159, E160, S161, E247, G293, E294, N297 , T301, I305, K306, T308, N312, F313, Q316, E319, Q320, E343, E348, E349, D427, K433, V438, N441, Q442, N679, E683, E691, D782, E783, E797, E800, M852 , D853, L855, N861, Q865, S957, D958.
- the one or more amino acids that interact with the DNA-RNA duplex are one or more of the following: G116, E117, T159, S161, E319, E343, or D958.
- the amino acid that interacts with the DNA-RNA duplex is D958.
- the amino acid position is as defined in the corresponding amino acid position of the wild-type Cas12i nuclease shown in SEQ ID NO: 1.
- the engineered Cas12i nuclease comprises one or more amino acids that interact with the DNA-RNA duplex in the reference Cas12i nuclease are replaced by R, H or K (such as R or K). mutation. In some embodiments, the engineered Cas12i nuclease comprises a mutation that replaces one or more amino acids involved in the DNA-RNA double helix interaction in the reference Cas12i nuclease with R.
- the engineered Cas12i nuclease comprises one or more of the following amino acid mutations: G116R, E117R, A156R, T159R, S161R, T301R, I305R, K306R, T308R, N312R, F313R, D427R, K433R, V438R ⁇ N441R ⁇ Q442R ⁇ M852R ⁇ L855R ⁇ N861R ⁇ Q865R ⁇ E160R ⁇ Q316R ⁇ E319R ⁇ Q320R ⁇ E247R ⁇ E343R ⁇ E348R ⁇ E349R ⁇ N679R ⁇ E683R ⁇ E691R ⁇ D782R ⁇ E783R ⁇ E797R ⁇ E800R ⁇ D853R ⁇ S957R ⁇ D958R ⁇ G293R, E294R, N297R; Described amino acid position is defined as the corresponding amino acid position of the wild-type Cas12i nuclease shown in
- the engineered Cas12i nuclease comprises a mutation or a combination of mutations at any one of the following amino acid residue positions: G116, E117, T159, S161, E319, E343, or D958; wherein, the amino acid position numbering is as in SEQ Defined by the corresponding amino acid position shown in ID NO:1.
- the mutation is a mutation that replaces the amino acid residue at the position with R, H or K (eg, R).
- the engineered Cas12i nuclease comprises any one of the following site amino acids or amino acid mutation combinations: 116R, 117R, 159R, 161R, 319R, 343R, or 958R; wherein, the amino acid position numbering is as SEQ ID NO : Defined by the corresponding amino acid position shown in 1.
- the engineered Cas12i nuclease comprises any one of the following mutations or mutation combinations: G116R, E117R, T159R, S161R, E319R, E343R, or D958R; wherein, the amino acid position numbering is as shown in SEQ ID NO: 1 defined by the corresponding amino acid positions indicated.
- the engineered Cas12i nuclease comprises a D958R mutation; wherein the amino acid position numbering is as defined in the corresponding amino acid position shown in SEQ ID NO:1.
- the above-mentioned engineered Cas12i nuclease has at least about 85% (such as any at least about 86%, 87%, 88%, 89%, 90%) %, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%) sequence identity engineered Cas12i nuclease.
- the engineered Cas12i nuclease comprises one or more mutations based on a reference Cas12i nuclease (e.g., Cas12i2) that interacts with the DNA-RNA duplex in the reference Cas12i nuclease
- a reference Cas12i nuclease e.g., Cas12i2
- One or more polar or positively charged amino acids are replaced with hydrophobic amino acids (such as A, V, I, L, M, F, Y, P, C or W).
- the mutation also referred to herein as a "high specificity” or "HF” mutation
- the off-target rate of CasXX-HF can be reduced by at least about 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, 99% relative to CasXX or lower even more.
- the engineered Cas12i enzyme comprises one, two, three, four, five, six or more substitutions of said amino acid residues.
- the one or more amino acids interacting with the DNA-RNA double helix are amino acids within 9 Angstroms of the distance from the DNA-RNA double helix on the three-dimensional structure, for example, it can be: on the three-dimensional structure and the DNA-RNA double helix Amino acids whose helical distance is within 8 angstroms, amino acids whose three-dimensional structure is within 7 angstroms of DNA-RNA double helix, amino acids whose three-dimensional structure is within 6 angstroms of DNA-RNA double helix, and three-dimensional structure Amino acids within 5 angstroms of DNA-RNA double helix distance, within 4 angstroms of DNA-RNA double helix distance in three-dimensional structure, within 3 angstroms of DNA-RNA double helix distance in three-dimensional structure, in Amino acids that are within 2 Angstroms of the distance from the DNA-RNA double helix in the three-dimensional structure, or amino acids that are closer.
- the one or more polar or positively charged amino acids that interact with the DNA-RNA duplex are one or more of the following positions: 119, 164, 297, 308, 309, 312, 346, 357, 394, 395, 402, 441, 433, 565, 715, 719, 766, 782, 807, 841, 844, 845, 848, 857, 861, and 865.
- the one or more polar or positively charged amino acids that interact with the DNA-RNA duplex are one or more of the following: Y119, Y164, N297, T308, R309, N312, S346, H357, K394, E395, R402, N441, K433, S565, R715, R719, S766, D782, K807, N841, K844, K845, N848, R857, N861 and Q865.
- the one or more polar or positively charged amino acids that interact with the DNA-RNA duplex are one or more of the following: R857, R861, K807, N848, R715, R719, K394, H357 and K844.
- the one or more polar or positively charged amino acids that interact with the DNA-RNA duplex are one or more of the following: R857, R719, K394, and K844.
- the amino acid position as described above is defined as the corresponding amino acid position shown in SEQ ID NO: 1 or 8.
- the engineered Cas12i nuclease comprises one or more mutations: S565A, N297A, Q865A, T308A, R309A, N312A, N441A, R857A, N861A, Y119F, K433A, K807A, N841A, N848A , K845A, D782A, R715A, R719A, S766A, K394A, H357A, K844A, E395A, S346A, R402A, Y164N; wherein, the amino acid position numbers are as defined in the corresponding amino acid positions shown in SEQ ID NO: 1 or 8.
- the Cas12i nuclease of described engineering comprises will reference Cas12i nuclease located in 357,394,715,719,807,844,848,857,861 one or more polar or positive Substitution of charged amino acids for mutations of hydrophobic amino acids.
- the hydrophobic amino acid is selected from A, V, L, I, P and F, such as A, V, L or I.
- the engineered Cas12i nuclease comprises one or more polarities or A mutation in which a positively charged amino acid is replaced by an A.
- the amino acid position numbers are as defined in the corresponding amino acid positions shown in SEQ ID NO: 1 or 8.
- the engineered Cas12i nuclease comprises a mutation of any one of the following amino acid or amino acid combinations: H357A, K394A, R715A, R719A, K807A, K844A, N848A, R857A, and R861A; wherein, the amino acid position numbering As defined by the corresponding amino acid position shown in SEQ ID NO: 1 or 8.
- the engineered Cas12i nuclease comprises a mutation or combination of mutations at any one of the following amino acid residue positions: 857, 719, 394, or 844; wherein, the amino acid position numbering is as shown in SEQ ID NO: 1 defined by the corresponding amino acid positions.
- the mutation is a mutation that replaces the amino acid residue at the position with a hydrophobic amino acid (eg, A).
- the engineered Cas12i nuclease comprises a mutation at any one of the following amino acid positions or amino acid combinations: R857, R719, K394, or K844; wherein, the amino acid position numbering is as shown in SEQ ID NO: 1 or 8 defined by the corresponding amino acid positions.
- the engineered Cas12i nuclease comprises a mutation or a combination of mutations at any one of the following amino acid residue positions: R857, R719, K394, K844, R719+K394, K394+K844, R857+K394, R719+ K844, R857+R719, R857+K844, R719+K394+K844, R857+R719+K394, R857+K394+K844, R857+R719+K844, R857+R719+K844, R857+R719+K844, R857+R719+K394+K844; wherein, the amino acid position numbering is as SEQ ID NO: defined by the corresponding amino acid position shown in 1 or 8; the mutation is a mutation in which the amino acid residue at the position is replaced by a hydrophobic amino acid (such as A).
- the engineered Cas12i nuclease comprises a mutation or a combination of mutations at any one of the following amino acid residue positions: R857A, R719A, K394A, K844A, R719A+K394A, K394A+K844A, R857A+K394A, R719A+ K844A, R857A+R719A, R857A+K844A, R719A+K394A+K844A, R857A+R719A+K394A, R857A+K394A+K844A, R857A+R719A+K844A, R857A+R719A+K844A, R857A+R719A+K844A, R857A+R719A+K394A+K844A, where the amino acid ID number is as SE; NO: Defined by the corresponding amino acid position indicated by 1 or 8.
- the engineered Cas12i nuclease comprises any one of the following mutations or combinations of mutations: R857A, R719A, K394A, or K844A; wherein, the amino acid position numbering is the corresponding amino acid shown in SEQ ID NO: 1 or 8 defined by the location.
- the engineered Cas12i nuclease comprises a K844A mutation; wherein the amino acid position numbering is as defined in the corresponding amino acid position shown in SEQ ID NO: 1 or 8.
- the engineered Cas12i nuclease comprises R719A and K844A mutations; wherein the amino acid position numbering is as defined in the corresponding amino acid position shown in SEQ ID NO: 1 or 8. In some embodiments, the engineered Cas12i nuclease comprises R857A and K844A mutations; wherein the amino acid position numbering is as defined in the corresponding amino acid position shown in SEQ ID NO: 1 or 8.
- the above-mentioned engineered Cas12i nuclease (with reference to one or more polarity or polarities that interact with the DNA-RNA double helix in the Cas12i nuclease).
- a positively charged amino acid is replaced by a hydrophobic amino acid) having at least about 85% (e.g., any at least about 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95% , 96%, 97%, 98%, 99%) sequence identity engineered Cas12i nuclease.
- the mutations described herein may include one or more of: insertions, deletions, substitutions, and may be mutations of a single amino acid or multiple amino acids.
- Any one or more mutations described in sections 1) to 5) can be combined with any one or more known mutations that increase Cas12i activity, such as target binding, double-strand cleavage activity, nickase activity and/or gene editing activity. Exemplary mutations can be found, for example, in the following documents PCT/CN2020/0134249 and CN 112195164 A, which are incorporated herein by reference in their entirety. Any one or more mutations described in sections 1) to 5) can also be combined with any one or more known mutations that reduce Cas12i activity, such as target binding, double-strand cleavage activity, nickase activity and/or gene editing activity .
- the engineered Cas12i nuclease (such as the engineered Cas12i nuclease comprising one or more mutations in the above 1) to 5) sections) also comprises one or more flexible region mutations, the The mutation increases (such as increases by at least about any 10%, 20%, 50%, 60%, 70%, 80%, 90%, 1-fold, 1.1-fold, 1.2-fold, 1.5-fold, 2-fold, 3-fold, 4-fold times, 5 times, 10 times, 20 times, 50 times, 100 times or more) reference Cas12i nuclease (or the engineered Cas12i comprising any one or more mutations in (1)-(5) sections The flexibility of the flexible region in nucleases).
- the flexible region in the reference Cas12i nuclease can be determined using any method known in the art. In some embodiments, multiple flexible regions are determined based solely on the amino acid sequence of the reference Cas12i nuclease. In some embodiments, multiple flexible regions are determined based on structural information of the reference Cas12i nuclease, including, for example, secondary structure, crystal structure, NMR structure, and the like.
- the method for engineering transformation Cas12i nuclease flexible region described here comprises: (a) obtains the Cas12i nuclease of multiple engineering, the Cas12i nuclease of every kind of engineering comprises one or more mutations, and said mutation increases Referring to the flexibility of the flexible region in one or more flexible regions of the Cas12i nuclease; and (b) selecting one or more engineered Cas12i nucleases from the plurality of engineered Cas12i nucleases, wherein the One or more engineered Cas12i nucleases have increased activity (eg, target binding, double-strand cleavage activity, nickase activity, and/or gene editing activity) compared to the reference Cas12i nuclease.
- activity eg, target binding, double-strand cleavage activity, nickase activity, and/or gene editing activity
- the method further comprises determining one or more regions of flexibility in the reference Cas12i nuclease. In some embodiments, the method further comprises measuring the activity of the engineered Cas12i nuclease in a eukaryotic cell, such as a mammalian cell (eg, a human cell).
- a mammalian cell eg, a human cell
- the plurality of regions of flexibility are determined using a program selected from the group consisting of: PredyFlexy, FoldUnfold, PROFbval, Flexserv, FlexPred, DynaMine, and Disomine.
- the one or more flexible regions are located at random coils.
- the one or more flexible regions are between the reference Cas12i nuclease (or the engineered Cas12i nuclease comprising any one or more mutations in sections (1)-(5)) and In the domain of DNA and/or RNA interaction.
- the flexible region is at least about 5 (eg, 5) amino acids in length.
- the one or more mutations comprise insertion of one or more (eg, 2) glycine (G) residues in the flexible region.
- the one or more G residues are inserted at the N-terminus of the flexible amino acid residues in the flexible region, wherein the flexible amino acid residues are selected from the group consisting of G, serine (S), asparagine Amide (N), aspartic acid (D), histidine (H), methionine (M), threonine (T), glutamic acid (E), glutamine (Q), lysine ( K), arginine (R), alanine (A) and proline (P).
- the flexible amino acid residues are selected according to the following priority: G>S>N>D>H>M>T>E>Q>K>R>A>P.
- the one or more mutations comprise replacing one or more non-G residues with one or more G residues.
- the one or more mutations comprise replacing a hydrophobic amino acid residue in the flexible region with a G residue, wherein the hydrophobic amino acid residue is selected from the group consisting of leucine (L), isoleucine Acid (I), Valine (V), Cysteine (C), Tyrosine (Y), Phenylalanine (F) and Tryptophan (W).
- L leucine
- I isoleucine Acid
- V Valine
- C Cysteine
- Y Tyrosine
- F Phenylalanine
- W Tryptophan
- the activity is a site-specific nuclease activity.
- the activity is gene editing activity in a eukaryotic cell (eg, a human cell).
- the gene editing efficiency is measured using a T7 endonuclease 1 (T7E1) assay, sequencing of target DNA, assayed by resolution tracking indels (TIDE), or by amplicon analysis Indel detection (IDAA) assays were performed.
- T7E1 T7 endonuclease 1
- TIDE resolution tracking indels
- IDAA amplicon analysis Indel detection
- the engineered Cas12i nuclease (for example, the engineered Cas12i nuclease comprising any one or more mutations in sections (1)-(5)) comprises one or more flexible region mutations
- the flexible region mutation increases the flexibility of the flexible region in the reference Cas12i nuclease (such as Cas12i2 nuclease, or the engineered Cas12i nuclease comprising any one or more mutations in (1)-(5) sections)
- the flexible region is selected from the group of regions corresponding to amino acid residues 228-232, amino acid residues 439-443, amino acid residues 478-482, amino acid residues 500-504, amino acid residues 775-779 and Amino acid residues 925-929, wherein the numbering of the amino acid residues is as defined in the corresponding amino acid positions shown in SEQ ID NO:1.
- the flexible region is selected from amino acid residues 439-443 or amino acid residues 925-929, wherein the numbering of the amino acid residues is as defined in the corresponding amino acid positions shown in SEQ ID NO:1.
- the reference Cas12i enzyme is Cas12i2 (SEQ ID NO: 1).
- the one or more flexible region mutations comprise insertion of one or more (eg, 2) G residues in the flexible region.
- the one or more G residues are inserted at the N-terminus of flexible amino acid residues in the flexible region, wherein the flexible amino acid residues are selected from the group consisting of G, S, N, D, H, M, T, E, Q, K, R, A, and P.
- the flexible amino acid residues are selected according to the following priority: G>S>N>D>H>M>T>E>Q>K>R>A>P.
- the one or more flexible region mutations comprise replacing a hydrophobic amino acid residue in the flexible region with a G residue, wherein the hydrophobic amino acid residue is selected from the group consisting of: A, V, I, L, M, F, Y, P, C or W; preferably, selected from: L, I, V, C, Y, F and W.
- the flexible region mutation is at 439 and/or 926. In some embodiments, they are one or more of the following amino acids: L439, I926. Wherein, the amino acid position is defined as the corresponding amino acid position shown in SEQ ID NO:1.
- the engineered Cas12i nuclease (for example, the engineered Cas12i nuclease comprising any one or more mutations in sections (1)-(5)) comprises 926G and/or 439(L +G) mutation.
- the engineered Cas12i nuclease comprises one or more of the following flexibility region mutations: I926G, L439(L+G), and L439(L+GG).
- the engineered Cas12i nuclease comprises an I926G mutation.
- the engineered Cas12i nuclease comprises a L439(L+G) mutation.
- the engineered Cas12i nuclease comprises a L439(L+GG) mutation.
- the amino acid residue numbering is based on SEQ ID NO:1.
- L439(L+G) is that in the cited amino acid sequence (such as SEQ ID NO: 1), a glycine (G) is inserted after the 439th amino acid, the original 439th
- the L sequence of bits is unchanged; in the context of this application and in the drawings, it is also sometimes indicated as 439G.
- L439 (L+GG) is that in the cited amino acid sequence, two glycines (GG) are inserted after the 439th amino acid, and the original 439th L sequence remains unchanged; in the context of this application And in the drawings, it is sometimes expressed as 439GG.
- the engineered Cas12i nuclease comprises any one of the following amino acid residue positions Mutation or combination of mutations: 926, 439, 925+926, 362+925+926, 439+926, 323+362+926 (such as I926, L439, N925+I926, D362+N925+I926, L439+I926, E323 +D362+I926); Wherein, the amino acid position numbering is defined as the corresponding amino acid position shown in SEQ ID NO:1.
- the mutation at amino acid position 323, 362, 925 or 926 is a mutation that replaces the amino acid residue at the position with R, H or K (such as R); wherein, the amino acid position numbering is as in SEQ ID NO:1 is defined by the corresponding amino acid position.
- the mutation at amino acid position 439 or 926 is a mutation that replaces the amino acid residue at the position with G or inserts G or GG after the amino acid residue; wherein, the amino acid position numbering is as in SEQ ID NO:1 is defined by the corresponding amino acid position.
- the engineered Cas12i nuclease (for example, the engineered Cas12i nuclease comprising any one or more mutations in sections (1)-(5)) comprises any of the following amino acid residues or Amino acid residue combination mutation: 926G, 439(L+GG), 925R+926G, 362R+925R+926G, 439(L+GG)+926R, or 323R+362R+926G; wherein, the amino acid position number is as in SEQ ID NO:1 is defined by the corresponding amino acid position.
- the engineered Cas12i nuclease comprises any one of the following mutations or combination of mutations: I926G, L439(L+GG), L439(L+GG)+I926R, N925R+I926G, D362R+N925R+I926G , E323R+D362R+I926G; wherein the amino acid position numbers are as defined in the corresponding amino acid positions shown in SEQ ID NO:1.
- the engineered Cas12i nuclease (for example, the engineered Cas12i nuclease comprising any one or more mutations in (1)-(5) sections) Cas12i nuclease, and/or the engineered Cas12i nuclease comprising above-mentioned flexible region mutation) has at least about 85% (such as any at least about 86%, 87%, 88%, 89%, 90%, 91%, 92% %, 93%, 94%, 95%, 96%, 97%, 98%, 99%) sequence identity engineered Cas12i nuclease.
- the engineered Cas12i nuclease has at least about 85% (such as any at least about 86%, 87%, 88%, 89%, 90%, 91%, 92% %, 93%, 94%, 95%, 96%, 97%, 98%, 99%) sequence identity engineered Cas12i nuclease.
- the engineered Cas12i nuclease comprises a mutation or combination of mutations at any one of the following amino acid residue positions: 164, 176, 238, 323, 357, 362, 394, 439, 447, 563, 715 ,719,807,844,848,857,861,925,926,958,176+238+447+563,323+362,176+238+447+563+164,176+238+447+563+926 , 176+238+447+563+323+362, 164+926, 164+323+362, 176+238+447+563+164+926, 176+238+447+563+164+323+362, 176 +238+447+563+164+926+323+362, 176+238+447+563+164+323+362, 176+238+447+563+164+323+362, 176
- the mutation at amino acid position 176, 238, 323, 362, 447, 563, 926, or 958 is a mutation that replaces the amino acid residue at the position with R, H or K (such as R) .
- the mutation at amino acid position 164 is a mutation that replaces the amino acid residue at that position with Y or F (eg, Y).
- the mutation at amino acid position 439 or 926 is a mutation that replaces the amino acid residue at the position with G or inserts G or GG after the amino acid residue.
- the engineered Cas12i nuclease comprises a mutation of any one of the following amino acid residues or combinations of amino acid residues: N164, E176; K238; E323, D362, T447; E563, I926, I926, D958, L439 , R857, N861, K807, N848, R715, R719, K394, H357, K844, E176+K238+T447+E563, E323+D362, E176+K238+T447+E563+N164, E176+K238+T447+E563+I926 , E176+K238+T447+E563+E323+D362, N164+I926, N164+E323+D362, E176+K238+T447+E563+N164+I926, E176+K238+T447+E563+N164+I926, E176+K238+T447+E56
- amino acid position number is defined as the corresponding amino acid position shown in SEQ ID NO:1.
- the engineered Cas12i nuclease comprises any one of the following mutations or combination of mutations:
- amino acid position number is defined as the corresponding amino acid position shown in SEQ ID NO:1.
- the engineered Cas12i nuclease comprises the following combination of mutations:
- E176R+K238R+T447R+E563R+N164Y+E323R+D362R wherein the amino acid position number is defined as the corresponding amino acid position shown in SEQ ID NO:1.
- the mutant is named as CasXX hereinafter, and its sequence number is SEQ ID NO: 8.
- the engineered Cas12i nuclease comprises any one of the following mutation combinations:
- the engineered Cas12i nuclease comprises the following combination of mutations:
- the engineered Cas12i nuclease comprises E176R, K238R, T447R, E563R, N164Y, E323R, and D362R mutations. In some embodiments, the engineered Cas12i nuclease comprises E176R, K238R, T447R, E563R, N164Y, and I926R mutations. In some embodiments, the engineered Cas12i nuclease comprises E176R, K238R, T447R, E563R, E323R, and D362R mutations.
- the engineered Cas12i nuclease comprises E176R, K238R, T447R, E563R, N164Y, E323R, D362R, and I926G mutations. In some embodiments, the engineered Cas12i nuclease comprises E176R, K238R, T447R, E563R, N164Y, E323R, D362R, I926G, and L439 (L+GG) mutations.
- the engineered Cas12i nuclease comprises E176R, K238R, T447R, E563R, N164Y, E323R, D362R, I926G, and L439 (L+G) mutations. In some embodiments, the engineered Cas12i nuclease comprises E176R, K238R, T447R, E563R, N164Y, and D958R mutations. In some embodiments, the engineered Cas12i nuclease comprises E176R, K238R, T447R, E563R, N164Y, I926R, and D958R mutations.
- the engineered Cas12i nuclease comprises E176R, K238R, T447R, E563R, N164Y, E323R, D362R, and D958R mutations. In some embodiments, the engineered Cas12i nuclease comprises E176R, K238R, T447R, E563R, N164Y, I926R, E323R, D362R, and D958R mutations. In some embodiments, the engineered Cas12i nuclease comprises E176R, K238R, T447R, E563R, N164Y, E323R, D362R, I926G, and D958R mutations.
- the engineered Cas12i nuclease comprises E176R, K238R, T447R, E563R, N164Y, E323R, D362R, I926G, L439 (L+GG), and D958R mutations. In some embodiments, the engineered Cas12i nuclease comprises E176R, K238R, T447R, E563R, N164Y, E323R, D362R, and K844A mutations.
- the engineered Cas12i nuclease comprises E176R, K238R, T447R, E563R, N164Y, E323R, D362R, R719A, and K844A mutations. In some embodiments, the engineered Cas12i nuclease comprises E176R, K238R, T447R, E563R, N164Y, E323R, D362R, R857A, and K844A mutations.
- the amino acid position numbers are as defined in the corresponding amino acid positions shown in SEQ ID NO:1.
- Cas12i nuclease with the above-mentioned engineered Cas12i nuclease has at least about 80% (such as any at least about 81%, 82%, 83%, 84%, 85%) %, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%) sequence identity engineering Cas12i nuclease.
- an engineered Cas12i nuclease which comprises any amino acid sequence shown in SEQ ID NOs: 2 ⁇ 24, or has any amino acid sequence shown in SEQ ID NOs: 2 ⁇ 24
- the amino acid sequence has at least about 80% (e.g., any at least about 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93% %, 94%, 95%, 96%, 97%, 98%, 99%) amino acid sequence identity.
- the reference Cas12i nuclease is Cas12i1, Cas12i2, or an ortholog thereof. In some embodiments, the reference Cas12i nuclease is native Cas12i1, or a variant thereof (eg, a naturally occurring variant). In some embodiments, the reference Cas12i nuclease is native Cas12i2 (as shown in SEQ ID NO: 1), or a variant thereof (such as a naturally occurring variant). In some embodiments, the reference Cas12i nuclease is an engineered Cas12i nuclease (the engineered Cas12i comprising any one or more mutations in sections (1)-(5) as described in the present invention Nuclease). In some embodiments, the reference Cas12i nuclease is CasXX (SEQ ID NO: 8).
- Type V-I CRISPR-Cas12i has been identified as an RNA-guided DNA endonuclease system. Unlike CRISPR-Cas systems such as Cas12b or Cas9, the Cas12i-based CRISPR system does not require a tracrRNA sequence.
- the RNA guide sequence comprises crRNA.
- the crRNAs described herein include direct repeat sequences and spacer sequences.
- the crRNA comprises, consists essentially of, or consists of a direct repeat sequence linked to a guide sequence or a spacer sequence.
- the crRNA includes a direct repeat sequence, a spacer sequence, and a direct repeat sequence (DR-spacer-DR), which is typical of precursor crRNA (pre-crRNA) configurations in other CRISPR systems .
- the crRNA includes truncated direct repeat and spacer sequences, which are typical features of processed or mature crRNA.
- the CRISPR-Cas12i effector protein forms a complex with an RNA guide sequence, and the spacer sequence directs the complex to sequence-specific binding to a target nucleic acid that is complementary to the spacer sequence (eg at least 70% complementary).
- the engineered Cas12i of the present application is an endonuclease that binds to a specific site of a target sequence and cuts under the guidance of a guide RNA, and has DNA and RNA endonuclease activity.
- the Cas12i is capable of autonomous crRNA biogenesis by processing a precursor crRNA array. Autonomous pre-crRNA processing facilitates Cas12i delivery, enabling double-nicking applications, as two separate genomic loci can be targeted by a single crRNA transcript.
- the Cas12i protein then processes the CRISPR array into two homologous crRNAs, forming a paired nicking complex.
- the guide RNA comprises a precursor crRNA expressed from a CRISPR array consisting of target sequences interleaved with unprocessed DR sequences, repeated by intrinsic precursor crRNA processing of the effector protein to enable simultaneous targeting of one, two or more sites.
- Cas12i nucleases from various organisms can be used as the reference Cas12i nucleases to provide the engineered Cas12i nucleases and effector proteins of the present application.
- Exemplary Cas12i nucleases have been described, eg, in WO2019/201331A1 and US2020/0063126A1, which are hereby incorporated by reference in their entirety.
- the reference Cas12i nuclease has enzymatic activity.
- the reference Cas12i is a nuclease, ie, cleaves both strands of a target duplex nucleic acid (eg, duplex DNA).
- the reference Cas12i is a nickase, ie, cuts a single strand of a target duplex nucleic acid (eg, duplex DNA).
- the reference Cas12i nuclease is enzymatically inactive.
- the reference Cas12i nuclease is Cas12i1, Cas12i2, or Cas12i-Phi.
- the reference Cas12i nuclease contains the sequence of SEQ ID NO: 1.
- the engineered Cas12i nuclease is based on a functional variant (or functional derivative) of a naturally occurring Cas12i nuclease.
- a functional variant or functional derivative
- a functional variant differs in amino acid sequence by at least one amino acid residue (eg, with a deletion, insertion, substitution and/or fusion).
- the functional variant has one or more mutations, such as amino acid substitutions, insertions and deletions.
- the functional variant may comprise 1, 2, 3, 4, 5, 6, 7, 8 , 9, 10 or more amino acid substitutions.
- the one or more substitutions are conservative substitutions.
- the functional variant has all domains of a naturally occurring Cas12i nuclease.
- the functional variant does not have one or more domains of a naturally occurring Cas12i nuclease.
- the functional variant has all domains of the engineered Cas12i nuclease.
- the functional variant does not have one or more domains of the engineered Cas12i nuclease.
- the biological activity of the functional variant of the Cas12i nuclease changes due to a change in its amino acid, for example, from a native nuclease to an enzyme-inactive mutant.
- the Cas12i variant may comprise Domain, identity percentage, etc.) Cas12i protein sequence.
- the functional variant of the engineered Cas12i nuclease has enzymatic activity (such as DNA double-strand or single-strand cleavage activity), or at least has a reference Cas12i nuclease (or its parent engineered Cas12i nucleic acid Enzyme) Enzyme activity of about 60%, such as at least about any of 65%, 70%, 80%, 90%, 95%, 96%, 97%, 98%, 99%, or 100%.
- the enzymatic activity of the functional variant of the engineered Cas12i nuclease is at least 1.1 times (such as at least 1.2 times, 1.5 times, 2 times, 3 times, 4 times, 5-fold, 10-fold or more) of the enzyme activity.
- the functional variant of the engineered Cas12i nuclease has a different catalytic activity than its engineered non-functional variant mutant form of the Cas12i nuclease.
- the functional variant mutations eg, amino acid substitutions, insertions and/or deletions
- the functional variant of the engineered Cas12i nuclease comprises mutations in one or more catalytic domains.
- a Cas12i nuclease that cleaves one strand of a double-stranded target nucleic acid without cutting the other is referred to herein as a "nickase” (eg, "Cas12i nickase”).
- a Cas12i nuclease having substantially no nuclease activity is referred to as an inactivated Cas12i protein ("dCas12i") (to which a heterologous polypeptide (fusion partner) fused thereto may provide nuclease activity, see “Engineered Cas12i” below for details. Cas12i Effector Proteins” section).
- the DNA cleavage activity of the functional variant mutant enzyme is less than about 25%, 10%, 5%, 1%, 0.1%, 0.01% or more relative to the mutant form of its non-functional variant
- the functional variant of the Cas12i nuclease is considered to substantially lack all DNA cleavage activity.
- the mutation of one or more amino acid residues in the Cas12i nuclease active site can lead to Cas12i (dCas12i) that reduces or loses enzymatic activity, also referred to in the present invention as " Cas12i of nuclease activity loss” or " enzymatic inactivation mutation body”.
- the engineered Cas12i nucleases provided herein can be modified to have reduced or absent nuclease activity, e.g., Cas12i nuclease activity deficient with wild-type Cas12i nuclease (or its parent engineered Cas12i Nuclease activity (such as DNA double-strand or single-strand cleavage activity) is reduced by at least about 50% (such as a reduction of at least about 60%, 70%, 80%, 90%, 95%, 96%, 97%) , 98%, 99%, or any of 100%).
- Cas12i nuclease activity deficient with wild-type Cas12i nuclease or its parent engineered Cas12i Nuclease activity (such as DNA double-strand or single-strand cleavage activity) is reduced by at least about 50% (such as a reduction of at least about 60%, 70%, 80%, 90%, 95%, 96%, 97%) , 98%, 99%, or any of 100%).
- the Cas12i nuclease activity can be reduced by several methods, for example, introducing mutations into one or more domains of the Cas12i nuclease: a domain interacting with PAM, a domain involved in opening a DNA double strand, RuvC domain, region interacting with nucleic acid (DNA/RNA), etc.
- the catalytic residues of the Cas12i nuclease activity can be replaced by different amino acid residues (for example, glycine or alanine) to reduce the the nuclease activity.
- Examples of such mutations of Cas12i1 include D647A, E894A and/or D948A.
- Examples of such mutations of Cas12i2 include D599A, E833A, S883A, H884A, D886A, R900A and/or D1019A.
- the engineered Cas12i nuclease or a functional derivative thereof comprises one of the following or multiple nuclease activity deletion mutations: D599A, E833A, S883A, H884A, R900A and D1019A; wherein, the amino acid position is defined as the corresponding amino acid position shown in SEQ ID NO:1.
- the engineered Cas12i nuclease (or functional variant thereof) has increased activity compared to the reference Cas12i nuclease. In some embodiments, the engineered Cas12i nuclease (or functional variant thereof) has reduced activity compared to the reference Cas12i nuclease. In some embodiments, the activity is target DNA binding activity. In some embodiments, the activity is a site-specific nuclease activity. In some embodiments, the activity is the activity of opening double strands of DNA. In some embodiments, the activity is double-stranded DNA cleavage activity.
- the activity is single-stranded DNA cleavage activity, including, for example, site-specific DNA cleavage activity or non-specific DNA cleavage activity.
- the activity is single-stranded RNA cleavage activity, eg, site-specific RNA cleavage activity or non-specific RNA cleavage activity.
- the activity is measured in vitro.
- the activity is measured in cells, such as bacterial cells, plant cells, or eukaryotic cells.
- the activity is measured in mammalian cells, such as rodent cells or human cells.
- the activity is measured in human cells, such as 293T cells.
- the activity is measured in mouse cells, eg, Hepal-6 cells.
- the engineered Cas12i nuclease (or a functional variant thereof) has an increase of at least about 10%, 20%, 30%, 40%, 50%, 60% compared to a reference Cas12i nuclease , 70%, 80%, 90%, 95%, 1-fold, 1.1-fold, 1.5-fold, 2-fold, 3-fold, 4-fold, 5-fold, 10-fold or more in any one of the activities (as above-mentioned one one one or more activities, such as site-specific nuclease activity).
- the engineered Cas12i nuclease (or a functional variant thereof) has at least about a 10%, 20%, 30%, 40%, 50%, 60% reduction compared to a reference Cas12i nuclease , 70%, 80%, 90%, 95%, 1-fold, 1.1-fold, 1.5-fold, 2-fold, 3-fold, 4-fold, 5-fold, 10-fold or more in any one of the activities (as above-mentioned one one or more activities, such as site-specific nuclease activity).
- the site-specific nuclease activity of the engineered Cas12i nuclease can be measured using methods known in the art, including, for example, gel shift assays, as in the Examples provided herein The described in vitro cleavage assay based on agarose gel electrophoresis.
- the activity is gene editing activity in the cell.
- the cells are bacterial cells, plant cells, or eukaryotic cells.
- the cells are mammalian cells such as rodent cells or human cells.
- the cells are 293T cells.
- the activity is measured in mouse cells, such as Hepal-6 cells.
- the activity is an indel forming activity at a target genomic site in the cell, for example, site-specific cleavage and DNA repair occurs through the mechanism of non-homologous end joining (NHEJ).
- NHEJ non-homologous end joining
- the activity is the activity of inserting an exogenous nucleic acid sequence at a target genomic site in the cell, for example by performing site-specificity on the target nucleic acid through the engineered Cas12i nuclease (or a functional variant thereof) Cleavage and DNA repair by homologous recombination (HR; by further introducing a repair template) mechanism.
- the engineered Cas12i nuclease (or its functional variant) is compared with the reference Cas12i nuclease in the target genome of cells (e.g.
- human cells such as 293T cells, or mouse Hepa1-6 cells
- the engineered Cas12i nuclease (or functional variant thereof) is more active in cells (e.g., human cells such as 293T cells or mouse Hepa1-6 cells) than the reference Cas12i2 nuclease ( For example, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more) target genomic loci are increased by at least about 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, 1 fold, 1.1 fold, 1.5 fold, 2 fold, 3 fold, 4 fold, 5 fold, 10 fold or more of gene editing (e.g. Deletion formation, or repair) activity.
- gene editing e.g. Deletion formation, or repair
- the engineered Cas12i nuclease (or functional variant thereof) is capable of editing a greater number (e.g., 2, 3, 4, 5, 6, 7) than the reference Cas12i nuclease , 8, 9, 10, 20, 30, 40, 50, 60, 70 or more) genomic loci, such as identifying more (eg, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 30, 40, 50, 60, 70 or more) of PAM sequences.
- the consensus PAM sequence of the engineered Cas12i nuclease (or functional variant thereof) is identical to the reference Cas12i nuclease.
- the gene editing efficiency of engineered Cas12i nucleases (or functional variants thereof) in vitro or in cells can be determined using methods known in the art, including, for example, T7 endonuclease 1 (T7E1) assays, sequencing of target DNA (including For example, Sanger sequencing, and next-generation sequencing), detection of indels by decomposition (TIDE), or detection of indels by amplicon analysis (IDAA).
- T7E1 T7 endonuclease 1
- sequencing of target DNA including For example, Sanger sequencing, and next-generation sequencing
- TIDE detection of indels by decomposition
- IDAA detection of indels by amplicon analysis
- targeted next-generation sequencing is used to measure the gene editing efficiency of the engineered Cas12i nuclease in cells.
- exemplary genomic loci that can be used to determine the gene editing efficiency of the engineered Cas12i nuclease (or functional variant thereof) include, but are not limited to, CCR5, AAVS, CD34, RNF2, and EMX1.
- the gene editing efficiency of the engineered Cas12i nuclease is that the engineered Cas12i nuclease is at least 5, 10, 15, 20, 25 , the average gene editing efficiency of 30, 35, 40, 45, 50, 55, 60, 65 or more loci (such as genomic loci in human cells).
- the gene editing efficiency (such as indel rate) of the engineered Cas12i nuclease (or its functional variant) reaches at least 10%, 20%, 30%, 40%, 45%, 50% , 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95% or higher.
- the engineered Cas12i nuclease (or functional variant thereof) has increased on-target specificity compared to the reference Cas12i nuclease, with a reduced off-target rate (e.g., reduced off-target recognition) sites, and/or reduced editing efficiency at one or more off-target sites), and/or increased target sequence editing efficiency.
- the engineered Cas12i nuclease (or functional variant thereof) is reduced by at least about 5% (such as by at least about 10%, 20%, 30%) compared to the reference Cas12i nuclease , 40%, 50%, 60%, 70%, 80%, 90%, 95%, or any one of 100%) off-target rate.
- the engineered Cas12i nuclease (or functional variant thereof) is reduced by at least one (such as at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 50 or more) off-target sites.
- the engineered Cas12i nuclease (or a functional variant thereof) has the same or similar editing efficiency to the target sequence (eg, within 1.1-fold) as compared to the reference Cas12i nuclease, or The editing efficiency of the target sequence is increased (eg, at least 1.2-fold, 1.5-fold, 2-fold, 3-fold, 5-fold, 10-fold or more than the reference Cas12i nuclease).
- the engineered Cas12i nuclease (or its functional variant) reduces at least about 5% off-target rate compared with the reference Cas12i nuclease, and has the same editing efficiency to the target sequence, Similar (for example, within 1.1 times), or increased.
- the guide RNA or crRNA comprises or consists of the following from 5' to 3': direct repeat sequence, spacer sequence. In some embodiments, the guide RNA comprises or consists of the following from 5' to 3': a direct repeat sequence, a spacer sequence, a nucleotide sequence constructed in tandem of direct repeat sequences. In some embodiments, the RNA guide comprises crRNA. In some embodiments, the guide RNA does not comprise tracrRNA.
- the crRNAs described herein include direct repeat sequences and spacer sequences.
- the crRNA comprises, consists essentially of, or consists of a direct repeat sequence linked to a guide sequence or a spacer sequence.
- the crRNA includes a direct repeat sequence, a spacer sequence, and a direct repeat sequence (DR-spacer-DR), which is a typical precursor crRNA (pre-crRNA) configuration in other CRISPR systems.
- the crRNA includes truncated direct repeat and spacer sequences, which are typical of processed or mature crRNA.
- the CRISPR-Cas effector protein forms a complex with the RNA guide, and the spacer sequence directs the complex for sequence-specific binding to a target nucleic acid that is complementary to the spacer sequence.
- the RNA guide comprises direct repeats. In some embodiments, the RNA guide can form a secondary structure, eg, a stem-loop structure as described herein.
- a CRISPR system described herein comprises a plurality of RNA guides (eg, 2, 3, 4, 5, 10, 15 or more) or a plurality of nucleic acids encoding the plurality of RNA guides.
- the CRISPR systems described herein comprise a single RNA strand or a nucleic acid encoding a single RNA strand, wherein the RNA guides are arranged in tandem.
- the single RNA strand can comprise multiple copies of the same RNA guide, multiple copies of different RNA guides, or a combination thereof.
- each RNA guide is specific for a different target nucleic acid.
- the CRISPR systems described herein include RNA guides or nucleic acids encoding RNA guides.
- the RNA guide comprises or consists of a direct repeat sequence and a spacer sequence capable of hybridizing (eg, hybridizing under appropriate conditions) to a target nucleic acid.
- the RNA guide sequence can be modified in a manner that allows formation of the CRISPR effector complex and successful binding of the target sequence, while not allowing successful nuclease activity (i.e., no nuclease activity/causing indels).
- modified guide sequences are referred to as "inactivated guides” or “inactivated guide sequences”.
- These inactive guides or inactive guide sequences may be catalytically inactive or conformationally inactive with respect to nuclease activity. Inactive guide sequences are generally shorter than corresponding guide sequences that result in cleavage of active RNA.
- an inactive guide is at least about 5%, 10%, 20%, 30%, 40%, or 50% shorter than a corresponding RNA guide having nuclease activity.
- the inactive guide sequence of the RNA guide can be 13 to 15 nucleotides (e.g., 13, 14, or 15 nucleotides), 15 to 19 nucleotides, or 17 to 18 nucleotides in length (eg, 17 nucleotides).
- the inactivated guide RNA is capable of hybridizing to the target sequence such that the CRISPR system is directed to the genomic locus of interest in the cell without detectable cleavage activity.
- RNA guide sequence comprises base modifications.
- the present invention also provides all possible variants of a nucleic acid (eg, cDNA) that can be prepared by selecting combinations based on possible codon usage. These combinations are made according to the standard triplet genetic code applied to polynucleotides encoding naturally occurring variants, and all such variants are considered specifically disclosed.
- a nucleic acid eg, cDNA
- the spacer sequence (or spacer, guide sequence) can be complementary to the target sequence of the target nucleic acid (such as DNA), such as at least about 70% (such as at least about 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100%) complementary.
- the spacer sequence is at least 15 (e.g., at least 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28) from the target sequence of the target nucleic acid (e.g., DNA). , 29, 30, 35, 40, 45, 50 or more) nucleotides complementary.
- cleavage efficiency can be exploited by introducing mismatches (e.g., one or more mismatches, e.g., 1 or 2 mismatches between the spacer sequence and the target sequence, including the position of the mismatch along the spacer/target). adjust. Mismatches, such as double mismatches, are more centrally located (i.e., not at the 3' or 5' ends); cleavage efficiency is more affected. Thus, by selecting the position of the mismatch along the spacer sequence, the efficiency of cleavage can be tuned. For example, if less than 100% cleavage of the target is desired (eg, in a population of cells), 1 or 2 mismatches between the spacer sequence and the target sequence may be introduced in the spacer sequence.
- mismatches e.g., one or more mismatches, e.g., 1 or 2 mismatches between the spacer sequence and the target sequence, including the position of the mismatch along the spacer/target.
- the guide sequence can be of suitable length.
- the spacer length of the RNA guide can range from about 11 to 50 (eg, about 15 to 50) nucleotides.
- the guide sequence is between about 18 and about 35 nucleotides, including, for example, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30 , 31, 32, 33, 34 or 35 nucleotides.
- the RNA guide has a spacer length of at least 16 nucleotides, at least 17 nucleotides, at least 18 nucleotides, at least 19 nucleotides, at least 20 nucleotides, at least 21 nucleotides or at least 22 nucleotides.
- the spacer is 15 to 17 nucleotides, 15 to 23 nucleotides, 16 to 22 nucleotides, 17 to 20 nucleotides, 20 to 24 nucleotides in length ( For example, 20, 21, 22, 23 or 24 nucleotides), 23 to 25 nucleotides (for example, 23, 24 or 25 nucleotides), 24 to 27 nucleotides, 27 to 30 nucleotides nucleotides, 30 to 45 nucleotides (e.g., 30, 31, 32, 33, 34, 35, 40 or 45 nucleotides), 30 or 35 to 40 nucleotides, 41 to 45 cores nucleotides, 45 to 50 nucleotides, or longer.
- 20 to 24 nucleotides in length For example, 20, 21, 22, 23 or 24 nucleotides), 23 to 25 nucleotides (for example, 23, 24 or 25 nucleotides), 24 to 27 nucleotides, 27 to 30 nucleotides nucleotides, 30 to 45 nu
- the spacer of the RNA guide is 31 nucleotides in length. In some embodiments, the RNA guide has a direct repeat length of at least 21 nucleotides, or between 21 and 37 nucleotides (eg, 23, 24, 25, 30, 35, or 36 nucleotides). In some embodiments, the spacer sequence comprises or consists of about 15 to about 34 nucleotides (eg, 16, 17, 18, 19, 20, 21 or 22 nucleotides). In some embodiments, the spacer sequence is between 17 and 31 nucleotides in length. In some embodiments, the spacer sequence is between 15 and 24 nucleotides in length. In some embodiments, the direct repeat length of the RNA guide is 23 or 20 nucleotides.
- the direct repeat sequence can guide the Cas12i protein (such as any engineered Cas12i nuclease of the present invention, its functional variant or effector protein) to bind to the guide gRNA (or crRNA) to form a CRISPR-Cas complex targeting the target sequence .
- Cas12i protein such as any engineered Cas12i nuclease of the present invention, its functional variant or effector protein
- Any DR that can guide the Cas12i nuclease or effector protein engineered in the application to bind to the guide gRNA (or crRNA) to form a CRISPR-Cas complex targeting the target sequence can be used in the present invention, such as US11168324 (the entire content of which The DR sequences described herein are incorporated by reference in their entirety.
- a direct repeat can consist of two stretches of nucleotides, which can be complementary to each other, separated by an intervening nucleotide, such that the direct repeat can hybridize to form a double-stranded RNA duplex (dsRNA duplex), resulting in a stem-loop structure , in which two stretches of complementary nucleotides form a stem, and intervening nucleotides form a loop or hairpin.
- dsRNA duplex double-stranded RNA duplex
- the intermediate nucleotides forming a "loop” have a length of from about 6 nucleotides to about 8 nucleotides, or about 7 nucleotides.
- the stem may comprise at least 2, at least 3, at least 4 or 5 base pairs.
- a direct repeat may comprise two complementary stretches of nucleotides of about 4 to about 7 (eg, 4, 5, 6, 7) nucleotides in length, separated by about 5 to about 9 (such as 5, 6, 7, 8, 9) nucleotides.
- Those skilled in the art can model known direct repeat structures.
- Direct repeats may comprise or consist of about 13 to about 23 nucleotides, about 22 to about 40 nucleotides, or about 23 to about 38 nucleotides, or about 23 to about 36 nucleotides .
- the direct repeat sequence includes a stem-loop structure near the 3' end (immediately adjacent to the spacer sequence). In some embodiments, the direct repeat sequence includes a stem-loop near the 3' end, wherein the stem is 5 nucleotides in length. In some embodiments, the direct repeat sequence includes a stem-loop near the 3' end, wherein the stem is 5 nucleotides in length and the loop is 7 nucleotides in length. In some embodiments, the direct repeat sequence includes a stem-loop near the 3' end, wherein the stem is 5 nucleotides in length and the loop is 6, 7 or 8 nucleotides in length.
- the direct repeat sequence includes near the 3' end the sequence 5'-CCGUCNNNNNNNUGACGG-3' (SEQ ID NO: 68), where N refers to any nucleobase. In some embodiments, the direct repeat sequence includes near the 3' end the sequence 5'-GUGCCNNNNNNNUGGCAC-3' (SEQ ID NO: 69), where N refers to any nucleobase.
- the direct repeat sequence includes the sequence 5'-GUGUCN 5-6 UGACAX 1 -3' (SEQ ID NO: 70 or 71) near the 3' end, wherein N 5-6 refers to any 5 or 6 A contiguous sequence of nucleobases, and X 1 refers to C or T or U.
- the direct repeat sequence includes the sequence 5'-UCX 3 UX 5 X 6 X 7 UUGACGG-3' (SEQ ID NO: 72) near the 3' end, wherein X 3 refers to C or T or U, X5 means A or T or U, X6 means A or C or G, and X7 means A or G.
- the direct repeat sequence includes the sequence 5'-CCX 3 X 4 X 5 CX 7 UUGGCAC-3' (SEQ ID NO: 73) near the 3' end, wherein X 3 refers to C or T or U, X4 means A or T or U, X5 means C or T or U, and X7 means A or G.
- the nucleotides encoding the direct repeat sequence comprise or consist of at least about 80%, such as at least about 85%, identical to SEQ ID NO: 59 (AGAAATCCGTCTTTCATTGACGG) or SEQ ID NO: 79 (GTTGCAAAACCCAAGAAATCCGTCTTTCATTGACGG) , 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% identical nucleotide sequences.
- the nucleotides encoding the direct repeat sequence comprise at least 21 (e.g., 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35 or 36) nucleotides.
- “Stem-loop structure” refers to a nucleic acid having a secondary structure comprising a region of nucleotides known or predicted to form a double strand (stem portion) flanked on one side by the main Regions (loop portions) that are single-stranded nucleotides are connected.
- the terms “hairpin” and “fold-back” structures are also used herein to refer to stem-loop structures. Such structures are well known in the art, and these terms are used consistently with their art-known meanings.
- stem-loop structures do not require precise base pairing.
- the stem may include one or more base mismatches.
- base pairing can be exact, ie not include any mismatches.
- the stem of the direct repeat consists of 5 complementary nucleobases that hybridize to each other and the loop is 6, 7 or 9 nucleotides in length.
- the sequence encoding the direct repeat comprises or consists of the following: the sequence shown in SEQ ID NO:79. In some embodiments, the sequence encoding the direct repeat comprises or consists of the following: the truncated nucleic acid with the initial three 5' nucleotides of the SEQ ID NO:79 nucleic acid sequence. In some embodiments, the sequence encoding the direct repeat comprises or consists of the following: the truncated nucleic acid with the initial four 5' nucleotides of the SEQ ID NO:79 nucleic acid sequence.
- the sequence encoding the direct repeat comprises or consists of the following: the truncated nucleic acid with the initial five 5' nucleotides of the SEQ ID NO:79 nucleic acid sequence. In some embodiments, the sequence encoding the direct repeat comprises or consists of the following: the truncated nucleic acid with the initial six 5' nucleotides of the SEQ ID NO:79 nucleic acid sequence. In some embodiments, the sequence encoding the direct repeat comprises or consists of the following: the truncated nucleic acid with the initial seven 5' nucleotides of the SEQ ID NO:79 nucleic acid sequence.
- sequence encoding the direct repeat comprises or consists of the following: the truncated nucleic acid with the initial eight 5' nucleotides of the SEQ ID NO:79 nucleic acid sequence. In some embodiments, the sequence encoding the direct repeat comprises or consists of the following: the sequence shown in SEQ ID NO:59.
- the direct repeat is a "functional variant" of the RNA sequence encoded by SEQ ID NO: 59 or 79, such as a “functionally truncated version", a “functionally extended version”, or a “functionally substituted version”. ", such as a part of SEQ ID NO:79 (truncated version), still has DR function.
- a DR "functional variant” is a 5' and/or 3' extension (functionally extended version) or truncation (functionally truncated version) of a reference DR (such as a parental DR), and/or a reference DR sequence
- a reference DR such as a parental DR
- DR sequence After inserting, deleting, and/or replacing (functional replacement version) one or more nucleotides, still have at least 20% of the reference DR (such as at least about any 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, or higher) function of the DR sequence, that is, the function of mediating the combination of the Cas12i protein and the corresponding crRNA.
- DR functional variants generally retain the stem-loop-like secondary structure or part thereof available for Cas12i protein binding.
- the DR or its functional variant comprises a stem-loop-like secondary structure or part thereof available for Cas12i protein binding. In some embodiments, the DR or its functional variant comprises at least two (such as 2, 3, 4, 5 or more) stem-loop-like secondary structures or parts thereof available for Cas12i protein binding.
- pegRNA Prime editing guide RNA
- PEgRNAs (relative to standard guide RNAs) are altered to include an extension that provides a DNA synthesis template sequence encoding a single-stranded DNA flap that is homologous to the strand to be edited targeting the endogenous DNA sequence but contains the desired one or multiple nucleotide changes and are incorporated into the target DNA molecule after being synthesized by a polymerase (eg, reverse transcriptase).
- a polymerase eg, reverse transcriptase
- the extended guide RNA comprises (a) a guide RNA and (b) an RNA extension at the 5' or 3' end of the guide RNA or in an intramolecular position of the guide RNA.
- the intramolecular positioning of the extension does not disrupt the function of the protospacer.
- the RNA extension may comprise (i) a reverse transcription template sequence comprising the desired nucleotide changes, (ii) a reverse transcription primer binding site, and (iii) an optional linker sequence.
- the reverse transcription template sequence may encode a single-stranded DNA flap complementary to an endogenous DNA sequence adjacent to the nicking site, wherein the single-stranded DNA flap contains the desired nucleotide change.
- the ssDNA flap displaces endogenous ssDNA at the nicking site.
- the desired nucleotide change incorporated into the target DNA can be a single nucleotide change (e.g., a transition or transversion), an insertion of one or more nucleotides, or one or more nuclear Deletion of nucleotides.
- the desired nucleotide change may be a single nucleotide substitution (eg, transition or transversion), deletion or insertion.
- the desired nucleotide change may be (1) G to T substitution, (2) G to A substitution, (3) G to C substitution, (4) T to G substitution, (5) T to A substitution, (6) T to C substitution, (7) C to G substitution, (8) C to T substitution, (9) C to A substitution, (10) A to T substitution, (11) A to G substitution, or ( 12) A to C substitution.
- the engineered Cas12i nuclease (or its functional variant) of the present invention can recognize the protospacer adjacent motif (PAM).
- the target nucleic acid comprises a PAM.
- the PAM is located 5' to a sequence in the target nucleic acid that is complementary to the targeting sequence of the guide RNA.
- the PAM comprises or consists of the nucleic acid sequence 5'-TTN-3', 5'-TTH-3', 5'-TTY-3' or 5'-TTC-3'.
- the PAM comprises or consists of the nucleic acid sequence 5'-TTTN-3'.
- the PAM comprises or consists of the nucleic acid sequence 5'-TTN-3'.
- the PAM applicable to the engineered Cas12i nuclease (or its functional variant) of the present invention includes the nucleic acid sequence 5'-TTTA-3', 5'-CTTA-3', 5'- GTTA-3', 5'-ATTA-3', 5'-TTTC-3', 5'-CTTC-3', 5'-GTTC-3', 5'-ATTC-3', 5'-TTTG- 3', 5'-CTTG-3', 5'-GTTG-3', 5'-ATTG-3', 5'-TTTT-3', 5'-CTTT-3', 5'-GTTT-3' , 5'-ATTT-3'.
- the application provides engineered Cas12i (such as Cas12i2) effector protein, which comprises any engineered Cas12i nuclease of the present invention or its functional variant (such as CasXX shown in SEQ ID NO: 8), which has improved activity , such as target binding, double-strand cleavage activity, nickase activity, and/or gene editing activity.
- engineered Cas12i such as Cas12i2
- effector protein which comprises any engineered Cas12i nuclease of the present invention or its functional variant (such as CasXX shown in SEQ ID NO: 8), which has improved activity , such as target binding, double-strand cleavage activity, nickase activity, and/or gene editing activity.
- an engineered Cas12i effector protein (e.g., a Cas12i nuclease, a Cas12i nickase, a Cas12i fusion effector protein, or a split Cas12i effector protein) comprising an engineered Cas12i nucleic acid as described herein is provided Any of the enzymes or their functional derivatives (such as dCas12i).
- the engineered Cas12i effector protein comprises, consists essentially of, or consists of any of the engineered Cas12i nucleases described herein or a functional variant thereof.
- an engineered Cas12i effector protein based on any of the engineered Cas12i2 nucleases described herein or functional variants thereof (e.g., CasXX shown in SEQ ID NO: 8).
- the engineered Cas12i effector protein has enzymatic activity (eg, DNA double-strand cleavage activity).
- the engineered Cas12i effector protein has nuclease activity that cleaves both strands of a target duplex nucleic acid (eg, duplex DNA).
- the engineered Cas12i effector protein has nickase activity, ie, cleaves a single strand of a target duplex nucleic acid (eg, duplex DNA).
- the engineered Cas12i effector protein comprises an enzyme-inactive mutant of the engineered Cas12i nuclease.
- the application also provides any split-type Cas12i effector protein based on the engineered Cas12i nuclease described herein or a functional variant thereof (such as CasXX shown in SEQ ID NO: 8).
- Split Cas12i effectors may be advantageous for delivery.
- the engineered Cas12i effector protein is split into two parts of the enzyme that can be reconstituted together to provide a substantially functioning Cas12i effector protein.
- Cas effector proteins can be provided using known methods, for example, fragmented forms of the Cas12 and Cas9 proteins have been described in, for example, WO2016/112242, WO2016/205749 and PCT/CN 2020/111057, which are incorporated herein by reference in their entirety.
- a split-type Cas12i effector protein comprising a first polypeptide comprising any of the engineered Cas12i nucleases described herein or a functional derivative thereof and a second polypeptide
- the N-terminal portion of the species, the second polypeptide comprises the C-terminal portion of any one of the engineered Cas12i nuclease or its functional derivatives, wherein the first polypeptide and the second polypeptide are capable of comprising
- the guide RNAs of the guide sequence associate with each other in the presence of each other to form a CRISPR complex that specifically binds to a target nucleic acid comprising a target sequence complementary to the guide sequence.
- the first polypeptide and the second polypeptide each comprise a dimerization domain. In some embodiments, the first dimerization domain and the second dimerization domain associate with each other in the presence of an inducing agent (eg, rapamycin). In some embodiments, the first and second polypeptides do not comprise a dimerization domain. In some embodiments, the segmented Cas12i effector protein is autoinducible.
- the engineered Cas12i effector protein of the present invention can be split in a manner that does not affect the catalytic domain.
- Cas12i effector proteins can act as nucleases (including nickases) or can be inactivated enzymes, which are essentially RNA-guided DNA-binding proteins with little or no catalytic activity (e.g. due to mutations in their catalytic domain) .
- the nuclease lobe and the ⁇ -helical lobe of the engineered Cas12i effector protein are expressed as separate polypeptides.
- the RNA guide sequence recruits them into a complex that recapitulates the activity of the full-length Cas12i nuclease and catalyzes site-specific DNA cutting.
- modified RNA guide sequences can be used to eliminate the activity of partitioned enzymes by preventing dimerization, thereby allowing the development of an inducible dimerization system.
- split-type enzymes are described, for example, in Wright, Addison V., et al. "Rational design of a split-Cas9 enzyme complex," Proc. Nat'l. Acad. Sci., 112.10 (2015): 2984-2989, It is incorporated herein by reference in its entirety.
- the split-type Cas12i effector protein portion described herein can be designed as a reference engineered Cas12i effector protein (e.g., a full-length engineered Cas12i nuclease) at a split position (i.e., split) into two halves, the Position is the point where the N-terminal portion of the reference Cas12i effector protein is separated from the C-terminal portion.
- the N-terminal portion comprises amino acid residues 1 to X of the reference Cas12i effector protein
- the C-terminal portion comprises amino acid residues X+1 to the C-terminus of the reference Cas12i effector protein .
- the numbering is consecutive, but this is not required, as it is also contemplated that the amino acids (or the nucleotides encoding them) can be anywhere from split ends and/or mutations (e.g., insertions, deletions, and substitutions).
- One is trimmed from the internal region of the polypeptide chain, provided that the reconstituted engineered Cas12i effector retains sufficient DNA binding activity (if desired), DNA nickase or cleavage activity, e.g., with the reference Cas12i effector At least about 40% (eg, at least about 50%, 60%, 70%, 80%, 90%, 95% or more) activity compared to a protein.
- Cutpoints can be designed in silico and cloned into constructs. During this process, mutations can be introduced into segmented Cas12i effectors and non-functional domains can be removed.
- the two parts or fragments (i.e., N-terminal and C-terminal fragments) of the split-type Cas12i effector protein can form a complete Cas12i effector protein comprising, for example, at least about 70% of the complete Cas12i effector protein sequence (eg, at least about 80%, 90%, 95%, 96%, 97%, 98%, 99% or more).
- the segmented Cas12i effector proteins may each comprise one or more dimerization domains.
- the first polypeptide comprises a first dimerization domain fused to a first split-type Cas12i effector protein portion
- the second polypeptide comprises a first dimerization domain fused to a second split-type Cas12i effector protein portion.
- Second dimerization domain can be fused to the segmented Cas12i effector protein portion by a peptide linker (eg, a flexible peptide linker such as a GS linker) or a chemical bond.
- the dimerization domain is fused to the N-terminus of the segmented Cas12i effector protein portion.
- the dimerization domain is fused to the C-terminus of the segmented Cas12i effector protein portion.
- the segmented Cas12i effector protein does not comprise any dimerization domains.
- the dimerization domain facilitates the association of two segmented Cas12i effector protein moieties.
- the segmented Cas12i effector portion is induced by an inducer to associate or dimerize into a functional Cas12i effector.
- the segmented Cas12i effector protein comprises an inducible dimerization domain.
- the dimerization domain is not an inducible dimerization domain, ie, the dimerization domain dimerizes in the absence of an inducing agent.
- An inducing agent may be an inducing energy source or an inducing molecule other than a guide RNA (eg, crRNA).
- the inducer partially remodels the two segmented Cas12i effectors into a functional Cas12i effector through the induced dimerization of the dimerization domain.
- the inducing agent brings together the two segmented Cas12i effector protein moieties by inducing association of the inducible dimerization domain.
- the two split Cas12i effector protein portions do not associate with each other to reconstitute into a functional Cas12i effector protein in the absence of an inducing agent.
- two separate Cas12i effector portions can associate with each other in the presence of a guide RNA (eg, crRNA) to remodel into a functional Cas12i effector.
- the inducer of the present application may be heat, ultrasound, electromagnetic energy or chemical compounds.
- the inducing agent is an antibiotic, small molecule, hormone, hormone derivative, steroid, or steroid derivative.
- the inducer is abscisic acid (ABA), doxycycline (DOX), cumate, rapamycin, 4-hydroxytamoxifen (4OHT), Estrogen or ecdysone.
- the segmented Cas12i effector system is an inducer-controlled system selected from the group consisting of antibiotic-based induction systems, electromagnetic energy-based induction systems, small molecule-based induction systems, nuclear receptor-based induction systems Induced systems and hormone-based induced systems.
- the segmented Cas12i effector system is an inducer-controlled system selected from the group consisting of tetracycline (Tet)/DOX inducible system, light inducible system, ABA inducible system, cumate The repressor/operator system, the 4OHT/estrogen inducible system, the ecdysone-based inducible system and the FKBP12/FRAP (FKBP12-rapamycin complex) inducible system.
- inducers are also discussed herein and in PCT/US2013/051418, which is hereby incorporated by reference in its entirety.
- pairs of split-type Cas12i effector proteins are separated and inactive until dimerization of the dimerization domain (e.g., FRB and FKBP) is induced, which results in Reassembly of functional Cas12i effector protein nucleases.
- the first split-type Cas12i effector protein comprising the first half of the inducible dimer (eg, FRB) is delivered separately, and/or in a separate manner from the first half comprising the inducible dimer (eg, FKBP). The position where the second split-type Cas12i effector separates in the second half.
- FKBP-based induction systems that can be used in the inducer-controlled split Cas12i effector systems described herein include, but are not limited to: FKBP that dimerizes with calcineurin (CNA) in the presence of FK506 ; FKBP that dimerizes with CyP-Fas in the presence of FKCsA; FKBP that dimerizes with FRB in the presence of rapamycin; GyrB that dimerizes with GryB in the presence of coumarin; GAI that dimerizes with GID1 in the presence of mycin; or Snap-tag that dimerizes with HaloTag in the presence of HaXS.
- CNA calcineurin
- FKBPs homodimerize (ie, one FKBP dimerizes with another FKBP) in the presence of FK1012.
- the dimerization domain is FKBP and the inducer is FK1012. In some embodiments, the dimerization domain is GryB and the inducer is coumarycin. In some embodiments, the dimerization domain is ABA and the inducing agent is gibberellin.
- the segmented Cas12i effector portion can be autoinduced (ie, autoactivated or autoinduced) in the absence of an inducing agent to associate/dimerize into a functional Cas12i effector.
- the auto-induction of the segmented Cas12i effector portion may be mediated by binding to a guide RNA such as crRNA.
- the first and second polypeptides do not comprise a dimerization domain. In some embodiments, the first and second polypeptides comprise a dimerization domain.
- the reconstituted Cas12i effector protein of the split-type Cas12i effector system described herein has an editing efficiency of at least about 60% ( An editing efficiency such as at least about any of 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or more).
- the reconstituted Cas12i effector protein of the inducer-controlled split-type Cas12i effector system described herein in the absence of an inducer (i.e., due to auto-induction), has a higher editing efficiency relative to a reference Cas12i effector protein. Having an editing efficiency of no more than about 50%, such as no more than any of about 45%, 40%, 35%, 30%, 25%, 20%, 15%, 10%, 5% or less efficiency.
- the application also provides engineered Cas12i effector proteins comprising additional protein domains and/or components, such as linkers, nuclear localization/export sequences, functional domains and/or reporter proteins.
- the engineered Cas12i effector protein comprises one or more heterologous protein domains (e.g., about or greater than about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more structural domains) and the protein complex of the nucleic acid targeting structural domain or functional derivative thereof of any engineered Cas12i nuclease of the present invention.
- the engineered Cas12i effector protein comprises one or more heterologous proteins fused with the engineered Cas12i nuclease or a functional variant thereof (such as CasXX shown in SEQ ID NO: 8) Fusion proteins of domains (eg, about or more than about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more domains).
- the engineered Cas12i effector protein of the present application may comprise (for example, through a fusion protein, such as through one or more peptide linkers, such as a GS peptide linker, etc.) one or more functional domains or association (for example, by co-expression of multiple proteins) on it.
- the one or more functional domains are enzymatic domains.
- RNA methylase activity DNA and/or RNA methylase activity
- nucleotide deaminase activity such as adenosine deaminase activity, cytidine deaminase activity
- demethylase activity transcription activation activity, transcription repression activity, transcription release factor activity, histone modification activity, RNA cleavage activity, DNA cleavage activity (e.g., double-strand endonuclease activity, nickase activity), nucleic acid binding activity, and switch activity (e.g., photoinduced or chemically induced).
- the one or more functional domains are transcriptional activation domains (ie, transactivation domains) or repressor domains. In some embodiments, the one or more functional domains are histone modification domains. In some embodiments, the one or more functional domains are transposase domains, HR (homologous recombination) machinery domains, recombinase domains, and/or integrase domains. In some embodiments, the functional domain is Krüppe1-associated box (KRAB), VP64, VP16, Fok1, P65, HSF1, MyoD1, Biotin-APEX, APOBEC1, AID, PmCDA1, Tad1, and M-MLV reverse transcriptase .
- KRAB Krüppe1-associated box
- the functional domain is selected from the group consisting of translation initiation domain, transcriptional repression domain, transactivation domain, epigenetic modification domain, nucleobase editing domain (e.g., CBE or ABE domain), reverse transcriptase domain, reporter domain (eg, fluorescent domain) and nuclease domain.
- the functional domain has the activity of modifying target DNA or target DNA-related protein selected from nuclease activity (for example, HNH nuclease, RuvC nuclease, Trex1 nuclease, Trex2 nuclease) , methylation activity, demethylation activity, DNA repair activity, DNA damage activity, deamination activity, dismutase activity, alkylation activity, depurination activity, oxidation activity, pyrimidine dimer formation activity, integrase activity , transposase activity, recombinase activity, polymerase activity, ligase activity, helicase activity, photolyase activity, glycosylase activity, acetyltransferase activity, deacetylase activity, kinase activity, phosphate Enzyme activity, ubiquitin ligase activity, deubiquitinating activity, adenylation activity, deadenylation activity, sumoylation activity, desumoylation activity
- the positioning of one or more functional domains in the engineered Cas12i effector protein allows correct spatial orientation of the functional domains to affect a target with the conferred functional effect.
- the functional domain is a transcriptional activator (e.g., VP16, VP64, or p65)
- the transcriptional activator is placed in a spatial orientation that enables it to affect the transcription of the target.
- a transcriptional repressor is positioned to affect the transcription of a target, and a nuclease (eg, Fok1 ) is positioned to cleave or partially cleave the target.
- the functional domain is located at the N-terminus of the engineered Cas12i effector protein.
- the functional domain is located at the C-terminus of the engineered Cas12i effector protein.
- the engineered Cas12i effector protein comprises a first functional domain at the N-terminus and a second functional domain at the C-terminus.
- the engineered Cas12i effector protein comprises a catalytically inactive mutant (dCas12i) of any of the engineered Cas12i nucleases described herein fused to one or more functional domains.
- the engineered Cas12i effector protein is a transcriptional activator.
- the engineered Cas12i effector protein comprises an enzyme-inactive variant of any of the engineered Cas12i nucleases described herein fused to a transactivation domain.
- the transactivation domain is selected from the group consisting of VP64, p65, HSF1, VP16, MyoD1, HSF1, RTA, SET7/9, and combinations thereof.
- the transactivation domain comprises VP64, p65, and HSF1.
- the engineered Cas12i effector protein comprises two split Cas12i effector polypeptides, each fused to a transactivation domain.
- the engineered Cas12i effector protein is a transcriptional repressor.
- the engineered Cas12i effector protein comprises an enzyme-inactive variant of any of the engineered Cas12i nucleases described herein fused to a transcriptional repression domain.
- the transcriptional repressor domain is selected from the group consisting of Krüppel-associated box (KRAB), EnR, NuE, NcoR, SID, SID4X, and combinations thereof.
- the engineered Cas12i effector protein comprises two split Cas12i effector polypeptides, each fused to a transcriptional repression domain.
- the engineered Cas12i effector protein is a base editor, such as a cytosine editor or an adenosine editor.
- the engineered Cas12i effector protein comprises an enzyme-inactive variant of any of any of the engineered Cas12i nucleases described herein fused to a nucleobase editing domain
- the nucleobase Editing domains are such as cytosine base editing (CBE) domains or adenosine base editing (ABE) domains.
- the nucleobase editing domain is a DNA editing domain.
- the nucleobase editing domain has deaminase activity.
- the nucleobase editing domain is a cytosine deaminase domain. In some embodiments, the nucleobase editing domain is an adenosine deaminase domain.
- Exemplary base editors based on Cas nucleases are described, eg, in WO2018/165629A1 and WO2019/226953A1, which are incorporated herein by reference in their entirety.
- Exemplary CBE domains include, but are not limited to, activation-induced cytidine deaminase or AID (e.g., hAID), apolipoprotein B mRNA editing complex, or APOBEC (e.g., rat APOBEC1, hAPOBEC3A/B/C/D/E /F/G) and PmCDA1.
- AID e.g., hAID
- APOBEC e.g., rat APOBEC1, hAPOBEC3A/B/C/D/E /F/G
- PmCDA1 e.g., PmCDA1.
- Exemplary ABE domains include, but are not limited to: TadA, ABE8, and variants thereof (see, e.g., Gaudelli et al., 2017, Nature 551:464-471; and Richter et al., 2020, Nature Biotechnology 38:883-891) .
- the functional domain is an APOBEC1 domain, eg, a rat APOBEC1 domain.
- the functional domain is a TadA domain, such as an E. coli TadA domain.
- the engineered Cas12i effector protein further comprises one or more nuclear localization sequences.
- the term "adenosine deaminase” or “adenosine deaminase protein” refers to a protein, polypeptide, or one or more functional domains of a protein or polypeptide that is capable of catalyzing the conversion of adenine (or molecular adenine moiety) to hypoxanthine (or the hypoxanthine moiety of the molecule) by hydrolytic deamination as shown below.
- the adenine-containing molecule is adenosine (A) and the hypoxanthine-containing molecule is inosine (I).
- the adenine-containing molecule can be deoxyribonucleic acid (DNA) or ribonucleic acid (RNA).
- Adenosine deaminases that can be used in conjunction with enzyme-inactive variants of any of the engineered Cas12i nucleases of the invention include, but are not limited to, members of the enzyme family known as RNA-acting adenosine deaminases (ADAR) , a member of the enzyme family known as tRNA-acting adenosine deaminase (ADAT), and other family members containing an adenosine deaminase domain (ADAD).
- ADAR RNA-acting adenosine deaminases
- ADAT tRNA-acting adenosine deaminase
- ADAD adenosine deaminase domain
- ADAR can perform adenosine-to-inosine editing responses to RNA/DNA and RNA/RNA duplexes.
- the adenosine deaminase can be modified to increase its ability to edit DNA in the RNA/DNA heteroduplex of the RNA duplex.
- the adenosine deaminase is derived from one or more metazoan species including, but not limited to, mammals, birds, frogs, squid, fish, flies, and worms. In some embodiments, the adenosine deaminase is human, squid, or Drosophila adenosine deaminase.
- the adenosine deaminase is a human ADAR, including hADAR1, hADAR2, hADAR3. In some embodiments, the adenosine deaminase is a Caenorhabditis elegans ADAR protein, including ADR-1 and ADR-2. In some embodiments, the adenosine deaminase is a Drosophila ADAR protein, including dAdar. In some embodiments, the adenosine deaminase is a squid (Loligo pealeii) ADAR protein, including sqADAR2a and sqADAR2b.
- the adenosine deaminase is the human ADAT protein. In some embodiments, the adenosine deaminase is the Drosophila ADAT protein. In some embodiments, the adenosine deaminase is a human ADAD protein, including TENR (hADAD1) and TENRL (hADAD2). In some embodiments, the adenosine deaminase is TadA8e.
- the adenosine deaminase is a TadA protein, such as E. coli TadA. See Kim et al., Biochemistry 45:6407-6416 (2006); Wolf et al., EMBO J. 21:3841-3851 (2002).
- the adenosine deaminase is mouse ADA. See Grunebaum et al., Curr. Opin. Allergy Clin. Immunol. 13:630-638 (2013).
- the adenosine deaminase is human ADAT2. See Fukui et al., J. Nucleic Acids 2010:260512 (2010).
- the deaminase e.g., adenosine or cytidine deaminase
- the deaminase is one or more of those described in: Cox et al., Science. 2017 Nov 24; 358( 6366):1019-1027; Komore et al., Nature. 2016 May 19;533(7603):420-4; and Gaudelli et al., Nature. 2017 Nov 23;551(7681):464- 471.
- the adenosine deaminase protein comprises one or more deaminase domains.
- the deaminase domain serves to recognize and convert one or more target adenosine (A) residues contained in double-stranded nucleic acid substrates to inosine (I) residues .
- the deaminase is cytidine deaminase.
- the term "cytidine deaminase” or “cytidine deaminase protein” refers to a protein, polypeptide, or one or more functional domains of a protein or polypeptide that is capable of catalyzing the conversion of cytosine (or The hydrolytic deamination reaction that converts the pyrimidine portion) to uracil (or the uracil portion of the molecule).
- the cytosine-containing molecule is cytidine (C) and the uracil-containing molecule is uridine (U).
- the cytosine-containing molecule may be deoxyribonucleic acid (DNA) or ribonucleic acid (RNA).
- Cytidine deaminases that can be used in conjunction with enzyme-inactive variants of any of the engineered Cas12i nucleases of the invention include, but are not limited to, deaminases of the family known as the apolipoprotein B mRNA editing complex (APOBEC) A member of the enzyme family, activation-induced deaminase (AID), or cytidine deaminase 1 (CDA1).
- APOBEC apolipoprotein B mRNA editing complex
- AID activation-induced deaminase
- CDA1 cytidine deaminase 1
- APOBEC1 deaminase APOBEC2 deaminase
- APOBEC3A deaminase APOBEC3B deaminase
- APOBEC3C deaminase APOBEC3D deaminase
- APOBEC3E deaminase APOBEC3F deaminase
- APOBEC3G deaminase APOBEC3H deaminase or a deaminase in APOBEC4 deaminase.
- the cytidine deaminase is capable of targeting cytosine in a single strand of DNA. In some embodiments, the cytidine deaminase can edit on a single strand present on the outside of the binding component. In some embodiments, the cytidine deaminase can edit at a localized vesicle, eg, a localized vesicle formed by a target editing site but a mismatch in the guide sequence.
- the cytidine deaminase may contain mutations that facilitate focusing activity, such as described in Kim et al., Nature Biotechnology (2017) 35(4):371-377 (doi: 10.1038/nbt.3803 of those.
- the cytidine deaminase is derived from one or more metazoan species including, but not limited to, mammals, birds, frogs, squid, fish, flies, and worms. In some embodiments, the cytidine deaminase is human, primate, bovine, dog, rat, or mouse cytidine deaminase.
- the cytidine deaminase is human APOBEC, including hAPOBEC1 or hAPOBEC3. In some embodiments, the cytidine deaminase is human AID.
- the cytidine deaminase protein comprises one or more deaminase domains.
- the deaminase domain serves to recognize and convert one or more target cytosine (C) residues contained in the single-stranded vesicle of the RNA duplex to uracil (U) Residues.
- the engineered Cas12i effector protein is a master editor. Cas9-based master editors are described, e.g., in A. Anzalone et al., Nature, 2019, 576(7785):149-157, which is hereby incorporated by reference in its entirety.
- the engineered Cas12i effector protein comprises a nickase variant of any one of the engineered Cas12i nucleases described herein fused to a reverse transcriptase domain.
- the functional domain is a reverse transcriptase domain.
- the reverse transcriptase domain is an M-MLV reverse transcriptase or a variant thereof, eg, an M-MLV reverse transcriptase having one or more mutations of D200N, T306K, W313F, T330P, and L603W.
- an engineered CRISPR/Cas12i system comprising the master editor is provided.
- the engineered CRISPR/Cas12i system further comprises a second Cas12i nickase, for example based on the same engineered Cas12i nuclease as the primary editor.
- the engineered CRISPR/Cas12i system comprises a guide editing guide RNA (pegRNA) comprising a primer binding site and a reverse transcriptase (RT) template sequence.
- pegRNA guide editing guide RNA
- the application provides one or more (for example, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 6 or more) segmented Cas12i effector system of functional domains.
- the functional domain may be provided as part of the first and/or second segmented Cas12i effector protein, as a fusion within the construct.
- the functional domain is usually fused to other parts of the split-type Cas12i effector protein (eg, split-type Cas12i effector protein part) through a peptide linker (such as a GS linker).
- a peptide linker such as a GS linker
- the engineered Cas12i effector protein comprises one or more nuclear localization sequences (NLS) and/or one or more nuclear export sequences (NES).
- NLS sequences include, for example, PKKKRKVPG (SEQ ID NO: 66) and ASPKKKRKV (SEQ ID NO: 67).
- NLS and/or NES can be operably linked to the N-terminal and/or C-terminal of the engineered Cas12i effector protein or a polypeptide chain in the engineered Cas12i effector protein.
- NLS and/or NES can be connected to the N-terminal and/or C-terminal of any engineered Cas12i nuclease or functional variant thereof of the present invention.
- the engineered Cas12i effector protein can encode additional components, such as a reporter protein.
- the engineered Cas12i effector protein comprises a fluorescent protein, such as GFP.
- GFP fluorescent protein
- the engineered Cas12i effector protein is an inducible segmentable Cas12i effector system that can be used to image genomic loci.
- an engineered Cas12i effector protein is provided, wherein the effector protein is capable of inducing double-strand breaks or single-strand breaks in DNA molecules.
- an engineered Cas12i effector protein wherein the functional derivative of the engineered Cas12i nuclease is an enzyme inactive mutant, for example containing D599A, E833A, S883A, H884A, D886A, R900A and/or the Cas12i2 nuclease inactivation mutant of D1019A (the amino acid position is defined as the corresponding amino acid position shown in SEQ ID NO: 1), and the Cas12i1 nuclease inactivation mutant containing D647A, E894A and/or D948A (The amino acid positions are defined as the corresponding amino acid positions shown in SEQ ID NO: 13).
- Known enzyme inactivation mutants of Cas12i2 nuclease such as any enzyme inactivation mutation of Cas12i2 nuclease described in US10808245B2 and Huang X.et al., Nature Communications, 11, Article number: 5241 (2020) can Combined with the mutations in the present application to provide functional derivatives of engineered Cas12i nucleases and their corresponding effector proteins.
- an engineered CRISPR-Cas12i system comprising: (a) any engineered Cas12i effector protein described in the present application (such as engineered Cas12i nuclease or a functional variant thereof, For example, CasXX shown in SEQ ID NO: 8); and (b) a guide RNA comprising a guide sequence complementary to the target sequence, or one or more nucleic acids encoding the guide RNA;
- the engineered Cas12i effector protein and the guide RNA can form a CRISPR complex, and the CRISPR complex specifically binds to a target nucleic acid comprising the target sequence and induces modification (such as double-stranded or single-stranded) of the target nucleic acid strand cleavage, base editing, etc.).
- modification covers cleavage, base editing, replacement, repair, etc. by nucleases at target sites on double-stranded or single-stranded nucleic acids.
- the engineered CRISPR-Cas12i system comprises: (a) any one of the engineered Cas12i effector proteins described herein (for example, any of the engineered Cas12i nucleases or functional variants thereof entity, or a nicking enzyme based on the engineered Cas12i nuclease or functional variant thereof, split-type Cas12i, transcriptional repressor, transcriptional activator, base editor, or master editor); and (b) comprising and target sequence A guide RNA of a complementary guide sequence, or one or more nucleic acids encoding the guide RNA; wherein the engineered Cas12i effector protein and the guide RNA can form a CRISPR complex that specifically binds Target nucleic acid comprising the target sequence and induce modification of the target nucleic acid (such as double-stranded or single-stranded cleavage, base editing, etc.).
- any one of the engineered Cas12i effector proteins described herein for example, any of the engineered Cas12i nu
- the engineered CRISPR-Cas12i system comprises an engineered Cas12i effector protein (such as an engineered Cas12i nuclease or a functional variant thereof, such as CasXX shown in SEQ ID NO: 8) and/or Or one or more nucleic acids of said guide RNA.
- the engineered CRISPR-Cas12i system comprises an array of precursor guide RNAs that can be processed into multiple crRNAs, eg, by the engineered Cas12i effector protein.
- the engineered CRISPR-Cas12i system comprises one or more vectors encoding the engineered Cas12i effector protein and/or the guide RNA.
- the engineered CRISPR-Cas12i system comprises a ribonucleoprotein (RNP) complex comprising the engineered Cas12i effector protein bound to the guide RNA.
- RNP ribonucleoprotein
- the engineered CRISPR-Cas12i system of the present application can comprise any suitable guide RNA.
- a guide RNA may comprise a guide sequence capable of hybridizing to a target sequence in a target nucleic acid of interest, such as a genomic site of interest in a cell.
- the gRNA comprises a CRISPR RNA (crRNA) sequence comprising the guide sequence.
- the crRNAs described herein include direct repeat sequences and spacer sequences.
- the crRNA comprises, consists essentially of, or consists of a direct repeat sequence linked to a guide sequence or a spacer sequence.
- the crRNA includes a direct repeat sequence, a spacer sequence, and a direct repeat sequence (DR-spacer sequence-DR), which is typical of the configuration of a precursor crRNA (pre-crRNA).
- the crRNA includes truncated direct repeat and spacer sequences, which are typical features of processed or mature crRNA.
- the CRISPR-Cas12i effector protein forms a complex with an RNA guide sequence, and the spacer sequence directs the complex to sequence-specific binding to a target nucleic acid that is associated with the spacer
- the sequences are complementary (eg, at least 70% complementary).
- the guide RNA is a crRNA comprising a guide sequence.
- the engineered CRISPR-Cas12i system comprises an array of precursor guide RNAs encoding multiple crRNAs.
- the Cas12i effector protein cleaves the array of precursor guide RNAs to generate a plurality of crRNAs.
- the engineered CRISPR-Cas12i system comprises an array of precursor guide RNAs encoding multiple crRNAs, wherein each crRNA comprises a different guide sequence.
- constructs, vectors, and expression systems encoding any of the engineered Cas12i effector proteins described herein (eg, engineered Cas12i nucleases or functional variants thereof).
- the construct, vector or expression system further comprises one or more gRNA or crRNA arrays.
- a “vector” is a composition of matter comprising an isolated nucleic acid and which can be used to deliver the isolated nucleic acid to the interior of a cell.
- Many vectors are known in the art including, but not limited to, linear polynucleotides, polynucleotides associated with ionic or amphiphilic compounds, plasmids, and viruses.
- suitable vectors contain an origin of replication functional in at least one organism, a promoter sequence, convenient restriction endonuclease sites and one or more selectable markers.
- the term “vector” should also be construed to include non-plasmid and non-viral compounds that facilitate the transfer of nucleic acids into cells, such as, for example, polylysine compounds, liposomes, and the like.
- the vector is a viral vector.
- viral vectors include, but are not limited to, adenoviral vectors, adeno-associated viral vectors, lentiviral vectors, retroviral vectors, vaccinia vectors, herpes simplex virus vectors, and derivatives thereof.
- the vector is a phage vector. Viral vector technology is well known in the art and is described, for example, in Sambrook et al. (2001, Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory, New York), among other handbooks of virology and molecular biology.
- retroviruses provide a convenient platform for gene delivery systems.
- the heterologous nucleic acid can be inserted into a vector and packaged in retroviral particles using techniques known in the art.
- Recombinant virus can then be isolated and delivered to the engineered mammalian cells in vitro or ex vivo.
- Many retroviral systems are known in the art.
- adenoviral vectors are used.
- Many adenoviral vectors are known in the art.
- lentiviral vectors are used.
- self-inactivating lentiviral vectors are used.
- the vector is an adeno-associated virus (AAV) vector, such as AAV2, AAV8, or AAV9, which can be administered in a single dose containing at least 1 x 105 particles (also known as particle units, pu). Administration of adenovirus or adeno-associated virus.
- AAV adeno-associated virus
- the administered amount is at least about 1 ⁇ 10 6 particles, at least about 1 ⁇ 10 7 particles, at least about 1 ⁇ 10 8 particles, or at least about 1 ⁇ 10 9 particles. related virus. Delivery methods and dosages are described, for example, in WO 2016205764 and US Patent No. 8,454,972, which are incorporated herein by reference in their entirety.
- the vector is a recombinant adeno-associated virus (rAAV) vector.
- modified AAV vectors can be used for delivery.
- Modified AAV vectors can be based on one or more of several capsid types, including AAV1, AV2, AAV5, AAV6, AAV8, AAV8.2, AAV9, AAV rh10, modified AAV vectors (e.g., modified AAV2, modified AAV3, modified AAV6) and pseudotyped AAV (such as AAV2/8, AAV2/5 and AAV2/6).
- Exemplary AAV vectors and techniques that can be used to generate rAAV particles are known in the art (see, e.g., Aponte-Ubillus et al. (2016) Appl. Microbiol. Biotechnol. 102(3): 1045-54; Zhong et al. ( 2012) J. Genet. Syndr. Gene Ther. S1: 008; West et al. (1987) Virology 160: 38-47 (1987); Tratschin et al. (1985) Mol. Cell. Biol. 5: 3251-60 ; US Patent Nos. 4,797,368 and 5,173,414; International Publication Nos. WO2015/054653 and WO93/24641, each of which is incorporated herein by reference).
- AAV vector used to deliver Cas9 and other Cas proteins can be used to deliver the engineered Cas12i system of the present application.
- the rAAV construct can be administered enterally to a subject. In some embodiments, the rAAV construct can be administered to a subject parenterally. In some embodiments, rAAV particles can be administered subcutaneously, intraocularly, intravitreally, subretinally, intravenously (IV), intracerebroventricularly, intramuscularly, intrathecally (IT), intracisternally, intraperitoneally, via inhalation, topically, or By direct injection into one or more cells, tissues or organs. In some embodiments, rAAV particles can be administered to a subject by injection into the hepatic artery or portal vein.
- Vectors can be transferred into host cells by physical, chemical or biological means.
- vectors into host cells include: calcium phosphate precipitation, lipofection, particle bombardment, microinjection, electroporation, and the like. Methods for producing cells comprising vectors and/or exogenous nucleic acids are well known in the art. See, eg, Sambrook et al. (2001) Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory, New York. In some embodiments, the vector is introduced into the cells by electroporation.
- Biological methods for introducing heterologous nucleic acids into host cells include the use of DNA and RNA vectors.
- Viral vectors have become the most widely used method for inserting genes into mammalian, eg human, cells.
- Chemical methods for introducing vectors into host cells include colloidal dispersion systems such as macromolecular complexes, nanocapsules, microspheres, beads, and lipid-based systems including oil-in-water emulsions, micelles, Mix micelles and liposomes.
- An exemplary colloidal system for use as an in vitro delivery vehicle is a liposome (eg, an artificial membrane vesicle).
- the engineered CRISPR-Cas12i system is delivered in nanoparticles as RNPs.
- the vector or expression system encoding the CRISPR-Cas12i system or components thereof comprises one or more selectable or detectable markers that provide for the isolation or efficient selection of the containing and/or Means for cells that have been modified by the CRISPR-Cas12i system (e.g. at an early stage and at a large scale).
- Reporter genes can be used to identify potentially transfected cells and assess the function of regulatory sequences.
- a reporter gene is a gene that is absent or not expressed in the recipient organism or tissue and whose expression of the encoded polypeptide is evidenced by some readily detectable property, such as enzymatic activity. Expression of the reporter gene is measured at an appropriate time after introduction of the DNA into the recipient cells.
- Suitable reporter genes may include genes encoding luciferase, ⁇ -galactosidase, chloramphenicol acetyltransferase, secreted alkaline phosphatase, or the green fluorescent protein gene (e.g. Ui-Tei et al. FEBS Letters 479: 79-82 (2000)).
- heterologous nucleic acid in host cells include, for example, molecular biological assays, such as Southern and Northern blots, RT-PCR, and PCR, well known to those skilled in the art; biochemical assays, such as by immunological methods such as ELISA and Western blot) to detect the presence or absence of specific peptides.
- molecular biological assays such as Southern and Northern blots, RT-PCR, and PCR, well known to those skilled in the art
- biochemical assays such as by immunological methods such as ELISA and Western blot
- the nucleic acid sequence encoding the engineered Cas12i effector protein and/or the guide RNA is operably linked to a promoter.
- the promoter is an endogenous promoter relative to the cell engineered using the engineered CRISPR-Cas12i system.
- the nucleic acid encoding the engineered Cas12i effector protein can be knocked into the genome of an engineered mammalian cell downstream of an endogenous promoter using any method known in the art.
- the endogenous promoter is the promoter of an abundant protein such as ⁇ -actin.
- the endogenous promoter is an inducible promoter, eg, inducible by an endogenous activation signal of the engineered mammalian cell.
- the promoter is a T cell activation dependent promoter (such as the IL-2 promoter, NFAT promoter or NF ⁇ B promoter).
- the promoter is a heterologous promoter relative to the cell engineered using the engineered CRISPR-Cas12i system.
- a variety of promoters have been explored for expression of genes in mammalian cells, and any promoter known in the art may be used in this application. Promoters can be broadly classified as constitutive promoters or regulated promoters, such as inducible promoters.
- the nucleic acid sequence encoding the engineered Cas12i effector protein and/or the guide RNA is operably linked to a constitutive promoter.
- Constitutive promoters allow the constitutive expression of heterologous genes (also known as transgenes) in host cells.
- Exemplary constitutive promoters contemplated herein include, but are not limited to: cytomegalovirus (CMV) promoter, human elongation factor-1 ⁇ (hEF1 ⁇ ), ubiquitin C promoter (UbiC), phosphoglycerol kinase promoter (PGK), The simian virus 40 early promoter (SV40) and the chicken ⁇ -actin promoter are coupled to the CMV early enhancer (CAG).
- CMV cytomegalovirus
- hEF1 ⁇ human elongation factor-1 ⁇
- UbiC ubiquitin C promoter
- PGK phosphoglycerol kinase promoter
- the promoter is a CAG promoter comprising a cytomegalovirus (CMV) early enhancer element, a promoter, the first exon and the first intron of the chicken ⁇ -actin gene , and the splice acceptor of the rabbit ⁇ -globin gene.
- CMV cytomegalovirus
- the nucleic acid sequence encoding the engineered CRISPR-Cas12i effector protein and/or the guide RNA is operably linked to an inducible promoter.
- Inducible promoters are a type of regulated promoter.
- the inducible promoter can be induced by one or more conditions, such as physical conditions, the microenvironment or the physiological state of the host cell, an inducer (ie, an inducer), or a combination thereof.
- the induction conditions are selected from the group consisting of inducer, irradiation (e.g., ionizing radiation, light), temperature (e.g., heat), redox state, tumor environment, and the CRISPR-Cas12i system to be engineered Activation state of engineered cells.
- the promoter is inducible by a small molecule inducing agent such as a compound.
- the small molecule is selected from the group consisting of doxycycline, tetracycline, alcohol, metal, or steroid. Chemically inducible promoters have been most extensively studied.
- Such promoters include promoters whose transcriptional activity is regulated by the presence or absence of small molecule chemicals such as doxycycline, tetracycline, alcohols, steroids, metals and other compounds.
- the doxycycline-inducible system with retrotetracycline-controlled transactivator (rtTA) and tetracycline-responsive element promoter (TRE) is currently the most mature system.
- rtTA retrotetracycline-controlled transactivator
- TRE tetracycline-responsive element promoter
- WO9429442 describes the tight control of gene expression in eukaryotic cells by tetracycline-responsive promoters.
- WO9601313 discloses tetracycline regulated transcriptional regulators.
- Tet technologies such as the Tet-on system have been described at, for example, the TetSystems.com website.
- any known chemically regulated promoter can be used to drive the expression of the engineered CRISPR-Cas12i protein
- the nucleic acid sequence encoding the engineered Cas12i effector protein is codon optimized.
- an expression construct comprising a codon-optimized sequence encoding the engineered Cas12i effector protein linked to a BPK2104-ccdB vector.
- the expression construct encodes a tag (eg, 10 ⁇ His tag) operably linked to the C-terminus of the engineered Cas12i effector protein.
- each engineered split Cas12i construct encodes a fluorescent protein such as GFP or RFP.
- the reporter protein can be used to assess the co-localization and/or dimerization of the engineered Cas12i protein, for example using microscopy.
- Nucleic acid sequences encoding engineered Cas12i effector proteins can be fused to nucleic acid sequences encoding additional components using sequences encoding self-cleaving peptides such as T2A, P2A, E2A or F2A peptides.
- an expression construct for use in mammalian cells comprising a nucleic acid sequence encoding said engineered Cas12i effector protein.
- the expression construct comprises a codon-optimized sequence encoding the engineered Cas12i effector protein inserted into the pCAG-2A-eGFP vector such that the Cas12i protein is operably linked to eGFP.
- a second vector is provided for expression of a guide RNA (eg, crRNA or pre-crRNA array) in a mammalian cell (eg, a human cell).
- the sequence encoding the guide RNA is expressed in the pUC19-U6-i2-cr RNA vector backbone.
- one or more vectors expressing one or more elements of the CRISPR-Cas12i system are introduced into the host cell such that expression of the elements of the CRISPR-Cas12i system directs the nucleic acid targeting complex at one or more target sites point formation.
- a Cas12i nucleic acid-targeting effector enzyme and a nucleic acid-targeting guide RNA can each be operably linked to separate regulatory elements on separate vectors.
- the RNA of the nucleic acid targeting system can be delivered to a transgenic Cas12i nucleic acid targeting effector protein animal or mammal, e.g., an animal or mammal that constitutively or inducibly or conditionally expresses the nucleic acid targeting effector protein; or otherwise expresses the nucleic acid Targeting the effector protein or an animal or mammal having cells containing the nucleic acid targeting effector protein, for example by prior administration thereto of one or more vectors encoding and expressing the nucleic acid targeting effector protein in vivo.
- two or more elements expressed by the same or different regulatory elements may be combined in a single vector, with one or more additional vectors providing any components of the nucleic acid targeting system not contained in the first vector.
- the elements of the nucleic acid targeting system combined in a single vector may be arranged in any suitable orientation, eg, one element is positioned 5' ("upstream") or 3' ("downstream") relative to a second element.
- the coding sequence for one element may be located on the same or opposite strand and oriented in the same or opposite orientation as the coding sequence for a second element.
- a single promoter drives the expression of transcripts encoding a Cas12i nucleic acid-targeting effector protein and a nucleic acid-targeting guide RNA that are embedded within one or more intronic sequences (e.g., each in a different Introns, two or more in at least one intron, or all in a single intron).
- nucleic acid-targeting effector protein and the nucleic acid-targeting guide RNA can be operably linked to and expressed from the same promoter.
- Delivery vehicles, vectors, particles, nanoparticles, formulations and components thereof for expressing one or more elements of the nucleic acid targeting system are as used in aforementioned documents such as WO 2014/093622 (PCT/US2013/074667).
- a vector comprises one or more insertion sites, such as restriction endonuclease recognition sequences (also referred to as "cloning sites").
- one or more insertion sites are located at one Upstream and/or downstream of one or more sequence elements of one or more vectors.
- a single expression construct can be used to target nucleic acid targeting activity to multiple different corresponding target sequences within a cell.
- a single vector may comprise about or greater than about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20 or more guide sequences.
- about or more than about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more such guide sequence-containing vectors may be provided and optionally delivered to cell.
- the vector comprises a regulatory element operably linked to an enzyme coding sequence encoding a nucleic acid targeting effector protein.
- the Cas12i nucleic acid-targeting effector protein or one or more nucleic acid-targeting guide RNAs may be delivered separately; and advantageously, at least one of these is delivered via a particle complex.
- the nucleic acid-targeting effector mRNA can be delivered before the nucleic acid-targeting guide RNA to allow time for expression of the Cas12i nucleic acid-targeting effector.
- the nucleic acid-targeting effector protein mRNA can be administered 1-12 hours (preferably about 2-6 hours) prior to administration of the nucleic acid-targeting guide RNA.
- nucleic acid-targeting effector protein mRNA and nucleic acid-targeting guide RNA can be administered together.
- the second booster dose of guide RNA may be administered 1-12 hours (preferably about 2-6 hours) after the initial administration of nucleic acid-targeting effector protein mRNA+guide RNA. Additional administration of nucleic acid-targeting effector protein mRNA and/or guide RNA may be useful to achieve the most efficient level of genome modification.
- a CRISPR-Cas12i system comprising: (1) any of the engineered Cas12i effector proteins (such as any of the engineered Cas12i nucleases or functional variants thereof) or a polynucleotide encoding any of the engineered Cas12i effector proteins; and (2) a crRNA or a polynucleotide encoding the crRNA comprising: (i) a spacer capable of hybridizing to a target sequence of a target DNA sequence, and (ii) a direct repeat sequence linked to the spacer sequence capable of directing the engineered Cas12i effector protein to bind to the crRNA to form a CRISPR-Cas12i complex targeting the target sequence.
- a CRISPR-Cas12i system comprising one or more vectors, said one or more vectors comprising: (1) a first regulatory element, said first regulatory element being operably linked To the nucleotide sequence of any one of the engineered Cas12i effector proteins (such as any one of the engineered Cas12i nucleases or functional variants thereof); and (2) a second regulatory element, the second regulatory element
- the element is operably linked to a polynucleotide encoding a crRNA comprising: (i) a spacer sequence capable of hybridizing to a target sequence of a target DNA, and (ii) a spacer sequence linked to the spacer sequence capable of directing the engineered
- the Cas12i effector protein binds to the crRNA to form a direct repeat of the CRISPR-Cas12i complex targeting the target sequence; wherein the first regulatory element and the second regulatory element are located at the CRISPR-Cas12i system on the same or different carriers.
- the first regulatory element and the second regulatory element are located on different vectors of the CRISPR-Cas12i system. In certain embodiments, the first regulatory element and the second regulatory element are located on the same vector of the CRISPR-Cas12i system. In certain embodiments, the first regulatory element and the nucleotide sequence encoding the engineered Cas12i effector protein are upstream of the second regulatory element and the polynucleotide encoding crRNA. In certain embodiments, the first regulatory element and the nucleotide sequence encoding the engineered Cas12i effector protein are downstream of the second regulatory element and the polynucleotide encoding crRNA. In certain embodiments, the first regulatory element and the second regulatory element are the same. In certain embodiments, said first regulatory element and said second regulatory element are different.
- a CRISPR-Cas12i system which includes a carrier, and the carrier comprises: (1) encoding any of the engineered Cas12i effector protein (such as any of the engineered Cas12i nucleic acid (2) a polynucleotide encoding a crRNA comprising: (i) a spacer sequence capable of hybridizing to a target sequence of a target DNA, and (ii) linked to the A direct repeat of a spacer sequence capable of directing the engineered Cas12i effector protein to bind to the crRNA to form a CRISPR-Cas12i complex targeting the target sequence; and (3) a regulatory element operable is connected to the polynucleotide encoding the engineered Cas12i effector protein and the polynucleotide encoding the crRNA.
- the vector comprises the regulatory element, the polynucleotide encoding the engineered Cas12i effector protein and the polynucleotide encoding the crRNA from 5' to 3'. In certain embodiments, the vector comprises the regulatory element, the polynucleotide encoding the crRNA and the polynucleotide encoding the engineered Cas12i effector protein from 5' to 3'.
- the polynucleotide encoding the crRNA and the polynucleotide encoding the engineered Cas12i effector protein are connected by a linker sequence, such as encoding P2A, T2A, E2A, F2A, BmCPV 2A, BmIFV 2A, (GS)n (SEQ ID NO:74), (GGGS)n (SEQ ID NO:75) and (GGGGS)n (SEQ ID NO:76) in any polynucleotide sequence (wherein n is at least 1), or any polynucleotide sequence of IRES, SV40, CMV, UBC, EF1 ⁇ , PGK and CAGG, or any combination thereof.
- a linker sequence such as encoding P2A, T2A, E2A, F2A, BmCPV 2A, BmIFV 2A, (GS)n (SEQ ID NO:74), (GGGS)n (SEQ ID NO:75) and (GGGGS)
- the components of the CRISPR-Cas12i system of the present invention can be delivered in various forms, such as a combination of DNA/RNA or RNA/RNA or protein RNA.
- an engineered Cas12i effector protein such as any one of the engineered Cas12i nucleases or a functional variant thereof can be delivered as a DNA-encoding polynucleotide or an RNA-encoding polynucleotide or as a protein.
- the guides can be delivered as DNA-encoding polynucleotides or as RNA. Mixed delivery forms can be used.
- the invention provides a method comprising delivering one or more polynucleotides, e.g., one or more vectors as described herein, one or more transcripts thereof, and/or one or more proteins transcribed therefrom, to Host Cell Methods.
- one or more polynucleotides e.g., one or more vectors as described herein, one or more transcripts thereof, and/or one or more proteins transcribed therefrom, to Host Cell Methods.
- the application provides the use of any of the engineered Cas12i effector proteins described herein (such as any of the engineered Cas12i nucleases or functional variants thereof) or the CRISPR-Cas12i system to detect targets in vitro, ex vivo or in vivo Nucleic acids or methods of modifying nucleic acids, and methods of using said engineered Cas12i effector protein or CRISPR-Cas12i system for therapy (such as gene editing) or diagnosis.
- any of the engineered Cas12i effector proteins described herein such as any of the engineered Cas12i nucleases or functional variants thereof
- CRISPR-Cas12i system to detect targets in vitro, ex vivo or in vivo Nucleic acids or methods of modifying nucleic acids, and methods of using said engineered Cas12i effector protein or CRISPR-Cas12i system for therapy (such as gene editing) or diagnosis.
- engineered Cas12i effector proteins or CRISPR-Cas12i systems described herein for detecting or modifying nucleic acids in cells, and for treating or diagnosing a disease or condition in a subject; and comprising said engineered Compositions of any one of the Cas12i effector proteins or one or more components of the engineered CRISPR-Cas12i system are useful in the preparation of nucleic acids for detection or modification of cells and for the treatment or diagnosis of diseases in subjects Use in medicine for a medical condition or condition.
- the present application also provides methods for detecting target nucleic acids using any of the engineered Cas12i effector proteins or CRISPR-Cas12i systems with improved activity.
- the use of Cas12i effector proteins as detection reagents can take advantage of the discovery that, once activated by the detection of target DNA, type V CRISPR/Cas proteins (e.g., Cas12i) can promiscuously cleave non-targeted single-stranded DNA (ssDNA or RNA, i.e. guide A single-stranded nucleic acid to which the guide sequence of RNA does not hybridize).
- target DNA double-stranded or single-stranded
- a sample e.g., in some cases above a threshold amount
- the result is cleavage of the single-stranded nucleic acid in the sample, which can be detected using any convenient detection method (for example, using tagged single-stranded detection nucleic acids such as DNA or RNA).
- Cas12i can cut ssDNA and ssRNA.
- methods using Cas proteins as detection reagents are described in US10253365 and WO2020/056924, which are hereby incorporated by reference in their entirety.
- a method of detecting target DNA comprising: (a) contacting the sample with: (i) an engineered DNA described herein Cas12i effector protein (such as any one of said engineered Cas12i nuclease or its functional variant); (ii) guide RNA, which comprises a guide sequence hybridized with said target DNA; and (iii) detection nucleic acid , which is single-stranded (i.e., "single-stranded detection nucleic acid”) and does not hybridize to the guide sequence of the guide RNA; and (b) measuring the generation of the single-stranded detection nucleic acid by the engineered Cas12i effector protein detectable signal.
- an engineered DNA described herein Cas12i effector protein such as any one of said engineered Cas12i nuclease or its functional variant
- guide RNA which comprises a guide sequence hybridized with said target DNA
- detection nucleic acid which is single-stranded (i.e., "single-stranded detection nu
- the single-stranded detection nucleic acid includes a fluorescence emitting dye pair (eg, the fluorescence emitting dye pair is a fluorescence resonance energy transfer (FRET) pair, a quencher/fluorescent pair).
- the target DNA is viral DNA (eg, papillomavirus, hepadnavirus, herpesvirus, adenovirus, poxvirus, parvovirus, etc.).
- the single-stranded detection nucleic acid is DNA.
- the single-stranded detection nucleic acid is RNA.
- the method for detecting target DNA (single-stranded or double-stranded) in a sample of the present disclosure can detect target DNA with high sensitivity.
- the methods of the present disclosure can be used to detect target DNA present in a sample comprising a plurality of DNAs, including the target DNA and a plurality of non-target DNAs, wherein the target DNA is expressed in the order of 107 non-target DNA
- One or more copies of target DNA present eg, one or more copies per 10 6 non-target DNA, one or more copies per 10 5 non-target DNA, one or more copies per 10 4
- the engineered Cas12i effector proteins described herein can detect target DNA with greater sensitivity than the reference Cas12i nuclease. In some embodiments, compared to the reference Cas12i nuclease, the engineered Cas12i effector protein can be 10%, 15%, 20%, 25%, 30%, 40%, 50%, 60% , 70%, 80%, 90%, 95% or higher sensitivity to detect target DNA.
- the present application provides methods of modifying a target nucleic acid comprising a target sequence comprising contacting the target nucleic acid with any of the engineered CRISPR-Cas12i systems described herein.
- the methods are performed in vitro.
- the target nucleic acid is present in a cell.
- the cells are bacterial cells, yeast cells, mammalian cells, plant cells, or animal cells.
- the methods are performed ex vivo.
- the methods are performed in vivo.
- Target nucleic acid modification includes, but is not limited to, target nucleic acid single-strand cleavage, double-strand cleavage, base substitution, base insertion, base deletion, mutation (such as pathogenic mutation) sequence repair, etc.
- the target nucleic acid is cleaved or a target sequence in the target nucleic acid is altered by the engineered CRISPR-Cas12i system. In some embodiments, expression of the target nucleic acid is altered by the engineered CRISPR-Cas12i system. In some embodiments, the target nucleic acid is genomic DNA. In some embodiments, the target sequence is associated with a disease or condition, eg, based on misexpression (eg, overexpression, nonexpression, or expression of a pathogenic RNA or protein) of the target sequence. In some embodiments, the engineered CRISPR-Cas12i system comprises an array of precursor guide RNAs encoding multiple crRNAs, wherein each crRNA comprises a different guide sequence.
- the present application provides a method of treating a disease or condition associated with a target nucleic acid in a cell of an individual comprising modifying the target nucleic acid in the cell of the individual using any of the methods described herein, The disease or condition is thereby treated.
- the disease or condition is selected from the group consisting of cancer, cardiovascular disease, genetic disease (e.g., a gene defect disease such as sickle cell anemia (SCD) or beta-thalassemia (TDT)), Autoimmune diseases, metabolic diseases, neurodegenerative diseases, eye diseases, bacterial and viral infections.
- SCD sickle cell anemia
- TTT beta-thalassemia
- the engineered CRISPR-Cas12i system described herein can modify target nucleic acids in cells in a variety of ways, depending on the type of Cas12i effector protein engineered in the CRISPR-Cas12i system.
- the methods induce site-specific cleavage in the target nucleic acid.
- the methods cleave genomic DNA in cells such as bacterial cells, plant cells, or animal cells (eg, mammalian cells).
- the method kills the cell by cleaving the genomic DNA in the cell.
- the methods cleave viral nucleic acid in the cell.
- the methods alter (eg, increase or decrease) the expression level of the target nucleic acid in a cell.
- the method uses an engineered Cas12i effector protein to increase the expression level of the target nucleic acid in a cell, eg, based on an enzymatically inactive Cas12i protein fused to a transactivation domain.
- the method uses an engineered Cas12i effector protein to reduce the expression level of the target nucleic acid in a cell, eg, based on an enzymatically inactive Cas12i protein fused to a transcriptional repression domain.
- the method is based, for example, on an enzymatically inactive Cas12i protein fused to an epigenetic modification domain, using an engineered Cas12i effector protein to introduce epigenetic modifications into the target nucleic acid or target nucleic acid in the cell Associated proteins (eg, proteins that bind to the target nucleic acid or are in its vicinity, such as transcription factors or histones).
- an engineered Cas12i effector protein to introduce epigenetic modifications into the target nucleic acid or target nucleic acid in the cell Associated proteins (eg, proteins that bind to the target nucleic acid or are in its vicinity, such as transcription factors or histones).
- the engineered Cas12i system described herein can be used to introduce other modifications into the target nucleic acid, depending on the functional domains comprised by the engineered Cas12i effector protein.
- the methods alter a target sequence in the target nucleic acid in a cell.
- the method introduces a mutation into the target nucleic acid in the cell, such as mutating a pathogenic mutation into a non-pathogenic sequence (e.g., using a Any dCas12i) described in the present invention.
- the methods use one or more endogenous DNA repair pathways, such as non-homologous end joining (NHEJ) or homology-directed recombination (HDR), to repair the double DNA induced in the target DNA in the cell. Strand breaks, as a result of sequence-specific cleavage by the CRISPR complex.
- NHEJ non-homologous end joining
- HDR homology-directed recombination
- Exemplary mutations include, but are not limited to, insertions, deletions, substitutions, and frameshifts.
- pegRNA is used to simultaneously perform target cleavage and provide a template for DNA repair.
- the methods insert donor DNA at the target site.
- insertion of donor DNA results in the introduction of a selectable marker or reporter protein into the cell.
- insertion of donor DNA results in knock-in of the gene.
- insertion of donor DNA results in a knockout mutation.
- insertion of donor DNA results in substitutional mutations such as single nucleotide substitutions.
- the method induces a phenotypic change in the cell.
- the engineered CRISPR-Cas12i system is used as part of a genetic circuit, or to insert a genetic circuit into the genomic DNA of a cell.
- the inducer-controlled engineered split-type Cas12i effector proteins described herein are particularly useful as components of genetic circuits.
- Genetic circuits can be used in gene therapy. Methods and techniques for designing and using genetic circuits are known in the art. Further reference can be made to eg Brophy, Jennifer AN, and Christopher A. Voigt. "Principles of genetic circuit design.” Nature methods 11.5 (2014): 508.
- the target nucleic acid is in a cell.
- the target nucleic acid is genomic DNA.
- the target nucleic acid is extrachromosomal DNA.
- the target nucleic acid is foreign to the cell.
- the target nucleic acid is viral nucleic acid such as viral DNA.
- the target nucleic acid is a plasmid in a cell.
- the target nucleic acid is a horizontally transferred plasmid.
- the target nucleic acid is RNA.
- the target nucleic acid is an isolated nucleic acid such as isolated DNA. In some embodiments, the target nucleic acid is present in a cell-free environment. In some embodiments, the target nucleic acid is an isolated vector such as a plasmid. In some embodiments, the target nucleic acid is an isolated linear DNA fragment.
- the cells are bacteria, yeast cells, fungal cells, algal cells, plant cells, or animal cells (eg, mammalian cells, such as human cells).
- the cells are of natural origin such as cells isolated from a tissue biopsy.
- the cells are cells isolated from cell lines cultured in vitro.
- the cells are from a primary cell line.
- the cells are from an immortalized cell line.
- the cells are genetically engineered cells.
- the cell is an animal cell of an organism selected from the group consisting of cattle, sheep, goats, horses, pigs, deer, chickens, ducks, geese, rabbits, and fish.
- the cell is a plant cell of an organism selected from the group consisting of corn, wheat, barley, oat, rice, soybean, oil palm, safflower, sesame, tobacco, flax, cotton, sunflower, pearl millet , corn, sorghum, canola, hemp, vegetable crops, forage crops, industrial crops, tree crops and biomass crops.
- the cells are mammalian cells. In some embodiments, the cells are human cells. In some embodiments, the human cells are human embryonic kidney 293T (HEK293T or 293T) cells or HeLa cells. In some embodiments, the cells are human embryonic kidney (HEK293T) cells. In some embodiments, the cells are mouse Hepal-6 cells. In some embodiments, the mammalian cells are selected from the group consisting of immune cells, liver cells, tumor cells, stem cells, blood cells, nerve cells, zygotes, muscle cells (eg, cardiomyocytes), and skin cells.
- HEK293T human embryonic kidney 293T
- the cells are mouse Hepal-6 cells.
- the mammalian cells are selected from the group consisting of immune cells, liver cells, tumor cells, stem cells, blood cells, nerve cells, zygotes, muscle cells (eg, cardiomyocytes), and skin cells.
- the cells are immune cells selected from the group consisting of cytotoxic T cells, helper T cells, natural killer (NK) T cells, iNK-T cells, NK-T-like cells, ⁇ T cells, tumor Infiltrating T cells and dendritic cell (DC) activated T cells.
- the methods generate modified immune cells, such as CAR-T cells or TCR-T cells.
- the cells are embryonic stem (ES) cells, induced pluripotent stem (iPS) cells, progenitor cells of gametes, cells in gametes, zygotes, or embryos.
- ES embryonic stem
- iPS induced pluripotent stem
- the methods described herein can be used to modify target cells in vivo, ex vivo, or in vitro, and can be done in such a way that the cells are altered such that, once modified, progeny or cell lines of the modified cells retain the modified Altered phenotype.
- the modified cells and progeny may be part of a multicellular organism such as a plant or animal with ex vivo or in vivo applications such as genome editing and gene therapy.
- the methods are performed ex vivo.
- the modified cells eg, mammalian cells
- the modified cells are propagated ex vivo.
- the modified cells are cultured to propagate for at least about any of 1 day, 2 days, 3 days, 4 days, 5 days, 6 days, 7 days, 10 days, 12 days, or 14 days .
- the modified cells are cultured for no more than about any of 1 day, 2 days, 3 days, 4 days, 5 days, 6 days, 7 days, 10 days, 12 days, or 14 days.
- the modified cells are further evaluated or screened to select cells with one or more desired phenotypes or properties.
- the target sequence is a sequence associated with a disease or condition.
- diseases or conditions include, but are not limited to, cancer, cardiovascular disease, genetic disease, autoimmune disease, metabolic disease, neurodegenerative disease, eye disease, bacterial infection, and viral infection.
- the disease or condition is a genetic disease.
- the disease or condition is a monogenic disease or condition.
- the disease or condition is a polygenic disease or condition.
- the target sequence has a mutation compared to the wild-type sequence. In some embodiments, the target sequence has a single nucleotide polymorphism (SNP) associated with a disease or condition.
- SNP single nucleotide polymorphism
- the donor DNA inserted into the target nucleic acid encodes a biological product selected from the group consisting of reporter proteins, antigen-specific receptors, therapeutic proteins, antibiotic resistance proteins, RNAi molecules, cytokines, kinases , antigens, antigen-specific receptors, cytokine receptors, and suicide polypeptides.
- the donor DNA encodes a therapeutic protein.
- the donor DNA encodes a therapeutic protein useful in gene therapy.
- the donor DNA encodes a therapeutic antibody.
- the donor DNA encodes an engineered receptor, such as a chimeric antigen receptor (CAR) or an engineered TCR.
- the donor DNA encodes a therapeutic RNA, such as a small RNA (eg, siRNA, shRNA, or miRNA) or a long non-coding RNA (lincRNA).
- the methods described herein can be used for multiplex gene editing or modulation at two or more (e.g., 2, 3, 4, 5, 6, 8, 10 or more) different target sites.
- the methods detect or modify multiple target nucleic acids or target nucleic acid sequences.
- the method comprises contacting the target nucleic acid with a guide RNA comprising a plurality (e.g., 2, 3, 4, 5, 6, 8, 10, or more) of crRNA sequences, wherein each crRNA comprises different target sequences.
- engineered cells comprising a modified target nucleic acid produced using any of the methods described herein.
- the engineered cells can be used in cell therapy.
- Autologous or allogeneic cells can be used to prepare engineered cells using the methods described herein for cell therapy.
- the methods described herein can also be used to generate isogenic lines of cells (eg, mammalian cells) to study genetic variants.
- engineered non-human animals comprising engineered cells described herein.
- the engineered non-human animal is a genome-edited non-human animal.
- the engineered non-human animals can be used as disease models.
- Non-human genome edited or transgenic animals include, but are not limited to: pronuclear microinjection, viral infection, transformation of embryonic stem cells and induced pluripotent stem (iPS) cells.
- iPS induced pluripotent stem
- Detailed methods that can be used include, but are not limited to, those described by Sundberg and Ichiki (2006, Genetically Engineered Mice Handbook, CRC Press) and those described by Gibson (2004, A Primer Of Genome Science 2nd ed. Sunderland, Mass.: Sinauer).
- the engineered animal can be of any suitable species including, but not limited to: cattle, horses, sheep, dogs, deer, felines, goats, pigs, primates, and lesser known mammals such as elephants , deer, zebra or camel.
- a use of the aforementioned engineered CRISPR-Cas12i system in preparing a medicament for treating a disease or disorder associated with a target nucleic acid in cells of an individual is provided.
- methods of the aforementioned engineered CRISPR-Cas12i system in treating a disease or disorder associated with a target nucleic acid in cells of an individual are provided.
- the present invention provides a method of treating a disease in a subject (such as a human) in need thereof, the method comprising administering (such as intravenous injection or infusion) to the subject a drug of the present invention.
- CRISPR-Cas12i system it comprises: (1) Cas12i effector protein of any described engineering (such as Cas12i nuclease or functional variant thereof of arbitrary described engineering), or the Cas12i effector protein of encoding described engineering and (2) crRNA, or a polynucleotide encoding said crRNA, said crRNA comprising: (i) a spacer sequence capable of hybridizing to a target sequence on a target nucleic acid associated with the disease, and (ii ) a direct repeat sequence linked to the spacer sequence capable of guiding the engineered Cas12i effector protein to bind to the crRNA to form a CRISPR-Cas12i complex targeting the target sequence; wherein the spacer sequence and the Hybridization of the target sequence media
- the present application provides methods of treating a disease or condition associated with a target nucleic acid in a cell of an individual comprising contacting the target nucleic acid with any of the engineered CRISPR-Cas12i systems described herein, wherein the guide sequence of the guide RNA is complementary to the target sequence of the target nucleic acid, wherein the engineered Cas12i effector protein and the guide RNA associate with each other to bind to the target nucleic acid to modify (such as cutting or base replacing) said target nucleic acid, thereby allowing said disease or condition to be treated.
- mutations are introduced into the target nucleic acid.
- expression of the target nucleic acid is enhanced.
- expression of the target nucleic acid is inhibited.
- the present application provides methods of treating a disease or condition in an individual comprising administering to the individual an effective amount of any of the engineered CRISPR-Cas12i systems described herein and a donor encoding a therapeutic agent DNA (such as a non-pathogenic native sequence used as a repair template), wherein the guide sequence of the guide RNA is complementary to the target sequence of the target nucleic acid of the individual, wherein the engineered Cas12i effector protein and the guide RNA are mutually Binding to bind to the target nucleic acid and insert donor DNA into the target sequence, thereby allowing the disease or condition to be treated.
- a therapeutic agent DNA such as a non-pathogenic native sequence used as a repair template
- the present application provides a method of treating a disease or condition in an individual comprising administering to the individual an effective amount of an engineered cell comprising a modified target nucleic acid, wherein the engineered cell is obtained by Prepared by contacting the cells with any of the engineered CRISPR-Cas12i systems described herein, wherein the guide sequence of the guide RNA is complementary to the target sequence of the target nucleic acid, wherein the engineered Cas12i effects The protein and the guide RNA associate with each other to bind to the target nucleic acid to modify the target nucleic acid.
- the engineered cells are immune cells.
- the engineered cells are stem cells (eg, hematopoietic stem cells, neural stem cells).
- the engineered cells are neural cells.
- the individual is a human.
- the individual is an animal, eg, a model animal such as a rodent, pet, or farm animal.
- the individual is a mammal, such as a cat, dog, rabbit, hamster, and the like.
- the disease or condition is selected from the group consisting of cancer, cardiovascular disease, genetic disease, autoimmune disease, metabolic disease, neurodegenerative disease, eye disease, bacterial infection, and viral infection.
- the target nucleic acid is PCSK9.
- the disease or condition is cardiovascular disease.
- the disease or condition is coronary artery disease.
- the method reduces cholesterol levels in the individual.
- the method treats diabetes in the individual.
- CRISPR-Cas12i system of the present invention can be used to treat include, but are not limited to, cystic fibrosis, hereditary angioedema, diabetes, progressive pseudohypertrophic muscular dystrophy, Becker muscular dystrophy, alpha-1- Antitrypsin deficiency, Pompe disease, myotonic dystrophy, Huntington's disease, fragile X syndrome, Friedreich's ataxia, amyotrophic lateral sclerosis, frontotemporal dementia, hereditary chronic kidney disease, high Lipidemia, hypercholesterolemia, Leber's congenital amaurosis, sickle cell disease, or beta thalassemia, etc.
- the condition or disease is transthyretin amyloidosis, such as wild-type transthyretin amyloidosis (transthyretin-related wild-type amyloidosis, ATTRwt), hereditary transthyretin amyloidosis transthyretin-related hereditary amyloidosis (ATTRm), familial amyloid polyneuropathy (FAP, ATTR-PN), or familial amyloid cardiomyopathy (FAC, ATTR-CM) .
- the disorder or disease is transthyretin instability caused by TTR gene mutation or abnormal expression (such as high expression).
- the disorder or disease is other disorder or disease caused by TTR gene mutation or abnormal expression (such as high expression), or a derivative disorder or disease.
- the engineered CRISPR-Cas12i system described herein, or components thereof, nucleic acid molecules thereof, or nucleic acid molecules encoding or providing components thereof can be delivered to host cells by various delivery systems such as plasmids or viruses (for example, any of the vectors described in the "Constructs and Vectors" section above).
- the engineered CRISPR-Cas12i system can be delivered by other means, such as a ribonucleoprotein complex consisting of the engineered Cas12i effector protein and its one or more cognate RNA guide sequences nucleofection or electroporation.
- delivery is via nanoparticles or exosomes.
- paired Cas12i nickase complexes can be delivered directly using nanoparticles or other direct protein delivery methods such that complexes comprising two paired crRNA elements are co-delivered.
- proteins can be delivered to cells via viral vectors or directly followed by CRISPR arrays containing two paired spacers for double nicks.
- the RNA can be conjugated to at least one sugar moiety such as N-acetylgalactosamine (GalNAc) (particularly a three-antennary GalNAc).
- the engineered CRISPR-Cas12i system can be delivered using any delivery method suitable for the disease being treated, such as delivery by intravenous injection or drip, or local delivery at the diseased site (such as tumor delivered within).
- Suitable routes of administration of the engineered CRISPR-Cas12i systems described herein include, but are not limited to: topical, subcutaneous, transdermal, intradermal, intralesional, intra-articular, intraperitoneal, intravesical, transmucosal, gingival, intradental, cochlear Intratympanic, transtympanic, intraorgan, epidural, intrathecal, intramuscular, intravenous, intravascular, intraosseous, periocular, intratumoral, intracerebral and intraventricular administration.
- the engineered CRISPR-Cas12i system described herein is administered to a subject by injection, by catheter, by suppository, or by implant, which is a porous, non-porous, or gel-like material comprising Membranes such as sialic acid membranes, or fibers.
- compositions comprising one or more components of any of the engineered Cas12i nucleases described herein or functional variants thereof, engineered Cas12i effector proteins or engineered CRISPR-Cas12i systems are also provided Boxes, unit doses and articles of manufacture.
- kits comprising: one or more AAV vectors encoding the engineered Cas12i nucleases described herein or functional variants thereof, engineered Cas12i effector proteins, or engineered CRISPR-Cas12i systems any of these.
- the kit further comprises one or more guide RNAs (or DNAs or vectors encoding them).
- the kit further comprises donor DNA.
- the kit further comprises cells such as human cells.
- the kit may comprise one or more additional components, such as containers, reagents, media, cytokines, buffers, antibodies, etc., to allow propagation of the engineered cells.
- the kit may also comprise a device for administering the composition.
- the kit may also comprise instructions for using the engineered CRISPR-Cas12i system described herein, such as methods for detecting or modifying target nucleic acids.
- the kit comprises instructions for treating or diagnosing the disease or condition.
- the instructions pertaining to the use of the kit components will generally include information on the amount to be administered, the schedule of administration and the route of administration for such deliberate treatment.
- the container can be a unit dose, bulk package (eg, a multi-dose package), or a subunit dose.
- kits comprising sufficient doses of a composition disclosed herein can be provided to provide effective treatment of an individual over an extended period of time.
- the kit can also include a plurality of unit doses of the composition and instructions for use packaged in quantities sufficient for storage and use in a pharmacy (eg, hospital pharmacy and compounding pharmacy).
- kits of the invention are in suitable packaging.
- suitable packaging includes, but is not limited to, vials, bottles, jars, flexible packaging (eg, sealed mylar or plastic bags), and the like. Kits may optionally provide additional components such as buffers and explanatory information. Accordingly, the present application also provides an article of manufacture including vials (eg, sealed vials), bottles, jars, flexible packaging, and the like.
- the article of manufacture may comprise a container and a label or package insert on or adhered to the container.
- Suitable containers include, for example, bottles, vials, syringes, and the like.
- the container can be formed from a variety of materials such as glass or plastic.
- the container contains a composition effective to treat a disease or condition described herein, and may have a sterile access port (e.g., the container may be a bag of intravenous solution or a vial with a stopper pierceable by a hypodermic needle) .
- the label or package insert indicates that the composition is used to treat a particular condition in an individual.
- the label or package insert will further include instructions for administering the composition to the individual.
- Package Insert means the instructions commonly included in commercial packages of therapeutic products that contain information on the indications, usage, dosage, administration, contraindications and/or warnings regarding the use of such therapeutic products.
- the article of manufacture may also include a second container comprising a pharmaceutically acceptable buffer, such as bacteriostatic water for injection (BWFI), phosphate buffered saline, Ringer's solution, and dextrose solution.
- a pharmaceutically acceptable buffer such as bacteriostatic water for injection (BWFI), phosphate buffered saline, Ringer's solution, and dextrose solution.
- BWFI bacteriostatic water for injection
- phosphate buffered saline such as bacteriostatic water for injection (BWFI), phosphate buffered saline, Ringer's solution, and dextrose solution.
- BWFI bacteriostatic water for injection
- the ingredients are presented individually or mixed together in unit dosage form, for example, as a dry lyophilized powder or water-free concentrate in a hermetically sealed container such as an ampoule or sachet showing the active dose.
- a hermetically sealed container such as an ampoule or sachet showing the active dose.
- the drug When the drug is administered by infusion, it can be dispensed with an infusion bottle filled with sterile pharmaceutical grade water or saline.
- an ampoule of sterile water for injection or saline can be provided so that the ingredients can be mixed prior to administration.
- Example 1 Replace the amino acid that interacts with PAM in the reference Cas12i2 enzyme with a positively charged amino acid, and verify its gene editing efficiency.
- the coding sequence of Cas12i2 was codon optimized (human) and synthesized. Variants of the Cas protein were generated by PCR-based site-directed mutagenesis. The specific method is to divide the DNA sequence design of the Cas12i2 protein into two parts centered on the mutation site, design two pairs of primers to amplify the two parts of the DNA sequence, and introduce the sequence to be mutated into the primers, and finally clone by Gibson Both fragments were loaded onto the pCAG-2A-eGFP vector (with penicillin resistance). The combination of mutants is constructed by splitting the DNA of the Cas12i2 protein into multiple segments and using PCR and Gibson clone.
- the position of the mutant is determined by analyzing the structural information of Cas12i2 using protein structure visualization software commonly used in the art (for example, PyMol, Chimera, etc. software can be selected).
- the structural information of Cas12i2 refers to PDB: 6LTU, 6LTR, 6LU0, 6LTP).
- Cas12i2 effector protein was expressed in human 293T cells by pCAG-2A-eGFP vector.
- the DNA encoding the Cas12i2 protein was inserted between XmaI and NheI.
- a vector expressing the crRNA of Cas12i2 in 293T was constructed by ligating annealed oligonucleotides containing the target sequence into the BasI-digested pUC19-U6-i2-crRNA backbone.
- the nucleotide sequence encoding DR is SEQ ID NO:59.
- HEK293T cells were cultured in DMEM (Gibco) containing 1% penicillin-streptomycin (Gibco) and 10% fetal bovine serum (Gibco). Cells were seeded in 24-well-cell culture dishes (Corning) for 16 hours until the cell density reached 70%.
- DMEM Gibco
- Gibco penicillin-streptomycin
- Gibco fetal bovine serum
- Lipofectamine 3000 Invitrogen
- 600ng of a plasmid encoding Cas12i2 protein and 300ng of a plasmid encoding crRNA were transfected into each 24-well-cell culture dish.
- HEK293T cells to be subjected to fluorescence-activated cell sorting were digested with trypsin-EDTA (0.05%) (Gibco).
- Cell sorting was performed using MoFlo XDP (Beckman Coulter) with GFP channel.
- FACS-sorted GFP-positive HEK293T cells were lysed with buffer L and incubated at 55 °C for 3 h, then at 95 °C for 10 min.
- dsDNA fragments containing target sites in different genomic loci were PCR amplified using the corresponding primers.
- target loci are directly amplified by barcoded PCR using cell lysates as templates.
- PCR products were purified and pooled into several libraries for high-throughput sequencing.
- the frequency (%) of indels was analyzed using CRISPResso2 software by calculating the ratio of reads containing indels or indels. In this application, the index of indel frequency (%) is uniformly used to compare and analyze gene editing efficiency. Reads with an amount less than 0.05% of complete reads were discarded.
- Example 1-A selected four engineered Cas12i2 with a single amino acid substitution
- the engineered Cas12i2 enzymes with a single mutation in the amino acid sequence were respectively expressed, and the preferred amino acid replacement method and its corresponding gene editing efficiency are shown in Figure 1a, Figure 1b and Table 1.
- Example 1-B compares the engineered Cas12i2 with multiple preferred amino acid substitutions
- Table 1 and Figure 1b We combined the point mutations in the four efficiency-enhancing mutants, E176R, K238R, T447R and E563R, which were screened in Example 1-A.
- Figure 1b and Table 1 by comparing the gene editing efficiency of these mutants and wild-type Cas12i2 at 3 genomic loci: CCR5-3, CCR5-5, and RNF2-7 in 293T cells, we found point mutations After the combination, mutants with further improved efficiency can be obtained.
- the coding sequences of the crRNA spacers for CCR5-3, RNF2-7 and CCR5-5 used here are shown in SEQ ID NO: 60, 61 and 62, respectively.
- Example 2 Replace the amino acids involved in opening the double-strand of DNA in the reference Cas12i2 enzyme with amino acids with aromatic rings, and verify their gene editing efficiencies
- the variant of Cas12i2 protein was produced by site-directed mutagenesis based on PCR.
- the specific method was to divide the DNA sequence design of Cas12i2 protein into two parts centered on the mutation site, and design two pairs of primers to amplify the two parts of DNA sequence respectively.
- the sequence to be mutated was introduced into the primer, and finally the two fragments were loaded into the pCAG-2A-eGFP vector by means of Gibson clone.
- the determination of the amino acid substitution position can be obtained by analyzing the structural information of Cas12i2 using common protein structure visualization software (for example, software such as PyMol and Chimera can be used).
- Cas12i2 refers to PDB: 6LTU, 6LTR, 6LU0, 6LTP).
- Cas12i2 effector protein was expressed in human 293T cells by pCAG-2A-eGFP vector.
- the DNA encoding the Cas12i2 protein was inserted between XmaI and NheI.
- a vector expressing Cas12i2crRNA in 293T was constructed by ligating annealed oligonucleotides containing the target sequence into the BasI-digested pUC19-U6-i2-crRNA backbone.
- the nucleotide sequence encoding DR is SEQ ID NO:59.
- HEK293T cells were cultured in DMEM (Gibco) containing 1% penicillin-streptomycin (Gibco) and 10% fetal bovine serum (Gibco). Cells were seeded in 24-well-cell culture dishes (Corning) for 16 hours until the cell density reached 70%. By using Lipofectamine 3000 (Invitrogen), 600ng of a plasmid encoding Cas12i2 protein and 300ng of a plasmid encoding crRNA were transfected into each 24-well-cell culture dish. After transfection68, HEK293T cells to be fluorescence-activated cell sorting (FACS) were digested with trypsin-EDTA (0.05%) (Gibco). Cell sorting was performed using MoFlo XDP (Beckman Coulter) with GFP channel.
- FACS fluorescence-activated cell sorting
- FACS-sorted GFP-positive HEK293T cells were lysed with buffer L and incubated at 55 °C for 3 h, then at 95 °C for 10 min.
- dsDNA fragments containing target sites in different genomic loci were PCR amplified using the corresponding primers.
- target loci are directly amplified by barcoded PCR using cell lysates as templates.
- PCR products were purified and pooled into several libraries for high-throughput sequencing. The frequency (%) of indels was analyzed using CRISPResso2 software by calculating the ratio of reads containing indels or indels. Reads with an amount less than 0.05% of complete reads were discarded.
- Example 3 Replace the amino acid located in the RuvC domain and interacting with the single-stranded DNA substrate in the reference Cas12i2 enzyme with a positively charged amino acid, and verify its gene editing efficiency
- the variant of Cas12i2 protein was produced by site-directed mutagenesis based on PCR.
- the specific method was to divide the DNA sequence design of Cas12i2 protein into two parts centered on the mutation site, and design two pairs of primers to amplify the two parts of DNA sequence respectively.
- the sequence to be mutated was introduced into the primer, and finally the two fragments were loaded into the pCAG-2A-eGFP vector by means of Gibson clone.
- the combination of mutants is constructed by splitting the DNA of the Cas12i2 protein into multiple segments and using PCR and Gibson clone.
- the location of the mutant is determined by analyzing the structural information of Cas12i2 with commonly used protein structure visualization software (for example, PyMol, Chimera and other software can be used).
- the structural information of Cas12i2 refers to PDB ID: 6LTU, 6LTR, 6LU0, 6LTP.
- the ssDNA substrate shown in these Cas12i2 structures is only 5nt.
- Cas12i2 effector protein was expressed in human 293T cells by pCAG-2A-eGFP vector.
- the DNA encoding the Cas12i2 protein was inserted between XmaI and NheI.
- a vector expressing Cas12i2crRNA in 293T was constructed by ligating annealed oligonucleotides containing the target sequence into the BasI-digested pUC19-U6-i2-crRNA backbone.
- the nucleotide sequence encoding DR is SEQ ID NO:59.
- the coding sequences of the crRNA spacers for CCR5-3 and RNF2-7 are shown in SEQ ID NO: 60 and 61, respectively.
- HEK293T cells were cultured in DMEM (Gibco) containing 1% penicillin-streptomycin (Gibco) and 10% fetal bovine serum (Gibco). Cells were seeded in 24-well-cell culture dishes (Corning) for 16 hours until the cell density reached 70%.
- DMEM Gibco
- Gibco penicillin-streptomycin
- Gibco fetal bovine serum
- Lipofectamine 3000 Invitrogen
- 600ng of a plasmid encoding Cas12i2 protein and 300ng of a plasmid encoding crRNA were transfected into each 24-well-cell culture dish.
- HEK293T cells to be subjected to fluorescence-activated cell sorting were digested with trypsin-EDTA (0.05%) (Gibco).
- Cell sorting was performed using MoFlo XDP (Beckman Coulter) with GFP channel.
- FACS-sorted GFP-positive HEK293T cells were lysed with buffer L and incubated at 55 °C for 3 h, then at 95 °C for 10 min.
- dsDNA fragments containing target sites in different genomic loci were PCR amplified using the corresponding primers.
- target loci are directly amplified by barcoded PCR using cell lysates as templates.
- PCR products were purified and pooled into several libraries for high-throughput sequencing. The frequency (%) of indels was analyzed using CRISPResso2 software by calculating the ratio of reads containing indels or indels. Reads with an amount less than 0.05% of complete reads were discarded.
- *L439(L+GG) means that two glycines are inserted after the 439th amino acid.
- Example 4 Replace the amino acid in the reference Cas12i2 enzyme that interacts with the DNA-RNA double helix with a positively charged amino acid, and verify its gene editing efficiency
- the variant of Cas12i2 protein was produced by site-directed mutagenesis based on PCR.
- the specific method was to divide the DNA sequence design of Cas12i2 protein into two parts centered on the mutation site, and design two pairs of primers to amplify the two parts of DNA sequence respectively.
- the sequence to be mutated was introduced into the primer, and finally the two fragments were loaded into the pCAG-2A-eGFP vector by means of Gibson clone.
- the combination of mutants is constructed by splitting the DNA of the Cas12i2 protein into multiple segments and using PCR and Gibson clone.
- the location of the mutant is determined by analyzing the structural information of Cas12i2 with commonly used protein structure visualization software (for example, PyMol, Chimera and other software can be used).
- the structural information of Cas12i2 refers to PDB: 6LTU, 6LTR, 6LU0, 6LTP).
- Cas12i2 effector protein was expressed in human 293T cells by pCAG-2A-eGFP vector.
- the DNA encoding the Cas12i2 protein was inserted between XmaI and NheI.
- a vector expressing Cas12i2crRNA in 293T was constructed by ligating annealed oligonucleotides containing the target sequence into the BasI-digested pUC19-U6-i2-crRNA backbone.
- the nucleotide sequence encoding DR is SEQ ID NO:59.
- the coding sequences of the crRNA spacers for CCR5-3 and RNF2-7 are shown in SEQ ID NO: 60 and 61, respectively.
- HEK293T cells were cultured in DMEM (Gibco) containing 1% penicillin-streptomycin (Gibco) and 10% fetal bovine serum (Gibco). Cells were seeded in 24-well-cell culture dishes (Corning) for 16 hours until the cell density reached 70%.
- DMEM Gibco
- Gibco penicillin-streptomycin
- Gibco fetal bovine serum
- Lipofectamine 3000 Invitrogen
- 600ng of a plasmid encoding Cas12i2 protein and 300ng of a plasmid encoding crRNA were transfected into each 24-well-cell culture dish.
- HEK293T cells to be subjected to fluorescence-activated cell sorting were digested with trypsin-EDTA (0.05%) (Gibco).
- Cell sorting was performed using MoFlo XDP (Beckman Coulter) with GFP channel.
- FACS-sorted GFP-positive HEK293T cells were lysed with buffer L and incubated at 55 °C for 3 h, then at 95 °C for 10 min.
- dsDNA fragments containing target sites in different genomic loci were PCR amplified using the corresponding primers.
- target loci are directly amplified by barcoded PCR using cell lysates as templates.
- PCR products were purified and pooled into several libraries for high-throughput sequencing. The frequency (%) of indels was analyzed using CRISPResso2 software by calculating the ratio of reads containing indels or indels. Reads with an amount less than 0.05% of complete reads were discarded.
- Figure 4 and Table 4 summarize the gene editing efficiencies of Cas12i2 mutants in this example and wild-type Cas12i2 at two genomic loci: CCR5-3 and RNF2-7 in 293T cells.
- seven mutants G116R, E117R, T159R, S161R, E319R, E343R, and D958R were able to effectively improve gene editing efficiency (at least at one genomic locus).
- D958R showed excellent gene editing efficiency at both genomic loci.
- Example 5 Combine some of the Cas12i2 engineered amino acid mutations obtained in Examples 1-4 to improve gene editing efficiency, and verify their gene editing efficiency.
- the combination of mutants is constructed by splitting the DNA of the Cas12i2 protein into multiple segments and using PCR and Gibson clone.
- the location of the mutant is determined by analyzing the structural information of Cas12i2 using commonly used protein structure visualization software (for example, software such as PyMol and Chimera can be used).
- the structural information of Cas12i2 refers to PDB: 6LTU, 6LTR, 6LU0, 6LTP).
- Cas12i2 effector protein was expressed in human 293T cells by pCAG-2A-eGFP vector.
- the DNA encoding the Cas12i2 protein was inserted between XmaI and NheI.
- a vector expressing Cas12i2crRNA in 293T was constructed by ligating annealed oligonucleotides containing the target sequence into the BasI-digested pUC19-U6-i2-crRNA backbone.
- the nucleotide sequence encoding DR is SEQ ID NO:59.
- HEK293T cells were cultured in DMEM (Gibco) containing 1% penicillin-streptomycin (Gibco) and 10% fetal bovine serum (Gibco). Cells were seeded in 24-well-cell culture dishes (Corning) for 16 hours until the cell density reached 70%.
- DMEM Gibco
- Gibco penicillin-streptomycin
- Gibco fetal bovine serum
- Lipofectamine 3000 Invitrogen
- 600ng of a plasmid encoding Cas12i2 protein and 300ng of a plasmid encoding crRNA were transfected into each 24-well-cell culture dish.
- HEK293T cells to be subjected to fluorescence-activated cell sorting were digested with trypsin-EDTA (0.05%) (Gibco).
- Cell sorting was performed using MoFlo XDP (Beckman Coulter) with GFP channel.
- FACS-sorted GFP-positive HEK293T cells were lysed with buffer L and incubated at 55 °C for 3 h, then at 95 °C for 10 min.
- dsDNA fragments containing target sites in different genomic loci were PCR amplified using the corresponding primers.
- target loci are directly amplified by barcoded PCR using cell lysates as templates.
- PCR products were purified and pooled into several libraries for high-throughput sequencing. The frequency (%) of indels was analyzed using CRISPResso2 software by calculating the ratio of reads containing indels or indels. Reads with an amount less than 0.05% of complete reads were discarded.
- mutants As shown in Figure 5 and Table 5, by comparing these mutants and wild-type Cas12i in 293T cells, they act on 5 genomic sites: CCR5-3, CCR5-5, CD34-8, CD34-9, RNF2-14
- mutants With further improved efficiency can be obtained.
- the mutant E176R+K238R+T447R+E563R+N164Y+E323R+D362R
- the coding sequences of the used crRNA spacers against CD34-8, CD34-9 and RNF2-14 are shown in SEQ ID NO: 63, 64 and 65, respectively.
- Example 5 Summary of the results (gene editing efficiency) of Example 5 (only the gene editing efficiency data embodied at RNF2-14 is taken as an example)
- *L439(L+G) means that a glycine is inserted after amino acid No. 439; in the context of this specification and the drawings, it is sometimes expressed as 439G.
- T7 endonuclease 1 T7E1
- Example 6 Comparison of CasXX and conventional gene editing tools to verify its gene editing efficiency.
- Cas effector protein was expressed in human 293T cells by pCAG-2A-eGFP vector.
- the DNA encoding the Cas protein was inserted between XmaI and NheI.
- sgRNAs or crRNAs expressing AsCas12a, BhCas12b v4, SpCas9, SaCas9, SaCas9-KKH and Cas12i2 in 293T were constructed by ligating annealed oligonucleotides containing target sequences into BasI-digested pUC19-U6-i2-crRNA backbone Carrier.
- the nucleotide sequence encoding the DR of CasXX is SEQ ID NO:59.
- HEK293T cells were cultured in DMEM (Gibco) containing 1% penicillin-streptomycin (Gibco) and 10% fetal bovine serum (Gibco). Cells were seeded in 24-well-cell culture dishes (Corning) for 16 hours until the cell density reached 70%.
- DMEM Gibco
- Gibco penicillin-streptomycin
- Gibco fetal bovine serum
- Lipofectamine 3000 Invitrogen
- 600ng of a plasmid encoding Cas12i2 protein and 300ng of a plasmid encoding crRNA were transfected into each 24-well-cell culture dish.
- HEK293T cells to be subjected to fluorescence-activated cell sorting were digested with trypsin-EDTA (0.05%) (Gibco).
- Cell sorting was performed using MoFlo XDP (Beckman Coulter) with GFP channel.
- FACS-sorted GFP-positive HEK293T cells were lysed with buffer L and incubated at 55 °C for 3 h, then at 95 °C for 10 min.
- dsDNA fragments containing target sites in different genomic loci were PCR amplified using the corresponding primers.
- target loci are directly amplified by barcoded PCR using cell lysates as templates.
- PCR products were purified and pooled into several libraries for high-throughput sequencing. The frequency (%) of indels was analyzed using CRISPResso2 software by calculating the ratio of reads containing indels or indels. Reads with an amount less than 0.05% of complete reads were discarded.
- the pCAG-2A-eGFP vector encoding CasXX and the pUC19-U6-i2-crRNA vector encoding crRNA were transfected into the mouse Hepa1-6 liver cancer cell line by lipofection. Among them, corresponding crRNAs were designed for 65 endogenous gene loci.
- the designed spacer sequence is 20 nucleotides.
- the indel frequency is obtained by PCR amplification and sequencing, which is similar to the analysis method of exogenous gene editing.
- CasXX exhibited powerful gene editing ability on 65 endogenous gene loci of the mouse Hepa1-6 cell line, with an average gene editing efficiency of more than 60%.
- Example 7 Gene editing at genomic loci containing different PAMs using CasXX.
- the DNA sequence encoding CasXX was loaded into the pCAG-2A-EGFP plasmid, thereby constructing the expression CasXX plasmid.
- a vector for expression of crRNA in HEK293T was constructed by ligating annealed oligonucleotides containing the target sequence into the BasI-digested pUC19-U6-crRNA backbone. Among them, 64 different crRNAs were designed to target 64 human endogenous sites.
- the PAM sequences contained in these 64 sites cover all NNNN combinations: NTTN, NTAN, NTCN, NTGN, NATN, NAAN, NACN, NAGN, NCTN, NCAN, NCCN, NCGN, NGTN, NGAN, NGCN, NGGN.
- the nucleotide sequence encoding DR is SEQ ID NO:59.
- the designed spacer sequence is 20 nucleotides.
- HEK293T cells were cultured in DMEM (Gibco) containing 1% penicillin-streptomycin (Gibco) and 10% fetal bovine serum (Gibco). Cells were seeded in 24-cell culture dishes (Corning) for 16 hours until the cell density reached 70%.
- DMEM fetal bovine serum
- Lipofectamine 3000 Invitrogen
- 600 ng of a plasmid encoding a Cas protein and 300 ng of a plasmid encoding a crRNA were transfected into cells cultured in a 24-well cell culture dish. 72h after transfection, the cells were digested with trypsin-EDTA (0.05%) (Gibco), and then subjected to GFP fluorescence-activated cell sorting (FACS).
- FACS-sorted GFP-positive 293FT cells were lysed with buffer L and incubated at 55 °C for 3 h, then at 95 °C for 10 min.
- dsDNA fragments containing target sites in different genomic loci were PCR amplified using the corresponding primers.
- target loci are directly amplified by barcoded PCR using cell lysates as templates.
- PCR products were purified and pooled into several libraries for high-throughput sequencing.
- the frequency (%) of indels was analyzed using CRISPResso2 software by calculating the ratio of reads containing indels or indels. Reads with an amount less than 0.05% of complete reads were discarded.
- the experimental results are shown in Figure 7.
- the target sites of CasXX with NTTN, NTAN, NTCN, NTGN, NATN, NAAN, NCTN, NCAN, and NGTN at the 5' end of these PAMs all exhibited high gene editing efficiency.
- the average gene editing efficiency exceeds 40%.
- Example 8 In vitro cleavage of double-stranded DNA containing different PAMs using CasXX
- the BPK2014 prokaryotic expression plasmid was transformed into Escherichia coli strain BL21( ⁇ DE3) (TransGen Biotech), and the transformed bacterial liquid was spread on solid LB containing chloramphenicol. Three clones were picked into 5 ml liquid LB and grown overnight. Then the bacteria were transferred to 3L liquid LB to continue culturing until the OD600 reached 0.6-0.8. It was then induced with IPTG (0.5 mM) for 20 hours at 16°C. Bacteria expressing Cas12i were harvested by ultracentrifugation, resuspended in lysis buffer (50 mM Tris-HCl, pH 7.5, 300 mM NaCl) and disrupted by sonication.
- lysis buffer 50 mM Tris-HCl, pH 7.5, 300 mM NaCl
- the Cas12i protein in the supernatant was first purified with a Ni column. Briefly, after incubation with the supernatant, the Ni column was sequentially washed with lysis buffer supplemented with 0 mM, 20 mM and 50 mM imidazole. Then, Cas12i protein was eluted by lysis buffer supplemented with 500 mM imidazole. The collected samples were then loaded into an ion exchange column (CM Sepharose Fast Flow, GE). Wild-type Cas12i2 and CasXX proteins were eluted with storage buffer (20 mM Tris-HCl, 300 mM NaCl, 1 mM TCEP, 10% glycerol, pH7.5). Proteins were filter sterilized and stored at -80 °C.
- the nucleotide sequence encoding DR is SEQ ID NO:59. Synthesize the oligonucleotide containing the T7 promoter sequence (named T7-F) and the oligonucleotide containing the complementary sequence of crRNA and T7 promoter (named T7-12i-crRNA-R), and in 1xNEBufferTM2 (NEB) Medium annealing. The sequences of these oligonucleotides are listed in Table 6. The annealed product was used as a template to produce crRNA by HiScribeTM T7 Quick High Yield RNA Synthesis Kit (NEB). use RNA Cleanup Kit (NEB) purifies transcribed crRNA.
- targets containing the same protospacer but different PAMs were first cloned into EcoR1 and HindIII treated pUC19 (with penicillin resistance). Target sequences with 5' PAMs are listed in Table 7.
- the pUC19 plasmid carrying the target was then linearized by SacI and purified using DNA Clean & Concentrator (Zymo Research).
- 400 nM Cas12i protein was first incubated with 2 ⁇ M crRNA at 37 °C for 15 min.
- the wild-type Cas12i2 protein only partially cuts double-stranded DNA containing 5'-NTTN-3'PAM, but has almost no cutting activity on the rest of the PAM.
- CasXX protein showed high cutting efficiency for double-stranded DNA containing NTTN, NTAN, NTCN, NTGN, NATN, NAAN, NACN, NCTN, NCAN, NGTN, and NGAN PAM.
- CasXX can almost completely cleave double-stranded DNA containing NTTN, NTAN, NTCN, NATN, NAAN, NACN, NCTN, NCAN, NGTN PAM. Therefore, the engineered Cas12i nuclease provided by the present invention can be used for more extensive gene editing or treatment.
- a vector expressing Cas protein crRNA in 293T was constructed by ligating annealed oligonucleotides containing the target sequence into the BasI-digested pUC19-U6-crRNA backbone.
- the crRNA comprises a spacer sequence (spacer; SEQ ID NO: 77) capable of targeting the endogenous site of EMX1-7.
- the nucleotide sequence encoding DR is SEQ ID NO:59.
- HEK293T cells were cultured in DMEM (Gibco) containing 1% penicillin-streptomycin (Gibco) and 10% fetal bovine serum (Gibco). Cells were seeded in 24-cell culture dishes (Corning) for 16 hours until the cell density reached 70%.
- DMEM fetal bovine serum
- Lipofectamine 3000 Invitrogen
- 600ng of a plasmid encoding a Cas protein 300ng of a plasmid encoding a crRNA
- 10pmol of an annealed double-stranded DNA tag see Table 8 for the sequence
- the cells were digested with trypsin-EDTA (0.05%) (Gibco), and then subjected to GFP fluorescence-activated cell sorting (FACS). The cells that successfully express the Cas enzyme were selected.
- the MicroElute Genomic DNA Kit (Omega) kit was used to extract the genome of FACS-sorted GFP-positive 293T cells. The purified genome was quantified using Qubit. Using the Covaris S220 instrument, the genome was fragmented to about 500bp according to the program recommended by the instrument. Then use the VAHTS Universal Pro DNA Library Prep Kit for Illumina (Vazyme) kit for DNA library construction.
- the library construction process refers to references (Tsai, SQ et al. "GUIDE-seq enables genome-wide profiling of off-target cleavage by CRISPR-Cas nucleases," Nat Biotechnol.2015; 33(2): 187-197; its content incorporated herein by reference in its entirety).
- the library products were subjected to high-throughput sequencing. Analyze sequencing results to find potential off-target sites. We systematically analyzed the off-target effect of CasXX at the EMX1-7 site, and the specific analysis results are shown in Figure 10.
- the first line of sequence is the reference target sequence, and the following sequences represent the target sequence and off-target sequence and the number of reads enriched by double-stranded DNA tags in the target sequence and off-target sequence, respectively.
- Example 10 On the basis of the CasXX sequence (SEQ ID NO: 8), a new amino acid mutation is introduced to check the characteristics of the mutant's specificity improvement (two targeting sites are selected for mutant screening experiments, namely EMX1-7 locus and RNF2-1 locus)
- CasXX including N164Y+E176R+K238R+E323R+D362R+T447R+E563R mutation based on SEQ ID NO: 1
- SEQ ID NO: 8 Based on the sequence of CasXX (SEQ ID NO: 8), we used primers with mutated bases to perform PCR on the DNA sequence of CasXX, using The HiFi DNA Assembly Master Mix (NEB) kit loaded the purified PCR product into the pCAG-2A-EGFP plasmid, thereby constructing the CasXX plasmid expressing the introduced point mutation.
- NEB HiFi DNA Assembly Master Mix
- CasXX-HF-1 a total of 26 CasXX sequence-based mutants containing single amino acid mutations, named CasXX-HF-1 to HF-26, respectively.
- the specific mutation methods are shown in Table 9.
- CasXX-HF-26 restores the N164Y point mutation in CasXX to the original amino acid N of wild-type Cas12i2, that is, Y164N (that is, the mutation N164Y is deleted).
- a vector expressing Cas protein crRNA in 293T was constructed by ligating the annealed oligonucleotide containing the target sequence into the pUC19-U6-crRNA backbone digested with BasI (NEB) using T4ligase (NEB).
- the crRNA was designed to target EMX1-7 sites.
- the specific sequence is shown in Table 10.
- the nucleotide sequence encoding DR is SEQ ID NO:59.
- serial number crRNA target site name spacer coding sequence SEQ ID NO: 27 EMX1-7 TGGTTGCCCACCCTAGTCAT
- HEK293T cells were cultured in DMEM (Gibco) containing 1% penicillin-streptomycin (Gibco) and 10% fetal bovine serum (Gibco). Cells were seeded in 24-cell culture dishes (Corning) for 16 hours until the cell density reached 70%.
- DMEM fetal bovine serum
- Lipofectamine 3000 Invitrogen
- 600 ng of a plasmid encoding a Cas protein and 300 ng of a plasmid encoding a crRNA were transfected into cells cultured in a 24-well cell culture dish. 72h after transfection, the cells were digested with trypsin-EDTA (0.05%) (Gibco), and then subjected to GFP fluorescence-activated cell sorting (FACS).
- FACS-sorted GFP-positive HEK293T cells were lysed with 40 ⁇ L of buffer L (bimake), incubated at 55°C for 3 hours, and then incubated at 95°C for 10 minutes. Containing target sites in different genomic loci using primer pairs for EMX1-7 target site, EMX1-7 off-target site 1, EMX1-7 off-target site 2, and EMX1-7 off-target site 3 ( Figure 10)
- the dsDNA fragments of the target or off-target sites were amplified by PCR, and the sequences of the target sites or off-target sites are shown in Table 11. Afterwards, using 10 ⁇ L of the PCR product, a re-annealing procedure was performed to form heteroduplex dsDNA.
- a vector expressing Cas protein crRNA in HEK293T was constructed by ligating the annealed oligonucleotide containing the target sequence into the pUC19-U6-crRNA backbone digested with BasI (NEB) using T4ligase (NEB).
- the final crRNAs all target the same RNF2-1 site, but the spacer sequences are different.
- the specific sequences encoding these spacers are shown in Table 13.
- RNF2-1-FM means that the spacer sequence in crRNA completely matches the RNF2-1 site.
- RNF2-1-Mis-1/2 means that the spacer sequence in crRNA does not match the RNF2-1 site at the 1st and 2nd base positions, and matches at the remaining positions.
- RNF2-1-Mis-5/6 means that the spacer sequence in the crRNA does not match the 5th and 6th base positions of the RNF2-1 site, and matches at the remaining positions.
- RNF2-1-Mis-17/1 means that the spacer sequence in the crRNA does not match the 17th and 18th base positions of the RNF2-1 site, and matches at the remaining positions.
- RNF2-1-Mis-19/20 means that the spacer sequence in the crRNA does not match the 19th and 20th base positions of the RNF2-1 site, and matches at the remaining positions.
- the purpose of setting base mismatches is to simulate off-target effects.
- the nucleotide sequence encoding DR is SEQ ID NO:59.
- crRNA name spacer coding sequence 32 crRNA-RNF2-1-FM TACAGGAGGCAATAACAGAT 33 crRNA-RNF2-1-Mis-1/2 GCCAGGAGGCAATAACAGAT 34 crRNA-RNF2-1-Mis-5/6 TACATTAGGCAATAACAGAT 35 crRNA-RNF2-1-Mis-17/18 TACAGGAGGCAATAACCTAT 36 crRNA-RNF2-1-Mis-19/20 TACAGGAGGCAATAACAGCG
- HEK293T cells were cultured in DMEM (Gibco) containing 1% penicillin-streptomycin (Gibco) and 10% fetal bovine serum (Gibco). Cells were seeded in 24-cell culture dishes (Corning) for 16 hours until the cell density reached 70%.
- DMEM fetal bovine serum
- Lipofectamine 3000 Invitrogen
- 600ng of a plasmid encoding a Cas protein and 300ng of a plasmid encoding a crRNA were transfected into cells cultured in a 24-well cell culture dish. 72h after transfection, the cells were digested with trypsin-EDTA (0.05%) (Gibco), and then subjected to GFP fluorescence-activated cell sorting (FACS). The cells that successfully express the Cas enzyme were selected.
- FACS-sorted GFP-positive HEK293T cells were lysed with 40 ⁇ L of buffer L (bimake), incubated at 55°C for 3 hours, and then incubated at 95°C for 10 minutes.
- the dsDNA fragment at the RNF2-1 site was amplified by PCR using the RNF2-1 primer pair. Afterwards, using 10 ⁇ L of the PCR product, a re-annealing procedure was performed to form heteroduplex dsDNA. Then, the mixture was treated with 1/10 volume of NEBuffer TM 2.1 and 0.2 ⁇ L of T7 endonuclease I (NEB) at 37° C. for 50 minutes. Digestion products were analyzed by ⁇ 2.5% agarose gel electrophoresis.
- the indel ratio (Indel, %) was calculated according to the gray value of the band.
- the ratio of indels at the target site for each Cas12i mutant when using different crRNAs is shown in Table 14, corresponding to FIG. 12 .
- Experimental results show that R857A, N861A, K807A, N848A, R715A, R719A, K394A, H357A, K844A these single point mutants based on CasXX sequence (corresponding to the engineered Cas12i nuclease of SEQ ID NOs: 14-22, see Table 9)
- Even when using simulated off-target crRNA-RNF2-1-Mis-1/2, crRNA-RNF2-1-Mis-5/6, crRNA-RNF2-1-Mis-17/18 and crRNA-RNF2-1-Mis- 19/20 can still effectively reduce the proportion of indels.
- we defined the decrease of the indel rate at the off-target site and the substantially unchanged (or increased) editing efficiency at the target site as
- Table 14 Editing efficiency of different CasXX-HF mutants at the RNF2-1 site using crRNA targeting the target sequence and simulating the off-target spacer sequence
- Embodiment 11 The mutant that introduces new amino acid mutation on the basis of CasXX sequence
- mutants with new amino acid mutations introduced on the basis of the CasXX sequence those mutants with improved specificity were screened in the fluorescent reporter system.
- the difference between this example and Example 10 is that the analysis means reflecting the editing efficiency is obtained by agarose gel Electrophoresis was changed to fluorescence.
- a vector expressing Cas protein crRNA in HEK293T was constructed by ligating the annealed oligonucleotide containing the target sequence into the pUC19-U6-crRNA backbone digested with BasI (NEB) using T4ligase (NEB).
- the final crRNAs all target the same mCherry site, but the spacer sequences are different.
- the specific sequence encoding the spacer is shown in Table 15. Among them, mCherry-FM means that the spacer sequence in the crRNA completely matches the mCherry site.
- crRNA-mCherry-Mis-1/2 means that the spacer sequence in the crRNA does not match the crRNA-mCherry site at the 1st and 2nd base positions, and matches at the remaining positions.
- crRNA-mCherry-Mis-5/6 means that the spacer sequence in the crRNA does not match the 5th and 6th base positions of the crRNA-mCherry site, and matches at the remaining positions.
- crRNA-mCherry-Mis-19/20 means that the spacer sequence in the crRNA does not match the crRNA-mCherry site at the 19th and 20th base positions, and matches at the remaining positions.
- the nucleotide sequence encoding DR is SEQ ID NO:59.
- Table 15 Spacer coding sequence targeting mCherry and simulating off-target
- crRNA target site name Spacer coding sequence 37 crRNA-mCherry-FM ACCTTGTAGATGAACTCGCC 38 crRNA-mCherry-Mis-1/2 CACTTGTAGATGAACTCGCC 39 crRNA-mCherry-Mis-5/6 ACCTGTTAGATGAACTCGCC 40 crRNA-mCherry-Mis-19/20 ACCTTGTAGATGAACTCGAA
- HEK293T cells were cultured in DMEM (Gibco) containing 1% penicillin-streptomycin (Gibco) and 10% fetal bovine serum (Gibco). Cells were seeded in 24-cell culture dishes (Corning) for 16 hours until the cell density reached 70%.
- DMEM Gibco
- Gibco penicillin-streptomycin
- Gibco fetal bovine serum
- 600 ng of a plasmid encoding a Cas protein, 300 ng of a plasmid encoding a crRNA, and 100 ng of a plasmid encoding mCherry were transfected into cells cultured in a 24-well cell culture dish.
- Example 12 On the basis of the CasXX sequence, amino acid combination mutations are introduced, and endogenous gene (combination) sites are screened for mutants with improved specificity
- a vector expressing Cas protein crRNA in 293T was constructed by ligating the annealed oligonucleotide containing the target sequence into the pUC19-U6-crRNA backbone digested with BasI (NEB) using T4ligase (NEB).
- crRNAs were designed as described in Example 10 to target EMX1-7 or RNF2-1, or to mimic RNF2-1 off-targets. See Tables 10 and 13 for the Spacer coding sequence.
- HEK293T cells were cultured in DMEM (Gibco) containing 1% penicillin-streptomycin (Gibco) and 10% fetal bovine serum (Gibco). Cells were seeded in 24-cell culture dishes (Corning) for 16 hours until the cell density reached 70%. By using Lipofectamine 3000 (Invitrogen), 600ng of the plasmid encoding the Cas protein and 300ng of the plasmid encoding the crRNA were transfected into cells cultured in a 24-well cell culture dish. 72 h after transfection, cells were digested with trypsin-EDTA (0.05%) (Gibco), and then subjected to fluorescence-activated cell sorting (FACS). The cells that successfully express the Cas enzyme were selected.
- FACS-sorted GFP-positive HEK293T cells were lysed with 40 ⁇ L of buffer L (bimake), incubated at 55°C for 3 hours, and then incubated at 95°C for 10 minutes.
- the corresponding primers described in Example 10 were used to perform PCR amplification on dsDNA fragments containing target sites or off-target sites in different genomic loci.
- the sequences of the target sites or off-target sites are shown in Tables 11 and 13. Afterwards, using 10 ⁇ L of the PCR product, a re-annealing procedure was performed to form heteroduplex dsDNA.
- the mixture was treated with 1/10 volume of NEBuffer TM 2.1 and 0.2 ⁇ L of T7 endonuclease I (NEB) at 37° C. for 50 minutes. Digestion products were analyzed by ⁇ 2.5% agarose gel electrophoresis. The indel ratio (Indel, %) was calculated according to the gray value of the band.
- R857A, R719A, K394A, K844A, R719A/K844A and R857A/K844A mutants based on CasXX sequences were used to mimic off-target crRNA-RNF2-1-Mis-1/2, crRNA-RNF2-1-Mis-5 /6, crRNA-RNF2-1-Mis-17/18 and crRNA-RNF2-1-Mis-19/20 can effectively reduce the ratio of indels without sacrificing the efficiency at the target site, see Table 18, corresponding Figure 16.
- mutants shown in SEQ ID NO: 23 and SEQ ID NO: 24 in this description (corresponding to the mutants marked with sequence numbers in Table 16) have significantly higher specificity.
- Example 13 Detection of off-target effects of CasXX and CasXX+K394A mutant using GUIDE-Seq.
- the DNA sequences encoding CasXX (SEQ ID NO: 8) and CasXX+K394A mutant (SEQ ID NO: 20) were loaded into the pCAG-2A-EGFP plasmid, thereby constructing a plasmid expressing CasXX or (CasXX+K394A).
- a vector expressing Cas12i protein crRNA in HEK293T was constructed by ligating annealed oligonucleotides containing the target sequence into the BasI-digested pUC19-U6-crRNA backbone.
- the crRNA is designed to contain a spacer sequence that can target the endogenous site of CD34-7.
- the nucleotide sequence encoding DR is SEQ ID NO:59.
- the nucleotide sequence encoding spacer comprises SEQ ID NO:78.
- HEK293T cells were cultured in DMEM (Gibco) containing 1% penicillin-streptomycin (Gibco) and 10% fetal bovine serum (Gibco). Cells were seeded in 24-cell culture dishes (Corning) for 16 hours until the cell density reached 70%.
- DMEM fetal bovine serum
- Lipofectamine 3000 Invitrogen
- the MicroElute Genomic DNA Kit (Omega) kit was used to extract genomes from FACS-sorted GFP-positive HEK293T cells. The purified genome was quantified using Qubit. Using the Covaris S220 instrument, the genome was fragmented to about 500bp according to the program recommended by the instrument. Then use the VAHTS Universal Pro DNA Library Prep Kit for Illumina (Vazyme) kit for DNA library construction.
- the library construction process refers to references (Tsai, SQ et al. "GUIDE-seq enables genome-wide profiling of off-target cleavage by CRISPR-Cas nucleases," Nat Biotechnol. 2015; 33(2): 187-197).
- the library products were subjected to high-throughput sequencing. Analyze sequencing results to find potential off-target sites.
- the first line of sequence is the reference target sequence, and the following sequences represent the target sequence and off-target sequence and the number of reads enriched by double-stranded DNA tags in the target sequence and off-target sequence, respectively.
- Figure 17 shows that CasXX has an off-target effect at the CD34-7 site, but the CasXX+K394A mutant has no off-target effect at this site. Therefore, the additional introduction of the K394A point mutation on the basis of the CasXX sequence can greatly improve the fidelity of CasXX, making CasXX highly specific.
Landscapes
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Chemical & Material Sciences (AREA)
- Molecular Biology (AREA)
- Organic Chemistry (AREA)
- Engineering & Computer Science (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Zoology (AREA)
- Wood Science & Technology (AREA)
- Genetics & Genomics (AREA)
- Microbiology (AREA)
- Biotechnology (AREA)
- Biomedical Technology (AREA)
- Biochemistry (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Medicinal Chemistry (AREA)
- Enzymes And Modification Thereof (AREA)
- Micro-Organisms Or Cultivation Processes Thereof (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
- Medicines That Contain Protein Lipid Enzymes And Other Medicines (AREA)
Abstract
Description
SEQ ID NO:43 | NTTN-靶标 | CTTCTTCAACATATCCAAACAAAT |
SEQ ID NO:44 | NTAN-靶标 | CTACTTCAACATATCCAAACAAAT |
SEQ ID NO:45 | NTCN-靶标 | CTCCTTCAACATATCCAAACAAAT |
SEQ ID NO:46 | NTGN-靶标 | CTGCTTCAACATATCCAAACAAAT |
SEQ ID NO:47 | NATN-靶标 | CATCTTCAACATATCCAAACAAAT |
SEQ ID NO:48 | NAAN-靶标 | CAACTTCAACATATCCAAACAAAT |
SEQ ID NO:49 | NACN-靶标 | CACCTTCAACATATCCAAACAAAT |
SEQ ID NO:50 | NAGN-靶标 | CAGCTTCAACATATCCAAACAAAT |
SEQ ID NO:51 | NCTN-靶标 | CCTCTTCAACATATCCAAACAAAT |
SEQ ID NO:52 | NCAN-靶标 | CCACTTCAACATATCCAAACAAAT |
SEQ ID NO:53 | NCCN-靶标 | CCCCTTCAACATATCCAAACAAAT |
SEQ ID NO:54 | NCGN-靶标 | CCGCTTCAACATATCCAAACAAAT |
SEQ ID NO:55 | NGTN-靶标 | CGTCTTCAACATATCCAAACAAAT |
SEQ ID NO:56 | NGAN-靶标 | CGACTTCAACATATCCAAACAAAT |
SEQ ID NO:57 | NGCN-靶标 | CGCCTTCAACATATCCAAACAAAT |
SEQ ID NO:58 | NGGN-靶标 | CGGCTTCAACATATCCAAACAAAT |
SEQ ID NO:25 | GUIDE-F1 | 5′-P-G*T*TTAATTGAGTTGTCATATGTTAATAACGGT*A*T-3′ |
SEQ ID NO:26 | GUIDE-R1 | 5′-P-A*T*ACCGTTATTAACATATGACAACTCAATTAA*A*C-3′ |
序列号 | crRNA靶向位点名称 | spacer编码序列 |
SEQ ID NO:27 | EMX1-7 | TGGTTGCCCACCCTAGTCAT |
SEQ ID NO: | crRNA名称 | spacer编码序列 |
32 | crRNA-RNF2-1-FM | TACAGGAGGCAATAACAGAT |
33 | crRNA-RNF2-1-Mis-1/2 | GCCAGGAGGCAATAACAGAT |
34 | crRNA-RNF2-1-Mis-5/6 | TACATTAGGCAATAACAGAT |
35 | crRNA-RNF2-1-Mis-17/18 | TACAGGAGGCAATAACCTAT |
36 | crRNA-RNF2-1-Mis-19/20 | TACAGGAGGCAATAACAGCG |
SEQ ID NO: | crRNA靶向位点名称 | Spacer编码序列 |
37 | crRNA-mCherry-FM | ACCTTGTAGATGAACTCGCC |
38 | crRNA-mCherry-Mis-1/2 | CACTTGTAGATGAACTCGCC |
39 | crRNA-mCherry-Mis-5/6 | ACCTGTTAGATGAACTCGCC |
40 | crRNA-mCherry-Mis-19/20 | ACCTTGTAGATGAACTCGAA |
Claims (38)
- 一种工程化的Cas12i核酸酶;其包含一种或多种基于参比Cas12i核酸酶的突变,所述突变选自:(1)将参比Cas12i核酸酶中与PAM相互作用的一个或多个氨基酸替换为带正电的氨基酸;(2)将参比Cas12i核酸酶中参与打开DNA双链的一个或多个氨基酸替换为带芳香环的氨基酸(3)将参比Cas12i核酸酶中位于RuvC结构域并与单链DNA底物相互作用的一个或多个氨基酸替换为带正电的氨基酸(4)将参比Cas12i核酸酶中与DNA-RNA双螺旋相互作用的一个或多个氨基酸替换为带正电的氨基酸;和(5)将参比Cas12i核酸酶中与DNA-RNA双螺旋相互作用的一个或多个极性或带正电荷的氨基酸替换为疏水氨基酸;优选地,所述参比Cas12i核酸酶为氨基酸序列为SEQ ID NO:1的野生型Cas12i2核酸酶。
- 如权利要求1所述的工程化的Cas12i核酸酶,其中,所述与PAM相互作用的一个或多个氨基酸是与PAM在三维结构上距离在9埃以内的氨基酸,优选,所述与PAM相互作用的一个或多个氨基酸是下述位置的一个或多个氨基酸:176、178、226、227、229、237、238、264、447和563,进一步优选,所述与PAM相互作用的一个或多个氨基酸是下述一个或多个氨基酸:E176、E178、Y226、A227、N229、E237、K238、K264、T447和E563,进一步优选,所述与PAM相互作用的一个或多个氨基酸是下述一个或多个氨基酸:E176、K238、T447和E563,其中,所述氨基酸位置编号如SEQ ID NO:1所示的相应氨基酸位置所定义。
- 如权利要求1或2所述的工程化的Cas12i核酸酶,其中,所述带正电的氨基酸是R,K或H。
- 如权利要求1-3中任一项所述的工程化的Cas12i核酸酶,其中,所述将参比Cas12i核酸酶中与PAM相互作用的一个或多个氨基酸替换为带正电的氨基酸是指如下替换中的一种或多种:E176R、K238R、T447R和E563R;优选,所述工程化的Cas12i核酸酶包含下述任何一种突变或突变组合:(1)E563R;(2)E176R、T447R、E176R和E563R;(3)K238R和E563R;(4)E176R、K238R和T447R;(5)E176R、K238R和E563R;(6)E176R、T447R和E563R;和(7)E176R、K238R、T447R和E563R;其中,所述氨基酸位置如SEQ ID NO:1所示的相应氨基酸位置所定义。
- 如权利要求1~4中任一项所述的工程化的Cas12i核酸酶,其中,所述参与打开DNA双链的一个或多个氨基酸是与PAM中相对于靶标链的3’端最后一个碱基对相互作用的氨基酸;优选,所述参与打开DNA双链的一个或多个氨基酸是下述位置的一个或多个氨基酸:163和164;进一步优选,所述参与打开DNA双链的一个或多个氨基酸是下述一个或多个氨基酸:Q163和N164;进一步优选,所述参与打开DNA双链的氨基酸为N164;其中,所述氨基酸位置编号如SEQ ID NO:1所示的相应氨基酸位置所定义。
- 如权利要求1-5中任一项所述的工程化的Cas12i核酸酶,其中,所述参与打开DNA双链的一个或多个氨基酸替换为带芳香环的氨基酸,所述带芳香环的氨基酸是F、Y或W,优选,所述带芳香环的氨基酸为F或Y。
- 如权利要求1-6中任一项所述的工程化的Cas12i核酸酶,其中,所述将参比Cas12i核酸酶中参与打开DNA双链的一个或多个氨基酸的替换为带芳香环的氨基酸包括以下一个或多个替换:Q163F、Q163Y、Q163W、和N164F;优选,所述Cas12i核酸酶包含N164Y或N164F突变;进一步优选,所述工程化的Cas12i核酸酶包含N164Y。
- 如权利要求1~7中任一项所述的工程化的Cas12i核酸酶,其中,所述位于RuvC结构域并与单链DNA底物相互作用的一个或多个氨基酸是在三维结构上与单链DNA底物距离在9埃以内的氨基酸;优选,所述位于RuvC结构域并与单链DNA底物相互作用的一个或多个氨基酸是下述位置的一个或多个氨基酸:323、327、355、359、360、361、362、388、390、391、392、393、414、417、418、421、424、425、650、652、653、696、705、708、709、751、752、755、840、848、851、856、885、897、925、926、928、929、932、1022;进一步优选,所述位于RuvC结构域并与单链DNA底物相互作用的一个或多个氨基酸是下述一个或多个氨基酸:E323、L327、V355、G359、G360、K361、D362、L388、N390、N391、F392、K393、Q414、L417、L418、K421、Q424、Q425、S650、E652、 G653、I696、K705、K708、E709、L751、S752、E755、N840、N848、S851、A856、Q885、M897、N925、I926、T928、G929、Y932、A1022;进一步优选,所述位于RuvC结构域并与单链DNA底物相互作用的一个或多个氨基酸是是下述一个或多个氨基酸:E323、D362、Q425、N925、I926、N391、Q424和G929;其中,所述氨基酸位置编号如SEQ ID NO:1所示的相应氨基酸位置所定义。
- 如权利要求1-8中任一项所述的工程化的Cas12i核酸酶,其中,将参比Cas12i核酸酶中位于RuvC结构域并与单链DNA底物相互作用的一个或多个氨基酸替换为带正电的氨基酸包括替换为R或K,优选,所述带正电的该氨基酸为R。
- 如权利要求1-9中任一项所述的工程化的Cas12i核酸酶,其中,将参比Cas12i核酸酶位于RuvC结构域并与单链DNA底物相互作用的一个或多个氨基酸替换为带正电的氨基酸是指包括如下替换中的一种或多种:E323R、D362R、N391R、Q424R、Q425R、N925R、I926R和G929R;优选,所述工程化的Cas12i核酸酶包含下述任何一种突变或突变组合:(1)E323R;(2)D362R;(3)Q425R;(4)N925R;(5)I926R;(6)E323R和D362R;(7)E323R和Q425R;(8)E323R和I926R;(9)Q425R和I926R;(10)D362R和I926R;(11)N925R和I926R;(12)E323R、D362R和Q425R;(13)E323R、D362R和I926R;(14)E323R、Q425R和I926R;(15)D362R、N925R和I926R;和(16)E323R、D362R、Q425R和I926R;其中,所述氨基酸位置如SEQ ID NO:1所示的相应氨基酸位置所定义。
- 如权利要求1~10中任一项所述的工程化的Cas12i核酸酶,其中,所述与DNA-RNA双螺旋相互作用的一个或多个氨基酸是与DNA-RNA双螺旋在三维结构上距离在9埃以内的氨基酸;优选,所述与DNA-RNA双螺旋相互作用的一个或多个氨基酸是下述位置的一个或多个氨基酸:116、117、156、159、160、161、247、293、294、297、301、305、306、308、312、313、316、319、320、343、348、349、427、433、438、441、442、679、683、691、782、783、797、800、852、853、855、861、865、957、958;进一步优选,所述与DNA-RNA双螺旋相互作用的一个或多个氨基酸是下述一个或多个氨基酸:G116、E117、A156、T159、E160、S161、E247、G293、E294、N297、T301、I305、K306、T308、N312、F313、Q316、E319、Q320、E343、E348、E349、D427、K433、V438、N441、Q442、N679、E683、E691、D782、E783、E797、E800、M852、D853、L855、N861、Q865、S957、D958;进一步优选,所述与DNA-RNA双螺旋相互作用的一个或多个氨基酸是下述一个或多个氨基酸:G116、E117、T159、S161、E319、E343和D958;其中,所述氨基酸位置如SEQ ID NO:1所示的相应氨基酸位置所定义。
- 如权利要求1-11中任一项所述的工程化的Cas12i核酸酶,其中,将参比Cas12i核酸酶中与DNA-RNA双螺旋相互作用的一个或多个氨基酸替换为带正电的氨基酸包括替换为R或K,优选,所述带正电的氨基酸为R。
- 如权利要求1-12中任一项所述的工程化的Cas12i核酸酶,其中,将参比Cas12i核酸酶中与DNA-RNA双螺旋相互作用的一个或多个氨基酸替换为带正电的氨基酸包括替换为以下的一种或多种:G116R、E117R、T159R、S161R、E319R、E343R和D958R;优选,所述工程化的Cas12i核酸酶包含D958R替换;其中,所述氨基酸位置如SEQ ID NO:1所示的相应氨基酸位置所定义。
- 如权利要求1~13中任一项所述的工程化的Cas12i核酸酶,所述与DNA-RNA双螺旋相互作用的一个或多个极性或带正电荷的氨基酸选自如下的一个或多个位置的氨基酸:357、394、715、719、807、844、848、857、861,即下述一个或多个氨基酸:H357、K394、R715、R719、K807、K844、N848、R857、R861;其中,所述氨基酸位置如SEQ ID NO:1所示的相应氨基酸位置所定义。
- 如权利要求1-14中任一项所述的工程化的Cas12i核酸酶,其中,将参比Cas12i核酸酶中与DNA-RNA双螺旋相互作用的一个或多个极性或带正电荷的氨基酸替换为疏水氨基酸包括替换为丙氨酸(A)。
- 如权利要求1-15中任一项所述的工程化的Cas12i核酸酶,其包含选自下组的一个或多个突变:H357A、K394A、R715A、R719A、K807A、K844A、N848A、R857A、R861A;优选,包含选自下组的一个或多个突变:K394A、R719A、K844A、R857A;其中,所述氨基酸位置如SEQ ID NO:1所示的相应氨基酸位置所定义。
- 如权利要求1-16中任一项所述的工程化的Cas12i核酸酶,其包含R719A和K844A氨基酸取代,或包含R857A和K844A氨基酸取代;其中,所述氨基酸位置如SEQ ID NO:1所示的相应氨基酸位置所定义。
- 如权利要求1~17中任一项所述的工程化的Cas12i核酸酶,其还包含一个或多个柔性区突变,所述突变增加了参比Cas12i核酸酶中的柔性区的柔性,所述柔性区选自氨基酸残基439-443或氨基酸残基925-929;优选,所述柔性区突变位于439和/或926位点;进一步优选,所述柔性区突变是L439和/或I926的突变;其中,所述氨基酸位置如SEQ ID NO:1所示的相应氨基酸位置所定义。
- 如权利要求18所述的工程化的Cas12i核酸酶,其中,所述一个或多个柔性区突变为:将该柔性区氨基酸替换为G、和/或在其后插入一个或两个G;优选,所述一个或多个柔性区突变包含I926G、L439(L+G)或L439(L+GG);进一步优选,所述一个或多个柔性区突变包含L439(L+G)或L439(L+GG);其中,所述氨基酸位置编号如SEQ ID NO:1所示的相应氨基酸位置所定义。
- 一种工程化的Cas12i核酸酶;所述工程化的Cas12i核酸酶包含任何以下一组或多组突变:(1)E563R;(2)E176R和T447R;(3)E176R和E563R;(4)K238R和E563R;(5)E176R、K238R和T447R;(6)E176R、T447R和E563R;(7)E176R、K238R和E563R;(8)E176R、K238R、T447R和E563R;(9)N164Y;(10)N164F;(11)E323R;(12)D362R;(13)Q425R;(14)N925R;(15)I926R;(16)D958R;(17)E323R和D362R;(18)E323R和Q425R;(19)E323R和I926R;(20)Q425R和I926R;(21)D362R和I926R;(22)N925R和I926R;(23)E323R、D362R和Q425R;(24)E323R、D362R和I926R;(25)E323R、Q425R和I926R;(26)D362R、N925R和I926R;(27)E323R、D362R、Q425R和I926R;(28)D362R和I926G;(29)N925R和I926G;(30)D362R、N925R和I926G;(31)I926R和L439(L+G);(32)I926R和L439(L+GG);(33)E323R、D362R和I926G;(34)R719A和K844A;和(35)R857A和K844A;优选,所述工程化的Cas12i核酸酶包含任何以下一组或多组突变:(1)E176R、K238R、T447R和E563R;(2)N164Y;(3)I926R;(4)E323R和D362R;(4)I926G;(5)I926R和L439(L+G);(6)I926R和L439(L+GG);和(7)D958R;其中,所述氨基酸位置如SEQ ID NO:1所示的相应氨基酸位置所定义。
- 一种工程化的Cas12i核酸酶,所述工程化的Cas12i核酸酶包含以下任一组突变:(1)E176R、K238R、T447R、E563R和N164Y;(2)E176R、K238R、T447R、E563R和I926R;(3)N164Y、E323R和D362R;(4)E176R、K238R、T447R、E563R、E323R和D362R;(5)N164Y和I926R;(6)E176R、K238R、T447R、E563R、N164Y和I926R;(7)E176R、K238R、T447R、E563R、N164Y、E323R和D362R;(8)E176R、K238R、T447R、E563R、N164Y、I926R、E323R和D362R;(9)E176R、K238R、T447R、E563R、 N164Y、E323R、D362R和I926G;(10)E176R、K238R、T447R、E563R、N164Y、E323R、D362R、I926G和L439(L+GG);(11)E176R、K238R、T447R、E563R、N164Y、E323R、D362R、I926G和L439(L+G);(12)E176R、K238R、T447R、E563R、N164Y和D958R;(13)E176R、K238R、T447R、E563R、I926R和D958R;(14)E176R、K238R、T447R、E563R、E323R、D362R和D958R;(15)N164Y、I926R和D958R;(16)N164Y、E323R、D362R和D958R;(17)E176R、K238R、T447R、E563R、N164Y、I926R和D958R;(18)E176R、K238R、T447R、E563R、N164Y、E323R、D362R和D958R;(19)E176R、K238R、T447R、E563R、N164Y、I926R、E323R、D362R和D958R;(20)E176R、K238R、T447R、E563R、N164Y、E323R、D362R、I926G和D958R;(21)E176R、K238R、T447R、E563R、N164Y、E323R、D362R、I926G、L439(L+GG)和D958R;(22)E176R、K238R、T447R、E563R、N164Y、E323R、D362R、I926G、L439(L+G)和D958R;(23)E176R、K238R、T447R、E563R、N164Y、E323R、D362R和R857A;(24)E176R、K238R、T447R、E563R、N164Y、E323R、D362R和N861A;(25)E176R、K238R、T447R、E563R、N164Y、E323R、D362R和K807A;(26)E176R、K238R、T447R、E563R、N164Y、E323R、D362R和N848A;(27)E176R、K238R、T447R、E563R、N164Y、E323R、D362R和R715A;(28)E176R、K238R、T447R、E563R、N164Y、E323R、D362R和R719A;(29)E176R、K238R、T447R、E563R、N164Y、E323R、D362R和K394A;(30)E176R、K238R、T447R、E563R、N164Y、E323R、D362R和H357A;(31)E176R、K238R、T447R、E563R、N164Y、E323R、D362R和K844A;(32)E176R、K238R、T447R、E563R、N164Y、E323R、D362R、R719A和K844A;或(33)E176R、K238R、T447R、E563R、N164Y、E323R、D362R、R857A和K844A;其中,所述氨基酸位置如SEQ ID NO:1所示的相应氨基酸位置所定义。
- 如权利要求1~21中任一项所述的工程化的Cas12i核酸酶,其包含如SEQ ID NOs:2~24中任一项所示氨基酸序列的工程化Cas12i核酸酶,或与如SEQ ID NOs:2~24中任一序列所示的氨基酸序列具有至少80%同一性的氨基酸序列。
- 一种工程化的Cas12i效应蛋白,其包含权利要求1~22中任一项所述的工程化的Cas12i核酸酶或其功能衍生物;任选的,所述工程化的Cas12i核酸酶或其功能衍生物具有酶活性,或者所述工程化的Cas12i核酸酶或其功能衍生物为酶失活突变体。
- 如权利要求23所述的工程化的Cas12i效应蛋白,其中所述Cas12i效应蛋白能够 诱导DNA分子中的双链断裂或单链断裂。
- 如权利要求23所述的工程化的Cas12i效应蛋白,其中所述工程化的Cas12i核酸酶或其功能衍生物为包含以下一种或多种突变的酶失活突变体:D599A、E833A、S883A、H884A、R900A和D1019A;其中,所述氨基酸位置如SEQ ID NO:1所示的相应氨基酸位置所定义。
- 如权利要求23~25中任一项所述的工程化的Cas12i效应蛋白,其还包含与所述工程化的Cas12i核酸酶或其功能衍生物融合的功能结构域。
- 如权利要求26所述的工程化的Cas12i效应蛋白,其中所述功能结构域选自下组中的一个或多个:翻译起始结构域、转录阻遏结构域、反式激活结构域、表观遗传修饰结构域、核碱基编辑结构域、逆转录酶结构域、报告分子结构域和核酸酶结构域。
- 如权利要求23-27中任一项所述的工程化的Cas12i效应蛋白,所述工程化的Cas12i效应蛋白包含:含有所述工程化的Cas12i核酸酶或其功能衍生物的N末端部分的第一多肽和含有所述工程化的Cas12i核酸酶或其功能衍生物的C末端部分的第二多肽,其中所述第一多肽和所述第二多肽能够在包含指导序列的指导RNA的存在下彼此缔合,以形成与靶核酸特异性结合的成簇规律间隔短回文重复序列(CRISPR)复合物,所述靶核酸包含与所述指导序列互补的靶序列;优选,所述第一多肽包含权利要求1-22中任一项所述的工程化的Cas12i核酸酶的N末端部分氨基酸残基1至X,所述第二多肽包含权利要求1-22中任一项所述的工程化的Cas12i核酸酶的氨基酸残基X+I至所述Cas12i核酸酶的C末端;可选的,所述第一多肽和所述第二多肽各自包含二聚化结构域;可选的,所述第一多肽的二聚化结构域和所述第二多肽的二聚化结构域在诱导剂存在下彼此缔合。
- 一种工程化的CRISPR-Cas12i系统,包括:(a)权利要求1-28中任一项所述的工程化的Cas12i核酸酶、或权利要求23-28中任一项所述的工程化的Cas12i效应蛋白;以及(b)包含与靶序列互补的指导序列的指导RNA,或编码所述指导RNA的一种或多种核酸,其中所述工程化的Cas12i核酸酶或工程化的Cas12i效应蛋白和所述指导RNA能够形成CRISPR复合物,所述CRISPR复合物特异性结合包含所述靶序列的靶核酸并诱导所述靶核酸的修饰;优选,所述指导RNA是包含所述指导序列的crRNA;进一步优选,所述工程化的CRISPR-Cas12i系统包括编码多个crRNA的前体指导RNA阵列(array);或者,优选所述工程化的Cas12i核酸酶或工程化的Cas12i效应蛋白是主编辑器,所述指导RNA是引导编辑指导RNA(pegRNA)。
- 如权利要求29所述的工程化的CRISPR-Cas12i系统,其包含一种或多种编码所述工程化的Cas12i核酸酶或工程化的Cas12i效应蛋白的载体;优选,所述一种或多种载体选自下组:逆转录病毒载体、慢病毒载体、腺病毒载体、腺相关的载体和单纯疱疹载体;进一步优选,所述一种或多种载体是腺相关病毒AAV载体;进一步优选,所述AAV载体还编码所述指导RNA。
- 一种检测样品中靶核酸的方法,包括:(a)使样品与权利要求29中的工程化的CRISPR-Cas12i系统以及加标签的检测核酸接触,该检测核酸为单链且不与所述指导RNA的指导序列杂交;以及(b)测量通过所述工程化的Cas12i核酸酶或工程化的Cas12i效应蛋白切割所述加标签的检测核酸而产生的可检测信号,从而检测所述靶核酸。
- 一种修饰包含靶序列的靶核酸的方法,包括使所述靶核酸与权利要求29或30所述的工程化的CRISPR-Cas12i系统接触;优选,所述方法在体外进行、离体进行或在体内进行;进一步优选,所述靶核酸存在于细胞中;进一步优选,所述细胞是细菌细胞、酵母细胞、哺乳动物细胞、植物细胞或动物细胞;进一步优选,所述靶核酸是基因组DNA;进一步优选,所述靶序列与疾病或病症相关;进一步优选,所述工程化的CRISPR-Cas12i系统包括编码多个crRNA的前体指导RNA阵列,其中每个crRNA包含不同的指导序列。
- 如权利要求29或30所述的工程化的CRISPR-Cas12i系统在制备治疗与个体的细胞中靶核酸相关的疾病或病症的药物中的用途;优选,所述疾病或病症选自下组:癌症、心血管疾病、遗传性疾病、自身免疫疾病、代谢性疾病、神经退行性疾病、眼病、细菌感染和病毒感染。
- 一种治疗与个体的细胞中的靶核酸相关的疾病或病症的方法,所述方法包含使用 权利要求32所述的方法来修饰所述个体的细胞中的靶核酸,从而治疗所述疾病或病症;优选,所述疾病或病症选自下组:癌症、心血管疾病、遗传性疾病、自身免疫疾病、代谢性疾病、神经退行性疾病、眼病、细菌感染和病毒感染。
- 一种修饰包含靶序列的靶核酸的方法,包括使所述靶核酸与权利要求29或30中所述的工程化的CRISPR-Cas12i系统接触。
- 包含如权利要求1-22中任一项所述的工程化的Cas12i核酸酶、如权利要求23-28中任一项所述的工程化的Cas12i效应蛋白的组合物或试剂盒。
- 工程化的细胞,其包含经修饰的靶核酸,其中靶核酸由如权利要求32或35所述的方法修饰。
- 工程化的非人类动物,其包含一种或多种如权利要求37所述的工程化的细胞。
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202280035920.XA CN117337326A (zh) | 2021-05-27 | 2022-05-25 | 工程化的Cas12i核酸酶、效应蛋白及其用途 |
JP2023573318A JP2024522112A (ja) | 2021-05-27 | 2022-05-25 | エンジニアリングされたCas12iヌクレアーゼ、エフェクタータンパク質及びその用途 |
EP22810597.9A EP4349979A4 (en) | 2021-05-27 | 2022-05-25 | MANIPULATED CAS12I NUCLEASE, EFFECTOR PROTEIN AND USE THEREOF |
Applications Claiming Priority (6)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110581290.3 | 2021-05-27 | ||
CN2021096477 | 2021-05-27 | ||
CN202110581290.3A CN113151215B (zh) | 2021-05-27 | 2021-05-27 | 工程化的Cas12i核酸酶及其效应蛋白以及用途 |
CNPCT/CN2021/096477 | 2021-05-27 | ||
CN202111347952.7 | 2021-11-15 | ||
CN202111347952 | 2021-11-15 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2022247873A1 true WO2022247873A1 (zh) | 2022-12-01 |
Family
ID=84228435
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2022/095072 WO2022247873A1 (zh) | 2021-05-27 | 2022-05-25 | 工程化的Cas12i核酸酶、效应蛋白及其用途 |
Country Status (4)
Country | Link |
---|---|
EP (1) | EP4349979A4 (zh) |
JP (1) | JP2024522112A (zh) |
CN (1) | CN117337326A (zh) |
WO (1) | WO2022247873A1 (zh) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2023104185A1 (en) * | 2021-12-09 | 2023-06-15 | Beijing Institute For Stem Cell And Regenerative Medicine | Engineered cas12b effector proteins and methods of use thereof |
WO2023231456A1 (zh) * | 2022-05-31 | 2023-12-07 | 山东舜丰生物科技有限公司 | 一种优化的Cas蛋白及其应用 |
Citations (22)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4797368A (en) | 1985-03-15 | 1989-01-10 | The United States Of America As Represented By The Department Of Health And Human Services | Adeno-associated virus as eukaryotic expression vector |
US5173414A (en) | 1990-10-30 | 1992-12-22 | Applied Immune Sciences, Inc. | Production of recombinant adeno-associated virus vectors |
WO1993024641A2 (en) | 1992-06-02 | 1993-12-09 | The United States Of America, As Represented By The Secretary, Department Of Health & Human Services | Adeno-associated virus with inverted terminal repeat sequences as promoter |
WO1994029442A2 (en) | 1993-06-14 | 1994-12-22 | Basf Aktiengesellschaft | Tight control of gene expression in eucaryotic cells by tetracycline-responsive promoters |
WO1996001313A1 (en) | 1994-07-01 | 1996-01-18 | Hermann Bujard | Tetracycline-regulated transcriptional modulators |
US8454972B2 (en) | 2004-07-16 | 2013-06-04 | The United States Of America, As Represented By The Secretary, Department Of Health And Human Services | Method for inducing a multiclade immune response against HIV utilizing a multigene and multiclade immunogen |
WO2014093622A2 (en) | 2012-12-12 | 2014-06-19 | The Broad Institute, Inc. | Delivery, engineering and optimization of systems, methods and compositions for sequence manipulation and therapeutic applications |
WO2015054653A2 (en) | 2013-10-11 | 2015-04-16 | Massachusetts Eye & Ear Infirmary | Methods of predicting ancestral virus sequences and uses thereof |
WO2016112242A1 (en) | 2015-01-08 | 2016-07-14 | President And Fellows Of Harvard College | Split cas9 proteins |
WO2016205764A1 (en) | 2015-06-18 | 2016-12-22 | The Broad Institute Inc. | Novel crispr enzymes and systems |
WO2016205749A1 (en) | 2015-06-18 | 2016-12-22 | The Broad Institute Inc. | Novel crispr enzymes and systems |
WO2018165629A1 (en) | 2017-03-10 | 2018-09-13 | President And Fellows Of Harvard College | Cytosine to guanine base editor |
US10253365B1 (en) | 2017-11-22 | 2019-04-09 | The Regents Of The University Of California | Type V CRISPR/Cas effector proteins for cleaving ssDNAs and detecting target DNAs |
WO2019201331A1 (zh) | 2018-04-20 | 2019-10-24 | 中国农业大学 | 一种CRISPR/Cas效应蛋白及系统 |
WO2019226953A1 (en) | 2018-05-23 | 2019-11-28 | The Broad Institute, Inc. | Base editors and uses thereof |
US20200063126A1 (en) | 2018-03-14 | 2020-02-27 | Arbor Biotechnologies, Inc. | Novel crispr dna targeting enzymes and systems |
WO2020056924A1 (zh) | 2018-09-20 | 2020-03-26 | 中国科学院动物研究所 | 核酸检测方法 |
WO2020191246A1 (en) | 2019-03-19 | 2020-09-24 | The Broad Institute, Inc. | Methods and compositions for editing nucleotide sequences |
CN112195164A (zh) | 2020-12-07 | 2021-01-08 | 中国科学院动物研究所 | 工程化的Cas效应蛋白及其使用方法 |
WO2021087246A1 (en) * | 2019-10-31 | 2021-05-06 | Inari Agriculture, Inc. | Base-editing systems |
CN113151215A (zh) * | 2021-05-27 | 2021-07-23 | 中国科学院动物研究所 | 工程化的Cas12i核酸酶及其效应蛋白以及用途 |
WO2021226558A1 (en) | 2020-05-08 | 2021-11-11 | The Broad Institute, Inc. | Methods and compositions for simultaneous editing of both strands of a target double-stranded nucleotide sequence |
-
2022
- 2022-05-25 CN CN202280035920.XA patent/CN117337326A/zh active Pending
- 2022-05-25 EP EP22810597.9A patent/EP4349979A4/en active Pending
- 2022-05-25 JP JP2023573318A patent/JP2024522112A/ja active Pending
- 2022-05-25 WO PCT/CN2022/095072 patent/WO2022247873A1/zh active Application Filing
Patent Citations (26)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4797368A (en) | 1985-03-15 | 1989-01-10 | The United States Of America As Represented By The Department Of Health And Human Services | Adeno-associated virus as eukaryotic expression vector |
US5173414A (en) | 1990-10-30 | 1992-12-22 | Applied Immune Sciences, Inc. | Production of recombinant adeno-associated virus vectors |
WO1993024641A2 (en) | 1992-06-02 | 1993-12-09 | The United States Of America, As Represented By The Secretary, Department Of Health & Human Services | Adeno-associated virus with inverted terminal repeat sequences as promoter |
WO1994029442A2 (en) | 1993-06-14 | 1994-12-22 | Basf Aktiengesellschaft | Tight control of gene expression in eucaryotic cells by tetracycline-responsive promoters |
WO1996001313A1 (en) | 1994-07-01 | 1996-01-18 | Hermann Bujard | Tetracycline-regulated transcriptional modulators |
US8454972B2 (en) | 2004-07-16 | 2013-06-04 | The United States Of America, As Represented By The Secretary, Department Of Health And Human Services | Method for inducing a multiclade immune response against HIV utilizing a multigene and multiclade immunogen |
WO2014093622A2 (en) | 2012-12-12 | 2014-06-19 | The Broad Institute, Inc. | Delivery, engineering and optimization of systems, methods and compositions for sequence manipulation and therapeutic applications |
WO2015054653A2 (en) | 2013-10-11 | 2015-04-16 | Massachusetts Eye & Ear Infirmary | Methods of predicting ancestral virus sequences and uses thereof |
WO2016112242A1 (en) | 2015-01-08 | 2016-07-14 | President And Fellows Of Harvard College | Split cas9 proteins |
WO2016205764A1 (en) | 2015-06-18 | 2016-12-22 | The Broad Institute Inc. | Novel crispr enzymes and systems |
WO2016205749A1 (en) | 2015-06-18 | 2016-12-22 | The Broad Institute Inc. | Novel crispr enzymes and systems |
WO2018165629A1 (en) | 2017-03-10 | 2018-09-13 | President And Fellows Of Harvard College | Cytosine to guanine base editor |
US10253365B1 (en) | 2017-11-22 | 2019-04-09 | The Regents Of The University Of California | Type V CRISPR/Cas effector proteins for cleaving ssDNAs and detecting target DNAs |
CN112041444A (zh) * | 2018-03-14 | 2020-12-04 | 阿伯生物技术公司 | 新型crispr dna靶向酶及系统 |
US20200063126A1 (en) | 2018-03-14 | 2020-02-27 | Arbor Biotechnologies, Inc. | Novel crispr dna targeting enzymes and systems |
US11168324B2 (en) | 2018-03-14 | 2021-11-09 | Arbor Biotechnologies, Inc. | Crispr DNA targeting enzymes and systems |
US10808245B2 (en) | 2018-03-14 | 2020-10-20 | Arbor Biotechnologies, Inc. | CRISPR DNA targeting enzymes and systems |
WO2019201331A1 (zh) | 2018-04-20 | 2019-10-24 | 中国农业大学 | 一种CRISPR/Cas效应蛋白及系统 |
CN112004932A (zh) * | 2018-04-20 | 2020-11-27 | 中国农业大学 | 一种CRISPR/Cas效应蛋白及系统 |
WO2019226953A1 (en) | 2018-05-23 | 2019-11-28 | The Broad Institute, Inc. | Base editors and uses thereof |
WO2020056924A1 (zh) | 2018-09-20 | 2020-03-26 | 中国科学院动物研究所 | 核酸检测方法 |
WO2020191246A1 (en) | 2019-03-19 | 2020-09-24 | The Broad Institute, Inc. | Methods and compositions for editing nucleotide sequences |
WO2021087246A1 (en) * | 2019-10-31 | 2021-05-06 | Inari Agriculture, Inc. | Base-editing systems |
WO2021226558A1 (en) | 2020-05-08 | 2021-11-11 | The Broad Institute, Inc. | Methods and compositions for simultaneous editing of both strands of a target double-stranded nucleotide sequence |
CN112195164A (zh) | 2020-12-07 | 2021-01-08 | 中国科学院动物研究所 | 工程化的Cas效应蛋白及其使用方法 |
CN113151215A (zh) * | 2021-05-27 | 2021-07-23 | 中国科学院动物研究所 | 工程化的Cas12i核酸酶及其效应蛋白以及用途 |
Non-Patent Citations (37)
Title |
---|
"Laboratory Techniques in Biochemistry and Molecular Biology-Hybridization with Nucleic Acid Probes", 1993, ELSEVIER, article "Overview of principles of hybridization and the strategy of nucleic acid probe assay" |
A. ANZALONE ET AL., NATURE, vol. 576, no. 7785, 2019, pages 149 - 157 |
APONTE-UBILLUS ET AL., APPL. MICROBIOL. BIOTECHNOL, vol. 102, no. 3, 2018, pages 1045 - 54 |
BROPHYJENNIFER ANCHRISTOPHER A. VOIGT.: "Principles of genetic circuit design.", NATURE METHODS, vol. 11, no. 5, 2014, pages 508, XP055256504, DOI: 10.1038/nmeth.2926 |
CHARPENTIERDOUDNA, NATURE, vol. 495, 2013, pages 50 - 51 |
CRABTREE ET AL., CHEMISTRY & BIOLOGY, vol. 13, January 2006 (2006-01-01), pages 99 - 107 |
E. MEYERSW. MILLER, COMPUT. APPL BIOSCI., vol. 4, 1988, pages 11 - 17 |
FUKUI ET AL., J. NUCLEIC ACIDS, 2010, pages 260512 |
GAUDELLI ET AL., NATURE, vol. 551, 2017, pages 464 - 471 |
GAUDELLI ET AL., NATURE, vol. 551, no. 7681, 23 November 2017 (2017-11-23), pages 464 - 471 |
GIBSON: "A Primer of Genome Science", 2004, SUNDERLAND, MASS.: SINAUER |
GRUNEBAUM ET AL., CURR. OPIN. ALLERGY CLIN. IMMUNOL., vol. 13, 2013, pages 630 - 638 |
HUANG X ET AL., NATURE COMMUNICATIONS, vol. 11, no. 5241, 2020 |
KIM ET AL., BIOCHEMISTRY, vol. 45, 2006, pages 6407 - 6416 |
KIM ET AL., NATURE BIOTECHNOLOGY, vol. 35, no. 4, 2017, pages 371 - 377 |
KOMORE ET AL., NATURE, vol. 533, no. 7603, 19 May 2016 (2016-05-19), pages 420 - 4 |
NEEDLEMAN ET AL., J. MOL. BIOL., vol. 48, 1970, pages 443 - 453 |
NEEDLEMANWUNSCH, J MOL BIOL., vol. 48, 1970, pages 444 - 453 |
PAULMURUGANPAULMURUGANGAMBHIR, CANCER RES, vol. 65, 15 August 2005 (2005-08-15), pages 7413 |
RICHTER ET AL., NATURE BIOTECHNOLOGY, vol. 38, 2020, pages 883 - 891 |
SAMBROOK ET AL.: "Molecular Cloning: A Laboratory Manual", 2001, COLD SPRING HARBOR LABORATORY |
See also references of EP4349979A4 |
SENTMANAT MF ET AL.: "A survey of validation strategies for CRISPR-Cas9 editing", SCIENTIFIC REPORTS, vol. 8, no. 888, 2018 |
SUNDBERGICHIKI: "Genetically Engineered Mice Handbook", 2006, CRC PRESS |
TAO WEI: "Research progress of CRISPR/Cas gene editing system", CHEMISTRY OF LIFE, vol. 39, no. 1, 15 February 2019 (2019-02-15), pages 39 - 45, XP055776583, DOI: 10.13488/j.smhx.20181350 * |
TRATSCHIN ET AL., MOL. CELL. BIOL., vol. 5, 1985, pages 3251 - 60 |
TSAI, SQ ET AL.: "GUIDE-seq enables genome-wide profiling of offtarget cleavage by CRISPR-Cas nucleases", NAT BIOTECHNOL., vol. 33, no. 2, 2015, pages 187 - 197 |
TSAI, SQ ET AL.: "GUIDE-seq enables genome-wide profiling of off-target cleavage by CRISPR-Cas nucleases", NAT BIOTECHNOL., vol. 33, no. 2, 2015, pages 187 - 197, XP055555627, DOI: 10.1038/nbt.3117 |
UI-TEI ET AL., FEBS LETTERS, vol. 479, 2000, pages 79 - 82 |
VAN DER OOST, SCIENCE, vol. 358, no. 6366, 24 November 2017 (2017-11-24), pages 1019 - 1027 |
WEST ET AL., VIROLOGY, vol. 160, 1987, pages 38 - 47 |
WOLF ET AL., EMBO J., vol. 21, 2002, pages 3841 - 3851 |
WRIGHT, ADDISON V ET AL.: "Rational design of a split-Cas9 enzyme complex", PROC. NAT'L. ACAD. SCI., vol. 112, no. 10, 2015, pages 2984 - 2989, XP055283739, DOI: 10.1073/pnas.1501698112 |
ZHANG H ET AL., NATURE STRUCTURAL & MOLECULAR BIOLOGY, vol. 27, 2020, pages 1069 - 1076 |
ZHANG, HENG ET AL.: "Mechanisms for Target Recognition and Cleavage by the Cas12i RNA-guided Endonuclease", NATURE STRUCTURAL & MOLECULAR BIOLOGY, vol. 27, no. 11, 7 September 2020 (2020-09-07), XP037295140, DOI: 10.1038/s41594-020-0499-0 * |
ZHENG ET AL., NUCLEIC ACIDS RES., vol. 45, no. 6, 2017, pages 3369 - 3377 |
ZHONG ET AL., J. GENET. SYNDR. GENE THER, 2012 |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2023104185A1 (en) * | 2021-12-09 | 2023-06-15 | Beijing Institute For Stem Cell And Regenerative Medicine | Engineered cas12b effector proteins and methods of use thereof |
WO2023231456A1 (zh) * | 2022-05-31 | 2023-12-07 | 山东舜丰生物科技有限公司 | 一种优化的Cas蛋白及其应用 |
Also Published As
Publication number | Publication date |
---|---|
CN117337326A (zh) | 2024-01-02 |
EP4349979A4 (en) | 2024-08-21 |
EP4349979A1 (en) | 2024-04-10 |
JP2024522112A (ja) | 2024-06-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113151215B (zh) | 工程化的Cas12i核酸酶及其效应蛋白以及用途 | |
CN112195164B (zh) | 工程化的Cas效应蛋白及其使用方法 | |
JP7365374B2 (ja) | ヌクレアーゼ介在性遺伝子発現調節 | |
US20200340012A1 (en) | Crispr-cas genome engineering via a modular aav delivery system | |
US9757420B2 (en) | Gene editing for HIV gene therapy | |
JP6793547B2 (ja) | 最適化機能CRISPR−Cas系による配列操作のための系、方法および組成物 | |
JP7085716B2 (ja) | Rnaガイド遺伝子編集及び遺伝子調節 | |
JP2022000041A (ja) | 標的化核酸編集のための系、方法、及び組成物 | |
JP2021010383A (ja) | ヌクレアーゼ媒介ゲノム遺伝子操作のための送達方法および組成物 | |
KR102494449B1 (ko) | 진핵 게놈 변형을 위한 조작된 cas9 시스템 | |
JP2019520391A (ja) | 網膜変性を処置するためのcrispr/cas9ベースの組成物および方法 | |
CA3026110A1 (en) | Novel crispr enzymes and systems | |
JP2019503716A (ja) | Crisprcpf1の結晶構造 | |
JP2024041866A (ja) | 強化されたhATファミリーのトランスポゾンが介在する遺伝子導入ならびに関連する組成物、システム、及び方法 | |
WO2022247873A1 (zh) | 工程化的Cas12i核酸酶、效应蛋白及其用途 | |
KR102683424B1 (ko) | 타우 응집과 관련된 유전적 취약성을 식별하기 위한 CRISPR/Cas 드랍아웃 스크리닝 플랫폼 | |
KR102695765B1 (ko) | 타우 시딩 또는 응집의 유전적 변형제를 식별하기 위한 CRISPR/Cas 스크리닝 플랫폼 | |
WO2022042557A1 (en) | Split cas12 systems and methods of use thereof | |
KR20230005865A (ko) | 전위-기반 요법 | |
KR20210105914A (ko) | 뉴클레아제-매개 반복부 팽창 | |
JP2024540337A (ja) | 新型CRISPR-Cas12iシステム及びその用途 | |
KR20230158531A (ko) | 신규 crispr 효소, 방법, 시스템 및 그 용도 | |
WO2022120520A1 (en) | Engineered cas effector proteins and methods of use thereof | |
JP2022528722A (ja) | 改善された遺伝子編集のための組成物及び方法 | |
WO2023138617A1 (zh) | 工程化的CasX核酸酶、效应蛋白及其用途 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 22810597 Country of ref document: EP Kind code of ref document: A1 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 202280035920.X Country of ref document: CN |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2023573318 Country of ref document: JP |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2022810597 Country of ref document: EP |
|
ENP | Entry into the national phase |
Ref document number: 2022810597 Country of ref document: EP Effective date: 20240102 |