CN118451179A

CN118451179A - Gene therapy for the treatment of HT1

Info

Publication number: CN118451179A
Application number: CN202280083049.0A
Authority: CN
Inventors: 熊强; 志威·高
Original assignee: Logic Biotherapy Co
Current assignee: Logic Biotherapy Co
Priority date: 2021-10-18
Filing date: 2022-10-18
Publication date: 2024-08-06

Abstract

The present disclosure provides compositions and methods for gene therapy. In addition, the present disclosure provides compositions and methods for treating HT1 by novel gene therapy mechanisms. One of the compositions comprises a circular cDNA integration gene therapy construct comprising, from 5' to 3', a coding (a) 5' homology arm between 1kb and 1.6kb in length; (b) a P2A coding sequence encoding a P2A peptide; (c) a therapeutic payload; and (d) a polynucleotide sequence of a 3' homology arm of between 1kb and 1.6kb in length.

Description

Gene therapy for the treatment of HT1

Cross Reference to Related Applications

The present application claims priority from U.S. provisional application No. 63/339,783 filed on 5 months 9 of 2022 and U.S. provisional application No. 63/257,028 filed on 18 months 10 of 2021, each of which is incorporated herein by reference in its entirety.

Background

A subset of human diseases can be traced back to genetic or acquired DNA changes in early embryonic development. Of particular interest to gene therapy developers are diseases caused by mutations in a single gene, known as monogenic diseases. It is believed that there are over 6,000 monogenic diseases. Generally, any particular genetic disease caused by inherited mutations is relatively rare, but overall, the mortality rate of genetically related diseases is high. Well known genetic diseases include cystic fibrosis, dunaliella muscular dystrophy (Duchenne muscular dystrophy), huntington's disease, and sickle cell disease. Other classes of genetic diseases include metabolic disorders such as organic acidemia and lysosomal storage disorders, where dysfunctional genes lead to defects in metabolic processes and accumulation of toxic byproducts that can lead to serious morbidity and mortality both short-term and long-term.

SUMMARY

Monogenic diseases are of particular interest to biomedical innovators due to the simplicity of their disease pathology. However, most of these diseases and conditions remain virtually untreated. Thus, there remains a long felt need in the art for the treatment of such diseases.

In some embodiments, the present disclosure provides methods of integrating a transgene into the genome of at least one cell population in a subject tissue. In some embodiments, the methods can include the step of administering to a subject a composition that delivers a transgene encoding a functional protein, cells in the subject's tissue failing to express the functional protein encoded by the gene product, wherein the composition comprises: a polynucleotide cassette (cassette) comprising an expression cassette comprising a first nucleic acid sequence and a second nucleic acid sequence, wherein the first nucleic acid sequence encodes a transgene; and the second nucleic acid sequence is located 5 'or 3' to the first nucleic acid sequence and facilitates the production of two independent gene products upon integration into the genome of the cell at the target integration site; a third nucleic acid sequence located 5 'of the expression cassette and comprising a sequence substantially homologous to the genomic sequence 5' of the target integration site in the genome of the cell; and a fourth nucleic acid sequence located 3 'of the expression cassette and comprising a sequence substantially homologous to the genomic sequence 3' of the target integration site in the genome of the cell, wherein upon administration of the composition, the transgene is integrated into the genome of the cell population.

In some embodiments, the present disclosure provides a method of increasing the expression level of a transgene in a tissue over a period of time, the method comprising the step of administering to a subject in need thereof a composition that delivers the transgene integrated into the genome of at least one cell population in the subject tissue, wherein the composition comprises: a polynucleotide cassette comprising an expression cassette, the expression cassette comprising a first nucleic acid sequence and a second nucleic acid sequence, wherein the first nucleic acid sequence encodes a transgene; and the second nucleic acid sequence is located 5 'or 3' to the first nucleic acid sequence and facilitates the production of two independent gene products upon integration into the genome of the cell at the target integration site; a third nucleic acid sequence located 5 'of the expression cassette and comprising a sequence substantially homologous to the genomic sequence 5' of the target integration site in the genome of the cell; and a fourth nucleic acid sequence located 3 'of the expression cassette and comprising a sequence substantially homologous to the genomic sequence 3' of the target integration site in the genome of the cell, wherein upon administration of the composition, the transgene is integrated into the genome of the cell population and the expression level of the transgene in the tissue increases over a period of time. In some embodiments, the increased expression level comprises an increased percentage of cells expressing the transgene in the tissue.

In some embodiments, the present disclosure provides methods comprising the step of administering to a subject a dose of a composition that delivers a transgene to cells in a subject's tissue, wherein the transgene (i) encodes a fumarylacetoacetate hydrolase (FAH); (ii) Integration at a target integration site in the genomes of a plurality of cells; (iii) upon integration, functionally express the FAH; and (iv) conferring a selective advantage to the plurality of cells over other cells in the tissue such that over time the tissue achieves a functional expression level of FAH, wherein the composition comprises: a polynucleotide cassette comprising an expression cassette, the expression cassette comprising a first nucleic acid sequence and a second nucleic acid sequence, wherein the first nucleic acid sequence encodes a transgene; and the second nucleic acid sequence is located 5 'or 3' to the first nucleic acid sequence and facilitates the production of two independent gene products when the transgene is integrated at the target integration site; a third nucleic acid sequence located 5 'of the expression cassette and comprising a sequence substantially homologous to the genomic sequence 5' of the target integration site; and a fourth nucleic acid sequence located 3 'of the expression cassette and comprising a sequence substantially homologous to the genomic sequence 3' of the target integration site. In some embodiments, the selective advantage comprises an increase in the percentage of cells expressing the transgene in the tissue.

In some embodiments, the present disclosure provides methods of treating monogenic disorders. In some embodiments, the present disclosure provides methods of treating type 1 hereditary tyrosinemia (HT 1). In some embodiments, the method of HT1 comprises administering to a subject a dose of a composition comprising: a polynucleotide cassette comprising an expression cassette comprising a first nucleic acid sequence and a second nucleic acid sequence, wherein the first nucleic acid sequence encodes a FAH transgene; and the second nucleic acid sequence is located 5 'or 3' to the first nucleic acid sequence and facilitates the production of two independent gene products when the transgene is integrated at the target integration site; a third nucleic acid sequence located 5 'of the expression cassette and comprising a sequence substantially homologous to the genomic sequence 5' of the target integration site; and a fourth nucleic acid sequence located 3 'of the expression cassette and comprising a sequence substantially homologous to the genomic sequence 3' of the target integration site. In some embodiments, the third nucleic acid sequence is selected from the group consisting of SEQ ID NO. 1, SEQ ID NO. 3 and SEQ ID NO. 4. In some embodiments, the fourth nucleic acid sequence is selected from the group consisting of SEQ ID NO. 2 and SEQ ID NO. 5.

In some embodiments, the composition comprises a delivery vehicle. In some embodiments, the delivery vehicle is a particle, such as a nanoparticle, such as a lipid nanoparticle. In some embodiments, the delivery vector is a recombinant viral vector. In some embodiments, the recombinant viral vector is a recombinant AAV vector. In some embodiments, the recombinant viral vector is or comprises a capsid protein comprising an amino acid sequence having at least 95% sequence identity to the amino acid sequence of AAV8, AAV-DJ, AAV-LK03, sL65 or AAVNP 59. In some embodiments, the composition further comprises AAV2ITR sequences. In some embodiments, the composition comprises a portion of an AAV2ITR sequence. In some embodiments, the composition comprises an ITR having at least 80%, 85%, 90%, 95%, 99% or 100% sequence identity to an AAV2 ITR. In some embodiments, the composition comprises an ITR sequence selected from the group consisting of SEQ ID NO:27, SEQ ID NO:28, SEQ ID NO:29, and SEQ ID NO: 30.

According to various embodiments, any of a variety of transgenes may be expressed according to the methods and compositions described herein. For example, in some embodiments, the transgene is or comprises a FAH transgene. In some embodiments, the FAH transgene is wt human FAH, codon optimized FAH, synthetic FAH, a FAH variant, a FAH mutant, or a FAH fragment. In some embodiments, the transgene is or comprises a sequence having 80% identity to any one of SEQ ID NO. 18, SEQ ID NO. 19, SEQ ID NO. 20, SEQ ID NO. 21 or SEQ ID NO. 22.

In some embodiments, the invention provides a recombinant viral vector for integrating a transgene into a target integration site in the genome of a cell, comprising: a polynucleotide cassette comprising an expression cassette comprising a first nucleic acid sequence and a second nucleic acid sequence, wherein the first nucleic acid sequence comprises a FAH transgene; and the second nucleic acid sequence is located 5 'or 3' to the first nucleic acid sequence and facilitates the production of two independent gene products when integrated into a target integration site in the genome of the cell; a third nucleic acid sequence located 5 'of the expression cassette and comprising a sequence substantially homologous to the genomic sequence 5' of the target integration site in the genome of the cell; and a fourth nucleic acid sequence located 3 'of the expression cassette and comprising a sequence substantially homologous to the genomic sequence 3' of the target integration site in the genome of the cell. In some embodiments, the second nucleic acid sequence is a sequence encoding a P2A peptide. In some embodiments, the second nucleic acid sequence has at least 80% identity to SEQ ID NO. 6. In some embodiments, the second nucleic acid sequence encodes a P2A peptide having at least 90% sequence identity to SEQ ID NO. 7. In some embodiments, recombinant viral vectors are provided comprising the sequences SEQ ID NO. 23, SEQ ID NO. 24, SEQ ID NO. 25 or SEQ ID NO. 26.

As described herein, the present disclosure encompasses several advantageous insights regarding the integration of one or more transgenes into the genome of a cell. For example, in some embodiments, the integration does not comprise nuclease activity.

Although any suitable tissue for use may be targeted, in some embodiments, the tissue is liver.

As described herein, provided methods and compositions include a polynucleotide cassette having at least four nucleic acid sequences. In some embodiments, the second nucleic acid sequence comprises: a) A nucleic acid sequence encoding a 2A peptide; b) A nucleic acid sequence encoding an Internal Ribosome Entry Site (IRES); c) A nucleic acid sequence encoding an N-terminal intein splicing region and a C-terminal intein splicing region; or d) a nucleic acid sequence encoding a splice donor and a splice acceptor. In some embodiments, the third nucleic acid sequence and the fourth nucleic acid sequence are homology arms that integrate the transgene and the second nucleic acid sequence into the target integration site. In some embodiments, the target integration site comprises an endogenous promoter and an endogenous gene. In some embodiments, the target integration site is an endogenous albumin locus comprising an endogenous albumin promoter and an endogenous albumin gene. In some embodiments, the homology arm directing expression cassette is integrated 3 'to the start codon of the endogenous albumin gene or 5' to the stop codon of the endogenous albumin gene.

According to various aspects, the third nucleic acid and/or the fourth nucleic acid may have a significant length (e.g., at least 800 nucleotides in length). In some embodiments, the third nucleic acid is between 200-3,000 nucleotides. In some embodiments, the fourth nucleic acid is between 200-3,000 nucleotides.

In some embodiments, the polynucleotide cassette does not comprise a promoter sequence. In some embodiments, after integration of the expression cassette into a target integration site in the genome of the cell, the transgene is expressed under the control of an endogenous promoter at the target integration site. In some embodiments, the target integration site is an albumin locus comprising an endogenous albumin promoter and an endogenous albumin gene. In some embodiments, the transgene is expressed under the control of an endogenous albumin promoter after integration of the expression cassette into a target integration site in the genome of the cell, without disrupting expression of the endogenous albumin gene.

In some embodiments, the provided compositions may be administered to a subject at a dose of between 1e12 vg/kg and 1e14 vg/kg. In some embodiments, the provided compositions may be administered to a subject at a dose of between 3E12vg/kg and 1E13 vg/kg. In some embodiments, the provided compositions may be administered to a subject at a dose of between 3e12vg/kg and 3e13 vg/kg. In some embodiments, the provided compositions may be administered to a subject at a dose of no more than 3e13 vg/kg. In some embodiments, the provided compositions may be administered to a subject at a dose of no more than 3e12 vg/kg. In some embodiments, the provided compositions may be administered to a subject only once. In some embodiments, provided compositions may be administered to a subject more than once.

In some embodiments, the present disclosure provides insight that the provided compositions can be administered to neonatal subjects. Furthermore, in some embodiments, the provided compositions may be administered to subjects between ages 0 day and 1 month, 3 months and 1 year, 1 year and 5 years, and 5 years and older. In some embodiments, the provided compositions can be administered to a subject, wherein the subject is an animal. In some embodiments, the provided compositions may be administered to a subject, wherein the subject is a human.

As used in this disclosure, the terms "about" and "approximately" are used as equivalents. Any reference to a publication, patent, or patent application is incorporated by reference herein in its entirety. Any numerical values, with or without about/approximately, used in the present application are intended to encompass any normal fluctuations as would be understood by one of ordinary skill in the art.

Other features, objects, and advantages of the invention will be apparent from the following description. It should be understood, however, that the description of the embodiments, while indicating embodiments of the invention, is given by way of illustration only, not limitation. Various changes and modifications within the scope of the invention will become apparent to those skilled in the art from the detailed description.

Drawings

Fig. 1A-1I show the results of treating a HT1 mouse model with a therapy as described herein. FIG. 1A shows an exemplary polynucleotide cassette. Fig. 1B shows the time course of an exemplary therapy. Fig. 1C shows an assessment of circulating biomarkers. Figure 1D shows the body weight change of the treated mice. Fig. 1E and 1F show markers of liver function in treated mice. FIG. 1G shows immunohistochemical staining of liver samples with FAH antibodies. FIG. 1H shows an assessment of integration into the host genomic DNA (gDNA INT). Fig. 1I shows an assessment of the presence of Alpha Fetal Protein (AFP) as a clinically validated prenatal biomarker for hepatocellular carcinoma (HCC).

Figures 2A-2E show the results and selectivity advantage of treating an HT1 mouse model with a therapy as described herein. Fig. 2A illustrates an exemplary vector and treatment regimen. Fig. 2B shows an evaluation of GENERIDE ^TM biomarkers. Figure 2C shows the survival rate of treated mice. Fig. 2D and 2E show markers of liver function in treated mice.

Fig. 3A-3D show the results of treating a pediatric HT1 mouse model with a therapy as described herein. Fig. 3A shows the time course of treatment. Fig. 3B shows an evaluation of GENERIDE ^TM biomarkers. Figure 3C shows the body weight change of the treated mice. Figure 3D shows AFP levels in treated mice.

Fig. 4A-4C show the results of various treatment regimens with a HT1 mouse model of therapy as described herein. Fig. 4A shows treatment of mice with various treatment regimens. Fig. 4B shows an assessment of circulating biomarkers. Fig. 4C shows markers of liver function in treated mice.

Figure 5 shows a schematic diagram of the tyrosine metabolic pathway.

Fig. 6A to 6F show the results of the NTBC cycle. Fig. 6A shows an exemplary study design. FIG. 6B shows the level of toxic metabolites (SUAC) accumulated in HT 1; fig. 6C shows markers of liver injury (e.g., ALT). Fig. 6D shows markers of liver synthesis function (e.g., activated partial thromboplastin time (ACTIVATED PARTIAL thromboplastin time, aPTT) exhibiting clotting time). Fig. 6E shows blood glucose levels. Fig. 6F shows the weight change of mice.

Fig. 7A-7F show the results of GENERIDE ^TM treatments during the 28 days post-treatment in the absence of NTBC. Fig. 7A shows an assessment of circulating biomarkers (e.g., ALB 2A). FIG. 7B shows accumulation of HT1 associated toxic metabolites (e.g., SUAC). Fig. 7C shows markers of liver function (e.g., ALT). Fig. 7D shows clotting time. Fig. 7E shows blood glucose levels. Figure 7F shows the body weight change of the treated mice.

Fig. 8A to 8D show the results of GENERIDE ^TM treatments within 4 months after treatment. Fig. 8A shows an exemplary study design. Fig. 8B shows an assessment of circulating biomarkers (e.g., ALB 2A) (left panel) and immunohistochemical staining of liver samples with anti-FAH antibodies (right panel). Fig. 8C shows an assessment of liver function as a measure of various biomarkers, such as aspartate Aminotransferase (AST), alanine Aminotransferase (ALT), alkaline phosphatase (ALP), gamma-glutamyl transferase (GGT), total Bilirubin (TBIL), albumin (ALB). Fig. 8D shows an assessment of kidney function as a measure of various biomarkers, e.g., urea nitrogen (UREAN), creatinine (create).

Fig. 9A-9E show the results of GENERIDE ^TM treatments within months after treatment. Fig. 9A shows an exemplary study design. Fig. 9B shows toxic metabolites accumulated in HT1 (e.g., SUAC). Fig. 9C shows an assessment of circulating biomarkers (e.g., ALB 2A). Fig. 9D shows an evaluation of Alpha Fetal Protein (AFP) as a clinically validated prenatal biomarker for hepatocellular carcinoma (HCC). Fig. 9E shows an assessment of tyrosine levels.

Fig. 10A-10B show the results of administration GENERIDE ^TM of a treated pediatric HT1 mouse model. Fig. 10A shows a schematic of the time course of treatment. Figure 3B shows Alpha Fetal Protein (AFP) levels in mice treated on postnatal day 14 (PND 14).

Fig. 11 shows exemplary immunohistochemical staining of liver function markers in mice treated with GENERIDE ^TM.

Definition of the definition

In order that the invention may be more readily understood, certain terms are first defined below. Other definitions of the following terms and other terms are set forth throughout the specification.

About: the term "about" as used herein to refer to a value in a context similar to the value mentioned. In general, those skilled in the art who are familiar with the context will recognize the relative degree of variation that is covered by "about" in this context. For example, in some embodiments, the term "about" may encompass a range of values within 25%, 20%, 19%, 18%, 17%, 16%, 15%, 14%, 13%, 12%, 11%, 10%, 9%, 8%, 7%, 6%, 5%, 4%, 3%, 2%, 1% or less of the mentioned value.

Adult: as used herein, the term "adult" refers to a human aged eighteen years or older. In some embodiments, the adult human weighs in the range of about 90 pounds to about 250 pounds.

Correlation: as the term is used herein, two events or entities are "related" to one another if the presence, level, and/or form of one event or entity is associated with the presence, level, and/or form of another event or entity. For example, a particular entity (e.g., polypeptide, genetic feature, metabolite, microorganism, etc.) is considered to be associated with a particular disease, disorder, or condition if its presence, level, and/or form is associated with the occurrence and/or susceptibility of the disease, disorder, or condition (e.g., in a related population). In some embodiments, two or more entities are physically "related" to each other if they interact directly or indirectly such that they are physically close to each other and/or remain close together. In some embodiments, two or more entities that are physically related to each other are covalently linked to each other; in some embodiments, two or more entities that are physically related to each other are not covalently linked to each other, but are not covalently related, for example, by hydrogen bonding, van der Waals (VAN DER WAALS) interactions, hydrophobic interactions, magnetism, and combinations thereof.

Biological sample: as used herein, the term "biological sample" generally refers to a sample obtained or derived from a biological source of interest (e.g., tissue or organism or cell culture) as described herein. In some embodiments, the source of interest comprises an organism, such as an animal or a human. In some embodiments, the biological sample is or comprises biological tissue or fluid. In some embodiments, the biological sample may be or comprise bone marrow; blood; blood cells; ascites; tissue or fine needle biopsy samples; a body fluid containing cells; free-floating nucleic acids; sputum; saliva; urine; cerebrospinal fluid, peritoneal fluid; chest water; feces; lymph; gynecological liquid; a skin swab; a vaginal swab; an oral swab; a nasal swab; wash or lavage fluid, such as catheter lavage fluid or bronchoalveolar lavage fluid; aspirate; scraping objects; bone marrow specimens; tissue biopsy specimens; surgical specimens; feces, other body fluids, secretions and/or excretions; and/or cells produced thereby, and the like. In some embodiments, the biological sample is or comprises cells obtained from a subject. In some embodiments, the obtained cells are or include cells from a subject from which the sample was obtained. In some embodiments, the sample is a "primary sample" obtained directly from a source of interest by any suitable means. For example, in some embodiments, the primary biological sample is obtained by a method selected from the group consisting of: biopsies (e.g., fine needle aspiration or tissue biopsy), surgery, collection of bodily fluids (e.g., blood, lymph, stool, etc.), and the like. In some embodiments, it will be clear from the context that the term "sample" refers to a formulation obtained by processing (e.g., by removing one or more components and/or by adding one or more agents thereto) a primary sample. For example, filtration using a semipermeable membrane. Such "treated sample" may comprise, for example, nucleic acids or proteins extracted from the sample or obtained by subjecting the primary sample to techniques such as amplification or reverse transcription of mRNA, isolation and/or purification of certain components, and the like.

Biomarkers: the term "biomarker" as used herein is consistent with its use in the art and refers to an entity whose presence, level, or form is associated with a particular biological event or state of interest such that it is considered a "marker" of that event or state. The present disclosure provides, among other things, biomarkers for gene therapy (e.g., useful in assessing one or more characteristics or properties of gene therapy treatment, such as the extent, level, and/or persistence of payload (payload) expression). In some embodiments, the biomarker is a cell surface marker. In some embodiments, the biomarker is intracellular. In some embodiments, the biomarker is present outside the cell (e.g., secreted or otherwise produced or present outside the cell, e.g., in a bodily fluid such as blood, urine, tears, saliva, cerebrospinal fluid, etc.). In certain embodiments, the present disclosure demonstrates the effectiveness of biomarkers detectable in a sample obtained from a subject who has received gene therapy for assessing one or more features or characteristics of the gene therapy; in some such embodiments, the sample is a cell, tissue, and/or fluid other than the cell, tissue, and/or fluid to which the gene therapy is delivered and/or other than the cell, tissue, and/or fluid for which the payload is effective.

Codon optimization: as used herein, the term "codon optimization" refers to the process of altering the codons of a given gene in a manner such that the polypeptide sequence encoded by the gene remains the same, while the altered codons improve the expression process of the polypeptide sequence. For example, if a polypeptide has a human protein sequence and is expressed in E.coli, expression will typically be increased when the DNA sequence is codon optimized to change the human codon to one that is more efficiently expressed in E.coli.

Detectable moiety: as used herein, the term "detectable moiety" refers to any entity (e.g., a molecule, a complex, or a portion or component thereof). In some embodiments, detectable moieties and/or their use as discrete molecular entities are provided; in some embodiments, the detectable moiety is part of and/or is associated with another molecular entity. Examples of detectable moieties include, but are not limited to: various ligands, radionuclides (e.g., ³H、¹⁴C、¹⁸F、¹⁹F、³²P、³⁵S、¹³⁵I、¹²⁵I、¹²³I、⁶⁴Cu、¹⁸⁷Re、¹¹¹In、⁹⁰Y、^99mTc、¹⁷⁷Lu、⁸⁹Zr, etc.), fluorescent dyes (see below for specific exemplary fluorescent dyes), chemiluminescent agents (e.g., azlactone, stabilized dioxetane, and the like), bioluminescent agents, spectrally resolvable inorganic fluorescent semiconductor nanocrystals (i.e., quantum dots), metal nanoparticles (e.g., gold, silver, copper, platinum, etc.) nanoclusters, paramagnetic metal ions, enzymes (see below for specific examples of enzymes), colorimetric labels (e.g., dyes, colloidal gold, and the like), biotin, digoxin, haptens, antibodies, and/or proteins from which antisera or monoclonal antibodies may be obtained.

Children: as used herein, the term "child" refers to a human being between two and 18 years of age. The weight may vary from age and specific child, and typically ranges from 30 pounds to 150 pounds.

Combination therapy: as used herein, the term "combination therapy" refers to those instances in which a subject is exposed to two or more treatment regimens (e.g., two or more therapeutic agents, such as gene therapy and non-gene therapy treatment modes) simultaneously. In some embodiments, two or more regimens may be administered simultaneously; in some embodiments, such regimens may be administered sequentially (e.g., all "doses" of the first regimen are administered prior to any dose administration of the second regimen); in some embodiments, such agents are administered in an overlapping dosing regimen. In some embodiments, "administering" of a combination therapy may involve administering one or more agents or modes to a subject who receives the other one or more agents or modes in combination. For clarity, combination therapy does not require that the individual agents be administered together (or even must be administered simultaneously) in a single composition.

Composition: those skilled in the art will appreciate that the term "composition" as used herein may be used to refer to a discrete physical entity comprising one or more specified components. Generally, unless otherwise specified, the composition may have any form, such as a gas, gel, liquid, solid, and the like.

And (3) determining: many of the methods described herein include a "determining" step. Those of ordinary skill in the art, with the benefit of the present description, will recognize that such "determining" may be accomplished using any of a variety of techniques available to those of skill in the art, including, for example, the specific techniques explicitly mentioned herein or by using such techniques. In some embodiments, the determining involves modulating the physical sample. In some embodiments, the determination involves consideration and/or regulation of data or information, for example using a computer or other processing unit adapted to perform a correlation analysis. In some embodiments, determining involves receiving relevant information and/or material from a source. In some embodiments, the determining involves comparing one or more characteristics of the sample or entity to a similar reference.

Gene: as used herein, the term "gene" refers to a DNA sequence encoding a gene product (e.g., an RNA product and/or a polypeptide product). In some embodiments, the gene comprises a coding sequence (e.g., a sequence encoding a particular product); in some embodiments, the gene comprises a non-coding sequence. In some particular embodiments, a gene may include both coding (e.g., exons) and non-coding (e.g., introns) sequences. In some embodiments, a gene may include one or more regulatory components (e.g., promoters, enhancers, silencers, termination information) that, for example, can control or affect one or more aspects of gene expression (e.g., cell-type specific expression, inducible expression). In some embodiments, the gene is located or present in the genome (e.g., in or on a chromosome or other replicable nucleic acid) (or has the same nucleotide sequence as the gene located or present in the genome).

Gene product or expression product: as used herein, the term "gene product" or "expression product" generally refers to RNA transcribed from a gene (before and/or after processing) or a polypeptide encoded by RNA transcribed from a gene (before and/or after modification).

"Improve", "increase", "inhibit" or "decrease": as used herein, the terms "improve," "increase," "inhibit," "decrease," or grammatical equivalents thereof indicate values relative to a baseline or other reference measurement. In some embodiments, suitable reference measurements may be or include measurements in a particular system (e.g., in a single subject) in the absence (e.g., before and/or after) of a particular agent or treatment, or in the presence of a suitable comparable reference agent, under otherwise comparable conditions. In some embodiments, suitable reference measurements may be or include measurements in similar systems known or expected to react in a particular manner in the presence of a relevant agent or treatment.

An infant: as used herein, the term "infant" refers to a human being less than two years of age. The typical weight of an infant is in the range of 3 pounds to 20 pounds.

Neonates: as used herein, the term "neonate" refers to a neonatal human.

Nucleic acid: as used herein, in its broadest sense, refers to any compound and/or substance that is or can be incorporated into an oligonucleotide chain. In some embodiments, the nucleic acid is a compound and/or substance that is incorporated or incorporable into the oligonucleotide chain via a phosphodiester linkage. As will be apparent from the context, in some embodiments, "nucleic acid" refers to individual nucleic acid residues (e.g., nucleotides and/or nucleosides); in some embodiments, "nucleic acid" refers to an oligonucleotide strand comprising individual nucleic acid residues. In some embodiments, a "nucleic acid" is or comprises RNA; in some embodiments, a "nucleic acid" is or comprises DNA. In some embodiments, the nucleic acid is, comprises, or consists of one or more nucleic acid residues. In some embodiments, the nucleic acid is, comprises, or consists of one or more nucleic acid analogs. In some embodiments, the nucleic acid analog differs from the nucleic acid in that it does not utilize a phosphodiester backbone. In some embodiments, the nucleic acid is, comprises, or consists of one or more natural nucleosides (e.g., adenosine, thymidine, guanosine, cytidine, uridine, deoxyadenosine, deoxythymidine, deoxyguanosine, and deoxycytidine). In some embodiments, the nucleic acid is, comprises, or consists of, one or more nucleoside analogs (e.g., 2-aminoadenosine, 2-thiothymidine, inosine, pyrrolopyrimidine, 3-methyladenosine, 5-methylcytidine, C-5 propynyl-cytidine, C-5 propynyl-uridine, 2-aminoadenosine, C5-bromouridine, C5-fluorouridine, C5-iodouridine, C5-propynyl-uridine, C5-propynyl-cytidine, C5-methylcytidine, 2-aminoadenosine, 7-deazaadenosine, 7-deazaguanosine, 8-oxoguanosine, 0 (6) -methylguanine, 2-thiocytidine, methylated bases, intercalating bases, and combinations thereof). In some embodiments, the nucleic acid has a nucleotide sequence encoding a functional gene product, such as RNA or a protein. In some embodiments, the nucleic acid comprises one or more introns. In some embodiments, the nucleic acid is prepared by one or more of isolation from a natural source, enzymatic synthesis (in vivo or in vitro) by complementary template-based polymerization, amplification in a recombinant cell or system, and chemical synthesis. In some embodiments, the nucleic acid is at least 3、4、5、6、7、8、9、10、15、20、25、30、35、40、45、50、55、60、65、70、75、80、85、90、95、100、110、120、130、140、150、160、170、180、190、20、225、250、275、300、325、350、375、400、425、450、475、500、600、700、800、900、1000、1500、2000、2500、3000、3500、4000、4500、5000 or more residues long. In some embodiments, the nucleic acid is partially or wholly single stranded; in some embodiments, the nucleic acid is partially or fully bifilar. In some embodiments, the nucleic acid has a nucleotide sequence comprising at least one component that encodes a polypeptide, or is a complement of the sequence encoding the polypeptide. In some embodiments, the nucleic acid has enzymatic activity.

Peptide: as used herein, the term "peptide" or "polypeptide" refers to any polymer chain of amino acids. In some embodiments, the peptide has an amino acid sequence that is found in nature. In some embodiments, the peptide has an amino acid sequence that is not found in nature. In some embodiments, the peptide has an engineered amino acid sequence, as it is designed and/or produced by manipulation manually. In some embodiments, the peptide may comprise or consist of a natural amino acid, an unnatural amino acid, or both. In some embodiments, the peptide may comprise or consist of only natural amino acids or only unnatural amino acids. In some embodiments, the peptide may comprise a D-amino acid, an L-amino acid, or both. In some embodiments, the peptide may comprise only D-amino acids. In some embodiments, the peptide may comprise only L-amino acids. In some embodiments, the peptide is linear. In some embodiments, the term "peptide" may be appended to the name of a reference peptide, activity, or structure; in such cases, they are used herein to refer to peptides that share a related activity or structure and thus can be considered members of the same class or family of peptides. For this class, the present description provides and/or those skilled in the art are aware of exemplary peptides in the class whose amino acid sequence and/or function are known; in some embodiments, the exemplary peptide is a reference peptide of a peptide class or family. In some embodiments, a member of a peptide class or family displays a reference peptide to that class; in some embodiments, significant sequence homology or identity to all peptides in the class, sharing common sequence motifs (e.g., signature sequence modules) therewith, and/or sharing common activities (in some embodiments, at comparable levels or within specified ranges) therewith. For example, in some embodiments, a member peptide exhibits at least about 30-40%, and typically greater than about 50%, 60%, 70%, 80%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more of overall sequence homology or identity to a reference peptide and/or includes at least one region (e.g., a conserved region, which may be or comprise a signature sequence component in some embodiments) that exhibits typically greater than 90% or even 95%, 96%, 97%, 98%, or 99% of very high sequence identity. This conserved region typically covers at least 3-4 and typically up to 20 or more amino acids; in some embodiments, the conserved region encompasses at least one segment having at least 2, 3,4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 or more contiguous amino acids.

The object is: as used herein, the term "subject" refers to an organism, typically a mammal (e.g., a human, including in some embodiments prenatal human forms). In some embodiments, the subject has a related disease, disorder, or condition. In some embodiments, the subject is susceptible to a disease, disorder, or condition. In some embodiments, the subject exhibits one or more symptoms or features of a disease, disorder, or condition. In some embodiments, the subject does not exhibit any symptoms or features of the disease, disorder, or condition. In some embodiments, the subject is a human having one or more characteristics that are characteristic of a susceptibility or risk of a disease, disorder, or condition. In some embodiments, the subject is a patient. In some embodiments, the subject is a subject to whom and/or to whom a diagnostic and/or therapy has been administered.

Essentially: as used herein, the term "substantially" refers to a qualitative condition that exhibits a complete or near complete range or degree of a feature or property of interest. Those of ordinary skill in the art will understand that biological and chemical phenomena rarely, if ever, reach completion and/or proceed to completion or achieve or avoid absolute results. Thus, the term "substantially" is used herein to capture the potential lack of completeness inherent in many biological and chemical phenomena.

Variants: as used herein in the context of a molecule, such as a nucleic acid, protein, or small molecule, the term "variant" refers to a molecule that exhibits significant structural identity with a reference molecule but is structurally different from the reference molecule, e.g., in the presence or absence or at the level of one or more chemical moieties, as compared to a reference entity. In some embodiments, the variant is also functionally different from its reference molecule. In general, whether a particular molecule is properly considered a "variant" of a reference molecule is based on the degree of structural identity with the reference molecule. As will be appreciated by those skilled in the art, any biological or chemical reference molecule has certain characteristic structural components. By definition, a variant is a unique molecule that shares one or more such feature components but differs from a reference molecule in at least one aspect. To give just a few examples, a polypeptide may have a characteristic sequence component consisting of a plurality of amino acids that have a specified position relative to each other in linear or three-dimensional space and/or contribute to a particular structural motif and/or biological function; a nucleic acid may have a characteristic sequence component consisting of a plurality of nucleotide residues having a specified position relative to each other in a linear or three-dimensional space. In some embodiments, a variant polypeptide or nucleic acid may differ from a reference polypeptide or nucleic acid by one or more differences in amino acid or nucleotide sequence and/or one or more differences in chemical moieties (e.g., carbohydrate, lipid, phosphate groups) that are covalent components of the polypeptide or nucleic acid (e.g., linked to the polypeptide or nucleic acid backbone). In some embodiments, the variant polypeptide or nucleic acid exhibits at least 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, or 99% overall sequence identity to a reference polypeptide or nucleic acid. In some embodiments, the variant polypeptide or nucleic acid does not share at least one characteristic sequence component with a reference polypeptide or nucleic acid. In some embodiments, the reference polypeptide or nucleic acid has one or more biological activities. In some embodiments, the variant polypeptide or nucleic acid shares one or more biological activities of a reference polypeptide or nucleic acid. In some embodiments, the variant polypeptide or nucleic acid lacks one or more biological activities of a reference polypeptide or nucleic acid. In some embodiments, the variant polypeptide or nucleic acid exhibits a reduced level of one or more biological activities as compared to a reference polypeptide or nucleic acid. In some embodiments, a polypeptide or nucleic acid of interest is considered to be a "variant" of a reference polypeptide or nucleic acid if the polypeptide or nucleic acid of interest has the same amino acid or nucleotide sequence as the reference polypeptide or nucleic acid, but has a small number of sequence changes at a particular position. Typically, fewer than about 20%, about 15%, about 10%, about 9%, about 8%, about 7%, about 6%, about 5%, about 4%, about 3%, or about 2% of the residues in the variant are substituted, inserted, or deleted as compared to the reference. In some embodiments, a variant polypeptide or nucleic acid comprises about 10, about 9, about 8, about 7, about 6, about 5, about 4, about 3, about 2, or about 1 substituted residue as compared to a reference. Typically, a variant polypeptide or nucleic acid comprises a very small number (e.g., less than about 5, about 4, about 3, about 2, or about 1) of substituted, inserted, or deleted functional residues (i.e., residues involved in a particular biological activity) relative to a reference. In some embodiments, the variant polypeptide or nucleic acid comprises no more than about 5, about 4, about 3, about 2, or about 1 additions or deletions as compared to the reference, and in some embodiments, no additions or deletions. In some embodiments, a variant polypeptide or nucleic acid comprises less than about 25, about 20, about 19, about 18, about 17, about 16, about 15, about 14, about 13, about 10, about 9, about 8, about 7, about 6, and typically less than about 5, about 4, about 3, or about 2 additions or deletions as compared to a reference. In some embodiments, the reference polypeptide or nucleic acid is a polypeptide or nucleic acid that is found in nature. In some embodiments, the reference polypeptide or nucleic acid is a human polypeptide or nucleic acid.

Detailed description of certain embodiments

Gene therapy

Gene therapy alters the gene expression profile of a patient's cells by gene transfer, a process of delivering a therapeutic gene (referred to as a transgene). Various delivery vectors are known for use as vectors for transporting transgenes into the nucleus of cells to alter or enhance cellular capacity (e.g., proteome, functionality, etc.). Developers have made great progress in introducing genes into tissue cells such as liver, eye retina and bone marrow hematopoietic cells using various vectors. These methods have in some cases become approved therapies, and in other cases, demonstrated highly desirable results in clinical trials.

There are various methods of gene therapy. In known AAV gene therapies, the transgene is introduced into the nucleus of the host cell, but is not intended to be integrated in chromosomal DNA. Transgenes are expressed from a non-integrated genetic component called the episome (episome) that is present in the nucleus. The second type of gene therapy employs a different type of virus, such as lentivirus, which inserts itself into chromosomal DNA along with the transgene but at any site.

The episomal expression of a gene must be driven by an exogenous promoter to produce a protein that corrects or ameliorates the disease condition.

Restriction of Gene therapy

Dilution effect during cell division and tissue growth. In the case of gene therapies based on episomal gene expression, the benefits of the therapy generally decrease when the cells divide during the growth or tissue regeneration process, as the transgene is not intended to integrate into the host chromosome and therefore does not replicate during cell division. Thus, each new generation of cells further reduces the proportion of cells expressing the transgene in the target tissue, resulting in a reduction or elimination of therapeutic benefit over time.

The insertion site cannot be controlled. Although some gene therapies using virus-mediated insertion may provide long-term benefits because the gene is inserted into the host chromosome, the location of the insertion of the gene cannot be controlled, which risks disrupting the essential gene or the location where the insertion may promote undesirable effects (e.g., tumor formation). For this reason, these integrated gene therapy methods are mainly limited to ex vivo methods in which cells are treated in vitro and then reinserted.

The use of exogenous promoters increases the risk of tumor formation. A common feature of both gene therapy approaches is the introduction of the transgene into the cell along with an exogenous promoter. The promoter is required to initiate transcription and amplification of DNA into messenger RNA or mRNA that will ultimately be translated into protein. High levels of therapeutic protein expression from gene therapy transgenes require powerful engineered promoters. Although these promoters are essential for protein expression, previous studies by others in animal models have shown that non-specific integration of gene therapy vectors can lead to a significant increase in tumor progression. The strength of the promoter plays a critical role in increasing the progression of these tumors. Thus, attempts to drive high levels of expression with strong promoters may have long-term deleterious consequences.

A. Gene editing

Gene editing is the deletion, alteration, or addition of abnormal genes by introducing breaks in cellular DNA using exogenously delivered gene editing mechanisms. Most current gene editing methods are limited in their efficacy due in part to the high rate of undesired on-target and off-target modification and the low efficiency of gene correction, due in part to the rapid cell attempt to repair the introduced DNA breaks. The emphasis of gene editing is currently on disabling dysfunctional genes or correcting or skipping individual deleterious mutations in genes. Because of the number of possible mutations, none of these methods address the entire population of mutations within a particular genetic disease, as is addressed by inserting a complete correction gene.

Unlike gene therapy methods, gene editing allows repaired genetic regions to be transmitted to new generation cells through normal cell division. In addition, the regulation mechanism of the cells themselves can be used to express the desired protein. Traditional gene editing methods are nuclease-based and use nucleases derived from bacteria to cleave DNA at specific locations in order to cause deletions, make changes, or apply correction sequences to DNA in vivo.

Once the nuclease cleaves the DNA, traditional gene editing techniques modify the DNA using two pathways: homology directed repair or HDR and non-homologous end joining or NHEJ. HDR involves the highly precise incorporation of a correct DNA sequence complementary to the site of DNA damage. HDR has key advantages because it can repair DNA with high fidelity and avoids the introduction of unwanted mutations at correction sites. NHEJ is a less selective, more error-prone process that rapidly links the ends of the fragmented DNA, resulting in a higher frequency of insertions or deletions at the site of the fragmentation.

1. Nuclease-based gene editing

Nuclease-based gene editing uses nucleases, which are enzymes engineered or originally identified in DNA-cleaving bacteria. Nuclease-based gene editing is a two-step process. First, an exogenous nuclease capable of cleaving one or both strands of double-stranded DNA is directed to the desired site by means of synthetic guide RNAs and specific cleavage is performed. After the nuclease makes the desired cleavage or cleavage, the DNA repair mechanism of the cell is activated and the editing process is completed by NHEJ or less commonly HDR.

NHEJ can occur in the absence of DNA template so that the cell makes copies when repairing DNA nicks. This is the primary or default route by which cells repair bifilar breaks. The NHEJ mechanism can be used to introduce smaller insertions or deletions, called insertions or deletions (indels), resulting in a knockout of gene function. NHEJ produces insertions and deletions in DNA due to its repair pattern and can also lead to the introduction of off-target, undesirable mutations, including chromosomal aberrations.

Nuclease-mediated co-delivery of HDR with nucleases, guide RNAs, and DNA templates similar to the cleaved DNA occurs. Thus, the cell can use the template to construct a repair DNA to replace a defective gene sequence with a corrected gene sequence. We believe that the HDR mechanism is the preferred repair pathway when inserting correction sequences using nuclease-based methods because of their high fidelity. However, most of the repair to the genome after cleavage with nucleases continues to use the NHEJ mechanism. More frequent NHEJ repair pathways are likely to cause unwanted mutations at the cleavage site, thereby limiting the range of diseases that any nuclease-based gene editing method can target at this time.

Traditional gene editing uses one of three nuclease-based methods: a transcriptional activator-like effector nuclease, or TALEN; repeating short palindromic sequence cluster associated protein-9, or CRISPR/Cas9 at regular intervals; zinc finger nucleases or ZFNs. While these methods have contributed to significant advances in research and product development, they have inherent limitations.

2. Restriction of nuclease-based gene editing

Nuclease-based gene editing methods are limited by their use of bacterial nucleases to cleave DNA and rely on exogenous promoters for transgene expression. These limitations include:

Nucleases result in mutations on the target and off-target. Based on the error-prone NHEJ process and potential off-target nuclease activity, gene editing techniques are known to result in genotoxicity, including chromosomal changes.

Delivery of gene editing components to cells is complex. Gene editing requires the simultaneous delivery of multiple components into the same cell. This is technically challenging and currently requires the use of multiple carriers.

Nucleases of bacterial origin are immunogenic. Since the nucleases used in known gene editing methods are mostly derived from bacteria, they have a higher immunogenic potential, which in turn limits their usefulness.

Due to these limitations, gene editing is mainly limited to ex vivo applications of cells, such as hematopoietic cells.

GENERIDE ^TM technical platform

GENERIDE ^TM is a novel AAV-based, nuclease-free genome editing technique that allows precise insertion of therapeutic transgenes into the genome by homologous recombination. GENERIDE ^TM provide durable transgene expression regardless of cell proliferation and tissue growth, whereas GENERIDE ^TM corrected hepatocytes exhibit selective expansion in the presence of intrinsic liver injury due to genetic defects (e.g., HT1 due to FAH defects). Without wishing to be bound by any particular theory, GENERIDE ^TM is contemplated as a genomic editing technique utilizing homologous recombination or HR (i.e., a naturally occurring DNA repair process that maintains genome fidelity). In some embodiments, by using HR, GENERIDE ^TM allows for insertion of transgenes into specific targeted genomic locations without the use of exogenous nucleases, enzymes engineered to cleave DNA. GENERIDE ^TM targeted transgene integration is designed to drive high levels of tissue-specific gene expression using endogenous promoters at these targeting locations without the deleterious problems associated with the use of exogenous promoters.

The GENERIDE ^TM technique is designed to precisely integrate the correction gene into the patient's genome to provide a stable therapeutic effect. Since GENERIDE ^TM is designed to have this durable therapeutic effect, it can be used for rare liver conditions in pediatric patients, where it is critical to provide treatment early in the life of the patient in order to avoid the possibility of irreversible disease pathology. In some embodiments described herein, compositions comprising GENERIDE ^TM constructs are useful for treating type 1 hereditary tyrosinemia or HT1, a life threatening disease that occurs immediately at birth.

The GENERIDE ^TM platform technology has the potential to overcome some of the key limitations of traditional gene therapies and known gene editing methods, and thus enables good treatment of genetic diseases, particularly in pediatric patients. In some embodiments GENERIDE ^TM uses AAV vectors to deliver genes into the nucleus. It then uses HR to stably integrate the correction gene into the recipient's genome at a location regulated by the endogenous promoter, thereby generating the potential for lifetime protein production, even if the body grows and changes over time, which is not feasible with known AAV gene therapies.

GENERIDE ^TM provides several key advantages over gene therapy and gene editing techniques that rely on exogenous promoters and nucleases. By utilizing the naturally occurring process of HR GENERIDE ^TM does not face the same challenges associated with gene editing methods that rely on engineered bacterial nucleases. The use of these enzymes is associated with a significant increase in the risk of undesired and potentially dangerous modifications in the host cell DNA, which may lead to an increased risk of tumor formation. Furthermore, GENERIDE ^TM aims at integrating the correction gene precisely, site-specifically, stably and permanently into the chromosome of the host cell, in contrast to known gene therapies. In preclinical animal studies using GENERIDE ^TM constructs, integration of the correction gene in specific locations in the genome was observed. Thus, in some embodiments, the methods and compositions of the present disclosure (e.g., those comprising GENERIDE ^TM constructs) provide a more durable method than gene therapy techniques that do not integrate into the genome and lose their effect as the cell divides. These benefits enable GENERIDE ^TM to be used well in the treatment of genetic diseases, particularly in pediatric patients.

The modular methods disclosed herein can be used to allow GENERIDE ^TM to deliver robust tissue-specific gene expression that will be reproducible between different therapeutic agents delivered to the same tissue. In some embodiments, this method allows for the utilization of common manufacturing processes and analysis between different GENERIDE ^TM product candidates, and may shorten the development process of the treatment procedure.

Previous work on non-destructive gene targeting is described in WO 2013/158309 and incorporated herein by reference. Previous work on genome editing without nucleases is described in WO 2015/143177 and incorporated herein by reference.

B. Genome editing using GENERIDE ^TM: mechanism and attribute

In some embodiments, genome editing using the GENERIDE ^TM platform differs from gene editing in that it uses HR to deliver a correction gene to one specific location in the genome. In some embodiments GENERIDE ^TM inserts the correction gene in a precise manner, resulting in site-specific integration in the genome. In some embodiments GENERIDE ^TM does not require the use of exogenous nucleases or promoters; instead, it utilizes the existing mechanisms of the cell to integrate and initiate transcription of the therapeutic transgene.

In some embodiments, GENERIDE ^TM comprises at least three components, each contributing to the potential benefits of the GENERIDE ^TM method. In some embodiments, the compositions and methods of the present disclosure comprise: homology arms, transgenes, and nucleic acids that facilitate the production of two independent gene products. In some embodiments, the compositions and methods of the present disclosure comprise a first nucleic acid sequence encoding a transgene. In some embodiments, the compositions and methods of the present disclosure comprise a second nucleic acid that facilitates the production of two independent gene products (e.g., 2A peptides). In some embodiments, the disclosure provides an expression cassette comprising a first nucleic acid sequence and a second nucleic acid sequence as described herein.

In some embodiments, the second nucleic acid sequence comprises: a nucleic acid sequence encoding a 2A peptide; a nucleic acid sequence encoding an Internal Ribosome Entry Site (IRES); a nucleic acid sequence encoding an N-terminal intein splicing region and a C-terminal intein splicing region; and/or a nucleic acid sequence encoding a splice donor and a splice acceptor. In some embodiments, the compositions and methods of the present disclosure comprise a polynucleotide cassette comprising an expression cassette comprising the first nucleic acid and the second nucleic acid. In some embodiments, the compositions and methods of the present disclosure comprise a third nucleic acid sequence comprising a sequence substantially homologous to the genomic sequence. In some embodiments, the compositions and methods of the present disclosure comprise a fourth nucleic acid sequence comprising a sequence substantially homologous to the genomic sequence. In some embodiments, the third nucleic acid sequence is located 5 'of the expression cassette and comprises a sequence that is substantially homologous to the genomic sequence 5' of the target integration site in the genome of the cell. In some embodiments, the fourth nucleic acid sequence is located 3 'of the expression cassette and comprises a sequence that is substantially homologous to the genomic sequence 3' of the target integration site in the genome of the cell.

The homology arm is made up of hundreds of nucleotides.

In some embodiments, the methods and compositions of the present disclosure comprise flanking (flank) sequences, referred to as homology arms. In some embodiments, the homology arms direct site-specific integration (also referred to herein as facilitating integration) and limit off-target insertion of the construct. In some embodiments, the third and fourth nucleic acid sequences comprise homology arms. In some embodiments, each homology arm is hundreds of nucleotides long, as opposed to the guide sequences used in CRISPR/Cas9, which are only tens of base pairs long. In some implementations, this increased length may facilitate improved precision and site-specific integration. In some embodiments, the homology arm of GENERIDE ^TM directs transgene integration immediately following the highly expressed gene. In some embodiments, integration of the transgene immediately following the high expression gene results in high levels of expression without the need to introduce an exogenous promoter.

In some embodiments, the third or fourth nucleic acid is between 200-3000、200-350、250-400、300-450、350-500、500-750、600-850、700-950、800-1050、900-1150、1000-1250、1100-1350、1200-1450、1300-1550、1400-1650、1500-1750、1600-1850、1700-1950、1800-2050、1900-2150、2000-2250、2100-2350、2200-2450、2300-2550、2400-2650、2500-2750、2600-2850、2700-2950 or 2800-3000 nucleotides in length. In some embodiments, the third or fourth nucleic acid is about 300, 400, 500, 600, 700, 800, 900, 1000, 1100, 1200, 1300, 1400, 1500, 1600, 1700, 1800, 1900, 2000, 2100, 2200, or 1700 nucleotides in length. In some embodiments, the fourth nucleic acid is 1000 nucleotides in length. In some embodiments, the third nucleic acid is 1600 nucleotides in length.

In some embodiments, the homology arm contains at least 70% homology to the target locus. In some embodiments, the homology arm contains at least 80% homology to the target locus. In some embodiments, the homology arm contains at least 90% homology to the target locus. In some embodiments, the homology arm contains at least 95% homology to the target locus. In some embodiments, the homology arm contains at least 99% homology to the target locus. In some embodiments, the homology arm contains 100% homology to the target locus.

In some embodiments, the homology arms have the same length (also referred to as balanced homology arms or uniform homology arms). In some embodiments, the homology arms have different lengths (also referred to as unbalanced homology arms or non-uniform homology arms). In some embodiments, compositions comprising unbalanced homology arms of different lengths provide improved effects (e.g., increased target site integration rate) compared to a reference sequence or balanced homology arms. In some embodiments, compositions comprising homology arms of different lengths (wherein each homology arm has at least a certain length) provide improved effects (e.g., increased target site integration rate) compared to a reference sequence (e.g., compositions comprising homology arms of the same length).

In some embodiments, each homology arm is greater than 50nt in length. In some embodiments, each homology arm is greater than 100nt in length. In some embodiments, each homology arm is greater than 400nt in length. In some embodiments, each homology arm is at least 750nt in length. In some embodiments, each homology arm is at least 1000nt in length. In some embodiments, one homology arm is at least 750nt in length and the other homology arm is at least 1000nt in length. In some embodiments, one homology arm is at least 750nt in length and the other homology arm is at least 1100nt in length. In some embodiments, one homology arm is at least 750nt in length and the other homology arm is at least 1200nt in length. In some embodiments, one homology arm is at least 750nt in length and the other homology arm is at least 1300nt in length. In some embodiments, one homology arm is at least 750nt in length and the other homology arm is at least 1400nt in length. In some embodiments, one homology arm is at least 750nt in length and the other homology arm is at least 1500nt in length. In some embodiments, one homology arm is at least 750nt in length and the other homology arm is at least 1600nt in length. In some embodiments, one homology arm is at least 750nt in length and the other homology arm is at least 1700nt in length. In some embodiments, one homology arm is at least 750nt in length and the other homology arm is at least 1800nt in length. In some embodiments, one homology arm is at least 750nt in length and the other homology arm is at least 1900nt in length. In some embodiments, one homology arm is at least 750nt in length and the other homology arm is at least 2000nt in length. In some embodiments, one homology arm is at least 1000nt in length and the other homology arm is at least 1100nt in length. In some embodiments, one homology arm is at least 1000nt in length and the other homology arm is at least 1200nt in length. In some embodiments, one homology arm is at least 1000nt in length and the other homology arm is at least 1300nt in length. In some embodiments, one homology arm is at least 1000nt in length and the other homology arm is at least 1400nt in length. In some embodiments, one homology arm is at least 1000nt in length and the other homology arm is at least 1500nt in length. In some embodiments, one homology arm is at least 1000nt in length and the other homology arm is at least 1600nt in length. In some embodiments, one homology arm is at least 1000nt in length and the other homology arm is at least 1700nt in length. In some embodiments, one homology arm is at least 1000nt in length and the other homology arm is at least 1800nt in length. In some embodiments, one homology arm is at least 1000nt in length and the other homology arm is at least 1900nt in length. In some embodiments, one homology arm is at least 1000nt in length and the other homology arm is at least 2000nt in length. In some embodiments, one homology arm is at least 1300nt in length and the other homology arm is at least 1400nt in length. In some embodiments, one homology arm is at least 1600nt in length and the other homology arm is at least 1000nt in length. In some embodiments, one homology arm is at least 1250nt in length and the other homology arm is at least 1250nt in length. In some embodiments, one homology arm is at least 400nt in length and the other homology arm is at least 800nt in length. In some embodiments, one homology arm is at least 600nt in length and the other homology arm is at least 600nt in length.

In some embodiments, the 5 'homology arm is longer than the 3' homology arm. In some embodiments, the 3 'homology arm is longer than the 5' homology arm. For example, in some embodiments, the 5 'homology arm is about 1600nt in length and the 3' homology arm is about 1000nt in length. In some embodiments, the 5 'homology arm is about 1000nt in length and the 3' homology arm is about 1600nt in length. In some embodiments, viral vectors comprising homology arms provide improved effects (e.g., increased target site integration rate) compared to a suitable reference sequence (e.g., viral vectors lacking homology arms). In some embodiments, a viral vector comprising homology arms provides a target site integration rate of 0.01% or more (e.g., 0.05% or more, 0.1% or more, 0.2% or more, 0.3% or more, 0.4% or more, 0.5% or more, 0.6% or more, 0.7% or more, 0.8% or more, 0.9% or more, 1% or more, 1.5% or more, 2% or more, 5% or more, 10% or more, 20% or more, 30% or more). In some embodiments, a viral vector comprising a homology arm provides for increased target site integration rate over time. In some embodiments, the target site integration rate increases over time relative to an initial measurement of target site integration. In some embodiments, the target site integration rate over time is at least 1.5 times (e.g., 1.5 times, 2 times, 3 times, 4 times, 5 times, 10 times, 20 times, 30 times, 40 times, 50 times, 60 times, 70 times, 80 times, 90 times, 100 times, 200 times) the initial measurement of target site integration. In some embodiments, the target site integration rate is measured after one or more days. In some embodiments, the target site integration rate is measured after one or more weeks. In some embodiments, the target site integration rate is measured after one month or more. In some embodiments, the target site integration rate is measured after one or more years.

In some embodiments, viral vectors comprising homology arms of different lengths provide improved effects (e.g., increased target site integration rate) relative to a reference sequence (e.g., viral vectors having homology arms of the same length, viral vectors having at least one homology arm of less than 500 nt). In some embodiments, a viral vector comprising homology arms of different lengths provides at least 1.1-fold, at least 1.2-fold, at least 1.3-fold, at least 1.4-fold, at least 1.5-fold, at least 1.6-fold, at least 1.7-fold, at least 1.8-fold, at least 1.9-fold, at least 2.0-fold, at least 2.5-fold, at least 3.0-fold, at least 3.5-fold, or at least 4.0-fold improved editing activity relative to a reference composition (e.g., a viral vector having homology arms of the same length, a viral vector having at least one homology arm of less than 500 nt).

In some embodiments, viral vectors comprising homology arms of different lengths provide a target site integration of 0.01% or more (e.g., 0.05% or more, 0.1% or more, 0.2% or more, 0.3% or more, 0.4% or more, 0.5% or more, 0.6% or more, 0.7% or more, 0.8% or more, 0.9% or more, 1% or more, 1.5% or more, 2% or more, 5% or more, 10% or more, 20% or more, 30% or more). In some embodiments, viral vectors comprising homology arms of different lengths provide increased target site integration rates over time. In some embodiments, the target site integration rate increases over time relative to an initial measurement of target site integration. In some embodiments, the target site integration rate over time is at least 1.5 times (e.g., 1.5 times, 2 times, 3 times, 4 times, 5 times, 10 times, 20 times, 30 times, 40 times, 50 times, 60 times, 70 times, 80 times, 90 times, 100 times, 200 times) the initial measurement of target site integration.

In some embodiments, viral vectors comprising homology arms of different lengths may provide improved gene editing in a species or model system of a species (e.g., mouse, human, or model thereof). In some embodiments, the viral vectors may comprise different combinations of homology arm lengths when optimized for expression in a particular species or model system of a particular species (e.g., mouse, human, or model thereof). In some embodiments, a viral vector comprising a particular combination of homology arm lengths may provide improved gene editing in a species or model system of a species (e.g., human, humanized mouse model) as compared to a second species or model system of a second species (e.g., mouse, pure mouse model). In some embodiments, a viral vector comprising a particular combination of homology arm lengths may be optimized for high level gene editing in a species or model of a species (e.g., human, humanized mouse model) as compared to a model system of a second species or second species (e.g., mouse, pure mouse model).

In some embodiments, the homology arm directs transgene integration immediately after the high expression of the endogenous gene. In some embodiments, the homology arms direct integration of the transgene without disrupting endogenous gene expression (non-destructive integration).

In some embodiments, one or more homology arm sequences may have at least 80%, 85%, 90%, 95%, 99% or 100% identity with a corresponding wild-type reference nucleotide sequence (e.g., a wild-type genomic sequence). In some embodiments, the one or more homology arm sequences may be or comprise sequences having at least 80%, 85%, 90%, 95%, 99% or 100% identity to a portion of a corresponding wild-type reference nucleotide sequence (e.g., wild-type genomic sequence).

In some embodiments, the viral vectors provided herein may comprise a 5 'homology arm and a 3' homology arm designed to target an albumin locus. In some embodiments, viral vectors provided herein may comprise a 5 'homology arm sequence having at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 99% or 100% sequence identity to SEQ ID No. 1 and a 3' homology arm sequence having at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 99% or 100% sequence identity to SEQ ID No. 2. In some embodiments, the viral vector comprises a 5 'homology arm comprising SEQ ID NO. 1 and a 3' homology arm comprising SEQ ID NO. 2. In some embodiments, the viral vector comprises a 5 'homology arm consisting of SEQ ID NO. 1 and a 3' homology arm consisting of SEQ ID NO. 2.

In some embodiments, the viral vectors provided herein may comprise a 5 'homology arm and a 3' homology arm designed to target an albumin locus. In some embodiments, viral vectors provided herein may comprise a 5 'homology arm sequence having at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 99% or 100% sequence identity to SEQ ID No. 3 and a 3' homology arm sequence having at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 99% or 100% sequence identity to SEQ ID No. 2. In some embodiments, the viral vector comprises a 5 'homology arm comprising SEQ ID NO. 3 and a 3' homology arm comprising SEQ ID NO. 2. In some embodiments, the viral vector comprises a 5 'homology arm consisting of SEQ ID NO. 3 and a 3' homology arm consisting of SEQ ID NO. 2.

In some embodiments, the viral vectors provided herein may comprise a5 'homology arm and a 3' homology arm designed to target an albumin locus. In some embodiments, viral vectors provided herein may comprise a5 'homology arm sequence having at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 99% or 100% sequence identity to SEQ ID No. 4 and a 3' homology arm sequence having at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 99% or 100% sequence identity to SEQ ID No. 5. In some embodiments, the viral vector comprises a5 'homology arm comprising SEQ ID NO. 4 and a 3' homology arm comprising SEQ ID NO. 5. In some embodiments, the viral vector comprises a5 'homology arm consisting of SEQ ID NO. 4 and a 3' homology arm consisting of SEQ ID NO. 5.

Exemplary homology arm sequences are provided below:

human albumin 1.6kb 5' homology arm (SEQ ID NO: 1)

tttctagatgtaaataattattttaagtttgccctatggtggccccacacatgagacaaacccccaagatgtgacttttgagaatgagacttggataaaaaacatgtagaaatgcaagccctgaagctcaactccctattgctatcacaggggttataattgcataaaatttagctatagaaagttgctgtcatctcttgtgggctgtaatcatcgtctaggcttaagagtaatattgcaaaacctgtcatgcccacacaaatctctccctggcattgttgtctttgcagatgtcagtgaaagagaaccagcagctcccatgagtttggatagccttattttctatagcctccccactattagctttgaagggagcaaagtttaagaaccaaatataaagtttctcatctttatagatgagaaaaattttaaataaagtccaagataattaaatttttaaggatcatttttagctctttaatagcaataaaactcaatatgacataatatggcacttccaaaatctgaataatatataattgcaatgacatacttcttttcagagatttactgaaaagaaatttgttgacactacataacgtgatgagtggtttatactgattgtttcagttggtcttcccaccaactccatgaaagtggattttattatcctcatcatgcagatgagaatattgagacttatagcggtatgcctgagccccaaagtactcagagttgcctggctccaagatttataatcttaaatgatgggactaccatccttactctctccatttttctatacgtgagtaatgttttttctgttttttttttttctttttccattcaaactcagtgcacttgttgagctcgtgaaacacaagcccaaggcaacaaaagagcaactgaaagctgttatggatgatttcgcagcttttgtagagaagtgctgcaaggctgacgataaggagacctgctttgccgaggaggtactacagttctcttcattttaatatgtccagtattcatttttgcatgtttggttaggctagggcttagggatttatatatcaaaggaggctttgtacatgtgggacagggatcttattttacaaacaattgtcttacaaaatgaataaaacagcactttgtttttatctcctgctctattgtgccatactgttaaatgtttataatgcctgttctgtttccaaatttgtgatgcttatgaatattaataggaatatttgtaaggcctgaaatattttgatcatgaaatcaaaacattaatttatttaaacatttacttgaaatgtggtggtttgtgatttagttgattttataggctagtgggagaatttacattcaaatgtctaaatcacttaaaattgccctttatggcctgacagtaacttttttttattcatttggggacaactatgtccgtgagcttccgtccagagattatagtagtaaattgtaattaaaggatatgatgcacgtgaaatcactttgcaatcatcaatagcttcataaatgttaattttgtatcctaatagtaatgctaatattttcctaacatctgtcatgtctttgtgttcagggtaaaaaacttgttgctgcaagtcaagctgccttaggctta

Human albumin 1kb 3' homology arm (SEQ ID NO: 2)

Taacatcacatttaaaagcatctcaggtaactatattttgaattttttaaaaaagtaactataatagttattattaaaatagcaaagattgaccatttccaagagccatatagaccagcaccgaccactattctaaactatttatgtatgtaaatattagcttttaaaattctcaaaatagttgctgagttgggaaccactattatttctattttgtagatgagaaaatgaagataaacatcaaagcatagattaagtaattttccaaagggtcaaaattcaaaattgaaaccaaagtttcagtgttgcccattgtcctgttctgacttatatgatgcggtacacagagccatccaagtaagtgatggctcagcagtggaatactctgggaattaggctgaaccacatgaaagagtgctttatagggcaaaaacagttgaatatcagtgatttcacatggttcaacctaatagttcaactcatcctttccattggagaatatgatggatctaccttctgtgaactttatagtgaagaatctgctattacatttccaatttgtcaacatgctgagctttaataggacttatcttcttatgacaacatttattggtgtgtccccttgcctagcccaacagaagaattcagcagccgtaagtctaggacaggcttaaattgttttcactggtgtaaattgcagaaagatgatctaagtaatttggcatttattttaataggtttgaaaaacacatgccattttacaaataagacttatatttgtccttttgtttttcagcctaccatgagaataagagaaagaaaatgaagatcaaaagcttattcatctgtttttctttttcgttggtgtaaagccaacaccctgtctaaaaaacataaatttctttaatcattttgcctcttttctctgtgcttcaattaataaaaaatggaaagaatctaatagagtggtacagcactgttatttttcaaagatgtgttg

Human albumin 1kb 5' homology arm (SEQ ID NO: 3)

Actccatgaaagtggattttattatcctcatcatgcagatgagaatattgagacttatagcggtatgcctgagccccaaagtactcagagttgcctggctccaagatttataatcttaaatgatgggactaccatccttactctctccatttttctatacgtgagtaatgttttttctgttttttttttttctttttccattcaaactcagtgcacttgttgagctcgtgaaacacaagcccaaggcaacaaaagagcaactgaaagctgttatggatgatttcgcagcttttgtagagaagtgctgcaaggctgacgataaggagacctgctttgccgaggaggtactacagttctcttcattttaatatgtccagtattcatttttgcatgtttggttaggctagggcttagggatttatatatcaaaggaggctttgtacatgtgggacagggatcttattttacaaacaattgtcttacaaaatgaataaaacagcactttgtttttatctcctgctctattgtgccatactgttaaatgtttataatgcctgttctgtttccaaatttgtgatgcttatgaatattaataggaatatttgtaaggcctgaaatattttgatcatgaaatcaaaacattaatttatttaaacatttacttgaaatgtggtggtttgtgatttagttgattttataggctagtgggagaatttacattcaaatgtctaaatcacttaaaattgccctttatggcctgacagtaacttttttttattcatttggggacaactatgtccgtgagcttccgtccagagattatagtagtaaattgtaattaaaggatatgatgcacgtgaaatcactttgcaatcatcaatagcttcataaatgttaattttgtatcctaatagtaatgctaatattttcctaacatctgtcatgtctttgtgttcagggtaaaaaacttgttgctgcaagtcaagctgccttaggctta

Mouse albumin 1kb 5' homology arm (SEQ ID NO: 4)

Ctgaaactagacaaaacccgtgtgactggcatcgattattctatttgatctagctagtcctagcaaagtgacaactgctactcccctcctacacagccaagattcctaagttggcagtggcatgcttaatcctcaaagccaaagttacttggctccaagatttatagccttaaactgtggcctcacattccttcctatcttactttcctgcactggggtaaatgtctccttgctcttcttgctttctgtcctactgcagggctcttgctgagctggtgaagcacaagcccaaggctacagcggagcaactgaagactgtcatggatgactttgcacagttcctggatacatgttgcaaggctgctgacaaggacacctgcttctcgactgaggtcagaaacgtttttgcattttgacgatgttcagtttccattttctgtgcacgtggtcaggtgtagctctctggaactcacacactgaataactccaccaatctagatgttgttctctacgtaactgtaatagaaactgacttacgtagcttttaatttttattttctgccacactgctgcctattaaatacctattatcactatttggtttcaaatttgtgacacagaagagcatagttagaaatacttgcaaagcctagaatcatgaactcatttaaaccttgccctgaaatgtttctttttgaattgagttattttacacatgaatggacagttaccattatatatctgaatcatttcacattccctcccatggcctaacaacagtttatcttcttattttgggcacaacagatgtcagagagcctgctttaggaattctaagtagaactgtaattaagcaatgcaaggcacgtacgtttactatgtcattgcctatggctatgaagtgcaaatcctaacagtcctgctaatacttttctaacatccatcatttctttgttttcagggtccaaaccttgtcactagatgcaaagacgccttagcc

Mouse albumin 1.6kb 3' homology arm (SEQ ID NO: 5)

taaacacatcacaaccacaaccttctcaggtaactatacttgggacttaaaaaacataatcataatcatttttcctaaaacgatcaagactgataaccatttgacaagagccatacagacaagcaccagctggcactcttaggtcttcacgtatggtcatcagtttgggttccatttgtagataagaaactgaacatataaaggtctaggttaatgcaatttacacaaaaggagaccaaaccagggagagaaggaaccaaaattaaaaattcaaaccagagcaaaggagttagccctggttttgctctgacttacatgaaccactatgtggagtcctccatgttagcctagtcaagcttatcctctggatgaagttgaaaccatatgaaggaatatttggggggtgggtcaaaacagttgtgtatcaatgattccatgtggtttgacccaatcattctgtgaatccatttcaacagaagatacaacgggttctgtttcataataagtgatccacttccaaatttctgatgtgccccatgctaagctttaacagaatttatcttcttatgacaaagcagcctcctttgaaaatatagccaactgcacacagctatgttgatcaattttgtttataatcttgcagaagagaattttttaaaatagggcaataatggaaggctttggcaaaaaaattgtttctccatatgaaaacaaaaaacttatttttttattcaagcaaagaacctatagacataaggctatttcaaaattatttcagttttagaaagaattgaaagttttgtagcattctgagaagacagctttcatttgtaatcataggtaatatgtaggtcctcagaaatggtgagacccctgactttgacacttggggactctgagggaccagtgatgaagagggcacaacttatatcacacatgcacgagttggggtgagagggtgtcacaacatctatcagtgtgtcatctgcccaccaagtaacagatgtcagctaagactaggtcatgtgtaggctgtctacaccagtgaaaatcgcaaaaagaatctaagaaattccacatttctagaaaataggtttggaaaccgtattccattttacaaaggacacttacatttctctttttgttttccaggctaccctgagaaaaaaagacatgaagactcaggactcatcttttctgttggtgtaaaatcaacaccctaaggaacacaaatttctttaaacatttgacttcttgtctctgtgctgcaattaataaaaaatggaaagaatctactctgtggttcagaactctatcttccaaaggcgcgcttcaccctagcagcctctttggctcagaggaatccctgcctttcctcccttcatctcagcagagaatgtagttccacatgggcaacacaatgaaaataaacgttaatactctcccatcttatgggtggtgaccctagaaaccaatacttcaacattacgagaattctgaatgagagactaaaagcttatgaactgtggctttcctttgtcagtgggactctaagaatgagttggggacaaaagagataggaatggctttaaaggtgactagttgaactgataaagtaaatgaactgaggaaaaaaaatatcactcaa

Measurement of target site integration

As described in the present application, one of the problems with the traditional use of nucleases to introduce nucleic acid material into cells is the significant opportunity for off-target integration (e.g., of transgenes). It is therefore important to verify proper integration by one or more specific targeting assays, as described below.

According to various embodiments, the rate of integration may be measured at any of a plurality of time points. In some embodiments, the target site integration rate is measured after one or more days. In some embodiments, the target site integration rate is measured after one or more weeks. In some embodiments, the target site integration rate is measured after one month or more. In some embodiments, the target site integration rate is measured after one or more years. In some embodiments, the target site integration rate is measured by assessing one or more biomarkers (e.g., biomarkers comprising 2A peptide). In some embodiments, the target site integration rate is measured by evaluating one or more isolated nucleic acids (e.g., mRNA, gDNA). In some embodiments, the target site integration rate is measured by assessing gene expression (e.g., by immunohistochemical staining).

Table 1: exemplary methods for assessing target site integration

Transgenic plants

In some embodiments, the methods and compositions of the present disclosure provide one or more transgenes (e.g., FAH). In some embodiments, the transgene is selected for integration into the genome. In some embodiments, the transgene is a functional version of a disease-associated gene present in the subject cell. In some embodiments, the combined size of the transgenes and homology arms can be optimized to increase the likelihood that these transgenes have the appropriate sequence length for efficient packaging in a delivery vehicle, which can increase the likelihood that the transgenes will ultimately be properly delivered in the patient.

In some embodiments, the nucleotide sequence encoding the transgene is codon optimized. In some embodiments, the nucleotide sequence encoding the transgene is codon optimized for a cell type (e.g., mammalian, insect, bacterial, fungal, etc.). In some embodiments, the nucleotide sequence encoding the transgene is codon optimized for human cells. In some embodiments, the nucleotide sequence encoding the transgene is codon optimized for human cells of a particular tissue type (e.g., liver, muscle, CNS, lung).

In certain embodiments, the nucleotide sequence encoding the transgene may be codon optimized to have less than 100% nucleotide homology to a reference nucleotide sequence (e.g., wild-type gene sequence). In certain embodiments, the nucleotide homology between the codon optimized nucleotide sequence encoding the transgene and the reference nucleotide sequence is less than 100%, less than 99%, less than 98%, less than 97%, less than 96%, less than 95%, less than 94%, less than 93%, less than 92%, less than 91%, less than 90%, less than 89%, less than 88%, less than 87%, less than 86%, less than 85%, less than 84%, less than 83%, less than 82%, less than 81%, less than 80%, less than 78%, less than 76%, less than 74%, less than 72%, less than 70%, less than 68%, less than 66%, less than 64%, less than 62%, less than 60%, less than 55%, less than 50%, and less than 40%.

In some embodiments, transgene expression in the subject results from substantially integration at the target integration site. In some embodiments, 75% or more (e.g., 80% or more, 85% or more, 90% or more, 95% or more, 99% or more, 99.5% or more) of the total transgene expression in the subject is from transgene integration at the target integration site. In some embodiments, 25% or less (e.g., 20% or less, 15% or less, 10% or less, 5% or less, 1% or less, 0.5% or less, 0.1% or less) of the total transgene expression in the subject is from a source other than transgene integration at the target integration site (e.g., episomal gene expression, integration at a non-target integration site).

In some embodiments, the transgene is transiently expressed in the subject (e.g., episomal gene expression from a plasmid, microcircular DNA, virus, etc.). In some embodiments, 75% or more (e.g., 80% or more, 85% or more, 90% or more, 95% or more, 99% or more, 99.5% or more) of the total transgene expression in the subject is from transient expression. In some embodiments, 25% or less (e.g., 20% or less, 15% or less, 10% or less, 5% or less, 1% or less, 0.5% or less, 0.1% or less) of the total transgene expression in the subject is from a source other than transient expression (e.g., integration at a non-target integration site). In some embodiments, the transgene is transiently expressed in the subject (e.g., episomal gene expression from a plasmid, microcircular DNA, virus, etc.) for one or more weeks after treatment. In some embodiments, the transgene is transiently expressed in the subject (e.g., episomal gene expression from a plasmid, microcircular DNA, virus, etc.) for one month or more following treatment.

In some embodiments, the transgene is transiently expressed (e.g., episomal gene expression from a plasmid, microcircular DNA, virus, etc.) in the subject for one or more weeks following treatment at a level comparable to that observed during one or more days following treatment. In some embodiments, the transgene is transiently expressed in the subject (e.g., episomal gene expression from a plasmid, microcircular DNA, virus, etc.) for one or more months following treatment at a level comparable to that observed during one or more days following treatment.

In some embodiments, the transgene is transiently expressed (e.g., episomal gene expression from a plasmid, microcircular DNA, virus, etc.) in the subject for one or more weeks after treatment at a level that is reduced relative to the level observed during one or more days after treatment. In some embodiments, the transgene is transiently expressed in the subject (e.g., episomal gene expression from a plasmid, microcircular DNA, virus, etc.) for one or more months after treatment at a level that is reduced relative to the level observed during one or more days after treatment.

In some embodiments, the transgene is transiently expressed in the subject (e.g., episomal gene expression from a plasmid, microcircular DNA, virus, etc.) for no more than one month after treatment. In some embodiments, the transgene is transiently expressed in the subject (e.g., episomal gene expression from a plasmid, microcircular DNA, virus, etc.) for no more than two months after treatment. In some embodiments, the transgene is transiently expressed in the subject (e.g., episomal gene expression from a plasmid, microcircular DNA, virus, etc.) for no more than three months after treatment. In some embodiments, the transgene is transiently expressed in the subject (e.g., episomal gene expression from a plasmid, microcircular DNA, virus, etc.) for no more than four months after treatment. In some embodiments, the transgene is transiently expressed in the subject (e.g., episomal gene expression from a plasmid, microcircular DNA, virus, etc.) for no more than five months after treatment. In some embodiments, the transgene is transiently expressed in the subject (e.g., episomal gene expression from a plasmid, microcircular DNA, virus, etc.) for no more than six months after treatment.

In some embodiments, the combined size of the transgenes and homology arms can be optimized to increase the likelihood that these transgenes have the appropriate sequence length for efficient packaging in a delivery vehicle, which can increase the likelihood that the transgenes will ultimately be properly delivered in the patient.

In some embodiments, the transgene may be or comprise a sequence having at least 80%, 85%, 90%, 95%, 99% or 100% identity to a corresponding wild-type reference nucleotide sequence (e.g., wild-type gene sequence). In some embodiments, a transgene may be or comprise a sequence having at least 80%, 85%, 90%, 95%, 99% or 100% identity to a portion of a corresponding wild-type reference nucleotide sequence (e.g., wild-type gene sequence).

Nucleic acids that facilitate the production of two independent gene products

2A peptide for polycistronic expression. In some embodiments, the methods and compositions of the present disclosure comprise a nucleic acid encoding a 2A peptide. Without wishing to be bound by any particular theory, the nucleic acid sequence encoding the 2A peptide may play a number of important roles. In some embodiments, the 2A peptide promotes polycistronic expression, i.e., the production of two different proteins from the same mRNA. This in turn allows integration of the transgene in a non-destructive manner under the drive of a strong endogenous promoter by coupling the transcription of the transgene to a highly expressed target gene in the tissue of interest. In some embodiments, the albumin locus may be used as an integration site for a therapeutic procedure for the liver. In some embodiments, the 2A peptide facilitates production of the therapeutic protein in each modified cell at the same level as the endogenous target gene (e.g., albumin) by a process known as ribosome skipping. In some embodiments, the endogenous target gene (e.g., albumin) of the subject is produced normally, except for the addition of a C-terminal tag that serves as a circulating biomarker that indicates successful integration and expression of the transgene. In some embodiments, modification of an endogenous target gene (e.g., albumin) has minimal impact on its function. The 2A peptide has been incorporated into other potential therapeutic agents, such as T cell receptor chimeric antigen receptors or CAR-T (Qasim et al SCI TRANSL MED 2017).

Exemplary sequences encoding one or more 2A peptides are provided below:

P2A nucleotide sequence (SEQ ID NO: 6)

GGAAGCGGCGCCACCAATTTCAGCCTGCTGAAACAGGCCGGCGACGTGGAAGAGAACCCTGGCCCT

P2A peptide sequence (SEQ ID NO: 7)

GSGATNFSLLKQAGDVEENPGP

In some embodiments, targeting a particular locus allows for the use of strong endogenous promoters that drive high levels of production to maximize expression of the transgene. In some embodiments, linking the expression of the transgene to a highly expressed endogenous protein (e.g., albumin) may allow the transgene to be expressed at therapeutic levels without the need to add an exogenous promoter or integrate the transgene into most target cells.

This is supported by animal models of MMA, hemophilia B and Crigler-Najjar syndrome. In these models, integration of the transgene into approximately 1% of the cells yields therapeutic benefits. In some embodiments, the strength of the albumin promoter overcomes moderate levels of integration, resulting in potentially therapeutic levels of transgene expression.

Without wishing to be bound by any particular theory, potential advantages of the GENERIDE ^TM method include the following:

Targeted integration of the transgene into the genome.

Known methods of gene therapy deliver therapeutic transgenes to target cells. One major disadvantage of most of these methods is that once the genes enter the cell interior, they do not integrate into the host cell chromosome and do not benefit from the natural processes that result in DNA replication and isolation during cell division. This is particularly problematic when known gene therapies are introduced early in the life of the patient, as rapid growth of tissue during normal development of the child will result in a weakening and eventual loss of the therapeutic benefits associated with the transgene. Non-integral genes expressed outside the genome on separate DNA strands are called episomes. This episomal expression may be effective in transduced primary cells, some of which may last a long time or for the lifetime of the patient. However, episomal gene expression is generally transient in target tissues such as the liver, where the turnover rate of cells is high and tends to increase significantly during the life of pediatric patients. By means of GENERIDE ^TM technology, the transgene is integrated into the genome, which is likely to provide stable and durable transgene expression as the cells divide and patient tissue grows, and may produce durable therapeutic benefits.

Transgene expression in the absence of exogenous promoters.

In some embodiments, the transgene is expressed at a location regulated by an effective endogenous promoter by the GENERIDE ^TM technique. In some embodiments, homology arms can be used to insert a transgene at a precise site in the genome, expressed under the control of an effective endogenous promoter (e.g., an albumin promoter). Instead of using an exogenous promoter to drive expression of the transgene, this technique avoids the possibility of off-target integration of the promoter associated with increased risk of cancer. In some embodiments, the selection of a strong endogenous promoter will allow therapeutic levels of protein expression to be achieved from transgenes with moderate integration rates of typical highly accurate and reliable HR processes.

Nuclease-free genome editing.

By utilizing the naturally occurring process of HR, GENERIDE ^TM is designed to avoid the undesirable side effects associated with exogenous nucleases used in known gene editing techniques. The use of these engineered enzymes is associated with genotoxicity, including chromosomal changes, due to error-prone DNA repair of double-stranded DNA nicks. Avoiding the use of nucleases also reduces the amount of exogenous components that need to be delivered to the cell.

Payload

In some embodiments, one or more vectors or constructs described herein may comprise a polynucleotide sequence encoding one or more payloads. Any of a variety of payloads (e.g., those payloads having diagnostic and/or therapeutic purposes) may be used alone or in combination according to various aspects. In some embodiments, the payload may be or comprise a polynucleotide sequence encoding a peptide or polypeptide. In some embodiments, the payload is a peptide having an intrinsic or extrinsic activity that facilitates a biological process for treating a medical condition. In some embodiments, the payload may be or comprise a transgene (also referred to herein as a gene of interest (GOI)). In some embodiments, the payload may be or comprise one or more Inverted Terminal Repeat (ITR) sequences (e.g., one or more AAV ITRs). In some embodiments, the payload may be or comprise one or more transgenes having flanking ITR sequences. In some embodiments, the payload may be or comprise one or more heterologous nucleic acid sequences encoding a reporter gene (e.g., a fluorescent or luminescent reporter gene). In some embodiments, the payload may be or comprise one or more biomarkers (e.g., agents expressed by the payload). In some embodiments, the payload may comprise sequences for polycistronic expression (including, for example, 2A peptide or intron sequences, internal ribosome entry sites). In some embodiments, the 2A peptide is a smaller (e.g., about 18-22 amino acids) peptide sequence that is capable of co-expressing two or more discrete protein products within a single coding sequence. In some embodiments, the 2A peptide allows for co-expression of two or more discrete protein products regardless of the arrangement of the protein coding sequences. In some embodiments, the 2A peptide is or comprises a common motif (e.g., DVEXNPGP). In some embodiments, the 2A peptide facilitates protein cleavage. In some embodiments, the 2A peptide is or comprises a viral sequence (e.g., foot-and-mouth disease virus (F2A), equine rhinitis a virus, porcine iron-solenoidal-1 (P2A), or echinacea vein (Thosea asigna) virus (T2A)).

In some embodiments, the payload may be or comprise a polynucleotide sequence comprising an expression cassette. In some embodiments, the expression cassette comprises a first nucleic acid sequence and a second nucleic acid sequence, wherein the first nucleic acid sequence encodes a transgene and the second nucleic acid sequence is located 5 'or 3' to the first nucleic acid sequence and facilitates the production of two independent gene products (e.g., sequences encoding 2A peptides).

Monitoring method

In some embodiments, the present disclosure provides and/or otherwise evaluates methods of gene therapy. In some embodiments, the present disclosure provides for the detection of a product (e.g., a polypeptide or nucleic acid) and/or a biomarker produced or encoded by a composition described herein. In some embodiments, the presence of a product or biomarker is assessed in a biological sample taken from a subject who has received an integrated gene therapy treatment as described herein. In some embodiments, the biological sample is or comprises hair, skin, stool, blood, plasma, serum, cerebrospinal fluid, urine, saliva, tears, vitreous, liver biopsy, or mucus.

In some embodiments, the product or biomarker is expressed in a cell. In some embodiments, the product or biomarker is secreted extracellularly. In some embodiments, the product or biomarker comprises a 2A peptide. In some embodiments, the product or biomarker comprises albumin (e.g., albumin modified with, for example, a C-terminal tag). Methods for detecting various products or biomarkers are known in the art. In some embodiments, the product or biomarker is detected by an immunological assay or a nucleic acid amplification assay. In some embodiments, methods of detecting a product or biomarker are described in WO/2020/214582, the entire contents of which are incorporated herein by reference. In some embodiments, detecting the product or biomarker is performed 1, 2, 3, 4, 5, 6, 7, 8 weeks or more after the subject has received the gene therapy treatment or gene integration composition.

Delivery vehicle

There are a variety of gene therapy approaches in the art. Thus, there are a variety of delivery mechanisms in the art. In some embodiments, the transgene is provided using a delivery vector. In some embodiments, the compositions of the present disclosure comprise a delivery vehicle. In some embodiments, the delivery vehicle is or comprises a non-viral particle. In some embodiments, the delivery vehicle is a lipid particle (e.g., a lipid nanoparticle). Various lipid nanoparticles for delivery of nucleic acids are known in the art, such as those described in WO2015184256, WO2013149140, WO2014089486A1, WO2009127060, WO2011071860, WO2020219941, the contents of each of which are incorporated herein by reference.

In some embodiments, the delivery vehicle is or comprises an exosome. Those skilled in the art recognize various methods and uses for exosome production. Examples of such methods and uses are described in Luan et al, acta Pharmacologica Sinica, volume 38, pages 754-763 (2017).

In some embodiments, the delivery vehicle is or comprises a closed circular cDNA integration gene therapy construct. In some embodiments, the delivery vector is or comprises a recombinant viral vector. In some embodiments, the recombinant viral vector is an adeno-associated virus (AAV) vector. In some embodiments, the recombinant AAV vector comprises capsid AAV8, AAV-DJ, AAV-LK03, sL65, or AAVNP59. In some embodiments, the recombinant viral vector is or comprises a capsid protein comprising an amino acid sequence having at least 90%, 95%, 99% or 100% sequence identity to the amino acid sequence of AAV8, AAV-DJ, AAV-LK03, sL65 or AAVNP59. In some embodiments, the recombinant viral vector is or comprises a variant (e.g., a codon optimized variant) of AAV8, AAV-DJ, AAV-LK03, sL65, or AAVNP.

In some embodiments, the recombinant AAV vector comprises at least one ITR. In some embodiments, the recombinant AAV vector comprises two ITRs. In some embodiments, the recombinant AAV vector comprises a 5' itr. In some embodiments, the recombinant AAV vector comprises a 3' itr. In some embodiments, the recombinant AAV vector comprises AAV2 ITRs. In some embodiments, the recombinant AAV vector comprises a portion of an AAV2 ITR. In some embodiments, the recombinant AAV vector comprises an ITR having at least 80%, 85%, 90%, 95%, 99% or 100% sequence identity to an AAV2 ITR. In some embodiments, the recombinant AAV vector comprises an ITR having 90%, 95%, 99%, 100% sequence identity to one of SEQ ID nos. 27-30.

145bp ITR：

aggaacccctagtgatggagttggccactccctctctgcgcgctcgctcgctcactgaggccgggcgaccaaaggtcgcccgacgcccgggctttgcccgggcggcctcagtgagcgagcgagcgcgcagagagggagtggccaa(SEQ ID NO.27)

130bp ITR：

aggaacccctagtgatggagttggccactccctctctgcgcgctcgctcgctcactgaggccgggcgaccaaaggtcgcccgacgcccgggctttgcccgggcggcctcagtgagcgagcgagcgcgcag(SEQ ID NO.28)

B loop deletion ITR:

aggaacccctagtgatggagttggccactccctctctgcgcgctcgctcgctcactgaggccgcccgggctttgcccgggcggcctcagtgagcgagcgagcgcgcagagagggagtggccaa(SEQ ID NO.29)

C loop deletion ITR:

aggaacccctagtgatggagttggccactccctctctgcgcgctcgctcgctcactgaggccgggcgaccaaaggtcgcccggcctcagtgagcgagcgagcgcgcagagagggagtggccaa(SEQ IDNO.30)

Therapeutic method

The compositions and constructs disclosed herein can be used for any in vitro or in vivo application where it is desirable to express a payload (e.g., transgene) from a particular target locus in a cell while maintaining expression of an endogenous gene at and around the target locus. For example, the compositions and constructs disclosed herein can be used (e.g., by gene therapy) to treat a disorder, disease, or medical condition in a subject.

In some embodiments, the treatment comprises achieving or maintaining a desired pharmacological and/or physiological effect. In some embodiments, the desired pharmacological and/or physiological effect may comprise complete or partial prevention of the disease (e.g., prevention of symptoms of the disease). In some embodiments, the desired pharmacological and/or physiological effect may include complete or partial cure of the disease (e.g., cure of side effects associated with the disease). In some embodiments, the desired pharmacological and/or physiological effect may include preventing recurrence of the disease. In some embodiments, the desired pharmacological and/or physiological effect may include slowing the progression of the disease. In some embodiments, the desired pharmacological and/or physiological effect may include alleviation of symptoms of the disease. In some embodiments, the desired pharmacological and/or physiological effect may include preventing regression of the disease. In some embodiments, the desired pharmacological and/or physiological effect may include stabilization and/or reduction of symptoms associated with the disease.

In some embodiments, the treatment comprises administering the composition before, during, or after the onset of the disease (e.g., before, during, or after the occurrence of symptoms associated with the disease). In some embodiments, the treatment comprises a combination therapy (e.g., with one or more therapies, including different types of therapies).

Targeted integration

In some embodiments, the compositions and constructs provided herein direct the integration of a payload (e.g., a transgene and/or a functional nucleic acid) at a target locus (also referred to herein as a target integration site) (e.g., an endogenous gene). In some embodiments, the compositions and constructs provided herein direct the integration of a payload at a target locus (e.g., a tissue-specific locus) in a particular cell type. In some embodiments, the integration of the payload occurs in a specific tissue (e.g., liver, central Nervous System (CNS), muscle, kidney, blood vessel, lung). In some embodiments, the integration of the payload occurs in multiple tissues (e.g., liver, central Nervous System (CNS), muscle, kidney, blood vessel, lung).

In some embodiments, the compositions and constructs provided herein direct the integration of the payload at a target locus (e.g., albumin, lipoprotein meta A2 (ApoA 2), heme binding) that is considered a safe harbor site. In some embodiments, the target locus may be selected from any genomic locus suitable for use with the methods and compositions provided herein. In some embodiments, the target locus encodes a polypeptide. In some embodiments, the target locus encodes a polypeptide that is highly expressed in a subject (e.g., a subject that does not have a disease, disorder, or condition, or a subject that has a disease, disorder, or condition). In some embodiments, the integration of the payload occurs at the 5 'or 3' end of one or more endogenous genes (e.g., genes encoding polypeptides). In some embodiments, the integration of the payload occurs between the 5 'or 3' ends of one or more endogenous genes (e.g., genes encoding polypeptides).

In some embodiments, the compositions and constructs provided herein direct the integration of a payload at a target locus with little or no off-target integration (e.g., integration at a non-target locus). In some embodiments, the compositions and constructs provided herein direct the integration of a payload at a target locus with reduced off-target integration compared to a reference composition or construct (e.g., relative to a composition or construct that does not have flanking homologous sequences).

In some embodiments, integration of the transgene at the target locus allows for expression of the payload without disrupting endogenous gene expression. In some embodiments, integration of the transgene at the target locus allows expression of the payload from the endogenous promoter. In some embodiments, integration of the transgene at the target locus disrupts endogenous gene expression. In some embodiments, integration of the transgene at the target locus disrupts endogenous gene expression without adversely affecting the target cell and/or subject (e.g., by targeting a safe harbor site). In some embodiments, integration of the transgene at the target locus does not require the use of nucleases (e.g., cas protein, endonuclease, TALEN, ZFN). In some embodiments, integration of the transgene at the target locus is aided by the use of nucleases (e.g., cas protein, endonuclease, TALEN, ZFN).

In some embodiments, integration of the transgene at the target locus confers a selective advantage (e.g., increased survival of the various cells relative to other cells in the tissue). In some embodiments, the selectivity advantage may result in an increased percentage of cells in one or more tissues expressing the transgene.

Composition and method for producing the same

In some embodiments, the methods and constructs (e.g., viral vectors) provided herein can be used to produce compositions. In some embodiments, the compositions include liquid, solid, and gaseous compositions. In some embodiments, the composition comprises additional ingredients (e.g., diluents, stabilizers, excipients, adjuvants). In some embodiments, the additional ingredients may include buffers (e.g., phosphate, citrate, organic acid buffers), antioxidants (e.g., ascorbic acid), low molecular weight polypeptides (e.g., less than 10 residues), various proteins (e.g., serum albumin, gelatin, immunoglobulins), hydrophilic polymers (e.g., polyvinylpyrrolidone), amino acids (e.g., glycine, glutamine, asparagine, arginine, lysine), carbohydrates (e.g., monosaccharides, disaccharides, glucose, mannose, dextrins), chelating agents (e.g., EDTA), sugar alcohols (e.g., mannitol, sorbitol), salt forming counter ions (e.g., sodium, potassium), and/or non-ionic surfactants (e.g., tween ^TM、Pluronics^TM, polyethylene glycol (PEG)), and the like. In some embodiments, the aqueous carrier is an aqueous pH buffered solution.

In some embodiments, the compositions provided herein may be provided in a range of dosages. In some embodiments, the compositions provided herein may be provided in a single dose. In some embodiments, the compositions provided herein may be provided in multiple doses. In some embodiments, the composition is provided over a period of time. In some embodiments, the composition is provided at specific time intervals (e.g., varying time intervals, set time intervals). In some embodiments, the dosage may vary depending on the dosage form and route of administration. In some embodiments, the compositions provided herein may be provided at a dose of between 1e12 vg/kg and 1e14 vg/kg. In some embodiments, the compositions provided herein may be provided at a dose of between 3e12 vg/kg and 1e13 vg/kg. In some embodiments, the compositions provided herein may be provided at a dose of between 1e13 vg/kg and 3e13 vg/kg. In some embodiments, the compositions provided herein may be provided at a dose between 3e12 vg/kg and 3e13 vg/kg. In some embodiments, the compositions provided herein may be provided at a dose of no more than 3e13 vg/kg. In some embodiments, the compositions provided herein may be provided at a dose of no more than 1e13 vg/kg. In some embodiments, the compositions provided herein may be provided at a dose of no more than 3e12 vg/kg.

In some embodiments, the compositions provided herein can be administered to a subject at a particular point in time (e.g., the age of the subject). In some embodiments, the compositions provided herein may be administered to a neonatal subject. In some embodiments, the compositions provided herein may be administered to a neonate subject. In some embodiments, the neonatal mouse subject is between 0 and 14 days old. In some embodiments, the neonatal mouse subject is between 0 days and 1 month old. In some embodiments, the compositions provided herein may be administered to subjects between 7 days and 30 days of age. In some embodiments, the compositions provided herein may be administered to subjects between 3 months and 1 year of age. In some embodiments, the compositions provided herein may be administered to subjects between 1 and 5 years of age. In some embodiments, the compositions provided herein may be administered to subjects between the ages of 4 and 7 years. In some embodiments, the compositions provided herein may be administered to subjects aged 5 years or older.

In some embodiments, the compositions provided herein may be administered to a subject at a particular point in time based on the growth phase of a particular tissue or organ (e.g., percentage of estimated/average adult body size or weight). In some embodiments, the compositions provided herein may be administered to a subject, wherein a tissue or organ (e.g., liver, muscle, CNS, lung, etc.) is at least 10%, at least 20%, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, or at least 99% of estimated/average adult body size or weight. In some embodiments, the compositions provided herein may be administered to a subject, wherein the tissue or organ is about 20% (+/-5%) of estimated/average adult body size or weight. In some embodiments, the compositions provided herein may be administered to a subject, wherein the tissue or organ is about 50% (+/-5%) of estimated/average adult body size or weight. In some embodiments, the compositions provided herein may be administered to a subject, wherein the tissue or organ is about 60% (+/-5%) of estimated/average adult body size or weight. In some embodiments, the estimated/average adult body size or weight of a particular tissue or organ may be determined as described in the art (see Noda et al PEDIATRIC RADIOLOGY,1997; johnson et al Liver transplantation,2005; and Szpinda et al Biomed research international,2015, each of which is incorporated herein by reference in its entirety).

Route of administration

In some embodiments, the compositions provided herein can be administered to a subject by any one (or more) of a variety of routes known in the art (e.g., parenteral, subcutaneous, intravenous, intracranial, intraspinal, intraocular, intramuscular, intravaginal, intraperitoneal, intradermal, rectal, pulmonary, intraosseous, oral, buccal, portal intravenous, intraarterial, intratracheal, or nasal). In some embodiments, the compositions provided herein can be introduced into cells, which are subsequently introduced into a subject (e.g., liver, muscle, central Nervous System (CNS), lung, blood cells). In some embodiments, the compositions provided herein may be introduced by delivery methods known in the art (e.g., injection, catheter).

In some embodiments, genome editing using the GENERIDE ^TM platform differs from known gene therapies in that it uses homologous recombination to deliver a correction gene to one specific location in the genome. In some embodiments GENERIDE ^TM inserts the correction gene in a precise manner, resulting in site-specific integration in the genome. In some embodiments GENERIDE ^TM does not require the use of exogenous nucleases or promoters. In some embodiments GENERIDE ^TM may be combined with one or more exogenous nucleases and/or promoters.

In some embodiments, provided compositions comprise one or more homology arms, transgenes, and nucleic acids that facilitate the production of two independent gene products. In some embodiments, the compositions and methods of the present disclosure comprise a first nucleic acid sequence encoding a transgene. In some embodiments, the compositions and methods of the present disclosure comprise a second nucleic acid that facilitates the production of two independent gene products (e.g., 2A peptides). In some embodiments, the disclosure provides an expression cassette comprising a first nucleic acid sequence and a second nucleic acid sequence as described herein.

In some embodiments, one or more compositions described herein are administered without any additional treatment. In some embodiments, one or more of the compositions described herein are administered in combination. In some embodiments, the first composition may be administered simultaneously with the second composition. In some embodiments, the first composition and the second composition may be administered sequentially (e.g., within minutes, hours, days, weeks, or months of each other). In some embodiments, one or more compositions may be administered by the same route (e.g., parenterally, subcutaneously, intravenously, intracranially, intraspinal, intraocular, intramuscular, intravaginally, intraperitoneal, intradermal, intrarectal, pulmonary, intraosseous, oral, buccal, portal intravenous, intraarterial, intratracheal, or nasal). In some embodiments, one or more compositions may be administered by different routes (e.g., parenteral, subcutaneous, intravenous, intracranial, intraspinal, intraocular, intramuscular, intravaginal, intraperitoneal, intradermal, intrarectal, pulmonary, intraosseous, oral, buccal, portal intravenous, intraarterial, intratracheal, or nasal).

In some embodiments, the first and/or second composition is administered only once in a particular dose (e.g., a fixed dose or a weight-based dose). In some embodiments, the first and/or second composition is administered more than once in a particular dose (e.g., a fixed dose or a weight-based dose). In some embodiments, where more than one dose (e.g., a fixed dose or a weight-based dose) is administered, the first and/or second compositions may be administered simultaneously, substantially simultaneously, or consecutively. In some embodiments, multiple doses (e.g., fixed doses or weight-based doses) are administered over a specified period of time (e.g., over a period of minutes, hours, days, weeks, or months).

In some embodiments, the first and/or second compositions are administered in response to a biomarker (e.g., a circulating biomarker as described in WO2020214582 A1). For example, the first and/or second compositions are administered at a particular dose (e.g., a fixed dose or a weight-based dose) and the level of the biomarker (e.g., as described in WO2020214582 A1) is monitored over a particular period of time (e.g., over a few minutes, hours, days, weeks, or months). If the level of the biomarker (e.g., as described in WO2020214582 A1) is low (e.g., as compared to an appropriate reference (e.g., the level of the biomarker prior to administration), the first and/or second composition is administered at a particular dose (e.g., a fixed dose or a weight-based dose). If the level of the biomarker (e.g., as described in WO2020214582 A1) is high (e.g., as compared to a suitable reference (e.g., the level of the biomarker after initial administration), then subsequent administration of the first and/or second compositions (e.g., fixed dose or weight-based dose) (e.g., treatment suspension, reduced fixed dose or weight-based dose) can be re-evaluated.

Method for producing viral vectors

Production of viral vectors

In some embodiments, generating a viral vector (e.g., an AAV viral vector) can include a step of generating the viral vector upstream (e.g., cell-based culture) and a step of processing the viral vector downstream (e.g., purification, formulation, etc.). In some embodiments, the upstream step may include one or more of cell expansion, cell culture, cell transfection, cell lysis, viral vector generation, and/or viral vector harvesting.

In some embodiments, the downstream steps may include one or more of separation, filtration, concentration, clarification, purification, chromatography (e.g., affinity, ion exchange, hydrophobic, mixed mode), centrifugation (e.g., ultracentrifugation), and/or deployment.

In some embodiments, each of the constructs and methods described herein are designed to increase viral vector yield (e.g., AAV vector yield), decrease the level of replicable viral vectors (e.g., replicable AAV (rcAAV)), increase viral vector packaging efficiency (e.g., AAV vector capsid packaging), and/or any combination thereof, relative to reference constructs or methods, such as those described in Xiao et al 1998 and Grieger et al 2015.

Cell strain and transfection reagent

In some embodiments, the production of the viral vector includes the use of cells (e.g., cell cultures). In some embodiments, the production of the viral vector comprises the use of a cell culture comprising one or more cell lines (e.g., mammalian cell lines). In some embodiments, the generation of the viral vector comprises the use of a HEK293 cell line or variant thereof (e.g., a HEK293T, HEK293F cell line). In some embodiments, the cells are capable of growing in suspension. In some embodiments, the cells are comprised of adherent cells. In some embodiments, the cells are capable of growing in a medium that does not contain animal components (e.g., animal serum). In some embodiments, the cells are capable of growing in serum-free medium (e.g., F17 medium, expi293 medium). In some embodiments, the generation of the viral vector comprises transfecting the cell with an expression construct (e.g., a plasmid). In some embodiments, the cells are selected to achieve high expression of a viral vector (e.g., an AAV vector). In some embodiments, the cells are selected to achieve high packaging efficiency of the viral vector (e.g., capsid packaging of the AAV vector). In some embodiments, the cells are selected to increase transfection efficiency (e.g., using chemical transfection reagents, including cationic molecules). In some embodiments, the cells are engineered to achieve high expression of a viral vector (e.g., an AAV vector). In some embodiments, the cells are engineered to achieve high packaging efficiency of the viral vector (e.g., capsid packaging of AAV vectors). In some embodiments, the cells are engineered to increase transfection efficiency (e.g., using chemical transfection reagents, including cationic molecules). In some embodiments, cells may be engineered or selected for two or more of the above attributes. In some embodiments, the cells are contacted with one or more expression constructs (e.g., plasmids). In some embodiments, the cells are contacted with one or more transfection reagents (e.g., chemical transfection reagents, including lipids, polymers, and cationic molecules) and one or more expression constructs. In some embodiments, the cells are contacted with one or more cationic molecules (e.g., cationic lipids, PEI agents) and one or more expression constructs. In some embodiments, the cells are contacted with PEIMAX reagents and one or more expression constructs. In some embodiments, the cell is contacted with FectoVir-AAV reagents and one or more expression constructs. In some embodiments, the cells are contacted with one or more transfection reagents and one or more expression constructs that present a specific ratio. In some embodiments, the ratio of transfection reagent to expression construct increases the production of viral vectors (e.g., increased vector yield, increased packaging efficiency, and/or increased transfection efficiency).

Expression constructs

In some embodiments, the expression construct is or comprises one or more polynucleotide sequences (e.g., a plasmid). In some embodiments, the expression construct comprises a specific polynucleotide sequence component (e.g., a payload, a promoter, a viral gene, etc.). In some embodiments, the expression construct comprises a polynucleotide sequence encoding a viral gene (e.g., a rep or cap gene or gene variant, one or more helper viral genes or gene variants). In some embodiments, a particular type of expression construct comprises a particular combination of polynucleotide sequence components. In some embodiments, a particular type of expression construct does not comprise a particular combination of polynucleotide sequence components. In some embodiments, a particular expression construct does not comprise polynucleotide sequence components encoding the rep and cap genes and/or gene variants.

In some embodiments, the expression construct comprises a polynucleotide sequence encoding a wild-type viral gene (e.g., a wild-type rep gene, cap gene, viral accessory gene, or a combination thereof). In some embodiments, the expression construct comprises a polynucleotide sequence encoding a viral accessory gene or gene variant (e.g., a herpes virus gene or gene variant, an adenovirus gene or gene variant). In some embodiments, the expression construct comprises a polynucleotide sequence encoding one or more gene copies (e.g., 1 copy, 2 copies, 3 copies, 4 copies, 5 copies, etc.) that express one or more wild-type Rep proteins. In some embodiments, the expression construct comprises a polynucleotide sequence encoding a single gene copy that expresses one or more wild-type Rep proteins (e.g., rep68, rep40, rep52, rep78, or a combination thereof). In some embodiments, the expression construct comprises a polynucleotide sequence encoding one or more wild-type Rep proteins (e.g., rep68, rep40, rep52, rep78, or a combination thereof). In some embodiments, the expression construct comprises a polynucleotide sequence encoding at least four wild-type Rep proteins (e.g., rep68, rep40, rep52, rep 78). In some embodiments, the expression construct comprises a polynucleotide sequence encoding each of Rep68, rep40, rep52, and Rep 78. In some embodiments, the expression construct comprises polynucleotide sequences encoding one or more wild-type adenovirus helper proteins (e.g., E2 and E4).

In some embodiments, the expression construct comprises a wild-type polynucleotide sequence encoding a wild-type viral gene (e.g., rep gene, cap gene, helper gene). In some embodiments, the expression construct comprises a modified polynucleotide sequence (e.g., codon optimized) encoding a wild-type viral gene (e.g., rep gene, cap gene, helper gene). In some embodiments, the expression construct comprises a modified polynucleotide sequence encoding a modified viral gene (e.g., rep gene, cap gene, helper gene). In some embodiments, the modified viral genes are designed and/or engineered to achieve certain improvements (e.g., increased transduction, tissue specificity, reduced size, reduced immune response, increased packaging, reduced rcAAV levels, etc.).

According to various embodiments, the expression constructs disclosed herein may provide increased flexibility and modularity as compared to the prior art. In some embodiments, the expression constructs disclosed herein may allow for the exchange of various polynucleotide sequences (e.g., different rep genes, cap genes, payloads, helper genes, promoters, etc.) while providing certain improvements (e.g., increased viral vector yield, increased packaging, reduced rcAAV levels, etc.). In some embodiments, the expression constructs disclosed herein are compatible with various upstream production processes (e.g., different cell culture conditions, different transfection reagents, etc.), while providing certain improvements (e.g., increased viral vector yield, increased packaging, reduced rcAAV levels, etc.).

In some embodiments, different types of expression constructs comprise different combinations of polynucleotide sequences. In some embodiments, one type of expression construct comprises one or more polynucleotide sequence components (e.g., payload, promoter, viral gene, etc.) that are not present in a different type of expression construct. In some embodiments, one type of expression construct comprises a polynucleotide sequence component encoding a viral gene (e.g., rep or cap gene or gene variant) and a polynucleotide sequence component encoding a payload (e.g., transgene and/or functional nucleic acid). In some embodiments, one type of expression construct comprises a polynucleotide sequence component encoding one or more viral genes (e.g., rep or cap genes or gene variants, and/or one or more helper viral genes). In some embodiments, one type of expression construct comprises a polynucleotide sequence component encoding one or more viral genes, wherein the viral genes are from one or more viral types (e.g., genes or gene variants from AAV and adenovirus). In some embodiments, the viral gene from adenovirus is a gene and/or a gene variant. In some embodiments, the viral gene from adenovirus is one or more of E2A (e.g., E2A DNA Binding Protein (DBP), E4 (e.g., E4 Open Reading Frame (ORF) 2, ORF3, ORF4, ORF 6/7), VA, and/or variants thereof. In some embodiments, the expression construct is used to produce a viral vector (e.g., by cell culture). In some embodiments, the expression construct is contacted with the cell in combination with one or more transfection reagents (e.g., chemical transfection reagents). In some embodiments, the expression construct is contacted with the cell in a specific ratio in combination with one or more transfection reagents. In some embodiments, different types of expression constructs are contacted with cells in specific ratios (e.g., weight ratios) in combination with one or more transfection reagents. In some embodiments, different types of expression constructs are contacted with the cell in a ratio (e.g., weight ratio) of about 10:1, 9:1, 8:1, 7:1, 6:1, 5:1, 4:1, 3:1, 2:1, 1.5:1, 1:1, 1:1.5, 1:2, 1:3, 1:4, 1:5, 1:6, 1:7, 1:8, 1:9, or 1:10. In some embodiments, a first expression construct comprising one or more viral accessory genes and a second expression construct comprising one or more payloads are contacted with the cell at a ratio (e.g., weight ratio) of the first expression construct to the second expression construct of about 10:1, 9:1, 8:1, 7:1, 6:1, 5:1, 4:1, 3:1, 2:1, 1.5:1, 1:1.5, 1:2, 1:3, 1:4, 1:5, 1:6, 1:7, 1:8, 1:9, or 1:10. In some embodiments, a first expression construct comprising one or more payloads and a second expression construct comprising one or more viral accessory genes are contacted with the cell at a ratio (e.g., weight ratio) of the first expression construct to the second expression construct of about 10:1, 9:1, 8:1, 7:1, 6:1, 5:1, 4:1, 3:1, 2:1, 1.5:1, 1:1.5, 1:2, 1:3, 1:4, 1:5, 1:6, 1:7, 1:8, 1:9, or 1:10. In some embodiments, a particular ratio of expression constructs increases AAV production (e.g., increased viral vector production, increased packaging efficiency, and/or increased transfection efficiency). In some embodiments, the cell is contacted with two or more expression constructs (e.g., sequentially or substantially simultaneously). In some embodiments, three or more expression constructs are contacted with the cell. In some embodiments, the expression construct comprises one or more promoters (e.g., one or more exogenous promoters). In some embodiments, the promoter is or comprises CMV, RSV, CAG, EF1 alpha, PGK, A1AT, C5-12, MCK, myofilament-to-filament protein (desmin), p5, p40, or a combination thereof. in some embodiments, the expression construct comprises one or more promoters upstream of a particular polynucleotide sequence component (e.g., a rep or cap gene or gene variant). In some embodiments, the expression construct comprises one or more promoters downstream of a particular polynucleotide sequence component (e.g., a rep or cap gene or gene variant).

In some embodiments, a first expression construct comprising one or more viral accessory genes and a second expression construct comprising one or more payloads are contacted with the cell at a ratio of greater than or equal to 1:1 to 3:1, wherein the viral titer yield is at least 1.5 times those obtained by administration of a reference system (e.g., three plasmids comprising separate plasmids, each encoding one of 1) AAV rep and AAV cap sequences, 2) related sequences from the accessory virus, and 3) the payload). In some embodiments, a first expression construct comprising one or more viral accessory genes and a second expression construct comprising one or more payloads are contacted with the cell at a ratio of greater than or equal to 1:1 to 5:1, wherein the viral titer yield is at least 1.5 times those obtained by administration of a reference system (e.g., three plasmids comprising separate plasmids, each plasmid encoding one of 1) AAV rep and AAV cap sequences, 2) related sequences from the accessory virus, and 3) the payload). In some embodiments, a first expression construct comprising one or more viral accessory genes and a second expression construct comprising one or more payloads are contacted with the cell in a ratio of greater than or equal to 1:1 to 6:1, wherein the viral titer yield is at least 1.5 times those obtained by administration of a reference system (e.g., three plasmids comprising separate plasmids, each plasmid encoding one of 1) AAV rep and AAV cap sequences, 2) related sequences from the accessory virus, and 3) the payload). In some embodiments, a first expression construct comprising one or more viral accessory genes and a second expression construct comprising one or more payloads are contacted with the cell in a ratio of greater than or equal to 1:1 to 8:1, wherein the viral titer yield is at least 1.5 times those obtained by administration of a reference system (e.g., three plasmids comprising separate plasmids, each encoding one of 1) AAV rep and AAV cap sequences, 2) related sequences from the accessory virus, and 3) the payload). In some embodiments, a first expression construct comprising one or more viral accessory genes and a second expression construct comprising one or more payloads are contacted with the cell in a ratio of greater than or equal to 1:1 to 10:1, wherein the viral titer yield is at least 1.5 times those obtained by administration of a reference system (e.g., three plasmids comprising separate plasmids, each encoding one of 1) AAV rep and AAV cap sequences, 2) related sequences from the accessory virus, and 3) the payload).

In some embodiments, a first expression construct comprising one or more viral accessory genes and a second expression construct comprising one or more payloads are contacted with a cell at a ratio between 10:1 and 1:1. In some embodiments, a first expression construct comprising one or more viral accessory genes and a second expression construct comprising one or more payloads are contacted with the cell at a ratio between 9:1 and 1:1. In some embodiments, a first expression construct comprising one or more viral accessory genes and a second expression construct comprising one or more payloads are contacted with a cell at a ratio between 8:1 and 1:1. In some embodiments, a first expression construct comprising one or more viral accessory genes and a second expression construct comprising one or more payloads are contacted with the cell in a ratio between 7:1 and 1:1. In some embodiments, a first expression construct comprising one or more viral accessory genes and a second expression construct comprising one or more payloads are contacted with the cell in a ratio between 6:1 and 1:1. In some embodiments, a first expression construct comprising one or more viral accessory genes and a second expression construct comprising one or more payloads are contacted with a cell at a ratio between 5:1 and 1:1. In some embodiments, a first expression construct comprising one or more viral accessory genes and a second expression construct comprising one or more payloads are contacted with the cell in a ratio between 4:1 and 1:1. In some embodiments, a first expression construct comprising one or more viral accessory genes and a second expression construct comprising one or more payloads are contacted with the cell in a ratio between 3:1 and 1:1. In some embodiments, a first expression construct comprising one or more viral accessory genes and a second expression construct comprising one or more payloads are contacted with the cell in a ratio between 2:1 and 1:1.

In some embodiments, a first expression construct comprising one or more viral accessory genes and a second expression construct comprising one or more payloads are contacted with a cell at a ratio between 1:1 and 2:1. In some embodiments, a first expression construct comprising one or more viral accessory genes and a second expression construct comprising one or more payloads are contacted with the cell in a ratio between 1:1 and 3:1. In some embodiments, a first expression construct comprising one or more viral accessory genes and a second expression construct comprising one or more payloads are contacted with the cell in a ratio between 1:1 and 4:1. In some embodiments, a first expression construct comprising one or more viral accessory genes and a second expression construct comprising one or more payloads are contacted with a cell at a ratio between 1:1 and 5:1. In some embodiments, a first expression construct comprising one or more viral accessory genes and a second expression construct comprising one or more payloads are contacted with a cell at a ratio between 1:1 and 6:1. In some embodiments, a first expression construct comprising one or more viral accessory genes and a second expression construct comprising one or more payloads are contacted with a cell at a ratio between 1:1 and 7:1. In some embodiments, a first expression construct comprising one or more viral accessory genes and a second expression construct comprising one or more payloads are contacted with a cell at a ratio between 1:1 and 8:1. In some embodiments, a first expression construct comprising one or more viral accessory genes and a second expression construct comprising one or more payloads are contacted with a cell at a ratio between 1:1 and 9:1. In some embodiments, a first expression construct comprising one or more viral accessory genes and a second expression construct comprising one or more payloads are contacted with a cell at a ratio between 1:1 and 10:1. In some embodiments, a first expression construct comprising one or more viral accessory genes and a second expression construct comprising one or more payloads are contacted with a cell at a ratio of 1.5:1.

In some embodiments, the expression construct comprises one or more polynucleotide sequences encoding components (e.g., selectable markers, origins of replication) necessary for a cell culture (e.g., bacterial cell culture, mammalian cell culture). In some embodiments, the expression construct comprises one or more polynucleotide sequences encoding an antibiotic resistance gene, e.g., kang Mei element (kanamycin) resistance gene, ampicillin (ampicillin) resistance gene. In some embodiments, the expression construct comprises one or more polynucleotide sequences encoding a bacterial origin of replication (e.g., colE1 origin of replication).

In some embodiments, the expression construct comprises one or more transcription termination sequences (e.g., polyA sequences). In some embodiments, the expression construct comprises one or more of BGH polyA, FIX polyA, SV40 polyA, synthetic polyA, or a combination thereof. In some embodiments, the expression construct comprises one or more transcription termination sequences downstream of a particular sequence component (e.g., a rep or cap gene or gene variant). In some embodiments, the expression construct comprises one or more transcription termination sequences upstream of a particular sequence component (e.g., a rep or cap gene or gene variant).

In some embodiments, the expression construct comprises one or more intron sequences. In some embodiments, the expression construct comprises one or more introns of different origin (e.g., known genes), including but not limited to FIX introns, albumin introns, or combinations thereof. In some embodiments, the expression construct comprises one or more introns of different lengths (e.g., 133bp to 4 kb). In some embodiments, the expression construct comprises one or more intron sequences upstream of a specific sequence component (e.g., a rep or cap gene or gene variant). In some embodiments, the expression construct comprises one or more intron sequences within a specific sequence component (e.g., a rep or cap gene or gene variant). In some embodiments, the expression construct comprises one or more intron sequences downstream of a specific sequence component (e.g., a rep or cap gene or gene variant). In some embodiments, the expression construct comprises one or more intron sequences following the promoter (e.g., p5 promoter). In some embodiments, the expression construct comprises one or more intron sequences prior to the rep gene or gene variant. In some embodiments, the expression construct comprises one or more intron sequences between the promoter and the rep gene or gene variant. In some embodiments, the compositions provided herein comprise an expression construct. In some embodiments, the composition comprises: (i) A first expression construct comprising a polynucleotide sequence encoding one or more rep genes and a polynucleotide sequence encoding one or more wild-type adenovirus helper proteins; and (ii) a second expression construct comprising a polynucleotide sequence encoding one or more cap genes and one or more payloads.

In some embodiments, the expression construct will comprise a three plasmid (e.g., triple transfection) system for the production of the viral vector. In some embodiments, the three plasmid system will comprise: 1) A first plasmid comprising one or more sequences encoding rep and cap genes or variants thereof; 2) A second sequence encoding one or more payloads; 3) A third sequence encoding one or more accessory proteins. In some embodiments, a three plasmid system can be used to produce one or more of the viral vectors disclosed herein.

Methods of characterizing AAV viral vectors

According to various embodiments, the viral vectors may be characterized by assessing various features and/or characteristics. In some embodiments, the evaluation of the viral vector may be performed at different points in the production process. In some embodiments, the assessment of the viral vector may be performed after the upstream production step is completed. In some embodiments, the evaluation of the viral vector may be performed after the downstream production step is completed.

Viral yield

In some embodiments, characterization of the viral vector includes assessing viral yield (e.g., viral titer). In some embodiments, characterization of the viral vector includes assessing viral yield prior to purification and/or filtration. In some embodiments, characterization of the viral vector includes assessing viral yield after purification and/or filtration. In some embodiments, characterization of the viral vector includes assessing whether the viral yield is greater than or equal to 1e10 vg/mL.

In some embodiments, characterization of the viral vector includes assessing whether the viral yield in the crude cell lysate is greater than or equal to 1e11 vg/mL. In some embodiments, characterization of the viral vector includes assessing whether the viral yield in the crude cell lysate is greater than or equal to 5e11vg/mL. In some embodiments, characterization of the viral vector includes assessing whether the viral yield in the crude cell lysate is greater than or equal to 1e12 vg/mL. In some embodiments, characterization of the viral vector includes assessing whether the viral yield in the crude lysate is between 5e9vg/mL and 5e11vg/mL. In some embodiments, characterization of the viral vector includes assessing whether the viral yield in the crude lysate is between 5e9vg/mL and 1e10 vg/mL. In some embodiments, characterization of the viral vector includes assessing whether the viral yield in the crude lysate is between 1e10 vg/mL and 1e11 vg/mL. In some embodiments, characterization of the viral vector includes assessing whether the viral yield in the crude lysate is between 1e11 vg/mL and 1e12 vg/mL. In some embodiments, characterization of the viral vector includes assessing whether the viral yield in the crude lysate is between 1e12 vg/mL and 1e13 vg/mL.

In some embodiments, characterization of the viral vector includes assessing whether the viral yield in the purified drug is greater than or equal to 1e11 vg/mL. In some embodiments, characterization of the viral vector includes assessing whether the viral yield in the purified drug is greater than or equal to 1e12 vg/mL. In some embodiments, characterization of the viral vector includes assessing whether the viral yield in the purified drug is between 1e10vg/mL and 1e15 vg/mL. In some embodiments, characterization of the viral vector includes assessing whether the viral yield in the purified drug is between 1e11 vg/mL and 1e15 vg/mL. In some embodiments, characterization of the viral vector includes assessing whether the viral yield in the purified drug is between 1e12vg/mL and 1e14 vg/mL. In some embodiments, characterization of the viral vector includes assessing whether the viral yield in the purified drug is between 1e13 vg/mL and 1e14 vg/mL.

In some embodiments, the methods and compositions provided herein can provide comparable or increased viral vector production compared to previous methods known in the art. For example, in some embodiments, the provided methods for generating and/or manufacturing viral vectors comprising the use of a dual plasmid transfection system provide comparable or increased viral vector yields as compared to a three plasmid system. In some embodiments, the provided methods for generating and/or manufacturing viral vectors comprising a two-plasmid transfection system using a specific combination of sequence components provide comparable or increased viral vector yields as compared to a two-plasmid system having a different combination of sequence components. In some embodiments, the provided methods for generating and/or manufacturing viral vectors comprising using a dual plasmid transfection system with a specific plasmid ratio provide comparable or increased viral vector yield as compared to a dual plasmid system with a different plasmid ratio. In some embodiments, the provided methods for generating and/or manufacturing viral vectors comprising the use of a dual plasmid transfection system with a specific plasmid ratio provide comparable or increased viral vector yield as compared to a reference under specific culture conditions (e.g., dual plasmid system with different plasmid ratios, three plasmid system). In some embodiments, the provided methods for producing and/or manufacturing viral vectors comprising using a two plasmid transfection system with a specific plasmid ratio provide comparable or increased viral vector yield as compared to a reference (e.g., a two plasmid system, a three plasmid system with different plasmid ratios) under large scale culture conditions (e.g., greater than 100mL, greater than 250mL, greater than 1L, greater than 10L, greater than 20L, greater than 30L, greater than 40L, greater than 50L, etc.).

Virus package

In some embodiments, characterization of the viral vector comprises assessing viral packaging efficiency (e.g., percentage of intact capsids relative to empty capsids). In some embodiments, characterization of the viral vector comprises assessing viral packaging efficiency prior to purification and/or whole capsid enrichment (e.g., cesium chloride-based density gradient, iodixanol-based density gradient, or ion exchange chromatography). In some embodiments, characterization of the viral vector comprises assessing whether the viral packaging efficiency is greater than or equal to 20% (e.g., 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 99%, 100%) prior to purification and/or filtration. In some embodiments, characterization of the viral vector comprises assessing viral packaging efficiency after purification and/or complete capsid enrichment. In some embodiments, characterization of the viral vector comprises assessing whether the viral packaging efficiency is greater than or equal to 50% (e.g., 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 99%, 100%) after purification and/or filtration.

In some embodiments, the methods and compositions provided herein can provide comparable or increased packaging efficiency as compared to previous methods known in the art. For example, in some embodiments, the provided methods for generating and/or manufacturing viral vectors comprising the use of a dual plasmid transfection system provide comparable or increased packaging efficiency as compared to a three plasmid system. In some embodiments, the provided methods for generating and/or manufacturing viral vectors comprising a two-plasmid transfection system using a specific combination of sequence components provide comparable or increased packaging efficiency as compared to a two-plasmid system having a different combination of sequence components. In some embodiments, the provided methods for generating and/or manufacturing viral vectors comprising using a two-plasmid transfection system with a specific plasmid ratio provide comparable or increased packaging efficiency as compared to a two-plasmid system with a different plasmid ratio.

Replicable (replication competent) vector levels

In some embodiments, the characterization of the viral vector comprises assessing the level of replicable vectors. In some embodiments, the characterization of the viral vector comprises assessing the level of replicable vector prior to purification and/or filtration. In some embodiments, the characterization of the viral vector comprises assessing the level of replicable vector after purification and/or filtration. In some embodiments, characterization of the viral vector comprises assessing whether the level of replicable vector is less than or equal to 1rcAAV/1e10 vg.

In some embodiments, the methods and compositions provided herein can provide comparable or reduced levels of replicable vectors as compared to previous methods known in the art. For example, in some embodiments, the provided methods for generating viral vectors comprising the use of a dual plasmid transfection system provide comparable or reduced levels of replicable vectors as compared to a three plasmid system. In some embodiments, the provided methods for generating viral vectors comprising a two-plasmid transfection system using a specific combination of sequence components provide comparable or reduced levels of replicable vectors as compared to a two-plasmid system having a different combination of sequence components. In some embodiments, the provided methods for producing a viral vector comprising using a dual plasmid transfection system with one or more intron sequences inserted into the rep gene provide comparable or reduced levels of replicable vectors as compared to a dual plasmid system without the one or more intron sequences.

Type 1 hereditary tyrosinemia (HT-1)

Type 1 hereditary tyrosinemia (HT 1) is an extremely rare metabolic disorder of neonatal onset caused by loss of function mutations of fumarylacetoacetic hydrolase (FAH). Without treatment, HT1 patients exhibit acute liver failure, kidney damage, and often develop hepatocellular carcinoma in early childhood. The standard of care for HT1 consists of the life-long drug Nitixinong (NTBC) and dietary restrictions. Although this treatment is effective in preventing organ failure, there is still a strong unmet medical need due to drug recalcitrance and dietary compliance.

HT1 is caused by a deficiency in fumarylacetoacetate hydrolase (FAH), the last enzyme in the tyrosine catabolic pathway (see FIG. 5). Thus, FAH deficiency leads to abnormal accumulation of toxic metabolites such as fumarylacetoacetic acid (FAA) and Succinylacetone (SUAC), leading to signs and symptoms of HT1 (see LindBlad et al PNAS,1977 and Grompe SEMIN LIVER DIS, 2001).

HT1 is a severe chromosomal recessive metabolic disorder that occurs in early infancy (within 2 years after birth). It is estimated that 1 out of each 100,000-120,000 newborns worldwide is affected by HT-1, but the incidence is more common in certain areas, such as Norway or Quebec, canada (see Russo et al, PEDIATRIC AND Developmental Pathology, 2001). Patients with HT-1 experience liver dysfunction (hepatomegaly, cirrhosis and hepatocellular carcinoma) and may also have associated complications involving the kidneys and nervous system, and exhibit growth retardation. HT-1 is fatal if left untreated, and therefore liver failure is the leading cause of early death, and all HT-1 patients are at high risk of developing hepatocellular carcinoma (HCC) (see Russo et al, PEDIATRIC AND Developmental Pathology,2001; morrow et al, HEREDITARY TYROSINEMIA, pages 9-21, 2017; chinsky et al, GENETICS IN MEDICINE,2017; and Ginkel et al, PEDIATRIC DRUGS, 2019).

To date, the most curative treatment for HT1 involves orthotopic liver transplants and is only performed in severe HT-1 cases (see Morrow et al HEREDITARY TYROSINEMIA, pages 9-21, 2017). Following transplantation, patients typically show decreased, rather than inhibited, levels of urine and plasma toxic metabolites (see Paradis et al American Journal of Human Genetics,1990; forget et al PEDIATRIC RADIOLOGY, 1999). Presumably because of the sustained production of kidneys. Most patients are treated with replacement therapy by dietary restriction (low phenylalanine and tyrosine intake) and NTBC (2-nitro-4-trifluoromethylbenzoyl) -1, 3-cyclohexanedione, nitenpyram) is orally administered at 0.5-2.0 mg/kg/day (see Chinsky et al GENETICS IN MEDICINE, 2017). NTBC is a reversible inhibitor of 4-hydroxyphenylpyruvate dioxygenase, thereby blocking the second step of tyrosine metabolism and preventing the formation of toxic metabolites (see Holme et al Journal of Inherited Metabolic Disease, 1998). Common treatments for HT1 include treatment with NTBC and diet restriction beginning the first month after birth and continuing uninterrupted to prevent liver failure, renal failure and development of HCC (see Chinsky et al GENETICS IN MEDICINE, 2017). Thus, the urgent need for liver transplantation is eliminated. However, early diagnosis and treatment is critical, as the effectiveness of such treatment depends on how early the disease is identified. Patients receiving NTBC treatment after neonatal period have been reported to have a 2-12 fold higher risk of developing HCC than patients treated during neonatal period (see Mayorandan et al Orphanet Journal of RARE DISEASES, 2014).

In some embodiments, the subject of the present disclosure is a newborn infant, child, or adult. In some embodiments, the subject of the present disclosure is one week old, two weeks old, three weeks old, four weeks old, five weeks old, six weeks old, seven weeks old, eight weeks old, nine weeks old, ten weeks old, or 12 weeks old. In some embodiments, the subject of the present disclosure is between one to three weeks, two to four weeks, three to five weeks, four to six weeks, five to seven weeks, six to eight weeks, six to nine weeks, eight to ten weeks, nine to eleven weeks, or ten to twelve weeks in age. In some embodiments, the subject of the present disclosure is less than one month old. In some embodiments, the subject of the present disclosure is aged one month, two months, three months, four months, five months, six months. In some embodiments, the subject of the present disclosure is between one to three months, two to four months, three to five months, or four to six months in age.

In some embodiments, the subject of the present disclosure is aged between 1 and 5 years old, 3 and 7 years old, 5 and 9 years old, 7 and 11 years old, 9 and 13 years old, 11 and 15 years old, 13 and 17 years old, 15 and 19 years old, 17 and 21 years old, 19 and 23 years old, 21 and 25 years old, 23 and 27 years old, 25 and 29 years old, 27 and 31 years old, 29 and 33 years old, 31 and 35 years old. In some embodiments, the subject of the present disclosure is 30 to 40 years old, 40 to 50 years old, 50 to 60 years old, 60 to 70 years old, 70 to 80 years old, or 80 to 90 years old.

In some embodiments, the subject has received or is receiving treatment for HT 1. In some embodiments, the method of treatment for HT1 comprises standard of care treatment (i.e., NTBC and dietary restrictions). In some embodiments, the treatment for HT1 comprises NTBC.

In some embodiments, the methods of the present disclosure comprise administering a composition comprising a polynucleotide cassette to a subject that has received or is receiving treatment for HT 1. In some embodiments, the methods of the present disclosure comprise administering a composition comprising a polynucleotide cassette to a subject that has received or is receiving NTBC. In some embodiments, a composition comprising a polynucleotide cassette and a treatment for HT1 (e.g., NTBC) are administered to a subject simultaneously or sequentially.

In some embodiments, administration of the compositions of the present disclosure may result in a change in the standard of care or a prior or concurrent treatment. In some embodiments, the subject receives a lower or reduced dose of the treatment that the subject is receiving. In some embodiments, the subject ceases or no longer receives treatment that the subject has received or is receiving.

As with most chronic diseases, patients and healthcare providers must consider the risks associated with non-compliance with long-term care. Due to limitations of disease on life and social functions, the quality of life of patients, their families and caregivers is significantly affected by the disease. Enhancing medical advice for continuous life-long medicine and dietary therapy can be challenging because the period of dietary or medical recalcitrance can be asymptomatic (see Chinsky et al GENETICS IN MEDICINE, 2017). Importantly, non-compliance or poor compliance with medical and dietary treatments directly or indirectly affects patient outcome, as elevated levels of toxic metabolites may promote HCC development. NTBC treatment may result in higher levels of tyrosine in the blood because tyrosine is not catabolized and patients may suffer from corneal disease (corneal crystallization) without strict dietary compliance (see Chinsky et al GENETICS IN MEDICINE, 2017).

NTBC (trade name Orfadin) is an expensive drug and the long-term therapeutic effect of NTBC is not yet clear. Introducing a functional copy of the FAH gene into the genome of an HT-1 patient would represent a better approach, potentially providing lifetime therapeutic benefits from a single administration.

In some embodiments, the transgene of the present disclosure comprises a sequence encoding a FAH. In some embodiments, the sequence encoding FAH has 80%, 85%, 90%, 95%, 99% sequence identity to one of SEQ ID NOS.18-22.

Mouse FAH (mFAH, SEQ ID NO: 18)

tcctttattccagtggccgaggactccgactttcccatccaaaacctgccctatggtgttttctccactcaaagca acccaaagccacggattggtgtagccatcggtgaccagatcttggacctgagtgtcattaaacacctctttaccggacctgccctttccaaacatcaacatgtcttcgatgagacaactctcaataacttcatgggtctgggtcaagctgcatggaaggaggcaagagcatccttacagaacttactgtctgccagccaagcccggctcagagatgacaaggagcttcggcagcgtgcattcacctcccaggcttctgcgacaatgcaccttcctgctaccataggagactacacggacttctactcttctcggcagcatgccaccaatgttggcattatgttcagaggcaaggagaatgcgctgttgccaaattggctccacttacctgtgggataccatggccgagcttcctccattgtggtatctggaaccccgattcgaagacccatggggcagatgagacctgataactcaaagcctcctgtgtatggtgcctgcagactcttagacatggagttggaaatggctttcttcgtaggccctgggaacagattcggagagccaatccccatttccaaagcccatgaacacattttcgggatggtcctcatgaacgactggagcgcacgagacatccagcaatgggagtacgtcccacttgggccattcctggggaaaagctttggaaccacaatctccccgtgggtggtgcctatggatgccctcatgccctttgtggtgccaaacccaaagcaggaccccaagcccttgccatatctctgccacagccagccctacacatttgatatcaacctgtctgtctctttgaaaggagaaggaatgagccaggcggctaccatctgcaggtctaactttaagcacatgtactggaccatgctgcagcaactcacacaccactctgttaatggatgcaacctgagacctggggacctcttggcttctggaaccatcagtggatcagaccctgaaagctttggctccatgctggaactgtcctggaagggaacaaaggccatcgatgtggggcaggggcagaccaggaccttcctgctggacggcgatgaagtcatcataacaggtcactgccagggggacggctaccgtgttggctttggccagtgtgctgggaaagtgctgcctgccctttcaccagcc

Human FAH (hFAH, SEQ ID NO: 19)

tccttcatcccggtggccgaggattccgacttccccatccacaacctgccctacggcgt cttctcgaccagaggcgacccaagaccgaggataggtgtggccattggcgaccagatcctggacctcagcatcatcaagcacctctttactggtcctgtcctctccaaacaccaggatgtcttcaatcagcctacactcaacagcttcatgggcctgggtcaggctgcctggaaggaggcgagagtgttcttgcagaacttgctgtctgtgagccaagccaggctcagagatgacaccgaacttcggaagtgtgcattcatctcccaggcttctgccacgatgcaccttccagccaccataggagactacacagacttctattcctctcggcagcatgctaccaacgtcggaatcatgttcagggacaaggagaatgcgttgatgccaaattggctgcacttaccagtgggctaccatggccgtgcctcctctgtcgtggtgtctggcaccccaatccgaaggcccatgggacagatgaaacctgatgactctaagcctcccgtatatggtgcctgcaagctcttggacatggagctggaaatggctttttttgtaggccctggaaacagattgggagagccgatccccatttccaaggcccatgagcacatttttggaatggtccttatgaacgactggagtgcacgagacattcagaagtgggagtatgtccctctcgggccattccttgggaagagttttgggaccactgtctctccgtgggtggtgcccatggatgctctcatgccctttgctgtgcccaacccgaagcaggaccccaggcccctgccgtatctgtgccatgacgagccctacacatttgacatcaacctctctgttaacctgaaaggagaaggaatgagccaggcggctaccatatgcaagtccaattttaagtacatgtactggacgatgctgcagcagctcactcaccactctgtcaacggctgcaacctgcggccgggggacctcctggcttctgggaccatcagcgggccggagccagaaaacttcggctccatgttggaactgtcgtggaagggaacgaagcccatagacctggggaatggtcagaccaggaagtttctgctggacggggatgaagtcatcataacagggtactgccagggggatggttaccgcatcggctttggccagtgtgctggaaaagtgctgcctgctctcctgccatca

Human FAH-codon-optimized variant 3 (hFAH-co 3, SEQ ID NO: 20)

tccttcattcctgtggccgaggattctgactttcccattcacaacctgccctatggcgtctt ttcaaccaggggagatcccaggcccagaatcggggtggccattggagatcagatcctggacctgtcaatcatcaagcacctgtttacaggccctgtgctgagcaagcaccaggatgtgttcaaccagccaactctgaacagcttcatggggctggggcaggctgcctggaaggaggcaagggtgtttctgcagaatctgctgagcgtgtcacaggctaggctgagggatgataccgagctgaggaagtgtgcatttatctcacaggcatcagccacaatgcatctgccagctacaatcggcgactataccgacttttactcaagcaggcagcatgccaccaatgtgggcatcatgttcagggacaaagagaatgccctgatgccaaattggctgcacctgcctgtgggctatcatggcagagccagctcagtggtggtgtcagggacaccaattaggaggcctatgggccagatgaagcctgatgacagcaaaccacctgtctacggcgcctgcaagctcctggatatggaactggagatggctttttttgtgggcccaggcaatagactgggagagcccattccaatctcaaaggcccatgaacatatctttggcatggtcctgatgaatgactggtcagccagagacattcagaagtgggagtacgtgccactgggaccatttctgggaaagagctttggcaccacagtgtcaccatgggtggtgcccatggatgccctgatgccctttgctgtgccaaatccaaaacaagaccccaggcccctcccctatctctgtcatgatgaaccatatacttttgacattaacctgagcgtgaacctgaaaggagaaggcatgagccaagccgccactatctgtaagagcaacttcaaatatatgtactggacaatgctgcagcagctcacccaccatagcgtgaatgggtgtaacctgaggccaggcgacctgctggcatcaggcactatttcagggcctgagcctgaaaattttggatcaatgctggagctgtcatggaagggaactaaacccatcgacctggggaatggccagaccagaaagtttctcctggatggcgatgaggtgatcattacaggctactgtcagggggatggctatagaattggatttggccaatgcgccggcaaagtcctccctgccctgctgcccagc

Human FAH-codon-optimized variant 2 (hFAH-co 2, SEQ ID NO: 21)

tccttcatcccagtggctgaagattctgacttccccattcacaacctcccttatggagtctt tagcacaagaggagacccaagacctagaattggcgtggctattggggatcagatcctggacctgagcattatcaaacatctctttacaggcccagtcctcagcaagcaccaagatgtgttcaatcaacccacactgaactcatttatggggctgggccaagctgcctggaaggaggcaagagtgtttctccagaacctcctgtcagtgagccaggccaggctgagagatgacacagagctgaggaagtgtgcctttatctcacaagccagcgccacaatgcacctgccagcaaccatcggggactacacagacttttacagcagcaggcagcacgccactaatgtgggcatcatgtttagagacaaagagaatgccctcatgcctaactggctgcatctgccagtgggctaccatggcagagccagcagcgtggtggtgtctggcacccctatcaggaggcccatgggccagatgaagcctgatgacagcaagccacctgtgtatggggcatgtaagctgctggacatggagctggaaatggctttctttgtggggcctgggaacaggctgggcgaaccaattcccatcagcaaagctcatgaacacatttttgggatggtgctcatgaatgactggtctgccagagacatccagaagtgggagtatgtccccctgggaccattcctgggcaagagctttgggaccactgtgtcaccatgggtggtgcccatggatgccctgatgccatttgctgtgcctaatcccaaacaagatccaaggccactgccctacctctgccacgatgagccctacacatttgatatcaacctctctgtgaatctgaagggagaaggaatgagccaagctgctaccatttgcaaaagcaactttaagtatatgtattggaccatgctgcagcaactcacccaccacagcgtgaatggctgtaacctgaggcctggcgacctgctggcctctggcaccatcagcggccctgaacctgagaactttggcagcatgctggagctgagctggaagggaaccaagcccattgacctgggcaatggacagaccaggaaattcctcctggacggggacgaggtgatcatcacaggctattgccagggggatgggtataggattggctttggccaatgtgccggcaaagtgctgcctgccctgctccctagc

Human FAH-codon-optimized variant 1 (hFAH-co 1, SEQ ID NO: 22)

tcctttattcctgtggccgaagatagcgacttccctatccataacctcccatatggcgtctt ctcaaccaggggcgaccccaggcccaggattggggtggcaattggagaccagatcctggacctcagcatcattaagcacctctttacaggacctgtgctgagcaagcaccaggatgtgttcaaccagcctaccctgaatagctttatgggactgggccaggcagcatggaaagaagccagagtgttcctccagaatctgctgagcgtgagccaggccaggctcagggatgacacagaactgaggaaatgtgccttcatctcacaggcatcagccacaatgcacctccctgccacaattggggactacactgacttctacagcagcagacagcatgccactaatgtggggattatgtttagagacaaggaaaatgccctcatgccaaattggctgcacctgcctgtgggctaccacggcagagcctcatcagtggtggtgtcaggcacaccaattaggaggccaatgggacagatgaagcctgacgactcaaagccaccagtctatggcgcctgcaaactgctggacatggagctggagatggctttctttgtggggcctggcaacaggctgggagaaccaatccctatttcaaaggcccacgagcacatttttggcatggtgctcatgaatgactggtctgccagagatatccagaagtgggaatacgtgcccctgggcccttttctgggcaagagctttggcaccacagtgtcaccttgggtggtcccaatggatgccctgatgccctttgccgtgcccaaccccaagcaggatcccagaccactgccctacctgtgccacgatgagccctatacctttgacatcaacctgtcagtcaacctgaagggggagggcatgagccaggccgccactatttgcaagagcaactttaaatatatgtactggactatgctccaacagctcactcaccactcagtgaatggctgcaatctgaggcctggcgacctcctggctagcggcactatctctggccctgaacctgagaactttgggagcatgctggagctgtcatggaaagggacaaagcctattgacctggggaatggccagacaaggaagtttctcctggatggcgatgaggtcatcatcactgggtactgccagggcgatggctacaggattggatttggacagtgtgctggcaaagtgctcccagctctgctgccctca

Because GENERIDE ^TM is designed to provide treatment persistence, it can provide life-long benefits to HT-1 patients by intervening early in their life with a treatment that restores the function of the aberrant gene before a decline in function can occur. In some embodiments, the therapeutic transgene is delivered using GENERIDE ^TM constructs designed to integrate immediately after the gene encoding albumin, which is the highest expressing gene in the liver. In some embodiments, expression of the transgene "backing" albumin, given the high level of albumin expression in the liver, may provide adequate therapeutic levels of the desired protein.

In some embodiments, the compositions of the present disclosure comprise a viral vector capsid and a polynucleotide cassette as described herein. In some embodiments, the compositions of the present disclosure may have 85%, 90%, 95%, 99% or 100% sequence identity to the sequences provided below:

GR-FAH-co3_1.6/1.0_kb(SEQ ID NO.23)

cgattcattaatgcagctgcgcgctcgctcgctcactgaggccgcccgggcaaagcccgggcgtcgggcgacctttggtcgcccggcctcagtgagcgagcgagcgcgcagagagggagtggccaactccatcactaggggttccttgtagttaatgattaacccgccatgctacttatctacgtaTTTCTAGATGTAAATAATTATTTTAAGTTTGCCCTATGGTGGCCCCACACATGAGACAAACCCCCAAGATGTGACTTTTGAGAATGAGACTTGGATAAAAAACATGTAGAAATGCAAGCCCTGAAGCTCAACTCCCTATTGCTATCACAGGGGTTATAATTGCATAAAATTTAGCTATAGAAAGTTGCTGTCATCTCTTGTGGGCTGTAATCATCGTCTAGGCTTAAGAGTAATATTGCAAAACCTGTCATGCCCACACAAATCTCTCCCTGGCATTGTTGTCTTTGCAGATGTCAGTGAAAGAGAACCAGCAGCTCCCATGAGTTTGGATAGCCTTATTTTCTATAGCCTCCCCACTATTAGCTTTGAAGGGAGCAAAGTTTAAGAACCAAATATAAAGTTTCTCATCTTTATAGATGAGAAAAATTTTAAATAAAGTCCAAGATAATTAAATTTTTAAGGATCATTTTTAGCTCTTTAATAGCAATAAAACTCAATATGACATAATATGGCACTTCCAAAATCTGAATAATATATAATTGCAATGACATACTTCTTTTCAGAGATTTACTGAAAAGAAATTTGTTGACACTACATAACGTGATGAGTGGTTTATACTGATTGTTTCAGTTGGTCTTCCCACCAACTCCATGAAAGTGGATTTTATTATCCTCATCATGCAGATGAGAATATTGAGACTTATAGCGGTATGCCTGAGCCCCAAAGTACTCAGAGTTGCCTGGCTCCAAGATTTATAATCTTAAATGATGGGACTACCATCCTTACTCTCTCCATTTTTCTATACGTGAGTAATGTTTTTTCTGTTTTTTTTTTTTCTTTTTCCATTCAAACTCAGTGCACTTGTTGAGCTCGTGAAACACAAGCCCAAGGCAACAAAAGAGCAACTGAAAGCTGTTATGGATGATTTCGCAGCTTTTGTAGAGAAGTGCTGCAAGGCTGACGATAAGGAGACCTGCTTTGCCGAGGAGGTACTACAGTTCTCTTCATTTTAATATGTCCAGTATTCATTTTTGCATGTTTGGTTAGGCTAGGGCTTAGGGATTTATATATCAAAGGAGGCTTTGTACATGTGGGACAGGGATCTTATTTTACAAACAATTGTCTTACAAAATGAATAAAACAGCACTTTGTTTTTATCTCCTGCTCTATTGTGCCATACTGTTAAATGTTTATAATGCCTGTTCTGTTTCCAAATTTGTGATGCTTATGAATATTAATAGGAATATTTGTAAGGCCTGAAATATTTTGATCATGAAATCAAAACATTAATTTATTTAAACATTTACTTGAAATGTGGTGGTTTGTGATTTAGTTGATTTTATAGGCTAGTGGGAGAATTTACATTCAAATGTCTAAATCACTTAAAATTGCCCTTTATGGCCTGACAGTAACTTTTTTTTATTCATTTGGGGACAACTATGTCCGTGAGCTTCCGTCCAGAGATTATAGTAGTAAATTGTAATTAAAGGATATGATGCACGTGAAATCACTTTGCAATCATCAATAGCTTCATAAATGTTAATTTTGTATCCTAATAGTAATGCTAATATTTTCCTAACATCTGTCATGTCTTTGTGTTCAGGGTAAAAAACTTGTTGCTGCAAGTCAAGCTGCCTTAGGCTTAGGAAGCGGCGCCACCAATTTCAGCCTGCTGAAACAGGCCGGCGACGTGGAAGAGAACCCTGGCCCTTCCTTCATTCCTGTGGCCGAGGATTCTGACTTTCCCATTCACAACCTGCCCTATGGCGTCTTTTCAACCAGGGGAGATCCCAGGCCCAGAATCGGGGTGGCCATTGGAGATCAGATCCTGGACCTGTCAATCATCAAGCACCTGTTTACAGGCCCTGTGCTGAGCAAGCACCAGGATGTGTTCAACCAGCCAACTCTGAACAGCTTCATGGGGCTGGGGCAGGCTGCCTGGAAGGAGGCAAGGGTGTTTCTGCAGAATCTGCTGAGCGTGTCACAGGCTAGGCTGAGGGATGATACCGAGCTGAGGAAGTGTGCATTTATCTCACAGGCATCAGCCACAATGCATCTGCCAGCTACAATCGGCGACTATACCGACTTTTACTCAAGCAGGCAGCATGCCACCAATGTGGGCATCATGTTCAGGGACAAAGAGAATGCCCTGATGCCAAATTGGCTGCACCTGCCTGTGGGCTATCATGGCAGAGCCAGCTCAGTGGTGGTGTCAGGGACACCAATTAGGAGGCCTATGGGCCAGATGAAGCCTGATGACAGCAAACCACCTGTCTACGGCGCCTGCAAGCTCCTGGATATGGAACTGGAGATGGCTTTTTTTGTGGGCCCAGGCAATAGACTGGGAGAGCCCATTCCAATCTCAAAGGCCCATGAACATATCTTTGGCATGGTCCTGATGAATGACTGGTCAGCCAGAGACATTCAGAAGTGGGAGTACGTGCCACTGGGACCATTTCTGGGAAAgAGCTTTGGCACCACAGTGTCACCATGGGTGGTGCCCATGGATGCCCTGATGCCCTTTGCTGTGCCAAATCCAAAACAAGACCCCAGGCCCCTCCCCTATCTCTGTCATGATGAACCATATACTTTTGACATTAACCTGAGCGTGAACCTGAAAGGAGAAGGCATGAGCCAAGCCGCCACTATCTGTAAGAGCAACTTCAAATATATGTACTGGACAATGCTGCAGCAGCTCACCCACCATAGCGTGAATGGGTGTAACCTGAGGCCAGGCGACCTGCTGGCATCAGGCACTATTTCAGGGCCTGAGCCTGAAAATTTTGGATCAATGCTGGAGCTGTCATGGAAGGGAACTAAACCCATCGACCTGGGGAATGGCCAGACCAGAAAGTTTCTCCTGGATGGCGATGAGGTGATCATTACAGGCTACTGTCAGGGGGATGGCTATAGAATTGGATTTGGCCAATGCGCCGGCAAAGTCCTCCCTGCCCTGCTGCCCAGCTAACATCACATTTAAAAGCATCTCAGGTAACTATATTTTGAATTTTTTAAAAAAGTAACTATAATAGTTATTATTAAAATAGCAAAGATTGACCATTTCCAAGAGCCATATAGACCAGCACCGACCACTATTCTAAACTATTTATGTATGTAAATATTAGCTTTTAAAATTCTCAAAATAGTTGCTGAGTTGGGAACCACTATTATTTCTATTTTGTAGATGAGAAAATGAAGATAAACATCAAAGCATAGATTAAGTAATTTTCCAAAGGGTCAAAATTCAAAATTGAAACCAAAGTTTCAGTGTTGCCCATTGTCCTGTTCTGACTTATATGATGCGGTACACAGAGCCATCCAAGTAAGTGATGGCTCAGCAGTGGAATACTCTGGGAATTAGGCTGAACCACATGAAAGAGTGCTTTATAGGGCAAAAACAGTTGAATATCAGTGATTTCACATGGTTCAACCTAATAGTTCAACTCATCCTTTCCATTGGAGAATATGATGGATCTACCTTCTGTGAACTTTATAGTGAAGAATCTGCTATTACATTTCCAATTTGTCAACATGCTGAGCTTTAATAGGACTTATCTTCTTATGACAACATTTATTGGTGTGTCCCCTTGCCTAGCCCAACAGAAGAATTCAGCAGCCGTAAGTCTAGGACAGGCTTAAATTGTTTTCACTGGTGTAAATTGCAGAAAGATGATCTAAGTAATTTGGCATTTATTTTAATAGGTTTGAAAAACACATGCCATTTTACAAATAAGACTTATATTTGTCCTTTTGTTTTTCAGCCTACCATGAGAATAAGAGAAAGAAAATGAAGATCAAAAGCTTATTCATCTGTTTTTCTTTTTCGTTGGTGTAAAGCCAACACCCTGTCTAAAAAACATAAATTTCTTTAATCATTTTGCCTCTTTTCTCTGTGCTTCAATTAATAAAAAATGGAAAGAATCTAATAGAGTGGTACAGCACTGTTATTTTTCAAAGATGTGTTGtacgtagataagtagcatggcgggttaatcattaactacaaggaacccctagtgatggagttggccactccctctctgcgcgctcgctcgctcactgaggccgggcgaccaaaggtcgcccgacgcccgggctttgcccgggcggcctcagtgagcgagcgagcgcgcagctggcgtaatagcgaagaggcccgcaccgatcgggtcaccaagcaggaagtcaaagactttttccggtgggcaaaggatcacgtggttgaggtggagcatgaattctacgtcaaaaagggtggagccaagaaaagacccgcccccagtgacgcagatataagtgagcccaaacgggtgcgcgagtcagttgcgcagccatcgacgtcagacgcggaagcttcgatcaactacgcagacaggtaccaaaacaaatgttctcgtcacgtgggcatgaatctgatgctgtttccctgcagacaatgcgagagaatgaatcagaattcaaatatctgcttcactcacggacagaaagactgtttagagtgctttcccgtgtcagaatctcaacccgtttctgtcgtcaaaaaggcgtatcagaaactgtgctacattcatcatatcatgggaaaggtgccagacgcttgcactgcctgcgatctggtcaatgtggatttggatgactgcatctttgaacaataaatgatttaaatcaggtatggctgctgacggttatcttccagattggctcgaggacaacctttctgaaggcattcgagagtggtgggcgctgcaacctggagcccctaaacccaaggcaaatcaacaacatcaggacaacgctcggggtcttgtgcttccgggttacaaatacctcggacccggcaacggactcgacaagggggaacccgtcaacgcagcggacgcggcagccctcgagcacgacaaggcctacgaccagcagctcaaggccggtgacaacccctacctcaagtacaaccacgccgacgccgagttccaggagcggctcaaagaagatacgtcttttgggggcaacctcgggcgagcagtcttccaggccaaaaagaggcttcttgaacctcttggtctggttgaggaagcggctaagacggctcctggaaagaagaggcctgtagatcagtctcctcaggaaccggactcatcatctggtgttggcaaatcgggcaaacagcctgccagaaaaagactaaatttcggtcagactggcgactcagagtcagtcccagaccctcaacctctcggagaaccaccagcagcccccacaagtttgggatctaatacaatggcttcaggcggtggcgcaccaatggcagacaataacgagggtgccgatggagtgggtaattcctcaggaaattggcattgcgattcccaatggctgggcgacagagtcatcaccaccagcaccagaacctgggccctgcccacttacaacaaccatctctacaagcaaatctccagccaatcaggagcttcaaacgacaaccactactttggctacagcaccccttgggggtattttgactttaacagattccactgccacttctcaccacgtgactggcagcgactcattaacaacaactggggattccggcccaagaaactcagcttcaagctcttcaacatccaagttaaagaggtcacgcagaacgatggcacgacgactattgccaataaccttaccagcacggttcaagtgtttacggactcggagtatcagctcccgtacgtgctcgggtcggcgcaccaaggctgtctcccgccgtttccagcggacgtcttcatggtccctcagtatggatacctcaccctgaacaacggaagtcaagcggtgggacgctcatccttttactgcctggagtacttcccttcgcagatgctaaggactggaaataacttccaattcagctataccttcgaggatgtaccttttcacagcagctacgctcacagccagagtttggatcgcttgatgaatcctcttattgatcagtatctgtactacctgaacagaacgcaaggaacaacctctggaacaaccaaccaatcacggctgctttttagccaggctgggcctcagtctatgtctttgcaggccagaaattggctacctgggccctgctaccggcaacagagactttcaaagactgctaacgacaacaacaacagtaactttccttggacagcggccagcaaatatcatctcaatggccgcgactcgctggtgaatccaggaccagctatggccagtcacaaggacgatgaagaaaaatttttccctatgcacggcaatctaatatttggcaaagaagggacaacggcaagtaacgcagaattagataatgtaatgattacggatgaagaagagattcgtaccaccaatcctgtggcaacagagcagtatggaactgtggcaaataacttgcagagctcaaatacagctcccacgactagaactgtcaatgatcagggggccttacctggcatggtgtggcaagatcgtgacgtgtaccttcaaggacctatctgggcaaagattcctcacacggatggacactttcatccttctcctctgatgggaggctttggactgaaacatccgcctcctcaaatcatgatcaaaaatactccggtaccggcaaatcctccgacgactttcagcccggccaagtttgcttcatttatcactcagtactccactggacaggtcagcgtggaaattgagtgggagctacagaaagaaaacagcaaacgttggaatccagagattcagtacacttccaactacaacaagtctgttaatgtggactttactgtagacactaatggtgtttatagtgaacctcgccccattggcacccgttaccttacccgtcccctgtaattgcttgttaatcaataaaccgtttaattcgtttcagttgaactttggtctctgcgtatttctttcttatctagtttccatatgcatgtagataagtagcatggcgggttaatcattaactaaccggtacctctagaactatagctagcatgcgcaaatttaaagcgctgatatcgatcgcgcgcagatctgtcatgatgatcattgcaattggatccatatatagggcccgggttataattacctcaggtcgacgtcccatggccattcgaattcgtaatcatggtcatagctgtttcctgtgtgaaattgttatccgctcacaattccacacaacatacgagccggaagcataaagtgtaaagcctggggtgcctaatgagtgagctaactcacattaattgcgttgcgctcactgcccgctttccagtcgggaaacctgtcgtgccagctgcattaatgaatcggccaacgcgcggggagaggcggtttgcgtattgggcgctcttccgcttcctcgctcactgactcgctgcgctcggtcgttcggctgcggcgagcggtatcagctcactcaaaggcggtaatacggttatccacagaatcaggggataacgcaggaaagaacatgtgagcaaaaggccagcaaaaggccaggaaccgtaaaaaggccgcgttgctggcgtttttccataggctccgcccccctgacgagcatcacaaaaatcgacgctcaagtcagaggtggcgaaacccgacaggactataaagataccaggcgtttccccctggaagctccctcgtgcgctctcctgttccgaccctgccgcttaccggatacctgtccgcctttctcccttcgggaagcgtggcgctttctcatagctcacgctgtaggtatctcagttcggtgtaggtcgttcgctccaagctgggctgtgtgcacgaaccccccgttcagcccgaccgctgcgccttatccggtaactatcgtcttgagtccaacccggtaagacacgacttatcgccactggcagcagccactggtaacaggattagcagagcgaggtatgtaggcggtgctacagagttcttgaagtggtggcctaactacggctacactagaagaacagtatttggtatctgcgctctgctgaagccagttaccttcggaaaaagagttggtagctcttgatccggcaaacaaaccaccgctggtagcggtggtttttttgtttgcaagcagcagattacgcgcagaaaaaaaggatctcaagaagatcctttgatcttttctacggggtctgacgctcagtggaacgaaaactcacgttaagggattttggtcatgagattatcaaaaaggatcttcacctagatccttttaaattaaaaatgaagttttaaatcaatctaaagtatatatgagtaaacttggtctgacagttagaaaaactcatcgagcatcaaatgaaactgcaatttattcatatcaggattatcaataccatatttttgaaaaagccgtttctgtaatgaaggagaaaactcaccgaggcagttccataggatggcaagatcctggtatcggtctgcgattccgactcgtccaacatcaatacaacctattaatttcccctcgtcaaaaataaggttatcaagtgagaaatcaccatgagtgacgactgaatccggtgagaatggcaaaagtttatgcatttctttccagacttgttcaacaggccagccattacgctcgtcatcaaaatcactcgcatcaaccaaaccgttattcattcgtgattgcgcctgagcgagacgaaatacgcgatcgctgttaaaaggacaattacaaacaggaatcgaatgcaaccggcgcaggaacactgccagcgcatcaacaatattttcacctgaatcaggatattcttctaatacctggaatgctgtttttccggggatcgcagtggtgagtaaccatgcatcatcaggagtacggataaaatgcttgatggtcggaagaggcataaattccgtcagccagtttagtctgaccatctcatctgtaacatcattggcaacgctacctttgccatgtttcagaaacaactctggcgcatcgggcttcccatacaagcgatagattgtcgcacctgattgcccgacattatcgcgagcccatttatacccatataaatcagcatccatgttggaatttaatcgcggcctcgacgtttcccgttgaatatggctcatactcttcctttttcaatattattgaagcatttatcagggttattgtctcatgagcggatacatatttgaatgtatttagaaaaataaacaaataggggttccgcgcacatttccccgaaaagtgccacctgacgtctaagaaaccattattatcatgacattaacctataaaaataggcgtatcacgaggccctttcgtctcgcgcgtttcggtgatgacggtgaaaacctctgacacatgcagctcccggagacggtcacagcttgtctgtaagcggatgccgggagcagacaagcccgtcagggcgcgtcagcgggtgttggcgggtgtcggggctggcttaactatgcggcatcagagcagattgtactgagagtgcaccataaaattgtaaacgttaatattttgttaaaattcgcgttaaatttttgttaaatcagctcattttttaaccaataggccgaaatcggcaaaatcccttataaatcaaaagaatagcccgagatagggttgagtgttgttccagtttggaacaagagtccactattaaagaacgtggactccaacgtcaaagggcgaaaaaccgtctatcagggcgatggcccactacgtgaaccatcacccaaatcaagttttttggggtcgaggtgccgtaaagcactaaatcggaaccctaaagggagcccccgatttagagcttgacggggaaagccggcgaacgtggcgagaaaggaagggaagaaagcgaaaggagcgggcgctagggcgctggcaagtgtagcggtcacgctgcgcgtaaccaccacacccgccgcgcttaatgcgccgctacagggcgcgtactatggttgctttgacgtatgcggtgtgaaataccgcacagatgcgtaaggagaaaataccgcatcaggcgccattcgccattcaggctgcgcaactgttgggaagggcgatcggtgcgggcctcttcgctattacgccagctggcgaaagggggatgtgctgcaaggcgattaagttgggtaacgccagggttttcccagtcacgacgttgtaaaacgacggccaacatgttctttcctgcgttatcccctgattctgtggataaccgtattaccgcctttgagtgagctgataccgctcgccgcagccgaacgaccgagcgcagcgagtcagtgagcgaggaagcggaagagcgcccaatacgcaaaccgcctctccccgcgcgttggc

GR-FAH-co2_1.6/1.0_kb(SEQ ID NO.24)

cgattcattaatgcagctgcgcgctcgctcgctcactgaggccgcccgggcaaagcccgggcgtcgggcgacctttggtcgcccggcctcagtgagcgagcgagcgcgcagagagggagtggccaactccatcactaggggttccttgtagttaatgattaacccgccatgctacttatctacgtaTTTCTAGATGTAAATAATTATTTTAAGTTTGCCCTATGGTGGCCCCACACATGAGACAAACCCCCAAGATGTGACTTTTGAGAATGAGACTTGGATAAAAAACATGTAGAAATGCAAGCCCTGAAGCTCAACTCCCTATTGCTATCACAGGGGTTATAATTGCATAAAATTTAGCTATAGAAAGTTGCTGTCATCTCTTGTGGGCTGTAATCATCGTCTAGGCTTAAGAGTAATATTGCAAAACCTGTCATGCCCACACAAATCTCTCCCTGGCATTGTTGTCTTTGCAGATGTCAGTGAAAGAGAACCAGCAGCTCCCATGAGTTTGGATAGCCTTATTTTCTATAGCCTCCCCACTATTAGCTTTGAAGGGAGCAAAGTTTAAGAACCAAATATAAAGTTTCTCATCTTTATAGATGAGAAAAATTTTAAATAAAGTCCAAGATAATTAAATTTTTAAGGATCATTTTTAGCTCTTTAATAGCAATAAAACTCAATATGACATAATATGGCACTTCCAAAATCTGAATAATATATAATTGCAATGACATACTTCTTTTCAGAGATTTACTGAAAAGAAATTTGTTGACACTACATAACGTGATGAGTGGTTTATACTGATTGTTTCAGTTGGTCTTCCCACCAACTCCATGAAAGTGGATTTTATTATCCTCATCATGCAGATGAGAATATTGAGACTTATAGCGGTATGCCTGAGCCCCAAAGTACTCAGAGTTGCCTGGCTCCAAGATTTATAATCTTAAATGATGGGACTACCATCCTTACTCTCTCCATTTTTCTATACGTGAGTAATGTTTTTTCTGTTTTTTTTTTTTCTTTTTCCATTCAAACTCAGTGCACTTGTTGAGCTCGTGAAACACAAGCCCAAGGCAACAAAAGAGCAACTGAAAGCTGTTATGGATGATTTCGCAGCTTTTGTAGAGAAGTGCTGCAAGGCTGACGATAAGGAGACCTGCTTTGCCGAGGAGGTACTACAGTTCTCTTCATTTTAATATGTCCAGTATTCATTTTTGCATGTTTGGTTAGGCTAGGGCTTAGGGATTTATATATCAAAGGAGGCTTTGTACATGTGGGACAGGGATCTTATTTTACAAACAATTGTCTTACAAAATGAATAAAACAGCACTTTGTTTTTATCTCCTGCTCTATTGTGCCATACTGTTAAATGTTTATAATGCCTGTTCTGTTTCCAAATTTGTGATGCTTATGAATATTAATAGGAATATTTGTAAGGCCTGAAATATTTTGATCATGAAATCAAAACATTAATTTATTTAAACATTTACTTGAAATGTGGTGGTTTGTGATTTAGTTGATTTTATAGGCTAGTGGGAGAATTTACATTCAAATGTCTAAATCACTTAAAATTGCCCTTTATGGCCTGACAGTAACTTTTTTTTATTCATTTGGGGACAACTATGTCCGTGAGCTTCCGTCCAGAGATTATAGTAGTAAATTGTAATTAAAGGATATGATGCACGTGAAATCACTTTGCAATCATCAATAGCTTCATAAATGTTAATTTTGTATCCTAATAGTAATGCTAATATTTTCCTAACATCTGTCATGTCTTTGTGTTCAGGGTAAAAAACTTGTTGCTGCAAGTCAAGCTGCCTTAGGCTTAGGAAGCGGCGCCACCAATTTCAGCCTGCTGAAACAGGCCGGCGACGTGGAAGAGAACCCTGGCCCTTCCTTCATCCCAGTGGCTGAAGATTCTGACTTCCCCATTCACAACCTCCCTTATGGAGTCTTTAGCACAAGAGGAGACCCAAGACCTAGAATTGGCGTGGCTATTGGGGATCAGATCCTGGACCTGAGCATTATCAAACATCTCTTTACAGGCCCAGTCCTCAGCAAGCACCAAGATGTGTTCAATCAACCCACACTGAACTCATTTATGGGGCTGGGCCAAGCTGCCTGGAAGGAGGCAAGAGTGTTTCTCCAGAACCTCCTGTCAGTGAGCCAGGCCAGGCTGAGAGATGACACAGAGCTGAGGAAGTGTGCCTTTATCTCACAAGCCAGCGCCACAATGCACCTGCCAGCAACCATCGGGGACTACACAGACTTTTACAGCAGCAGGCAGCACGCCACTAATGTGGGCATCATGTTTAGAGACAAAGAGAATGCCCTCATGCCTAACTGGCTGCATCTGCCAGTGGGCTACCATGGCAGAGCCAGCAGCGTGGTGGTGTCTGGCACCCCTATCAGGAGGCCCATGGGCCAGATGAAGCCTGATGACAGCAAGCCACCTGTGTATGGGGCATGTAAGCTGCTGGACATGGAGCTGGAAATGGCTTTCTTTGTGGGGCCTGGGAACAGGCTGGGCGAACCAATTCCCATCAGCAAAGCTCATGAACACATTTTTGGGATGGTGCTCATGAATGACTGGTCTGCCAGAGACATCCAGAAGTGGGAGTATGTCCCCCTGGGACCATTCCTGGGCAAGAGCTTTGGGACCACTGTGTCACCATGGGTGGTGCCCATGGATGCCCTGATGCCATTTGCTGTGCCTAATCCCAAACAAGATCCAAGGCCACTGCCCTACCTCTGCCACGATGAGCCCTACACATTTGATATCAACCTCTCTGTGAATCTGAAGGGAGAAGGAATGAGCCAAGCTGCTACCATTTGCAAAAGCAACTTTAAGTATATGTATTGGACCATGCTGCAGCAACTCACCCACCACAGCGTGAATGGCTGTAACCTGAGGCCTGGCGACCTGCTGGCCTCTGGCACCATCAGCGGCCCTGAACCTGAGAACTTTGGCAGCATGCTGGAGCTGAGCTGGAAGGGAACCAAGCCCATTGACCTGGGCAATGGACAGACCAGGAAATTCCTCCTGGACGGGGACGAGGTGATCATCACAGGCTATTGCCAGGGGGATGGGTATAGGATTGGCTTTGGCCAATGTGCCGGCAAAGTGCTGCCTGCCCTGCTCCCTAGCTAACATCACATTTAAAAGCATCTCAGGTAACTATATTTTGAATTTTTTAAAAAAGTAACTATAATAGTTATTATTAAAATAGCAAAGATTGACCATTTCCAAGAGCCATATAGACCAGCACCGACCACTATTCTAAACTATTTATGTATGTAAATATTAGCTTTTAAAATTCTCAAAATAGTTGCTGAGTTGGGAACCACTATTATTTCTATTTTGTAGATGAGAAAATGAAGATAAACATCAAAGCATAGATTAAGTAATTTTCCAAAGGGTCAAAATTCAAAATTGAAACCAAAGTTTCAGTGTTGCCCATTGTCCTGTTCTGACTTATATGATGCGGTACACAGAGCCATCCAAGTAAGTGATGGCTCAGCAGTGGAATACTCTGGGAATTAGGCTGAACCACATGAAAGAGTGCTTTATAGGGCAAAAACAGTTGAATATCAGTGATTTCACATGGTTCAACCTAATAGTTCAACTCATCCTTTCCATTGGAGAATATGATGGATCTACCTTCTGTGAACTTTATAGTGAAGAATCTGCTATTACATTTCCAATTTGTCAACATGCTGAGCTTTAATAGGACTTATCTTCTTATGACAACATTTATTGGTGTGTCCCCTTGCCTAGCCCAACAGAAGAATTCAGCAGCCGTAAGTCTAGGACAGGCTTAAATTGTTTTCACTGGTGTAAATTGCAGAAAGATGATCTAAGTAATTTGGCATTTATTTTAATAGGTTTGAAAAACACATGCCATTTTACAAATAAGACTTATATTTGTCCTTTTGTTTTTCAGCCTACCATGAGAATAAGAGAAAGAAAATGAAGATCAAAAGCTTATTCATCTGTTTTTCTTTTTCGTTGGTGTAAAGCCAACACCCTGTCTAAAAAACATAAATTTCTTTAATCATTTTGCCTCTTTTCTCTGTGCTTCAATTAATAAAAAATGGAAAGAATCTAATAGAGTGGTACAGCACTGTTATTTTTCAAAGATGTGTTGtacgtagataagtagcatggcgggttaatcattaactacaaggaacccctagtgatggagttggccactccctctctgcgcgctcgctcgctcactgaggccgggcgaccaaaggtcgcccgacgcccgggctttgcccgggcggcctcagtgagcgagcgagcgcgcagctggcgtaatagcgaagaggcccgcaccgatcgggtcaccaagcaggaagtcaaagactttttccggtgggcaaaggatcacgtggttgaggtggagcatgaattctacgtcaaaaagggtggagccaagaaaagacccgcccccagtgacgcagatataagtgagcccaaacgggtgcgcgagtcagttgcgcagccatcgacgtcagacgcggaagcttcgatcaactacgcagacaggtaccaaaacaaatgttctcgtcacgtgggcatgaatctgatgctgtttccctgcagacaatgcgagagaatgaatcagaattcaaatatctgcttcactcacggacagaaagactgtttagagtgctttcccgtgtcagaatctcaacccgtttctgtcgtcaaaaaggcgtatcagaaactgtgctacattcatcatatcatgggaaaggtgccagacgcttgcactgcctgcgatctggtcaatgtggatttggatgactgcatctttgaacaataaatgatttaaatcaggtatggctgctgacggttatcttccagattggctcgaggacaacctttctgaaggcattcgagagtggtgggcgctgcaacctggagcccctaaacccaaggcaaatcaacaacatcaggacaacgctcggggtcttgtgcttccgggttacaaatacctcggacccggcaacggactcgacaagggggaacccgtcaacgcagcggacgcggcagccctcgagcacgacaaggcctacgaccagcagctcaaggccggtgacaacccctacctcaagtacaaccacgccgacgccgagttccaggagcggctcaaagaagatacgtcttttgggggcaacctcgggcgagcagtcttccaggccaaaaagaggcttcttgaacctcttggtctggttgaggaagcggctaagacggctcctggaaagaagaggcctgtagatcagtctcctcaggaaccggactcatcatctggtgttggcaaatcgggcaaacagcctgccagaaaaagactaaatttcggtcagactggcgactcagagtcagtcccagaccctcaacctctcggagaaccaccagcagcccccacaagtttgggatctaatacaatggcttcaggcggtggcgcaccaatggcagacaataacgagggtgccgatggagtgggtaattcctcaggaaattggcattgcgattcccaatggctgggcgacagagtcatcaccaccagcaccagaacctgggccctgcccacttacaacaaccatctctacaagcaaatctccagccaatcaggagcttcaaacgacaaccactactttggctacagcaccccttgggggtattttgactttaacagattccactgccacttctcaccacgtgactggcagcgactcattaacaacaactggggattccggcccaagaaactcagcttcaagctcttcaacatccaagttaaagaggtcacgcagaacgatggcacgacgactattgccaataaccttaccagcacggttcaagtgtttacggactcggagtatcagctcccgtacgtgctcgggtcggcgcaccaaggctgtctcccgccgtttccagcggacgtcttcatggtccctcagtatggatacctcaccctgaacaacggaagtcaagcggtgggacgctcatccttttactgcctggagtacttcccttcgcagatgctaaggactggaaataacttccaattcagctataccttcgaggatgtaccttttcacagcagctacgctcacagccagagtttggatcgcttgatgaatcctcttattgatcagtatctgtactacctgaacagaacgcaaggaacaacctctggaacaaccaaccaatcacggctgctttttagccaggctgggcctcagtctatgtctttgcaggccagaaattggctacctgggccctgctaccggcaacagagactttcaaagactgctaacgacaacaacaacagtaactttccttggacagcggccagcaaatatcatctcaatggccgcgactcgctggtgaatccaggaccagctatggccagtcacaaggacgatgaagaaaaatttttccctatgcacggcaatctaatatttggcaaagaagggacaacggcaagtaacgcagaattagataatgtaatgattacggatgaagaagagattcgtaccaccaatcctgtggcaacagagcagtatggaactgtggcaaataacttgcagagctcaaatacagctcccacgactagaactgtcaatgatcagggggccttacctggcatggtgtggcaagatcgtgacgtgtaccttcaaggacctatctgggcaaagattcctcacacggatggacactttcatccttctcctctgatgggaggctttggactgaaacatccgcctcctcaaatcatgatcaaaaatactccggtaccggcaaatcctccgacgactttcagcccggccaagtttgcttcatttatcactcagtactccactggacaggtcagcgtggaaattgagtgggagctacagaaagaaaacagcaaacgttggaatccagagattcagtacacttccaactacaacaagtctgttaatgtggactttactgtagacactaatggtgtttatagtgaacctcgccccattggcacccgttaccttacccgtcccctgtaattgcttgttaatcaataaaccgtttaattcgtttcagttgaactttggtctctgcgtatttctttcttatctagtttccatatgcatgtagataagtagcatggcgggttaatcattaactaaccggtacctctagaactatagctagcatgcgcaaatttaaagcgctgatatcgatcgcgcgcagatctgtcatgatgatcattgcaattggatccatatatagggcccgggttataattacctcaggtcgacgtcccatggccattcgaattcgtaatcatggtcatagctgtttcctgtgtgaaattgttatccgctcacaattccacacaacatacgagccggaagcataaagtgtaaagcctggggtgcctaatgagtgagctaactcacattaattgcgttgcgctcactgcccgctttccagtcgggaaacctgtcgtgccagctgcattaatgaatcggccaacgcgcggggagaggcggtttgcgtattgggcgctcttccgcttcctcgctcactgactcgctgcgctcggtcgttcggctgcggcgagcggtatcagctcactcaaaggcggtaatacggttatccacagaatcaggggataacgcaggaaagaacatgtgagcaaaaggccagcaaaaggccaggaaccgtaaaaaggccgcgttgctggcgtttttccataggctccgcccccctgacgagcatcacaaaaatcgacgctcaagtcagaggtggcgaaacccgacaggactataaagataccaggcgtttccccctggaagctccctcgtgcgctctcctgttccgaccctgccgcttaccggatacctgtccgcctttctcccttcgggaagcgtggcgctttctcatagctcacgctgtaggtatctcagttcggtgtaggtcgttcgctccaagctgggctgtgtgcacgaaccccccgttcagcccgaccgctgcgccttatccggtaactatcgtcttgagtccaacccggtaagacacgacttatcgccactggcagcagccactggtaacaggattagcagagcgaggtatgtaggcggtgctacagagttcttgaagtggtggcctaactacggctacactagaagaacagtatttggtatctgcgctctgctgaagccagttaccttcggaaaaagagttggtagctcttgatccggcaaacaaaccaccgctggtagcggtggtttttttgtttgcaagcagcagattacgcgcagaaaaaaaggatctcaagaagatcctttgatcttttctacggggtctgacgctcagtggaacgaaaactcacgttaagggattttggtcatgagattatcaaaaaggatcttcacctagatccttttaaattaaaaatgaagttttaaatcaatctaaagtatatatgagtaaacttggtctgacagttagaaaaactcatcgagcatcaaatgaaactgcaatttattcatatcaggattatcaataccatatttttgaaaaagccgtttctgtaatgaaggagaaaactcaccgaggcagttccataggatggcaagatcctggtatcggtctgcgattccgactcgtccaacatcaatacaacctattaatttcccctcgtcaaaaataaggttatcaagtgagaaatcaccatgagtgacgactgaatccggtgagaatggcaaaagtttatgcatttctttccagacttgttcaacaggccagccattacgctcgtcatcaaaatcactcgcatcaaccaaaccgttattcattcgtgattgcgcctgagcgagacgaaatacgcgatcgctgttaaaaggacaattacaaacaggaatcgaatgcaaccggcgcaggaacactgccagcgcatcaacaatattttcacctgaatcaggatattcttctaatacctggaatgctgtttttccggggatcgcagtggtgagtaaccatgcatcatcaggagtacggataaaatgcttgatggtcggaagaggcataaattccgtcagccagtttagtctgaccatctcatctgtaacatcattggcaacgctacctttgccatgtttcagaaacaactctggcgcatcgggcttcccatacaagcgatagattgtcgcacctgattgcccgacattatcgcgagcccatttatacccatataaatcagcatccatgttggaatttaatcgcggcctcgacgtttcccgttgaatatggctcatactcttcctttttcaatattattgaagcatttatcagggttattgtctcatgagcggatacatatttgaatgtatttagaaaaataaacaaataggggttccgcgcacatttccccgaaaagtgccacctgacgtctaagaaaccattattatcatgacattaacctataaaaataggcgtatcacgaggccctttcgtctcgcgcgtttcggtgatgacggtgaaaacctctgacacatgcagctcccggagacggtcacagcttgtctgtaagcggatgccgggagcagacaagcccgtcagggcgcgtcagcgggtgttggcgggtgtcggggctggcttaactatgcggcatcagagcagattgtactgagagtgcaccataaaattgtaaacgttaatattttgttaaaattcgcgttaaatttttgttaaatcagctcattttttaaccaataggccgaaatcggcaaaatcccttataaatcaaaagaatagcccgagatagggttgagtgttgttccagtttggaacaagagtccactattaaagaacgtggactccaacgtcaaagggcgaaaaaccgtctatcagggcgatggcccactacgtgaaccatcacccaaatcaagttttttggggtcgaggtgccgtaaagcactaaatcggaaccctaaagggagcccccgatttagagcttgacggggaaagccggcgaacgtggcgagaaaggaagggaagaaagcgaaaggagcgggcgctagggcgctggcaagtgtagcggtcacgctgcgcgtaaccaccacacccgccgcgcttaatgcgccgctacagggcgcgtactatggttgctttgacgtatgcggtgtgaaataccgcacagatgcgtaaggagaaaataccgcatcaggcgccattcgccattcaggctgcgcaactgttgggaagggcgatcggtgcgggcctcttcgctattacgccagctggcgaaagggggatgtgctgcaaggcgattaagttgggtaacgccagggttttcccagtcacgacgttgtaaaacgacggccaacatgttctttcctgcgttatcccctgattctgtggataaccgtattaccgcctttgagtgagctgataccgctcgccgcagccgaacgaccgagcgcagcgagtcagtgagcgaggaagcggaagagcgcccaatacgcaaaccgcctctccccgcgcgttggc

GR-FAH-co1_1.6/1.0_kb(SEQ ID NO.25)

cgattcattaatgcagctgcgcgctcgctcgctcactgaggccgcccgggcaaagcccgggcgtcgggcgacctttggtcgcccggcctcagtgagcgagcgagcgcgcagagagggagtggccaactccatcactaggggttccttgtagttaatgattaacccgccatgctacttatctacgtaTTTCTAGATGTAAATAATTATTTTAAGTTTGCCCTATGGTGGCCCCACACATGAGACAAACCCCCAAGATGTGACTTTTGAGAATGAGACTTGGATAAAAAACATGTAGAAATGCAAGCCCTGAAGCTCAACTCCCTATTGCTATCACAGGGGTTATAATTGCATAAAATTTAGCTATAGAAAGTTGCTGTCATCTCTTGTGGGCTGTAATCATCGTCTAGGCTTAAGAGTAATATTGCAAAACCTGTCATGCCCACACAAATCTCTCCCTGGCATTGTTGTCTTTGCAGATGTCAGTGAAAGAGAACCAGCAGCTCCCATGAGTTTGGATAGCCTTATTTTCTATAGCCTCCCCACTATTAGCTTTGAAGGGAGCAAAGTTTAAGAACCAAATATAAAGTTTCTCATCTTTATAGATGAGAAAAATTTTAAATAAAGTCCAAGATAATTAAATTTTTAAGGATCATTTTTAGCTCTTTAATAGCAATAAAACTCAATATGACATAATATGGCACTTCCAAAATCTGAATAATATATAATTGCAATGACATACTTCTTTTCAGAGATTTACTGAAAAGAAATTTGTTGACACTACATAACGTGATGAGTGGTTTATACTGATTGTTTCAGTTGGTCTTCCCACCAACTCCATGAAAGTGGATTTTATTATCCTCATCATGCAGATGAGAATATTGAGACTTATAGCGGTATGCCTGAGCCCCAAAGTACTCAGAGTTGCCTGGCTCCAAGATTTATAATCTTAAATGATGGGACTACCATCCTTACTCTCTCCATTTTTCTATACGTGAGTAATGTTTTTTCTGTTTTTTTTTTTTCTTTTTCCATTCAAACTCAGTGCACTTGTTGAGCTCGTGAAACACAAGCCCAAGGCAACAAAAGAGCAACTGAAAGCTGTTATGGATGATTTCGCAGCTTTTGTAGAGAAGTGCTGCAAGGCTGACGATAAGGAGACCTGCTTTGCCGAGGAGGTACTACAGTTCTCTTCATTTTAATATGTCCAGTATTCATTTTTGCATGTTTGGTTAGGCTAGGGCTTAGGGATTTATATATCAAAGGAGGCTTTGTACATGTGGGACAGGGATCTTATTTTACAAACAATTGTCTTACAAAATGAATAAAACAGCACTTTGTTTTTATCTCCTGCTCTATTGTGCCATACTGTTAAATGTTTATAATGCCTGTTCTGTTTCCAAATTTGTGATGCTTATGAATATTAATAGGAATATTTGTAAGGCCTGAAATATTTTGATCATGAAATCAAAACATTAATTTATTTAAACATTTACTTGAAATGTGGTGGTTTGTGATTTAGTTGATTTTATAGGCTAGTGGGAGAATTTACATTCAAATGTCTAAATCACTTAAAATTGCCCTTTATGGCCTGACAGTAACTTTTTTTTATTCATTTGGGGACAACTATGTCCGTGAGCTTCCGTCCAGAGATTATAGTAGTAAATTGTAATTAAAGGATATGATGCACGTGAAATCACTTTGCAATCATCAATAGCTTCATAAATGTTAATTTTGTATCCTAATAGTAATGCTAATATTTTCCTAACATCTGTCATGTCTTTGTGTTCAGGGTAAAAAACTTGTTGCTGCAAGTCAAGCTGCCTTAGGCTTAGGAAGCGGCGCCACCAATTTCAGCCTGCTGAAACAGGCCGGCGACGTGGAAGAGAACCCTGGCCCTTCCTTTATTCCTGTGGCCGAAGATAGCGACTTCCCTATCCATAACCTCCCATATGGCGTCTTCTCAACCAGGGGCGACCCCAGGCCCAGGATTGGGGTGGCAATTGGAGACCAGATCCTGGACCTCAGCATCATTAAGCACCTCTTTACAGGACCTGTGCTGAGCAAGCACCAGGATGTGTTCAACCAGCCTACCCTGAATAGCTTTATGGGACTGGGCCAGGCAGCATGGAAAGAAGCCAGAGTGTTCCTCCAGAATCTGCTGAGCGTGAGCCAGGCCAGGCTCAGGGATGACACAGAACTGAGGAAATGTGCCTTCATCTCACAGGCATCAGCCACAATGCACCTCCCTGCCACAATTGGGGACTACACTGACTTCTACAGCAGCAGACAGCATGCCACTAATGTGGGGATTATGTTTAGAGACAAGGAAAATGCCCTCATGCCAAATTGGCTGCACCTGCCTGTGGGCTACCACGGCAGAGCCTCATCAGTGGTGGTGTCAGGCACACCAATTAGGAGGCCAATGGGACAGATGAAGCCTGACGACTCAAAGCCACCAGTCTATGGCGCCTGCAAACTGCTGGACATGGAGCTGGAGATGGCTTTCTTTGTGGGGCCTGGCAACAGGCTGGGAGAACCAATCCCTATTTCAAAGGCCCACGAGCACATTTTTGGCATGGTGCTCATGAATGACTGGTCTGCCAGAGATATCCAGAAGTGGGAATACGTGCCCCTGGGCCCTTTTCTGGGCAAGAGCTTTGGCACCACAGTGTCACCTTGGGTGGTCCCAATGGATGCCCTGATGCCCTTTGCCGTGCCCAACCCCAAGCAGGATCCCAGACCACTGCCCTACCTGTGCCACGATGAGCCCTATACCTTTGACATCAACCTGTCAGTCAACCTGAAGGGGGAGGGCATGAGCCAGGCCGCCACTATTTGCAAGAGCAACTTTAAATATATGTACTGGACTATGCTCCAACAGCTCACTCACCACTCAGTGAATGGCTGCAATCTGAGGCCTGGCGACCTCCTGGCTAGCGGCACTATCTCTGGCCCTGAACCTGAGAACTTTGGGAGCATGCTGGAGCTGTCATGGAAAGGGACAAAGCCTATTGACCTGGGGAATGGCCAGACAAGGAAGTTTCTCCTGGATGGCGATGAGGTCATCATCACTGGGTACTGCCAGGGCGATGGCTACAGGATTGGATTTGGACAGTGTGCTGGCAAAGTGCTCCCAGCTCTGCTGCCCTCATAACATCACATTTAAAAGCATCTCAGGTAACTATATTTTGAATTTTTTAAAAAAGTAACTATAATAGTTATTATTAAAATAGCAAAGATTGACCATTTCCAAGAGCCATATAGACCAGCACCGACCACTATTCTAAACTATTTATGTATGTAAATATTAGCTTTTAAAATTCTCAAAATAGTTGCTGAGTTGGGAACCACTATTATTTCTATTTTGTAGATGAGAAAATGAAGATAAACATCAAAGCATAGATTAAGTAATTTTCCAAAGGGTCAAAATTCAAAATTGAAACCAAAGTTTCAGTGTTGCCCATTGTCCTGTTCTGACTTATATGATGCGGTACACAGAGCCATCCAAGTAAGTGATGGCTCAGCAGTGGAATACTCTGGGAATTAGGCTGAACCACATGAAAGAGTGCTTTATAGGGCAAAAACAGTTGAATATCAGTGATTTCACATGGTTCAACCTAATAGTTCAACTCATCCTTTCCATTGGAGAATATGATGGATCTACCTTCTGTGAACTTTATAGTGAAGAATCTGCTATTACATTTCCAATTTGTCAACATGCTGAGCTTTAATAGGACTTATCTTCTTATGACAACATTTATTGGTGTGTCCCCTTGCCTAGCCCAACAGAAGAATTCAGCAGCCGTAAGTCTAGGACAGGCTTAAATTGTTTTCACTGGTGTAAATTGCAGAAAGATGATCTAAGTAATTTGGCATTTATTTTAATAGGTTTGAAAAACACATGCCATTTTACAAATAAGACTTATATTTGTCCTTTTGTTTTTCAGCCTACCATGAGAATAAGAGAAAGAAAATGAAGATCAAAAGCTTATTCATCTGTTTTTCTTTTTCGTTGGTGTAAAGCCAACACCCTGTCTAAAAAACATAAATTTCTTTAATCATTTTGCCTCTTTTCTCTGTGCTTCAATTAATAAAAAATGGAAAGAATCTAATAGAGTGGTACAGCACTGTTATTTTTCAAAGATGTGTTGtacgtagataagtagcatggcgggttaatcattaactacaaggaacccctagtgatggagttggccactccctctctgcgcgctcgctcgctcactgaggccgggcgaccaaaggtcgcccgacgcccgggctttgcccgggcggcctcagtgagcgagcgagcgcgcagctggcgtaatagcgaagaggcccgcaccgatcgggtcaccaagcaggaagtcaaagactttttccggtgggcaaaggatcacgtggttgaggtggagcatgaattctacgtcaaaaagggtggagccaagaaaagacccgcccccagtgacgcagatataagtgagcccaaacgggtgcgcgagtcagttgcgcagccatcgacgtcagacgcggaagcttcgatcaactacgcagacaggtaccaaaacaaatgttctcgtcacgtgggcatgaatctgatgctgtttccctgcagacaatgcgagagaatgaatcagaattcaaatatctgcttcactcacggacagaaagactgtttagagtgctttcccgtgtcagaatctcaacccgtttctgtcgtcaaaaaggcgtatcagaaactgtgctacattcatcatatcatgggaaaggtgccagacgcttgcactgcctgcgatctggtcaatgtggatttggatgactgcatctttgaacaataaatgatttaaatcaggtatggctgctgacggttatcttccagattggctcgaggacaacctttctgaaggcattcgagagtggtgggcgctgcaacctggagcccctaaacccaaggcaaatcaacaacatcaggacaacgctcggggtcttgtgcttccgggttacaaatacctcggacccggcaacggactcgacaagggggaacccgtcaacgcagcggacgcggcagccctcgagcacgacaaggcctacgaccagcagctcaaggccggtgacaacccctacctcaagtacaaccacgccgacgccgagttccaggagcggctcaaagaagatacgtcttttgggggcaacctcgggcgagcagtcttccaggccaaaaagaggcttcttgaacctcttggtctggttgaggaagcggctaagacggctcctggaaagaagaggcctgtagatcagtctcctcaggaaccggactcatcatctggtgttggcaaatcgggcaaacagcctgccagaaaaagactaaatttcggtcagactggcgactcagagtcagtcccagaccctcaacctctcggagaaccaccagcagcccccacaagtttgggatctaatacaatggcttcaggcggtggcgcaccaatggcagacaataacgagggtgccgatggagtgggtaattcctcaggaaattggcattgcgattcccaatggctgggcgacagagtcatcaccaccagcaccagaacctgggccctgcccacttacaacaaccatctctacaagcaaatctccagccaatcaggagcttcaaacgacaaccactactttggctacagcaccccttgggggtattttgactttaacagattccactgccacttctcaccacgtgactggcagcgactcattaacaacaactggggattccggcccaagaaactcagcttcaagctcttcaacatccaagttaaagaggtcacgcagaacgatggcacgacgactattgccaataaccttaccagcacggttcaagtgtttacggactcggagtatcagctcccgtacgtgctcgggtcggcgcaccaaggctgtctcccgccgtttccagcggacgtcttcatggtccctcagtatggatacctcaccctgaacaacggaagtcaagcggtgggacgctcatccttttactgcctggagtacttcccttcgcagatgctaaggactggaaataacttccaattcagctataccttcgaggatgtaccttttcacagcagctacgctcacagccagagtttggatcgcttgatgaatcctcttattgatcagtatctgtactacctgaacagaacgcaaggaacaacctctggaacaaccaaccaatcacggctgctttttagccaggctgggcctcagtctatgtctttgcaggccagaaattggctacctgggccctgctaccggcaacagagactttcaaagactgctaacgacaacaacaacagtaactttccttggacagcggccagcaaatatcatctcaatggccgcgactcgctggtgaatccaggaccagctatggccagtcacaaggacgatgaagaaaaatttttccctatgcacggcaatctaatatttggcaaagaagggacaacggcaagtaacgcagaattagataatgtaatgattacggatgaagaagagattcgtaccaccaatcctgtggcaacagagcagtatggaactgtggcaaataacttgcagagctcaaatacagctcccacgactagaactgtcaatgatcagggggccttacctggcatggtgtggcaagatcgtgacgtgtaccttcaaggacctatctgggcaaagattcctcacacggatggacactttcatccttctcctctgatgggaggctttggactgaaacatccgcctcctcaaatcatgatcaaaaatactccggtaccggcaaatcctccgacgactttcagcccggccaagtttgcttcatttatcactcagtactccactggacaggtcagcgtggaaattgagtgggagctacagaaagaaaacagcaaacgttggaatccagagattcagtacacttccaactacaacaagtctgttaatgtggactttactgtagacactaatggtgtttatagtgaacctcgccccattggcacccgttaccttacccgtcccctgtaattgcttgttaatcaataaaccgtttaattcgtttcagttgaactttggtctctgcgtatttctttcttatctagtttccatatgcatgtagataagtagcatggcgggttaatcattaactaaccggtacctctagaactatagctagcatgcgcaaatttaaagcgctgatatcgatcgcgcgcagatctgtcatgatgatcattgcaattggatccatatatagggcccgggttataattacctcaggtcgacgtcccatggccattcgaattcgtaatcatggtcatagctgtttcctgtgtgaaattgttatccgctcacaattccacacaacatacgagccggaagcataaagtgtaaagcctggggtgcctaatgagtgagctaactcacattaattgcgttgcgctcactgcccgctttccagtcgggaaacctgtcgtgccagctgcattaatgaatcggccaacgcgcggggagaggcggtttgcgtattgggcgctcttccgcttcctcgctcactgactcgctgcgctcggtcgttcggctgcggcgagcggtatcagctcactcaaaggcggtaatacggttatccacagaatcaggggataacgcaggaaagaacatgtgagcaaaaggccagcaaaaggccaggaaccgtaaaaaggccgcgttgctggcgtttttccataggctccgcccccctgacgagcatcacaaaaatcgacgctcaagtcagaggtggcgaaacccgacaggactataaagataccaggcgtttccccctggaagctccctcgtgcgctctcctgttccgaccctgccgcttaccggatacctgtccgcctttctcccttcgggaagcgtggcgctttctcatagctcacgctgtaggtatctcagttcggtgtaggtcgttcgctccaagctgggctgtgtgcacgaaccccccgttcagcccgaccgctgcgccttatccggtaactatcgtcttgagtccaacccggtaagacacgacttatcgccactggcagcagccactggtaacaggattagcagagcgaggtatgtaggcggtgctacagagttcttgaagtggtggcctaactacggctacactagaagaacagtatttggtatctgcgctctgctgaagccagttaccttcggaaaaagagttggtagctcttgatccggcaaacaaaccaccgctggtagcggtggtttttttgtttgcaagcagcagattacgcgcagaaaaaaaggatctcaagaagatcctttgatcttttctacggggtctgacgctcagtggaacgaaaactcacgttaagggattttggtcatgagattatcaaaaaggatcttcacctagatccttttaaattaaaaatgaagttttaaatcaatctaaagtatatatgagtaaacttggtctgacagttagaaaaactcatcgagcatcaaatgaaactgcaatttattcatatcaggattatcaataccatatttttgaaaaagccgtttctgtaatgaaggagaaaactcaccgaggcagttccataggatggcaagatcctggtatcggtctgcgattccgactcgtccaacatcaatacaacctattaatttcccctcgtcaaaaataaggttatcaagtgagaaatcaccatgagtgacgactgaatccggtgagaatggcaaaagtttatgcatttctttccagacttgttcaacaggccagccattacgctcgtcatcaaaatcactcgcatcaaccaaaccgttattcattcgtgattgcgcctgagcgagacgaaatacgcgatcgctgttaaaaggacaattacaaacaggaatcgaatgcaaccggcgcaggaacactgccagcgcatcaacaatattttcacctgaatcaggatattcttctaatacctggaatgctgtttttccggggatcgcagtggtgagtaaccatgcatcatcaggagtacggataaaatgcttgatggtcggaagaggcataaattccgtcagccagtttagtctgaccatctcatctgtaacatcattggcaacgctacctttgccatgtttcagaaacaactctggcgcatcgggcttcccatacaagcgatagattgtcgcacctgattgcccgacattatcgcgagcccatttatacccatataaatcagcatccatgttggaatttaatcgcggcctcgacgtttcccgttgaatatggctcatactcttcctttttcaatattattgaagcatttatcagggttattgtctcatgagcggatacatatttgaatgtatttagaaaaataaacaaataggggttccgcgcacatttccccgaaaagtgccacctgacgtctaagaaaccattattatcatgacattaacctataaaaataggcgtatcacgaggccctttcgtctcgcgcgtttcggtgatgacggtgaaaacctctgacacatgcagctcccggagacggtcacagcttgtctgtaagcggatgccgggagcagacaagcccgtcagggcgcgtcagcgggtgttggcgggtgtcggggctggcttaactatgcggcatcagagcagattgtactgagagtgcaccataaaattgtaaacgttaatattttgttaaaattcgcgttaaatttttgttaaatcagctcattttttaaccaataggccgaaatcggcaaaatcccttataaatcaaaagaatagcccgagatagggttgagtgttgttccagtttggaacaagagtccactattaaagaacgtggactccaacgtcaaagggcgaaaaaccgtctatcagggcgatggcccactacgtgaaccatcacccaaatcaagttttttggggtcgaggtgccgtaaagcactaaatcggaaccctaaagggagcccccgatttagagcttgacggggaaagccggcgaacgtggcgagaaaggaagggaagaaagcgaaaggagcgggcgctagggcgctggcaagtgtagcggtcacgctgcgcgtaaccaccacacccgccgcgcttaatgcgccgctacagggcgcgtactatggttgctttgacgtatgcggtgtgaaataccgcacagatgcgtaaggagaaaataccgcatcaggcgccattcgccattcaggctgcgcaactgttgggaagggcgatcggtgcgggcctcttcgctattacgccagctggcgaaagggggatgtgctgcaaggcgattaagttgggtaacgccagggttttcccagtcacgacgttgtaaaacgacggccaacatgttctttcctgcgttatcccctgattctgtggataaccgtattaccgcctttgagtgagctgataccgctcgccgcagccgaacgaccgagcgcagcgagtcagtgagcgaggaagcggaagagcgcccaatacgcaaaccgcctctccccgcgcgttggc

GR-FAH-co1_1kb(SEQ ID NO.26)

cgattcattaatgcagctgcgcgctcgctcgctcactgaggccgcccgggcaaagcccgggcgtcgggcgacctttggtcgcccggcctcagtgagcgagcgagcgcgcagagagggagtggccaactccatcactaggggttccttgtagttaatgattaacccgccatgctacttatctacgtaACTCCATGAAAGTGGATTTTATTATCCTCATCATGCAGATGAGAATATTGAGACTTATAGCGGTATGCCTGAGCCCCAAAGTACTCAGAGTTGCCTGGCTCCAAGATTTATAATCTTAAATGATGGGACTACCATCCTTACTCTCTCCATTTTTCTATACGTGAGTAATGTTTTTTCTGTTTTTTTTTTTTCTTTTTCCATTCAAACTCAGTGCACTTGTTGAGCTCGTGAAACACAAGCCCAAGGCAACAAAAGAGCAACTGAAAGCTGTTATGGATGATTTCGCAGCTTTTGTAGAGAAGTGCTGCAAGGCTGACGATAAGGAGACCTGCTTTGCCGAGGAGGTACTACAGTTCTCTTCATTTTAATATGTCCAGTATTCATTTTTGCATGTTTGGTTAGGCTAGGGCTTAGGGATTTATATATCAAAGGAGGCTTTGTACATGTGGGACAGGGATCTTATTTTACAAACAATTGTCTTACAAAATGAATAAAACAGCACTTTGTTTTTATCTCCTGCTCTATTGTGCCATACTGTTAAATGTTTATAATGCCTGTTCTGTTTCCAAATTTGTGATGCTTATGAATATTAATAGGAATATTTGTAAGGCCTGAAATATTTTGATCATGAAATCAAAACATTAATTTATTTAAACATTTACTTGAAATGTGGTGGTTTGTGATTTAGTTGATTTTATAGGCTAGTGGGAGAATTTACATTCAAATGTCTAAATCACTTAAAATTGCCCTTTATGGCCTGACAGTAACTTTTTTTTATTCATTTGGGGACAACTATGTCCGTGAGCTTCCGTCCAGAGATTATAGTAGTAAATTGTAATTAAAGGATATGATGCACGTGAAATCACTTTGCAATCATCAATAGCTTCATAAATGTTAATTTTGTATCCTAATAGTAATGCTAATATTTTCCTAACATCTGTCATGTCTTTGTGTTCAGGGTAAAAAACTTGTTGCTGCAAGTCAAGCTGCCTTAGGCTTAGGAAGCGGCGCCACCAATTTCAGCCTGCTGAAACAGGCCGGCGACGTGGAAGAGAACCCTGGCCCTTCCTTTATTCCTGTGGCCGAAGATAGCGACTTCCCTATCCATAACCTCCCATATGGCGTCTTCTCAACCAGGGGCGACCCCAGGCCCAGGATTGGGGTGGCAATTGGAGACCAGATCCTGGACCTCAGCATCATTAAGCACCTCTTTACAGGACCTGTGCTGAGCAAGCACCAGGATGTGTTCAACCAGCCTACCCTGAATAGCTTTATGGGACTGGGCCAGGCAGCATGGAAAGAAGCCAGAGTGTTCCTCCAGAATCTGCTGAGCGTGAGCCAGGCCAGGCTCAGGGATGACACAGAACTGAGGAAATGTGCCTTCATCTCACAGGCATCAGCCACAATGCACCTCCCTGCCACAATTGGGGACTACACTGACTTCTACAGCAGCAGACAGCATGCCACTAATGTGGGGATTATGTTTAGAGACAAGGAAAATGCCCTCATGCCAAATTGGCTGCACCTGCCTGTGGGCTACCACGGCAGAGCCTCATCAGTGGTGGTGTCAGGCACACCAATTAGGAGGCCAATGGGACAGATGAAGCCTGACGACTCAAAGCCACCAGTCTATGGCGCCTGCAAACTGCTGGACATGGAGCTGGAGATGGCTTTCTTTGTGGGGCCTGGCAACAGGCTGGGAGAACCAATCCCTATTTCAAAGGCCCACGAGCACATTTTTGGCATGGTGCTCATGAATGACTGGTCTGCCAGAGATATCCAGAAGTGGGAATACGTGCCCCTGGGCCCTTTTCTGGGCAAGAGCTTTGGCACCACAGTGTCACCTTGGGTGGTCCCAATGGATGCCCTGATGCCCTTTGCCGTGCCCAACCCCAAGCAGGATCCCAGACCACTGCCCTACCTGTGCCACGATGAGCCCTATACCTTTGACATCAACCTGTCAGTCAACCTGAAGGGGGAGGGCATGAGCCAGGCCGCCACTATTTGCAAGAGCAACTTTAAATATATGTACTGGACTATGCTCCAACAGCTCACTCACCACTCAGTGAATGGCTGCAATCTGAGGCCTGGCGACCTCCTGGCTAGCGGCACTATCTCTGGCCCTGAACCTGAGAACTTTGGGAGCATGCTGGAGCTGTCATGGAAAGGGACAAAGCCTATTGACCTGGGGAATGGCCAGACAAGGAAGTTTCTCCTGGATGGCGATGAGGTCATCATCACTGGGTACTGCCAGGGCGATGGCTACAGGATTGGATTTGGACAGTGTGCTGGCAAAGTGCTCCCAGCTCTGCTGCCCTCATAACATCACATTTAAAAGCATCTCAGGTAACTATATTTTGAATTTTTTAAAAAAGTAACTATAATAGTTATTATTAAAATAGCAAAGATTGACCATTTCCAAGAGCCATATAGACCAGCACCGACCACTATTCTAAACTATTTATGTATGTAAATATTAGCTTTTAAAATTCTCAAAATAGTTGCTGAGTTGGGAACCACTATTATTTCTATTTTGTAGATGAGAAAATGAAGATAAACATCAAAGCATAGATTAAGTAATTTTCCAAAGGGTCAAAATTCAAAATTGAAACCAAAGTTTCAGTGTTGCCCATTGTCCTGTTCTGACTTATATGATGCGGTACACAGAGCCATCCAAGTAAGTGATGGCTCAGCAGTGGAATACTCTGGGAATTAGGCTGAACCACATGAAAGAGTGCTTTATAGGGCAAAAACAGTTGAATATCAGTGATTTCACATGGTTCAACCTAATAGTTCAACTCATCCTTTCCATTGGAGAATATGATGGATCTACCTTCTGTGAACTTTATAGTGAAGAATCTGCTATTACATTTCCAATTTGTCAACATGCTGAGCTTTAATAGGACTTATCTTCTTATGACAACATTTATTGGTGTGTCCCCTTGCCTAGCCCAACAGAAGAATTCAGCAGCCGTAAGTCTAGGACAGGCTTAAATTGTTTTCACTGGTGTAAATTGCAGAAAGATGATCTAAGTAATTTGGCATTTATTTTAATAGGTTTGAAAAACACATGCCATTTTACAAATAAGACTTATATTTGTCCTTTTGTTTTTCAGCCTACCATGAGAATAAGAGAAAGAAAATGAAGATCAAAAGCTTATTCATCTGTTTTTCTTTTTCGTTGGTGTAAAGCCAACACCCTGTCTAAAAAACATAAATTTCTTTAATCATTTTGCCTCTTTTCTCTGTGCTTCAATTAATAAAAAATGGAAAGAATCTAATAGAGTGGTACAGCACTGTTATTTTTCAAAGATGTGTTGtacgtagataagtagcatggcgggttaatcattaactacaaggaacccctagtgatggagttggccactccctctctgcgcgctcgctcgctcactgaggccgggcgaccaaaggtcgcccgacgcccgggctttgcccgggcggcctcagtgagcgagcgagcgcgcagctggcgtaatagcgaagaggcccgcaccgatcgggtcaccaagcaggaagtcaaagactttttccggtgggcaaaggatcacgtggttgaggtggagcatgaattctacgtcaaaaagggtggagccaagaaaagacccgcccccagtgacgcagatataagtgagcccaaacgggtgcgcgagtcagttgcgcagccatcgacgtcagacgcggaagcttcgatcaactacgcagacaggtaccaaaacaaatgttctcgtcacgtgggcatgaatctgatgctgtttccctgcagacaatgcgagagaatgaatcagaattcaaatatctgcttcactcacggacagaaagactgtttagagtgctttcccgtgtcagaatctcaacccgtttctgtcgtcaaaaaggcgtatcagaaactgtgctacattcatcatatcatgggaaaggtgccagacgcttgcactgcctgcgatctggtcaatgtggatttggatgactgcatctttgaacaataaatgatttaaatcaggtatggctgctgacggttatcttccagattggctcgaggacaacctttctgaaggcattcgagagtggtgggcgctgcaacctggagcccctaaacccaaggcaaatcaacaacatcaggacaacgctcggggtcttgtgcttccgggttacaaatacctcggacccggcaacggactcgacaagggggaacccgtcaacgcagcggacgcggcagccctcgagcacgacaaggcctacgaccagcagctcaaggccggtgacaacccctacctcaagtacaaccacgccgacgccgagttccaggagcggctcaaagaagatacgtcttttgggggcaacctcgggcgagcagtcttccaggccaaaaagaggcttcttgaacctcttggtctggttgaggaagcggctaagacggctcctggaaagaagaggcctgtagatcagtctcctcaggaaccggactcatcatctggtgttggcaaatcgggcaaacagcctgccagaaaaagactaaatttcggtcagactggcgactcagagtcagtcccagaccctcaacctctcggagaaccaccagcagcccccacaagtttgggatctaatacaatggcttcaggcggtggcgcaccaatggcagacaataacgagggtgccgatggagtgggtaattcctcaggaaattggcattgcgattcccaatggctgggcgacagagtcatcaccaccagcaccagaacctgggccctgcccacttacaacaaccatctctacaagcaaatctccagccaatcaggagcttcaaacgacaaccactactttggctacagcaccccttgggggtattttgactttaacagattccactgccacttctcaccacgtgactggcagcgactcattaacaacaactggggattccggcccaagaaactcagcttcaagctcttcaacatccaagttaaagaggtcacgcagaacgatggcacgacgactattgccaataaccttaccagcacggttcaagtgtttacggactcggagtatcagctcccgtacgtgctcgggtcggcgcaccaaggctgtctcccgccgtttccagcggacgtcttcatggtccctcagtatggatacctcaccctgaacaacggaagtcaagcggtgggacgctcatccttttactgcctggagtacttcccttcgcagatgctaaggactggaaataacttccaattcagctataccttcgaggatgtaccttttcacagcagctacgctcacagccagagtttggatcgcttgatgaatcctcttattgatcagtatctgtactacctgaacagaacgcaaggaacaacctctggaacaaccaaccaatcacggctgctttttagccaggctgggcctcagtctatgtctttgcaggccagaaattggctacctgggccctgctaccggcaacagagactttcaaagactgctaacgacaacaacaacagtaactttccttggacagcggccagcaaatatcatctcaatggccgcgactcgctggtgaatccaggaccagctatggccagtcacaaggacgatgaagaaaaatttttccctatgcacggcaatctaatatttggcaaagaagggacaacggcaagtaacgcagaattagataatgtaatgattacggatgaagaagagattcgtaccaccaatcctgtggcaacagagcagtatggaactgtggcaaataacttgcagagctcaaatacagctcccacgactagaactgtcaatgatcagggggccttacctggcatggtgtggcaagatcgtgacgtgtaccttcaaggacctatctgggcaaagattcctcacacggatggacactttcatccttctcctctgatgggaggctttggactgaaacatccgcctcctcaaatcatgatcaaaaatactccggtaccggcaaatcctccgacgactttcagcccggccaagtttgcttcatttatcactcagtactccactggacaggtcagcgtggaaattgagtgggagctacagaaagaaaacagcaaacgttggaatccagagattcagtacacttccaactacaacaagtctgttaatgtggactttactgtagacactaatggtgtttatagtgaacctcgccccattggcacccgttaccttacccgtcccctgtaattgcttgttaatcaataaaccgtttaattcgtttcagttgaactttggtctctgcgtatttctttcttatctagtttccatatgcatgtagataagtagcatggcgggttaatcattaactaaccggtacctctagaactatagctagcatgcgcaaatttaaagcgctgatatcgatcgcgcgcagatctgtcatgatgatcattgcaattggatccatatatagggcccgggttataattacctcaggtcgacgtcccatggccattcgaattcgtaatcatggtcatagctgtttcctgtgtgaaattgttatccgctcacaattccacacaacatacgagccggaagcataaagtgtaaagcctggggtgcctaatgagtgagctaactcacattaattgcgttgcgctcactgcccgctttccagtcgggaaacctgtcgtgccagctgcattaatgaatcggccaacgcgcggggagaggcggtttgcgtattgggcgctcttccgcttcctcgctcactgactcgctgcgctcggtcgttcggctgcggcgagcggtatcagctcactcaaaggcggtaatacggttatccacagaatcaggggataacgcaggaaagaacatgtgagcaaaaggccagcaaaaggccaggaaccgtaaaaaggccgcgttgctggcgtttttccataggctccgcccccctgacgagcatcacaaaaatcgacgctcaagtcagaggtggcgaaacccgacaggactataaagataccaggcgtttccccctggaagctccctcgtgcgctctcctgttccgaccctgccgcttaccggatacctgtccgcctttctcccttcgggaagcgtggcgctttctcatagctcacgctgtaggtatctcagttcggtgtaggtcgttcgctccaagctgggctgtgtgcacgaaccccccgttcagcccgaccgctgcgccttatccggtaactatcgtcttgagtccaacccggtaagacacgacttatcgccactggcagcagccactggtaacaggattagcagagcgaggtatgtaggcggtgctacagagttcttgaagtggtggcctaactacggctacactagaagaacagtatttggtatctgcgctctgctgaagccagttaccttcggaaaaagagttggtagctcttgatccggcaaacaaaccaccgctggtagcggtggtttttttgtttgcaagcagcagattacgcgcagaaaaaaaggatctcaagaagatcctttgatcttttctacggggtctgacgctcagtggaacgaaaactcacgttaagggattttggtcatgagattatcaaaaaggatcttcacctagatccttttaaattaaaaatgaagttttaaatcaatctaaagtatatatgagtaaacttggtctgacagttagaaaaactcatcgagcatcaaatgaaactgcaatttattcatatcaggattatcaataccatatttttgaaaaagccgtttctgtaatgaaggagaaaactcaccgaggcagttccataggatggcaagatcctggtatcggtctgcgattccgactcgtccaacatcaatacaacctattaatttcccctcgtcaaaaataaggttatcaagtgagaaatcaccatgagtgacgactgaatccggtgagaatggcaaaagtttatgcatttctttccagacttgttcaacaggccagccattacgctcgtcatcaaaatcactcgcatcaaccaaaccgttattcattcgtgattgcgcctgagcgagacgaaatacgcgatcgctgttaaaaggacaattacaaacaggaatcgaatgcaaccggcgcaggaacactgccagcgcatcaacaatattttcacctgaatcaggatattcttctaatacctggaatgctgtttttccggggatcgcagtggtgagtaaccatgcatcatcaggagtacggataaaatgcttgatggtcggaagaggcataaattccgtcagccagtttagtctgaccatctcatctgtaacatcattggcaacgctacctttgccatgtttcagaaacaactctggcgcatcgggcttcccatacaagcgatagattgtcgcacctgattgcccgacattatcgcgagcccatttatacccatataaatcagcatccatgttggaatttaatcgcggcctcgacgtttcccgttgaatatggctcatactcttcctttttcaatattattgaagcatttatcagggttattgtctcatgagcggatacatatttgaatgtatttagaaaaataaacaaataggggttccgcgcacatttccccgaaaagtgccacctgacgtctaagaaaccattattatcatgacattaacctataaaaataggcgtatcacgaggccctttcgtctcgcgcgtttcggtgatgacggtgaaaacctctgacacatgcagctcccggagacggtcacagcttgtctgtaagcggatgccgggagcagacaagcccgtcagggcgcgtcagcgggtgttggcgggtgtcggggctggcttaactatgcggcatcagagcagattgtactgagagtgcaccataaaattgtaaacgttaatattttgttaaaattcgcgttaaatttttgttaaatcagctcattttttaaccaataggccgaaatcggcaaaatcccttataaatcaaaagaatagcccgagatagggttgagtgttgttccagtttggaacaagagtccactattaaagaacgtggactccaacgtcaaagggcgaaaaaccgtctatcagggcgatggcccactacgtgaaccatcacccaaatcaagttttttggggtcgaggtgccgtaaagcactaaatcggaaccctaaagggagcccccgatttagagcttgacggggaaagccggcgaacgtggcgagaaaggaagggaagaaagcgaaaggagcgggcgctagggcgctggcaagtgtagcggtcacgctgcgcgtaaccaccacacccgccgcgcttaatgcgccgctacagggcgcgtactatggttgctttgacgtatgcggtgtgaaataccgcacagatgcgtaaggagaaaataccgcatcaggcgccattcgccattcaggctgcgcaactgttgggaagggcgatcggtgcgggcctcttcgctattacgccagctggcgaaagggggatgtgctgcaaggcgattaagttgggtaacgccagggttttcccagtcacgacgttgtaaaacgacggccaacatgttctttcctgcgttatcccctgattctgtggataaccgtattaccgcctttgagtgagctgataccgctcgccgcagccgaacgaccgagcgcagcgagtcagtgagcgaggaagcggaagagcgcccaatacgcaaaccgcctctccccgcgcgttggc

In some embodiments, the present disclosure provides a composition comprising a recombinant AAV construct comprising: a polynucleotide cassette comprising: an expression cassette comprising a first nucleic acid sequence and a second nucleic acid sequence, wherein the first nucleic acid sequence has 80% sequence identity to SEQ ID No.18, 19, 20, 21 or 22; and the second nucleic acid sequence (i) is located 5 'or 3' to the first nucleic acid sequence; and (ii) facilitates the production of two independent gene products upon integration into a target integration site in the genome of the cell; a third nucleic acid sequence located 5 'of the expression cassette and comprising a sequence substantially homologous to the genomic sequence 5' of the target integration site in the genome of the cell; and a fourth nucleic acid sequence located 3 'of the expression cassette and comprising a sequence substantially homologous to the genomic sequence 3' of the target integration site in the genome of the cell. In some embodiments, the AAV construct comprises a capsid protein comprising an amino acid sequence having at least 95% sequence identity to an amino acid sequence of AAV8, AAV-DJ, AAV-LK03, sL65, or AAVNP 59. In some embodiments, the composition further comprises AAV2 ITR sequences. In some embodiments, the second nucleic acid sequence has 80% identity to SEQ ID NO. 6. In some embodiments, the second nucleic acid sequence encodes a P2A peptide having 90% sequence identity to SEQ ID NO. 7. In some embodiments, the third nucleic acid sequence has 80% sequence identity to SEQ ID No.1, 3 or 4. In some embodiments, the fourth nucleic acid sequence has 80% sequence identity to SEQ ID No.2 or 5.

Example(s)

Example 1 materials and methods

Animal study

FAH knockout (Fah-/-, KO) and heterozygous Fah+/-littermates (HET) animals were purchased from Jackson Laboratories. FRG mice were purchased from Yecuris company.

GENERIDE ^TM PoC in tyrosinemia FRG mice

Four week old FRG males were treated under anesthesia with either the retroorbital sinus vehicle or rAAV.DJ-GR-hFAH at 1e14 vg/kg. All mice remained 8 mg/L of Nitenpyram (NTBC) before study initiation and one week post-dose, then on weight loss, NTBC was cycled between 5 weeks post-dose, and then NTBC was discontinued. During the study, animals were periodically sampled by submandibular hemorrhage and plasma was collected and stored at-80 ℃ until further analysis. Final harvest was performed at week 9 and week 16 post dosing. At the time of sacrifice, blood was collected by cardiac puncture to obtain plasma. Animals were dissected whole livers or perfused with livers to collect hepatocytes. For animals dissecting whole livers, one lobe was fixed with 10% formalin and the remaining lobes were quick frozen and stored at-80 ℃. The next day, formalin-fixed livers were transferred to 70% ethanol for paraffin embedding. After liver perfusion, isolated hepatocytes were centrifuged at 300Xg at 4℃for 5min and stored at-80 ℃.

GENERIDE ^TM dose-responsive PoC in FAH mice

Pediatric Fah-/- (KO) and Fah+/- (HET) animals 14 days old were treated under anesthesia with rAAV.DJ-GR-mFAH at a dose of 1e 13, 3e13 or 1e14 vg/kg via the retroorbital sinus. All mice maintained an NTBC of 8 mg/L prior to study initiation and one week post-dose, and then were cycled between NTBC and 2 weeks post-dose based on weight loss. During the study, animals were periodically sampled by submandibular hemorrhage and plasma was collected and stored at-80 ℃ until further analysis. Final harvest was performed 16 weeks after dosing. At the time of sacrifice, blood was collected by cardiac puncture to obtain plasma. Animals were perfused with liver to collect hepatocytes. Prior to the start of perfusion, one lobe was sutured and dissected for formalin fixation. After liver perfusion, isolated hepatocytes were centrifuged at 300xg for 5min at 4 ℃ and stored at-80 ℃. The next day, formalin-fixed livers were transferred to 70% ethanol for paraffin embedding.

Hepatocellular carcinoma (HCC) risk assessment between NTBC and GeneRide

Fah-/-animals remained with 8mg/L NTBC since birth. At four weeks of age, a group of Fah-/-animals was randomly selected and treated with rAAV.DJ-GR-mFAH under anesthesia at a dose of 1e14 vg/kg through the retroorbital sinus, and then exited the NTBC (GENERIDE ^TM treated group). Another group of Fah-/-animals was maintained at 8mg/L NTBC (standard care group). The third group of Fah +/-litters was included in the study but did not receive any treatment (no NTBC or GeneRide). All animals were followed up to one year old and HCC biomarkers (AFP levels) were assessed periodically.

Evaluation of compatibility between NTBC (standard of care) and GENERIDE ^TM

Four-week-old Fah-/-animals were treated with rAAV.DJ-GR-mFAH under anesthesia at a dose of 1e14 vg/kg via the retroorbital sinus. All mice were maintained at 8mg/L of NTBC until 4 weeks post-dosing prior to study initiation, and then NTBC was maintained at 8mg/L (control) or titrated to 3mg/L, 0.8mg/L or 0.3mg/L for 8 weeks. During the study, animals were periodically sampled by submandibular hemorrhage and plasma was collected and stored at-80 ℃ until further analysis. At the time of sacrifice, blood was collected by cardiac puncture to obtain plasma. For liver dissection, one lobe was fixed with 10% formalin and the remaining lobes were quick frozen and stored at-80 ℃. The next day, formalin-fixed livers were transferred to 70% ethanol for paraffin embedding.

Targeted genomic DNA integration in the liver

Genomic DNA was extracted from frozen liver tissue and targeted genomic DNA integration was analyzed by long-range Polymerase Chain Reaction (PCR) amplification followed by quantitative polymerase chain reaction (qPCR) quantification using the identified method (see below). A long-range PCR was performed using the forward primer (F1) and the reverse primer (R1). The PCR product was washed by solid phase reversible immobilization beads (ABM, G950) and used as template for qPCR using forward primer (F1), reverse primer (R2) and probe (P1). The primer and probe was (F1)5'-ATGTTCCACGAAGAAGCCA-3'、(R1)5'-TCAGCAGGCTGAAATTGGT-3、(R2)5'-AGCTGTTTCTTACTCCATTCTCA-3'、(P1)5'-AGGCAACGTCATGGGTGTGACTTT-3'. mouse transferrin receptor (Tfrc) used as an internal control for qPCR.

Plasma albumin-2A fusion protein quantification

Mouse albumin-2A in plasma was measured by chemiluminescent ELISA, captured using proprietary rabbit polyclonal anti-2A antibody, and detected using HRP-labeled polyclonal goat anti-mouse albumin antibody (abcam ab 19195). Recombinant mouse albumin-2A expressed in mammalian cells and affinity purified was used to construct a standard curve in 1% control mouse plasma to explain the matrix effect. 1% milk in PBS (CELL SIGNALING 9999S) was used for blocking and 1% BSA in PBST was used for sample dilution.

Plasma liver injury biomarker quantification

Alanine Aminotransferase (ALT) activity and total bilirubin levels in mouse plasma were quantified as biomarkers of liver injury. Plasma ALT activity was quantified using an alanine aminotransferase activity colorimetric assay kit (BioVision) according to the instructions of the supplier. Use of a certified clinical analyser according to manufacturer's protocolBR2Bilirubin Stat-Analyzer ^TM (Advanced Instruments, LLC) measures total bilirubin in plasma.

Plasma alpha fetal protein quantification

Plasma alpha fetal proteins were quantified using a chemiluminescent ELISA kit (R & DSystems) according to the manufacturer's protocol.

Immunohistochemistry

Immunohistochemistry was performed on a robotic platform (Ventana discover Ultra Staining Module, ventana co., tucson, AZ). Tissue sections (4 μm) were deparaffinized and heat-induced antigen recovery was performed for 64min. Endogenous peroxidase was blocked with peroxidase inhibitor (CM 1) for 8min, followed by incubation of the sections with anti-FAH antibodies (Yecuris, portland, OR) at 1:400 dilution for 60min at room temperature. The DISC is then used to detect antigen-antibody complexes. OmniMap anti-rabbit multimeric RUO detection system and DISCOVERY ChromoMap DAB kit Ventana co., tucson, AZ). All slides were then counterstained with hematoxylin (FISHER SCI, waltham, MA), dehydrated, washed and mounted for image scanning using a digital slide scanner (Hamamatsu, bridgewater, NJ). The scanned effect was evaluated in a blind format using ImageJ software to quantify positive staining areas. Exemplary sequences used in embodiments of the present invention are provided below:

AAV-DJ-mha-mFAH(PM-0550；SEQ ID NO.27)

cagatcctctacgccggacgcatcgtggccggcatcaccggcgccacaggtgcggttgctggcgcctatatcgccgacatcaccgatggggaagatcgggctcgccacttcgggctcatgagcgcttgtttcggcgtgggtatggtggcaggccccgtggccgggggactgttgggcgccatctccttgcatgcaccattccttgcggcggcggtgctcaacggcctcaacctactactgggctgcttcctaatgcaggagtcgcataagggagagcgtcgaatggtgcactctcagtacaatctgctctgatgccgcatagttaagccagccccgacacccgccaacacccgctgacgcgccctgacgggcttgtctgctcccggcatccgcttacagacaagctgtgaccgtctccgggagctgcatgtgtcagaggttttcaccgtcatcaccgaaacgcgcgagacgaaagggcctcgtgatacgcctatttttataggttaatgtcatgataataatggtttcttagacgtcaggtggcacttttcggggaaatgtgcgcggaacccctatttgtttatttttctaaatacattcaaatatgtatccgctcatgagacaataaccctgataaatgcttcaataatattgaaaaaggaagagtatgagccatattcaacgggaaacgtcgaggccgcgattaaattccaacatggatgctgatttatatgggtataaatgggctcgcgataatgtcgggcaatcaggtgcgacaatctatcgcttgtatgggaagcccgatgcgccagagttgtttctgaaacatggcaaaggtagcgttgccaatgatgttacagatgagatggtcagactaaactggctgacggaatttatgcctcttccgaccatcaagcattttatccgtactcctgatgatgcatggttactcaccactgcgatccccggaaaaacagcattccaggtattagaagaatatcctgattcaggtgaaaatattgttgatgcgctggcagtgttcctgcgccggttgcattcgattcctgtttgtaattgtccttttaacagcgatcgcgtatttcgtctcgctcaggcgcaatcacgaatgaataacggtttggttgatgcgagtgattttgatgacgagcgtaatggctggcctgttgaacaagtctggaaagaaatgcataaacttttgccattctcaccggattcagtcgtcactcatggtgatttctcacttgataaccttatttttgacgaggggaaattaataggttgtattgatgttggacgagtcggaatcgcagaccgataccaggatcttgccatcctatggaactgcctcggtgagttttctccttcattacagaaacggctttttcaaaaatatggtattgataatcctgatatgaataaattgcagtttcatttgatgctcgatgagtttttctaactgtcagaccaagtttactcatatatactttagattgatttaaaacttcatttttaatttaaaaggatctaggtgaagatcctttttgataatctcatgaccaaaatcccttaacgtgagttttcgttccactgagcgtcagaccccgtagaaaagatcaaaggatcttcttgagatcctttttttctgcgcgtaatctgctgcttgcaaacaaaaaaaccaccgctaccagcggtggtttgtttgccggatcaagagctaccaactctttttccgaaggtaactggcttcagcagagcgcagataccaaatactgttcttctagtgtagccgtagttaggccaccacttcaagaactctgtagcaccgcctacatacctcgctctgctaatcctgttaccagtggctgctgccagtggcgataagtcgtgtcttaccgggttggactcaagacgatagttaccggataaggcgcagcggtcgggctgaacggggggttcgtgcacacagcccagcttggagcgaacgacctacaccgaactgagatacctacagcgtgagctatgagaaagcgccacgcttcccgaagggagaaaggcggacaggtatccggtaagcggcagggtcggaacaggagagcgcacgagggagcttccagggggaaacgcctggtatctttatagtcctgtcgggtttcgccacctctgacttgagcgtcgatttttgtgatgctcgtcaggggggcggagcctatggaaaaacgccagcaacgcggcctttttacggttcctggccttttgctggccttttgctcacatgttctttcctgcgttatcccctgattctgtggataaccgtattaccgcctttgagtgagctgataccgctcgccgcagccgaacgaccgagcgcagcgagtcagtgagcgaggaagcggaagagcgcccaatacgcaaaccgcctctccccgcgcgttggccgattcattaatgcagctgtggaatgtgtgtcagttagggtgtggaaagtccccaggctccccagcaggcagaagtatgcaaagcatgcatctcaattagtcagcaaccaggtgtggaaagtccccaggctccccagcaggcagaagtatgcaaagcatgcatctcaattagtcagcaaccatagtcccgcccctaactccgcccatcccgcccctaactccgcccagttccgcccattctccgccccatggctgactaattttttttatttatgcagaggccgaggccgcctcggcctctgagctattccagaagtagtgaggaggcttttttggaggcctaggcttttgcaaaaagttggccactccctctctgcgcgctcgctcgctcactgaggccgcccgggcaaagcccgggcgtcgggcgacctttggtcgcccggcctcagtgagcgagcgagcgcgcagagagggagtggccaactccatcactaggggttcctTACGTAATGCATGGATCCCCTAGGgcggccgcCTGAAACTAGACAAAACCCGTGTGACTGGCATCGATTATTCTATTTGATCTAGCTAGTCCTAGCAAAGTGACAACTGCTACTCCCCTCCTACACAGCCAAGATTCCTAAGTTGGCAGTGGCATGCTTAATCCTCAAAGCCAAAGTTACTTGGCTCCAAGATTTATAGCCTTAAACTGTGGCCTCACATTCCTTCCTATCTTACTTTCCTGCACTGGGGTAAATGTCTCCTTGCTCTTCTTGCTTTCTGTCCTACTGCAGGGCTCTTGCTGAGCTGGTGAAGCACAAGCCCAAGGCTACAGCGGAGCAACTGAAGACTGTCATGGATGACTTTGCACAGTTCCTGGATACATGTTGCAAGGCTGCTGACAAGGACACCTGCTTCTCGACTGAGGTCAGAAACGTTTTTGCATTTTGACGATGTTCAGTTTCCATTTTCTGTGCACGTGGTCAGGTGTAGCTCTCTGGAACTCACACACTGAATAACTCCACCAATCTAGATGTTGTTCTCTACGTAACTGTAATAGAAACTGACTTACGTAGCTTTTAATTTTTATTTTCTGCCACACTGCTGCCTATTAAATACCTATTATCACTATTTGGTTTCAAATTTGTGACACAGAAGAGCATAGTTAGAAATACTTGCAAAGCCTAGAATCATGAACTCATTTAAACCTTGCCCTGAAATGTTTCTTTTTGAATTGAGTTATTTTACACATGAATGGACAGTTACCATTATATATCTGAATCATTTCACATTCCCTCCCATGGCCTAACAACAGTTTATCTTCTTATTTTGGGCACAACAGATGTCAGAGAGCCTGCTTTAGGAATTCTAAGTAGAACTGTAATTAAGCAATGCAAGGCACGTACGTTTACTATGTCATTGCCTATGGCTATGAAGTGCAAATCCTAACAGTCCTGCTAATACTTTTCTAACATCCATCATTTCTTTGTTTTCAGGGTCCAAACCTTGTCACTAGATGCAAAGACGCCTTAGCCggaagcggcgccaccaatttcagcctgctgaaacaggccggcgacgtggaagagaaccctggccctTCCTTTATTCCAGTGGCCGAGGACTCCGACTTTCCCATCCAAAACCTGCCCTATGGTGTTTTCTCCACTCAAAGCAACCCAAAGCCACGGATTGGTGTAGCCATCGGTGACCAGATCTTGGACCTGAGTGTCATTAAACACCTCTTTACCGGACCTGCCCTTTCCAAACATCAACATGTCTTCGATGAGACAACTCTCAATAACTTCATGGGTCTGGGTCAAGCTGCATGGAAGGAGGCAAGAGCATCCTTACAGAACTTACTGTCTGCCAGCCAAGCCCGGCTCAGAGATGACAAGGAGCTTCGGCAGCGTGCATTCACCTCCCAGGCTTCTGCGACAATGCACCTTCCTGCTACCATAGGAGACTACACGGACTTCTACTCTTCTCGGCAGCATGCCACCAATGTTGGCATTATGTTCAGAGGCAAGGAGAATGCGCTGTTGCCAAATTGGCTCCACTTACCTGTGGGATACCATGGCCGAGCTTCCTCCATTGTGGTATCTGGAACCCCGATTCGAAGACCCATGGGGCAGATGAGACCTGATAACTCAAAGCCTCCTGTGTATGGTGCCTGCAGACTCTTAGACATGGAGTTGGAAATGGCTTTCTTCGTAGGCCCTGGGAACAGATTCGGAGAGCCAATCCCCATTTCCAAAGCCCATGAACACATTTTCGGGATGGTCCTCATGAACGACTGGAGCGCACGAGACATCCAGCAATGGGAGTACGTCCCACTTGGGCCATTCCTGGGGAAAAGCTTTGGAACCACAATCTCCCCGTGGGTGGTGCCTATGGATGCCCTCATGCCCTTTGTGGTGCCAAACCCAAAGCAGGACCCCAAGCCCTTGCCATATCTCTGCCACAGCCAGCCCTACACATTTGATATCAACCTGTCTGTCTCTTTGAAAGGAGAAGGAATGAGCCAGGCGGCTACCATCTGCAGGTCTAACTTTAAGCACATGTACTGGACCATGCTGCAGCAACTCACACACCACTCTGTTAATGGATGCAACCTGAGACCTGGGGACCTCTTGGCTTCTGGAACCATCAGTGGATCAGACCCTGAAAGCTTTGGCTCCATGCTGGAACTGTCCTGGAAGGGAACAAAGGCCATCGATGTGGGGCAGGGGCAGACCAGGACCTTCCTGCTGGACGGCGATGAAGTCATCATAACAGGTCACTGCCAGGGGGACGGCTACCGTGTTGGCTTTGGCCAGTGTGCTGGGAAAGTGCTGCCTGCCCTTTCACCAGCCTAAACACATCACAACCACAACCTTCTCAGGTAACTATACTTGGGACTTAAAAAACATAATCATAATCATTTTTCCTAAAACGATCAAGACTGATAACCATTTGACAAGAGCCATACAGACAAGCACCAGCTGGCACTCTTAGGTCTTCACGTATGGTCATCAGTTTGGGTTCCATTTGTAGATAAGAAACTGAACATATAAAGGTCTAGGTTAATGCAATTTACACAAAAGGAGACCAAACCAGGGAGAGAAGGAACCAAAATTAAAAATTCAAACCAGAGCAAAGGAGTTAGCCCTGGTTTTGCTCTGACTTACATGAACCACTATGTGGAGTCCTCCATGTTAGCCTAGTCAAGCTTATCCTCTGGATGAAGTTGAAACCATATGAAGGAATATTTGGGGGGTGGGTCAAAACAGTTGTGTATCAATGATTCCATGTGGTTTGACCCAATCATTCTGTGAATCCATTTCAACAGAAGATACAACGGGTTCTGTTTCATAATAAGTGATCCACTTCCAAATTTCTGATGTGCCCCATGCTAAGCTTTAACAGAATTTATCTTCTTATGACAAAGCAGCCTCCTTTGAAAATATAGCCAACTGCACACAGCTATGTTGATCAATTTTGTTTATAATCTTGCAGAAGAGAATTTTTTAAAATAGGGCAATAATGGAAGGCTTTGGCAAAAAAATTGTTTCTCCATATGAAAACAAAAAACTTATTTTTTTATTCAAGCAAAGAACCTATAGACATAAGGCTATTTCAAAATTATTTCAGTTTTAGAAAGAATTGAAAGTTTTGTAGCATTCTGAGAAGACAGCTTTCATTTGTAATCATAGGTAATATGTAGGTCCTCAGAAATGGTGAGACCCCTGACTTTGACACTTGGGGACTCTGAGGGACCAGTGATGAAGAGGGCACAACTTATATCACACATGCACGAGTTGGGGTGAGAGGGTGTCACAACATCTATCAGTGTGTCATCTGCCCACCAAGTAACAGATGTCAGCTAAGACTAGGTCATGTGTAGGCTGTCTACACCAGTGAAAATCGCAAAAAGAATCTAAGAAATTCCACATTTCTAGAAAATAGGTTTGGAAACCGTATTCCATTTTACAAAGGACACTTACATTTCTCTTTTTGTTTTCCAGGCTACCCTGAGAAAAAAAGACATGAAGACTCAGGACTCATCTTTTCTGTTGGTGTAAAATCAACACCCTAAGGAACACAAATTTCTTTAAACATTTGACTTCTTGTCTCTGTGCTGCAATTAATAAAAAATGGAAAGAATCTACTCTGTGGTTCAGAACTCTATCTTCCAAAGGCGCGCTTCACCCTAGCAGCCTCTTTGGCTCAGAGGAATCCCTGCCTTTCCTCCCTTCATCTCAGCAGAGAATGTAGTTCCACATGGGCAACACAATGAAAATAAACGTTAATACTCTCCCATCTTATGGGTGGTGACCCTAGAAACCAATACTTCAACATTACGAGAATTCTGAATGAGAGACTAAAAGCTTATGAACTGTGGCTTTCCTTTGTCAGTGGGACTCTAAGAATGAGTTGGGGACAAAAGAGATAGGAATGGCTTTAAAGGTGACTAGTTGAACTGATAAAGTAAATGAACTGAGGAAAAAAAATATCACTCAAcctgcaggGGACGTCCTACGTAATGCATaggaacccctagtgatggagttggccactccctctctgcgcgctcgctcgctcactgaggccgggcgaccaaaggtcgcccgacgcccgggctttgcccgggcggcctcagtgagcgagcgagcgcgcagagagggagtggccaagaattaattccgtgtattctatagtgtcacctaaatcgtatgtgtatgatacataaggttatgtattaattgtagccgcgttctaacgacaatatgtacaagcctaattgtgtagcatctggcttactgaagcagaccctatcatctctctcgtaaactgccgtcagagtcggtttggttggacgaaccttctgagtttctggtaacgccgtcccgcacccggaaatggtcagcgaaccaatcagcagggtcatcgctagc

AAV-DJ-mha-hFAH(PM-0549；SEQ ID NO.28)

cagatcctctacgccggacgcatcgtggccggcatcaccggcgccacaggtgcggttgctggcgcctatatcgccgacatcaccgatggggaagatcgggctcgccacttcgggctcatgagcgcttgtttcggcgtgggtatggtggcaggccccgtggccgggggactgttgggcgccatctccttgcatgcaccattccttgcggcggcggtgctcaacggcctcaacctactactgggctgcttcctaatgcaggagtcgcataagggagagcgtcgaatggtgcactctcagtacaatctgctctgatgccgcatagttaagccagccccgacacccgccaacacccgctgacgcgccctgacgggcttgtctgctcccggcatccgcttacagacaagctgtgaccgtctccgggagctgcatgtgtcagaggttttcaccgtcatcaccgaaacgcgcgagacgaaagggcctcgtgatacgcctatttttataggttaatgtcatgataataatggtttcttagacgtcaggtggcacttttcggggaaatgtgcgcggaacccctatttgtttatttttctaaatacattcaaatatgtatccgctcatgagacaataaccctgataaatgcttcaataatattgaaaaaggaagagtatgagccatattcaacgggaaacgtcgaggccgcgattaaattccaacatggatgctgatttatatgggtataaatgggctcgcgataatgtcgggcaatcaggtgcgacaatctatcgcttgtatgggaagcccgatgcgccagagttgtttctgaaacatggcaaaggtagcgttgccaatgatgttacagatgagatggtcagactaaactggctgacggaatttatgcctcttccgaccatcaagcattttatccgtactcctgatgatgcatggttactcaccactgcgatccccggaaaaacagcattccaggtattagaagaatatcctgattcaggtgaaaatattgttgatgcgctggcagtgttcctgcgccggttgcattcgattcctgtttgtaattgtccttttaacagcgatcgcgtatttcgtctcgctcaggcgcaatcacgaatgaataacggtttggttgatgcgagtgattttgatgacgagcgtaatggctggcctgttgaacaagtctggaaagaaatgcataaacttttgccattctcaccggattcagtcgtcactcatggtgatttctcacttgataaccttatttttgacgaggggaaattaataggttgtattgatgttggacgagtcggaatcgcagaccgataccaggatcttgccatcctatggaactgcctcggtgagttttctccttcattacagaaacggctttttcaaaaatatggtattgataatcctgatatgaataaattgcagtttcatttgatgctcgatgagtttttctaactgtcagaccaagtttactcatatatactttagattgatttaaaacttcatttttaatttaaaaggatctaggtgaagatcctttttgataatctcatgaccaaaatcccttaacgtgagttttcgttccactgagcgtcagaccccgtagaaaagatcaaaggatcttcttgagatcctttttttctgcgcgtaatctgctgcttgcaaacaaaaaaaccaccgctaccagcggtggtttgtttgccggatcaagagctaccaactctttttccgaaggtaactggcttcagcagagcgcagataccaaatactgttcttctagtgtagccgtagttaggccaccacttcaagaactctgtagcaccgcctacatacctcgctctgctaatcctgttaccagtggctgctgccagtggcgataagtcgtgtcttaccgggttggactcaagacgatagttaccggataaggcgcagcggtcgggctgaacggggggttcgtgcacacagcccagcttggagcgaacgacctacaccgaactgagatacctacagcgtgagctatgagaaagcgccacgcttcccgaagggagaaaggcggacaggtatccggtaagcggcagggtcggaacaggagagcgcacgagggagcttccagggggaaacgcctggtatctttatagtcctgtcgggtttcgccacctctgacttgagcgtcgatttttgtgatgctcgtcaggggggcggagcctatggaaaaacgccagcaacgcggcctttttacggttcctggccttttgctggccttttgctcacatgttctttcctgcgttatcccctgattctgtggataaccgtattaccgcctttgagtgagctgataccgctcgccgcagccgaacgaccgagcgcagcgagtcagtgagcgaggaagcggaagagcgcccaatacgcaaaccgcctctccccgcgcgttggccgattcattaatgcagctgtggaatgtgtgtcagttagggtgtggaaagtccccaggctccccagcaggcagaagtatgcaaagcatgcatctcaattagtcagcaaccaggtgtggaaagtccccaggctccccagcaggcagaagtatgcaaagcatgcatctcaattagtcagcaaccatagtcccgcccctaactccgcccatcccgcccctaactccgcccagttccgcccattctccgccccatggctgactaattttttttatttatgcagaggccgaggccgcctcggcctctgagctattccagaagtagtgaggaggcttttttggaggcctaggcttttgcaaaaagttggccactccctctctgcgcgctcgctcgctcactgaggccgcccgggcaaagcccgggcgtcgggcgacctttggtcgcccggcctcagtgagcgagcgagcgcgcagagagggagtggccaactccatcactaggggttcctTACGTAATGCATGGATCCCCTAGGgcggccgcCTGAAACTAGACAAAACCCGTGTGACTGGCATCGATTATTCTATTTGATCTAGCTAGTCCTAGCAAAGTGACAACTGCTACTCCCCTCCTACACAGCCAAGATTCCTAAGTTGGCAGTGGCATGCTTAATCCTCAAAGCCAAAGTTACTTGGCTCCAAGATTTATAGCCTTAAACTGTGGCCTCACATTCCTTCCTATCTTACTTTCCTGCACTGGGGTAAATGTCTCCTTGCTCTTCTTGCTTTCTGTCCTACTGCAGGGCTCTTGCTGAGCTGGTGAAGCACAAGCCCAAGGCTACAGCGGAGCAACTGAAGACTGTCATGGATGACTTTGCACAGTTCCTGGATACATGTTGCAAGGCTGCTGACAAGGACACCTGCTTCTCGACTGAGGTCAGAAACGTTTTTGCATTTTGACGATGTTCAGTTTCCATTTTCTGTGCACGTGGTCAGGTGTAGCTCTCTGGAACTCACACACTGAATAACTCCACCAATCTAGATGTTGTTCTCTACGTAACTGTAATAGAAACTGACTTACGTAGCTTTTAATTTTTATTTTCTGCCACACTGCTGCCTATTAAATACCTATTATCACTATTTGGTTTCAAATTTGTGACACAGAAGAGCATAGTTAGAAATACTTGCAAAGCCTAGAATCATGAACTCATTTAAACCTTGCCCTGAAATGTTTCTTTTTGAATTGAGTTATTTTACACATGAATGGACAGTTACCATTATATATCTGAATCATTTCACATTCCCTCCCATGGCCTAACAACAGTTTATCTTCTTATTTTGGGCACAACAGATGTCAGAGAGCCTGCTTTAGGAATTCTAAGTAGAACTGTAATTAAGCAATGCAAGGCACGTACGTTTACTATGTCATTGCCTATGGCTATGAAGTGCAAATCCTAACAGTCCTGCTAATACTTTTCTAACATCCATCATTTCTTTGTTTTCAGGGTCCAAACCTTGTCACTAGATGCAAAGACGCCTTAGCCggaagcggcgccaccaatttcagcctgctgaaacaggccggcgacgtggaagagaaccctggcccttccttcatcccggtggccgaggattccgacttccccatccacaacctgccctacggcgtcttctcgaccagaggcgacccaagaccgaggataggtgtggccattggcgaccagatcctggacctcagcatcatcaagcacctctttactggtcctgtcctctccaaacaccaggatgtcttcaatcagcctacactcaacagcttcatgggcctgggtcaggctgcctggaaggaggcgagagtgttcttgcagaacttgctgtctgtgagccaagccaggctcagagatgacaccgaacttcggaagtgtgcattcatctcccaggcttctgccacgatgcaccttccagccaccataggagactacacagacttctattcctctcggcagcatgctaccaacgtcggaatcatgttcagggacaaggagaatgcgttgatgccaaattggctgcacttaccagtgggctaccatggccgtgcctcctctgtcgtggtgtctggcaccccaatccgaaggcccatgggacagatgaaacctgatgactctaagcctcccgtatatggtgcctgcaagctcttggacatggagctggaaatggctttttttgtaggccctggaaacagattgggagagccgatccccatttccaaggcccatgagcacatttttggaatggtccttatgaacgactggagtgcacgagacattcagaagtgggagtatgtccctctcgggccattccttgggaagagttttgggaccactgtctctccgtgggtggtgcccatggatgctctcatgccctttgctgtgcccaacccgaagcaggaccccaggcccctgccgtatctgtgccatgacgagccctacacatttgacatcaacctctctgttaacctgaaaggagaaggaatgagccaggcggctaccatatgcaagtccaattttaagtacatgtactggacgatgctgcagcagctcactcaccactctgtcaacggctgcaacctgcggccgggggacctcctggcttctgggaccatcagcgggccggagccagaaaacttcggctccatgttggaactgtcgtggaagggaacgaagcccatagacctggggaatggtcagaccaggaagtttctgctggacggggatgaagtcatcataacagggtactgccagggggatggttaccgcatcggctttggccagtgtgctggaaaagtgctgcctgctctcctgccatcaTAAACACATCACAACCACAACCTTCTCAGGTAACTATACTTGGGACTTAAAAAACATAATCATAATCATTTTTCCTAAAACGATCAAGACTGATAACCATTTGACAAGAGCCATACAGACAAGCACCAGCTGGCACTCTTAGGTCTTCACGTATGGTCATCAGTTTGGGTTCCATTTGTAGATAAGAAACTGAACATATAAAGGTCTAGGTTAATGCAATTTACACAAAAGGAGACCAAACCAGGGAGAGAAGGAACCAAAATTAAAAATTCAAACCAGAGCAAAGGAGTTAGCCCTGGTTTTGCTCTGACTTACATGAACCACTATGTGGAGTCCTCCATGTTAGCCTAGTCAAGCTTATCCTCTGGATGAAGTTGAAACCATATGAAGGAATATTTGGGGGGTGGGTCAAAACAGTTGTGTATCAATGATTCCATGTGGTTTGACCCAATCATTCTGTGAATCCATTTCAACAGAAGATACAACGGGTTCTGTTTCATAATAAGTGATCCACTTCCAAATTTCTGATGTGCCCCATGCTAAGCTTTAACAGAATTTATCTTCTTATGACAAAGCAGCCTCCTTTGAAAATATAGCCAACTGCACACAGCTATGTTGATCAATTTTGTTTATAATCTTGCAGAAGAGAATTTTTTAAAATAGGGCAATAATGGAAGGCTTTGGCAAAAAAATTGTTTCTCCATATGAAAACAAAAAACTTATTTTTTTATTCAAGCAAAGAACCTATAGACATAAGGCTATTTCAAAATTATTTCAGTTTTAGAAAGAATTGAAAGTTTTGTAGCATTCTGAGAAGACAGCTTTCATTTGTAATCATAGGTAATATGTAGGTCCTCAGAAATGGTGAGACCCCTGACTTTGACACTTGGGGACTCTGAGGGACCAGTGATGAAGAGGGCACAACTTATATCACACATGCACGAGTTGGGGTGAGAGGGTGTCACAACATCTATCAGTGTGTCATCTGCCCACCAAGTAACAGATGTCAGCTAAGACTAGGTCATGTGTAGGCTGTCTACACCAGTGAAAATCGCAAAAAGAATCTAAGAAATTCCACATTTCTAGAAAATAGGTTTGGAAACCGTATTCCATTTTACAAAGGACACTTACATTTCTCTTTTTGTTTTCCAGGCTACCCTGAGAAAAAAAGACATGAAGACTCAGGACTCATCTTTTCTGTTGGTGTAAAATCAACACCCTAAGGAACACAAATTTCTTTAAACATTTGACTTCTTGTCTCTGTGCTGCAATTAATAAAAAATGGAAAGAATCTACTCTGTGGTTCAGAACTCTATCTTCCAAAGGCGCGCTTCACCCTAGCAGCCTCTTTGGCTCAGAGGAATCCCTGCCTTTCCTCCCTTCATCTCAGCAGAGAATGTAGTTCCACATGGGCAACACAATGAAAATAAACGTTAATACTCTCCCATCTTATGGGTGGTGACCCTAGAAACCAATACTTCAACATTACGAGAATTCTGAATGAGAGACTAAAAGCTTATGAACTGTGGCTTTCCTTTGTCAGTGGGACTCTAAGAATGAGTTGGGGACAAAAGAGATAGGAATGGCTTTAAAGGTGACTAGTTGAACTGATAAAGTAAATGAACTGAGGAAAAAAAATATCACTCAAcctgcaggGGACGTCCTACGTAATGCATaggaacccctagtgatggagttggccactccctctctgcgcgctcgctcgctcactgaggccgggcgaccaaaggtcgcccgacgcccgggctttgcccgggcggcctcagtgagcgagcgagcgcgcagagagggagtggccaagaattaattccgtgtattctatagtgtcacctaaatcgtatgtgtatgatacataaggttatgtattaattgtagccgcgttctaacgacaatatgtacaagcctaattgtgtagcatctggcttactgaagcagaccctatcatctctctcgtaaactgccgtcagagtcggtttggttggacgaaccttctgagtttctggtaacgccgtcccgcacccggaaatggtcagcgaaccaatcagcagggtcatcgctagc

AAV-DJ-mha-mFAH(PM-0549；SEQ ID NO.29)

gtatatatgagtaaacttggtctgacagttaccaatgcttaatcagtgaggcacctatctcagcgatctgtctatttcgttcatccatagttgcctgactccccgtcgtgtagataactacgatacgggagggcttaccatctggccccagtgctgcaatgataccgcgagacccacgctcaccggctccagatttatcagcaataaaccagccagccggaagggccgagcgcagaagtggtcctgcaactttatccgcctccatccagtctattaattgttgccgggaagctagagtaagtagttcgccagttaatagtttgcgcaacgttgttgccattgctacaggcatcgtggtgtcacgctcgtcgtttggtatggcttcattcagctccggttcccaacgatcaaggcgagttacatgatcccccatgttgtgcaaaaaagcggttagctccttcggtcctccgatcgttgtcagaagtaagttggccgcagtgttatcactcatggttatggcagcactgcataattctcttactgtcatgccatccgtaagatgcttttctgtgactggtgagtactcaaccaagtcattctgagaatagtgtatgcggcgaccgagttgctcttgcccggcgtcaatacgggataataccgcgccacatagcagaactttaaaagtgctcatcattggaaaacgttcttcggggcgaaaactctcaaggatcttaccgctgttgagatccagttcgatgtaacccactcgtgcacccaactgatcttcagcatcttttactttcaccagcgtttctgggtgagcaaaaacaggaaggcaaaatgccgcaaaaaagggaataagggcgacacggaaatgttgaatactcatactcttcctttttcaatattattgaagcatttatcagggttattgtctcatgagcggatacatatttgaatgtatttagaaaaataaacaaataggggttccgcgcacatttccccgaaaagtgccacctgacgtctaagaaaccattattatcatgacattaacctataaaaataggcgtatcacgaggccctttcgtctcgcgcgtttcggtgatgacggtgaaaacctctgacacatgcagctcccggagacggtcacagcttgtctgtaagcggatgccgggagcagacaagcccgtcagggcgcgtcagcgggtgttggcgggtgtcggggctggcttaactatgcggcatcagagcagattgtactgagagtgcaccattcgacgctctcccttatgcgactcctgcattaggaagcagcccagtagtaggttgaggccgttgagcaccgccgccgcaaggaatggtgcatgcaaggagatggcgcccaacagtcccccggccacggggcctgccaccatacccacgccgaaacaagcgctcatgagcccgaagtggcgagcccgatcttccccatcggtgatgtcggcgatataggcgccagcaaccgcacctgtggcgccggtgggtcaccaagcaggaagtcaaagactttttccggtgggcaaaggatcacgtggttgaggtggagcatgaattctacgtcaaaaagggtggagccaagaaaagacccgcccccagtgacgcagatataagtgagcccaaacgggtgcgcgagtcagttgcgcagccatcgacgtcagacgcggaagcttcgatcaactacgcagacaggtaccaaaacaaatgttctcgtcacgtgggcatgaatctgatgctgtttccctgcagacaatgcgagagaatgaatcagaattcaaatatctgcttcactcacggacagaaagactgtttagagtgctttcccgtgtcagaatctcaacccgtttctgtcgtcaaaaaggcgtatcagaaactgtgctacattcatcatatcatgggaaaggtgccagacgcttgcactgcctgcgatctggtcaatgtggatttggatgactgcatctttgaacaataaatgATTTAAATCAGGTatggctgccgatggttatcttccagattggctcgaggacactctctctgaaggaataagacagtggtggaagctcaaacctggcccaccaccaccaaagcccgcagagcggcataaggacgacagcaggggtcttgtgcttcctgggtacaagtacctcggacccttcaacggactcgacaagggagagccggtcaacgaggcagacgccgcggccctcgagcacgacaaagcctacgaccggcagctcgacagcggagacaacccgtacctcaagtacaaccacgccgacgccgagttccaggagcggctcaaagaagatacgtcttttgggggcaacctcgggcgagcagtcttccaggccaaaaagaggcttcttgaacctcttggtctggttgaggaagcggctaagacggctcctggaaagaagaggcctgtagagcactctcctgtggagccagactcctcctcgggaaccggaaaggcgggccagcagcctgcaagaaaaagattgaattttggtcagactggagacgcagactcagtcccagaccctcaaccaatcggagaacctcccgcagccccctcaggtgtgggatctcttacaatggctgcaggcggtggcgcaccaatggcagacaataacgagggcgccgacggagtgggtaattcctcgggaaattggcattgcgattccacatggatgggcgacagagtcatcaccaccagcacccgaacctgggccctgcccacctacaacaaccacctctacaagcaaatctccaacagcacatctggaggatcttcaaatgacaacgcctacttcggctacagcaccccctgggggtattttgactttaacagattccactgccacttttcaccacgtgactggcagcgactcatcaacaacaactggggattccggcccaagagactcagcttcaagctcttcaacatccaggtcaaggaggtcacgcagaatgaaggcaccaagaccatcgccaataacctcaccagcaccatccaggtgtttacggactcggagtaccagctgccgtacgttctcggctctgcccaccagggctgcctgcctccgttcccggcggacgtgttcatgattccccagtacggctacctaacactcaacaacggtagtcaggccgtgggacgctcctccttctactgcctggaatactttccttcgcagatgctgagaaccggcaacaacttccagtttacttacaccttcgaggacgtgcctttccacagcagctacgcccacagccagagcttggaccggctgatgaatcctctgattgaccagtacctgtactacttgtctcggactcaaacaacaggaggcacgacaaatacgcagactctgggcttcagccaaggtgggcctaatacaatggccaatcaggcaaagaactggctgccaggaccctgttaccgccagcagcgagtatcaaagacatctgcggataacaacaacagtgaatactcgtggactggagctaccaagtaccacctcaatggcagagactctctggtgaatccgggcccggccatggcaagccacaaggacgatgaagaaaagttttttcctcagagcggggttctcatctttgggaagcaaggctcagagaaaacaaatgtggacattgaaaaggtcatgattacagacgaagaggaaatcaggacaaccaatcccgtggctacggagcagtatggttctgtatctaccaacctccagagaggcaacagacaagcagctaccgcagatgtcaacacacaaggcgttcttccaggcatggtctggcaggacagagatgtgtaccttcaggggcccatctgggcaaagattccacacacggacggacattttcacccctctcccctcatgggtggattcggacttaaacaccctccgcctcagatcctgatcaagaacacgcctgtacctgcggatcctccgaccaccttcaaccagtcaaagctgaactctttcatcacccagtattctactggccaagtcagcgtggagatcgagtgggagctgcagaaggaaaacagcaagcgctggaaccccgagatccagtacacctccaactactacaaatctacaagtgtggactttgctgttaatacagaaggcgtgtactctgaaccccgccccattggcacccgttacctcacccgtaatctgtaaTTGCTTGTTAATCAATAAACCGTTTAATTCGTTTCAGTTGAACTTTGGTCTCTGCGTATTTCTTTCTTATCTAGTTTCCATATGCATGTAGATAAGTAGCATGGCGGGTTAATCATTAACTAAccggtacctctagaactatagctagcgatgaccctgctgattggttcgctgaccatttccgggtgcgggacggcgttaccagaaactcagaaggttcgtccaaccaaaccgactctgacggcagtttacgagagagatgatagggtctgcttcagtaagccagatgctacacaattaggcttgtacatattgtcgttagaacgcggctacaattaatacataaccttatgtatcatacacatacgatttaggtgacactatagaatacacggaattaattcttggccactccctctctgcgcgctcgctcgctcactgaggccgcccgggcaaagcccgggcgtcgggcgacctttggtcgcccggcctcagtgagcgagcgagcgcgcagagagggagtggccaactccatcactaggggttccttaccgaCTGAAACTAGACAAAACCCGTGTGACTGGCATCGATTATTCTATTTGATCTAGCTAGTCCTAGCAAAGTGACAACTGCTACTCCCCTCCTACACAGCCAAGATTCCTAAGTTGGCAGTGGCATGCTTAATCCTCAAAGCCAAAGTTACTTGGCTCCAAGATTTATAGCCTTAAACTGTGGCCTCACATTCCTTCCTATCTTACTTTCCTGCACTGGGGTAAATGTCTCCTTGCTCTTCTTGCTTTCTGTCCTACTGCAGGGCTCTTGCTGAGCTGGTGAAGCACAAGCCCAAGGCTACAGCGGAGCAACTGAAGACTGTCATGGATGACTTTGCACAGTTCCTGGATACATGTTGCAAGGCTGCTGACAAGGACACCTGCTTCTCGACTGAGGTCAGAAACGTTTTTGCATTTTGACGATGTTCAGTTTCCATTTTCTGTGCACGTGGTCAGGTGTAGCTCTCTGGAACTCACACACTGAATAACTCCACCAATCTAGATGTTGTTCTCTACGTAACTGTAATAGAAACTGACTTACGTAGCTTTTAATTTTTATTTTCTGCCACACTGCTGCCTATTAAATACCTATTATCACTATTTGGTTTCAAATTTGTGACACAGAAGAGCATAGTTAGAAATACTTGCAAAGCCTAGAATCATGAACTCATTTAAACCTTGCCCTGAAATGTTTCTTTTTGAATTGAGTTATTTTACACATGAATGGACAGTTACCATTATATATCTGAATCATTTCACATTCCCTCCCATGGCCTAACAACAGTTTATCTTCTTATTTTGGGCACAACAGATGTCAGAGAGCCTGCTTTAGGAATTCTAAGTAGAACTGTAATTAAGCAATGCAAGGCACGTACGTTTACTATGTCATTGCCTATGGCTATGAAGTGCAAATCCTAACAGTCCTGCTAATACTTTTCTAACATCCATCATTTCTTTGTTTTCAGGGTCCAAACCTTGTCACTAGATGCAAAGACGCCTTAGCCggaagcggcgccaccaatttcagcctgctgaaacaggccggcgacgtggaagagaaccctggccctTCCTTTATTCCAGTGGCCGAGGACTCCGACTTTCCCATCCAAAACCTGCCCTATGGTGTTTTCTCCACTCAAAGCAACCCAAAGCCACGGATTGGTGTAGCCATCGGTGACCAGATCTTGGACCTGAGTGTCATTAAACACCTCTTTACCGGACCTGCCCTTTCCAAACATCAACATGTCTTCGATGAGACAACTCTCAATAACTTCATGGGTCTGGGTCAAGCTGCATGGAAGGAGGCAAGAGCATCCTTACAGAACTTACTGTCTGCCAGCCAAGCCCGGCTCAGAGATGACAAGGAGCTTCGGCAGCGTGCATTCACCTCCCAGGCTTCTGCGACAATGCACCTTCCTGCTACCATAGGAGACTACACGGACTTCTACTCTTCTCGGCAGCATGCCACCAATGTTGGCATTATGTTCAGAGGCAAGGAGAATGCGCTGTTGCCAAATTGGCTCCACTTACCTGTGGGATACCATGGCCGAGCTTCCTCCATTGTGGTATCTGGAACCCCGATTCGAAGACCCATGGGGCAGATGAGACCTGATAACTCAAAGCCTCCTGTGTATGGTGCCTGCAGACTCTTAGACATGGAGTTGGAAATGGCTTTCTTCGTAGGCCCTGGGAACAGATTCGGAGAGCCAATCCCCATTTCCAAAGCCCATGAACACATTTTCGGGATGGTCCTCATGAACGACTGGAGCGCACGAGACATCCAGCAATGGGAGTACGTCCCACTTGGGCCATTCCTGGGGAAAAGCTTTGGAACCACAATCTCCCCGTGGGTGGTGCCTATGGATGCCCTCATGCCCTTTGTGGTGCCAAACCCAAAGCAGGACCCCAAGCCCTTGCCATATCTCTGCCACAGCCAGCCCTACACATTTGATATCAACCTGTCTGTCTCTTTGAAAGGAGAAGGAATGAGCCAGGCGGCTACCATCTGCAGGTCTAACTTTAAGCACATGTACTGGACCATGCTGCAGCAACTCACACACCACTCTGTTAATGGATGCAACCTGAGACCTGGGGACCTCTTGGCTTCTGGAACCATCAGTGGATCAGACCCTGAAAGCTTTGGCTCCATGCTGGAACTGTCCTGGAAGGGAACAAAGGCCATCGATGTGGGGCAGGGGCAGACCAGGACCTTCCTGCTGGACGGCGATGAAGTCATCATAACAGGTCACTGCCAGGGGGACGGCTACCGTGTTGGCTTTGGCCAGTGTGCTGGGAAAGTGCTGCCTGCCCTTTCACCAGCCTAAACACATCACAACCACAACCTTCTCAGGTAACTATACTTGGGACTTAAAAAACATAATCATAATCATTTTTCCTAAAACGATCAAGACTGATAACCATTTGACAAGAGCCATACAGACAAGCACCAGCTGGCACTCTTAGGTCTTCACGTATGGTCATCAGTTTGGGTTCCATTTGTAGATAAGAAACTGAACATATAAAGGTCTAGGTTAATGCAATTTACACAAAAGGAGACCAAACCAGGGAGAGAAGGAACCAAAATTAAAAATTCAAACCAGAGCAAAGGAGTTAGCCCTGGTTTTGCTCTGACTTACATGAACCACTATGTGGAGTCCTCCATGTTAGCCTAGTCAAGCTTATCCTCTGGATGAAGTTGAAACCATATGAAGGAATATTTGGGGGGTGGGTCAAAACAGTTGTGTATCAATGATTCCATGTGGTTTGACCCAATCATTCTGTGAATCCATTTCAACAGAAGATACAACGGGTTCTGTTTCATAATAAGTGATCCACTTCCAAATTTCTGATGTGCCCCATGCTAAGCTTTAACAGAATTTATCTTCTTATGACAAAGCAGCCTCCTTTGAAAATATAGCCAACTGCACACAGCTATGTTGATCAATTTTGTTTATAATCTTGCAGAAGAGAATTTTTTAAAATAGGGCAATAATGGAAGGCTTTGGCAAAAAAATTGTTTCTCCATATGAAAACAAAAAACTTATTTTTTTATTCAAGCAAAGAACCTATAGACATAAGGCTATTTCAAAATTATTTCAGTTTTAGAAAGAATTGAAAGTTTTGTAGCATTCTGAGAAGACAGCTTTCATTTGTAATCATAGGTAATATGTAGGTCCTCAGAAATGGTGAGACCCCTGACTTTGACACTTGGGGACTCTGAGGGACCAGTGATGAAGAGGGCACAACTTATATCACACATGCACGAGTTGGGGTGAGAGGGTGTCACAACATCTATCAGTGTGTCATCTGCCCACCAAGTAACAGATGTCAGCTAAGACTAGGTCATGTGTAGGCTGTCTACACCAGTGAAAATCGCAAAAAGAATCTAAGAAATTCCACATTTCTAGAAAATAGGTTTGGAAACCGTATTCCATTTTACAAAGGACACTTACATTTCTCTTTTTGTTTTCCAGGCTACCCTGAGAAAAAAAGACATGAAGACTCAGGACTCATCTTTTCTGTTGGTGTAAAATCAACACCCTAAGGAACACAAATTTCTTTAAACATTTGACTTCTTGTCTCTGTGCTGCAATTAATAAAAAATGGAAAGAATCTACTCTGTGGTTCAGAACTCTATCTTCCAAAGGCGCGCTTCACCCTAGCAGCCTCTTTGGCTCAGAGGAATCCCTGCCTTTCCTCCCTTCATCTCAGCAGAGAATGTAGTTCCACATGGGCAACACAATGAAAATAAACGTTAATACTCTCCCATCTTATGGGTGGTGACCCTAGAAACCAATACTTCAACATTACGAGAATTCTGAATGAGAGACTAAAAGCTTATGAACTGTGGCTTTCCTTTGTCAGTGGGACTCTAAGAATGAGTTGGGGACAAAAGAGATAGGAATGGCTTTAAAGGTGACTAGTTGAACTGATAAAGTAAATGAACTGAGGAAAAAAAATATCACTCAAtcggtaaggaacccctagtgatggagttggccactccctctctgcgcgctcgctcgctcactgaggccgggcgaccaaaggtcgcccgacgcccgggctttgcccgggcggcctcagtgagcgagcgagcgcgcagagagggagtggccaactttttgcaaaagcctaggcctccaaaaaagcctcctcactacttctggaatagctcagaggccgaggcggcctcggcctctgcataaataaaaaaaattagtcagccatggggcggagaatgggcggaactgggcggagttaggggcgggatgggcggagttaggggcgggactatggttgctgactaattgagatgcatgctttgcatacttctgcctgctggggagcctggggactttccacacctggttgctgactaattgagatgcatgctttgcatacttctgcctgctggggagcctggggactttccacaccctaactgacacacattccacagctgcattaatgaatcggccaacgcgcggggagaggcggtttgcgtattgggcgctcttccgcttcctcgctcactgactcgctgcgctcggtcgttcggctgcggcgagcggtatcagctcactcaaaggcggtaatacggttatccacagaatcaggggataacgcaggaaagaacatgtgagcaaaaggccagcaaaaggccaggaaccgtaaaaaggccgcgttgctggcgtttttccataggctccgcccccctgacgagcatcacaaaaatcgacgctcaagtcagaggtggcgaaacccgacaggactataaagataccaggcgtttccccctggaagctccctcgtgcgctctcctgttccgaccctgccgcttaccggatacctgtccgcctttctcccttcgggaagcgtggcgctttctcatagctcacgctgtaggtatctcagttcggtgtaggtcgttcgctccaagctgggctgtgtgcacgaaccccccgttcagcccgaccgctgcgccttatccggtaactatcgtcttgagtccaacccggtaagacacgacttatcgccactggcagcagccactggtaacaggattagcagagcgaggtatgtaggcggtgctacagagttcttgaagtggtggcctaactacggctacactagaagaacagtatttggtatctgcgctctgctgaagccagttaccttcggaaaaagagttggtagctcttgatccggcaaacaaaccaccgctggtagcggtggtttttttgtttgcaagcagcagattacgcgcagaaaaaaaggatctcaagaagatcctttgatcttttctacggggtctgacgctcagtggaacgaaaactcacgttaagggattttggtcatgagattatcaaaaaggatcttcacctagatccttttaaattaaaaatgaagttttaaatcaatctaaa

Characterization of disease progression and regression in Fah-/-mice with or without NTBC treatment

Fah-/-mice aged 7 weeks were kept on NTBC drinking water at 8mg/L before the study began. The animals were then supplied with regular water (NTBC stopped) for 7 days, followed by a further 7 days of water containing NTBC at 8mg/L and the cycle repeated for a further 14 days. During the study period, animals were monitored for daily body weight. The submandibular blood and urine were sampled periodically and serum and plasma were collected and stored at-80 ℃ until further analysis. At the time of sacrifice, blood was collected by cardiac puncture to obtain plasma. For liver dissection, one lobe was fixed with 10% formalin and the remaining lobes were quick frozen and stored at-80 ℃. The next day, formalin-fixed livers were transferred to 70% ethanol for paraffin embedding.

During the selective expansion of edited hepatocytes, the severity of disease phenotype of GeneRide-treated Fah-/-mice was measured.

Four week old Fah ^-/- animals were treated with rAAV.DJ-GR-mFah under anesthesia at a dose of 1e14 vg/kg through the retroorbital sinus. All mice were kept with 8mg/L NTBC drinking water before the study began and 3 weeks after dosing. Thereafter, animals were supplied with regular water (NTBC stopped) until the end of the study. During the study period, animals were monitored for daily body weight. The submandibular blood and urine were sampled periodically and serum and plasma were collected and stored at-80 ℃ until further analysis. At the time of sacrifice, blood was collected by cardiac puncture to obtain plasma. For liver dissection, one lobe was fixed with 10% formalin and the remaining lobes were quick frozen and stored at-80 ℃. The next day, formalin-fixed livers were transferred to 70% ethanol for paraffin embedding.

GENERIDE ^TM and Nitixinong Therapy (NTBC) were evaluated for their prophylactic effect on hepatocellular carcinoma development.

Four week old Fah ^-/- animals were treated with rAAV.DJ-GR-mFah under anesthesia at a dose of 1e14 vg/kg through the retroorbital sinus. Mice treated with GeneRide were kept with 8mg/L NTBC drinking water before the study was started and 1 week after dosing. Thereafter, animals were supplied with regular water (NTBC stopped) until the end of the study. Fah ^-/- animals and Fah ^+/- animals receiving 8mg/L NTBC treatment were study controls. During the 1 year study, the body weight of the animals was monitored. For animals receiving sub-optimal NTBC doses (0.8 mg/L of NTBC), body weight was monitored 2 times per week starting with the 0.8mg/L of NTBC treatment. When any animal was observed to have a 10% weight loss over the previous week, the animal was subjected to 8mg/LNTBC for up to 7 days. Thereafter, the animals return to receiving 0.8mg/L NTBC until or if there is a next weight loss event. For all animals, the submandibular blood and urine were sampled periodically and serum and plasma were collected and stored at-80 ℃ until further analysis. At the time of sacrifice, blood was collected by cardiac puncture to obtain plasma. For liver dissection, one lobe was fixed with 10% formalin and the remaining lobes were quick frozen and stored at-80 ℃. The next day, formalin-fixed livers were transferred to 70% ethanol for paraffin embedding.

Serum activated partial thrombin time (aPTT) assay

Use of the kit with the provision according to the manufacturer's protocol (Diagnostica Stago, parsippany, NJ)The Stago blood coagulation analyzer (# 00595) performs serum aPTT quantification.

Serum clinical chemistry evaluation

Serum samples were analyzed using liquid chromatography-tandem mass spectrometry (LC-MS/MS) at CHARLES RIVER Laboratory (Ashland, OH). Clinical chemistry evaluations included aspartate Aminotransferase (AST), alanine Aminotransferase (ALT), alkaline phosphatase (ALP), gamma-glutamyl transferase (GGT), total Bilirubin (TBIL), albumin (ALB), total Protein (TPROT), globulin (GLOB), creatine Kinase (CK), urea nitrogen (UREAN), creatinine (create), generating cells (PHOS, NA, K, CL), glucose (GLUC), cholesterol (CHOL), and Triglycerides (TRIG).

EXAMPLE 2 GENERIDE ^TM treatment improves liver function in vivo

This example demonstrates, among other things, that viral vectors comprising sequences encoding fumarylacetoacetate hydrolase (FAH) can be used to treat or prevent tyrosinase in vivo (e.g., in one or more mouse models).

A viral vector comprising an AAV-DJ viral capsid, a human FAH (hFAH) transgene, a P2A sequence, a flanking 5 'homology arm of 1000 nucleotides (nt) in length, and a 3' homology arm of 1600nt in length was constructed (FIG. 1A). The homology arms are designed to be complementary to the target integration site of the mouse genomic albumin. FRG mice (tyrosinemia mice model) aged 4 weeks were administered viral vectors by intravenous injection and NTBC drinking water (8 mg/L) was maintained from birth up to 1 week after administration (FIG. 1B). Between 1 and 4 weeks post-dose, NTBC doses are adjusted as needed (e.g., animals are circulated back to 8mg/L NTBC drinking water for one week if animal body weight drops by 10% from the previous week measurement). NTBC administration was stopped from 4 weeks after dosing until sacrifice (9 or 16 weeks after dosing).

Mice were assessed for circulating GENERIDE ^TM biomarker (e.g., ALB-2A levels) up to 9 weeks post-treatment (fig. 1C) and mice were assessed for body growth (by percent body weight change) up to 16 weeks post-treatment (fig. 1D). Markers of liver function (e.g., alanine Aminotransferase (ALT), bilirubin) were also assessed (fig. 1E and 1F). Mice were sacrificed 9 weeks (9 WK; 4 animals harvested) and their livers were isolated 16 weeks (16 WK; 4 animals harvested) after treatment. A subset of treated mice were sacrificed and liver samples were collected and analyzed by immunohistochemical staining with anti-FAH antibodies (fig. 1G). Hepatocytes were isolated from a subset of individual treated mice whose livers had been perfused (n=2). gDNA integration of isolated hepatocytes was assessed (FIG. 1H). The presence of Alpha Fetal Protein (AFP) (a marker of hepatocellular carcinoma (HCC)) in plasma samples from mice was also assessed compared to healthy Fah +/-Heterozygous (HET) mice and vehicle-only treated mice (fig. 1I).

The present disclosure demonstrates, among other things, that treatment with a viral vector comprising a sequence encoding human FAH (hFAH) can provide improvement in liver function in subjects with type 1 hereditary tyrosinemia (HT 1), such as in the FRG mouse model system, compared to a reference, such as untreated or vector treated. In some embodiments, treating a subject with a viral vector of the present disclosure may provide a reduction in the level of a biomarker (e.g., ALT, bilirubin) associated with reduced liver function as compared to a reference (e.g., untreated or vector treated). In some embodiments, treating a subject (e.g., a subject with hereditary tyrosinemia) with a viral vector of the present disclosure may allow or restore normal growth (e.g., as measured by a percentage change in body weight over time) relative to a reference (e.g., untreated or vector treated). In some embodiments, treating a subject (e.g., a subject with hereditary tyrosinemia) with a viral vector of the present disclosure may provide for reduced levels of biomarkers (e.g., AFP) associated with a disease (e.g., cancer, including HCC). In some embodiments, treating a subject (e.g., a subject with hereditary tyrosinemia) with a viral vector of the present disclosure may provide one or more of an improvement in liver function (e.g., as measured by assessing markers of liver function), normal growth relative to a reference (e.g., untreated or vector treated) (e.g., as measured by a percentage change in body weight over time), and a decrease in the level of a biomarker (e.g., AFP) associated with a disease (e.g., cancer, including HCC).

As shown in fig. 1H, among other things, the present disclosure demonstrates that viral vectors of the present disclosure are capable of integrating a transgene sequence (e.g., FAH) into a genomic target site in a subject. In some embodiments, integration of a transgene sequence (e.g., FAH) can provide selective advantages to cells (e.g., liver cells) in a subject (e.g., a subject with hereditary tyrosinemia). In some embodiments, treating a subject (e.g., a subject with hereditary tyrosinemia) with a viral vector of the present disclosure can integrate more than 10% (e.g., 20%, 30%, 40%, 50%) of the genomic DNA of a transgene (e.g., a FAH) delivered in cells (e.g., liver cells) within a particular tissue type (e.g., liver). In some embodiments, treating a subject (e.g., a subject with hereditary tyrosinemia) with a viral vector of the present disclosure can cause more than 50% (e.g., 60%, 70%, 80%, 90%, 95%, 99%, 100%, etc.) of the cells (e.g., liver cells) within a particular tissue type (e.g., liver) to be comprised of cells that have successfully integrated the delivered transgene (e.g., FAH).

Example 3 GENERIDE ^TM treatment allows rapid selective expansion of edited cells in vivo

This example demonstrates that, among other things, a viral vector comprising a sequence encoding fumarylacetoacetate hydrolase (FAH) can be administered to a subject (e.g., a subject suffering from type 1 hereditary tyrosinemia) at certain doses and provides selective advantages for cells that have successfully integrated the FAH coding sequence.

A viral vector comprising an AAV-DJ viral capsid, a mouse FAH (mFAH) transgene, a P2A sequence, a flanking 5 'homology arm of 1000 nucleotides (nt) in length, and a 3' homology arm of 1600nt in length was constructed (FIG. 2A). The homology arms are designed to be complementary to the target integration site of the mouse genomic albumin. The vector described herein was administered by intravenous injection to two week old Fah-/-mice (FIG. 2A). Mice were supplied with 8mg/L of NTBC from birth until 1 week (3 weeks of age) after GENERIDE ^TM administration. Between 3 and 4 weeks of age, 8mg/L NTBC drinking water was provided only to animals that observed a 10% decrease in body weight over the previous week. NTBC administration was stopped from 4 weeks of age until sacrifice (at least 18 weeks of age). Another group of Fah-/-mice consistently maintained 8mg/L NTBC drinking water and served as a standard care control group. Mice were assessed for survival after up to 8 weeks post-treatment for circulating biomarkers (e.g., ALB-2A levels) (fig. 2B) and NTBC removal (fig. 2C). The presence of liver function markers (e.g., ALT) and toxic metabolites (e.g., succinylacetone (SUAC)) were assessed (fig. 2D and 2E).

A viral vector comprising an AAV-DJ viral capsid, a mouse FAH (mFAH) transgene, a P2A sequence, a flanking 5 'homology arm of 1000 nucleotides (nt) in length, and a 3' homology arm of 1600nt in length was constructed (FIG. 2A). The homology arms are designed to be complementary to the target integration site of the mouse genomic albumin. The vector described herein was administered by intravenous injection to Fah-/-mice aged 1 month (FIG. 3A). Mice were treated with 8mg/L NTBC from birth until GENERIDE ^TM vector was administered. From 1 month of age until sacrifice (at least 12 months of age), NTBC administration was stopped. Circulating biomarkers (e.g., ALB-2A levels) were assessed in mice aged at least 5 months (fig. 3B) and body weight was monitored for at least 6 months after treatment (fig. 3C). HCC risk (AFP level) was also assessed in these mice (at least 6 months of age) (fig. 3D).

The present disclosure demonstrates, among other things, that treating a subject (e.g., a subject with type 1 hereditary tyrosinemia) with the viral vectors of the present disclosure can provide a rapid selective advantage to cells (e.g., liver cells) resulting in complete re-proliferation of the diseased liver within 4 weeks after treatment. In some embodiments, treating a subject (e.g., a subject with hereditary tyrosinemia) with a viral vector of the present disclosure can cause more than 50% (e.g., 60%, 70%, 80%, 90%, 95%, 99%, 100%, etc.) of the cells (e.g., liver cells) within a particular tissue type (e.g., liver) to be composed of cells that have successfully integrated the delivered transgene (e.g., FAH) within 4 weeks after treatment. In some embodiments, treating a subject (e.g., a subject with hereditary tyrosinemia) with a viral vector of the present disclosure at certain doses (e.g., 3e13 vg/kg, 1e14 vg/kg) can provide selective advantages to cells (e.g., liver cells) within 4 weeks after treatment.

Among other things, the present disclosure demonstrates that treatment with a viral vector comprising a sequence encoding mouse FAH (mFAH) can provide improvement in liver function in subjects with type 1 hereditary tyrosinemia (HT 1), such as in the FAH ^-/- mouse model system, compared to a reference, such as untreated. In some embodiments, treating a subject with a viral vector of the present disclosure may provide a reduction in the level of a biomarker (e.g., ALT, bilirubin) associated with reduced liver function as compared to a reference (e.g., untreated). In some embodiments, treating a subject (e.g., a subject with hereditary tyrosinemia) with a viral vector of the present disclosure can provide a reduction in the level of a deleterious metabolite (e.g., SUAC) relative to a reference (e.g., untreated or NTBC treated).

Among other things, the present disclosure demonstrates that treating a subject (e.g., a subject with hereditary tyrosinemia) with the viral vectors of the present disclosure can reduce HCC risk compared to a reference (e.g., a untreated or NTBC treated subject). In some embodiments, treatment of a subject (e.g., a subject with hereditary tyrosinemia) with a viral vector of the present disclosure may provide a reduction in the level of a biomarker (e.g., AFP) associated with a disease (e.g., cancer, including HCC) at least 5 months (6 months of age) after administration, as compared to an age-matched reference (e.g., untreated or NTBC treated subject).

The present disclosure demonstrates, among other things, that treating a subject (e.g., a subject with hereditary tyrosinemia) with the viral vectors of the present disclosure can exhibit sustained transgene expression after adulthood. In some embodiments, administration of a viral vector of the present disclosure to a subject at least one month of age demonstrates sustained expression of a circulating biomarker (e.g., ALB-2A) at least 5 months after administration. In some embodiments, subjects treated with the viral vectors of the present disclosure exhibit similar body weight compared to a reference (e.g., subjects treated with NTBC). In some embodiments, treating a subject (e.g., a subject with hereditary tyrosinemia) with a viral vector of the present disclosure may provide a reduction in the level of a biomarker (e.g., AFP) associated with a disease (e.g., cancer, including HCC) compared to a reference (e.g., untreated or NTBC treated subject) at an age of at least 6 months.

Example 4 optimization of generide ^TM combination therapy

This example demonstrates, among other things, that a viral vector comprising a sequence encoding fumarylacetoacetate hydrolase (FAH) can be administered to a subject (e.g., a subject suffering from hereditary tyrosinemia) in combination with certain doses of one or more alternative therapies (e.g., NTBC therapy) in order to optimize the selective advantage of cells that have successfully integrated the FAH coding sequence.

A viral vector comprising an AAV-DJ viral capsid, a mouse FAH (mFAH) transgene, a P2A sequence, and a flanking 5 'homology arm of 1000 nucleotides (nt) in length and a 3' homology arm of 1600nt in length was constructed. The homology arms are designed to be complementary to the target integration site of the mouse genomic albumin. The viral vectors described herein were administered by intravenous injection to three groups (groups 1,2, 3, 4) of age 4 week Fah-/-mice at a dose of 1e14 vg/kg. All groups of mice were maintained on NTBC drinking water (8 mg/L) for 4 weeks, followed by titrating doses of NTBC (3 mg/L, 0.8mg/L and 0.3mg/L, respectively) for 8 weeks. Mice in group 1 maintained standard doses of NTBC for 8 weeks. Mice were assessed for circulating biomarkers (e.g., ALB-2A levels) up to 6 weeks after NTBC titration (fig. 4B).

The present disclosure demonstrates, among other things, that treating a subject (e.g., a subject with hereditary tyrosinemia) with a viral vector of the present disclosure can include administering the viral vector in combination with one or more alternative therapies (e.g., alternative HT1 therapies) in order to provide selective advantages to cells (e.g., liver cells). In some embodiments, the combined administration of the viral vectors of the present disclosure with one or more alternative therapies (e.g., NTBC) may be optimized (e.g., by optimizing dose levels and/or timing) to provide selective advantages to cells (e.g., liver cells) while maintaining or improving liver function (e.g., reduced levels of biomarkers (e.g., ALT, bilirubin) associated with reduced liver function (fig. 4C), reduced levels of detrimental metabolites (SUAC), etc.). In some embodiments, the dosage level of one or more replacement therapies (e.g., NTBC therapies) may be titrated to provide selective advantages to cells (e.g., liver cells) while controlling the severity of the disease (e.g., alleviating symptoms and/or side effects of the disease).

Among other things, the present disclosure demonstrates that treatment with a viral vector comprising a sequence encoding FAH (FAH) can provide improved integration with titrated lower doses of supplemental NTBC in subjects with type 1 hereditary tyrosinemia (e.g., in FAH-/-mouse model systems). In some embodiments, as shown in fig. 4B, treatment of a subject with a viral vector of the present disclosure may provide increased levels of circulating biomarker (e.g., ALB-2A) and suboptimal NTBC doses, while high doses of NTBC may prevent selective expansion of GENERIDE ^TM -edited cells (e.g., hepatocytes).

Example 5 GENERIDE ^TM treatment may exhibit improved liver function prior to complete selective expansion of edited cells in vivo

This example demonstrates, among other things, that administration of a viral vector comprising a specific component and a sequence encoding fumarylacetoacetate hydrolase (FAH) to a subject (e.g., a subject suffering from type 1 hereditary tyrosinemia) can improve liver function before completion of selective expansion of cells that have successfully integrated the FAH coding sequence.

A viral vector comprising an AAV-DJ viral capsid, a mouse FAH (mFAH) transgene, a P2A sequence, and a flanking 5 'homology arm of 1000 nucleotides (nt) in length and a 3' homology arm of 1600nt in length was constructed. The homology arms are designed to be complementary to the target integration site of the mouse genomic albumin. The viral vectors described herein were administered to Fah-/-mice at a dose of 1e14 vg/kg. All mice were kept with 8mg/L NTBC drinking water before the study began and 3 weeks after dosing. Thereafter, animals were supplied with regular water (NTBC stopped) until the end of the study. Mice were assessed for circulating GENERIDE ^TM biomarker (e.g., ALB-2A levels) (fig. 7A), the presence of toxic metabolites (e.g., SUAC) (fig. 7B), and markers of liver injury and synthetic function (e.g., ALT) (fig. 7C), blood clotting (e.g., activated partial thromboplastin time (aPTT)) (fig. 7D), blood glucose (fig. 7E), and/or body weight (fig. 7F).

As shown in fig. 6A-6F, NTBC treatment may restore the hereditary tyrosinemia biomarker (e.g., SUAC) and liver synthesis function (aPTT, ALT, blood glucose) quickly (e.g., less than 7 days), and upon removal of the NTBC treatment (e.g., less than 3 days), abnormal levels of the hereditary tyrosinemia biomarker (e.g., SUAC) and liver synthesis function (aPTT, ALT, blood glucose) follow.

As shown in fig. 7A-7F, replacement of NTBC Fah-/-mice within 7 days of GENERIDE ^TM treatment, correction cell expansion may be sufficient to control SUAC in the absence of NTBC treatment. On day 7, only mild liver injury was observed, and no severe metabolic dysfunction was detected. From day 17 to day 28, SUAC levels tended to drop to the normal range, followed by weight gain. Importantly, on day 28 (e.g., 4 weeks), cells completely edited with GENERIDE ^TM that had successfully integrated the FAH transgene were completely repopulating with GENERIDE ^TM treatment, as shown in FIG. 8B. As shown herein, GENERIDE ^TM treatment produced liver and kidney biomarker profiles comparable to full dose NTBC treatment, and GENERIDE ^TM and NTBC treatments exhibited similar levels compared to healthy Fah +/-mice (fig. 8C and 8D).

Example 6 Long term GENERIDE ^TM treatment allows for rapid selective expansion of edited cells in the liver at low risk of hepatocellular carcinoma (HCC)

This example demonstrates, among other things, that administration of a viral vector comprising a specific component and a sequence encoding fumarylacetoacetate hydrolase (FAH) to a subject (e.g., a subject suffering from type 1 hereditary tyrosinemia) can provide selective advantages to cells that have successfully integrated the FAH coding sequence, and that these edited hepatocytes can confer a lower risk of producing HCC than non-edited hepatocytes.

A viral vector comprising an AAV-DJ viral capsid, a mouse FAH (mFAH) transgene, a P2A sequence, and a flanking 5 'homology arm of 1000 nucleotides (nt) in length and a 3' homology arm of 1600nt in length was constructed. The homology arms are designed to be complementary to the target integration site of the mouse genomic albumin. The viral vectors described herein were administered to Fah-/-mice at a dose of 1e14 vg/kg. Mice treated with GENERIDE ^TM were kept with 8mg/L NTBC drinking water before the study was started and 1 week after dosing. Thereafter, animals were supplied with regular water (NTBC stopped) until the end of the study (fig. 9A).

During the study, the body weight of the animals was monitored. For animals receiving suboptimal NTBC doses, body weight was monitored 2 times per week starting with 0.8mg/L NTBC treatment. If a 10% weight loss from the previous week is observed, the animals are subjected to 8mg/L NTBC for 7 days. Mice were assessed for circulating GENERIDE ^TM biomarker (e.g., ALB-2A levels) (fig. 9B), presence of toxic metabolites (e.g., SUAC) (fig. 9B), and tyrosine levels (fig. 9E). HCC risk (AFP level) was also assessed in these mice (at least 7 months of age) (fig. 9D).

As shown in FIG. 9C, GENERIDE ^TM -treated Fah-/-mice may exhibit durable hepatocyte edits for at least 10 months. GENERIDE ^TM treatment reduced toxic metabolites (e.g., SUAC) (fig. 9B) to undetectable levels and restored tyrosine metabolism (fig. 9E), demonstrating long-term superiority to NTBC treatment. This advantage is clinically relevant, especially when dietary restrictions and NTBC treatments are not strictly followed (e.g., comparable to sub-optimal NTBC treatment groups). The GENERIDE ^TM -treated Fah-/-mice may exhibit reduced HCC biomarkers (e.g., AFP) compared to high doses of consecutive NTBC (8 mg/L) (FIG. 9D), indicating that GENERIDE ^TM treatment may be associated with the same or lower risk of HCC as standard of care (e.g., NTBC).

Example 7. GENERIDE ^TM therapies were optimized in pediatric mice.

This example demonstrates that a viral vector comprising a sequence encoding fumarylacetoacetate hydrolase (FAH) can be administered to a pediatric subject, such as a pediatric subject suffering from hereditary tyrosinemia, in order to optimize the selective advantage of cells that have successfully integrated the FAH coding sequence.

A viral vector comprising an AAV-DJ viral capsid, a mouse FAH (mFAH) transgene, a P2A sequence, a flanking 5 'homology arm of 1000 nucleotides (nt) in length, and a flanking 3' homology arm of 1600nt in length was constructed. The homology arms are designed to be complementary to the target integration site of the mouse genomic albumin. FAH ^-/- mice were divided into 4 different groups (FIG. 10A). Mice in group 3 were maintained on NTBC drinking water (8 mg/L) for 4 weeks and on standard doses of NTBC (8 mg/L) for 8 weeks. Mice in group 4 were maintained on NTBC drinking water (8 mg/L) for 4 weeks, followed by a titration dose of 0.8mg/L of NTBC for 8 weeks. Mice in group 1 were maintained on NTBC drinking water (8 mg/L) for 4 weeks, and the viral vectors described herein were administered at a dose of 1e14 vg/kg at age 4 weeks. Mice in group 2 were given breast milk from the mother who had been administered 24mg/L NTBC for 3 weeks. This ensures that the neonate receives a dose close to the standard dose of NTBC (e.g. 8 mg/L). At age 2 weeks, mice in group 2 were administered the viral vectors described herein at a dose of 1e14 vg/kg. HCC risk (e.g., AFP levels) was assessed in these mice.

NTBC, the current standard of care, reduces the accumulation of toxic products in patients with hereditary tyrosinemia (see, e.g., ginkel et al, adv Exp Med biol.2017;959: 101-109). However, especially if NTBC treatment begins later (exhibiting a slow decline in tumor marker AFP), the risk of liver cancer development still exists (see, e.g., ginkel et al, adv Exp Med biol.2017;959: 101-109).

As shown in fig. 10B, GENERIDE ^TM treatment may be effective on the far ^-/- mice (e.g., mice in group 2) as early as day 14 post-natal, and the far-/-mice may exhibit reduced AFP levels at 3 months of age after receiving treatment. For mice in group 2, AFP levels were at least about 10 ⁴ to 10 ⁵ ng/mL at age 2 months, and reduced to at least about 10 ² to 10 ³ ng/mL at age 3 months. In addition, AFP levels remain at least about 10 ² to 10 ³ ng/mL for at least about 12 months after treatment.

Thus, this example demonstrates that early administration of the viral vectors described herein (e.g., day 14 post-natal) can effectively reduce HCC risk in FAH ^-/- mice, characterized by reduced AFP levels after treatment.

EXAMPLE 8 GENERIDE ^TM treatment of pediatric mice allows for rapid selective expansion of edited cells in vivo

This example further demonstrates that a viral vector comprising a sequence encoding Fumarylacetoacetic Acid Hydrolase (FAH) can be administered to a pediatric subject, such as a subject suffering from type 1 hereditary tyrosinemia, and provides selective advantage to cells that have successfully integrated the FAH coding sequence.

A viral vector comprising an AAV-DJ viral capsid, a mouse FAH (mFAH) transgene, a P2A sequence, a flanking 5 'homology arm of 1000 nucleotides (nt) in length, and a 3' homology arm of 1600nt in length was constructed. The homology arms are designed to be complementary to the target integration site of the mouse genomic albumin. The Fah-/-mice were given breast milk from the mother who had been administered 24mg/L NTBC for 3 weeks. This ensures that the neonate receives a dose close to the standard dose of NTBC (e.g. 8 mg/L). The vectors described herein were administered to mice at 3E12vg/kg or 1E13vg/kg by intravenous injection at two weeks of age (i.e., day 14 post-natal). Liver samples were collected 16 weeks after dosing and analyzed by immunohistochemical staining with anti-FAH antibodies.

As shown in fig. 11, liver samples from the day 14 postnatal mice treated with GENERIDE ^TM administered at 3E12 vg/kg or 1E13vg/kg doses exhibited intense staining of the Fah protein 16 weeks after treatment. Liver samples from GENERIDE ^TM -treated Fah-/-mice administered 3E12 vg/kg exhibited higher intensity staining for the Fah protein than those administered GENERIDE ^TM at 1E13 vg/kg.

Thus, this example demonstrates that administration of the vectors described herein at an additional dose (e.g., 3E12 vg/kg or 1E13 vg/kg) in pediatric mice (e.g., day 14 post-natal) effectively provides selective advantage to cells that have successfully integrated the FAH coding sequence, characterized by intense staining of the FAH sample in the liver sample. Furthermore, this example demonstrates that administration of lower doses of GENERIDE ^TM treated (e.g., 3E12 vg/kg) Fah-/-mice can improve disease outcome, characterized by strong liver re-proliferation of cells that have successfully integrated the FAH coding sequence.

EXAMPLE 9 Dual plasmid and triple plasmid systems can be used to generate viral vectors

This example demonstrates, among other things, that a two-plasmid or three-plasmid system can be used to produce an AAV vector.

In some embodiments, HEK293F cells are expanded for vector production. Cells were split into 2e6 cells/mL in 200mL of expi293 medium in 500mL flasks. Plasmid mixtures for various transfection conditions were prepared and filtered through a 0.22 μm filter unit. Transfection reagent mixtures (e.g., PEI or FectoVIR-AAV) were prepared according to the manufacturer's protocol. The plasmid and transfection reagent mixture are combined into a single transfection mixture. 20mL of the transfection mixture was added to 100mL of HEK293F cells in a 500mL flask and incubated at 37℃for 72 hours.

In some embodiments, plasmids for use in a two-plasmid system comprise an AAV Rep sequence and related sequences from a helper virus ("Rep/helper plasmid") or an AAV Cap sequence and a payload ("payload/Cap plasmid"). In some embodiments, the plasmids used in the three plasmid system comprise separate plasmids, each encoding one of the following: 1) AAV rep and AAV cap sequences, 2) related sequences from helper viruses, and 3) payloads. The human gene sequence of interest (e.g., "mHA-FAH") compatible with GeneRide systems with flanking homology arms to mouse albumin can be used as the payload for mouse experiments. The gene sequence of human interest (e.g., "hHA-far") compatible with the GeneRide system, with flanking homology arms to human albumin, can be used as the payload for human or humanized mouse experiments. In some embodiments, the payload may comprise SEQ ID NO. 31. In some embodiments, the payload may consist of SEQ ID NO. 31. In some implementations, the payload can include any of the payloads described herein. Multiple AAV Cap genes encoding different AAV capsids were evaluated in the payload/Cap plasmid. In some embodiments, the AAV cap gene can encode AAV1、AAV2、AAV3、AAV4、AAV5、AAV6、AAV7、AAV8、AAV9、AAV10、AAV11、AAVC11.01、AAVC11.02、AAVC11.03、AAVC11.04、AAVC11.05、AAVC11.06、AAVC11.07、AAVC11.08、AAVC11.09、AAVC11.10、AAVC11.11、AAVC11.12、AAVC11.13、AAVC11.14、AAVC11.15、AAVC11.16、AAVC11.17、AAVC11.18、AAVC11.19、AAV-DJ、AAV-LK03、AAV-LK19、AAVrh.74、AAVrh.10、AAVhu.37、AAVrh.K、AAVrh.39、AAV12、AAV 13、AAVrh.8、 avian AAV, bovine AAV, canine AAV, equine AAV, primate AAV, non-primate AAV, ovine AAV, hybrid AAV (e.g., an AAV comprising one or more sequences of one AAV subtype and one or more sequences of a second subtype). In some embodiments, the payload/Cap plasmid may comprise SEQ ID NO. 32. In some embodiments, the payload/Cap plasmid may consist of SEQ ID NO. 32. In some embodiments, the payload/Cap plasmid may comprise SEQ ID NO. 33. In some embodiments, the payload/Cap plasmid may consist of SEQ ID NO. 33. In some embodiments, the payload/Cap plasmid may comprise any of the payload or capsid sequences disclosed herein.

Table 1A: exemplary sequences for generating viral vectors.

Exemplary embodiments

1. A composition comprising:

A polynucleotide cassette comprising:

An expression cassette comprising a first nucleic acid sequence and a second nucleic acid sequence, wherein the first nucleic acid sequence encodes a transgene; and the second nucleic acid sequence is located 5 'or 3' to the first nucleic acid sequence and facilitates the production of two independent gene products upon integration into the genome of the cell at a target integration site;

A third nucleic acid sequence located 5 'of the expression cassette and comprising a sequence substantially homologous to the genomic sequence 5' of the target integration site in the genome of the cell; and

A fourth nucleic acid sequence located 3 'of the expression cassette and comprising a sequence substantially homologous to the genomic sequence 3' of the target integration site in the genome of the cell.

2. The composition of embodiment 1, wherein the composition further comprises a delivery vehicle.

3. The composition according to embodiment 2, wherein the delivery vehicle comprises lipid nanoparticles.

4. The composition according to embodiment 3, wherein the delivery vector comprises a recombinant viral vector.

5. The composition of embodiment 4, wherein the recombinant viral vector is a recombinant adeno-associated (AAV) viral vector.

6. The composition according to embodiment 5, wherein the recombinant viral vector is or comprises a capsid protein comprising an amino acid sequence having at least 95% sequence identity to the amino acid sequence of AAV8, AAV-DJ, AAV-LK03, sL65 or AAVNP.

7. The composition according to any one of the preceding embodiments, wherein the transgene is or comprises a fumarylacetoacetate hydrolase (FAH) transgene.

8. The composition according to embodiment 7, wherein the FAH transgene is wt human FAH, codon optimized FAH, synthetic FAH, a FAH variant, a FAH mutant or a FAH fragment.

9. The composition of embodiment 7, wherein the FAH transgene has 80% sequence identity to SEQ ID No.18, 19, 20, 21 or 22.

10. The composition according to any one of the preceding embodiments, wherein the composition further comprises an AAV2 ITR sequence selected from SEQ ID NOs 27-30 and/or ITR sequence.

11. The composition according to any one of the preceding embodiments, wherein the polynucleotide cassette does not comprise a promoter sequence.

12. The composition according to any one of the preceding embodiments, wherein the second nucleic acid sequence comprises:

a) A nucleic acid sequence encoding a 2A peptide;

b) A nucleic acid sequence encoding an Internal Ribosome Entry Site (IRES);

c) A nucleic acid sequence encoding an N-terminal intein splicing region and a C-terminal intein splicing region; or (b)

D) Nucleic acid sequences encoding splice donors and splice acceptors.

13. The composition according to any one of the preceding embodiments, wherein the third nucleic acid sequence and the fourth nucleic acid sequence are homology arms that integrate the expression cassette at a target integration site comprising an endogenous promoter and an endogenous gene.

14. The composition according to embodiment 13, wherein the target integration site is an endogenous albumin locus.

15. A method of integrating a transgene into the genome of at least one cell population in a subject tissue, the method comprising

Administering to a subject a composition comprising:

A polynucleotide cassette comprising:

An expression cassette comprising a first nucleic acid sequence and a second nucleic acid sequence, wherein the first nucleic acid sequence encodes the transgene; and the second nucleic acid sequence is located 5' or 3' to the first nucleic acid sequence and facilitates the production of two independent gene products upon integration into the cell's genome at a target integration site;

A fourth nucleic acid sequence located 3 'of the expression and comprising a sequence substantially homologous to the genomic sequence 3' of the target integration site in the genome of the cell;

wherein the transgene is integrated into the genome of the cell population after administration of the composition.

16. The method of embodiment 15, wherein the integration does not comprise nuclease activity.

17. The method according to embodiment 15, wherein the composition further comprises a delivery vehicle.

18. The method according to embodiment 17, wherein the delivery vehicle comprises lipid nanoparticles.

19. The method of embodiment 17, wherein the delivery vector comprises a recombinant viral vector.

20. The method of embodiment 19, wherein the recombinant viral vector is a recombinant adeno-associated (AAV) viral vector.

21. The method according to embodiment 19, wherein the recombinant viral vector is or comprises a capsid protein comprising an amino acid sequence having at least 95% sequence identity to the amino acid sequence of AAV8, AAV-DJ, AAV-LK03, sL65 or AAVNP 59.

22. The method according to any one of embodiments 15 to 21, wherein the transgene is or comprises a fumarylacetoacetate hydrolase (FAH) transgene.

23. The method according to embodiment 22, wherein the FAH transgene is wt human FAH, codon-optimized FAH, synthetic FAH, FAH variant, FAH mutant or FAH fragment.

24. The method according to embodiment 22, wherein the FAH transgene has 80% sequence identity to SEQ ID NO.18, 19, 20, 21 or 22.

25. The method according to any one of embodiments 15 to 24, wherein the composition further comprises an AAV2 ITR sequence selected from SEQ ID NOs 27-30 and/or ITR sequences.

26. The method according to any one of embodiments 15 to 25, wherein the polynucleotide cassette does not comprise a promoter sequence.

27. The method according to any one of embodiments 15-26, wherein after integrating the expression cassette into the target integration site in the genome of the cell, the transgene is expressed under the control of an endogenous promoter at the target integration site.

28. The method of embodiment 27, wherein the target integration site is an albumin locus comprising an endogenous albumin promoter and an endogenous albumin gene.

29. The method according to embodiment 28, wherein after integrating the expression cassette into the genome of the cell at the target integration site, the transgene is expressed under the control of the endogenous albumin promoter without disrupting expression of the endogenous albumin gene.

30. The method according to any one of the preceding embodiments, wherein the tissue is liver.

31. The method according to any one of the preceding embodiments, wherein the second nucleic acid sequence comprises:

a) A nucleic acid sequence encoding a 2A peptide;

b) A nucleic acid sequence encoding an Internal Ribosome Entry Site (IRES);

D) Nucleic acid sequences encoding splice donors and splice acceptors.

32. The method according to any one of the preceding embodiments, wherein the third nucleic acid sequence and the fourth nucleic acid sequence are homology arms that integrate the expression cassette at a target integration site comprising an endogenous promoter and an endogenous gene.

33. The method of embodiment 32, wherein the target integration site is an endogenous albumin locus.

34. A method of increasing the expression level of a transgene in a tissue over a period of time, the method comprising administering to a subject in need thereof a composition that delivers a transgene that integrates into the genome of at least one cell population in the subject tissue, wherein the composition comprises:

A polynucleotide cassette comprising:

A fourth nucleic acid sequence located 3 'of the expression cassette and comprising a sequence substantially homologous to the genomic sequence 3' of the target integration site in the genome of the cell;

Wherein, after administration of the composition, the transgene is integrated into the genome of the cell population, and the expression level of the transgene in the tissue increases over a period of time.

35. The method according to embodiment 34, wherein the integration of the transgene does not comprise nuclease activity.

36. The method according to embodiment 34 or 35, wherein the increased expression level comprises an increased percentage of cells expressing the transgene in the tissue.

37. The method of embodiment 34, wherein the composition further comprises a delivery vehicle.

38. The method according to embodiment 37, wherein the delivery vehicle comprises a lipid nanoparticle.

39. The method of embodiment 37, wherein the delivery vector comprises a recombinant viral vector.

40. The method according to embodiment 39, wherein the recombinant viral vector is a recombinant AAV vector.

41. The method according to embodiment 40, wherein the recombinant viral vector is or comprises a capsid protein comprising an amino acid sequence having at least 95% sequence identity with the amino acid sequence of AAV8, AAV-DJ, AAV-LK03, sL65 or AAVNP.

42. The method according to any one of embodiments 34 to 41, wherein the transgene is or comprises a fumarylacetoacetate hydrolase (FAH) transgene.

43. The method according to embodiment 42, wherein the FAH transgene is wt human FAH, codon-optimized FAH, synthetic FAH, FAH variant, FAH mutant or FAH fragment.

44. The method according to embodiment 42, wherein the FAH transgene has 80% sequence identity to SEQ ID NO.18, 19, 20, 21 or 22.

45. The method according to any one of embodiments 34 to 44, wherein the composition further comprises an AAV2 ITR sequence and/or ITR sequence selected from SEQ ID NOS: 27-30.

46. The method according to any one of embodiments 34 to 45, wherein the polynucleotide cassette does not comprise a promoter sequence.

47. The method according to any one of embodiments 34-46, wherein after integrating the expression cassette into the target integration site in the genome of the cell, the transgene is expressed under the control of an endogenous promoter at the target integration site.

48. The method of embodiment 47, wherein the target integration site is an albumin locus comprising an endogenous albumin promoter and an endogenous albumin gene.

49. The method of embodiment 48, wherein after integrating the expression cassette into the target integration site in the genome of the cell, the transgene is expressed under the control of the endogenous albumin promoter without disrupting expression of the endogenous albumin gene.

50. The method according to any one of embodiments 34 to 49, wherein the tissue is liver.

51. The method according to any one of embodiments 34 to 50, wherein the second nucleic acid sequence comprises:

a) A nucleic acid sequence encoding a 2A peptide;

b) A nucleic acid sequence encoding an Internal Ribosome Entry Site (IRES);

D) Nucleic acid sequences encoding splice donors and splice acceptors.

52. The method of any one of embodiments 34-51, wherein the third nucleic acid sequence and the fourth nucleic acid sequence are homology arms that integrate the expression cassette at a target integration site comprising an endogenous promoter and an endogenous gene.

53. The method of embodiment 52, wherein the target integration site is an endogenous albumin locus.

54. A recombinant viral vector for integrating a transgene into a target integration site in the genome of a cell, comprising a polynucleotide cassette comprising:

(i) An expression cassette comprising a first nucleic acid sequence and a second nucleic acid sequence, wherein the first nucleic acid sequence comprises a FAH transgene; and the second nucleic acid sequence is located 5 'or 3' to the first nucleic acid sequence and facilitates the production of two independent gene products upon integration into the genome of the cell at the target integration site;

(ii) A third nucleic acid sequence located 5 'of the expression cassette and comprising a sequence substantially homologous to the genomic sequence 5' of the target integration site in the genome of the cell; and

(Iii) A fourth nucleic acid sequence located 3 'of the expression cassette and comprising a sequence substantially homologous to the genomic sequence 3' of the target integration site in the genome of the cell.

55. The recombinant viral vector according to embodiment 54, wherein the third nucleic acid is between 900-1150 nucleotides.

56. The recombinant viral vector according to embodiment 54 or embodiment 55, wherein the fourth nucleic acid is between 1500-1750 nucleotides.

57. The recombinant viral vector according to any one of embodiments 54-56, wherein the recombinant viral vector is a recombinant AAV vector.

58. The recombinant viral vector according to embodiment 57, wherein the recombinant viral vector is or comprises a capsid protein comprising an amino acid sequence having at least 95% sequence identity to the amino acid sequence of AAV8, AAV-DJ, AAV-LK03, sL65 or AAV-NP 59.

59. The recombinant viral vector according to any one of embodiments 54-58, further comprising an AAV2 ITR sequence and/or ITR sequence selected from SEQ ID NOs 27-30.

60. The recombinant viral vector according to any one of embodiments 54-59, wherein the polynucleotide cassette does not comprise a promoter sequence.

61. The recombinant viral vector according to any one of embodiments 54-60, wherein after integration of the expression cassette into the target integration site in the genome of the cell, the FAH transgene is expressed under the control of an endogenous promoter at the target integration site.

62. The recombinant viral vector according to any one of embodiment 61, wherein the target integration site is an albumin locus comprising an endogenous albumin promoter and an endogenous albumin gene.

63. The recombinant viral vector according to embodiment 61, wherein after integration of the expression cassette into the target integration site in the genome of the cell, the FAH transgene is expressed under the control of the endogenous albumin promoter without disrupting expression of the endogenous albumin gene.

64. The recombinant viral vector according to any one of embodiment 61, wherein the two independent gene products are a FAH protein expressed from the FAH transgene and a peptide comprising an endogenous protein expressed from an endogenous gene at the integration site.

65. The recombinant viral vector according to any one of embodiments 54-64, wherein the cell is a liver cell.

66. The recombinant viral vector according to any one of embodiments 54-65, wherein the second nucleic acid sequence comprises:

a) A nucleic acid sequence encoding a 2A peptide;

b) A nucleic acid sequence encoding an Internal Ribosome Entry Site (IRES);

D) Nucleic acid sequences encoding splice donors and splice acceptors.

67. The recombinant viral vector according to any one of embodiments 54-66, wherein the third nucleic acid sequence and the fourth nucleic acid sequence are homology arms that integrate the FAH transgene and the second nucleic acid sequence into an endogenous albumin locus comprising an endogenous albumin promoter and an endogenous albumin gene.

68. The recombinant viral vector according to embodiment 67, wherein the third nucleic acid sequence and the fourth nucleic acid sequence are homology arms that integrate the FAH transgene and the second nucleic acid sequence into an endogenous albumin locus having the endogenous albumin promoter and the endogenous albumin gene in frame.

69. The recombinant viral vector according to embodiment 67 or embodiment 68, wherein the homology arm directs the polynucleotide cassette to integrate 3 'immediately adjacent to the start codon of the endogenous albumin gene or 5' immediately adjacent to the stop codon of the endogenous albumin gene.

70. The recombinant viral vector according to any one of embodiments 54-69, wherein the FAH transgene is wt human FAH, codon optimized FAH, synthetic FAH, a FAH variant, a FAH mutant or a FAH fragment.

71. A method comprising the steps of:

Administering to a subject a dose of a composition that delivers a transgene into cells in tissue of the subject, wherein the transgene (i) encodes a FAH; (ii) Integration at a target integration site in the genomes of a plurality of cells; (iii) upon integration, functionally express the FAH; and (iv) conferring a selective advantage to the plurality of cells relative to other cells in the tissue such that over time the tissue achieves a functional expression level of FAH that is higher than that of cells not integrating the transgene, wherein the composition comprises:

A polynucleotide cassette comprising:

An expression cassette comprising a first nucleic acid sequence and a second nucleic acid sequence, wherein the first nucleic acid sequence encodes the transgene; and the second nucleic acid sequence is located 5 'or 3' to the first nucleic acid sequence and facilitates the production of two independent gene products upon integration of the transgene at the target integration site;

a third nucleic acid sequence located 5 'of the expression cassette and comprising a sequence substantially homologous to the genomic sequence 5' of the target integration site; and

A fourth nucleic acid sequence located 3 'to the expression cassette and comprising a sequence substantially homologous to the genomic sequence 3' to the target integration site.

72. The method of embodiment 71, wherein the integration of the transgene does not comprise nuclease activity.

73. The method according to embodiment 71 or 72, wherein the selective advantage comprises an increase in the percentage of cells expressing the transgene in the tissue.

74. The method according to any one of embodiments 71 to 73, wherein the composition further comprises a delivery vehicle.

75. The method according to embodiment 74, wherein the delivery vehicle comprises a lipid nanoparticle.

76. The method of embodiment 74, wherein the composition comprises a recombinant viral vector.

77. The method of embodiment 76, wherein the recombinant viral vector is a recombinant AAV vector.

78. The method according to embodiment 77, wherein the recombinant viral vector is or comprises a capsid protein comprising an amino acid sequence having at least 95% sequence identity with the amino acid sequence of AAV8, AAV-DJ, AAV-LK03, sL65 or AAVNP.

79. The method according to embodiment 71, wherein the FAH transgene is wt human FAH, codon-optimized FAH, synthetic FAH, FAH variant, FAH mutant or FAH fragment.

80. The method according to any one of embodiments 71 to 79, wherein the composition further comprises an AAV2 ITR sequence selected from SEQ ID NOs 27-30 and/or ITR sequence.

81. The method according to any one of embodiments 71 to 80, wherein the polynucleotide cassette does not comprise a promoter sequence.

82. The method according to any one of embodiments 71-81, wherein after integrating the expression cassette into the target integration site in the genome of the cell, the transgene is expressed under the control of an endogenous promoter at the target integration site.

83. The method of embodiment 82, wherein the target integration site is an albumin locus comprising an endogenous albumin promoter and an endogenous albumin gene.

84. The method according to embodiment 83, wherein after integrating the polynucleotide cassette into the target integration site in the genome of the cell, the transgene is expressed under the control of the endogenous albumin promoter without disrupting expression of the endogenous albumin gene.

85. The method according to any one of embodiments 71-84, wherein the tissue is liver.

86. The method according to any one of embodiments 71 to 85, wherein the second nucleic acid sequence comprises:

a) A nucleic acid sequence encoding a 2A peptide;

b) A nucleic acid sequence encoding an Internal Ribosome Entry Site (IRES);

D) Nucleic acid sequences encoding splice donors and splice acceptors.

87. The method of any one of embodiments 71 to 86, wherein the third nucleic acid sequence and the fourth nucleic acid sequence are homology arms that integrate the expression cassette at a target integration site comprising an endogenous promoter and an endogenous gene.

88. The method of embodiment 87, wherein the target integration site is an endogenous albumin locus.

89. A method of treating type 1 hereditary tyrosinemia (HT 1), the method comprising administering to a subject a dose of a composition comprising:

A polynucleotide cassette comprising:

An expression cassette comprising a first nucleic acid sequence and a second nucleic acid sequence, wherein the first nucleic acid sequence encodes a FAH transgene; and the second nucleic acid sequence is located 5 'or 3' to the first nucleic acid sequence and facilitates the production of two independent gene products upon integration of the transgene at the target integration site;

a third nucleic acid sequence located 5 'to the first nucleic acid sequence and comprising a sequence substantially homologous to the genomic sequence 5' of the target integration site; and

A fourth nucleic acid sequence located 3 'to the second nucleic acid sequence and comprising a sequence substantially homologous to the genomic sequence 3' to the target integration site;

90. The method of embodiment 89, wherein the integration does not comprise nuclease activity.

91. The method of embodiment 89, wherein the composition further comprises a delivery vehicle.

92. The method of embodiment 91, wherein the delivery vehicle comprises a lipid nanoparticle.

93. The method of embodiment 91, wherein the delivery vector comprises a recombinant viral vector.

94. The method of embodiment 93, wherein the recombinant viral vector is a recombinant adeno-associated (AAV) viral vector.

95. The method according to embodiment 93, wherein the recombinant viral vector is or comprises a capsid protein comprising an amino acid sequence having at least 95% sequence identity to the amino acid sequence of AAV8, AAV-DJ, AAV-LK03, sL65 or AAVNP.

96. The method according to embodiment 95, wherein the FAH transgene is wt human FAH, codon-optimized FAH, synthetic FAH, FAH variant, FAH mutant or FAH fragment.

97. The method according to embodiment 95, wherein the FAH transgene has 80% sequence identity to SEQ ID NO.18, 19, 20, 21 or 22.

98. The method according to any of embodiments 89 to 97, wherein the composition further comprises an AAV2 ITR sequence selected from SEQ ID NOs 27-30 and/or ITR sequence.

99. The method according to any one of embodiments 89-98, wherein the polynucleotide cassette does not comprise a promoter sequence.

100. The method according to any one of embodiments 89-99, wherein after integrating the expression cassette into the target integration site in the genome of the cell, the transgene is expressed under the control of an endogenous promoter at the target integration site.

101. The method of embodiment 100, wherein the target integration site is an albumin locus comprising an endogenous albumin promoter and an endogenous albumin gene.

102. The method according to embodiment 101, wherein after integrating the polynucleotide cassette into the target integration site in the genome of the cell, the transgene is expressed under the control of the endogenous albumin promoter without disrupting expression of the endogenous albumin gene.

103. The method according to any one of embodiments 89-102, wherein the tissue is liver.

104. The method according to any one of embodiments 89 to 103, wherein the second nucleic acid sequence comprises:

a) A nucleic acid sequence encoding a 2A peptide;

b) A nucleic acid sequence encoding an Internal Ribosome Entry Site (IRES);

D) Nucleic acid sequences encoding splice donors and splice acceptors.

105. The method of any one of embodiments 89-104, wherein the third nucleic acid sequence and the fourth nucleic acid sequence are homology arms that integrate an expression cassette at a target integration site comprising an endogenous promoter and an endogenous gene.

106. The method of embodiment 105, wherein the target integration site is an endogenous albumin locus.

107. The method according to embodiment 89, wherein the subject has received or is receiving treatment for HT 1.

108. The method according to embodiment 107, wherein the subject has received or is receiving NTBC ((2- (2-nitro-4-trifluoromethylbenzoyl) -1, 3-cyclohexanedione)).

109. The method of embodiment 107 or 108, wherein NTBC is administered in combination with the composition.

110. The method of embodiment 109, wherein NTBC is administered simultaneously or sequentially with the composition.

111. The method of embodiment 89, wherein the subject receives a lower or reduced dose of the treatment being received by the subject after treatment with the composition.

112. The method of embodiment 89, wherein the subject receives the same dose of treatment that the subject has received or is receiving after treatment with the composition.

113. The method of embodiment 89, wherein the subject ceases to receive treatment that the subject has received or is receiving after treatment with the composition.

114. The method of embodiment 89, wherein the subject is a neonate.

115. The method according to embodiment 89, wherein the subject is one week old, two weeks old, three weeks old, four weeks old, or five weeks old.

116. A method of monitoring gene therapy, the method comprising the steps of:

Detecting in a biological sample from a subject that has received a gene therapy treatment comprising a composition according to embodiment 1 the level or activity of a biomarker produced by integration of an integrated gene therapy treatment as a surrogate for one or more characteristics of the state of the gene therapy treatment, wherein the one or more characteristics of the state of the gene therapy treatment are selected from the group consisting of: the level of payload, the activity of the payload, the level of integration of the gene therapy treatment in the cell population, and combinations thereof.

117. The method of embodiment 116, wherein the payload is or comprises an intracellular expressed peptide.

118. The method of embodiment 116, wherein the payload is or comprises an extracellular secreted peptide.

119. The method according to any one of embodiments 116-118, wherein the payload is encoded by a polynucleotide cassette.

120. The method according to any one of embodiments 116-119, wherein the biological sample is or comprises hair, skin, stool, blood, plasma, serum, cerebrospinal fluid, urine, saliva, tears, vitreous, liver biopsy, or mucus.

121. The method according to any one of embodiments 116 to 120, wherein the detecting step comprises an immunological assay or a nucleic acid amplification assay.

122. The method according to any one of embodiments 116-121, wherein the biomarker comprises a detectable moiety that becomes fused to the polypeptide encoded by the target site after translation of the polypeptide encoded by the target site.

123. The method according to any one of embodiments 116-123, wherein the biomarker comprises a detectable moiety that becomes fused to the polypeptide encoded by the payload after translation of the polypeptide encoded by the target site.

124. The method according to any one of embodiments 116-123, wherein the biomarker comprises a detectable moiety that is a 2A peptide.

125. The method of embodiment 124, wherein the 2A peptide is selected from the group consisting of P2A, T2A, E a and F2A.

126. The method according to any one of embodiments 116-125, wherein the subject receives a single dose of the gene therapy treatment or gene integration composition.

127. The method according to any one of embodiments 116-126, wherein the detecting step is performed 1,2,3,4, 5, 6, 7, 8 weeks or more after the subject receives the gene therapy treatment or gene integration composition.

128. The method according to any one of embodiments 116-127, wherein the detecting step is performed at a plurality of time points after the subject receives the gene therapy treatment or gene integration composition.

129. The method according to any one of embodiments 116-128, wherein the detecting step is performed over a period of at least 3 months after the subject receives the gene therapy treatment or gene integration composition.

130. The method according to any one of embodiments 116-129, wherein the method further comprises monitoring the subject's autoimmune response to the gene therapy.

131. A composition comprising a recombinant AAV construct comprising:

A polynucleotide cassette comprising:

An expression cassette comprising a first nucleic acid sequence and a second nucleic acid sequence, wherein the first nucleic acid sequence has 80% sequence identity to SEQ ID No.18, 19, 20, 21 or 22; and the second nucleic acid sequence

(I) Located 5 'or 3' to the first nucleic acid sequence; and is also provided with

(Ii) Facilitating the production of two independent gene products upon integration into the genome of the cell at a target integration site;

132. The composition of embodiment 131, wherein the recombinant viral vector is or comprises a capsid protein comprising an amino acid sequence having at least 95% sequence identity to the amino acid sequence of AAV8, AAV-DJ, AAV-LK03, sL65 or AAVNP.

133. The composition of embodiment 131 or 132, wherein the composition further comprises an AAV2 ITR sequence selected from SEQ ID NOs 27-30 and/or an ITR sequence.

134. The composition according to embodiments 131 to 133, wherein the second nucleic acid sequence has 80% sequence identity to SEQ ID No. 6.

135. The composition according to embodiments 131 to 133, wherein the second nucleic acid sequence encodes a P2A peptide having 90% sequence identity to SEQ ID No. 7.

136. A recombinant AAV construct comprising a nucleic acid sequence comprising SEQ ID No. 23.

137. A recombinant AAV construct comprising a nucleic acid sequence comprising SEQ ID No. 24.

138. A recombinant AAV construct comprising a nucleic acid sequence comprising SEQ ID No. 25.

139. A recombinant AAV construct comprising a nucleic acid sequence comprising SEQ ID No. 26.

140. A composition comprising a recombinant AAV construct according to any one of embodiments 136 to 139.

141. A method of integrating a transgene into the genome of at least one cell population in a subject tissue, the method comprising administering to a subject a composition comprising the recombinant AAV construct according to any one of embodiments 136-139.

142. A composition comprising the recombinant viral vector according to embodiments 54-70.

143. A method of treatment comprising administering the composition of any one of embodiments 1, 131 or 142, wherein the composition is administered to a subject at a dose of between 1e12 vg/kg and 1e14 vg/kg.

144. The method of embodiment 143, wherein the composition is administered to the subject at a dose of between 3E12 vg/kg and 1E13 vg/kg.

145. The method of embodiment 143, wherein the composition is administered to the subject at a dose of between 3E12 vg/kg and 3E13 vg/kg.

146. The method of embodiment 143, wherein the composition is administered to the subject at a dose of no more than 3e13 vg/kg.

147. The method of embodiment 143, wherein the composition is administered to the subject at a dose of no more than 3e12 vg/kg.

148. The method according to any one of embodiments 142-147, wherein the composition is administered only once.

149. The method according to any of embodiments 142-147, wherein the composition is administered more than once.

150. The method according to any one of embodiments 143 to 149, wherein the subject is a neonate.

151. The method according to any one of embodiments 143-149, wherein the subject is between 0 days and 1 month old.

152. The method according to any one of embodiments 143-149, wherein the subject is between 3 months and 1 year old.

153. The method according to any one of embodiments 143-149, wherein the subject is between 1 year old and 5 years old.

154. The method according to any one of embodiments 143 to 149, wherein the subject is 5 years of age or older.

Equivalent(s)

Those skilled in the art will recognize, or be able to ascertain using no more than routine experimentation, many equivalents to the specific embodiments of the invention described herein. The scope of the invention is not intended to be limited by the foregoing description, but rather is set forth in the following claims.

Claims

1. A composition comprising:

A closed circular cDNA integration gene therapy construct comprising, from 5' to 3', a sequence encoding (a) a 5' homology arm of between 1kb and 1.6kb in length; (b) a P2A coding sequence encoding a P2A peptide; (c) a therapeutic payload; and (d) a polynucleotide sequence of a 3' homology arm between 1kb and 1.6kb in length, wherein:

The therapeutic payload comprises a transgene sequence encoding fumarylacetoacetate hydrolase (FAH);

the homology arm sequence facilitates integration of the construct at the endogenous albumin target site by homologous recombination, such that the albumin locus is capable of resulting in simultaneous production of albumin-2A and transgene as separate proteins.

2. The composition according to claim 1, wherein the closed circular DNA comprises the sequence SEQ ID No. 23, SEQ ID No. 24, SEQ ID No. 25 or SEQ ID No. 26.

3. The composition according to claim 1 or 2, wherein the closed circular DNA consists of the sequence: SEQ ID NO. 23, SEQ ID NO. 24, SEQ ID NO. 25 or SEQ ID NO. 26.

4. The composition according to any of the preceding claims, wherein the 5' homology arm sequence comprises SEQ ID No. 1 or SEQ ID No. 3.

5. The composition according to any of the preceding claims, wherein the 3' homology arm sequence comprises SEQ ID No. 2.

6. The composition according to any of the preceding claims, wherein the P2A coding sequence comprises SEQ ID No. 6.

7. The composition of claim 6, wherein the P2A coding sequence encodes a peptide comprising SEQ ID NO. 7.

8. The composition according to any one of the preceding claims, wherein the transgene sequence encoding a FAH comprises SEQ ID No. 20, SEQ ID No. 21 or SEQ ID No. 22.

9. The composition according to any one of the preceding claims, wherein the transgene sequence encoding a FAH consists of: SEQ ID NO. 20, SEQ ID NO. 21 or SEQ ID NO. 22.

10. The composition according to any one of the preceding claims, wherein the composition further comprises an AAV capsid protein.

11. The composition according to claim 10, wherein the AAV capsid protein comprises an amino acid sequence having at least 95% sequence identity to an amino acid sequence of AAV8, AAV-DJ, AAV-LK03, sL65 or AAVNP 59.

12. A method of treating type 1 hereditary tyrosinemia (HT 1), the method comprising administering to a subject a dose of a composition comprising:

13. The method according to claim 12, wherein the closed circular DNA comprises the sequence SEQ ID No. 23, SEQ ID No. 24, SEQ ID No. 25 or SEQ ID No. 26.

14. The method according to claim 12 or 13, wherein the closed circular DNA consists of the sequence: SEQ ID NO. 23, SEQ ID NO. 24, SEQ ID NO. 25 or SEQ ID NO. 26.

15. The method according to any one of claims 12 to 14, wherein the 5' homology arm sequence comprises SEQ ID No.1 or SEQ ID No. 3.

16. The method according to any one of claims 12 to 15, wherein the 3' homology arm sequence comprises SEQ ID No. 2.

17. The method according to any one of claims 12 to 16, wherein the P2A coding sequence comprises SEQ ID No. 6.

18. The method of claim 17, wherein the P2A coding sequence encodes a peptide comprising SEQ ID No. 7.

19. The method according to any one of claims 12 to 18, wherein the transgene sequence encoding FAH comprises SEQ ID No. 20, SEQ ID No. 21 or SEQ ID No. 22.

20. The method according to any one of claims 12 to 18, wherein the transgene sequence encoding FAH consists of: SEQ ID NO. 20, SEQ ID NO. 21 or SEQ ID NO. 22.

21. The method according to any one of claims 12 to 20, wherein the composition further comprises an AAV capsid protein.

22. The method according to claim 21, wherein the AAV capsid protein comprises an amino acid sequence having at least 95% sequence identity to an amino acid sequence of AAV8, AAV-DJ, AAV-LK03, sL65 or AAVNP.

23. The method according to any one of claims 12 to 22, wherein the composition is in the form of 1E12

Doses between vg/kg and 1e14 vg/kg are administered to the subject.

24. The method of claim 23, wherein the composition is at 3E12 vg/kg and 1E13

A dose of between vg/kg is administered to a subject.

25. The method of claim 23, wherein the composition is at 3E12 vg/kg and 3E13

A dose of between vg/kg is administered to a subject.

26. The method according to claim 23, wherein the composition is administered to the subject at a dose of no more than 3e13 vg/kg.

27. The method according to claim 23, wherein the composition is administered to the subject at a dose of no more than 3e12 vg/kg.

28. The method according to any one of claims 12 to 27, wherein the composition is administered only once.

29. The method according to any one of claims 12 to 27, wherein the composition is applied more than once.

30. The method according to any one of claims 12 to 27, wherein the subject is a newborn infant.

31. The method according to any one of claims 12 to 27, wherein the subject is between 0 days and 1 month old.

32. The method according to any one of claims 12 to 27, wherein the subject is between 3 months and 1 year old.

33. The method according to any one of claims 12 to 27, wherein the subject is between 1 and 5 years of age.

34. The method according to any one of claims 12 to 27, wherein the subject is 5 years of age or older.

35. A closed circular DNA consisting of a polynucleotide sequence consisting of SEQ ID NO. 23.

36. A closed circular DNA consisting of a polynucleotide sequence consisting of SEQ ID NO. 24.

37. A closed circular DNA consisting of a polynucleotide sequence consisting of SEQ ID NO. 25.

38. A closed circular DNA consisting of a polynucleotide sequence consisting of SEQ ID NO. 26.

39. A liver-targeting recombinant engineered adeno-associated viral vector for use in the treatment of type 1 hereditary tyrosinemia (HT 1) and encoding human Fumarylacetoacetic Acid Hydrolase (FAH), the viral vector comprising a closed circular cDNA polynucleotide sequence comprising:

A human FAH polynucleotide sequence encoding a functional FAH comprising SEQ ID NO. 20, SEQ ID NO. 21 or SEQ ID NO. 22 preceded by a 2A-peptide sequence encoding a 2A-peptide comprising SEQ ID NO. 7;

The human FAH polynucleotide sequence and the 2A-peptide sequence are flanked together by a 3 'homology arm polynucleotide sequence comprising SEQ ID NO. 2 and a 5' homology arm polynucleotide sequence comprising SEQ ID NO. 1 or SEQ ID NO. 3.

40. A liver-targeting recombinant engineered adeno-associated viral vector encoding human Fumarylacetoacetic Acid Hydrolase (FAH) or a codon-optimized variant thereof preceded by a 2A-peptide coding sequence and flanked by homology arms spanning the albumin stop codon to facilitate site-specific homologous recombination without the need for an exogenous nuclease or promoter in the vector genome, the viral vector consisting of a closed circular cDNA polynucleotide sequence,

The closed circular cDNA polynucleotide sequence comprises a therapeutic transgene sequence encoding a functional human Fumarylacetoacetic Acid Hydrolase (FAH), preceded by a 2A-peptide coding sequence, which together flank the gene homology arms, each 1.0kb to 1.6kb in length, consisting of a 5 'homology arm sequence and a 3' homology arm sequence.

41. The adeno-associated viral vector of claim 38, wherein:

The 5' homologous arm sequence consists of SEQ ID NO.1 or SEQ ID NO. 3; and is also provided with

The 3' homology arm sequence consists of SEQ ID NO. 2; and is also provided with

The P2A coding sequence encodes a peptide comprising SEQ ID NO. 7.

42. The adeno-associated viral vector according to any one of claims 38 to 39, wherein the transgene sequence encoding human FAH or a codon-optimized variant thereof is a cDNA sequence comprising SEQ ID NO. 20, SEQ ID NO. 21 or SEQ ID NO. 22.

43. The adeno-associated viral vector according to claim 40, wherein the transgene sequence encoding FAH consists of SEQ ID NO. 20, SEQ ID NO. 21 or SEQ ID NO. 22.

44. A method of treating type 1 hereditary tyrosinemia (HT 1), the method comprising administering to a subject in need thereof a therapeutically effective dose of a liver-targeting recombinant engineered adeno-associated closed circular cDNA viral vector expressing a functional therapeutic human fumarylacetoacetic hydrolase (FAH) preceded by a 2A-peptide coding sequence and flanked by 1.0kb and 1.6kb of gene homology arms spanning the albumin stop codon to promote site-specific homologous recombination in the absence of exogenous nucleases or promoters in the vector genome, wherein the homology arm sequences promote integration of the construct at endogenous albumin target sites by homologous recombination such that the albumin loci are capable of causing simultaneous production of albumin-2A and the human FAH as separate proteins.

45. The method of claim 42, wherein the therapeutic transgene sequence encoding human FAH consists of SEQ ID NO. 20.

46. The method of claim 42, wherein the therapeutic transgene sequence encoding human FAH consists of SEQ ID NO. 21.

47. The method of claim 42, wherein the therapeutic transgene sequence encoding human FAH consists of SEQ ID NO. 22.

48. The method of any one of claims 42 to 45, wherein each gene homology arm is independently 1.0kb or 1.6kb.

49. The method of any one of claims 42-46, wherein the gene homology arms consist of a 5 'homology arm sequence and a 3' homology arm sequence, and

The 5' homology arm sequence comprises SEQ ID NO.1 or SEQ ID NO. 3; and is also provided with

The 3' homology arm sequence comprises SEQ ID NO. 2.

50. The method of any one of claims 42 to 46, wherein the recombinant engineered adeno-associated closed cDNA viral vector comprises a P2A coding sequence encoding a peptide of SEQ ID No. 7.

51. The method according to any one of claims 42 to 48, wherein the P2A coding sequence comprises SEQ ID NO. 6.

52. A method of treating type 1 hereditary tyrosinemia (HT 1), the method comprising administering to a subject in need thereof a therapeutically effective dose of a liver-targeting recombinant engineered adeno-associated viral vector encoding human fumarylacetoacetate hydrolase (FAH), the viral vector comprising a cDNA polynucleotide sequence comprising

A FAH polynucleotide sequence encoding a functional human FAH, said FAH polynucleotide sequence comprising SEQ ID NO. 20, SEQ ID NO. 21 or SEQ ID NO. 22, said FAH polynucleotide sequence being preceded by a 2A-peptide sequence encoding a 2A-peptide comprising SEQ ID NO. 7;

the FAH polynucleotide sequence and the 2A-peptide sequence are flanked together by a 3 'homology arm polynucleotide sequence comprising SEQ ID NO. 2 and a 5' homology arm polynucleotide sequence comprising SEQ ID NO. 1 or SEQ ID NO. 3.