WO2014004638A2 - Methods and compositions for enhancing gene expression - Google Patents
Methods and compositions for enhancing gene expression Download PDFInfo
- Publication number
- WO2014004638A2 WO2014004638A2 PCT/US2013/047837 US2013047837W WO2014004638A2 WO 2014004638 A2 WO2014004638 A2 WO 2014004638A2 US 2013047837 W US2013047837 W US 2013047837W WO 2014004638 A2 WO2014004638 A2 WO 2014004638A2
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- organism
- intron
- gene
- expression
- utr
- Prior art date
Links
- 230000014509 gene expression Effects 0.000 title claims abstract description 309
- 238000000034 method Methods 0.000 title claims abstract description 179
- 230000002708 enhancing effect Effects 0.000 title claims abstract description 20
- 239000000203 mixture Substances 0.000 title abstract description 9
- 108090000623 proteins and genes Proteins 0.000 claims abstract description 291
- 108091033319 polynucleotide Proteins 0.000 claims abstract description 134
- 102000040430 polynucleotide Human genes 0.000 claims abstract description 134
- 239000002157 polynucleotide Substances 0.000 claims abstract description 134
- 108020003589 5' Untranslated Regions Proteins 0.000 claims abstract description 67
- 108091026898 Leader sequence (mRNA) Proteins 0.000 claims abstract description 20
- 241000196324 Embryophyta Species 0.000 claims description 140
- 230000001105 regulatory effect Effects 0.000 claims description 95
- 150000007523 nucleic acids Chemical class 0.000 claims description 51
- 102000039446 nucleic acids Human genes 0.000 claims description 46
- 108020004707 nucleic acids Proteins 0.000 claims description 46
- 102000004169 proteins and genes Human genes 0.000 claims description 37
- 241000894007 species Species 0.000 claims description 26
- 240000008042 Zea mays Species 0.000 claims description 25
- 241000219194 Arabidopsis Species 0.000 claims description 23
- 235000002017 Zea mays subsp mays Nutrition 0.000 claims description 23
- 230000000694 effects Effects 0.000 claims description 19
- 240000007594 Oryza sativa Species 0.000 claims description 17
- 235000007164 Oryza sativa Nutrition 0.000 claims description 17
- 235000009566 rice Nutrition 0.000 claims description 16
- 241000209510 Liliopsida Species 0.000 claims description 15
- 241001233957 eudicotyledons Species 0.000 claims description 15
- 235000016383 Zea mays subsp huehuetenangensis Nutrition 0.000 claims description 12
- 235000009973 maize Nutrition 0.000 claims description 12
- 230000001965 increasing effect Effects 0.000 claims description 11
- 235000010469 Glycine max Nutrition 0.000 claims description 10
- 244000068988 Glycine max Species 0.000 claims description 10
- 244000062793 Sorghum vulgare Species 0.000 claims description 10
- 241000233866 Fungi Species 0.000 claims description 8
- 241001465754 Metazoa Species 0.000 claims description 8
- 230000002194 synthesizing effect Effects 0.000 claims description 8
- 235000009467 Carica papaya Nutrition 0.000 claims description 6
- 240000006432 Carica papaya Species 0.000 claims description 6
- 235000007688 Lycopersicon esculentum Nutrition 0.000 claims description 6
- 241000219823 Medicago Species 0.000 claims description 6
- 241001520808 Panicum virgatum Species 0.000 claims description 6
- 240000003768 Solanum lycopersicum Species 0.000 claims description 6
- 235000002595 Solanum tuberosum Nutrition 0.000 claims description 6
- 244000061456 Solanum tuberosum Species 0.000 claims description 6
- 235000011684 Sorghum saccharatum Nutrition 0.000 claims description 6
- 230000002349 favourable effect Effects 0.000 claims description 6
- 235000019713 millet Nutrition 0.000 claims description 6
- 230000001172 regenerating effect Effects 0.000 claims description 6
- 235000011331 Brassica Nutrition 0.000 claims description 5
- 229920000742 Cotton Polymers 0.000 claims description 5
- 240000005979 Hordeum vulgare Species 0.000 claims description 5
- 235000007340 Hordeum vulgare Nutrition 0.000 claims description 5
- 244000207740 Lemna minor Species 0.000 claims description 5
- 235000006439 Lemna minor Nutrition 0.000 claims description 5
- 235000017587 Medicago sativa ssp. sativa Nutrition 0.000 claims description 5
- 240000005561 Musa balbisiana Species 0.000 claims description 5
- 235000018290 Musa x paradisiaca Nutrition 0.000 claims description 5
- 235000001855 Portulaca oleracea Nutrition 0.000 claims description 5
- 240000000111 Saccharum officinarum Species 0.000 claims description 5
- 235000007201 Saccharum officinarum Nutrition 0.000 claims description 5
- 235000021307 Triticum Nutrition 0.000 claims description 5
- 108700024394 Exon Proteins 0.000 claims description 4
- 241000743774 Brachypodium Species 0.000 claims 4
- 241000220243 Brassica sp. Species 0.000 claims 4
- 241000219146 Gossypium Species 0.000 claims 4
- 241000209140 Triticum Species 0.000 claims 4
- 239000002773 nucleotide Substances 0.000 description 210
- 125000003729 nucleotide group Chemical group 0.000 description 210
- 210000004027 cell Anatomy 0.000 description 104
- 108091092195 Intron Proteins 0.000 description 45
- 238000013518 transcription Methods 0.000 description 30
- 230000035897 transcription Effects 0.000 description 30
- 239000005090 green fluorescent protein Substances 0.000 description 27
- 108010043121 Green Fluorescent Proteins Proteins 0.000 description 26
- 102000004144 Green Fluorescent Proteins Human genes 0.000 description 26
- 210000001519 tissue Anatomy 0.000 description 22
- 108020004414 DNA Proteins 0.000 description 16
- 108700019146 Transgenes Proteins 0.000 description 16
- 230000009466 transformation Effects 0.000 description 15
- 108091032973 (ribonucleotides)n+m Proteins 0.000 description 14
- 108091028043 Nucleic acid sequence Proteins 0.000 description 14
- 230000004048 modification Effects 0.000 description 13
- 108091026890 Coding region Proteins 0.000 description 12
- 238000012986 modification Methods 0.000 description 12
- 235000005824 Zea mays ssp. parviglumis Nutrition 0.000 description 11
- 235000005822 corn Nutrition 0.000 description 11
- 239000013615 primer Substances 0.000 description 11
- 230000009261 transgenic effect Effects 0.000 description 11
- 108010060309 Glucuronidase Proteins 0.000 description 10
- 102000053187 Glucuronidase Human genes 0.000 description 10
- 108700008625 Reporter Genes Proteins 0.000 description 10
- 239000012634 fragment Substances 0.000 description 10
- 239000003550 marker Substances 0.000 description 9
- 230000002103 transcriptional effect Effects 0.000 description 9
- 239000013598 vector Substances 0.000 description 9
- 108091092724 Noncoding DNA Proteins 0.000 description 8
- 239000003623 enhancer Substances 0.000 description 8
- 230000001404 mediated effect Effects 0.000 description 8
- 238000013519 translation Methods 0.000 description 8
- 238000003556 assay Methods 0.000 description 7
- 238000011161 development Methods 0.000 description 7
- 230000018109 developmental process Effects 0.000 description 7
- 241000701489 Cauliflower mosaic virus Species 0.000 description 6
- 108091081024 Start codon Proteins 0.000 description 6
- 239000000523 sample Substances 0.000 description 6
- 108020004511 Recombinant DNA Proteins 0.000 description 5
- 238000004458 analytical method Methods 0.000 description 5
- 238000005516 engineering process Methods 0.000 description 5
- 230000002363 herbicidal effect Effects 0.000 description 5
- 238000005259 measurement Methods 0.000 description 5
- 238000012546 transfer Methods 0.000 description 5
- 238000011426 transformation method Methods 0.000 description 5
- IJGRMHOSHXDMSA-UHFFFAOYSA-N Atomic nitrogen Chemical compound N#N IJGRMHOSHXDMSA-UHFFFAOYSA-N 0.000 description 4
- 241000894006 Bacteria Species 0.000 description 4
- 240000007377 Petunia x hybrida Species 0.000 description 4
- 108700001094 Plant Genes Proteins 0.000 description 4
- 108091036066 Three prime untranslated region Proteins 0.000 description 4
- 241000700605 Viruses Species 0.000 description 4
- 230000003321 amplification Effects 0.000 description 4
- 238000004520 electroporation Methods 0.000 description 4
- 230000002068 genetic effect Effects 0.000 description 4
- 239000004009 herbicide Substances 0.000 description 4
- 238000000338 in vitro Methods 0.000 description 4
- 210000003205 muscle Anatomy 0.000 description 4
- 238000003199 nucleic acid amplification method Methods 0.000 description 4
- 238000003753 real-time PCR Methods 0.000 description 4
- 230000014639 sexual reproduction Effects 0.000 description 4
- 230000001052 transient effect Effects 0.000 description 4
- 108010077544 Chromatin Proteins 0.000 description 3
- 108091027974 Mature messenger RNA Proteins 0.000 description 3
- 108091028664 Ribonucleotide Proteins 0.000 description 3
- 240000006394 Sorghum bicolor Species 0.000 description 3
- 238000013459 approach Methods 0.000 description 3
- 230000003115 biocidal effect Effects 0.000 description 3
- 230000004071 biological effect Effects 0.000 description 3
- 230000001413 cellular effect Effects 0.000 description 3
- 210000003483 chromatin Anatomy 0.000 description 3
- 239000002299 complementary DNA Substances 0.000 description 3
- 244000038559 crop plants Species 0.000 description 3
- 230000030279 gene silencing Effects 0.000 description 3
- 238000001727 in vivo Methods 0.000 description 3
- 238000004519 manufacturing process Methods 0.000 description 3
- 108020004999 messenger RNA Proteins 0.000 description 3
- 238000000520 microinjection Methods 0.000 description 3
- 210000000056 organ Anatomy 0.000 description 3
- 229920001184 polypeptide Polymers 0.000 description 3
- 102000004196 processed proteins & peptides Human genes 0.000 description 3
- 108090000765 processed proteins & peptides Proteins 0.000 description 3
- 210000001938 protoplast Anatomy 0.000 description 3
- 238000011160 research Methods 0.000 description 3
- 239000002336 ribonucleotide Substances 0.000 description 3
- 125000002652 ribonucleotide group Chemical group 0.000 description 3
- 230000001131 transforming effect Effects 0.000 description 3
- 101710197633 Actin-1 Proteins 0.000 description 2
- 244000144725 Amygdalus communis Species 0.000 description 2
- 235000011437 Amygdalus communis Nutrition 0.000 description 2
- 244000226021 Anacardium occidentale Species 0.000 description 2
- 244000099147 Ananas comosus Species 0.000 description 2
- 235000007119 Ananas comosus Nutrition 0.000 description 2
- 241000219195 Arabidopsis thaliana Species 0.000 description 2
- 244000105624 Arachis hypogaea Species 0.000 description 2
- 241000219198 Brassica Species 0.000 description 2
- 235000002566 Capsicum Nutrition 0.000 description 2
- 240000008574 Capsicum frutescens Species 0.000 description 2
- 235000003255 Carthamus tinctorius Nutrition 0.000 description 2
- 244000020518 Carthamus tinctorius Species 0.000 description 2
- 108010035563 Chloramphenicol O-acetyltransferase Proteins 0.000 description 2
- 241000207199 Citrus Species 0.000 description 2
- 235000013162 Cocos nucifera Nutrition 0.000 description 2
- 244000060011 Cocos nucifera Species 0.000 description 2
- 108020004705 Codon Proteins 0.000 description 2
- 241000723377 Coffea Species 0.000 description 2
- 241000195493 Cryptophyta Species 0.000 description 2
- 244000078127 Eleusine coracana Species 0.000 description 2
- 108700039887 Essential Genes Proteins 0.000 description 2
- 241000206602 Eukaryota Species 0.000 description 2
- 244000299507 Gossypium hirsutum Species 0.000 description 2
- 108010025815 Kanamycin Kinase Proteins 0.000 description 2
- 241000339550 Landoltia Species 0.000 description 2
- 241000208467 Macadamia Species 0.000 description 2
- 241000710118 Maize chlorotic mottle virus Species 0.000 description 2
- 241000723994 Maize dwarf mosaic virus Species 0.000 description 2
- 235000014826 Mangifera indica Nutrition 0.000 description 2
- 240000007228 Mangifera indica Species 0.000 description 2
- 240000003183 Manihot esculenta Species 0.000 description 2
- 235000002637 Nicotiana tabacum Nutrition 0.000 description 2
- 244000061176 Nicotiana tabacum Species 0.000 description 2
- 238000000636 Northern blotting Methods 0.000 description 2
- 240000007817 Olea europaea Species 0.000 description 2
- 235000007199 Panicum miliaceum Nutrition 0.000 description 2
- 235000007195 Pennisetum typhoides Nutrition 0.000 description 2
- 244000025272 Persea americana Species 0.000 description 2
- 235000008673 Persea americana Nutrition 0.000 description 2
- 238000011529 RT qPCR Methods 0.000 description 2
- 235000007238 Secale cereale Nutrition 0.000 description 2
- 244000082988 Secale cereale Species 0.000 description 2
- 240000005498 Setaria italica Species 0.000 description 2
- 235000002597 Solanum melongena Nutrition 0.000 description 2
- 244000061458 Solanum melongena Species 0.000 description 2
- 244000269722 Thea sinensis Species 0.000 description 2
- 244000299461 Theobroma cacao Species 0.000 description 2
- 235000009470 Theobroma cacao Nutrition 0.000 description 2
- 241000723792 Tobacco etch virus Species 0.000 description 2
- 241000723873 Tobacco mosaic virus Species 0.000 description 2
- 108700009124 Transcription Initiation Site Proteins 0.000 description 2
- 244000098338 Triticum aestivum Species 0.000 description 2
- 108091023045 Untranslated Region Proteins 0.000 description 2
- 235000007244 Zea mays Nutrition 0.000 description 2
- 230000009418 agronomic effect Effects 0.000 description 2
- 150000001413 amino acids Chemical group 0.000 description 2
- 230000011681 asexual reproduction Effects 0.000 description 2
- 238000013465 asexual reproduction Methods 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 108010005774 beta-Galactosidase Proteins 0.000 description 2
- 102000005936 beta-Galactosidase Human genes 0.000 description 2
- 230000007321 biological mechanism Effects 0.000 description 2
- 244000022203 blackseeded proso millet Species 0.000 description 2
- 235000013339 cereals Nutrition 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 239000003795 chemical substances by application Substances 0.000 description 2
- 235000020971 citrus fruits Nutrition 0.000 description 2
- 238000012217 deletion Methods 0.000 description 2
- 230000037430 deletion Effects 0.000 description 2
- 239000005547 deoxyribonucleotide Substances 0.000 description 2
- 125000002637 deoxyribonucleotide group Chemical group 0.000 description 2
- 238000001514 detection method Methods 0.000 description 2
- 108010048367 enhanced green fluorescent protein Proteins 0.000 description 2
- 230000007613 environmental effect Effects 0.000 description 2
- 238000010195 expression analysis Methods 0.000 description 2
- 102000034287 fluorescent proteins Human genes 0.000 description 2
- 108091006047 fluorescent proteins Proteins 0.000 description 2
- 238000012226 gene silencing method Methods 0.000 description 2
- 238000009396 hybridization Methods 0.000 description 2
- 108010002685 hygromycin-B kinase Proteins 0.000 description 2
- 230000000977 initiatory effect Effects 0.000 description 2
- 238000003780 insertion Methods 0.000 description 2
- 230000037431 insertion Effects 0.000 description 2
- 239000007788 liquid Substances 0.000 description 2
- 238000012544 monitoring process Methods 0.000 description 2
- 229910052757 nitrogen Inorganic materials 0.000 description 2
- 239000002245 particle Substances 0.000 description 2
- 239000002987 primer (paints) Substances 0.000 description 2
- 210000003296 saliva Anatomy 0.000 description 2
- 238000005204 segregation Methods 0.000 description 2
- 238000010561 standard procedure Methods 0.000 description 2
- 238000003786 synthesis reaction Methods 0.000 description 2
- OVSKIKFHRZPJSS-UHFFFAOYSA-N 2,4-D Chemical compound OC(=O)COC1=CC=C(Cl)C=C1Cl OVSKIKFHRZPJSS-UHFFFAOYSA-N 0.000 description 1
- 239000005631 2,4-Dichlorophenoxyacetic acid Substances 0.000 description 1
- 229940087195 2,4-dichlorophenoxyacetate Drugs 0.000 description 1
- IAJOBQBIJHVGMQ-UHFFFAOYSA-N 2-amino-4-[hydroxy(methyl)phosphoryl]butanoic acid Chemical compound CP(O)(=O)CCC(N)C(O)=O IAJOBQBIJHVGMQ-UHFFFAOYSA-N 0.000 description 1
- UPMXNNIRAGDFEH-UHFFFAOYSA-N 3,5-dibromo-4-hydroxybenzonitrile Chemical compound OC1=C(Br)C=C(C#N)C=C1Br UPMXNNIRAGDFEH-UHFFFAOYSA-N 0.000 description 1
- CAAMSDWKXXPUJR-UHFFFAOYSA-N 3,5-dihydro-4H-imidazol-4-one Chemical class O=C1CNC=N1 CAAMSDWKXXPUJR-UHFFFAOYSA-N 0.000 description 1
- FWMNVWWHGCHHJJ-SKKKGAJSSA-N 4-amino-1-[(2r)-6-amino-2-[[(2r)-2-[[(2r)-2-[[(2r)-2-amino-3-phenylpropanoyl]amino]-3-phenylpropanoyl]amino]-4-methylpentanoyl]amino]hexanoyl]piperidine-4-carboxylic acid Chemical compound C([C@H](C(=O)N[C@H](CC(C)C)C(=O)N[C@H](CCCCN)C(=O)N1CCC(N)(CC1)C(O)=O)NC(=O)[C@H](N)CC=1C=CC=CC=1)C1=CC=CC=C1 FWMNVWWHGCHHJJ-SKKKGAJSSA-N 0.000 description 1
- 241000589155 Agrobacterium tumefaciens Species 0.000 description 1
- 241000724328 Alfalfa mosaic virus Species 0.000 description 1
- 244000291564 Allium cepa Species 0.000 description 1
- 235000002732 Allium cepa var. cepa Nutrition 0.000 description 1
- 235000001274 Anacardium occidentale Nutrition 0.000 description 1
- 101100127405 Arabidopsis thaliana ATPK1 gene Proteins 0.000 description 1
- 235000010777 Arachis hypogaea Nutrition 0.000 description 1
- 235000007319 Avena orientalis Nutrition 0.000 description 1
- 244000075850 Avena orientalis Species 0.000 description 1
- 235000021533 Beta vulgaris Nutrition 0.000 description 1
- 241000335053 Beta vulgaris Species 0.000 description 1
- 241000219310 Beta vulgaris subsp. vulgaris Species 0.000 description 1
- 241001536303 Botryococcus braunii Species 0.000 description 1
- 244000178993 Brassica juncea Species 0.000 description 1
- 240000008100 Brassica rapa Species 0.000 description 1
- 239000005489 Bromoxynil Substances 0.000 description 1
- 235000004936 Bromus mango Nutrition 0.000 description 1
- 235000002567 Capsicum annuum Nutrition 0.000 description 1
- 240000004160 Capsicum annuum Species 0.000 description 1
- 240000001844 Capsicum baccatum Species 0.000 description 1
- 240000000533 Capsicum pubescens Species 0.000 description 1
- 101710132601 Capsid protein Proteins 0.000 description 1
- 241000195585 Chlamydomonas Species 0.000 description 1
- 241000195649 Chlorella <Chlorellales> Species 0.000 description 1
- 101710094648 Coat protein Proteins 0.000 description 1
- 108700010070 Codon Usage Proteins 0.000 description 1
- 241000218631 Coniferophyta Species 0.000 description 1
- 240000007235 Cyanthillium patulum Species 0.000 description 1
- 108010066133 D-octopine dehydrogenase Proteins 0.000 description 1
- 230000004544 DNA amplification Effects 0.000 description 1
- 239000003155 DNA primer Substances 0.000 description 1
- 208000035240 Disease Resistance Diseases 0.000 description 1
- 241001057636 Dracaena deremensis Species 0.000 description 1
- 241000195632 Dunaliella tertiolecta Species 0.000 description 1
- 235000007349 Eleusine coracana Nutrition 0.000 description 1
- 235000013499 Eleusine coracana subsp coracana Nutrition 0.000 description 1
- 241000710188 Encephalomyocarditis virus Species 0.000 description 1
- 102100031780 Endonuclease Human genes 0.000 description 1
- 102000004190 Enzymes Human genes 0.000 description 1
- 108090000790 Enzymes Proteins 0.000 description 1
- 241000218218 Ficus <angiosperm> Species 0.000 description 1
- 108090000331 Firefly luciferases Proteins 0.000 description 1
- 102100021181 Golgi phosphoprotein 3 Human genes 0.000 description 1
- 240000000047 Gossypium barbadense Species 0.000 description 1
- 235000009429 Gossypium barbadense Nutrition 0.000 description 1
- 235000009432 Gossypium hirsutum Nutrition 0.000 description 1
- 241000206581 Gracilaria Species 0.000 description 1
- 108010004889 Heat-Shock Proteins Proteins 0.000 description 1
- 102000002812 Heat-Shock Proteins Human genes 0.000 description 1
- 244000020551 Helianthus annuus Species 0.000 description 1
- 235000003222 Helianthus annuus Nutrition 0.000 description 1
- 241000238631 Hexapoda Species 0.000 description 1
- 101000899240 Homo sapiens Endoplasmic reticulum chaperone BiP Proteins 0.000 description 1
- 244000017020 Ipomoea batatas Species 0.000 description 1
- 235000002678 Ipomoea batatas Nutrition 0.000 description 1
- 241000209499 Lemna Species 0.000 description 1
- 241000234280 Liliaceae Species 0.000 description 1
- 108060001084 Luciferase Proteins 0.000 description 1
- 239000005089 Luciferase Substances 0.000 description 1
- 101710125418 Major capsid protein Proteins 0.000 description 1
- 235000004456 Manihot esculenta Nutrition 0.000 description 1
- 235000016735 Manihot esculenta subsp esculenta Nutrition 0.000 description 1
- 241000234295 Musa Species 0.000 description 1
- 206010028980 Neoplasm Diseases 0.000 description 1
- 101710141454 Nucleoprotein Proteins 0.000 description 1
- 239000004677 Nylon Substances 0.000 description 1
- 235000002725 Olea europaea Nutrition 0.000 description 1
- 241000581017 Oliva Species 0.000 description 1
- 108700026244 Open Reading Frames Proteins 0.000 description 1
- 241000209094 Oryza Species 0.000 description 1
- 244000038248 Pennisetum spicatum Species 0.000 description 1
- 244000115721 Pennisetum typhoides Species 0.000 description 1
- 241000709664 Picornaviridae Species 0.000 description 1
- 241000758706 Piperaceae Species 0.000 description 1
- 241000710078 Potyvirus Species 0.000 description 1
- 101710083689 Probable capsid protein Proteins 0.000 description 1
- 102000001253 Protein Kinase Human genes 0.000 description 1
- 241000508269 Psidium Species 0.000 description 1
- 240000001679 Psidium guajava Species 0.000 description 1
- 235000013929 Psidium pyriferum Nutrition 0.000 description 1
- 108020005067 RNA Splice Sites Proteins 0.000 description 1
- 108010092799 RNA-directed DNA polymerase Proteins 0.000 description 1
- 102000007056 Recombinant Fusion Proteins Human genes 0.000 description 1
- 108010008281 Recombinant Fusion Proteins Proteins 0.000 description 1
- 108010052090 Renilla Luciferases Proteins 0.000 description 1
- 241000220317 Rosa Species 0.000 description 1
- 241000209051 Saccharum Species 0.000 description 1
- 235000008515 Setaria glauca Nutrition 0.000 description 1
- 235000007226 Setaria italica Nutrition 0.000 description 1
- 240000003461 Setaria viridis Species 0.000 description 1
- 235000002248 Setaria viridis Nutrition 0.000 description 1
- 235000007230 Sorghum bicolor Nutrition 0.000 description 1
- 235000009184 Spondias indica Nutrition 0.000 description 1
- 229920002472 Starch Polymers 0.000 description 1
- 235000021536 Sugar beet Nutrition 0.000 description 1
- 235000006468 Thea sinensis Nutrition 0.000 description 1
- 239000007983 Tris buffer Substances 0.000 description 1
- 102000003431 Ubiquitin-Conjugating Enzyme Human genes 0.000 description 1
- 108060008747 Ubiquitin-Conjugating Enzyme Proteins 0.000 description 1
- 108020005202 Viral DNA Proteins 0.000 description 1
- 108700010756 Viral Polyproteins Proteins 0.000 description 1
- 108020000999 Viral RNA Proteins 0.000 description 1
- 241000339989 Wolffia Species 0.000 description 1
- 241000340053 Wolffiella Species 0.000 description 1
- HCHKCACWOHOZIP-UHFFFAOYSA-N Zinc Chemical compound [Zn] HCHKCACWOHOZIP-UHFFFAOYSA-N 0.000 description 1
- 230000036579 abiotic stress Effects 0.000 description 1
- 230000001133 acceleration Effects 0.000 description 1
- JUGOREOARAHOCO-UHFFFAOYSA-M acetylcholine chloride Chemical compound [Cl-].CC(=O)OCC[N+](C)(C)C JUGOREOARAHOCO-UHFFFAOYSA-M 0.000 description 1
- 230000004913 activation Effects 0.000 description 1
- 235000020224 almond Nutrition 0.000 description 1
- 238000000137 annealing Methods 0.000 description 1
- 238000010296 bead milling Methods 0.000 description 1
- 102000023732 binding proteins Human genes 0.000 description 1
- 108091008324 binding proteins Proteins 0.000 description 1
- 230000033228 biological regulation Effects 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 108091005948 blue fluorescent proteins Proteins 0.000 description 1
- 244000309464 bull Species 0.000 description 1
- -1 but not limited to Chemical class 0.000 description 1
- 239000001511 capsicum annuum Substances 0.000 description 1
- 239000001390 capsicum minimum Substances 0.000 description 1
- 235000021256 carbohydrate metabolism Nutrition 0.000 description 1
- 150000001720 carbohydrates Chemical class 0.000 description 1
- 235000020226 cashew nut Nutrition 0.000 description 1
- 238000012512 characterization method Methods 0.000 description 1
- 238000007385 chemical modification Methods 0.000 description 1
- 239000012707 chemical precursor Substances 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000004587 chromatography analysis Methods 0.000 description 1
- 238000010367 cloning Methods 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 230000000295 complement effect Effects 0.000 description 1
- 150000001875 compounds Chemical class 0.000 description 1
- 230000001276 controlling effect Effects 0.000 description 1
- 238000004132 cross linking Methods 0.000 description 1
- 108010082025 cyan fluorescent protein Proteins 0.000 description 1
- 230000008260 defense mechanism Effects 0.000 description 1
- 230000002939 deleterious effect Effects 0.000 description 1
- 235000014113 dietary fatty acids Nutrition 0.000 description 1
- 230000029087 digestion Effects 0.000 description 1
- 235000013399 edible fruits Nutrition 0.000 description 1
- 230000008030 elimination Effects 0.000 description 1
- 238000003379 elimination reaction Methods 0.000 description 1
- 210000002257 embryonic structure Anatomy 0.000 description 1
- 230000002255 enzymatic effect Effects 0.000 description 1
- 210000002615 epidermis Anatomy 0.000 description 1
- 230000007717 exclusion Effects 0.000 description 1
- 229930195729 fatty acid Natural products 0.000 description 1
- 239000000194 fatty acid Substances 0.000 description 1
- 150000004665 fatty acids Chemical class 0.000 description 1
- 108010021843 fluorescent protein 583 Proteins 0.000 description 1
- 238000007421 fluorometric assay Methods 0.000 description 1
- 235000013305 food Nutrition 0.000 description 1
- 230000005714 functional activity Effects 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 229910000078 germane Inorganic materials 0.000 description 1
- 238000000227 grinding Methods 0.000 description 1
- 239000001963 growth medium Substances 0.000 description 1
- 230000001976 improved effect Effects 0.000 description 1
- 230000001939 inductive effect Effects 0.000 description 1
- 230000036512 infertility Effects 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 238000005304 joining Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 230000035800 maturation Effects 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 239000012528 membrane Substances 0.000 description 1
- 230000000442 meristematic effect Effects 0.000 description 1
- 238000010369 molecular cloning Methods 0.000 description 1
- 239000004570 mortar (masonry) Substances 0.000 description 1
- 238000002887 multiple sequence alignment Methods 0.000 description 1
- 238000002703 mutagenesis Methods 0.000 description 1
- 231100000350 mutagenesis Toxicity 0.000 description 1
- 231100000219 mutagenic Toxicity 0.000 description 1
- 230000003505 mutagenic effect Effects 0.000 description 1
- 108010058731 nopaline synthase Proteins 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 235000021062 nutrient metabolism Nutrition 0.000 description 1
- 229920001778 nylon Polymers 0.000 description 1
- 235000019198 oils Nutrition 0.000 description 1
- 235000002252 panizo Nutrition 0.000 description 1
- 244000052769 pathogen Species 0.000 description 1
- 230000001717 pathogenic effect Effects 0.000 description 1
- 235000020232 peanut Nutrition 0.000 description 1
- 230000000885 phytotoxic effect Effects 0.000 description 1
- 230000008121 plant development Effects 0.000 description 1
- 230000037039 plant physiology Effects 0.000 description 1
- 239000013612 plasmid Substances 0.000 description 1
- 239000013600 plasmid vector Substances 0.000 description 1
- 230000008488 polyadenylation Effects 0.000 description 1
- 230000004952 protein activity Effects 0.000 description 1
- 108060006633 protein kinase Proteins 0.000 description 1
- 230000017854 proteolysis Effects 0.000 description 1
- 238000011897 real-time detection Methods 0.000 description 1
- 238000010188 recombinant method Methods 0.000 description 1
- 230000006798 recombination Effects 0.000 description 1
- 238000005215 recombination Methods 0.000 description 1
- 108010054624 red fluorescent protein Proteins 0.000 description 1
- 230000022532 regulation of transcription, DNA-dependent Effects 0.000 description 1
- 230000008439 repair process Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 108091008146 restriction endonucleases Proteins 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 238000002864 sequence alignment Methods 0.000 description 1
- 238000002741 site-directed mutagenesis Methods 0.000 description 1
- 230000010473 stable expression Effects 0.000 description 1
- 235000019698 starch Nutrition 0.000 description 1
- 239000008107 starch Substances 0.000 description 1
- 230000004936 stimulating effect Effects 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000002123 temporal effect Effects 0.000 description 1
- 231100000331 toxic Toxicity 0.000 description 1
- 230000002588 toxic effect Effects 0.000 description 1
- 230000005026 transcription initiation Effects 0.000 description 1
- 230000010474 transient expression Effects 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
- 230000014621 translational initiation Effects 0.000 description 1
- 230000001960 triggered effect Effects 0.000 description 1
- LENZDBCJOHFCAS-UHFFFAOYSA-N tris Chemical compound OCC(N)(CO)CO LENZDBCJOHFCAS-UHFFFAOYSA-N 0.000 description 1
- 238000011144 upstream manufacturing Methods 0.000 description 1
- 235000015112 vegetable and seed oil Nutrition 0.000 description 1
- 235000013311 vegetables Nutrition 0.000 description 1
- 230000003612 virological effect Effects 0.000 description 1
- 108091005957 yellow fluorescent proteins Proteins 0.000 description 1
- 239000011701 zinc Substances 0.000 description 1
- 229910052725 zinc Inorganic materials 0.000 description 1
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/79—Vectors or expression systems specially adapted for eukaryotic hosts
- C12N15/82—Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
- C12N15/8216—Methods for controlling, regulating or enhancing expression of transgenes in plant cells
Definitions
- sequence listing is submitted electronically via EFS- Web as an ASCII formatted sequence listing with a file named 435025SeqLst.txt, created on June 26, 2013, and having a size of 93.7 kilobytes, and is filed concurrently with the specification.
- the sequence listing contained in this ASCII formatted document is part of the specification and is herein incorporated by reference in its entirety.
- transgenic cells and organisms comprising a heterologous gene sequence are now routinely practiced by molecular biologists. Methods for incorporating an isolated gene sequence into an expression cassette, producing transformation vectors, and transforming many types of cells and organisms are well known.
- the regulation or control of expression of the heterologous gene and the protein encoded by the gene can often be critical in the development of a transgenic organism for commercial use. For example, in transgenic plants comprising a heterologous gene that confers tolerance to herbicide that is normally toxic to the plant, it can be critical to have the heterologous gene expressed in a temporal and spatial manner that corresponds to when the plant is exposed to the herbicide and to what parts of the plant the herbicide normally exerts its phytotoxic effect.
- a number of genetic regulatory elements are known to play a role in regulating the expression of a gene in plants and other organisms including, for example, promoters, 5 '-untranslated regions (UTRs), 3'-untranslated regions, and expression- enhancing introns.
- promoters for example, promoters, 5 '-untranslated regions (UTRs), 3'-untranslated regions, and expression- enhancing introns.
- UTRs 5 '-untranslated regions
- 3'-untranslated regions e.g., 3'-untranslated regions, and expression- enhancing introns.
- Silencing of transgenes previously showing stable expression can also be triggered 'de novo' when a new transgene is added by crossing or re-transformation if, for example, the same promoter has been used in both transgenes in an effort to promote coordinated expression (Halpin (2005) Plant Biotech. J. 3 : 141-155).
- the use of the same promoter in multiple transgenes in a single plant is due to the lack of more than one promoter that gives the desired pattern and level of expression.
- the Cauliflower mosaic virus (CaMV) 35S promoter is frequently used as the promoter in plant transgenes because it provides for high-level constitutive expression of an operably linked gene of interest.
- the CaMV 35 promoter is often used to drive the high-level constitutive expression of two or more transgenes in the same plant.
- additional promoters and other genetic regulatory elements are needed to avoid gene silencing that might be caused by the use of a particular genetic regulatory element more than once when two, three, four, or more transgenes are stacked in a single crop plant.
- a common approach for identifying additional promoters that can be used to drive high-level, constitutive expression of an operably linked heterologous nucleotide sequence in plants involves screening plants to identify plant genes that display the high-level constitutive expression across most tissue and/or cell types.
- Methods are provided for making an expression construct for enhancing gene expression in an organism.
- the methods involve selecting a first intron that is derived from a native gene that is highly expressed in a constitutive manner in an organism.
- the first intron is the first intron from the 5' end of the transcribed region of a gene.
- the methods further comprise selecting a promoter that is derived from the same gene as the first intron or from a different gene that is highly expressed in a constitutive manner either in the same organism as the gene from which the first intron was derived or in a different organism.
- the methods further comprise synthesizing an expression construct comprising the promoter operably linked to a polynucleotide.
- polynucleotide comprises a 5'-untranslated region (5'-UTR), the first intron, and a translated region, and the 5'-UTR or translated region comprises the first intron.
- the expression construct provides for enhanced expression of the operably linked polynucleotide in a target organism when compared to the expression of the polynucleotide in the target organism from a control expression construct which lacks the first intron.
- the methods comprise introducing into at least one cell of a target organism an expression construct comprising a promoter operably linked to a polynucleotide.
- the polynucleotide comprises a 5'-UTR, a first intron, and a translated region, and the 5'-UTR or translated region comprises the first intron, which is derived from a native gene that is highly expressed in a constitutive manner in an organism.
- the promoter can be derived from the native gene or from a different gene that is highly expressed in a constitutive manner either in the same organism as the gene from which the first intron was derived or in a different organism.
- the methods can further comprise regenerating a target organism from at least one cell comprising the expression construct.
- the target organism that is produced by the methods of the present invention is capable of expressing the polynucleotide when the target organism or cell thereof is exposed to conditions favorable for the expression of the
- the target organism is capable of enhanced expression of the polynucleotide when compared to the expression in the target organism of the polynucleotide from a control expression construct which lacks the first intron.
- the methods comprise obtaining a target organism comprising an expression construct or at least one cell thereof.
- the expression construct comprises a promoter operably linked to a polynucleotide and exposing the target organism or cell thereof to conditions favorable for the expression of the polynucleotide for a sufficient period of time, whereby the polynucleotide is expressed.
- the polynucleotide comprises a S'-UTR, a first intron, and a translated region, and the 5'- UTR or translated region comprises the first intron.
- the first intron is derived from a native gene that is highly expressed in a constitutive manner in an organism.
- the promoter can be derived from the native gene or from a different gene that is highly expressed in a constitutive manner either in the same organism as the gene from which the first intron was derived or in a different organism.
- expression of the polynucleotide is increased in a target organism comprising the expression construct or in at least one cell thereof, when compared to the expression of the polynucleotide in the target organism comprising a control expression construct which lacks the first intron or in at least one cell thereof.
- the expression level of the polynucleotide is determined by measuring the level of the protein encoded by translated region or by assaying the activity or function of the protein.
- Methods are provided for making a regulatory construct.
- the methods involve selecting a first intron that is derived from a native gene that is highly expressed in a constitutive manner in an organism.
- the first intron is the first intron from the 5' end of the gene, if the gene contains more than one intron.
- the methods further comprise selecting a promoter that is derived from the same gene as the first intron or from a different gene that is highly expressed in a constitutive manner either in the same organism as the gene from which the first intron was derived or in a different organism.
- the methods further comprise synthesizing a regulatory construct comprising the promoter operably linked to a 5'-UTR, which comprises the first intron.
- the first intron is at or near the 3' end of the 5'-UTR.
- the regulatory construct provides for enhanced expression of an operably linked gene of interest in a target organism when compared to the expression of the gene of interest in the target organism from a control regulatory construct which lacks the first intron.
- the methods can further comprise operably linking a gene of interest to the regulatory construct for expression of the gene of interest in a target organism.
- Nucleic acid molecules comprising the expression constructs and regulatory constructs of the present invention are provided. Additionally provided are organisms and host cells comprising the expression constructs and regulatory constructs of the present invention. In one embodiment of the invention, the organisms and host cells include, for example, plants, seeds, plant parts, and plant cells comprising at least one expression construct and/or at least one regulatory construct of the present invention.
- nucleotide sequences listed in the accompanying sequence listing are shown using standard letter abbreviations for nucleotide bases.
- the nucleotide sequences follow the standard convention of beginning at the 5' end of the sequence and proceeding forward (i.e., from left to right in each line) to the 3' end. Only one strand of each nucleic acid sequence is shown, but the complementary strand is understood to be included by any reference to the displayed strand.
- SEQ ID NO: 1 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession ATI G 13440.
- the first intron is located at nucleotide positions 1 1 1 1 to 1203.
- the start of transcription is at nucleotide 955. Additional nucleotides added to make a consensus splice site are at nucleotide positions 1 108, 1 109, 1 1 10, 1204 and 1205.
- SEQ ID NO: 2 sets forth the nucleotide sequence of SEQ ID NO: 1 without the first intron and any nucleotides that were added to form a consensus splice site.
- SEQ ID NO: 3 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession ATI G22840.
- the first intron is located at nucleotide positions 296 to 774.
- the start of transcription is at nucleotide 196. Additional nucleotides added to make a consensus splice site are at nucleotide positions 293, 294, 295, 775, and 776 .
- SEQ ID NO: 4 sets forth the nucleotide sequence of SEQ ID NO: 3 without the first intron and any nucleotides that were added to form a consensus splice site.
- SEQ ID NO: 5 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession AT1 G52300.
- the first intron is located at nucleotide positions 1 100 to 1201.
- the start of transcription is at nucleotide 1017. Additional nucleotides added to make a consensus splice site are at nucleotide positions 1099, 1202, and 1203.
- SEQ ID NO: 6 sets forth the nucleotide sequence of SEQ ID NO: 5 without the first intron and any nucleotides that were added to form a consensus splice site.
- SEQ ID NO: 7 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession AT4G37830.
- the first intron is located at nucleotide positions 861 to 1203.
- the start of transcription is at nucleotide 786. Additional nucleotides added to make a consensus splice site are at nucleotide positions 858, 859, 860, 1204, and 1205.
- SEQ ID NO: 8 sets forth the nucleotide sequence of SEQ ID NO: 7 without the first intron and any nucleotides that were added to form a consensus splice site.
- SEQ ID NO: 9 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession ATI G51650.
- the first intron is located at nucleotide positions 819 to 1567.
- the start of transcription is at nucleotide 751. Additional nucleotides added to make a consensus splice site are at nucleotide positions 816, 817, 818, 1568, and 1569.
- SEQ ID NO: 10 sets forth the nucleotide sequence of SEQ ID NO: 9 without the first intron and any nucleotides that were added to form a consensus splice site.
- SEQ ID NO: 1 1 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession AT3G48140.
- the first intron is located at nucleotide positions 1045 to 1201.
- the start of transcription is at nucleotide 929. Additional nucleotides added to make a consensus splice site are at nucleotide positions 1044, 1202, and 1203.
- SEQ ID NO: 12 sets forth the nucleotide sequence of SEQ ID NO: 1 1 without the first intron and any nucleotides that were added to form a consensus splice site.
- SEQ ID NO: 13 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession AT1 G02780.
- the first intron is located at nucleotide positions 1003 to 1343.
- the start of transcription is at nucleotide 926, Additional nucleotides added to make a consensus splice site are at nucleotide positions 1000, 1001 , 1002, 1344, and 1345.
- SEQ ID NO: 14 sets forth the nucleotide sequence of SEQ ID NO: 13 without the first intron and any nucleotides that were added to form a consensus splice site.
- SEQ ID NO: 15 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession AT3G01280.
- the first intron is located at nucleotide positions 604 to 1 102.
- the start of transcription is at nucleotide 448. Additional nucleotides added to make a consensus splice site are at nucleotide positions 601 , 602, 603, 1 103, and 1 104.
- SEQ ID NO: 16 sets forth the nucleotide sequence of SEQ ID NO: 15 without the first intron and any nucleotides that were added to form a consensus splice site.
- SEQ ID NO: 17 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession AT1 G67430.
- the first intron is located at nucleotide positions 1783 to 1891.
- the start of transcription is at nucleotide 1730. Additional nucleotides added to make a consensus splice site are at nucleotide positions 1780, 1781 , 1782, 1892, and 1893.
- SEQ ID NO: 18 sets forth the nucleotide sequence of SEQ ID NO: 17 without the first intron and any nucleotides that were added to form a consensus splice site.
- SEQ ID NO: 19 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession AT1 G76200. The first intron is located at nucleotide positions 758 to 1073. The start of transcription is at nucleotide 654. Additional nucleotides added to make a consensus splice site are at nucleotide positions 755, 756, 757, 1074, 1075.
- SEQ ID NO: 20 sets forth the nucleotide sequence of SEQ ID NO: 19 without the first intron and any nucleotides that were added to form a consensus splice site.
- SEQ ID NO: 21 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession AT2G31490.
- the first intron is located at nucleotide positions 704 to 1430.
- the start of transcription is at nucleotide 624. Additional nucleotides added to make a consensus splice site are at nucleotide positions 701 , 702, 703, 1431 , and 1432.
- SEQ ID NO: 22 sets forth the nucleotide sequence of SEQ ID NO: 21 without the first intron and any nucleotides that were added to form a consensus splice site.
- SEQ ID NO: 23 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession AT5G08690.
- the first intron is located at nucleotide positions 776 to 1077.
- the start of transcription is at nucleotide 747. Additional nucleotides added to make a consensus splice site are at nucleotide positions 773, 774, 775, 1078, and 1079.
- SEQ ID NO: 24 sets forth the nucleotide sequence of SEQ ID NO: 23 without the first intron and any nucleotides that were added to form a consensus splice site.
- SEQ ID NO: 25 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession AT1 G07600.
- the first intron is located at nucleotide positions 1504 to 1783.
- the start of transcription is at nucleotide 1501. Additional nucleotides added to make a consensus splice site are at nucleotide positions 1502, 1503, 1784, and 1785.
- SEQ ID NO: 26 sets forth the nucleotide sequence of SEQ ID NO: 25 without the first intron and any nucleotides that were added to form a consensus splice site.
- SEQ ID NO: 27 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession AT1 G78380.
- the first intron is located at nucleotide positions 1504 to 2004.
- the start of transcription is at nucleotide 1414. Additional nucleotides added to make a consensus splice site are at nucleotide positions 1501 , 1502, 1503, 2005, and 2006.
- SEQ ID NO: 28 sets forth the nucleotide sequence of SEQ ID NO: 27 without the first intron and any nucleotides that were added to form a consensus splice site.
- SEQ ID NO: 29 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession AT2G33040.
- the first intron is located at nucleotide positions 552 to 952.
- the start of transcription is at nucleotide 415. Additional nucleotides added to make a consensus splice site are at nucleotide positions 551 , 953, and 954.
- SEQ ID NO: 30 sets forth the nucleotide sequence of SEQ ID NO: 29 without the first intron and any nucleotides that were added to form a consensus splice site.
- SEQ ID NO: 31 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession Os03g21940.
- the first intron is located at nucleotide positions 1504 to 2482.
- the start of transcription is at nucleotide 1406. Additional nucleotides added to make a consensus splice site are at nucleotide positions 1501 , 1502, 1503, 2483, and 2484.
- SEQ ID NO: 32 sets forth the nucleotide sequence of SEQ ID NO: 31 without the first intron and any nucleotides that were added to form a consensus splice site.
- SEQ ID NO: 33 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession Os05g45950.
- the first intron is located at nucleotide positions 1504 to 1656.
- the start of transcription is at nucleotide 1419. Additional nucleotides added to make a consensus splice site are at nucleotide positions 1501 , 1502, 1503, 1657, and 1658.
- SEQ ID NO: 34 sets forth the nucleotide sequence of SEQ ID NO: 33 without the first intron and any nucleotides that were added to form a consensus splice site.
- SEQ ID NO: 35 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession Osl l g47760.
- the first intron is located at nucleotide positions 729 to 2633.
- the start of transcription is at nucleotide 638. Additional nucleotides added to make a consensus splice site are at nucleotide positions 726, 727, 728, 2634, and 2635.
- SEQ ID NO: 36 sets forth the nucleotide sequence of SEQ ID NO: 35 without the first intron and any nucleotides that were added to form a consensus splice site.
- SEQ ID NO: 37 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession Os02g02130. The first intron is located at nucleotide positions 1504 to 1586. The start of transcription is at nucleotide 1501. Additional nucleotides added to make a consensus splice site are at nucleotide positions 1502, 1503, 1587, and 1588.
- SEQ ID NO: 38 sets forth the nucleotide sequence of SEQ ID NO: 37 without the first intron and any nucleotides that were added to form a consensus splice site.
- SEQ ID NO: 39 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession Os03g56190.
- the first intron is located at nucleotide positions 1504 to 1615.
- the start of transcription is at nucleotide 1437. Additional nucleotides added to make a consensus splice site are at nucleotide positions 1501 , 1502, 1503, 1616, and 1617.
- SEQ ID NO: 40 sets forth the nucleotide sequence of SEQ ID NO: 39 without the first intron and any nucleotides that were added to form a consensus splice site.
- SEQ ID NO: 41 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession Os05g47980.
- the first intron is located at nucleotide positions 940 to 1553.
- the start of transcription is at nucleotide 829. Additional nucleotides added to make a consensus splice site are at nucleotide positions 937, 938, 939, 1554, and 1555.
- SEQ ID NO: 42 sets forth the nucleotide sequence of SEQ ID NO: 41 without the first intron and any nucleotides that were added to form a consensus splice site.
- SEQ ID NO: 43 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession Os01 g46610.
- the first intron is located at nucleotide positions 1504 to 2228.
- the start of transcription is at nucleotide 1384. Additional nucleotides added to make a consensus splice site are at nucleotide positions 1501 , 1502, 1503, 2229, and 2230.
- SEQ ID NO: 44 sets forth the nucleotide sequence of SEQ ID NO: 43 without the first intron and any nucleotides that were added to form a consensus splice site.
- SEQ ID NO: 45 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession Os04g28180.
- the first intron is located at nucleotide positions 1504 to 1646.
- the start of transcription is at nucleotide 1399. Additional nucleotides added to make a consensus splice site are at nucleotide positions 1501 , 1502, 1503, 1647, and 1648.
- SEQ ID NO: 46 sets forth the nucleotide sequence of SEQ ID NO: 45 without the first intron and any nucleotides that were added to form a consensus splice site.
- SEQ ID NO: 47 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession Os05g01820.
- the first intron is located at nucleotide positions 1504 to 2453.
- the start of transcription is at nucleotide 1229. Additional nucleotides added to make a consensus splice site are at nucleotide positions 1501 , 1502, 1503, 2454, and 2455.
- SEQ ID NO: 48 sets forth the nucleotide sequence of SEQ ID NO: 47 without the first intron and any nucleotides that were added to form a consensus splice site.
- SEQ ID NO: 49 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession Osl l gl 1390.
- the first intron is located at nucleotide positions 1504 to 2798.
- the start of transcription is at nucleotide 1431. Additional nucleotides added to make a consensus splice site are at nucleotide positions 1501 , 1502, 1503, 2799, and 2780.
- SEQ ID NO: 50 sets forth the nucleotide sequence of SEQ ID NO: 49 without the first intron and any nucleotides that were added to form a consensus splice site.
- SEQ ID NO: 51 sets forth the nucleotide sequence of the regulatory construct comprising the promoter from gene accession AT4G37830 and the first intron from gene accession AT1 G52300.
- the first intron is located at nucleotide positions 861 to 960.
- the start of transcription is at nucleotide 786.
- SEQ ID NO: 52 sets forth the nucleotide sequence of the regulatory construct comprising the promoter from gene accession ATl G52300and the first intron from gene accession AT4G37830 .
- the first intron is located at nucleotide positions 1 100 to 1442.
- the start of transcription is at nucleotide 1017.
- FIG. 1 is a graphical representation of root expression enhancement of
- Arabidopsis promoters by cognate first introns.
- Expression constructs comprising a promoter, a 5'-UTR comprising the cognate first intron, and a translated region comprising the coding sequence of a reporter gene were tested for expression in Arabidopsis roots and compared to a control expression construct lacking the first intron (i.e., - intron variant).
- Average intron-mediated enhancement is expressed as on the ⁇ -axis as 2 A -fold enhancement (e.g., 2 2 and 2 4 stand for 4-fold and 16-fold expression enhancement, respectively.)
- the individual promoters used are listed below the -axis.
- FIG. 2 is a graphical representation of expression enhancement of fice promoters by cognate first introns.
- Expression constructs comprising a promoter, a 5'- UTR comprising the cognate first intron, and a translated region comprising the coding sequence of a reporter gene were tested for expression in corn and compared to a control expression constructs lacking the first intron.
- Average intron-mediated enhancement (IME) is expressed as expressed on the j ⁇ -axis as 2 -fold enhancement (e.g., 2 2 and 2 4 stand for 4-fold and 16-fold expression enhancement, respectively.)
- the individual promoters used are listed below the -axis.
- an element means one or more element.
- the word "comprise,” or variations such as “comprises” or “comprising,” will be understood to imply the inclusion of a stated element, integer or step, or group of elements, integers or steps, but not the exclusion of any other element, integer or step, or group of elements, integers or steps.
- expression construct refers to a recombinant DNA or nucleic acid, which comprises in a 5' ⁇ to-3' order and in operable linkage a promoter, a 5 '-untranslated region (5'-UTR), and a translated region, wherein the 5'-UTR comprises a first intron from a native gene of an organism.
- the transcribed region of the expression construct comprises 5'-UTR, the first intron, and the translated region.
- the expression constructs of the present invention can further comprise one or more each of one or more of the following elements: an enhancer, an additional intron, a 3 '-untranslated region, a transcriptional terminator, and a chromatin control element.
- regulatory construct refers to a recombinant DNA or nucleic acid, which comprises in a 5'-to-3' order and in operable linkage a promoter and a 5'- UTR, wherein the 5'-UTR comprises a first intron from a native gene of an organism.
- the regulatory constructs of the present invention can further comprise one or more each of one or more of the following elements: an enhancer, an additional intron, a translated region or coding sequence, a 3 '-untranslated region, a transcriptional terminator, and a chromatin control element.
- RNA transcript of a gene or an expression construct or a regulatory construct of the present invention also comprises a 5'-UTR.
- any introns that occur in a '5-UTR of a gene or expression construct or regulatory construct are not found in the corresponding mature RNA transcript produced in vivo as such introns are typically spliced out by the host organism or cell thereof, unless the intron is non-functional in the host organism or the host organism is incompetent for splicing out such introns.
- first intron refers to the first intron from the 5' end of a native gene of an organism.
- the first intron can be found within the 5 '-UTR or the translated region of the native gene.
- the first intron is between the first protein coding exon and the second protein coding exon.
- typically the 5' end of a first intron that is capable of enhancing expression as disclosed herein is within about the first 1000 base pairs (bp) after the transcriptional start site (in a 5' to 3' direction) and is preferably within about the first 500 bp after the transcription start site.
- an “expression-enhancing intron” or “enhancing intron” is an intron that is capable of causing an increase in the expression of a gene or polynucleotide to which it is operably linked.
- a “first intron” of the present invention is an expression-enhancing intron. While the present invention is not known to depend on a particular biological mechanism, it is believed that the expression-enhancing introns of the present invention enhance expression through intron-mediated enhancement (IME). It is recognized that naturally occurring introns that enhance expression through IME are typically found within 1 Kb of the transcription start site of their native genes (see, Rose el al. (2008) Plant Cell 20:543-551 ).
- Such introns are usually the first intron, whether the first intron is in the 5'-UTR or the coding sequence, and are in a transcribed region.
- Introns that enhance expression solely through IME do not enhance gene expression when they are inserted into a non-transcribed region of gene, such as for example, a promoter. That is, they do not function as transcriptional enhancers.
- the first introns of the present invention are capable of enhancing gene expression when they are found in a transcribed region of a gene but not when they occur in a non-transcribed region such as, for example, a promoter.
- the term "translated region” refers to the portion of a gene or expression construct of the present invention or its corresponding RNA transcript that encodes a polypeptide or protein of interest.
- the translated region comprises the start codon (e.g., ATG) for translation through the last codon of the protein or polypeptide encoded thereby. It is recognized that the translated region of a gene or expression construct can comprise one or more introns.
- any introns that occur in the translated region of a gene or expression construct of the present invention are not typically found in the corresponding mature RNA transcript produced in vivo as such introns are normally spliced out by the host organism or cell thereof unless the intron or introns are non-functional and/or the host organism is incompetent for splicing out such introns.
- native gene refers to a gene that is part of a natural genome of an organism and that was not introduced into the organism or a progenitor thereof by artificial means that do not involve the transfer of genes from one organism to another organism by sexual reproduction.
- Such artificial means include, for example, any methods involving the introduction of recombinant DNA or other recombinant nucleic acid molecules into the organism or a progenitor thereof.
- a gene that is introduced into a progenitor of an organism by artificial means does not become a native gene when it is transferred from the progenitor to the organism via sexual reproduction.
- recombinant DNA refers to DNA and other recombinant nucleic acid molecules that are an artificial or non-naturally occurring combination of nucleic acid fragments, e.g., regulatory and coding sequences that are not all found together in the same form in nature.
- recombinant nucleic acid molecules may comprise regulatory sequences and coding sequences that are derived from different sources, or regulatory sequences and coding sequences derived from the same source, but arranged in a manner different than that found in nature.
- an "expression construct” and a "regulatory construct” each comprise recombinant DNA.
- enhancing gene expression is intended to mean enhancing or increasing the expression of a gene or its gene product, particularly a protein or polypeptide.
- Gene expression can be determining by monitoring the formation of a transcript of a gene or polynucleotide or gene of interest of the present invention, a protein encoded by the transcript, or even an activity or function of the encoded protein. In preferred embodiments of the present invention, gene expression is determined by monitoring the level of a protein encoded by the gene or the activity or function of the encoded protein.
- the expression of a polynucleotide or gene of interest of the present invention can be assessed in an organism or at least one cell thereof by determining the level of level of the protein encoded by the translated region of the polynucleotide or gene of interest or the activity or function of the encoded protein.
- the polynucleotide or gene of interest comprises a translated region which encodes green fluorescent protein (GFP), and expression of the polynucleotide or gene of interest can be determined by measuring green fluorescence emitted from the GFP protein when it is exposed to blue light.
- GFP green fluorescent protein
- the polynucleotide or gene of interest comprises a translated region which encodes f3-glucuronidase (GUS) and expression of the polynucleotide can be determined by measuring GUS activity using the MUG fluorometric assay.
- GUS f3-glucuronidase
- a "promoter” refers to a nucleic acid that is capable of controlling the expression of an operably linked coding sequence or other sequence encoding an RNA.
- the promoter sequence consists of proximal and more distal upstream elements, the latter elements often referred to as enhancers. It is understood by those skilled in the art that different promoters may direct the expression of a gene in different tissues or cell types, or at different stages of development, or in response to different environmental conditions. It is further recognized that since in most cases the exact boundaries of regulatory sequences have not been completely defined, nucleic acid fragments of some variation may have identical promoter activity.
- an “enhancer” is a DNA sequence that can stimulate promoter activity, and may be an innate element of the promoter or a heterologous element inserted to enhance the level or tissue-specificity of a promoter. Enhancers may be found in both non-transcribed and transcribed regions of a gene. Typically, the promoter stimulating activity of an enhancer is insensitive with respect to the position and orientation (i.e., can be inverted) of the enhancer within a gene.
- Promoters that cause an operably linked gene or polynucleotide to be expressed in most cell types of an organism and at most times are commonly referred to as “constitutive promoters".
- the constitutive promoters of the present invention cause an operably linked gene or polynucleotide to be expressed in all or substantially all tissues and stages of development and being minimally responsive to abiotic stimuli.
- Expression of a gene of a gene or polynucleotide in most cell types of an organism and at most times is referred to herein as “constitutive gene expression” or “constitutive expression”, “expressed constitutively”, “expression in a constitutive manner", or expression in a 'constitutive pattern”. It is understood that for the terms “constitutive promoter” and “constitutive expression” and that some variation in absolute levels of expression or activity can exist among different tissues and stages of development of an organism.
- the present invention provides novel expression constructs comprising a promoter operably linked to a polynucleotide. It is recognized that nucleic acid molecules comprising such novel expression constructs can be synthesized or produced using a number of methods known in the art. As used herein, “synthesizing an expression construct” or “producing an expression construct” are interchangeable terms that are intending to mean the making of an expression construct by any known method including, but not limited to, chemical synthesis of the entire nucleic acid molecule or part or parts thereof, modification of a pre-existing nucleic acid molecule by molecular biology methods such as, for example, restriction endonuclease digestion, DNA amplification by polymerase and ligation, and the combination of chemical synthesis and modification.
- progeny comprises any subsequent generation of an organism or a host cell, whether the result of sexual reproduction or asexual reproduction.
- a progeny of the present invention is made by the methods of the present invention and/or comprises an expression construct of the present invention.
- progenitor or “progenitor organism” refers to an ancestor of an organism or host cell.
- methods are described that can involve the use of an organism or cell comprising an expression construct of the present invention wherein the organism or cell is descended from a progenitor into which the expression construct was introduced.
- the expression construct was stably introduced into the genome of the progenitor by, for example, a stable transformation method described herein or otherwise known in the art.
- an "organism” refers any life form that has genetic material comprising nucleic acids including, but not limited to, prokaryotes, eukaryotes, and viruses.
- Organisms include, for example, plants, animals, fungi, bacteria, and viruses, and cells and parts thereof.
- Preferred organisms of the present invention are eukaryotic organisms, including, for example, plants, animals, fungi, and protists.
- a "target organism” is the organism into which an expression construct of the present invention is introduced, particularly for the purpose of expressing the protein encoded by the translated region of the expression construct.
- gene of interest is intended any nucleotide sequence that can be expressed when operable linked to a promoter or a regulatory construct of the present invention.
- a gene of interest of the present invention may, but need not, encode a protein.
- a translated region of the present invention can be a gene of interest.
- the gene of interest does not by itself comprise a functional promoter.
- the gene of interest does not comprise a full-length 5 -UTR. More preferably, the gene of interest is a translated region.
- heterologous gene is any nucleic acid molecule or polynucleotide that is expressed from a nucleotide construct of the present invention.
- a heterologous gene can comprise a nucleotide sequence that is native or endogenous to an organism or can be foreign.
- the present invention does not depend on a particular method of determining if the expression construct of the present invention is capable of enhancing gene expression in a target organism, typically gene expression is determined by transforming the target organism or at least one cell thereof with a polynucleotide construct comprising the expression construct.
- the expression construct can further comprise additional genetic regulatory elements, if desired or necessary for expression in the translated region in the organism or at least one cell thereof.
- determining whether the expression construct is capable of enhancing the expression of an operably linked gene in the desired manner in the target organism or any other organism of interest can depend on any number of factors including, for example, the type of genetic regulatory element (e.g., promoter, a 5'-untranslated region (UTR), a 3 '-untranslated region, an intron, a terminator, a chromatin control element), the presence of additional genetic elements in the construct, the gene of interest to be expressed, the organism or part or cell thereof in which expression is assayed, the expression assay, the detection method (e.g., GFP visible fluorescent, detection of GFP RNA by qPCR), the environmental conditions during the assay, and the like.
- the type of genetic regulatory element e.g., promoter, a 5'-untranslated region (UTR), a 3 '-untranslated region, an intron, a terminator, a chromatin control element
- the detection method e.g., GFP visible fluorescent, detection of GFP
- a "control expression construct” is the same or substantially the same as an expression construct of the present invention but lacks a first intron and can be used as a control in gene expression determinations as disclosed herein.
- a control expression construct lacks a first intron but otherwise comprises the same promoter, 5'-UTR, and translated region as an expression construct of the present invention.
- a control expression construct lacks a first intron but otherwise has the same nucleotide sequence as an expression construct of the present invention, except for the missing portion that would correspond to the first intron in the expression construct.
- a "control regulatory construct” is the same or substantially the same as a regulatory construct of the present invention but lacks a first intron and can be used as a control in gene expression determinations as disclosed herein.
- a control regulatory construct lacks a first intron but otherwise comprises the same promoter and 5'-UTR as a regulatory construct of the present invention.
- a control regulatory construct lacks a first intron but otherwise has the same nucleotide sequence as a regulatory construct of the present invention, except for the missing portion that would correspond to the first intron in the regulatory construct.
- reporter refers to a nucleic acid molecule encoding a detectable marker.
- Reporter genes include, for example, luciferase (e.g., firefly luciferase or Renilla luciferase), ⁇ -galactosidase, ⁇ - glucuronidase (GUS), chloramphenicol acetyl transferase (CAT), and a fluorescent protein (e.g., green fluorescent protein (GFP), red fluorescent protein (DsRed), yellow fluorescent protein, blue fluorescent protein, cyan fluorescent protein, or variants thereof, including enhanced variants such as enhanced GFP (eGFP).
- Reporter genes are detectable by a reporter assay. Reporter assays can measure the level of reporter gene expression or activity by any number of means, including, for example, measuring the level of reporter mRNA, the level of reporter protein, or the amount of reporter protein activity. Reporter assays are known in the art or otherwise disclosed herein.
- the present invention provides methods and compositions for enhancing gene expression in organisms, particularly eukaryotic organisms. Such methods and compositions can be used for the expression of polynucleotides, particularly the proteins encoded thereby, constitutively and at a high level in a target organism. Thus, the methods and compositions of the present invention find use in the production of any protein of interest in a eukaryotic organism or cells thereof.
- the target organisms are plants, particularly monocot and dicot plants, more particularly monocot and dicot plants that are crop plants or that are suitable for the production of a protein of interest when grown in fields, greenhouses and/or controlled-environment facilities.
- the present invention was made during the course of research related to the discovery and characterization of promoters that can be used to drive the expression of operably linked polynucleotides constitutively and at a high level in plants.
- promoters are known as strong constitutive promoters.
- the present inventors discovered that the expression of a polynucleotide can be increased by adding to a polynucleotide construct comprising a constitutive promoter an operably linked intron from the same plant gene as the promoter or an intron from a different plant gene that is also known to be expressed constitutively and at a high level.
- the present invention provides methods for making an expression construct for enhancing gene expression in an organism.
- the methods comprise selecting a first intron that is derived from a first gene that is highly expressed in a constitutive manner in a first organism.
- the first intron is the first intron from the 5' end of the first gene, and the first gene is a gene that is native to the first organism.
- Such a native gene is part of the natural genome of the first organism and was not introduced into the organism or a progenitor organism by artificial means.
- the methods further comprise selecting a promoter.
- the promoter can be selected before, after, or at the same time as, the first intron is selected.
- the promoter can be a promoter derived from the first gene or a promoter derived from a second gene that is highly expressed in a constitutive manner either in the first organism or in a second organism.
- the second gene is native to either the first organism or the second organism.
- the methods further comprise synthesizing an expression construct comprising the promoter operably linked to a polynucleotide, wherein the
- polynucleotide comprises a 5'-untranslated region (5'-UTR), the first intron, and a translated region, and wherein the 5'-UTR or translated region comprises the first intron.
- the 5'-UTR or any part thereof can be derived from the native 5 -UTR of the first gene, the second gene, or a different gene, or can be synthetic or artificial.
- an expression construct made by the methods disclosed herein provides for enhanced or increased expression of the polynucleotide in a target organism, when compared to the expression of the polynucleotide in the target organism from a control expression construct which lacks the first intron. More preferably, an expression construct made by the methods disclosed herein provides for enhanced or increased expression of the polynucleotide in a target organism expression without significantly altering the constitutive manner of expression of the
- an expression construct made by the methods disclosed provides for at least a about 1.25, 1.5, 2, 2.5, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 40, 50, 60. 70, 80, 90, 100-fold increase in expression of the
- an expression construct of the present invention can provide for an approximately 2 to 70-fold increase in expression of the polynucleotide in a target organism, when compared to the expression of the polynucleotide in the target organism from a control expression construct which lacks the first intron.
- the methods of the present invention can involve a first organism and a target organism.
- the first organism and the target organism can be the same species or different species. In embodiments in which the first organism and the target organism are not the same species, the first organism and the target organism are typically from related species.
- the first organism and the target organism can be two different plant species, preferably two different monocot or dicot plant species, more preferably two different plant species within the same taxonomic family, most preferably two different plant species within the same genus.
- the methods involve a first organism, a second organism, and a target organism.
- the first organism, the second organism, and the target organism can be the same species or two or more different species.
- the first organism, the second organism, and the target organism are not all the same species the first organism, the second organism, and the target organism are typically from two or more related species.
- the first organism, the second organism, and the target organism can be three different plant species, preferably three different monocot or dicot plant species, more preferably three different plant species within the same taxonomic family, most preferably three different plant species within the same genus.
- the expression construct comprises a promoter operably linked to a polynucleotide for transcription of the polynucleotide.
- a polynucleotide of the present invention comprises a transcribed region.
- the polynucleotide represents the region of the expression construct that is transcribed so as to produce an RNA molecule or transcript. It is recognized the initial RNA molecule or transcript this is produced may be further modified in the organism or cell thereof so as to produce a mature RNA transcript. Modifications can include, for example, splicing out one or more introns including, but not limited to, the first intron.
- the polynucleotide comprises the 5'-UTR, the first intron, and the translated region, and either the 5'-UTR or the translated regions comprises the first intron.
- the first intron is between the first and second exons of the translated region.
- the 5'-UTR comprises the first intron.
- the first intron is at or near the 3' end of the 5'-UTR. More preferably, the first intron is at the 3' end of the 5'-UTR immediately before the translational start site.
- 3' end of the 5'-UTR is the nucleotide immediately before the first nucleotide of the start codon for translation.
- the start codon will be ATG.
- other start codons are known to be used by some organisms and that the present invention does not depend a particular start codon.
- non-intron sequences of 5'-UTRs are typically in the range of about 30 bp to about 200 bp, preferably about 50 to about 150 bp, although substantially larger or smaller 5'-UTRs are also encompassed by the present invention.
- the expression constructs and regulatory constructs of the present invention comprise promoters and first introns that are derived from native genes.
- the 5'-UTR or portion thereof can also be derived from a native gene.
- the promoters, first introns, and the 5-UTRs can be identical to or substantially the same as the corresponding element in its native gene. It is recognized that promoters, first introns, and 5'-UTRs of the present invention that are each derived from a native gene can be modified so that their sequences are no longer identical to the corresponding sequences in the native gene.
- modifications include, for example, the addition of a consensus splice sites on one or both ends of an intron, removal of cryptic splice site, and sequence modifications that increase transcription. Generally, any such modifications will not alter constitutive expression of the promoters and the function of the first introns but it is recognized such modifications may enhance gene expression.
- the first introns comprise consensus splice sites on both the 5' and 3' ends.
- the first introns comprise consensus splice sites on both the 5' and 3' ends, wherein the consensus splice sites are selected, or designed to be, efficiently spliced out when present in a transcript in the organism of interest.
- first intron that is not spliced out may be disruptive to translation when the first intron is located in the 5'-UTR, particularly when located near the 3 '-end of the 5'-UTR.
- first introns that are located within the translated region and that are not spliced out at all or spliced out inefficiently can have the unintended effect of reducing or eliminating the expression of the protein of interest.
- the methods of the present invention can comprise selecting a first intron and/or a promoter that is derived from a gene that is highly expressed in a constitutive manner in an organism.
- the selected first intron and promoter can be derived from the same gene, from different genes in the same organism, or even from different genes in different organisms.
- the first intron and/or a promoter can be selected from the promoters and first introns of genes that are known to be highly expressed in a constitutive manner.
- Such promoters and first introns and methods for identifying them are generally known the art. See, for example: U.S. Patent Application No.
- the methods of the present invention can further comprise identifying highly expressed constitutive genes from any organism and the selecting first introns and/or promoters from the newly identified highly expressed constitutive genes. Any method known in the art for the identification of highly expressed constitutive genes can be used in the methods disclosed herein. See, for example: U.S. Patent Application No. 13/528,515, filed June 20, 2012; WO 201 1/079197; and WO 2012/006426.
- the present invention provides methods for making a making a regulatory construct.
- the methods involve selecting a first intron that is derived from a native gene that is highly expressed in a constitutive manner in an organism.
- the first intron is the first intron from the 5' end of the gene, if the gene contains more than one intron.
- the methods further comprise selecting a promoter that is derived from the same gene as the first intron or from a different gene that is highly expressed in a constitutive manner in the same organism as the gene from which the first intron was derived or in a different organism.
- the methods further comprise synthesizing a regulatory construct comprising the promoter operably linked to a 5'-UTR, which comprises the first intron.
- the first intron is at or near the 3' end of the 5'- UTR.
- the regulatory construct provides for enhanced expression of an operably linked gene of interest in a target organism when compared to the expression of the gene of interest in the target organism from a control regulatory construct which lacks the first intron.
- the methods can further comprise operably linking a gene of interest to the regulatory construct for expression of the gene of interest in a target organism.
- regulatory construct of the present invention is essentially the same as an expression construct of present invention but without an operably linked translated region.
- the descriptions herein of the various elements of, and the arrangement within, the expression constructs of the present invention are also germane to the regulatory constructs of the present invention with the exception that the regulatory constructs are not required to comprise an operably linked translated region.
- the expression constructs of the present invention find use in the making of organisms or cells that express a heterologous gene in a constitutive manner and at high level.
- the present invention provides methods for making an organism for expressing a heterologous gene. The methods comprising introducing into at least one cell of a target organism an expression construct of the present invention.
- Such an expression construct comprises a promoter operably linked to a polynucleotide, wherein:
- the polynucleotide comprises a 5'-UTR and a translated region
- the 5'-UTR or the translated region comprises a first intron
- the first intron is the first intron from the 5' end of the first gene
- the first gene is native to the first organism
- the promoter is derived from the first gene or from a second gene
- the second gene is native to at least one of the first organism and
- the methods for making an organism for expressing a heterologous gene can further comprise regenerating from the at least one cell a target organism comprising the expression construct.
- the target organism or cell is capable of expressing the polynucleotide when the target organism or cell is exposed to conditions favorable for the expression of the polynucleotide for a sufficient period of time, and the polynucleotide is expressed at an increased level in the target organism or at least one cell thereof when compared to the expression of the polynucleotide in the target organism or at least one cell thereof comprising a control expression construct which lacks the first intron.
- expression of the polynucleotide is determined by measuring the level of the protein encoded by translated region or by assaying the activity or function of the protein.
- the methods for making an organism for expressing a heterologous gene can further comprise producing additional organisms or progeny by one or more rounds of sexual or asexual reproduction and optionally selecting for progeny comprising the expression construct.
- the methods of the present invention are not only limited to making the initial organism or the initial cell into which the expression construct was introduced but also encompass all progeny cells and organisms, however produced, that are descended from initial organism and/or the initial cell and that comprise the expression construct.
- the expression constructs of present invention find use in methods for expressing a heterologous gene in an organism.
- the present invention provides methods for expressing a heterologous gene in an organism. The methods involve obtaining a target organism comprising an expression construct of the present invention or at least one cell thereof and exposing the target organism or cell thereof to conditions favorable for the expression of the polynucleotide for a sufficient period of time, whereby the polynucleotide is expressed.
- the polynucleotide is expressed at an increased level in the target organism or cell thereof when compared to the expression of the polynucleotide in the target organism or cell thereof comprising a control expression construct which lacks the first intron.
- the methods further comprise producing the target organism or a progenitor thereof by introducing the expression construct into at least one cell of an organism and regenerating the at least one cell into the target organism or a progenitor thereof comprising the expression construct.
- the methods for expressing a heterologous gene in an organism can further comprise making the expression construct as described herein above.
- the present invention additionally provides nucleic acid molecules, vectors, expression cassettes comprising at least one of the expression constructs and/or at least one of the regulatory constructs of the present invention. Further provided are non-human organisms and non-human host cells comprising at least one of the expression constructs and/or at least one of the regulatory constructs as disclosed herein.
- the invention further provides expression cassettes, plants, plant parts, plant cells, seeds and host cells comprising at least one of the expression constructs and/or at least one of the regulatory constructs of the present invention.
- expression constructs and regulatory constructs can comprise promoters, first introns, and/or 5'UTRs that are identical in nucleotide sequence to corresponding promoters, first introns, and/or 5'UTRs in one or more native genes
- the expression constructs and regulatory constructs of the present invention are not known to be naturally occurring.
- the expression constructs and regulatory constructs of the present invention are recombinant nucleic acids that are not native to the genome of an organism.
- the invention encompasses isolated or substantially purified nucleic acid molecule or polynucleotide compositions.
- An "isolated” or “purified” nucleic acid molecule or polynucleotide, or biologically active portion thereof, is substantially or essentially free from components that normally accompany or interact with the nucleic acid molecule or polynucleotide as found in its naturally occurring environment.
- an isolated or purified nucleic acid molecule or polynucleotide is substantially free of other cellular material or culture medium when produced by recombinant techniques, or substantially free of chemical precursors or other chemicals when chemically synthesized.
- fragments and variants of the disclosed nucleic acid molecules or polynucleotides encompasses fragments and variants of the disclosed nucleic acid molecules or polynucleotides.
- fragment is intended a portion of the nucleic acid molecule or polynucleotide. Fragments of a polynucleotide comprising nucleic acid sequences retain biological activity of the full-length nucleic acid molecule or polynucleotide. Alternatively, fragments of a polynucleotide that are useful as hybridization probes generally do not encode proteins that retain biological activity or do not retain promoter activity. Thus, fragments of a nucleotide sequence may range from at least about 20 nucleotides, about 50 nucleotides, about 100 nucleotides, and up to the full-length polynucleotide of the invention.
- a fragment of a polynucleotide of the invention may encode a biologically active portion of a polynucleotide.
- a biologically active portion of a polynucleotide can be prepared by isolating a portion of one of the polynucleotides of the invention that comprises the genetic regulatory element and assessing activity as described herein.
- Polynucleotides that are fragments of a nucleotide sequence of the present invention comprise at 16, 20, 50, 75, 100, 150, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 700, 800, 900, 1 ,000, 1 , 100, 1 ,200, 1 ,300, 1 ,400, 1 ,500, 1 ,600, 1 ,700, 1 ,800, 1 ,900, 2,000, 2,100, 2,200, 2,300, 2,400 2,500, 2,600, or 2,700 contiguous nucleotides, or up to the number of nucleotides present in a full-length polynucleotide disclosed herein.
- a variant comprises a polynucleotide having deletions (i.e., truncations) at the 5' and/or 3' end; deletion and/or addition of one or more nucleotides at one or more internal sites in the reference polynucleotide; and/or substitution of one or more nucleotides at one or more sites in the reference polynucleotide.
- a "reference" polynucleotide comprises a nucleotide sequence produced by the methods disclosed herein.
- Variant polynucleotides also include synthetically derived polynucleotides, such as those generated, for example, by using site-directed mutagenesis but which still comprise biological activity.
- variants of a particular polynucleotide or nucleic acid molecule of the invention will have at least about 80%, 85%, 90%, 91 %, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more sequence identity to that particular polynucleotide as determined by sequence alignment programs and parameters as described elsewhere herein.
- Variant polynucleotides also encompass sequences derived from a mutagenic and recombinogenic procedure such as DNA shuffling.
- Strategies for such DNA shuffling are known in the art. See, for example, Stemmer (1994) PNAS 91 : 10747- 10751 ; Stemmer (1994) Nature 370:389-391 ; Crameri et al. (1997) Nature Biotech. 15:436-438; Moore et al. (1997) J. Mol. Biol. 272:336-347; Zhang et al. (1997) PNAS 94:4504-4509; Crameri et al. (1998) Nature 391 :288-291 ; and U.S. Patent Nos.
- oligonucleotide primers can be designed for use in PCR reactions to amplify corresponding DNA sequences from cDNA or genomic DNA extracted from any plant of interest.
- Methods for designing PCR primers and PCR cloning are generally known in the art and are disclosed in Sambrook et al. (1989) Molecular Cloning: A Laboratory Manual (2d ed., Cold Spring Harbor Laboratory Press, Plainview, New York). See also Innis et al , eds. (1990) ?CR Protocols: A Guide to Methods and Applications (Academic Press, New York); Innis and Gelfand, eds.
- PCR PCR Strategies
- nested primers single specific primers
- degenerate primers gene-specific primers
- vector-specific primers partially-mismatched primers
- polynucleotide molecules of the present invention encompass polynucleotide molecules comprising a nucleotide sequence that is sufficiently identical to one of the nucleotide sequences set forth in any one or more of SEQ ID NOS: 1-52.
- the term "sufficiently identical" is used herein to refer to a first nucleotide sequence that contains a sufficient or minimum number of identical or equivalent nucleotides to a second nucleotide sequence such that the first and second nucleotide sequences have a common structural domain and/or common functional activity.
- nucleotide sequences that contain a common structural domain having at least about 85% or 90% identity, preferably 95% identity, more preferably 96%), 97%), 98%o or 99% identity are defined herein as sufficiently identical.
- the sequences are aligned for optimal comparison purposes.
- the two sequences are the same length.
- the percent identity between two sequences can be determined using techniques similar to those described below, with or without allowing gaps. In calculating percent identity, typically exact matches are counted.
- the determination of percent identity between two sequences can be accomplished using a mathematical algorithm.
- a preferred, nonlimiting example of a mathematical algorithm utilized for the comparison of two sequences is the algorithm of arlin and Altschul (1990) PNAS 87:2264, modified as in Karlin and Altschul
- PSI-Blast can be used to perform an iterated search that detects distant relationships between molecules. See Altschul et al. (1997) supra. When utilizing BLAST, Gapped BLAST, and PSI-Blast programs, the default parameters of the respective programs (e.g., XBLAST and NBLAST) can be used. See
- sequence identity values for pairs of sequences provided herein refer to the value obtained using the BLAST 2.0 suite of programs using default parameters (Altschul el al , (1997) Nucleic Acids Res. 25:3389-402) using the full-length sequences of the invention.
- sequence identity values for multiple sequence alignments provided herein refer to the value obtained using MUSCLE (Version 3.8) using default parameters using the full-length sequences of the invention. MUSCLE is available at http://www.drive5.com/muscle/ or http://www.ebi.ac.uk/Tools/msa/muscle/. See, Edgar (2004) Nucleic Acids Res.
- polynucleotide and “nucleic acid” is not intended to limit the present invention to polynucleotides and nucleic acids comprising DNA.
- polynucleotides and nucleic acids can comprise ribonucleotides and combinations of ribonucleotides and
- deoxyribonucleotides Such deoxyribonucleotides and ribonucleotides include both naturally occurring molecules and synthetic analogues.
- the polynucleotides and nucleic acids of the invention also encompass all forms of sequences including, but not limited to, single-stranded forms, double-stranded forms, hairpins, stem-and-loop structures, and the like.
- the expression constructs and regulatory constructs of the present invention can be provided in expression cassettes for expression in the plant or other organism or host cell of interest. It is recognized that the expression constructs of the present invention and expression cassettes comprising one or more of such expression constructs can be used for the expression in both human and non-human host cells including, but not limited to, host cells from plants, animals, fungi, protists, and algae. In one
- the host cells are human host cells or a host cell line that is incapable of differentiating into a human being.
- the expression cassette can include additional 5' and 3' regulatory sequences operably linked to the expression construct or regulatory construct.
- "Operably linked" intended to mean a functional linkage between two or more elements.
- an operable linkage between one or more genetic regulatory elements and a gene of interest is functional link between the gene of interest and the one or more genetic regulatory elements that allows for expression of the gene of interest.
- Operably linked elements may be contiguous or non-contiguous.
- an "operably linked intron” is an intron that is functional and splices out of a polynucleotide when in a host organism capable of splicing out such a functional intron.
- an "operably linked intron” is one that is functional and splices out of a coding region or translated region of an RNA without disrupting the reading frame for translation when the polynucleotide is in a host organism capable of splicing out such a functional intron. It is understood that the term "in operable linkage” as used herein has the same meaning as “operably linked”.
- the expression cassette may additionally contain at least one additional gene to be co-transformed into the organism.
- the additional gene(s) can be provided on multiple expression cassettes.
- Such an expression cassette is provided with a plurality of restriction sites and/or recombination sites for insertion of the polynucleotide to be under the transcriptional regulation of the regulatory regions.
- the expression cassette may additionally contain selectable marker genes.
- the expression cassette can comprise in the 5 '-3' direction of transcription, a transcriptional initiation region (i.e., a promoter), a translational initiation region, nucleotide sequence to be expressed, a translational stop site, and a transcriptional termination region (i.e., termination region) functional in plants or other organism or host cell.
- the expression cassette further comprises a first intron either in the 5'-UTR or coding region.
- the regulatory regions (i.e., promoters, transcriptional regulatory regions, and translational termination regions) and/or the polynucleotide to be expressed may be native/analogous to the host cell or to each other. Alternatively, any of the regulatory regions and/or the polynucleotide to be expressed may be
- heterologous in reference to a sequence is a sequence that originates from a foreign species, or, if from the same species, is substantially modified from its native form in composition and/or genomic locus by deliberate human intervention.
- a promoter operably linked to a heterologous polynucleotide is from a species different from the species from which the polynucleotide was derived, or, if from the same/analogous species, one or both are substantially modified from their original form and/or genomic locus, or the promoter is not the native promoter for the operably linked polynucleotide.
- a chimeric gene comprises a coding sequence operably linked to a transcription initiation region that is heterologous to the coding sequence.
- the termination region may be native with the transcriptional initiation region, may be native with the operably linked polynucleotide of interest, may be native with the plant host, or may be derived from another source (i.e., foreign or heterologous) to the promoter, the polynucleotide of interest, the plant host, or any combination thereof.
- Convenient termination regions are available from the Ti-plasmid of A. lumefaciens, such as the octopine synthase and nopaline synthase termination regions. See also
- a promoter of the present invention for gene expression in plants is capable of directing the constitutive expression of an operably linked gene of interest in a plant, a plant part, and/or a plant cell.
- the genes of interest may be optimized for increased expression in the transformed plant. That is, the polynucleotides can be synthesized using plant-preferred codons for improved expression. See, for example, Campbell and Gowri (1990) Plant Physiol. 92: 1-1 1 for a discussion of host-preferred codon usage. Methods are available in the art for synthesizing plant-preferred genes. See, for example, U.S. Patent Nos. 5,380,831 , and 5,436,391 , and Murray et al. (1989) Nucleic Acids Res. 17:477-498, herein incorporated by reference.
- Additional sequence modifications are known to enhance gene expression in a cellular host. These include elimination of sequences encoding spurious polyadenylation signals, exon-intron splice site signals, transposon-like repeats, and other such well-characterized sequences that may be deleterious to gene expression.
- the G-C content of the sequence may be adjusted to levels average for a given cellular host, as calculated by reference to known genes expressed in the host cell. When possible, the sequence is modified to avoid predicted hairpin secondary mRNA structures.
- the expression cassettes may additionally contain heterologous 5' UTRs (also known as 5' leader sequences). Such 5' UTRs can act to enhance translation.
- Translation leaders are known in the art and include: picornavirus leaders, for example, EMCV leader (Encephalomyocarditis 5' noncoding region) (Elroy-Stein et al. (1989) PNAS USA 86:6126-6130); potyvirus leaders, for example, TEV leader (Tobacco Etch Virus) (Gallie el al. (1995) Gene 165(2):233-238), MDMV leader (Maize Dwarf Mosaic Virus) (Virology 154:9-20), and human immunoglobulin heavy-chain binding protein (BiP) (Macejak et al.
- EMCV leader Engelphalomyocarditis 5' noncoding region
- potyvirus leaders for example, TEV leader (Tobacco Etch Virus) (Gallie el al. (1995) Gene 165(2):233-238), MDMV leader (Maize Dwarf Mosaic Virus) (Virology 154:9-20), and human immunoglobulin heavy-chain binding protein (Bi
- AMV RNA 4 alfalfa mosaic virus
- TMV tobacco mosaic virus leader
- MCMV maize chlorotic mottle virus leader
- the various DNA fragments may be manipulated, so as to provide for the DNA sequences in the proper orientation and, as appropriate, in the proper reading frame.
- adapters or linkers may be employed to join the DNA fragments or other manipulations may be involved to provide for convenient restriction sites, removal of superfluous DNA, removal of restriction sites, or the like.
- in vitro mutagenesis, primer repair, restriction, annealing, resubstitutions, e.g., transitions and transversions may be involved.
- the expression cassette can also comprise a selectable marker gene for the selection of transformed cells.
- Selectable marker genes are utilized for the selection of transformed cells or tissues.
- Marker genes include genes encoding antibiotic resistance, such as those encoding neomycin phosphotransferase II (NEO) and hygromycin phosphotransferase (HPT), as well as genes conferring resistance to herbicidal compounds, such as glufosinate ammonium, bromoxynil, imidazolinones, and 2,4- dichlorophenoxyacetate (2,4-D).
- Additional selectable markers include phenotypic markers such as ⁇ -galactosidase and fluorescent proteins such as green fluorescent protein (GFP) (Su et al.
- selectable marker genes are not meant to be limiting. Any selectable marker gene can be used in the present invention.
- the methods of the invention involve introducing an expression construct or regulatory construct into an organism.
- introducing is intended presenting to the organism the expression construct in such a manner that the construct gains access to the interior of a cell of the organism.
- the methods of the invention do not depend on a particular method for introducing an expression construct or regulatory construct into an organism, only that the expression construct or regulatory construct gains access to the interior of at least one cell of the organism.
- Methods for introducing expression constructs, regulatory constructs, and other polynucleotides into various organisms such as, for example, plants, animals, fungi, protists, and bacteria are known in the art including, but not limited to, stable transformation methods, transient transformation methods, and virus-mediated methods.
- stable transformation is intended that the polynucleotide construct introduced into a organism integrates into a genome of organism and is capable of being inherited by progeny thereof.
- transient transformation is intended that a polynucleotide construct introduced into an organism does not integrate into a genome of the organism.
- the expression constructs and regulatory constructs of the invention are inserted using standard techniques into any vector known in the art that is suitable for expression of the nucleotide sequences in the organism or host cell.
- the selection of the vector depends on the preferred transformation technique and the species of target organism or host cell to be transformed.
- the expression constructs and regulatory constructs of the invention are inserted using standard techniques into any vector known in the art that is suitable for expression of the nucleotide sequences in a plant or plant cell.
- the selection of the vector depends on the preferred transformation technique and the target plant species to be transformed.
- nucleic acid molecules, expression constructs, and regulatory constructs of the invention may be introduced into plants by contacting plants with a virus or viral nucleic acids. Generally, such methods involve incorporating a nucleic acid molecule or an expression construct of the invention within a viral DNA or RNA molecule. It is recognized that the a protein of the invention may be initially synthesized as part of a viral polyprotein, which later may be processed by proteolysis in vivo or in vitro to produce the desired recombinant protein.
- the cells that have been transformed may be grown into plants in accordance with conventional ways. See, for example, McCormick et al. (1986) Plant Cell Reports 5 :81 -84. These plants may then be grown, and either pollinated with the same transformed strain or different strains, and the resulting hybrid having constitutive expression of the desired phenotypic characteristic identified. Two or more generations may be grown to ensure that expression of the desired phenotypic characteristic is stably maintained and inherited and then seeds harvested to ensure expression of the desired phenotypic characteristic has been achieved. In this manner, the present invention provides transformed seed (also referred to as "transgenic seed") having a polynucleotide construct of the invention, for example, an expression cassette of the invention, stably incorporated into their genome.
- the nucleic acid molecules, expression constructs, and regulatory constructs of the present invention can be provided to a plant or other organism using a variety of transient transformation methods.
- transient transformation methods include, but are not limited to, the introduction of the sequence or variants and fragments thereof directly into the plant or other organism or the introduction of a transcript into the plant.
- Such methods include, for example, microinjection, electroporation, or particle bombardment. See, for example, Crossway et al. ( ⁇ 986) Mo! Gen. Genet. 202: 179-185; Nomura et al. (1986) Plant Sci. 44:53-58; Hepler et al. (1994) PNAS 91 : 2176-2180 and Hush et al.
- polynucleotide can be transiently transformed into the plant or other organism using any other technique known in the art.
- nucleic acid molecules,expression constructs, and regulatory constructs of the present invention can be used for transformation of any plant species, including, but not limited to, monocots and dicots.
- plant species of interest include, but are not limited to, Arabidopsis thaliana, peppers ⁇ Capsicum spp; e.g., Capsicum annuum, C. baccatum, C. chinense, C. frutescens, C.
- juncea particularly those Brassica species useful as sources of seed oil, alfalfa (Medicago saliva), rice (Oryza saliva), rye (Secale cereale), sorghum (Sorghum bicolor, Sorghum vulgare), millet (e.g., pearl millet (Pennisetum glaucum), proso millet (Panicum miliaceum), foxtail millet (Setaria italica), green millet (Setaria viridis), finger millet (Eleusine coracana)), sunflower (Helianlhus annuus), safflower (Carthamus tinctorius), wheat (Triticum aestivum), soybean (Glycine max), potato (Solanum tuberosum), peanuts (Arachis hypogaea), cotton (Gossypium barbadense, Gossypium hirsutum), sweet potato (lpomoea batatus), cassava (Manihot esculenta), coffee (
- Wolffiella spp., and Wolffia spp.) algae e.g., Chlamydomonas reinhardlii, Botryococcus braunii, Chlorella spp. , Dunaliella tertiolecta, Gracilaria spp.), oats, barley, vegetables, ornamentals, and conifers.
- algae e.g., Chlamydomonas reinhardlii, Botryococcus braunii, Chlorella spp. , Dunaliella tertiolecta, Gracilaria spp.
- oats barley, vegetables, ornamentals, and conifers.
- the term plant includes plant cells, plant protoplasts, plant cell or tissue cultures from which plants can be regenerated, plant calli, plant clumps, and plant cells that are intact in plants or parts of plants such as embryos, pollen, ovules, seeds, leaves, flowers, branches, fruits, roots, root tips, anthers, and the like. Progeny, variants, and mutants of the regenerated plants are also included within the scope of the invention, provided that these parts comprise the introduced expression constructs or polynucleotides.
- Various changes in phenotype are of interest including modifying the fatty acid composition in a plant, altering the amino acid content of a plant, altering a plant's pathogen defense mechanism, and the like. These results can be achieved by providing expression of heterologous products or increased expression of endogenous products in plants.
- the present invention provides methods for expressing heterologous genes in organisms.
- a heterologous gene of the present invention can be any gene of interest that can be expressed by the methods of the present invention.
- Genes of interest encode proteins of interest.
- a translated region of the present invention can comprise a gene of interest that encodes a protein of interest.
- genes of interest are reflective of the commercial markets and interests of those involved in the development of the crop. Crops and markets of interest change, and as developing nations open up world markets, new crops and technologies will emerge also. In addition, as our understanding of agronomic traits and characteristics such as yield and heterosis increase, the choice of genes for transformation will change accordingly.
- General categories of genes of interest include, for example, those genes involved in information, such as zinc fingers, those involved in communication, such as kinases, and those involved in housekeeping, such as heat shock proteins. More specific categories of transgenes, for example, include genes encoding important traits for agronomics, insect resistance, disease resistance, herbicide resistance, sterility, grain characteristics, yield, abiotic stress tolerance, and commercial products. Genes of interest include, generally, those involved in oil, starch, carbohydrate, or nutrient metabolism. In addition, genes of interest include genes encoding enzymes and other proteins from plants and other sources including prokaryotes and other eukaryotes.
- Promoters and introns from Arabidopsis and rice highly expressed constitutive genes were used to make expression constructs comprising a promoter and intron from the same gene operably linked to a reporter gene and control expression constructs comprising a promoter operably linked to the reporter gene.
- the Arabidopsis and rice genes were previously identified as being highly expressed constitutive genes as reported in WO 201 1/079197 (see also, U.S. Patent Application No. 13/528,515, filed June 20, 2012) and WO 2012/006426.
- the accession numbers of the genes are listed in Tables 1 and 2 along with cross-references to the sequence identifiers for the constructs in these publications. Table 1. Gene Accessions from Arabidopsis thaliana
- intron-mediated enhancement was calculated as average expression with a + intron construct divided by average expression with the corresponding - intron construct.
- the constructs from Arabidopsis were tested for GFP expression in Arabidopsis lhaliana by calculating the GFP index, and the rice constructs were tested for GUS expression in corn (Zea mays) by determining GUS enzymatic activity, as described in WO 201 1/079197 and WO 2012/006426.
- IME was calculated for each of the 14 (Arabidopsis) or 10 (corn) tissues/zones/stages, and then these values are averaged for presentation in Figures 1 and 2.
- the presence of the first introns in the constructs enhanced expression in most cases (12 of 15 cases in
- Arabidopsis 10 of 10 cases in rice
- the expression enhancement ranging from 2- 70 fold in both Arabidopsis and corn
- median IME in corn was 1 1.5-fold
- the starred Arabidopsis IME values in Figure 1 and all of the corn IME values in Figure 2 are minimal estimates for IME because there was no detectable expression in the absence of an intron in one or more of the tissues tested.
- the IME value is calculated using background GFP or GUS values, respectively, for the tissues with no detectable expression in the -intron transgenics.
- Arabidopsis expression measurements are from the root epidermis, cortex, endodermis, and stele in each of the meristematic, elongation, and maturation zones, as well as the root cap and quiescent center (14 measurements throughout root
- Corn expression measurements are from V3-root, V7-root, VT-root, V3-leaf, V7-leaf, VT-leaf, VT-anther, VT-silk, 21 -DAP-embryo, and 21 -DAP-endosperm (10 measurements throughout plant development total) from R0 seedlings.
- IME was also determined in shoot tissue from two representative Arabidopsis promoters using quantitative PCR analysis (qRT-PCR) and northern blot analysis.
- qRT-PCR quantitative PCR analysis
- Plant tissues were harvested from 2-3 week old seedlings and homogenized in liquid nitrogen by grinding with mortar and pestle.
- Total RNA was extracted from tissues using the RNeasy kit (Qiagen). Gel resolution, transfer and crosslinking were done with the NorthernMax kit (Ambion).
- Probes for GFP and the housekeeping gene ATPK1 were labeled with the Prime-A-Gene kit (Promega). Unincorporated labels were removed via Micro Bio-spin P30 Tris chromatography columns (BioRad). Following overnight hybridization, membranes were washed in 2X SSC with 0.1 %SDS, dried, and screened at ⁇ ⁇ using the Scan Phospholmager. Bands were quantified via ImageQuant software.
- cDNA was generated from total RNA using Superscriptlll reverse transcriptase (Invitrogen) per manufacturer's instructions. Quantitative PCR was performed with iQ Multiplex Powermix (Bio-Rad) supplemented with the appropriate primers and probes (see below) on an iCycler iQ real-time detection system (Bio-Rad) using the following thermal-cycler program: (1) 9 min at 95°C; (2) 15 s at 94°C; (3) 30 s at 57°C; (4) 30 s at 72°C; repeat 40 cycles of steps 2-4. Amplification data recorded by the iQ software (Bio-Rad) was exported to Linregpcr program (Ruijter et al. (2009) Nucleic Acids Res.
- PCR efficiency and cycle threshold values were used to calculate GFP transgene copy number and expression relative to the 35S:GFP control using REST-MCS beta tool (Pfaffl et al. (2002) Nucleic Acids Res. 30(9):e36).
- Relative GFP expression in each tissue was calculated by normalizing the amplification of GFP in cDNA to the amplification of ubiquitin-conjugating enzyme 9 (UBC9), a "housekeeping gene", and subsequent normalization to 35S:GFP.
- UBC9 ubiquitin-conjugating enzyme 9
- PDS 1 Probe 5' - 5 TEX 615/TCGGTGTTAGAGCCGTTGCGATTGAA /3IAbRQSp.
- 5TEX615 indicate the presence of 5' fluorophore modifications while 3IAbRQSp and 3IABkFQ indicate the presence of 3' quencher modifications
- IME Expression enhancement
- Tables 4 and 5 demonstrate the absolute expression activity of the + intron variants when compared to well-know, high constitutive expressing control promoters.
- expression constructs with Arabidopsis promoters and cognate introns were compared to the CaMV 35S promoter for expression in Arabidopsis roots.
- GFP expression in Arabidopsis was measured as the GFP index as described in WO
- tissue/stages from 5-10 lines per promoter tissue/stages from 5-10 lines per promoter.
- the introns that have been identified can enhance the expression of heterologous promoters.
- introns were swapped between two promoters from Figure 1 and tested for expression enhancement by northern analysis of shoot tissue as described above.
- the result in Table 6 for the AT1 G52300/AT4G37830 construct is based on 1 single copy homozygous line of each the - and + intron variants.
- the result in Table 6 for the AT4G37830/AT1 G52300 construct is based on 2 (- intron variant) and 5 (+ intron variant) single copy, homozygous lines.
- Table 6 Intron-Mediated Enhancement (IME) of Heterologous Promoters
- the present invention demonstrates how to identify enhancing introns - by taking the first introns from genes selected for particular properties (e.g., high and uniform expression in all cell types, organs, tissues).
- the first introns are usually in the coding region but as disclosed herein the enhancing property of the first introns is modular because the first introns can be moved to the 3' end of 5 -UTRs of cloned promoters and still provide effective enhancement. This is important because the present invention demonstrates that there it is not necessary to make fusion constructs comprising a first intron inserted within the translated region of a gene of interest.
- regulatory constructs can be prepared which comprise a promoter operably linked to a 5'-UTR which comprises a first intron preferably at or new the 3' of the 5'-UTR.
- a construct can be operably linked to any gene of interest with relative ease without making any modification to the translated region of the gene of interest.
Landscapes
- Health & Medical Sciences (AREA)
- Genetics & Genomics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Engineering & Computer Science (AREA)
- Zoology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Biomedical Technology (AREA)
- Organic Chemistry (AREA)
- Biotechnology (AREA)
- General Engineering & Computer Science (AREA)
- Chemical & Material Sciences (AREA)
- Wood Science & Technology (AREA)
- Microbiology (AREA)
- Physics & Mathematics (AREA)
- Plant Pathology (AREA)
- Cell Biology (AREA)
- Molecular Biology (AREA)
- Biochemistry (AREA)
- General Health & Medical Sciences (AREA)
- Biophysics (AREA)
- Micro-Organisms Or Cultivation Processes Thereof (AREA)
Abstract
Methods for making expression constructs for enhancing gene expression in an organism are provided. The expression constructs comprise a constitutive promoter operably linked to a polynucleotide, which comprises a 5'-untranslated region (5'-UTR) and a translated region. The 5'-UTR comprises the first intron of a gene that is native to an organism and expressed constitutively. Methods of using the expression constructs to enhance the expression of a gene in an organism and compositions comprising the expression constructs are further provided.
Description
METHODS AND COMPOSITIONS FOR ENHANCING GENE EXPRESSION
CROSS-REFERENCE TO RELATED APPLICATION
This application claims the benefit of U.S. Application No. 61/666,318, filed June 29, 2012, which is hereby incorporated herein in its entirety by reference. FEDERALLY SPONSORED RESEARCH OR DEVELOPMENT
This invention was made with United States Government support under STTR 0957836 awarded by the National Science Foundation. The United States Government has certain rights in the invention. REFERENCE TO A SEQUENCE LISTING SUBMITTED
AS A TEXT FILE VIA EFS WEB
The official copy of the sequence listing is submitted electronically via EFS- Web as an ASCII formatted sequence listing with a file named 435025SeqLst.txt, created on June 26, 2013, and having a size of 93.7 kilobytes, and is filed concurrently with the specification. The sequence listing contained in this ASCII formatted document is part of the specification and is herein incorporated by reference in its entirety.
BACKGROUND OF THE INVENTION
The production of transgenic cells and organisms comprising a heterologous gene sequence is now routinely practiced by molecular biologists. Methods for incorporating an isolated gene sequence into an expression cassette, producing transformation vectors, and transforming many types of cells and organisms are well known. The regulation or control of expression of the heterologous gene and the protein encoded by the gene can often be critical in the development of a transgenic organism for commercial use. For example, in transgenic plants comprising a
heterologous gene that confers tolerance to herbicide that is normally toxic to the plant, it can be critical to have the heterologous gene expressed in a temporal and spatial manner that corresponds to when the plant is exposed to the herbicide and to what parts of the plant the herbicide normally exerts its phytotoxic effect.
A number of genetic regulatory elements are known to play a role in regulating the expression of a gene in plants and other organisms including, for example, promoters, 5 '-untranslated regions (UTRs), 3'-untranslated regions, and expression- enhancing introns. To express a transgene in a plant or organism, one or more of these genetic regulatory elements is operably linked for expression to a nucleic acid sequence or gene of interest.
Recently, it has become commonplace to introduce or "stack" multiple transgenes into a single transgenic crop plant. The stacking of multiple transgenes into a single transgenic plant has, however, proved to be problematic, particularly when the same genetic regulatory elements are used in more than one of the stacked transgenes. The use of multiple copies of the same regulatory sequence within two or more transgenes in a single plant is known to promote the activation of gene silencing mechanisms (Halpin (2005) Plant Biotech. J. 3: 141-155). Silencing of transgenes previously showing stable expression can also be triggered 'de novo' when a new transgene is added by crossing or re-transformation if, for example, the same promoter has been used in both transgenes in an effort to promote coordinated expression (Halpin (2005) Plant Biotech. J. 3 : 141-155). Often, the use of the same promoter in multiple transgenes in a single plant is due to the lack of more than one promoter that gives the desired pattern and level of expression. For example, the Cauliflower mosaic virus (CaMV) 35S promoter is frequently used as the promoter in plant transgenes because it provides for high-level constitutive expression of an operably linked gene of interest. Because of a lack of suitable alternative promoters, the CaMV 35 promoter is often used to drive the high-level constitutive expression of two or more transgenes in the same plant. Thus, additional promoters and other genetic regulatory elements are needed to avoid gene silencing that might be caused by the use of a particular genetic regulatory element more than once when two, three, four, or more transgenes are stacked in a single crop plant.
A common approach for identifying additional promoters that can be used to drive high-level, constitutive expression of an operably linked heterologous nucleotide sequence in plants involves screening plants to identify plant genes that display the high-level constitutive expression across most tissue and/or cell types. Often, however, this approach yields less than satisfactory results when the promoter from the plant gene is separated from its native downstream transcribed region, operably linked to reporter gene or other gene of interest, introduced into a plant or plant cell, and assayed for the level of expression of the reporter gene or other gene of interest. In many cases, such promoters fail to display the same high-level constitutive expression of the operably linked gene as the promoter displays when it occurs in its native position operably linked to its native transcribed region. Thus, new approaches are needed to provide additional promoters suitable for driving high-level, constitutive expression of an operably linked heterologous nucleotide sequence in plants. BRIEF SUMMARY OF THE INVENTION
Methods are provided for making an expression construct for enhancing gene expression in an organism. The methods involve selecting a first intron that is derived from a native gene that is highly expressed in a constitutive manner in an organism. The first intron is the first intron from the 5' end of the transcribed region of a gene. The methods further comprise selecting a promoter that is derived from the same gene as the first intron or from a different gene that is highly expressed in a constitutive manner either in the same organism as the gene from which the first intron was derived or in a different organism. The methods further comprise synthesizing an expression construct comprising the promoter operably linked to a polynucleotide. The
polynucleotide comprises a 5'-untranslated region (5'-UTR), the first intron, and a translated region, and the 5'-UTR or translated region comprises the first intron.
Preferably, the expression construct provides for enhanced expression of the operably linked polynucleotide in a target organism when compared to the expression of the polynucleotide in the target organism from a control expression construct which lacks the first intron.
Additionally provided are methods for making an organism for expressing a heterologous gene. The methods comprise introducing into at least one cell of a target
organism an expression construct comprising a promoter operably linked to a polynucleotide. The polynucleotide comprises a 5'-UTR, a first intron, and a translated region, and the 5'-UTR or translated region comprises the first intron, which is derived from a native gene that is highly expressed in a constitutive manner in an organism. The promoter can be derived from the native gene or from a different gene that is highly expressed in a constitutive manner either in the same organism as the gene from which the first intron was derived or in a different organism. The methods can further comprise regenerating a target organism from at least one cell comprising the expression construct. The target organism that is produced by the methods of the present invention is capable of expressing the polynucleotide when the target organism or cell thereof is exposed to conditions favorable for the expression of the
polynucleotide for a sufficient period of time. Preferably, the target organism is capable of enhanced expression of the polynucleotide when compared to the expression in the target organism of the polynucleotide from a control expression construct which lacks the first intron.
Further provided are methods for expressing a heterologous gene in an organism. The methods comprise obtaining a target organism comprising an expression construct or at least one cell thereof. The expression construct comprises a promoter operably linked to a polynucleotide and exposing the target organism or cell thereof to conditions favorable for the expression of the polynucleotide for a sufficient period of time, whereby the polynucleotide is expressed. For this method, the polynucleotide comprises a S'-UTR, a first intron, and a translated region, and the 5'- UTR or translated region comprises the first intron. The first intron is derived from a native gene that is highly expressed in a constitutive manner in an organism. The promoter can be derived from the native gene or from a different gene that is highly expressed in a constitutive manner either in the same organism as the gene from which the first intron was derived or in a different organism. Preferably, expression of the polynucleotide is increased in a target organism comprising the expression construct or in at least one cell thereof, when compared to the expression of the polynucleotide in the target organism comprising a control expression construct which lacks the first intron or in at least one cell thereof. In certain embodiments, the expression level of the
polynucleotide is determined by measuring the level of the protein encoded by translated region or by assaying the activity or function of the protein.
Methods are provided for making a regulatory construct. The methods involve selecting a first intron that is derived from a native gene that is highly expressed in a constitutive manner in an organism. The first intron is the first intron from the 5' end of the gene, if the gene contains more than one intron. The methods further comprise selecting a promoter that is derived from the same gene as the first intron or from a different gene that is highly expressed in a constitutive manner either in the same organism as the gene from which the first intron was derived or in a different organism. The methods further comprise synthesizing a regulatory construct comprising the promoter operably linked to a 5'-UTR, which comprises the first intron. Preferably, the first intron is at or near the 3' end of the 5'-UTR. Also preferably, the regulatory construct provides for enhanced expression of an operably linked gene of interest in a target organism when compared to the expression of the gene of interest in the target organism from a control regulatory construct which lacks the first intron. The methods can further comprise operably linking a gene of interest to the regulatory construct for expression of the gene of interest in a target organism.
Nucleic acid molecules comprising the expression constructs and regulatory constructs of the present invention are provided. Additionally provided are organisms and host cells comprising the expression constructs and regulatory constructs of the present invention. In one embodiment of the invention, the organisms and host cells include, for example, plants, seeds, plant parts, and plant cells comprising at least one expression construct and/or at least one regulatory construct of the present invention.
SEQUENCE LISTING
The nucleotide sequences listed in the accompanying sequence listing are shown using standard letter abbreviations for nucleotide bases. The nucleotide sequences follow the standard convention of beginning at the 5' end of the sequence and proceeding forward (i.e., from left to right in each line) to the 3' end. Only one strand of each nucleic acid sequence is shown, but the complementary strand is understood to be included by any reference to the displayed strand.
SEQ ID NO: 1 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession ATI G 13440. The first intron is located at nucleotide positions 1 1 1 1 to 1203. The start of transcription is at nucleotide 955. Additional nucleotides added to make a consensus splice site are at nucleotide positions 1 108, 1 109, 1 1 10, 1204 and 1205.
SEQ ID NO: 2 sets forth the nucleotide sequence of SEQ ID NO: 1 without the first intron and any nucleotides that were added to form a consensus splice site.
SEQ ID NO: 3 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession ATI G22840. The first intron is located at nucleotide positions 296 to 774. The start of transcription is at nucleotide 196. Additional nucleotides added to make a consensus splice site are at nucleotide positions 293, 294, 295, 775, and 776 .
SEQ ID NO: 4 sets forth the nucleotide sequence of SEQ ID NO: 3 without the first intron and any nucleotides that were added to form a consensus splice site.
SEQ ID NO: 5 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession AT1 G52300. The first intron is located at nucleotide positions 1 100 to 1201. The start of transcription is at nucleotide 1017. Additional nucleotides added to make a consensus splice site are at nucleotide positions 1099, 1202, and 1203.
SEQ ID NO: 6 sets forth the nucleotide sequence of SEQ ID NO: 5 without the first intron and any nucleotides that were added to form a consensus splice site.
SEQ ID NO: 7 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession AT4G37830. The first intron is located at nucleotide positions 861 to 1203. The start of transcription is at nucleotide 786. Additional nucleotides added to make a consensus splice site are at nucleotide positions 858, 859, 860, 1204, and 1205.
SEQ ID NO: 8 sets forth the nucleotide sequence of SEQ ID NO: 7 without the first intron and any nucleotides that were added to form a consensus splice site.
SEQ ID NO: 9 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession ATI G51650. The first intron is located at nucleotide positions 819 to 1567. The start of transcription is at
nucleotide 751. Additional nucleotides added to make a consensus splice site are at nucleotide positions 816, 817, 818, 1568, and 1569.
SEQ ID NO: 10 sets forth the nucleotide sequence of SEQ ID NO: 9 without the first intron and any nucleotides that were added to form a consensus splice site.
SEQ ID NO: 1 1 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession AT3G48140. The first intron is located at nucleotide positions 1045 to 1201. The start of transcription is at nucleotide 929. Additional nucleotides added to make a consensus splice site are at nucleotide positions 1044, 1202, and 1203.
SEQ ID NO: 12 sets forth the nucleotide sequence of SEQ ID NO: 1 1 without the first intron and any nucleotides that were added to form a consensus splice site.
SEQ ID NO: 13 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession AT1 G02780. The first intron is located at nucleotide positions 1003 to 1343. The start of transcription is at nucleotide 926, Additional nucleotides added to make a consensus splice site are at nucleotide positions 1000, 1001 , 1002, 1344, and 1345.
SEQ ID NO: 14 sets forth the nucleotide sequence of SEQ ID NO: 13 without the first intron and any nucleotides that were added to form a consensus splice site.
SEQ ID NO: 15 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession AT3G01280. The first intron is located at nucleotide positions 604 to 1 102. The start of transcription is at nucleotide 448. Additional nucleotides added to make a consensus splice site are at nucleotide positions 601 , 602, 603, 1 103, and 1 104.
SEQ ID NO: 16 sets forth the nucleotide sequence of SEQ ID NO: 15 without the first intron and any nucleotides that were added to form a consensus splice site.
SEQ ID NO: 17 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession AT1 G67430. The first intron is located at nucleotide positions 1783 to 1891. The start of transcription is at nucleotide 1730. Additional nucleotides added to make a consensus splice site are at nucleotide positions 1780, 1781 , 1782, 1892, and 1893.
SEQ ID NO: 18 sets forth the nucleotide sequence of SEQ ID NO: 17 without the first intron and any nucleotides that were added to form a consensus splice site.
SEQ ID NO: 19 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession AT1 G76200. The first intron is located at nucleotide positions 758 to 1073. The start of transcription is at nucleotide 654. Additional nucleotides added to make a consensus splice site are at nucleotide positions 755, 756, 757, 1074, 1075.
SEQ ID NO: 20 sets forth the nucleotide sequence of SEQ ID NO: 19 without the first intron and any nucleotides that were added to form a consensus splice site.
SEQ ID NO: 21 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession AT2G31490. The first intron is located at nucleotide positions 704 to 1430. The start of transcription is at nucleotide 624. Additional nucleotides added to make a consensus splice site are at nucleotide positions 701 , 702, 703, 1431 , and 1432.
SEQ ID NO: 22 sets forth the nucleotide sequence of SEQ ID NO: 21 without the first intron and any nucleotides that were added to form a consensus splice site.
SEQ ID NO: 23 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession AT5G08690. The first intron is located at nucleotide positions 776 to 1077. The start of transcription is at nucleotide 747. Additional nucleotides added to make a consensus splice site are at nucleotide positions 773, 774, 775, 1078, and 1079.
SEQ ID NO: 24 sets forth the nucleotide sequence of SEQ ID NO: 23 without the first intron and any nucleotides that were added to form a consensus splice site.
SEQ ID NO: 25 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession AT1 G07600. The first intron is located at nucleotide positions 1504 to 1783. The start of transcription is at nucleotide 1501. Additional nucleotides added to make a consensus splice site are at nucleotide positions 1502, 1503, 1784, and 1785.
SEQ ID NO: 26 sets forth the nucleotide sequence of SEQ ID NO: 25 without the first intron and any nucleotides that were added to form a consensus splice site.
SEQ ID NO: 27 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession AT1 G78380. The first intron is located at nucleotide positions 1504 to 2004. The start of transcription is
at nucleotide 1414. Additional nucleotides added to make a consensus splice site are at nucleotide positions 1501 , 1502, 1503, 2005, and 2006.
SEQ ID NO: 28 sets forth the nucleotide sequence of SEQ ID NO: 27 without the first intron and any nucleotides that were added to form a consensus splice site.
SEQ ID NO: 29 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession AT2G33040. The first intron is located at nucleotide positions 552 to 952. The start of transcription is at nucleotide 415. Additional nucleotides added to make a consensus splice site are at nucleotide positions 551 , 953, and 954.
SEQ ID NO: 30 sets forth the nucleotide sequence of SEQ ID NO: 29 without the first intron and any nucleotides that were added to form a consensus splice site.
SEQ ID NO: 31 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession Os03g21940. The first intron is located at nucleotide positions 1504 to 2482. The start of transcription is at nucleotide 1406. Additional nucleotides added to make a consensus splice site are at nucleotide positions 1501 , 1502, 1503, 2483, and 2484.
SEQ ID NO: 32 sets forth the nucleotide sequence of SEQ ID NO: 31 without the first intron and any nucleotides that were added to form a consensus splice site.
SEQ ID NO: 33 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession Os05g45950. The first intron is located at nucleotide positions 1504 to 1656. The start of transcription is at nucleotide 1419. Additional nucleotides added to make a consensus splice site are at nucleotide positions 1501 , 1502, 1503, 1657, and 1658.
SEQ ID NO: 34 sets forth the nucleotide sequence of SEQ ID NO: 33 without the first intron and any nucleotides that were added to form a consensus splice site.
SEQ ID NO: 35 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession Osl l g47760. The first intron is located at nucleotide positions 729 to 2633. The start of transcription is at nucleotide 638. Additional nucleotides added to make a consensus splice site are at nucleotide positions 726, 727, 728, 2634, and 2635.
SEQ ID NO: 36 sets forth the nucleotide sequence of SEQ ID NO: 35 without the first intron and any nucleotides that were added to form a consensus splice site.
SEQ ID NO: 37 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession Os02g02130. The first intron is located at nucleotide positions 1504 to 1586. The start of transcription is at nucleotide 1501. Additional nucleotides added to make a consensus splice site are at nucleotide positions 1502, 1503, 1587, and 1588.
SEQ ID NO: 38 sets forth the nucleotide sequence of SEQ ID NO: 37 without the first intron and any nucleotides that were added to form a consensus splice site.
SEQ ID NO: 39 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession Os03g56190. The first intron is located at nucleotide positions 1504 to 1615. The start of transcription is at nucleotide 1437. Additional nucleotides added to make a consensus splice site are at nucleotide positions 1501 , 1502, 1503, 1616, and 1617.
SEQ ID NO: 40 sets forth the nucleotide sequence of SEQ ID NO: 39 without the first intron and any nucleotides that were added to form a consensus splice site.
SEQ ID NO: 41 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession Os05g47980. The first intron is located at nucleotide positions 940 to 1553. The start of transcription is at nucleotide 829. Additional nucleotides added to make a consensus splice site are at nucleotide positions 937, 938, 939, 1554, and 1555.
SEQ ID NO: 42 sets forth the nucleotide sequence of SEQ ID NO: 41 without the first intron and any nucleotides that were added to form a consensus splice site.
SEQ ID NO: 43 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession Os01 g46610. The first intron is located at nucleotide positions 1504 to 2228. The start of transcription is at nucleotide 1384. Additional nucleotides added to make a consensus splice site are at nucleotide positions 1501 , 1502, 1503, 2229, and 2230.
SEQ ID NO: 44 sets forth the nucleotide sequence of SEQ ID NO: 43 without the first intron and any nucleotides that were added to form a consensus splice site.
SEQ ID NO: 45 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession Os04g28180. The first intron is located at nucleotide positions 1504 to 1646. The start of transcription is
at nucleotide 1399. Additional nucleotides added to make a consensus splice site are at nucleotide positions 1501 , 1502, 1503, 1647, and 1648.
SEQ ID NO: 46 sets forth the nucleotide sequence of SEQ ID NO: 45 without the first intron and any nucleotides that were added to form a consensus splice site.
SEQ ID NO: 47 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession Os05g01820. The first intron is located at nucleotide positions 1504 to 2453. The start of transcription is at nucleotide 1229. Additional nucleotides added to make a consensus splice site are at nucleotide positions 1501 , 1502, 1503, 2454, and 2455.
SEQ ID NO: 48 sets forth the nucleotide sequence of SEQ ID NO: 47 without the first intron and any nucleotides that were added to form a consensus splice site.
SEQ ID NO: 49 sets forth the nucleotide sequence of the regulatory construct comprising the promoter and the first intron from gene accession Osl l gl 1390. The first intron is located at nucleotide positions 1504 to 2798. The start of transcription is at nucleotide 1431. Additional nucleotides added to make a consensus splice site are at nucleotide positions 1501 , 1502, 1503, 2799, and 2780.
SEQ ID NO: 50 sets forth the nucleotide sequence of SEQ ID NO: 49 without the first intron and any nucleotides that were added to form a consensus splice site.
SEQ ID NO: 51 sets forth the nucleotide sequence of the regulatory construct comprising the promoter from gene accession AT4G37830 and the first intron from gene accession AT1 G52300. The first intron is located at nucleotide positions 861 to 960. The start of transcription is at nucleotide 786.
SEQ ID NO: 52 sets forth the nucleotide sequence of the regulatory construct comprising the promoter from gene accession ATl G52300and the first intron from gene accession AT4G37830 . The first intron is located at nucleotide positions 1 100 to 1442. The start of transcription is at nucleotide 1017.
BRIEF DESCRIPTION OF THE FIGURES
FIG. 1 is a graphical representation of root expression enhancement of
Arabidopsis promoters by cognate first introns. Expression constructs comprising a promoter, a 5'-UTR comprising the cognate first intron, and a translated region comprising the coding sequence of a reporter gene were tested for expression in
Arabidopsis roots and compared to a control expression construct lacking the first intron (i.e., - intron variant). Average intron-mediated enhancement (IME) is expressed as on the ^-axis as 2A-fold enhancement (e.g., 22 and 24 stand for 4-fold and 16-fold expression enhancement, respectively.) The dashed line at 2° ( = 1 ) indicates the relative expression of the - intron variants. The individual promoters used are listed below the -axis.
FIG. 2 is a graphical representation of expression enhancement of fice promoters by cognate first introns. Expression constructs comprising a promoter, a 5'- UTR comprising the cognate first intron, and a translated region comprising the coding sequence of a reporter gene were tested for expression in corn and compared to a control expression constructs lacking the first intron. Average intron-mediated enhancement (IME) is expressed as expressed on the j^-axis as 2 -fold enhancement (e.g., 22 and 24 stand for 4-fold and 16-fold expression enhancement, respectively.) The individual promoters used are listed below the -axis.
DETAILED DESCRIPTION OF THE INVENTION
The present inventions now will be described more fully hereinafter with reference to the accompanying drawings, in which some, but not all embodiments of the inventions are shown. Indeed, these inventions may be embodied in many different forms and should not be construed as limited to the embodiments set forth herein; rather, these embodiments are provided so that this disclosure will satisfy applicable legal requirements. Like numbers refer to like elements throughout.
In the context of this disclosure, a number of terms and abbreviations are used. The following definitions are provided.
The articles "a" and "an" are used herein to refer to one or more than one (i.e., to at least one) of the grammatical object of the article. By way of example, "an element" means one or more element.
Throughout the specification the word "comprise," or variations such as "comprises" or "comprising," will be understood to imply the inclusion of a stated element, integer or step, or group of elements, integers or steps, but not the exclusion of any other element, integer or step, or group of elements, integers or steps.
As used herein, the term "expression construct" refers to a recombinant DNA or nucleic acid, which comprises in a 5'~to-3' order and in operable linkage a promoter, a 5 '-untranslated region (5'-UTR), and a translated region, wherein the 5'-UTR comprises a first intron from a native gene of an organism. The transcribed region of the expression construct comprises 5'-UTR, the first intron, and the translated region. In particular embodiments, the expression constructs of the present invention can further comprise one or more each of one or more of the following elements: an enhancer, an additional intron, a 3 '-untranslated region, a transcriptional terminator, and a chromatin control element.
As used herein, the term "regulatory construct" refers to a recombinant DNA or nucleic acid, which comprises in a 5'-to-3' order and in operable linkage a promoter and a 5'- UTR, wherein the 5'-UTR comprises a first intron from a native gene of an organism. In particular embodiments, the regulatory constructs of the present invention can further comprise one or more each of one or more of the following elements: an enhancer, an additional intron, a translated region or coding sequence, a 3 '-untranslated region, a transcriptional terminator, and a chromatin control element.
As used herein, the terms "5'-untranslated region" or "5'-UTR" refer to a portion of transcribed region of a gene or an expression construct of the present invention that extends from the transcriptional start site and ends with the nucleotide immediately before the first nucleotide of the start codon for translation. As its name implies, the 5'- UTR does not normally serve as a template for translation and thus, is referred to as an untranslated region. The "5'-UTR" is, however, transcribed into RNA. Thus, an RNA transcript of a gene or an expression construct or a regulatory construct of the present invention also comprises a 5'-UTR. In most cases, any introns that occur in a '5-UTR of a gene or expression construct or regulatory construct are not found in the corresponding mature RNA transcript produced in vivo as such introns are typically spliced out by the host organism or cell thereof, unless the intron is non-functional in the host organism or the host organism is incompetent for splicing out such introns.
As used herein, the term "first intron" refers to the first intron from the 5' end of a native gene of an organism. The first intron can be found within the 5 '-UTR or the translated region of the native gene. When the first intron is located within the translated region of the gene, the first intron is between the first protein coding exon
and the second protein coding exon. While the present invention does not depend on the location of the first intron within a native gene, typically the 5' end of a first intron that is capable of enhancing expression as disclosed herein is within about the first 1000 base pairs (bp) after the transcriptional start site (in a 5' to 3' direction) and is preferably within about the first 500 bp after the transcription start site.
An "expression-enhancing intron" or "enhancing intron" is an intron that is capable of causing an increase in the expression of a gene or polynucleotide to which it is operably linked. A "first intron" of the present invention is an expression-enhancing intron. While the present invention is not known to depend on a particular biological mechanism, it is believed that the expression-enhancing introns of the present invention enhance expression through intron-mediated enhancement (IME). It is recognized that naturally occurring introns that enhance expression through IME are typically found within 1 Kb of the transcription start site of their native genes (see, Rose el al. (2008) Plant Cell 20:543-551 ). Such introns are usually the first intron, whether the first intron is in the 5'-UTR or the coding sequence, and are in a transcribed region. Introns that enhance expression solely through IME do not enhance gene expression when they are inserted into a non-transcribed region of gene, such as for example, a promoter. That is, they do not function as transcriptional enhancers. Unless stated otherwise or apparent from the context, the first introns of the present invention are capable of enhancing gene expression when they are found in a transcribed region of a gene but not when they occur in a non-transcribed region such as, for example, a promoter.
As used herein, the term "translated region" refers to the portion of a gene or expression construct of the present invention or its corresponding RNA transcript that encodes a polypeptide or protein of interest. Thus, the translated region comprises the start codon (e.g., ATG) for translation through the last codon of the protein or polypeptide encoded thereby. It is recognized that the translated region of a gene or expression construct can comprise one or more introns. It is further recognized that any introns that occur in the translated region of a gene or expression construct of the present invention are not typically found in the corresponding mature RNA transcript produced in vivo as such introns are normally spliced out by the host organism or cell thereof unless the intron or introns are non-functional and/or the host organism is incompetent for splicing out such introns.
As used herein, "native gene" refers to a gene that is part of a natural genome of an organism and that was not introduced into the organism or a progenitor thereof by artificial means that do not involve the transfer of genes from one organism to another organism by sexual reproduction. Such artificial means include, for example, any methods involving the introduction of recombinant DNA or other recombinant nucleic acid molecules into the organism or a progenitor thereof. A gene that is introduced into a progenitor of an organism by artificial means does not become a native gene when it is transferred from the progenitor to the organism via sexual reproduction.
The terms "recombinant DNA", "recombinant nucleic acid molecule", and similar terms refer to DNA and other recombinant nucleic acid molecules that are an artificial or non-naturally occurring combination of nucleic acid fragments, e.g., regulatory and coding sequences that are not all found together in the same form in nature. For example, recombinant nucleic acid molecules may comprise regulatory sequences and coding sequences that are derived from different sources, or regulatory sequences and coding sequences derived from the same source, but arranged in a manner different than that found in nature. As used herein, an "expression construct" and a "regulatory construct" each comprise recombinant DNA.
By "enhancing gene expression" is intended to mean enhancing or increasing the expression of a gene or its gene product, particularly a protein or polypeptide. Gene expression can be determining by monitoring the formation of a transcript of a gene or polynucleotide or gene of interest of the present invention, a protein encoded by the transcript, or even an activity or function of the encoded protein. In preferred embodiments of the present invention, gene expression is determined by monitoring the level of a protein encoded by the gene or the activity or function of the encoded protein. Thus, it is understood that the expression of a polynucleotide or gene of interest of the present invention can be assessed in an organism or at least one cell thereof by determining the level of level of the protein encoded by the translated region of the polynucleotide or gene of interest or the activity or function of the encoded protein. In some embodiments of the present invention, the polynucleotide or gene of interest comprises a translated region which encodes green fluorescent protein (GFP), and expression of the polynucleotide or gene of interest can be determined by measuring green fluorescence emitted from the GFP protein when it is exposed to blue light. In
other embodiments, the polynucleotide or gene of interest comprises a translated region which encodes f3-glucuronidase (GUS) and expression of the polynucleotide can be determined by measuring GUS activity using the MUG fluorometric assay.
As used herein, a "promoter" refers to a nucleic acid that is capable of controlling the expression of an operably linked coding sequence or other sequence encoding an RNA. The promoter sequence consists of proximal and more distal upstream elements, the latter elements often referred to as enhancers. It is understood by those skilled in the art that different promoters may direct the expression of a gene in different tissues or cell types, or at different stages of development, or in response to different environmental conditions. It is further recognized that since in most cases the exact boundaries of regulatory sequences have not been completely defined, nucleic acid fragments of some variation may have identical promoter activity.
An "enhancer" is a DNA sequence that can stimulate promoter activity, and may be an innate element of the promoter or a heterologous element inserted to enhance the level or tissue-specificity of a promoter. Enhancers may be found in both non-transcribed and transcribed regions of a gene. Typically, the promoter stimulating activity of an enhancer is insensitive with respect to the position and orientation (i.e., can be inverted) of the enhancer within a gene.
Promoters that cause an operably linked gene or polynucleotide to be expressed in most cell types of an organism and at most times are commonly referred to as "constitutive promoters". Preferably, the constitutive promoters of the present invention cause an operably linked gene or polynucleotide to be expressed in all or substantially all tissues and stages of development and being minimally responsive to abiotic stimuli. Expression of a gene of a gene or polynucleotide in most cell types of an organism and at most times is referred to herein as "constitutive gene expression" or "constitutive expression", "expressed constitutively", "expression in a constitutive manner", or expression in a 'constitutive pattern". It is understood that for the terms "constitutive promoter" and "constitutive expression" and that some variation in absolute levels of expression or activity can exist among different tissues and stages of development of an organism.
The present invention provides novel expression constructs comprising a promoter operably linked to a polynucleotide. It is recognized that nucleic acid
molecules comprising such novel expression constructs can be synthesized or produced using a number of methods known in the art. As used herein, "synthesizing an expression construct" or "producing an expression construct" are interchangeable terms that are intending to mean the making of an expression construct by any known method including, but not limited to, chemical synthesis of the entire nucleic acid molecule or part or parts thereof, modification of a pre-existing nucleic acid molecule by molecular biology methods such as, for example, restriction endonuclease digestion, DNA amplification by polymerase and ligation, and the combination of chemical synthesis and modification.
As used herein, "progeny" comprises any subsequent generation of an organism or a host cell, whether the result of sexual reproduction or asexual reproduction.
Preferably, a progeny of the present invention is made by the methods of the present invention and/or comprises an expression construct of the present invention.
A used herein, "progenitor" or "progenitor organism" refers to an ancestor of an organism or host cell. In certain embodiments of the invention, methods are described that can involve the use of an organism or cell comprising an expression construct of the present invention wherein the organism or cell is descended from a progenitor into which the expression construct was introduced. Preferably, the expression construct was stably introduced into the genome of the progenitor by, for example, a stable transformation method described herein or otherwise known in the art.
As used herein, an "organism" refers any life form that has genetic material comprising nucleic acids including, but not limited to, prokaryotes, eukaryotes, and viruses. Organisms include, for example, plants, animals, fungi, bacteria, and viruses, and cells and parts thereof. Preferred organisms of the present invention are eukaryotic organisms, including, for example, plants, animals, fungi, and protists.
As used herein, a "target organism" is the organism into which an expression construct of the present invention is introduced, particularly for the purpose of expressing the protein encoded by the translated region of the expression construct.
By "gene of interest" is intended any nucleotide sequence that can be expressed when operable linked to a promoter or a regulatory construct of the present invention. A gene of interest of the present invention may, but need not, encode a protein. A translated region of the present invention can be a gene of interest. Unless stated
otherwise or readily apparent from the context, when a gene of interest of the present invention is said to be operably linked to a promoter of the invention, the gene of interest does not by itself comprise a functional promoter. Preferably, the gene of interest does not comprise a full-length 5 -UTR. More preferably, the gene of interest is a translated region.
As used herein, a "heterologous gene" is any nucleic acid molecule or polynucleotide that is expressed from a nucleotide construct of the present invention. Such a heterologous gene can comprise a nucleotide sequence that is native or endogenous to an organism or can be foreign.
While the present invention does not depend on a particular method of determining if the expression construct of the present invention is capable of enhancing gene expression in a target organism, typically gene expression is determined by transforming the target organism or at least one cell thereof with a polynucleotide construct comprising the expression construct. The expression construct can further comprise additional genetic regulatory elements, if desired or necessary for expression in the translated region in the organism or at least one cell thereof.
Those of skill in the art will appreciate that determining whether the expression construct is capable of enhancing the expression of an operably linked gene in the desired manner in the target organism or any other organism of interest can depend on any number of factors including, for example, the type of genetic regulatory element (e.g., promoter, a 5'-untranslated region (UTR), a 3 '-untranslated region, an intron, a terminator, a chromatin control element), the presence of additional genetic elements in the construct, the gene of interest to be expressed, the organism or part or cell thereof in which expression is assayed, the expression assay, the detection method (e.g., GFP visible fluorescent, detection of GFP RNA by qPCR), the environmental conditions during the assay, and the like.
As used herein, a "control expression construct" is the same or substantially the same as an expression construct of the present invention but lacks a first intron and can be used as a control in gene expression determinations as disclosed herein. Preferably, a control expression construct lacks a first intron but otherwise comprises the same promoter, 5'-UTR, and translated region as an expression construct of the present invention. More preferably, a control expression construct lacks a first intron but
otherwise has the same nucleotide sequence as an expression construct of the present invention, except for the missing portion that would correspond to the first intron in the expression construct.
Similarly, as used herein, a "control regulatory construct" is the same or substantially the same as a regulatory construct of the present invention but lacks a first intron and can be used as a control in gene expression determinations as disclosed herein. Preferably, a control regulatory construct lacks a first intron but otherwise comprises the same promoter and 5'-UTR as a regulatory construct of the present invention. More preferably, a control regulatory construct lacks a first intron but otherwise has the same nucleotide sequence as a regulatory construct of the present invention, except for the missing portion that would correspond to the first intron in the regulatory construct.
As used herein a "reporter" or a "reporter gene" refers to a nucleic acid molecule encoding a detectable marker. Reporter genes include, for example, luciferase (e.g., firefly luciferase or Renilla luciferase), β-galactosidase, β- glucuronidase (GUS), chloramphenicol acetyl transferase (CAT), and a fluorescent protein (e.g., green fluorescent protein (GFP), red fluorescent protein (DsRed), yellow fluorescent protein, blue fluorescent protein, cyan fluorescent protein, or variants thereof, including enhanced variants such as enhanced GFP (eGFP). Reporter genes are detectable by a reporter assay. Reporter assays can measure the level of reporter gene expression or activity by any number of means, including, for example, measuring the level of reporter mRNA, the level of reporter protein, or the amount of reporter protein activity. Reporter assays are known in the art or otherwise disclosed herein.
The present invention provides methods and compositions for enhancing gene expression in organisms, particularly eukaryotic organisms. Such methods and compositions can be used for the expression of polynucleotides, particularly the proteins encoded thereby, constitutively and at a high level in a target organism. Thus, the methods and compositions of the present invention find use in the production of any protein of interest in a eukaryotic organism or cells thereof. In preferred embodiments of the invention, the target organisms are plants, particularly monocot and dicot plants, more particularly monocot and dicot plants that are crop plants or that are suitable for
the production of a protein of interest when grown in fields, greenhouses and/or controlled-environment facilities.
The present invention was made during the course of research related to the discovery and characterization of promoters that can be used to drive the expression of operably linked polynucleotides constitutively and at a high level in plants. Such promoters are known as strong constitutive promoters. During the course of that research, the present inventors discovered that the expression of a polynucleotide can be increased by adding to a polynucleotide construct comprising a constitutive promoter an operably linked intron from the same plant gene as the promoter or an intron from a different plant gene that is also known to be expressed constitutively and at a high level.
In one aspect, the present invention provides methods for making an expression construct for enhancing gene expression in an organism. The methods comprise selecting a first intron that is derived from a first gene that is highly expressed in a constitutive manner in a first organism. The first intron is the first intron from the 5' end of the first gene, and the first gene is a gene that is native to the first organism. Such a native gene is part of the natural genome of the first organism and was not introduced into the organism or a progenitor organism by artificial means. The methods further comprise selecting a promoter. The promoter can be selected before, after, or at the same time as, the first intron is selected. The promoter can be a promoter derived from the first gene or a promoter derived from a second gene that is highly expressed in a constitutive manner either in the first organism or in a second organism. The second gene is native to either the first organism or the second organism. The methods further comprise synthesizing an expression construct comprising the promoter operably linked to a polynucleotide, wherein the
polynucleotide comprises a 5'-untranslated region (5'-UTR), the first intron, and a translated region, and wherein the 5'-UTR or translated region comprises the first intron. The 5'-UTR or any part thereof can be derived from the native 5 -UTR of the first gene, the second gene, or a different gene, or can be synthetic or artificial.
Preferably, an expression construct made by the methods disclosed herein provides for enhanced or increased expression of the polynucleotide in a target organism, when compared to the expression of the polynucleotide in the target
organism from a control expression construct which lacks the first intron. More preferably, an expression construct made by the methods disclosed herein provides for enhanced or increased expression of the polynucleotide in a target organism expression without significantly altering the constitutive manner of expression of the
polynucleotide in the target organism from a control expression construct which lacks the first intron. In preferred embodiments, an expression construct made by the methods disclosed, provides for at least a about 1.25, 1.5, 2, 2.5, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 40, 50, 60. 70, 80, 90, 100-fold increase in expression of the
polynucleotide in a target organism, when compared to the expression of the polynucleotide in the target organism from a control expression construct which lacks the first intron. Typically, an expression construct of the present invention, can provide for an approximately 2 to 70-fold increase in expression of the polynucleotide in a target organism, when compared to the expression of the polynucleotide in the target organism from a control expression construct which lacks the first intron.
In certain embodiments, the methods of the present invention can involve a first organism and a target organism. The first organism and the target organism can be the same species or different species. In embodiments in which the first organism and the target organism are not the same species, the first organism and the target organism are typically from related species. For example, the first organism and the target organism can be two different plant species, preferably two different monocot or dicot plant species, more preferably two different plant species within the same taxonomic family, most preferably two different plant species within the same genus.
In other embodiments, the methods involve a first organism, a second organism, and a target organism. The first organism, the second organism, and the target organism can be the same species or two or more different species. In embodiments in which the first organism, the second organism, and the target organism, are not all the same species the first organism, the second organism, and the target organism are typically from two or more related species. For example, the first organism, the second organism, and the target organism can be three different plant species, preferably three different monocot or dicot plant species, more preferably three different plant species within the same taxonomic family, most preferably three different plant species within the same genus.
In the methods of the present invention for making an expression construct for enhancing gene expression in an organism, the expression construct comprises a promoter operably linked to a polynucleotide for transcription of the polynucleotide. Thus, a polynucleotide of the present invention comprises a transcribed region. When an expression construct of the present invention is introduced into a target organism or at least one cell thereof the polynucleotide represents the region of the expression construct that is transcribed so as to produce an RNA molecule or transcript. It is recognized the initial RNA molecule or transcript this is produced may be further modified in the organism or cell thereof so as to produce a mature RNA transcript. Modifications can include, for example, splicing out one or more introns including, but not limited to, the first intron.
As described above, the polynucleotide comprises the 5'-UTR, the first intron, and the translated region, and either the 5'-UTR or the translated regions comprises the first intron. In embodiments of the invention in which the translated region comprises the first intron, the first intron is between the first and second exons of the translated region. In other embodiments, the 5'-UTR comprises the first intron. Preferably in these embodiments, the first intron is at or near the 3' end of the 5'-UTR. More preferably, the first intron is at the 3' end of the 5'-UTR immediately before the translational start site. It is recognized that 3' end of the 5'-UTR is the nucleotide immediately before the first nucleotide of the start codon for translation. Typically, the start codon will be ATG. However, it is recognized that other start codons are known to be used by some organisms and that the present invention does not depend a particular start codon.
While the present invention does not depend on a 5' UTR of a certain size, it is recognized that non-intron sequences of 5'-UTRs are typically in the range of about 30 bp to about 200 bp, preferably about 50 to about 150 bp, although substantially larger or smaller 5'-UTRs are also encompassed by the present invention.
The expression constructs and regulatory constructs of the present invention comprise promoters and first introns that are derived from native genes. In some embodiments of the invention, the 5'-UTR or portion thereof can also be derived from a native gene. In certain embodiments, the promoters, first introns, and the 5-UTRs can be identical to or substantially the same as the corresponding element in its native gene.
It is recognized that promoters, first introns, and 5'-UTRs of the present invention that are each derived from a native gene can be modified so that their sequences are no longer identical to the corresponding sequences in the native gene. Such modifications include, for example, the addition of a consensus splice sites on one or both ends of an intron, removal of cryptic splice site, and sequence modifications that increase transcription. Generally, any such modifications will not alter constitutive expression of the promoters and the function of the first introns but it is recognized such modifications may enhance gene expression. In preferred embodiments of the invention, the first introns comprise consensus splice sites on both the 5' and 3' ends. In particularly preferred embodiments of the invention, the first introns comprise consensus splice sites on both the 5' and 3' ends, wherein the consensus splice sites are selected, or designed to be, efficiently spliced out when present in a transcript in the organism of interest. While the present invention is not bound by a particular biological mechanism, it is recognized that a first intron that is not spliced out may be disruptive to translation when the first intron is located in the 5'-UTR, particularly when located near the 3 '-end of the 5'-UTR. Moreover, first introns that are located within the translated region and that are not spliced out at all or spliced out inefficiently can have the unintended effect of reducing or eliminating the expression of the protein of interest.
The methods of the present invention can comprise selecting a first intron and/or a promoter that is derived from a gene that is highly expressed in a constitutive manner in an organism. The selected first intron and promoter can be derived from the same gene, from different genes in the same organism, or even from different genes in different organisms. Generally, the first intron and/or a promoter can be selected from the promoters and first introns of genes that are known to be highly expressed in a constitutive manner. Such promoters and first introns and methods for identifying them are generally known the art. See, for example: U.S. Patent Application No. 13/528,515, filed June 20, 2012; WO 201 1/079197; and WO 2012/006426; all of which are herein incorporated in their entirety by reference. If desired, the methods of the present invention can further comprise identifying highly expressed constitutive genes from any organism and the selecting first introns and/or promoters from the newly identified highly expressed constitutive genes. Any method known in the art for the identification
of highly expressed constitutive genes can be used in the methods disclosed herein. See, for example: U.S. Patent Application No. 13/528,515, filed June 20, 2012; WO 201 1/079197; and WO 2012/006426.
In another aspect, the present invention provides methods for making a making a regulatory construct. The methods involve selecting a first intron that is derived from a native gene that is highly expressed in a constitutive manner in an organism. The first intron is the first intron from the 5' end of the gene, if the gene contains more than one intron. The methods further comprise selecting a promoter that is derived from the same gene as the first intron or from a different gene that is highly expressed in a constitutive manner in the same organism as the gene from which the first intron was derived or in a different organism. The methods further comprise synthesizing a regulatory construct comprising the promoter operably linked to a 5'-UTR, which comprises the first intron. Preferably, the first intron is at or near the 3' end of the 5'- UTR. Also preferably, the regulatory construct provides for enhanced expression of an operably linked gene of interest in a target organism when compared to the expression of the gene of interest in the target organism from a control regulatory construct which lacks the first intron. The methods can further comprise operably linking a gene of interest to the regulatory construct for expression of the gene of interest in a target organism.
It is recognized that regulatory construct of the present invention is essentially the same as an expression construct of present invention but without an operably linked translated region. Thus, it is further recognized that the descriptions herein of the various elements of, and the arrangement within, the expression constructs of the present invention are also germane to the regulatory constructs of the present invention with the exception that the regulatory constructs are not required to comprise an operably linked translated region.
The expression constructs of the present invention find use in the making of organisms or cells that express a heterologous gene in a constitutive manner and at high level. Thus, in yet another aspect, the present invention provides methods for making an organism for expressing a heterologous gene. The methods comprising introducing
into at least one cell of a target organism an expression construct of the present invention. Such an expression construct comprises a promoter operably linked to a polynucleotide, wherein:
(a) the polynucleotide comprises a 5'-UTR and a translated region,
(b) the 5'-UTR or the translated region comprises a first intron,
(c) wherein the first intron is derived from a first gene that is highly
expressed in a constitutive manner in a first organism,
(d) the first intron is the first intron from the 5' end of the first gene,
(e) the first gene is native to the first organism,
(f) the promoter is derived from the first gene or from a second gene
that is highly expressed in a constitutive manner in the first organism or in a second organism, and
(g) the second gene is native to at least one of the first organism and
the second organism.
The methods for making an organism for expressing a heterologous gene can further comprise regenerating from the at least one cell a target organism comprising the expression construct. Preferably, the target organism or cell is capable of expressing the polynucleotide when the target organism or cell is exposed to conditions favorable for the expression of the polynucleotide for a sufficient period of time, and the polynucleotide is expressed at an increased level in the target organism or at least one cell thereof when compared to the expression of the polynucleotide in the target organism or at least one cell thereof comprising a control expression construct which lacks the first intron. In preferred embodiments of the invention, expression of the polynucleotide is determined by measuring the level of the protein encoded by translated region or by assaying the activity or function of the protein.
The methods for making an organism for expressing a heterologous gene can further comprise producing additional organisms or progeny by one or more rounds of sexual or asexual reproduction and optionally selecting for progeny comprising the expression construct. Thus, the methods of the present invention are not only limited to making the initial organism or the initial cell into which the expression construct was introduced but also encompass all progeny cells and organisms, however produced, that
are descended from initial organism and/or the initial cell and that comprise the expression construct.
The expression constructs of present invention, as well as the organisms and cells of the present invention that comprise such expression constructs, find use in methods for expressing a heterologous gene in an organism. Thus, in yet another aspect, the present invention provides methods for expressing a heterologous gene in an organism. The methods involve obtaining a target organism comprising an expression construct of the present invention or at least one cell thereof and exposing the target organism or cell thereof to conditions favorable for the expression of the polynucleotide for a sufficient period of time, whereby the polynucleotide is expressed. Preferably, the polynucleotide is expressed at an increased level in the target organism or cell thereof when compared to the expression of the polynucleotide in the target organism or cell thereof comprising a control expression construct which lacks the first intron. In some embodiments, the methods further comprise producing the target organism or a progenitor thereof by introducing the expression construct into at least one cell of an organism and regenerating the at least one cell into the target organism or a progenitor thereof comprising the expression construct. In certain embodiments, the methods for expressing a heterologous gene in an organism can further comprise making the expression construct as described herein above.
The present invention additionally provides nucleic acid molecules, vectors, expression cassettes comprising at least one of the expression constructs and/or at least one of the regulatory constructs of the present invention. Further provided are non- human organisms and non-human host cells comprising at least one of the expression constructs and/or at least one of the regulatory constructs as disclosed herein. The invention further provides expression cassettes, plants, plant parts, plant cells, seeds and host cells comprising at least one of the expression constructs and/or at least one of the regulatory constructs of the present invention.
While the expression constructs and regulatory constructs can comprise promoters, first introns, and/or 5'UTRs that are identical in nucleotide sequence to corresponding promoters, first introns, and/or 5'UTRs in one or more native genes, the expression constructs and regulatory constructs of the present invention are not known to be naturally occurring. The expression constructs and regulatory constructs of the
present invention are recombinant nucleic acids that are not native to the genome of an organism.
The invention encompasses isolated or substantially purified nucleic acid molecule or polynucleotide compositions. An "isolated" or "purified" nucleic acid molecule or polynucleotide, or biologically active portion thereof, is substantially or essentially free from components that normally accompany or interact with the nucleic acid molecule or polynucleotide as found in its naturally occurring environment. Thus, an isolated or purified nucleic acid molecule or polynucleotide is substantially free of other cellular material or culture medium when produced by recombinant techniques, or substantially free of chemical precursors or other chemicals when chemically synthesized.
The invention encompasses fragments and variants of the disclosed nucleic acid molecules or polynucleotides. By "fragment" is intended a portion of the nucleic acid molecule or polynucleotide. Fragments of a polynucleotide comprising nucleic acid sequences retain biological activity of the full-length nucleic acid molecule or polynucleotide. Alternatively, fragments of a polynucleotide that are useful as hybridization probes generally do not encode proteins that retain biological activity or do not retain promoter activity. Thus, fragments of a nucleotide sequence may range from at least about 20 nucleotides, about 50 nucleotides, about 100 nucleotides, and up to the full-length polynucleotide of the invention.
A fragment of a polynucleotide of the invention may encode a biologically active portion of a polynucleotide. A biologically active portion of a polynucleotide can be prepared by isolating a portion of one of the polynucleotides of the invention that comprises the genetic regulatory element and assessing activity as described herein. Polynucleotides that are fragments of a nucleotide sequence of the present invention comprise at 16, 20, 50, 75, 100, 150, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 700, 800, 900, 1 ,000, 1 , 100, 1 ,200, 1 ,300, 1 ,400, 1 ,500, 1 ,600, 1 ,700, 1 ,800, 1 ,900, 2,000, 2,100, 2,200, 2,300, 2,400 2,500, 2,600, or 2,700 contiguous nucleotides, or up to the number of nucleotides present in a full-length polynucleotide disclosed herein.
"Variants" is intended to mean substantially similar sequences. For
polynucleotides, a variant comprises a polynucleotide having deletions (i.e.,
truncations) at the 5' and/or 3' end; deletion and/or addition of one or more nucleotides at one or more internal sites in the reference polynucleotide; and/or substitution of one or more nucleotides at one or more sites in the reference polynucleotide. As used herein, a "reference" polynucleotide comprises a nucleotide sequence produced by the methods disclosed herein. Variant polynucleotides also include synthetically derived polynucleotides, such as those generated, for example, by using site-directed mutagenesis but which still comprise biological activity. Generally, variants of a particular polynucleotide or nucleic acid molecule of the invention will have at least about 80%, 85%, 90%, 91 %, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more sequence identity to that particular polynucleotide as determined by sequence alignment programs and parameters as described elsewhere herein.
Variant polynucleotides also encompass sequences derived from a mutagenic and recombinogenic procedure such as DNA shuffling. Strategies for such DNA shuffling are known in the art. See, for example, Stemmer (1994) PNAS 91 : 10747- 10751 ; Stemmer (1994) Nature 370:389-391 ; Crameri et al. (1997) Nature Biotech. 15:436-438; Moore et al. (1997) J. Mol. Biol. 272:336-347; Zhang et al. (1997) PNAS 94:4504-4509; Crameri et al. (1998) Nature 391 :288-291 ; and U.S. Patent Nos.
5,605,793 and 5,837,458.
For PGR amplifications of the polynucleotides disclosed herein, oligonucleotide primers can be designed for use in PCR reactions to amplify corresponding DNA sequences from cDNA or genomic DNA extracted from any plant of interest. Methods for designing PCR primers and PCR cloning are generally known in the art and are disclosed in Sambrook et al. (1989) Molecular Cloning: A Laboratory Manual (2d ed., Cold Spring Harbor Laboratory Press, Plainview, New York). See also Innis et al , eds. (1990) ?CR Protocols: A Guide to Methods and Applications (Academic Press, New York); Innis and Gelfand, eds. (1995) PCR Strategies (Academic Press, New York); and Innis and Gelfand, eds. (1999) PCR Methods Manual (Academic Press, New York). Known methods of PCR include, but are not limited to, methods using paired primers, nested primers, single specific primers, degenerate primers, gene-specific primers, vector-specific primers, partially-mismatched primers, and the like.
It is recognized that the polynucleotide molecules of the present invention encompass polynucleotide molecules comprising a nucleotide sequence that is
sufficiently identical to one of the nucleotide sequences set forth in any one or more of SEQ ID NOS: 1-52. The term "sufficiently identical" is used herein to refer to a first nucleotide sequence that contains a sufficient or minimum number of identical or equivalent nucleotides to a second nucleotide sequence such that the first and second nucleotide sequences have a common structural domain and/or common functional activity. For example, nucleotide sequences that contain a common structural domain having at least about 85% or 90% identity, preferably 95% identity, more preferably 96%), 97%), 98%o or 99% identity are defined herein as sufficiently identical.
To determine the percent identity of two nucleic acids, the sequences are aligned for optimal comparison purposes. The percent identity between the two sequences is a function of the number of identical positions shared by the sequences (i.e., percent identity = number of identical positions/total number of positions (e.g., overlapping positions) x 100). In one embodiment, the two sequences are the same length. The percent identity between two sequences can be determined using techniques similar to those described below, with or without allowing gaps. In calculating percent identity, typically exact matches are counted.
The determination of percent identity between two sequences can be accomplished using a mathematical algorithm. A preferred, nonlimiting example of a mathematical algorithm utilized for the comparison of two sequences is the algorithm of arlin and Altschul (1990) PNAS 87:2264, modified as in Karlin and Altschul
(1993) PNAS 90:5873-5877. Such an algorithm is incorporated into the NBLAST and XBLAST programs of Altschul et al. (1990) J. Mol. Biol. 215:403. BLAST nucleotide searches can be performed with the NBLAST program, score = 100, wordlength = 12, to obtain nucleotide sequences homologous to the polynucleotide molecules of the invention. To obtain gapped alignments for comparison purposes, Gapped BLAST can be utilized as described in Altschul et al. (1997) Nucleic Acids Res. 25:3389.
Alternatively, PSI-Blast can be used to perform an iterated search that detects distant relationships between molecules. See Altschul et al. (1997) supra. When utilizing BLAST, Gapped BLAST, and PSI-Blast programs, the default parameters of the respective programs (e.g., XBLAST and NBLAST) can be used. See
http://www.ncbi.nlm.nih.gov. Another preferred, non-limiting example of a mathematical algorithm utilized for the comparison of sequences is the algorithm of
Myers and Miller (1988) CABIOS 4: 1 1 -17. Such an algorithm is incorporated into the ALIGN program (version 2.0), which is part of the GCG sequence alignment software package. When utilizing the ALIGN program for comparing amino acid sequences, a PAM 120 weight residue table, a gap length penalty of 12, and a gap penalty of 4 can be used. Alignment may also be performed manually by inspection.
Unless otherwise stated, sequence identity values for pairs of sequences provided herein refer to the value obtained using the BLAST 2.0 suite of programs using default parameters (Altschul el al , (1997) Nucleic Acids Res. 25:3389-402) using the full-length sequences of the invention. Unless otherwise stated, sequence identity values for multiple sequence alignments provided herein refer to the value obtained using MUSCLE (Version 3.8) using default parameters using the full-length sequences of the invention. MUSCLE is available at http://www.drive5.com/muscle/ or http://www.ebi.ac.uk/Tools/msa/muscle/. See, Edgar (2004) Nucleic Acids Res.
32(5): 1792- 1797; herein incorporated by reference.
The use of the term "polynucleotide" and "nucleic acid" is not intended to limit the present invention to polynucleotides and nucleic acids comprising DNA. Those of ordinary skill in the art will recognize that polynucleotides and nucleic acids, can comprise ribonucleotides and combinations of ribonucleotides and
deoxyribonucleotides. Such deoxyribonucleotides and ribonucleotides include both naturally occurring molecules and synthetic analogues. The polynucleotides and nucleic acids of the invention also encompass all forms of sequences including, but not limited to, single-stranded forms, double-stranded forms, hairpins, stem-and-loop structures, and the like.
The expression constructs and regulatory constructs of the present invention can be provided in expression cassettes for expression in the plant or other organism or host cell of interest. It is recognized that the expression constructs of the present invention and expression cassettes comprising one or more of such expression constructs can be used for the expression in both human and non-human host cells including, but not limited to, host cells from plants, animals, fungi, protists, and algae. In one
embodiment of the invention, the host cells are human host cells or a host cell line that is incapable of differentiating into a human being.
The expression cassette can include additional 5' and 3' regulatory sequences operably linked to the expression construct or regulatory construct. "Operably linked" intended to mean a functional linkage between two or more elements. For example, an operable linkage between one or more genetic regulatory elements and a gene of interest is functional link between the gene of interest and the one or more genetic regulatory elements that allows for expression of the gene of interest. Operably linked elements may be contiguous or non-contiguous. When used to refer to the joining of two protein coding regions, by "operably linked" is intended that the coding regions are in the same reading frame. With respect to introns, an "operably linked intron" is an intron that is functional and splices out of a polynucleotide when in a host organism capable of splicing out such a functional intron. In the case of introns within a coding region or translated region of a gene, an "operably linked intron" is one that is functional and splices out of a coding region or translated region of an RNA without disrupting the reading frame for translation when the polynucleotide is in a host organism capable of splicing out such a functional intron. It is understood that the term "in operable linkage" as used herein has the same meaning as "operably linked".
The expression cassette may additionally contain at least one additional gene to be co-transformed into the organism. Alternatively, the additional gene(s) can be provided on multiple expression cassettes. Such an expression cassette is provided with a plurality of restriction sites and/or recombination sites for insertion of the polynucleotide to be under the transcriptional regulation of the regulatory regions. The expression cassette may additionally contain selectable marker genes.
The expression cassette can comprise in the 5 '-3' direction of transcription, a transcriptional initiation region (i.e., a promoter), a translational initiation region, nucleotide sequence to be expressed, a translational stop site, and a transcriptional termination region (i.e., termination region) functional in plants or other organism or host cell. The expression cassette further comprises a first intron either in the 5'-UTR or coding region. The regulatory regions (i.e., promoters, transcriptional regulatory regions, and translational termination regions) and/or the polynucleotide to be expressed may be native/analogous to the host cell or to each other. Alternatively, any of the regulatory regions and/or the polynucleotide to be expressed may be
heterologous to the host cell or to each other. As used herein, "heterologous" in
reference to a sequence is a sequence that originates from a foreign species, or, if from the same species, is substantially modified from its native form in composition and/or genomic locus by deliberate human intervention. For example, a promoter operably linked to a heterologous polynucleotide is from a species different from the species from which the polynucleotide was derived, or, if from the same/analogous species, one or both are substantially modified from their original form and/or genomic locus, or the promoter is not the native promoter for the operably linked polynucleotide. As used herein, a chimeric gene comprises a coding sequence operably linked to a transcription initiation region that is heterologous to the coding sequence.
The termination region may be native with the transcriptional initiation region, may be native with the operably linked polynucleotide of interest, may be native with the plant host, or may be derived from another source (i.e., foreign or heterologous) to the promoter, the polynucleotide of interest, the plant host, or any combination thereof. Convenient termination regions are available from the Ti-plasmid of A. lumefaciens, such as the octopine synthase and nopaline synthase termination regions. See also
Guerineau et al. (1991) Mo/. Gen. Genet. 262: 141 -144; Proudfoot (1991 ) Cell 64:671 - 674; Sanfacon et al. (1991 ) Genes Dev. 5: 141-149; Mogen et al. (1990) Plant Cell 2: 1261 - 1272; Munroe et al. (1990) Gene 91 : 151 -158; Ballas et al. (1989) Nucleic Acids Res. 17:7891 -7903; and Joshi et al. (1987) Nucleic Acids Res. 15:9627-9639.
Unless stated otherwise or obvious from the context, a promoter of the present invention for gene expression in plants is capable of directing the constitutive expression of an operably linked gene of interest in a plant, a plant part, and/or a plant cell.
Where appropriate, the genes of interest may be optimized for increased expression in the transformed plant. That is, the polynucleotides can be synthesized using plant-preferred codons for improved expression. See, for example, Campbell and Gowri (1990) Plant Physiol. 92: 1-1 1 for a discussion of host-preferred codon usage. Methods are available in the art for synthesizing plant-preferred genes. See, for example, U.S. Patent Nos. 5,380,831 , and 5,436,391 , and Murray et al. (1989) Nucleic Acids Res. 17:477-498, herein incorporated by reference.
Additional sequence modifications are known to enhance gene expression in a cellular host. These include elimination of sequences encoding spurious
polyadenylation signals, exon-intron splice site signals, transposon-like repeats, and other such well-characterized sequences that may be deleterious to gene expression. The G-C content of the sequence may be adjusted to levels average for a given cellular host, as calculated by reference to known genes expressed in the host cell. When possible, the sequence is modified to avoid predicted hairpin secondary mRNA structures.
The expression cassettes may additionally contain heterologous 5' UTRs (also known as 5' leader sequences). Such 5' UTRs can act to enhance translation.
Translation leaders are known in the art and include: picornavirus leaders, for example, EMCV leader (Encephalomyocarditis 5' noncoding region) (Elroy-Stein et al. (1989) PNAS USA 86:6126-6130); potyvirus leaders, for example, TEV leader (Tobacco Etch Virus) (Gallie el al. (1995) Gene 165(2):233-238), MDMV leader (Maize Dwarf Mosaic Virus) (Virology 154:9-20), and human immunoglobulin heavy-chain binding protein (BiP) (Macejak et al. (1991) Nature 353:90-94); untranslated leader from the coat protein mRNA of alfalfa mosaic virus (AMV RNA 4) (Jobling et al. (1987) Nature 325:622-625); tobacco mosaic virus leader (TMV) (Gallie et al. (1989) in Molecular Biology of RNA, ed. Cech (Liss, New York), pp. 237-256); and maize chlorotic mottle virus leader (MCMV) (Lommel et al. (1991) Virology 81 :382-385). See also,
Della-Cioppa et al. (1987) Plant Physiol. 84:965-968.
In preparing the expression cassette, the various DNA fragments may be manipulated, so as to provide for the DNA sequences in the proper orientation and, as appropriate, in the proper reading frame. Toward this end, adapters or linkers may be employed to join the DNA fragments or other manipulations may be involved to provide for convenient restriction sites, removal of superfluous DNA, removal of restriction sites, or the like. For this purpose, in vitro mutagenesis, primer repair, restriction, annealing, resubstitutions, e.g., transitions and transversions, may be involved.
The expression cassette can also comprise a selectable marker gene for the selection of transformed cells. Selectable marker genes are utilized for the selection of transformed cells or tissues. Marker genes include genes encoding antibiotic resistance, such as those encoding neomycin phosphotransferase II (NEO) and hygromycin phosphotransferase (HPT), as well as genes conferring resistance to herbicidal
compounds, such as glufosinate ammonium, bromoxynil, imidazolinones, and 2,4- dichlorophenoxyacetate (2,4-D). Additional selectable markers include phenotypic markers such as β-galactosidase and fluorescent proteins such as green fluorescent protein (GFP) (Su et al. (2004) Biotechnol Bioeng. 85:610-9 and Fetter et al. (2004) Plant Cell 16:215-28), cyan florescent protein (CYP) (Bolte et al. (2004) J. Cell Science 1 17:943-54 and Kato et al. (2002) Plant Physiol. 129:913-42), and yellow florescent protein (PhiYFP™ from Evrogen, see, Bolte et al. (2004) J. Cell Science 1 17:943-54). For additional selectable markers, see generally, Yarranton (1992) Curr. Opin. Biotech. 3 :506-51 1 ; Christopherson et al. (1992) PNAS 89:6314-6318; Yao et al. (1992) Cell 71 :63-72; Reznikoff (1992) Mol. Microbiol. 6:2419-2422; Barkley et al.
(1980) in The Operon, pp. 177-220; Hu et al. (1987) Cell 48:555-566; Brown el al. (1987) Cell 49:603-612; Figge et al. (1988) Cell 52:713-722; Deuschle et al. (1989) PNAS 86:5400-5404; Fuerst et al. (1989) PNAS 86:2549-2553; Deuschle et al. (1990) Science 248:480-483; Gossen (1993) Ph.D. Thesis, University of Heidelberg; Reines et al (1993) PNAS 90: 1917-1921 ; Labow e/ al. (1990) Mol. Cell. Biol. 10:3343-3356; Zambretti et al. (1992) PNAS 89:3952-3956; Bairn et al. (1991) PNAS 88:5072-5076; Wyborski et al.
(1991) Nucleic Acids Res. 19:4647-4653; Hillenand-Wissman (1989) Topics Mol Struc. Biol. 10: 143-162; Degenkolb e al. (1991) Antimicrob. Agents Chemother. 35: 1591 -1595; Kleinschnidt et al. (1988) Biochemistry 27: 1094-1 104; Bonin (1993) Ph.D. Thesis, University of Heidelberg; Gossen et al. (1992) PNAS 89:5547-5551 ; Oliva el al. (1992) Antimicrob. Agents Chemother. 36:913-919; Hlavka et al. (1985) Handbook of
Experimental Pharmacology, Vol. 78 ( Springer- Verlag, Berlin); Gill et al. (1988) Nature 334:721-724. Such disclosures are herein incorporated by reference.
The above list of selectable marker genes is not meant to be limiting. Any selectable marker gene can be used in the present invention.
Numerous plant transformation vectors and methods for transforming plants are available. See, for example, An, G. el al. (1986) Plant Pysiol , 81 :301 -305; Fry, J., el al. (1987) Plant Cell Rep. 6:321 -325; Block, M. (1988) Theor. Appl Genet. l6:161-11A; Hinchee, et al. (1990) Stadler. Genet. Symp. 203212.203-212; Cousins, et al. (1991 ) A st. J. Plant Physiol. 18:481 -494; Chee, P. P. and Slightom, J. L. (1992) Gene
1 18:255-260; Christou, et al. (1992; Trends. Biotechnol. 10:239-246; D'Halluin, el al
( 1992) Bio/Technol. 10:309-314; Dhir, et al. (1992) Plant Physiol. 99:81-88; Casas et
al. (1993) PNAS 90: 1 1212- 1 1216; Christou, P. (1993) In Vitro Cell. Dev. Biol. -Plant; 29P: 1 19- 124; Davies, et al. ( 1993) Plant ' Cell Rep. 12: 180- 183; Dong, J. A. and Mchughen, A. (1993) Plant Sci. 91 : 139- 148; Franklin, C. I. and Trieu, T. N. (1993) Plant. Physiol. 102: 167; Golovkin, et al. (1993) Plant Sci. 90:41 -52; duo Chin Sci. Bull. 38:2072-2078; Asano, et al. (1994) Plant Cell Rep. 13; Ayeres N. M. and Park, W. D. (1994) Crit. Rev. Plant. Sci. 13 :219-239; Barcelo, et al. (1994) Plant. J. 5 :583- 592; Becker, et al. (1994) Plant. J. 5 :299-307; Borkowska et al. ( 1994) Acta. Physiol Plant. 16:225-230; Christou, P. (1994) Agro. Food. Ind. Hi Tech. 5 : 17-27; Eapen et al. (1994) Plant Cell Rep. 13 :582-586; Hartman, et al. (1994) Bio-Technology 12: 919923 ; Ritala, et al. ( 1994) Plant. Mol. Biol. 24:317-325; and Wan, Y. C. and Lemaux, P. G. ( 1994) Plant Physiol. 104:3748.
The methods of the invention involve introducing an expression construct or regulatory construct into an organism. By "introducing" is intended presenting to the organism the expression construct in such a manner that the construct gains access to the interior of a cell of the organism. The methods of the invention do not depend on a particular method for introducing an expression construct or regulatory construct into an organism, only that the expression construct or regulatory construct gains access to the interior of at least one cell of the organism. Methods for introducing expression constructs, regulatory constructs, and other polynucleotides into various organisms such as, for example, plants, animals, fungi, protists, and bacteria are known in the art including, but not limited to, stable transformation methods, transient transformation methods, and virus-mediated methods.
By "stable transformation" is intended that the polynucleotide construct introduced into a organism integrates into a genome of organism and is capable of being inherited by progeny thereof. By "transient transformation" is intended that a polynucleotide construct introduced into an organism does not integrate into a genome of the organism.
For the transformation of target organisms and host cells, the expression constructs and regulatory constructs of the invention are inserted using standard techniques into any vector known in the art that is suitable for expression of the nucleotide sequences in the organism or host cell. The selection of the vector depends
on the preferred transformation technique and the species of target organism or host cell to be transformed.
For the transformation of plants and plant cells, the expression constructs and regulatory constructs of the invention are inserted using standard techniques into any vector known in the art that is suitable for expression of the nucleotide sequences in a plant or plant cell. The selection of the vector depends on the preferred transformation technique and the target plant species to be transformed.
Methodologies for constructing plant expression cassettes and introducing foreign nucleic acids into plants are generally known in the art and have been previously described. For example, foreign DNA can be introduced into plants, using tumor-inducing (Ti) plasmid vectors. Other methods utilized for the delivery foreign DNA or other foreign nucleic acids involve the use of PEG mediated protoplast transformation, electroporation, microinjection whiskers, and biolistics or
microprojectile bombardment for direct DNA uptake. Such methods are known in the art. (U.S. Pat. No. 5,405,765 to Vasil et al ; Bilang et at. ( 1991 ) Gene 100: 247-250; Scheid et al , (1991 ) Mol. Gen. Genet. 228: 104- 1 12; Guerche et al , (1987) Plant Science 52: 1 1 1 -1 16; Neuhause et al , (1987) Theor. Appl Genet. 75: 30-36; Klein et al , (1987) Nature 327: 70-73 ; Howell et al , (1980) Science 208: 1265; Horsch et al ,
(1985) Science 227: 1229- 123 1 ; DeBlock et al , (1989) Plant Physiology 91 : 694-701 ; Methods for Plant Molecular Biology (Weissbach and Weissbach, eds.) Academic
Press, Inc. (1988) and Methods in Plant Molecular Biology (Schuler and Zielinski, eds.) Academic Press, Inc. (1989). The method of transformation depends upon the plant cell to be transformed, stability of vectors used, expression level of gene products and other parameters.
Other suitable methods of introducing nucleotide sequences into plant cells and subsequent insertion into the plant genome include microinjection as Crossway el al.
(1986) Biotechniques 4:320-334, electroporation as described by Riggs et al. ( 1986) PNAS 83 :5602-5606, Agrobacterium-mQd' ted transformation as described by
Townsend et al , U.S. Patent No. 5,563,055, Zhao et al , U.S. Patent No. 5,981 ,840, Yukou et al. , WO 94/000977, and Hideaki et al. , WO 95/06722, direct gene transfer as described by Paszkowski et al. (1984) EMBO J. 3 :2717-2722, and ballistic particle acceleration as described in, for example, Sanford et al , U.S. Patent No. 4,945,050;
Tomes et al , U.S. Patent No. 5,879,918; Tomes et al , U.S. Patent No. 5,886,244; Bidney et al , U.S. Patent No. 5,932,782; Tomes et al. (1995) "Direct DNA Transfer into Intact Plant Cells via Microprojectile Bombardment," in Plant Cell, Tissue, and Organ Culture: Fundamental Methods, ed. Gamborg and Phillips (Springer-Verlag, Berlin); McCabe et al. (1988) Biotechnology 6:923-926); and Led transformation (WO 00/28058). Also see, Weissinger et al. ( 1988) Rev. Genet. 22:421 -477; Sanford el al. (1987) Particulate Science and Technology 5:27-37 (onion); Christou et al (1988) Plant Physiol. 87:671 -674 (soybean); McCabe et al. (1988) Bio/Technology 6:923-926 (soybean); Finer and McMullen (1991) In Vitro Cell Dev. Biol. 27P: 175-182 (soybean); Singh et al. (1998) Theor. Appl. Genet. 96:319-324 (soybean); Datta et al. (1990) Biotechnology 8:736-740 (rice); Klein et al. (1988) PNAS 85:4305-4309 (maize); Klein et al. (1988) Biotechnology 6:559-563 (maize); Tomes, U.S. Patent No. 5,240,855; Buising et al , U.S. Patent Nos. 5,322,783 and 5,324,646; Tomes et al. (1995) "Direct DNA Transfer into Intact Plant Cells via Microprojectile
Bombardment," in Plant Cell, Tissue, and Organ Culture: Fundamental Methods, ed. Gamborg (Springer-Verlag, Berlin) (maize); Klein et al. (1988) Plant Physiol.
91 :440-444 (maize); Fromm et al. (1990) Biotechnology 8:833-839 (maize); Hooykaas- Van Slogteren et al. (1984) Nature (London) 31 1 :763-764; Bowen et al , U.S. Patent No. 5,736,369 (cereals); Bytebier et al. (1987) PNAS 84:5345-5349 (Liliaceae); De Wet et al. (1985) in The Experimental Manipulation of Ovule Tissues, ed. Chapman et al (Longman, New York), pp. 197-209 (pollen); Kaeppler et al. (1990) Plant Cell Reports 9:415-418 and Kaeppler et al. (1992) Theor. Appl. Genet. 84:560-566
(whisker-mediated transformation); DTIalluin et al. (1992) Plant Cell 4: 1495-1505 (electroporation); Li et al. (1993) Plant Cell Reports 12:250-255 and Christou and Ford (1995) Annals of Botany 75:407-413 (rice); Osjoda et al. (1996) Nature Biotechnology 14:745-750 (maize via Agrobacterium tumefaciens); all of which are herein
incorporated by reference.
The nucleic acid molecules, expression constructs, and regulatory constructs of the invention may be introduced into plants by contacting plants with a virus or viral nucleic acids. Generally, such methods involve incorporating a nucleic acid molecule or an expression construct of the invention within a viral DNA or RNA molecule. It is recognized that the a protein of the invention may be initially synthesized as part of a
viral polyprotein, which later may be processed by proteolysis in vivo or in vitro to produce the desired recombinant protein.
The cells that have been transformed may be grown into plants in accordance with conventional ways. See, for example, McCormick et al. (1986) Plant Cell Reports 5 :81 -84. These plants may then be grown, and either pollinated with the same transformed strain or different strains, and the resulting hybrid having constitutive expression of the desired phenotypic characteristic identified. Two or more generations may be grown to ensure that expression of the desired phenotypic characteristic is stably maintained and inherited and then seeds harvested to ensure expression of the desired phenotypic characteristic has been achieved. In this manner, the present invention provides transformed seed (also referred to as "transgenic seed") having a polynucleotide construct of the invention, for example, an expression cassette of the invention, stably incorporated into their genome.
In specific embodiments, the nucleic acid molecules, expression constructs, and regulatory constructs of the present invention can be provided to a plant or other organism using a variety of transient transformation methods. Such transient transformation methods include, but are not limited to, the introduction of the sequence or variants and fragments thereof directly into the plant or other organism or the introduction of a transcript into the plant. Such methods include, for example, microinjection, electroporation, or particle bombardment. See, for example, Crossway et al. ( \ 986) Mo! Gen. Genet. 202: 179-185; Nomura et al. (1986) Plant Sci. 44:53-58; Hepler et al. (1994) PNAS 91 : 2176-2180 and Hush et al. (1994) The Journal of Cell Science 107:775-784, Sheen, J. 2002. A transient expression assay using maize mesophyll protoplasts, http://genetics.mgh.harvard.edu/sheenweb/, Anderson et al , U.S. Pat. No. 7,645,919 B2, all of which are herein incorporated by reference.
Alternatively, the polynucleotide can be transiently transformed into the plant or other organism using any other technique known in the art.
The nucleic acid molecules,expression constructs, and regulatory constructs of the present invention can be used for transformation of any plant species, including, but not limited to, monocots and dicots. Examples of plant species of interest include, but are not limited to, Arabidopsis thaliana, peppers {Capsicum spp; e.g., Capsicum annuum, C. baccatum, C. chinense, C. frutescens, C. pubescens, and the like), tomatoes
(Lycopersicon esculentum), tobacco (Nicotiana tabacum), eggplant (Solanum melongena), petunia (Petunia spp., e.g., Petunia x hybrida or Petunia hybrida), corn or maize (Zea mays), Brassica ssp. (e.g., B, napus, B. rapa, B. juncea), particularly those Brassica species useful as sources of seed oil, alfalfa (Medicago saliva), rice (Oryza saliva), rye (Secale cereale), sorghum (Sorghum bicolor, Sorghum vulgare), millet (e.g., pearl millet (Pennisetum glaucum), proso millet (Panicum miliaceum), foxtail millet (Setaria italica), green millet (Setaria viridis), finger millet (Eleusine coracana)), sunflower (Helianlhus annuus), safflower (Carthamus tinctorius), wheat (Triticum aestivum), soybean (Glycine max), potato (Solanum tuberosum), peanuts (Arachis hypogaea), cotton (Gossypium barbadense, Gossypium hirsutum), sweet potato (lpomoea batatus), cassava (Manihot esculenta), coffee (Coffea spp.), coconut (Cocos nucifera), pineapple (Ananas comosus), citrus trees (Citrus spp.), cocoa (Theobroma cacao), tea (Camellia sinensis), banana (Musa spp.), avocado (Persea americana), fig (Ficus casica), guava (Psidium guajava), mango (Mangifera indica), olive (Olea europaea), papaya (Carica papaya), cashew (Anacardium occidentale), macadamia (Macadamia integrifolid), almond (Prunus amygdalus), sugar beets (Beta vulgaris), sugarcane (Saccharum spp.), switchgrass (Panicum virgatum), duckweed (e.g., Lemna spp., Spirodela spp., Landoltia spp.,
Wolffiella spp., and Wolffia spp.) algae (e.g., Chlamydomonas reinhardlii, Botryococcus braunii, Chlorella spp. , Dunaliella tertiolecta, Gracilaria spp.), oats, barley, vegetables, ornamentals, and conifers.
As used herein, the term plant includes plant cells, plant protoplasts, plant cell or tissue cultures from which plants can be regenerated, plant calli, plant clumps, and plant cells that are intact in plants or parts of plants such as embryos, pollen, ovules, seeds, leaves, flowers, branches, fruits, roots, root tips, anthers, and the like. Progeny, variants, and mutants of the regenerated plants are also included within the scope of the invention, provided that these parts comprise the introduced expression constructs or polynucleotides.
Various changes in phenotype are of interest including modifying the fatty acid composition in a plant, altering the amino acid content of a plant, altering a plant's pathogen defense mechanism, and the like. These results can be achieved by providing expression of heterologous products or increased expression of endogenous products in plants.
The present invention provides methods for expressing heterologous genes in organisms. A heterologous gene of the present invention can be any gene of interest that can be expressed by the methods of the present invention. Genes of interest encode proteins of interest. Thus, a translated region of the present invention can comprise a gene of interest that encodes a protein of interest.
Genes of interest are reflective of the commercial markets and interests of those involved in the development of the crop. Crops and markets of interest change, and as developing nations open up world markets, new crops and technologies will emerge also. In addition, as our understanding of agronomic traits and characteristics such as yield and heterosis increase, the choice of genes for transformation will change accordingly. General categories of genes of interest include, for example, those genes involved in information, such as zinc fingers, those involved in communication, such as kinases, and those involved in housekeeping, such as heat shock proteins. More specific categories of transgenes, for example, include genes encoding important traits for agronomics, insect resistance, disease resistance, herbicide resistance, sterility, grain characteristics, yield, abiotic stress tolerance, and commercial products. Genes of interest include, generally, those involved in oil, starch, carbohydrate, or nutrient metabolism. In addition, genes of interest include genes encoding enzymes and other proteins from plants and other sources including prokaryotes and other eukaryotes.
The following examples are offered by way of illustration and not by way of limitation.
EXAMPLE 1
Promoters and introns from Arabidopsis and rice highly expressed constitutive genes were used to make expression constructs comprising a promoter and intron from the same gene operably linked to a reporter gene and control expression constructs comprising a promoter operably linked to the reporter gene. The Arabidopsis and rice genes were previously identified as being highly expressed constitutive genes as reported in WO 201 1/079197 (see also, U.S. Patent Application No. 13/528,515, filed June 20, 2012) and WO 2012/006426. The accession numbers of the genes are listed in Tables 1 and 2 along with cross-references to the sequence identifiers for the constructs in these publications.
Table 1. Gene Accessions from Arabidopsis thaliana
Table 2. Gene Accessions from Rice (Oryza sativa)
In Figures 1 and 2, intron-mediated enhancement (IME) was calculated as average expression with a + intron construct divided by average expression with the corresponding - intron construct. The constructs from Arabidopsis were tested for GFP expression in Arabidopsis lhaliana by calculating the GFP index, and the rice constructs were tested for GUS expression in corn (Zea mays) by determining GUS enzymatic activity, as described in WO 201 1/079197 and WO 2012/006426. IME was calculated for each of the 14 (Arabidopsis) or 10 (corn) tissues/zones/stages, and then these values are averaged for presentation in Figures 1 and 2. The presence of the first introns in the constructs enhanced expression in most cases (12 of 15 cases in
Arabidopsis, 10 of 10 cases in rice), with the expression enhancement ranging from 2- 70 fold in both Arabidopsis and corn (median IME in Arabidopsis was 4.1 -fold, median IME in corn was 1 1.5-fold). It is noted that the starred Arabidopsis IME values in Figure 1 and all of the corn IME values in Figure 2 are minimal estimates for IME because there was no detectable expression in the absence of an intron in one or more of the tissues tested. In these cases, the IME value is calculated using background GFP or GUS values, respectively, for the tissues with no detectable expression in the -intron transgenics. Arabidopsis expression measurements are from the root epidermis, cortex, endodermis, and stele in each of the meristematic, elongation, and maturation zones, as
well as the root cap and quiescent center (14 measurements throughout root
development total) of T2 seedlings. Corn expression measurements are from V3-root, V7-root, VT-root, V3-leaf, V7-leaf, VT-leaf, VT-anther, VT-silk, 21 -DAP-embryo, and 21 -DAP-endosperm (10 measurements throughout plant development total) from R0 seedlings.
IME was also determined in shoot tissue from two representative Arabidopsis promoters using quantitative PCR analysis (qRT-PCR) and northern blot analysis. For the Northern blot analysis of GFP expression, independent transgenic lines that exhibited single locus segregation of antibiotic resistance marker expression were selected for GFP transgene copy number and expression analysis. T3 seeds from the transgenic lines, as well as wild-type (WT) Col controls, were surface sterilized, stratified and grown on IX MS media. Shoot tissues were harvested from 2-3 week old seedlings and homogenized in liquid nitrogen by grinding with mortar and pestle. Total RNA was extracted from tissues using the RNeasy kit (Qiagen). Gel resolution, transfer and crosslinking were done with the NorthernMax kit (Ambion). Probes for GFP and the housekeeping gene ATPK1 were labeled with the Prime-A-Gene kit (Promega). Unincorporated labels were removed via Micro Bio-spin P30 Tris chromatography columns (BioRad). Following overnight hybridization, membranes were washed in 2X SSC with 0.1 %SDS, dried, and screened at Ι ΟΟμιη using the Scan Phospholmager. Bands were quantified via ImageQuant software.
For the quantitative PCR analysis of GFP expression, independent transgenic lines that exhibited single locus segregation of antibiotic resistance marker expression were selected for GFP transgene copy number and expression analysis. T3 seeds from the tested transgenic lines as well as a CaMV 35S:GFP construct and wild-type (WT) Col controls, were surface sterilized, stratified and sown on 90 μηι nylon mesh on I X MS media. Shoots were harvested from pools of ~100 1 week old seedlings per line. Shoot tissues was homogenized in liquid nitrogen by bead milling followed by passage through QIAshredder columns (Qiagen). Genomic DNA and total RNA were extracted from tissues using Allprep DNA/RNA kits (Qiagen). cDNA was generated from total RNA using Superscriptlll reverse transcriptase (Invitrogen) per manufacturer's instructions. Quantitative PCR was performed with iQ Multiplex Powermix (Bio-Rad) supplemented with the appropriate primers and probes (see below) on an iCycler iQ
real-time detection system (Bio-Rad) using the following thermal-cycler program: (1) 9 min at 95°C; (2) 15 s at 94°C; (3) 30 s at 57°C; (4) 30 s at 72°C; repeat 40 cycles of steps 2-4. Amplification data recorded by the iQ software (Bio-Rad) was exported to Linregpcr program (Ruijter et al. (2009) Nucleic Acids Res. 37(6):e45) to determine PCR efficiency and cycle threshold values, which were used to calculate GFP transgene copy number and expression relative to the 35S:GFP control using REST-MCS beta tool (Pfaffl et al. (2002) Nucleic Acids Res. 30(9):e36). Relative GFP expression in each tissue was calculated by normalizing the amplification of GFP in cDNA to the amplification of ubiquitin-conjugating enzyme 9 (UBC9), a "housekeeping gene", and subsequent normalization to 35S:GFP. Primers used for PCR are as follows:
ER-GFP F 5' - CGTGCAGGAGAGGACCAT;
ER-GFP R 5' - TGTCTCCCTCAAACTTGACTTCAG;
ER-GFP Probe 5' - 56-FAM/TCCCGTCGTCCTTGAAGAAG/3IABkFQ;
UBC9 F 5' - ATGGAAGCATCTGCCTCGACATCT;
UBC9 R 5' - AGGATCATCTGGGTTTGGATCCGT;
UBC9 Probe 5' - 5TEX615/AGCAGTGGAGTCCTGCTCTCACAATT/3IAbRQSp; PDS 1 F 5' - TCACGGCTCTTGTCGTTCCTTCTT;
PDS 1 R 5' - TGGAGAAAGCTGACTCTGCGTCTT;
PDS 1 Probe 5' - 5 TEX 615/TCGGTGTTAGAGCCGTTGCGATTGAA /3IAbRQSp.
56-FAM and 5TEX615 indicate the presence of 5' fluorophore modifications while 3IAbRQSp and 3IABkFQ indicate the presence of 3' quencher modifications
(Integrated DNA Technologies, Coralville, Iowa USA) on the real time PCR probes.
Expression enhancement (IME) was calculated as average expression with + intron construct divided by average expression with the (-) intron construct.
Measurements are the average of shoots of 3-5 independent, homozygous, single-copy T3 lines per intron variant.
Table 3. Shoot Intron Expression Enhancement (IME) of Arabidopsis
Promoters by Cognate First Introns
* Minimal estimate of IME since no
expression above background was detected
in the -intron variants.
Tables 4 and 5 demonstrate the absolute expression activity of the + intron variants when compared to well-know, high constitutive expressing control promoters. In Table 4, expression constructs with Arabidopsis promoters and cognate introns were compared to the CaMV 35S promoter for expression in Arabidopsis roots. GFP expression in Arabidopsis was measured as the GFP index as described in WO
201 1/079197 and WO 2012/006426. In Table 5, expression constructs with rice promoters and cognate introns were compared to an enhanced rice actin 1 (eACTl ) promoter for expression in corn. GUS expression in corn was measured from GUS activity assays as described in WO 201 1/079197. These results demonstrate that IME is important for achieving expression approaching and comparable to well-know, high constitutive expressing control promoters.
Table 4. Expression of Arabidopsis + Intron Promoter Constructs in Arabidopsis Roots
* Average GFP index from 14 root tissues/zones
of two independent lines per promoter. Results from a CaMV 35S-promoter control (- intron)
are shown for comparison.
Table 5. Expression of Rice + Intron Promoter Constructs in Corn Plants
* Average GUS activity measured in 10 corn
tissue/stages from 5-10 lines per promoter.
Results from an enhanced rice actin 1 (eACTl) promoter control (+ intron) are shown for
comparison.
In addition to enhancing the expression of their cognate promoters, the introns that have been identified can enhance the expression of heterologous promoters. In this example, introns were swapped between two promoters from Figure 1 and tested for expression enhancement by northern analysis of shoot tissue as described above. The result in Table 6 for the AT1 G52300/AT4G37830 construct is based on 1 single copy homozygous line of each the - and + intron variants. The result in Table 6 for the AT4G37830/AT1 G52300 construct is based on 2 (- intron variant) and 5 (+ intron variant) single copy, homozygous lines.
Table 6. Intron-Mediated Enhancement (IME) of Heterologous Promoters
* IME was calculated as average expression with a + intron construct divided by average
expression with the corresponding (-) intron
construct. The results provided in Figures 1 -2 and Tables 3-6 demonstrate that the phenomenon of intron enhancement of gene expression by a first intron is widespread and that IME contributes to the expression of most highly expressed constitutive genes in both monocot and dicot plants. All of the tested promoters are from highly expressed constitutive genes. The present inventors have only been able to recapitulate high expression with cloned promoters when the first introns were included. In contrast, the current dogma has been that there were just a few monocot introns with enhancing properties, and that intron enhancement is not important in dicots. Furthermore, the present invention demonstrates how to identify enhancing introns - by taking the first introns from genes selected for particular properties (e.g., high and uniform expression in all cell types, organs, tissues). The first introns are usually in the coding region but as disclosed herein the enhancing property of the first introns is modular because the first introns can be moved to the 3' end of 5 -UTRs of cloned promoters and still provide effective enhancement. This is important because the present invention demonstrates that there it is not necessary to make fusion constructs comprising a first intron inserted within the translated region of a gene of interest. Instead, regulatory constructs can be prepared which comprise a promoter operably linked to a 5'-UTR which comprises a first intron preferably at or new the 3' of the 5'-UTR. Such a construct can be operably linked to any gene of interest with relative ease without making any modification to the translated region of the gene of interest.
All publications and patent applications mentioned in the specification are indicative of the level of those skilled in the art to which this invention pertains. All
publications and patent applications are herein incorporated by reference to the same extent as if each individual publication or patent application was specifically and individually indicated to be incorporated by reference.
Many modifications and other embodiments of the inventions set forth herein will come to mind to one skilled in the art to which these inventions pertain having the benefit of the teachings presented in the foregoing descriptions and the associated drawings. Therefore, it is to be understood that the inventions are not to be limited to the specific embodiments disclosed and that modifications and other embodiments are intended to be included within the scope of the appended claims. Although specific terms are employed herein, they are used in a generic and descriptive sense only and not for purposes of limitation.
Claims
THAT WHICH IS CLAIMED:
1. A method for making an expression construct for enhancing gene expression in an organism, said method comprising:
(a) selecting a first intron, wherein the first intron is derived from a first gene that is highly expressed in a constitutive manner in a first organism, wherein the first intron is the first intron from the 5' end of the first gene, and wherein the first gene is native to the first organism;
(b) selecting a promoter, wherein the promoter is derived from the first gene or from a second gene that is highly expressed in a constitutive manner in the first organism or in a second organism, wherein the second gene is native to at least one of the first organism and the second organism; and
(b) synthesizing an expression construct comprising the promoter operably linked to a polynucleotide, wherein the polynucleotide comprises a 5'-untranslated region (5'-UTR), a first intron, and a translated region, and wherein the 5'-UTR or the translated region comprise the first intron.
2. The method of claim 1 , wherein the expression construct provides for enhanced expression of the polynucleotide in a target organism when compared to the expression of the polynucleotide in the target organism from a control expression construct which lacks the first intron.
3. The method of claim 2, wherein the expression level of the
polynucleotide is determined by measuring the level of the protein encoded by translated region or by assaying the activity or function of the protein.
4. The method of claim 1 , wherein the first organism and the second rganism are from the same species.
5. The method of claim 1 , wherein the first organism and the second organism are from different species.
6. The method of claim 2, wherein the target organism is the same species as at least one of the first organism and the second organism.
7. The method of claim 1 , wherein the first intron, the promoter, and the 5'- UTR are derived from the same gene. 8. The method of claim 1 , wherein the first intron, the promoter, and the 5'-
UTR are not all derived from the same gene.
9. The method of claim 1 , wherein the first intron, the promoter, and the 5'- UTR are derived from the same organism.
10. The method claim 1 , wherein the first intron, the promoter, and the 5'- UTR are not all derived from the same organism.
1 1. The method of claim 1 , wherein the 5'-UTR comprises the first intron.
12. The method of claim 1 1 , wherein the first intron is at or near the 3' end of the 5'-UTR.
13. The method of claim 1 , wherein the translated region comprises the first intron.
14. The method of claim 13, wherein the first intron is between the first and second exons of the translated region. 15. The method of claim 1 , wherein the first organism and the second organism are eukaryotic organisms.
16. The method claim 15, wherein the eukaryotic organisms are selected from the group consisting of plants, animals, fungi, and protists.
17. The method of claim 1 , wherein the first organism and the second organism are plants.
8 The method of claim 17, wherein the plant is a dicot or a monocot.
19. The method of claim 18, wherein the dicot is selected from the group consisting of Arabidopsis, soybean, cotton, tomato, potato, papaya, alfalfa, and Brassica sp.
20. The method of claim 15, wherein the monocot is selected from the group consisting of maize, sorghum, wheat, rice, switchgrass, sugarcane, millet, duckweed, Brachypodium, banana, and barley.
An expression construct according to any one of claims 1 -20.
A non-human organism or a non-human host cell comprising the expression construct of claim 21.
23. A method for making an organism for expressing a heterologous gene, said method comprising introducing into at least one cell of a target organism an expression construct comprising a promoter operably linked to a polynucleotide, wherein:
(a) the polynucleotide comprises a 5'-UTR, a first intron, and a
translated region,
(b) the 5'-UTR or translated comprises the first intron, (c) wherein the first intron is derived from a first gene that is highly expressed in a constitutive manner in a first organism,
(d) the first intron is the first intron from the 5' end of the first gene, (e) the first gene is native to the first organism,
(f> the promoter is derived from the first gene or from a second gene that is highly expressed in a constitutive manner in the first organism or in a second organism, and
(g) the second gene is native to at least one of the first organism and the second organism.
24. The method of claim 23, further comprising regenerating from the at least one cell a target organism comprising the expression construct.
The method of claim 24, wherein the target organism is capable of expressing the polynucleotide when the target organism is exposed to conditions favorable for the expression of the polynucleotide for a sufficient period of time.
26. The method of claim 25, wherein the polynucleotide is expressed at an increased level in the target organism or at least one cell thereof when compared to the expression of the polynucleotide in the target organism or at least one cell thereof comprising a control expression construct which lacks the first intron.
27. The method of claim 25, wherein the expression level of the
polynucleotide is determined by measuring the level of the protein encoded by translated region or by assaying the activity or function of the protein.
28. The method of claim 23, wherein the first organism and the second organism are from the same species.
29. The method of claim 23, wherein the first organism and the second organism are from different species.
30. The method of claim 23, wherein the target organism is the same species of organism as at least one of the first organism and the second organism.
31. The method of claim 23, wherein the first intron, the promoter, and the 5'-UTR are derived from the same gene.
32. The method of claim 23, wherein the first intron, the promoter, and the 5 -UTR are not all derived from the same gene.
33. The method of claim 23, wherein the first intron, the promoter, and the 5'-UTR are derived from the same organism. 34. The method of claim 23, wherein the first intron, the promoter, and the
5 -UTR are not all derived from the same organism.
35. The method of claim 23, wherein the 5'-UTR comprises the first intron. 36. The method of claim 35, wherein the first intron is at or near the 3' end of the S'-UTR.
37. The method of claim 23, wherein the translated region comprises the first intron.
38. The method of claim 37, wherein the first intron is between the first and second exons of the translated region.
39. The method of claim 23, wherein the first organism, the second organism, and the target are eukaryotic organisms.
40. The method claim 39, wherein the eukaryotic organisms are selected from the group consisting of plants, animals, fungi, and protists. 41. The method of claim 23, wherein the first organism, the second organism, and the target organism are plants.
The method of claim 31 , wherein the plant is a dicot or a monocot.
43. The method of claim 42, wherein the dicot is selected from the group consisting of Arabidopsis, soybean, cotton, tomato, potato, papaya, alfalfa, and Brassica sp.
44. The method of claim 42, wherein the monocot is selected from the group consisting of maize, sorghum, wheat, rice, switchgrass, sugarcane, millet, duckweed, Brachypodium, banana, and barley.
45. The method of claim 23, wherein the target organism is a plant.
46. A plant of claim 45 or descendant thereof that comprises the expression construct.
47. The plant or descendant of claim 46, wherein the plant is a seed.
48. A non-human organism of any one of claims 23-45 or cell or descendant thereof, wherein the non-human organism or cell or descendant thereof comprises the expression construct.
49. A method for expressing a heterologous gene in an organism, said method comprising:
(a) obtaining a target organism comprising an expression construct or at least one cell thereof, wherein the nucleic acid comprises a promoter operably linked to a polynucleotide, wherein the polynucleotide comprises a 5'-UTR, a first intron, and a translated region, wherein the 5'-UTR or translated region comprises the first intron, wherein the first intron is derived from a first gene that is highly expressed in a constitutive manner in a first organism, wherein the first intron is the first intron from the 5' end of the first gene, wherein the first gene is native to the first
organism, wherein the promoter is derived from the first gene or from a second gene that is highly expressed in a constitutive manner in the first organism or in a second organism, and wherein the second gene is native to at least one of the first organism and the second organism; and
(b) exposing the target organism or cell thereof to conditions
favorable for the expression of the polynucleotide for a sufficient period of time, whereby the polynucleotide is expressed.
50. The method of claim 49, wherein the polynucleotide is expressed at an increased level in the target organism or cell thereof when compared to the expression of the polynucleotide in the target organism or cell thereof comprising a control expression construct which lacks the first intron.
51. The method of claim 50, wherein the expression level of the
polynucleotide is determined by measuring the level of the protein encoded by translated region or by assaying the activity or function of the protein.
52. The method claim 49, wherein the target organism or a progenitor thereof is produced by introducing the expression construct into at least one cell of an organism and regenerating the at least one cell into the target organism or a progenitor thereof comprising the expression construct.
53. The method of claim 49, wherein the first organism and the second organism are from the same species.
54. The method of claim 49, wherein the first organism and the second organism are from different species.
55. The method of claim 49, wherein the target organism is the same species of organism as at least one of the first organism and the second organism.
56. The method of claim 49, wherein the first intron, the promoter, and the 5'-UTR are derived from the same gene.
57. The method of claim 49, wherein the first intron, the promoter, and the 5'-UTR are not all derived from the same gene.
58. The method of claim 49, wherein the first intron, the promoter, and the 5'-UTR are derived from the same organism. 59. The method of claim 49, wherein the first intron, the promoter, and the
5'-UTR are not all derived from the same organism.
60. The method of claim 49, wherein the 5'-UTR comprises the first intron. 61. The method of claim 60, wherein the first intron is at or near the 3 ' end of the 5'-UTR.
62. The method of claim 49, wherein the translated region comprises the first intron.
63. The method of claim 62, wherein the first intron is between the first and second exons of the translated region.
64. The method of claim 49, wherein the first organism, the second organism, and the target are eukaryotic organisms.
65. The method claim 64, wherein the eukaryotic organisms are selected from the group consisting of plants, animals, fungi, and protists.
66. The method of claim 49, wherein the first organism, the second organism, and the target organism are plants.
The method of claim 66, wherein the plant is a dicot or a monocot.
68. The method of claim 66, wherein the dicot is selected from the group consisting of Arabidopsis, soybean, cotton, tomato, potato, papaya, alfalfa, and
Brassica sp.
69. The method of claim 66, wherein the monocot is selected from the group consisting of maize, sorghum, wheat, rice, switchgrass, sugarcane, millet, duckweed, Brachypodium, banana, and barley.
70. The method of claim 49, wherein the target organism is a plant.
71. The method of claim 70, further comprising regenerating the at least one cell into a plant comprising the expression construct.
72. A plant of claim 71 or descendant thereof, wherein the descendant comprises the expression construct.
73. The plant or descendant of claim 72, wherein the plant or descendant is a seed.
74. A non-human organism of any one of claims 49-71 or cell or descendant thereof, wherein the non-human organism or cell or descendant thereof comprises the expression construct.
75. A method for making a regulatory construct, said method comprising:
(a) selecting a first intron, wherein the first intron is derived from a first gene that is highly expressed in a constitutive manner in a first organism, wherein the first intron is the first intron from the 5' end of the first gene, and wherein the first gene is native to the first organism;
(b) selecting a promoter, wherein the promoter is derived from the first gene or from a second gene that is highly expressed in a
constitutive manner in the first organism or in a second organism, wherein the second gene is native to at least one of the first organism and the second organism; and
(b) synthesizing an expression construct comprising the promoter operably linked to a polynucleotide, wherein the polynucleotide comprises a 5 '-untranslated region (5'-UTR), a first intron, and a translated region, and wherein the 5'-UTR or the translated region comprise the first intron. 76. The method of claim 75, wherein the regulatory construct provides for enhanced expression of an operably linked gene of interest in a target organism when compared to the expression of the gene of interest in the target organism from a control regulatory construct which lacks the first intron. 77. The method of claim 75, wherein the first organism and the second organism are from the same species.
78. The method of claim 75, wherein the first organism and the second organism are from different species.
79. The method of claim 76, wherein the target organism is the same species as at least one of the first organism and the second organism.
80. The method of claim 75, wherein the first intron, the promoter, and the 5'-UTR are derived from the same gene.
81. The method of claim 75, wherein the first intron, the promoter, and the 5 -UTR are not all derived from the same gene. 82. The method of claim 75, wherein the first intron, the promoter, and the
5'-UTR are derived from the same organism.
83. The method claim 75, wherein the first intron, the promoter, and the 5'- UTR are not all derived from the same organism.
85. The method of claim 75, wherein the first intron is at or near the 3' end of the 5'-UTR.
86. The method of claim 1 , wherein the first organism and the second organism are eukaryotic organisms. 87. The method claim 86, wherein the eukaryotic organisms are selected from the group consisting of plants, animals, fungi, and protists.
88. The method of claim 75, wherein the first organism and the second organism are plants.
89. The method of claim 88, wherein the plant is a dicot or a monocot.
90. The method of claim 89, wherein the dicot is selected from the group consisting of Arabidopsis, soybean, cotton, tomato, potato, papaya, alfalfa, and Brassica sp.
91. The method of claim 89, wherein the monocot is selected from the group consisting of maize, sorghum, wheat, rice, switchgrass, sugarcane, millet, duckweed, Brachypodium, banana, and barley.
92. The method of claim 75, further comprising operably linking a gene of interest to the regulatory construct.
93. A regulatory construct according to any one of any one of claims 75-92.
94. A non-human organism or a non-human host cell comprising the regulatory construct of claim 93.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201261666318P | 2012-06-29 | 2012-06-29 | |
US61/666,318 | 2012-06-29 |
Publications (2)
Publication Number | Publication Date |
---|---|
WO2014004638A2 true WO2014004638A2 (en) | 2014-01-03 |
WO2014004638A3 WO2014004638A3 (en) | 2014-03-13 |
Family
ID=48746154
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2013/047837 WO2014004638A2 (en) | 2012-06-29 | 2013-06-26 | Methods and compositions for enhancing gene expression |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2014004638A2 (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2016134213A3 (en) * | 2015-02-19 | 2016-11-03 | Danisco Us Inc | Enhanced protein expression |
WO2018136594A1 (en) * | 2017-01-19 | 2018-07-26 | Monsanto Technology Llc | Plant regulatory elements and uses thereof |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
ATE506441T1 (en) * | 1998-08-19 | 2011-05-15 | Monsanto Technology Llc | PLANT EXPRESSION VECTORS |
CA2599405A1 (en) * | 2005-03-08 | 2006-09-14 | Basf Plant Science Gmbh | Expression enhancing intron sequences |
US20130145502A1 (en) * | 2010-06-09 | 2013-06-06 | Pioneer Hi-Bred International, Inc. | Regulatory sequences for modulating transgene expression in plants |
-
2013
- 2013-06-26 WO PCT/US2013/047837 patent/WO2014004638A2/en active Application Filing
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2016134213A3 (en) * | 2015-02-19 | 2016-11-03 | Danisco Us Inc | Enhanced protein expression |
WO2018136594A1 (en) * | 2017-01-19 | 2018-07-26 | Monsanto Technology Llc | Plant regulatory elements and uses thereof |
US10196648B2 (en) | 2017-01-19 | 2019-02-05 | Monsanto Technology Llc | Plant regulatory elements and uses thereof |
US10870863B2 (en) | 2017-01-19 | 2020-12-22 | Monsanto Technology Llc | Plant regulatory elements and uses thereof |
EA039606B1 (en) * | 2017-01-19 | 2022-02-16 | Монсанто Текнолоджи Ллс | PLANT REGULATORY ELEMENTS AND THEIR USE |
US11519002B2 (en) | 2017-01-19 | 2022-12-06 | Monsanto Technology Llc | Plant regulatory elements and uses thereof |
US12043842B2 (en) | 2017-01-19 | 2024-07-23 | Monsanto Technology Llc | Plant regulatory elements and uses thereof |
Also Published As
Publication number | Publication date |
---|---|
WO2014004638A3 (en) | 2014-03-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
RU2694686C2 (en) | Methods for identifying variant recognition sites for rare-cutting engineered double-strand-break-inducing agents and compositions and uses thereof | |
BR102013032129A2 (en) | DNA detection methods for site specific nuclease activity | |
US9574202B2 (en) | Methods for increasing the anthocyanin content of citrus fruit | |
CA2805937A1 (en) | Chimeric promoters and methods of use | |
CN104245939A (en) | Methods and compositions for generating complex trait loci | |
US20140137292A1 (en) | Citrus trees with resistance to citrus canker | |
US11732271B2 (en) | Stem rust resistance genes and methods of use | |
MX2008010992A (en) | Compositions related to the quantitative trait locus 6 (qtl6) in maize and methods of use. | |
AU2017234672B2 (en) | Zea mays regulatory elements and uses thereof | |
CA2933042C (en) | Zea mays regulatory elements and uses thereof | |
AU2017235944B2 (en) | Zea mays regulatory elements and uses thereof | |
WO2014004638A2 (en) | Methods and compositions for enhancing gene expression | |
CA3045784A1 (en) | Modulation of transgene expression in plants | |
US9777286B2 (en) | Zea mays metallothionein-like regulatory elements and uses thereof | |
CN104411157A (en) | Transcription factors in plants related to levels of nitrate and methods of using the same | |
US20130111634A1 (en) | Methods and compositions for silencing genes using artificial micrornas | |
US20240093220A1 (en) | Plant regulatory elements and uses thereof | |
US20250075226A1 (en) | Proteins for regulation of symbiotic infection and associated regulatory elements | |
CN118974076A (en) | Plant disease resistance genes against stem rust and methods of use | |
CN116634861A (en) | Rust resistance gene |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 13733532 Country of ref document: EP Kind code of ref document: A2 |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 13733532 Country of ref document: EP Kind code of ref document: A2 |