NO175640B

NO175640B -

Info

Publication number: NO175640B
Application number: NO863000A
Authority: NO
Priority date: 1985-07-27
Filing date: 1986-07-25
Publication date: 1994-08-01
Also published as: JPH07113040B2; ES2000278A6; AU6056586A; PT83065A; HU197356B; ZA865556B; JPS6229600A; IE59127B1; EP0211299B1; IL79522A; PT83065B; NO175640C; FI863041A0; EP0211299A2; DK355486D0; IE861977L; EP0211299A3; NZ216967A; GR861957B; DK172618B1

Abstract

1. Claims for the Contracting States : BE, CH, DE, FR, GB, IT, LI, LU, NL, SE A fusion protein of the general formula Met - Xn - D' - Y - Z in which n is zero or 1, X is a sequence of 1 to 12 genetically codable amino acids, D' is a sequence of about 70 amino acids in the region of the sequence of amino acids 23 - 93 of the D-peptide in the trp operon of E. coli, Y denotes a sequence of one or more genetically codable amino acids which permits the following amino acid sequence Z to be cleaved off, and Z is a sequence of genetically codable amino acids. 1. Claims for the Contracting State : AT A process for the preparation of a fusion protein of the general formula (1) Met - Xn - D' - Y - Z in which n is zero or 1, X is a sequence of 1 to 12 genetically codable amino acids, D' is a sequence of about 70 amino acids in the region of the sequence of amino acids 23 - 93 of the D-peptide in the trp operon of E. coli, Y denotes a sequence of one or more genetically codable amino acids which permits the following amino acid sequence Z to be cleaved off, and Z is a sequence of genetically codable amino acids, characterized by expressing a gene structure which codes for these fusion proteins in a host cell, and separating the fusion protein.

Description

Foreliggende oppfinnelse vedrører fusjonsproteiner og DNA som koder for disse. The present invention relates to fusion proteins and the DNA that codes for them.

Ved den gentekniske fremstillingen av små eukariotiske proteiner med en molvekt på inntil 15.000 dalton oppnås det i bakterier ofte "bare et lavt utbytte. Sannsynligvis nedbygges de dannede proteinene raskt av vertsegne proteaser. Slike proteiner fremstilles derfor hensiktsmessig som fusjonsproteiner, spesielt med en vertsegen proteinandel som deretter avspaltes. In the genetic engineering of small eukaryotic proteins with a molecular weight of up to 15,000 daltons, only a low yield is achieved in bacteria. The proteins formed are probably quickly degraded by host-specific proteases. Such proteins are therefore conveniently produced as fusion proteins, especially with a host-specific protein proportion that then split off.

Det er nå funnet at et av bare ca. 70 aminosyrer bestående avsnitt av D-proteinet fra trp-operonet av E. coli egner seg spesielt til dannelse av fusjonsproteiner, og spesielt i området av aminosyresekvens 23 til 93 (C. Yanofsky et al., Nucleic Acids Res. 9 (1981) 6647), i det følgende også betegnet "D'-peptid". Mellom karboksy-terminalen av dette peptidet og aminosyresekvensen for det ønskede eukariotiske proteinet befinner det seg en sekvens av en eller flere genetisk kodbare aminosyrer som tillater en kjemisk eller enzymatisk avspaltning av det ønskede proteinet. I foretrukne utførelser av foreliggende oppfinnelse slutter det seg til aminoterminalen en kort aminosyresekvens bestående av Lys-Ala, eventuelt fulgt av en sekvens med 1 til 10, spesielt It has now been found that one of only approx. The 70 amino acid segment of the D protein from the trp operon of E. coli is particularly suitable for the formation of fusion proteins, and especially in the region of amino acid sequence 23 to 93 (C. Yanofsky et al., Nucleic Acids Res. 9 (1981) 6647 ), hereinafter also referred to as "D'-peptide". Between the carboxy terminus of this peptide and the amino acid sequence for the desired eukaryotic protein is a sequence of one or more genetically codeable amino acids that allow a chemical or enzymatic cleavage of the desired protein. In preferred embodiments of the present invention, a short amino acid sequence consisting of Lys-Ala is joined to the amino terminal, possibly followed by a sequence of 1 to 10, in particular

1 til 3 ytterligere genetisk kodbare aminosyrer, fortrinnsvis av 2 aminosyrer, spesielt av Lys-Gly. Foreliggende oppfinnelse tilveiebringer et fusjonsprotein, kjennetegnet ved den generelle formelen 1 to 3 further genetically codeable amino acids, preferably of 2 amino acids, especially of Lys-Gly. The present invention provides a fusion protein, characterized by the general formula

hvori in which

n er null eller 1, n is zero or 1,

X er en sekvens på 1 til 12 genetisk kodbare aminosyrer, X is a sequence of 1 to 12 genetically coded amino acids,

D' er en sekvens på ca. 70 aminosyrer i området for aminosyresekvensen 23-93 av D-peptidet i trp-operonet for E. coli, D' is a sequence of approx. 70 amino acids in the region of the amino acid sequence 23-93 of the D-peptide in the trp operon for E. coli,

Y er en sekvens av en eller flere genetisk kodbare aminosyrer, som muliggjør en avspaltning av den følgende aminosyresekvensen Z, og Y is a sequence of one or more genetically coded amino acids, which enables a cleavage of the following amino acid sequence Z, and

Z er en sekvens av genetisk kodbare aminosyrer. Z is a sequence of genetically coded amino acids.

Videre tilveiebringer foreliggende oppfinnelse DNA, kjennetegnet ved at det koder for fusjonsproteinet som omtalt ovenfor, hvor følgende sekvens koder for D': Furthermore, the present invention provides DNA, characterized in that it codes for the fusion protein as discussed above, where the following sequence codes for D':

for X koder DNA-sekvensen (kodende streng): for X the DNA sequence (coding strand) codes:

5' AAA GCA AAG GGC 3' 5' AAA GCA AAG GGC 3'

i nukleotidsekvensen til Y som koder for aminosyrene inneholder Y C-terminalt Met, Cys, Trp, Arg eller Lys eller en av gruppene in the nucleotide sequence of Y that codes for the amino acids, Y C-terminally contains Met, Cys, Trp, Arg or Lys or one of the groups

(Asp)m - Pro, hhv. Glu - (Asp)m - Pro eller Ile - Glu - Gly - Arg (Asp)m - Pro, respectively Glu - (Asp)m - Pro or Ile - Glu - Gly - Arg

hvori m betyr 1, 2 eller 3, eller består av disse aminosyrene eller en av disse gruppene. wherein m means 1, 2 or 3, or consists of these amino acids or one of these groups.

Det er naturligvis fordelaktig når den uønskede (vertsegne) andelen av fusjonsproteinet er minst mulig, siden cellen i dette tilfellet bare produserer "ballast" og utbyttet av det ønskede proteinet derved blir høyt. Videre oppstår ved avspaltningen av den uønskede andelen mindre biprodukter, hvilket gjør opparbeidelsen enklere. I motsatt retning virker at den (formodede) "beskyttelsesfunksjonen" for den uønskede andelen bare kan ventes fra en bestemt størrelse. Overraskende har det vist seg at det ifølge oppfinnelsen valgte segmentet fra D-proteinet oppfyller denne oppgaven, selv om det bare oppviser ca. 70 aminosyrer. It is of course advantageous when the unwanted (significant) proportion of the fusion protein is as small as possible, since in this case the cell only produces "ballast" and the yield of the desired protein thereby becomes high. Furthermore, the splitting off of the unwanted part results in less by-products, which makes processing easier. In the opposite direction, it seems that the (supposed) "protection function" for the unwanted proportion can only be expected from a certain size. Surprisingly, it has been shown that the segment chosen according to the invention from the D-protein fulfills this task, even though it only exhibits approx. 70 amino acids.

I mange tilfeller, fremfor alt ved den foretrukne utførelses-formen, hvori X står for Lys-Ala og N-terminalen inneholder denne sekvensen, dannes et uoppløselig fusjonsprotein. Dette kan på enkel måte fraskilles fra de oppløselige proteinene, hvilket letter opparbeidelsen i stor grad og forhøyer utbyttet. Dannelsen av uoppløselig fusjonsprotein er overraskende, idet på den ene siden den bakterielle andelen er relativt lav med bare ca. 70 aminosyrer og på den andre siden er bestanddel av et i vertscelle i oppløsning foreliggende protein. In many cases, above all in the preferred embodiment, in which X stands for Lys-Ala and the N-terminus contains this sequence, an insoluble fusion protein is formed. This can be easily separated from the soluble proteins, which greatly facilitates processing and increases the yield. The formation of insoluble fusion protein is surprising, as on the one hand the bacterial proportion is relatively low with only approx. 70 amino acids and, on the other hand, is part of a protein present in the host cell in solution.

"Ca. 70 aminosyrer i området for aminosyresekvensen 23 til 93 av D-peptidet" betyr at variasjoner er mulige på i og for seg kjent måte, dvs. at enkelte aminosyrer kan utelates, erstattes eller utbyttes, uten at herved egenskapene for fusjonsproteinet ifølge oppfinnelsen forandrer sin signifi-kans . "Approximately 70 amino acids in the area of the amino acid sequence 23 to 93 of the D-peptide" means that variations are possible in a manner known per se, i.e. that certain amino acids can be omitted, replaced or replaced, without thereby affecting the properties of the fusion protein according to the invention changes its significance.

Det ønskede eukaryotiske proteinet er fortrinnsvis et biologisk aktivt protein som et hirudin eller er forstadium av et slikt protein som humant proinsulin. The desired eukaryotic protein is preferably a biologically active protein such as a hirudin or is the precursor of such a protein as human proinsulin.

Fusjonsproteinet utvinnes ved ekspresjon i et egnet system og isoleres, i den spesielt foretrukne utførelsesformen, etter oppslutning av vertscellene fra bunnfallet, hvori det anrikes på grunn av den lave oppløseligheten. Dermed er en enkel fraskillelse fra de oppløselige bestanddelene i cellen mulig. The fusion protein is recovered by expression in a suitable system and is isolated, in the particularly preferred embodiment, after digesting the host cells from the precipitate, in which it is enriched due to its low solubility. Thus, a simple separation from the soluble constituents in the cell is possible.

Som vertsceller kommer alle celler i betraktning som er kjent for ekspresjonssystemer, dvs. pattedyrceller og mikroorgan-ismer, fortrinnvis bakterier, spesielt E. coli, idet den bakterielle andelen av fusjonsproteinet er et vertseget protein fra E. coli. As host cells, all cells that are known for expression systems come into consideration, i.e. mammalian cells and micro-organisms, preferably bacteria, especially E. coli, the bacterial part of the fusion protein being a host-specific protein from E. coli.

DNA-sekvensen som koder for fusjonsproteinet ifølge oppfinnelsen innbygges på kjent måte i en vektor som sikrer en god ekspresjon i det valgte ekspresjonssystemet. The DNA sequence which codes for the fusion protein according to the invention is incorporated in a known manner into a vector which ensures good expression in the chosen expression system.

I bakterielle verter velger man hensiktsmesig promotoren og operatoren fra gruppen lac, tac, Pl eller Pj) for fagen \, hsp, omp eller en syntetisk promotor, som eksempelvis beskrevet i det tyske utlegningsskrift nr. 3 430 683 (europeisk patentpublikasjon 0 173 149). In bacterial hosts, the promoter and operator are appropriately selected from the group lac, tac, Pl or Pj) for the phage \, hsp, omp or a synthetic promoter, as for example described in the German specification no. 3 430 683 (European patent publication 0 173 149) .

Spesielt egnet er en vektor som fra trp-operonet fra E. coli inneholder følgende elementer: Promotoren, operatoren og ribosombindingsposisjonen for L-peptidet. I det kodende området slutter seg spesielt hensiktsmessig de første tre aminosyrene til dette L-peptidet, hvoretter over en kort aminosyresekvens aminosyrene 23 til 93 i D-proteinet i trp-operonet følger. Particularly suitable is a vector which from the trp operon from E. coli contains the following elements: The promoter, the operator and the ribosome binding position for the L-peptide. In the coding region, the first three amino acids of this L-peptide join particularly expediently, after which, over a short amino acid sequence, amino acids 23 to 93 of the D protein in the trp operon follow.

Mellomsekvensen Y, som muliggjør en avspaltning av det ønskede polypeptidet, avhenger av sammensetningen av dette ønskede peptidet: Når dette eksempelvis ikke inneholder metionin kan Y stå for met, hvorpå en kjemisk spaltning følger med bromcyan. Dersom i bindeleddet Y ved karboksy-terminalen står for cystein eller dersom Y står for Cys, så kan en enzymatisk cystein-spesifikk avspaltning eller en kjemisk avspaltning, eksempelvis etter S-cyanylering foregå. Dersom i broleddet Y ved karboksyterminalen står for tryptofan eller Y står for trp, kan en kjemisk avspaltning foregå med N-bromsuksinimid. Dersom Y står for Asp-Pro så kan en proteolytisk avspaltning foregå på i og for seg kjent måte (D. Piszkiewicz et al., Biochemical and Biophysical Research Communications 40 (1970) 1173-1178). Asp-Pro-bindingen kan, som det ble funnet, også utformes syrelabil, når Y er The intermediate sequence Y, which enables cleavage of the desired polypeptide, depends on the composition of this desired peptide: When this, for example, does not contain methionine, Y can stand for met, whereupon a chemical cleavage follows with cyanogen bromide. If in the linker Y at the carboxy terminal stands for cysteine or if Y stands for Cys, then an enzymatic cysteine-specific cleavage or a chemical cleavage, for example after S-cyanylation, can take place. If in the bridge link Y at the carboxy terminal stands for tryptophan or Y stands for trp, a chemical cleavage can take place with N-bromosuccinimide. If Y stands for Asp-Pro, then a proteolytic cleavage can take place in a manner known per se (D. Piszkiewicz et al., Biochemical and Biophysical Research Communications 40 (1970) 1173-1178). As it was found, the Asp-Pro bond can also be designed acid-labile, when Y is

hvorved m er 1, 2 eller 3. Herved oppnås spaltningsprodukter som begynner N-terminalen med Pro hhv. avslutter C-terminalen med Asp. whereby m is 1, 2 or 3. This results in cleavage products that begin the N-terminus with Pro or terminates the C-terminus with Asp.

Eksempler på enzymatiske spaltninger er også kjente, hvorved også modifiserte enzymer med forbedret spesifisitet kan anvendes (C. S. Craik et al., Science 228 (1985) 291-297). Dersom det ønskede eukariotiske peptidet er humant proinsulin, så kan man som sekvens Y velge en peptidsekvens hvori med trypsin avspaltbar aminosyre (Arg, Lys) er bundet til den N-terminale aminosyren (phe) av proinsulinet, eksempelvis Ala-Ser-Met-Thr-Arg, idet i dette tilfellet den arginin-spesifikke avspaltningen kan foregå med proteasen trypsin. Dersom det ønskede proteinet ikke inneholder aminosyrerekken Ile-Glu-Gly-Arg, er også en spaltning med faktoren Xa mulig (EP-A 0 161 937). Examples of enzymatic cleavages are also known, whereby modified enzymes with improved specificity can also be used (C.S. Craik et al., Science 228 (1985) 291-297). If the desired eukaryotic peptide is human proinsulin, then one can choose as sequence Y a peptide sequence in which a trypsin-cleavable amino acid (Arg, Lys) is bound to the N-terminal amino acid (phe) of the proinsulin, for example Ala-Ser-Met-Thr -Arg, as in this case the arginine-specific cleavage can take place with the protease trypsin. If the desired protein does not contain the amino acid sequence Ile-Glu-Gly-Arg, cleavage with factor Xa is also possible (EP-A 0 161 937).

Ved utformingen av sekvensen Y har man også anledning til å ta hensyn til de syntetiske mulighetene og innbygge egnede spaltningsposisjoner for restriksjonsenzymer. DNA-sekvensen som tilsvarer aminosyresekvensen Y kan derfor også overta funksjonen av et forbindelsesledd eller en adapter. When designing the sequence Y, one also has the opportunity to take into account the synthetic possibilities and incorporate suitable cleavage positions for restriction enzymes. The DNA sequence corresponding to the amino acid sequence Y can therefore also take over the function of a connecting link or an adapter.

Fusjonsproteinet ifølge oppfinnelsen eksprimeres med fordel under kontroll av trp-operonet fra E. coli. Et DNA-avsnitt som inneholder promotoren og operatoren for trp-operonet er handelsvanlig. Ekspresjonen av proteiner under kontroll av trp-operonet er ofte beskrevet, eksempelvis i EP-A2 0 036 776. Induksjonen av trp-operonet kan oppnås ved fravær av L-tryptofan og/eller nærvær av indolyl-3-akrylsyre i medium. The fusion protein according to the invention is advantageously expressed under the control of the trp operon from E. coli. A DNA segment containing the promoter and operator for the trp operon is commercially available. The expression of proteins under the control of the trp operon is often described, for example in EP-A2 0 036 776. The induction of the trp operon can be achieved in the absence of L-tryptophan and/or the presence of indolyl-3-acrylic acid in the medium.

Under kontroll av trp-operatoren foregår først transkrip-sjonen av L-peptidet som består av 14 aminosyrer. Dette L-peptidet inneholder i posisjonene 10 og 11 L-tryptofan. Hastigheten for proteinsyntesen av L-peptidet bestemmer om det etterfølgende strukturgenet også translateres, eller om proteinsyntesen avbrytes. Ved mangel på L-tryptofan foregår en langsom syntese av L-peptidet på grunn av en lav konsentrasjon av tRNA for L-tryptofan og det foregår en syntese av følgende proteiner. Ved høye konsentrasjoner av L-tryptofan avleses derimot det tilsvarende avsnitt av mRNA raskt og det kommer til avbrudd av proteinbiosyntesen idet mRNA antar en terminator-lignende struktur (C. Yanofsky et al., se ovenf or ). Under the control of the trp operator, the transcription of the L-peptide, which consists of 14 amino acids, first takes place. This L-peptide contains in positions 10 and 11 L-tryptophan. The rate of protein synthesis of the L-peptide determines whether the subsequent structural gene is also translated, or whether protein synthesis is interrupted. In the case of a lack of L-tryptophan, a slow synthesis of the L-peptide takes place due to a low concentration of tRNA for L-tryptophan and a synthesis of the following proteins takes place. At high concentrations of L-tryptophan, on the other hand, the corresponding section of mRNA is read quickly and there is an interruption of protein biosynthesis as the mRNA assumes a terminator-like structure (C. Yanofsky et al., see above).

Hyppigheten av translasjonen for en mRNA påvirkes sterkt av typen av nukleotid i omgivelsene for startkodonet. Ved ekspresjon av et fusjonsprotein ved hjelp av trp-operonet synes det følgelig å være gunstig å anvende nukleotidene for de første aminosyrene av L-peptidet for begynnelsen av strukturgenet for fusjonsproteinet. Ved den foretrukne utførelsen av oppfinnelsen i trp-systemet ble nukleotidene for de første 3 aminosyrene av L-peptidet ( i det følgende L'-peptid) valgt som kodoner for de N-terminale aminosyrene av fusjonsproteinet. The frequency of translation for an mRNA is strongly influenced by the type of nucleotide surrounding the start codon. When expressing a fusion protein by means of the trp operon, it therefore seems advantageous to use the nucleotides for the first amino acids of the L-peptide for the beginning of the structural gene for the fusion protein. In the preferred embodiment of the invention in the trp system, the nucleotides for the first 3 amino acids of the L-peptide (hereinafter L'-peptide) were chosen as codons for the N-terminal amino acids of the fusion protein.

Oppfinnelsen vedrører derfor også vektorer, fortrinnsvis plasmider til ekspresjon av fusjonsproteiner, hvorved DNA for vektorene sett fra 5'-enden har følgende karakteristika (i egnet anordning og i fase): En promotor, en operator, et ribosom-bindingssete og strukturgenet for fusjonsproteinet, hvorved det sistnevnte inneholder aminosyresekvensen I (anheng) før sekvensen av det ønskede proteinet. Før strukturgenet, hhv. som første triplett for strukturgenet, befinner seg startkodonet (ATG) og eventuelt ytterligere kodoner for andre aminosyrer, som er anordnet mellom startkodonet og D'-sekvensen eller mellom D'-sekvensen og genet for det ønskede proteinet. Avhengig av aminosyresam-mensetningen for det ønskede proteinet foretas valget av DNA-sekvensen før strukturgenet, slik at en avspaltning av det ønskede proteinet fra fusjonsproteinet muliggjøres. The invention therefore also relates to vectors, preferably plasmids for the expression of fusion proteins, whereby the DNA of the vectors seen from the 5' end has the following characteristics (in a suitable arrangement and in phase): A promoter, an operator, a ribosome binding site and the structural gene for the fusion protein, whereby the latter contains the amino acid sequence I (pendant) before the sequence of the desired protein. Before the structural gene, respectively as the first triplet for the structural gene, the start codon (ATG) and possibly additional codons for other amino acids are located between the start codon and the D' sequence or between the D' sequence and the gene for the desired protein. Depending on the amino acid composition of the desired protein, the selection of the DNA sequence is made before the structural gene, so that cleavage of the desired protein from the fusion protein is made possible.

Ved ekspresjonen av fusjonsproteinet ifølge oppfinnelsen kan det vise seg hensiktsmessig å endre enkelte tripletter for de første aminosyrene etter ATG-startkodonet for å forhindre en eventuell basepardannelse i området for mRNA. Slike endringer, samt endringer, utelatelser eller addisjoner av enkelte aminosyrer i D'-proteinet er kjente for fagmannen. In the expression of the fusion protein according to the invention, it may prove appropriate to change certain triplets for the first amino acids after the ATG start codon in order to prevent any base pair formation in the area of mRNA. Such changes, as well as changes, omissions or additions of certain amino acids in the D' protein, are known to the person skilled in the art.

Idet mindre plasmider oppviser flere fordeler består en foretrukket utførelse av oppfinnelsen i at det fra plasmider, som er avledet fra pBR 322, elimineres et DNA-avsnitt med strukturgenet for tetracyklinresistensen. Fortrinnsvis fjerner man segmenter fra HindiII-snittstedet ved posisjon 29 til PvulI-snittstedet ved posisjon 2066. Spesielt fordelaktig er det ved disse plasmidene å fjerne et noe større DNA- avsnitt, idet man benytter PvuII-snittstedet ved begynnelsen (i avleseretning) av trp-operonet (som ligger i en ikke-essensiell del). Slik kan man direkte ligere det store fragmentet som oppstår med begge PvuII-snittstedene. Det oppnådde, med ca. 2 kbp forminskede plasmidet bevirker en ekspresjonsøkning, som muligens kan tilbakeføres til et forhøyet kopiantall i vertscellen. As smaller plasmids exhibit several advantages, a preferred embodiment of the invention consists in eliminating a DNA section with the structural gene for tetracycline resistance from plasmids derived from pBR 322. Preferably, segments are removed from the HindiII cut site at position 29 to the PvulI cut site at position 2066. With these plasmids, it is particularly advantageous to remove a somewhat larger DNA section, using the PvuII cut site at the beginning (in the reading direction) of the trp the operon (which lies in a non-essential part). In this way, one can directly ligate the large fragment that occurs with both PvuII cut sites. It achieved, with approx. The 2 kbp reduced plasmid causes an increase in expression, which can possibly be attributed to an increased copy number in the host cell.

I de følgende eksemplene belyses oppfinnelsen nærmere. In the following examples, the invention is explained in more detail.

Eksempel 1 Example 1

a) Kromosomalt E. coli-DNA kuttes ved hjelp av Hinf I og 492 bp-fragmentet isoleres, som fra trp-operonet inneholder a) Chromosomal E. coli DNA is cut using Hinf I and the 492 bp fragment is isolated, which from the trp operon contains

promotoren, operatoren, strukturgenet for L-peptidet, attenuatoren og kodonene for de første seks aminosyrene av trp-E-strukturgenet. Dette fragmentet oppfylles ved hjelp av Klenow-polymerase med desoksynukleotidtrifosfater, forbindes ved begge ender med et oligonukleotid som inneholder et gjenkjenningssete for Hindlll og det etterskjæres med Hindlll. Det oppnådde Hindlll-fragmentet ligeres i Hindlll-snittsetet av pBR 322. Man får dermed plasmidet ptrpE2-l (J. the promoter, the operator, the structural gene for the L-peptide, the attenuator and the codons for the first six amino acids of the trp-E structural gene. This fragment is completed using Klenow polymerase with deoxynucleotide triphosphates, joined at both ends with an oligonucleotide containing a recognition site for HindIII and it is cut with HindIII. The resulting HindIII fragment is ligated into the HindIII cut site of pBR 322. This results in the plasmid ptrpE2-1 (J.

C. Edmann et al., Nature 291 (1981) 503-506). Dette overføres som beskrevet til plasmidet ptrpLl. Ved hjelp av de syntetiske oligonukleotidene (NI) og (N2) C. Edmann et al., Nature 291 (1981) 503-506). This is transferred as described to the plasmid ptrpLl. Using the synthetic oligonucleotides (NI) and (N2)

som komplementerer til det dobbeltkjedede oligonukleotidet which is complementary to the double-stranded oligonucleotide

(N3) (N3)

innbygges i Clal-setet for plasmidet ptrpLl DNA-sekvensen for de første tre aminosyrene av L-peptidet, samtidig dannes et restriksjonssete (Stul) for innføring av ytterligere DNA (fig. 1). Plasmidet ptrpLl omsettes med en enzymet Clal ifølge fabrikantens angivelse, blandingen ekstraheres med fenol og DNA utfelles med etanol. Det åpnede plasmidet omsettes for fjernelse av fosfatgruppene ved 5'-endene med alkalisk fosfatase fra E. coli. Det syntetiske nukleotidet fosforyleres ved 5'-endene og innføres med T4 DNA-ligase i det åpnede, fosfatase-behandlede plasmidet. Etter avsluttet ligase-reaksjon foregår transformasjonen i E. coli 294 og seleksjon av transformantene ved Amp-resistens og nærvær av et Stul-restriksjonssete. is built into the ClaI site of the plasmid ptrpLl the DNA sequence for the first three amino acids of the L-peptide, at the same time a restriction site (Stul) is formed for the introduction of further DNA (Fig. 1). The plasmid ptrpLl is reacted with the enzyme ClaI according to the manufacturer's instructions, the mixture is extracted with phenol and the DNA is precipitated with ethanol. The opened plasmid is reacted to remove the phosphate groups at the 5' ends with alkaline phosphatase from E. coli. The synthetic nucleotide is phosphorylated at the 5' ends and introduced with T4 DNA ligase into the opened, phosphatase-treated plasmid. After completion of the ligase reaction, transformation takes place in E. coli 294 and selection of the transformants by Amp resistance and the presence of a Stul restriction site.

Ca. 80$ av de oppnådde klonene oppviser det ventede restriksjonssetet; den i fig. 1 gjengitte nukleotidsekvensen ble bekreftet ved sekvensanalyse. Man oppnår plasmidet pH120/14, som etter ribosombindingssetet for L-peptidet oppviser nukleotidtripletten for de første tre aminosyrene av L-peptidet (L'-peptid), fulgt av et Stul-sete, som tillater ytterligere innføring av DNA og derved muliggjør dannelsen av fusjonsproteiner med de første tre aminosyrene fra L-peptidet. About. 80% of the clones obtained display the expected restriction site; the one in fig. 1 the reproduced nucleotide sequence was confirmed by sequence analysis. The plasmid pH120/14 is obtained, which after the ribosome binding site for the L-peptide exhibits the nucleotide triplet for the first three amino acids of the L-peptide (L'-peptide), followed by a Stul site, which allows further introduction of DNA and thereby enables the formation of fusion proteins with the first three amino acids from the L-peptide.

b) I det følgende beskrives, med utgangspunkt i det ovenfor anvendte oligonukleotidet (NI), den kjemiske syntesen av b) In the following, based on the oligonucleotide (NI) used above, the chemical synthesis of

slike oligonukleotider: such oligonucleotides:

Med fremgangsmåten ifølge M.J. Gait et al., Nucleic Acids Research 8 (1980) 1081-1096, bindes det ved 3'-enden stående nukleosidet, i foreliggende tilfelle guanosin-kovalent til en glasskule-bærer (CPG = controlled pore glass) LCAA (= long chain alkyl amine) (fra firma Pierce) via 3'-hydroksyfunk-sjonen. Herved omsettes guanosinet som N-2-isobutyroyl-3'-0-suksinoyl-5'-dimetoksy-trityleter i nærvær av N,N'-dicyklo-heksylkarbodiimid og 4-dimetylaminopyridin med den modifiserte bæreren, hvorved den frie karboksygruppen av suksinoyl- resten acylerer aminoresten av det langkjedede aminet på bæreren. With the method according to M.J. Gait et al., Nucleic Acids Research 8 (1980) 1081-1096, the nucleoside standing at the 3' end, in this case guanosine, is covalently bound to a glass sphere carrier (CPG = controlled pore glass) LCAA (= long chain alkyl) amine) (from Pierce) via the 3'-hydroxy function. In this way, the guanosine is reacted as N-2-isobutyroyl-3'-0-succinoyl-5'-dimethoxy-trityl ether in the presence of N,N'-dicyclohexylcarbodiimide and 4-dimethylaminopyridine with the modified carrier, whereby the free carboxyl group of succinoyl- the remainder acylates the amino residue of the long-chain amine on the support.

I de følgende syntesetrinnene anvendes basekomponentene som 5 ' - O-dimetoksytr i ty 1-nukleosid-3 '-f os f orsyremonometylester-dialkylamid eller -klorid, hvorved adeninet foreligger som N6-benzoyl-forbindelsen, cytosinet som 4N-benzoyl-for-bindelsen, guaninet som N2-isobutyryl-forbindelsen og tymin som ikke inneholder noen aminogruppe foreligger uten beskyttelsesgruppe. 40 mg av bæreren som inneholder 1 jjmol bundet guanosin behandles trinnvis med følgende reagenser: In the following synthesis steps, the base components are used such as 5 '-O-dimethoxytrity 1-nucleoside-3'-phosphoric acid monomethyl ester dialkylamide or chloride, whereby the adenine is present as the N6-benzoyl compound, the cytosine as the 4N-benzoyl-for- bond, the guanine as the N2-isobutyryl compound and the thymine containing no amino group are present without a protecting group. 40 mg of the carrier containing 1 jjmol of bound guanosine is treated stepwise with the following reagents:

a) metylenklorid, a) methylene chloride,

b) 10$ trikloreddiksyre i metylenklorid, b) 10% trichloroacetic acid in methylene chloride,

c ) metanol, c) methanol,

d) tetrahydrofuran, d) tetrahydrofuran,

e) acetonitril, e) acetonitrile,

f) 15 pmol av det tilsvarende nukleosidfosfittet og 50 pmol tetrazol i 0,3 ml vannfri acetonitril (5 minutter), g) 20$ acetanhydrid i tetrahydrofuran med 40$ lutidin og 10$ dimetylaminopyridin (2 minutter), f) 15 pmol of the corresponding nucleoside phosphite and 50 pmol tetrazole in 0.3 ml anhydrous acetonitrile (5 minutes), g) 20% acetic anhydride in tetrahydrofuran with 40% lutidine and 10% dimethylaminopyridine (2 minutes),

h) tetrahydrofuran, h) tetrahydrofuran,

i) tetrahydrofuran med 20$ vann og 40$ lutidin, i) tetrahydrofuran with 20% water and 40% lutidine,

j) 3% jod i kollidin/vann/tetrahydrofuran i volumforhold j) 3% iodine in collidine/water/tetrahydrofuran in volume ratio

5:4:1, 5:4:1,

k) tetrahydrofuran og k) tetrahydrofuran and

1) metanol. 1) methanol.

Under "fosfitt" forstås her desoksyribose-3'-monofosforsyre-monometylesteren, hvorved den tredje valensen er avmettet med klor eller en tertiær aminogruppe, eksempelvis en diisopropy-laminorest. Utbyttene i de enkelte syntesetrinnene kan i ethvert tilfelle etter detrityleringsreaksjonen b) bestemmes spektrofotometrisk ved måling av absorpsjon for dimetoksytri-tylkationet ved en bølgelengde på 496 nm. By "phosphite" is meant here the deoxyribose-3'-monophosphoric acid monomethyl ester, whereby the third valence is desaturated with chlorine or a tertiary amino group, for example a diisopropylamino group. The yields in the individual synthesis steps can in any case after the detritylation reaction b) be determined spectrophotometrically by measuring absorption for the dimethoxytrityl cation at a wavelength of 496 nm.

Etter avsluttet syntese av oligonukleotidet avspaltes metylfosfatbeskyttelsesgruppen for oligomeren ved hjelp av p-tiokresol og trietylamin. After completion of the synthesis of the oligonucleotide, the methyl phosphate protection group for the oligomer is removed with the help of p-thiocresol and triethylamine.

Deretter fraskilles oligonukleotidet fra den faste bæreren ved 3 timers behandling med ammoniakk. En 2- til 3-dagers behandling av oligomeren med konsentrert ammoniakk avspaltes aminobeskyttelsesgruppene kvantitativt fra basene. Det oppnådde råproduktet renses ved høytrykks-væskekromatografi (HPLC) eller ved polyakrylamid-gelelektroforese. The oligonucleotide is then separated from the solid support by treatment with ammonia for 3 hours. A 2- to 3-day treatment of the oligomer with concentrated ammonia cleaves the amino protecting groups quantitatively from the bases. The crude product obtained is purified by high-pressure liquid chromatography (HPLC) or by polyacrylamide gel electrophoresis.

På tilsvarende måte syntetiseres de øvrige oligonukleotidene. The other oligonucleotides are synthesized in a similar way.

c) Plasmidet ptrpE5-l (R.A. Hallewell et al., Gene 9 (1980) 27-47) omsettes med restriksjonsenzymene Hindlll og Sali c) The plasmid ptrpE5-1 (R.A. Hallewell et al., Gene 9 (1980) 27-47) is reacted with the restriction enzymes HindIII and Sali

ifølge fabrikantens angivelser og det ca. 602 bpDNA-fragmentet fjernes. De syntetiserte oligonukleotidene (N4) og according to the manufacturer's specifications and the approx. The 602 bpDNA fragment is removed. The synthesized oligonucleotides (N4) and

(N5) (N5)

fosforyleres, inkuberes sammen ved 37°C og adderes ved hjelp av DNA-ligase til det buttendede DNA- for proinsulin (W. Wetekam et al., Gene 19 (1982) 179-183). Etter omsetning med Hindlll og Sali innbygges den nå forlengde proinsulin-DNA med enzymet T4 DNA-ligase kovalent i det åpnede plasmidet (fig. are phosphorylated, incubated together at 37°C and added by means of DNA ligase to the butt-ended DNA for proinsulin (W. Wetekam et al., Gene 19 (1982) 179-183). After reaction with HindIII and SalI, the now extended proinsulin DNA is incorporated covalently into the opened plasmid with the enzyme T4 DNA ligase (Fig.

2), hvorved plasmidet pH106/4 oppstår. 2), whereby the plasmid pH106/4 arises.

Plasmidet pH106/4 omsettes først nok en gang med Sali, de overlappende endene fullstendiggjøres med Klenow-polymerase til butte ender og deretter inkuberes med enzymet Mstl. Det isoleres et DNA-fragment på ca. 500 bp, som inneholder den samlede kodende delen av koinsulinet samt et ca. 210 bp stort avsnitt av D-proteinet fra trp-operonet av E. coli. DNA-fragmentet er buttendet og innføres i Stul-setet av plasmidetPH120/14, hvorved plasmidet pH154/25 oppstår (fig. 3). Dette er egnet til ekspresjon av et fusjonsprotein under kontroll av trp-operonet, hvori etter L'- og D'-peptidet aminosyresekvensen Ala-Ser-Met-Thr-Arg er anordnet, hvortil aminosyresekvensen av proinsulinet er tilsluttet. The plasmid pH106/4 is first reacted once again with SalI, the overlapping ends are completed with Klenow polymerase to blunt ends and then incubated with the enzyme Mstl. A DNA fragment of approx. 500 bp, which contains the entire coding part of coinsulin as well as an approx. 210 bp fragment of the D protein from the trp operon of E. coli. The DNA fragment is butt-ended and introduced into the Stul site of the plasmid PH120/14, whereby the plasmid pH154/25 arises (Fig. 3). This is suitable for expression of a fusion protein under the control of the trp operon, in which the amino acid sequence Ala-Ser-Met-Thr-Arg is arranged after the L' and D' peptides, to which the amino acid sequence of proinsulin is connected.

Eksempel 2 Example 2

Plasmidet pH154/25 (fig. 3) omsettes med restriksjonsenzymene BamHI og Xmalll. De overstående endene oppfylles ved hjelp av Klenow-polymerase og sammenknytes med T4 DNA-ligase. Man får plasmidet pH-254 (fig. 4) som egner seg til ekspresjon av et fusjonsprotein med aminosyresekvensen L', D'-proinsulin under kontroll av trp-promotoren. Plasmidet er noe mindre enn pH154/25, hvilket kan være fordelaktig. The plasmid pH154/25 (Fig. 3) is reacted with the restriction enzymes BamHI and XmalII. The excess ends are filled using Klenow polymerase and joined with T4 DNA ligase. The plasmid pH-254 (Fig. 4) is obtained which is suitable for expression of a fusion protein with the amino acid sequence L', D'-proinsulin under the control of the trp promoter. The plasmid is somewhat smaller than pH154/25, which may be advantageous.

Eksempel 3 Example 3

Ved inkubering av plasmidet pH254 (eksempel 2; fig. 4) med restriksjonsenzymene Mlul og Sali settes et DNA-avsnitt med 280 bp fri, dette fraskilles. Restplasmidet omvandles med Klenow-polymerase til den buttendede formen og ringsluttes med DNA-ligase igjen kovalent. Derved oppstår plasmidet pH255 (fig. 4), som er egnet til innføring av et strukturgen ved et av restriks j onssetene Mlul, Sali og EcoRI. Under induksjonsbetingelser foregår dannelsen av et fusjonsprotein med L',D'-proteinet. Naturligvis kan ytterligere restrik-sjonsseter innføres i plasmidet pH255 ved hjelp av egnede forbindelsesledd. When incubating the plasmid pH254 (example 2; Fig. 4) with the restriction enzymes Mlul and Sali, a DNA section of 280 bp is set free, this is separated. The remaining plasmid is converted with Klenow polymerase to the butt-ended form and ring-closed with DNA ligase again covalently. This results in the plasmid pH255 (Fig. 4), which is suitable for the introduction of a structural gene at one of the restriction sites MluI, SalI and EcoRI. Under induction conditions, the formation of a fusion protein with the L',D' protein takes place. Naturally, further restriction sites can be introduced into the plasmid pH255 by means of suitable connecting links.

Eksempel 4 Example 4

Plasmidet pH154/25 (fig. 3) inkuberes med enzymene Mlul og EcoRI og det frigitte DNA-fragmentet (ca. 300p) fjernes. Restplasmidet oppfylles med Klenow-polymerase. Ring-slutningen foregår ved innvirkning av DNA-ligase. Plasmidet pH256 som oppstår (fig. 5) kan anvendes til innføring av strukturgener i EcoRI-setet. The plasmid pH154/25 (Fig. 3) is incubated with the enzymes MluI and EcoRI and the released DNA fragment (approx. 300p) is removed. The remaining plasmid is ligated with Klenow polymerase. Ring closing takes place by the action of DNA ligase. The resulting plasmid pH256 (Fig. 5) can be used to introduce structural genes into the EcoRI site.

Eksempel 5 Example 5

Ved fjernelse av et 600 bp-fragment fra plasmidet pH256 (eksempel 4; fig. 5) med restriksjonsenzymene BamHI og Nrul får man plasmidet pH257 (fig. 5). For dette formål inkuberes pH256 først med BamHI og butte ender oppnås med Klenow-polymerase. Etter inkubering med Nrul og fraskillelse av 600 bp-fragmentet foregår dannelsen av pH257 etter inkubering med DNA-ligase. By removing a 600 bp fragment from the plasmid pH256 (example 4; fig. 5) with the restriction enzymes BamHI and Nrul, the plasmid pH257 is obtained (fig. 5). For this purpose, pH256 is first incubated with BamHI and blunt ends are obtained with Klenow polymerase. After incubation with Nrul and separation of the 600 bp fragment, the formation of pH257 takes place after incubation with DNA ligase.

Eksempel 6 Example 6

Ved innføring av lac-repressoren (P.J. Farabaugh, Nature 274 Upon introduction of the lac repressor (P.J. Farabaugh, Nature 274

(1978) 765-769) i plasmidet pKK 177-3 (Amann et al., Gene 25 (1978) 765-769) in the plasmid pKK 177-3 (Amann et al., Gene 25

(1983) 167) får man plasmidet pJF118. Dette omsettes med EcoEI og Sali og restplasmidet isoleres. (1983) 167) the plasmid pJF118 is obtained. This is reacted with EcoEI and Sali and the residual plasmid is isolated.

Fra plasmidet pH106/4 (fig. 2) oppnås ved innvirkning av Sali og inkubering med Mstl et ca. 495 bp stort fragment. From the plasmid pH106/4 (fig. 2) is obtained by exposure to Sali and incubation with Mstl et approx. 495 bp large fragment.

De syntetisk fremstilte oligonukleotidene (N6) og (N7) The synthetically produced oligonucleotides (N6) and (N7)

fosforyleres og adderes med DNA-ligase til det butt-endende DNA-fragmentet. Ved omsetning med EcoRI og Sali settes de overlappende endene fri, hvilket tillater en ligering i det åpnede plasmidet pJF118. is phosphorylated and added with DNA ligase to the blunt-ended DNA fragment. By reaction with EcoRI and SalI, the overlapping ends are set free, which allows a ligation into the opened plasmid pJF118.

Etter transformasjon av det på denne måten fremstilte hybridplasmidet i E. Coli 294 utvelges på grunnlag av størrelsen av restriksjonsfragmentene de riktige klonene. Dette plasmidet betegnes som pJ120 (fig. 6). After transformation of the thus produced hybrid plasmid in E. Coli 294, the correct clones are selected on the basis of the size of the restriction fragments. This plasmid is designated as pJ120 (Fig. 6).

Ekspresjonen av fusjonsproteinet gjennomføres i ristekolbe på følgende måte: Fra en kultur som har stått over natten av E. coli 294-transformanter som inneholder plasmidet pJ120, i LB-medium (J.H. Miller, "Experiments in Molecular Genetics", Gold Spring Harbor Laboratory, 1972 ) med 50 jjg/ml ampicillin startes en frisk kultur i forhold ca. 1:100 og veksten følges ved hjelp av OD-måling. Ved OD = 0,5 blandes kulturen med så mye isopropyl-e-D-galaktopyranosid (IPTG) at dens konsentrasjon utgjør 1 mM, og bakteriene fraskilles etter 150 til 180 minutter. Bakteriene kokes i 5 minutter i en bufferblanding (7M urinstoff, 0,1$ SDS, 0,1 M natriumfosfat, pH 7,0) og prøvene påføres på en SDS-gelelektroforeseplate. Etter elektroforese oppnås det fra bakterier som inneholder plasmidet pJ120 et proteinbånd som tilsvarer størrelsen av det ventede fusjonsproteinet og som reagerer med antistoffer mot insulin. Etter isolering av fusjonsproteinet kan man ved bromcyanspaltning sette det ventede proinsulin-derivatet fri. Etter oppslutning av bakteriene (French Press; "Dyno-Muhle") og sentrifugering befinner L',D'-proinsulin-fusjonsproteinet seg i bunnfallet, slik at det allerede med supernatanten kan fraskilles betydelige mengder av øvrige proteiner. The expression of the fusion protein is carried out in a shake flask as follows: From an overnight culture of E. coli 294 transformants containing the plasmid pJ120, in LB medium (J.H. Miller, "Experiments in Molecular Genetics", Gold Spring Harbor Laboratory, 1972 ) with 50 µg/ml ampicillin, a fresh culture is started in a ratio of approx. 1:100 and the growth is monitored using OD measurement. At OD = 0.5, the culture is mixed with so much isopropyl-ε-D-galactopyranoside (IPTG) that its concentration is 1 mM, and the bacteria are separated after 150 to 180 minutes. The bacteria are boiled for 5 minutes in a buffer mixture (7M urea, 0.1% SDS, 0.1M sodium phosphate, pH 7.0) and the samples are applied to an SDS gel electrophoresis plate. After electrophoresis, a protein band corresponding to the size of the expected fusion protein and reacting with antibodies against insulin is obtained from bacteria containing the plasmid pJ120. After isolation of the fusion protein, the expected proinsulin derivative can be released by cyanogen bromide cleavage. After digestion of the bacteria (French Press; "Dyno-Muhle") and centrifugation, the L',D'-proinsulin fusion protein is found in the precipitate, so that considerable amounts of other proteins can already be separated with the supernatant.

De angitte induksjonbetingelsene gjelder for ristekulturer; ved større fermenteringer kan tilsvarende endrede OD-verdier og eventuelt lett endrede IPTG-konsentrasjoner være hensikts-messige . The indicated induction conditions apply to shaking cultures; in the case of larger fermentations, correspondingly changed OD values and possibly slightly changed IPTG concentrations may be appropriate.

Eksempel 7 Example 7

Fra E. coli 294-transformanter, som inneholder plasmidet pH154/25 (fig. 3) fremstilles en kultur over natten i LB-medium med 50 jjg/ml ampicillin og som neste mål fortynnes i forhold ca. 1:100 i M9-medium (J.H. Miller, se ovenfor) med 2000 pg/ml casaminosyre og 1 pg/ml tiamin. Ved OD = 0,5 tilsettes indolyl-3-akrylsyre slik at sluttkonsentrasjonen utgjør 15 jjg/ml. Etter 2 til 3 timers inkubering frasentrifugeres bakteriene. En SDS-gelelektroforese viser ved det for fusjonsproteinet ventede setet et meget tydelig proteinbånd som reagerer med antistoffer mot insulin. Etter oppslutning av bakteriene og sentrifugering befinner L',D'-proinsulin-fusjonsproteinet seg i bunnfallet, slik at det også i dette tilfellet med supernatanten fjernes betydelige mengder av de øvrige proteinene. From E. coli 294 transformants, which contain the plasmid pH154/25 (fig. 3), a culture is prepared overnight in LB medium with 50 µg/ml ampicillin and, as the next measure, diluted in a ratio of approx. 1:100 in M9 medium (J.H. Miller, see above) with 2000 pg/ml casamino acid and 1 pg/ml thiamine. At OD = 0.5, indolyl-3-acrylic acid is added so that the final concentration amounts to 15 µg/ml. After 2 to 3 hours of incubation, the bacteria are centrifuged off. An SDS gel electrophoresis shows at the site expected for the fusion protein a very clear protein band which reacts with antibodies against insulin. After digestion of the bacteria and centrifugation, the L',D'-proinsulin fusion protein is found in the precipitate, so that in this case also significant quantities of the other proteins are removed with the supernatant.

Også i det foreliggende tilfellet gjelder de angitte induksjonsbetingelsene for ristekulturer. Fermenteringer i større volumer krever endrede konsentrasjoner av casaminosyrer, henholdsvis en tilsats av L-tryptofan. Also in the present case, the indicated induction conditions apply to shaking cultures. Fermentations in larger volumes require changed concentrations of casamino acids, respectively an addition of L-tryptophan.

Eksempel 8 Example 8

Plasmidet pH154/25 (fig.3) åpnes med EcoEI og de overstående DNA-enkeltkjedene oppfylles med Klenow-polymerase. Det på denne måten oppnådde DNA inkuberes med enzymet Mlul og det for insulin kodende DNA skjæres ut av plasmidet. Ved gelelektroforetisk adskillelse fraskilles dette fragmentet fra restplasmidet og restplasmidet isoleres. The plasmid pH154/25 (fig.3) is opened with EcoEI and the above DNA single chains are completed with Klenow polymerase. The DNA obtained in this way is incubated with the enzyme MluI and the DNA coding for insulin is excised from the plasmid. By gel electrophoretic separation, this fragment is separated from the residual plasmid and the residual plasmid is isolated.

Plasmidet ifølge fig. 3 i det tyske utlegningsskrift nr. The plasmid according to fig. 3 in the German interpretation document no.

3 429 430 (europeisk patentpublikasjon EP-A1 0 171 024) omsettes med restriksjonsenzymene Accl og Sali og DNA-fragmentet som inneholder hirudinsekvensen fraskilles. Etter oppfylling av de ovenstående endene av Sall-snittsetet med Klenow-polymerase ligeres DNA-segmentet med det syntetiske DNA av formel (N8) 3 429 430 (European patent publication EP-A1 0 171 024) is reacted with the restriction enzymes Accl and Sali and the DNA fragment containing the hirudin sequence is separated. After filling in the above ends of the SalI cut site with Klenow polymerase, the DNA segment is ligated with the synthetic DNA of formula (N8)

Ligeringsproduktet inkuberes med Mlul. Etter inaktivering av enzymet ved 65 °C behandles DNA-blandingen med alkalisk kvegfosfatase i 1 time ved 37"C. Herpå fjernes fosfatasen og restriksjonsenzymet ved hjelp av fenolekstraksjon fra blandingen og DNA renses ved etanolutfelling. Det på denne måten behandlede DNA innføres med T4-ligase i det åpnede restplasmidet pH154/25, hvorved man oppnår plasmidet pK150, som blirkarakterisert vedrestriksjonsanalyse og DNA-sekvensering ifølge Maxam og Gilbert (fig. 7). The ligation product is incubated with MluI. After inactivating the enzyme at 65 °C, the DNA mixture is treated with alkaline bovine phosphatase for 1 hour at 37 °C. The phosphatase and the restriction enzyme are then removed by means of phenol extraction from the mixture and the DNA is purified by ethanol precipitation. The DNA treated in this way is introduced with T4- ligase in the opened residual plasmid pH154/25, whereby the plasmid pK150 is obtained, which is characterized by restriction analysis and DNA sequencing according to Maxam and Gilbert (fig. 7).

Eksempel 9 Example 9

E. coli 294-bakterier, som inneholder plasmidet pK150 (fig. 7) får stå i LB-medium med 30 til 50 jjg/ml ampicillin over natten ved 37°C. Kulturen fortynnes med M9-medium som inneholder 2000 jjg/ml casaminosyrer og 1 jjg/ml tiamin, i forhold 1:100 og inkuberes ved 37°C under stadig omrøring. Ved en OD^qq = 0,5 hhv. 1 tilsettes indolyl-3-akrylsyre til en sluttkonsentrasjon på 15 pg/ml og det inkuberes 2 til 3 timer, henholdsvis 16 timer. Deretter frasentrifugeres bakteriene og oppsluttes i 0,1 M natriumfosfatbuffer (pH 6,5) under trykk. De tungt oppløselige proteinene frasentrifugeres og analyseres ved hjelp av SDS-polyakrylamid-elektroforese. Det viser seg at celler hvis trp-operon er indusert i området under 20.000 dalton, men over 14.000 dalton, inneholder et nytt protein som ikke finnes i ikke-induserte celler. Etter isolering av fusjonsproteinet og omsetning med bromcyan settes hirudin fri. E. coli 294 bacteria containing the plasmid pK150 (Fig. 7) are allowed to stand in LB medium with 30 to 50 µg/ml ampicillin overnight at 37°C. The culture is diluted with M9 medium containing 2000 µg/ml casamino acids and 1 µg/ml thiamine, in a ratio of 1:100 and incubated at 37°C with constant stirring. At an OD^qq = 0.5 or 1, indolyl-3-acrylic acid is added to a final concentration of 15 pg/ml and incubated for 2 to 3 hours, respectively 16 hours. The bacteria are then centrifuged off and suspended in 0.1 M sodium phosphate buffer (pH 6.5) under pressure. The poorly soluble proteins are centrifuged off and analyzed using SDS-polyacrylamide electrophoresis. It turns out that cells whose trp operon is induced in the range below 20,000 daltons but above 14,000 daltons contain a new protein that is not found in non-induced cells. After isolation of the fusion protein and reaction with cyanogen bromide, hirudin is set free.

Eksempel 10 Example 10

I det følgende beskrives konstruksjoner som tillater at det foran 5'-enden av trp-D-sekvensen innføres DNA-sekvenser som inneholder mest mulig tallrike gjenkjenningsseter for forskjellige restriksjonsenzymer, for å kunne innbygge trp-D-sekvensen mest mulig universelt i de forskjellige pro-karyotiske ekspresjonssystemene. In the following, constructions are described that allow DNA sequences to be introduced in front of the 5' end of the trp-D sequence that contain the most numerous recognition sites for different restriction enzymes, in order to be able to incorporate the trp-D sequence as universally as possible into the various pro - the karyotic expression systems.

Plasmidene pUC12 og pUC13 (Pharmacia P-L Biochemicals, 5401 St. Goar: "The Molecular Biology Catalogue" 1983, Appendix, side 89) inneholder en poly-forbindelsessekvens, hvorved i plasmidet pUC13 mellom restriksjonssnittsetene for Xmal og SacI MstI-HindIII-trp-fragmentet fra plasmidet pl06/4 (fig. 2), fusjonert med HindIII-hirudin-SacI-fragmentet fra plasmid pK150 (fig. 7) skal innføres. Plasmids pUC12 and pUC13 (Pharmacia P-L Biochemicals, 5401 St. Goar: "The Molecular Biology Catalogue" 1983, Appendix, page 89) contain a poly linker sequence whereby in plasmid pUC13 between the restriction sites of Xmal and SacI the MstI-HindIII-trp fragment from the plasmid pl06/4 (Fig. 2), fused with the HindIII-hirudin-SacI fragment from plasmid pK150 (Fig. 7) should be introduced.

For dette formålet behandles DNA i plasmidet pUC13 først med restriksjonsenzymet Xmal. Endene av det lineariserte plasmidet oppfylles ved hjelp av Klenow-polymerase-reaksjon. Etter etanolutfelling behandles DNA med enzymet SacI og utfelles på nytt med etanol fra reaksjonsblandingen. DNA omsettes så i en vandig 1igeringsblanding med Mstl-Hindlll-trp-D-fragmentet, isolert fra plasmidet pH106/4 og Hindlll-Sacl-hirudinfragmentet, isolert fra plasmidet pK150, og T4 DNA-ligase. For this purpose, the DNA in the plasmid pUC13 is first treated with the restriction enzyme XmaI. The ends of the linearized plasmid are filled using Klenow polymerase reaction. After ethanol precipitation, the DNA is treated with the enzyme SacI and re-precipitated with ethanol from the reaction mixture. The DNA is then reacted in an aqueous ligation mixture with the MstI-HindIII-trp-D fragment, isolated from plasmid pH106/4 and the HindIII-SacI hirudin fragment, isolated from plasmid pK150, and T4 DNA ligase.

Det på denne måten fremstilte plasmidet pK160 inneholder umiddelbart før begynnelsen av trp-D-sekvensen en multippel restriksjonsenzym-gjenkjenningssekvens, som omfatter snittstedene for enzymene Xmal, Smal, BamHI, Xbal, HincII, Sali, AccI, Pstl og Hindlll. Ved denne konstruksjonen oppnås videre etter 3'-enden av hirudin-sekvensen et EcoRI-snittsete (fig.8). The plasmid pK160 produced in this way contains, immediately before the beginning of the trp-D sequence, a multiple restriction enzyme recognition sequence, which comprises the cutting sites for the enzymes XmaI, SmaI, BamHI, XbaI, HincII, Sali, AccI, Pstl and HindIII. With this construction, an EcoRI cut site is also obtained after the 3' end of the hirudin sequence (Fig. 8).

Eksempel 11 Example 11

Plasmidet pH131/5 fremstilles (ifølge den tyske patent-publikasjonen nr. P 35 14 113.1, eksempel 1, fig. 1) på følgende måte: The plasmid pH131/5 is prepared (according to the German patent publication no. P 35 14 113.1, example 1, fig. 1) in the following way:

Plasmidet ptrpLl (J.C. Edman et al., Nature 291 (1981) 503-506) åpnes med Clal og ligeres med det syntetisk fremstilte, selv-komplementære oligonukleotidet (N9) The plasmid ptrpLl (J.C. Edman et al., Nature 291 (1981) 503-506) is opened with ClaI and ligated with the synthetically prepared, self-complementary oligonucleotide (N9)

Det på denne måten oppnådde plasmidet pH131/5 (fig. 9) åpnes ved det på denne måten innførte snittstedet for restriksjonsenzymet Ncol og de dannede overstående enkeltkjede-endene oppfylles ved hjelp av Klenow-polymerase-reaksjon. Det liniariserte, glattendede DNA etterskjæres nå med enzymet EcoRI og den største av de dannede DNA-sekvensene fraskilles ved hjelp av etanolutfelling fra den mindre sekvensen. Det på denne måten dannede restplasmid-DNA for plasmidet pH131/5 ligeres nå med et for trp-D'-hirudin kodende fragment fra pK160 ved T4-ligase-reaksjon. Dette fragmentet ble ved åpning av plasmidet med HincII og EcoRI avspaltet fra plasmidet pK160. Fragmentet fraskilles gelelektroforetisk fra restplasmidet og eineres deretter fra gelmaterialet. Ligeringsproduktet transformeres etter E. coli K12. Plasmid-DNA'et inneholdende kloner isoleres og karakteriseres ved restriksjonsanalyse og DNA-sekvensanalyse. Det på denne måten fremstilte plasmidet pK170 inneholder fusjonert til trp-operatoren en DNA-sekvens som koder for Met-Asp-Ser-Arg-Gly-Ser-Pro-Gly-trp-D'-(hirudin) (fig. 9). The thus obtained plasmid pH131/5 (Fig. 9) is opened at the cut site for the restriction enzyme NcoI introduced in this way and the resulting single chain ends are filled in by means of the Klenow polymerase reaction. The linearised, smooth-ended DNA is now cut with the enzyme EcoRI and the largest of the DNA sequences formed is separated by means of ethanol precipitation from the smaller sequence. The remaining plasmid DNA for the plasmid pH131/5 formed in this way is now ligated with a fragment coding for trp-D'-hirudin from pK160 by T4 ligase reaction. This fragment was cleaved from the plasmid pK160 by opening the plasmid with HincII and EcoRI. The fragment is gel electrophoretically separated from the residual plasmid and then separated from the gel material. The ligation product is transformed by E. coli K12. The plasmid DNA containing clones are isolated and characterized by restriction analysis and DNA sequence analysis. The plasmid pK170 produced in this way contains, fused to the trp operator, a DNA sequence which codes for Met-Asp-Ser-Arg-Gly-Ser-Pro-Gly-trp-D'-(hirudin) (Fig. 9).

Eksempel 12 Example 12

Plasmidet pJF118 (eksempel 6) åpnes med EcoRI og de overstående DNA-endene overføres ved hjelp av Klenow-polymerase-reaksjon til glatte ender. Det på denne måten behandlede DNA skjæres deretter med enzymet Sali og fraskilles gelelektroforetisk fra det korte EcoRI Sall-fragmentet. The plasmid pJF118 (Example 6) is opened with EcoRI and the overhanging DNA ends are transferred by Klenow polymerase reaction to blunt ends. The DNA treated in this way is then cut with the enzyme SalI and gel electrophoretically separated from the short EcoRI SalI fragment.

Plasmidet pK170 (eksempel 11) spaltes med Ncol og de overstående endene overføres ved hjelp av Klenow-polymerase til glatte ender. Plasmid-DNA fraskilles fra reaksjonsblandingen ved etanolutfelling og behandles med enzymene Hindlll og BamHI. Fra de dannede fragmentene isoleres to, nemlig Ncol (med oppfylt ende )-trpD'-HindIII-fragmentet og HindIII-hirudin-BamHI-fragmentet (tysk utlegningsskrift nr. 3 429 430). Begge fragmentene isoleres etter gelelektroforetisk adskillelse. The plasmid pK170 (Example 11) is cleaved with NcoI and the overhanging ends are transferred by means of Klenow polymerase to blunt ends. Plasmid DNA is separated from the reaction mixture by ethanol precipitation and treated with the enzymes HindIII and BamHI. From the fragments formed, two are isolated, namely the NcoI (with filled end )-trpD'-HindIII fragment and the HindIII-hirudin-BamHI fragment (German publication no. 3 429 430). Both fragments are isolated after gel electrophoretic separation.

Videre isoleres BamHI-Sall-hirudinll-fragmentet ifølge fig. 2r i det tyske utlegningsskrift nr. 3 429 430. I en ligeringsreaksjon omsettes nå de fire fragmentene, nemlig pJF118-restplasmidet, NcoI-trpD'-HindIII-fragmentet, Hindlll-hirudin-BamHI-fragmentet og hirudinll-fragmentet med hverandre og det oppnådde plasmidet pK180 (fig. 10) transformeres etter E. coli K12-W 3110. Riktige plasmider finnes ved at et EcoRI-trpD'-hirudin-SalI-fragment kan påvises i plasmid-DNA. trpD'-hirudin-sekvensen er nå knyttet til tac-promotoren. Fusjonsproteinet eksprimeres ifølge eksempel 6. Furthermore, the BamHI-SalI-hirudinII fragment is isolated according to fig. 2r in the German specification no. 3 429 430. In a ligation reaction, the four fragments, namely the pJF118 residual plasmid, the NcoI-trpD'-HindIII fragment, the HindIII-hirudin-BamHI fragment and the hirudinIII fragment are now reacted with each other and the obtained the plasmid pK180 (Fig. 10) is transformed after E. coli K12-W 3110. Correct plasmids are found when an EcoRI-trpD'-hirudin-SalI fragment can be detected in plasmid DNA. The trpD' hirudin sequence is now linked to the tac promoter. The fusion protein is expressed according to example 6.

Eksempel 13 Example 13

Ved plasmidene som er avledet fra pBR322, som pH120/14 (eksempel 1, fig. 1), pH154/25 (eksempel 1, fig. 3), pH256 (eksempel 4, fig. 5), pK150 (eksempel 8, fig. 7) og pK170 (eksempel 11, fig. 9) befinner det seg - på fig.ene i urviserretning - mellom startkodonet for fusjonsproteinet og det nærmeste Hindlll-setet (tilsvarende Hindlll ved posisjon 29 i pBR322) et ytterligere PvuII-sete i området av fragmentet som inneholder trp-promotoren og -operatoren, men riktignok utenfor promotor-området. In the case of the plasmids derived from pBR322, such as pH120/14 (Example 1, Fig. 1), pH154/25 (Example 1, Fig. 3), pH256 (Example 4, Fig. 5), pK150 (Example 8, Fig. 7) and pK170 (example 11, fig. 9) there is - clockwise in the figures - between the start codon for the fusion protein and the closest HindIII site (corresponding to HindIII at position 29 in pBR322) a further PvuII site in the area of the fragment containing the trp promoter and operator, but admittedly outside the promoter region.

Det er nå funnet at ved fjernelse av DNA-avsnittet som er begrenset av det omtalte PvuII-setet og PvuII-setet som tilsvarer posisjonen 2066 i pBR322, kan utbyttet av et klonet protein (eller fusjonsprotein) forhøyes betydelig. It has now been found that by removing the DNA section restricted by the mentioned PvuII site and the PvuII site corresponding to position 2066 in pBR322, the yield of a cloned protein (or fusion protein) can be significantly increased.

I det følgende beskrives eksempelvis forkortelsen av plasmidet pH154/25 til pH154/25<*>, som for de øvrige proteinene kan foregå på tilsvarende måte (hvorved de forkortede plasmidene i ethvert tilfelle er merket med en stjerne): pH154/25 omsettes (ifølge fabrikantens angivelser) med PvuII hvorved tre fragmenter oppstår: Fragment 1: fra PvuII-restriksjonssetet for proinsulin inntil det PvuII-restriksjonssetet som tilsvarer posisjonen 2066 i pBR322, In the following, for example, the shortening of the plasmid pH154/25 to pH154/25<*> is described, which for the other proteins can take place in a similar way (whereby the shortened plasmids are in any case marked with an asterisk): pH154/25 is reacted (according to manufacturer's specifications) with PvuII whereby three fragments arise: Fragment 1: from the PvuII restriction site for proinsulin until the PvuII restriction site corresponding to position 2066 in pBR322,

fragment 2: fra PvuII-restriksjonssetet nær trp-promotoren fragment 2: from the PvuII restriction site near the trp promoter

til PvuII-setet for proinsulin og to the PvuII site for proinsulin and

fragment 3: fra PvuII-setet nær trp-promotorfragmentet inntil det PvuII-setet som tilsvarer posisjonen 2066 i pBR322. fragment 3: from the PvuII site near the trp promoter fragment to the PvuII site corresponding to position 2066 in pBR322.

Fragmentene kan adskilles ved elektroforese på agarose og deretter isoleres (Maniatis "Molecular Cloning", Cold Spring Harbor 1982). The fragments can be separated by electrophoresis on agarose and then isolated (Maniatis "Molecular Cloning", Cold Spring Harbor 1982).

Fragmentene 1 og 2 forbindes under "blunt end"-betingelser med enzymet T4 DNA-ligase. Etter transformasjon i E. coli 294 undersøker man i hvilke kolonier et plasmid med den full-stendige proinsulinsekvensen er tilstede og dermed fragmentene foreligger i den ønskede anordningen. Plasmidet pH154/25<*>er gjengitt i fig. 11. Fragments 1 and 2 are joined under "blunt end" conditions with the enzyme T4 DNA ligase. After transformation in E. coli 294, one examines in which colonies a plasmid with the complete proinsulin sequence is present and thus the fragments are present in the desired arrangement. The plasmid pH154/25<*> is reproduced in fig. 11.

Ved ekspresjonen, som foregår som beskrevet i de ovenfor angitte eksemplene, observeres en betydelig økning av fusj onsproteinandelen. During the expression, which takes place as described in the above examples, a significant increase in the proportion of the fusion protein is observed.

Eksempel 14 Example 14

Plasmidet pH154/25<*>(eksempel 13, fig. 11) nedbrytes med Hindlll og Sali og det lille fragmentet (med proinsulinsekvensen) fraskilles gelelektroforetisk. Det store fragmentet isoleres og ligeres med det syntetiske DNA (N10) The plasmid pH154/25<*> (Example 13, Fig. 11) is digested with HindIII and SalI and the small fragment (with the proinsulin sequence) is separated gel electrophoretically. The large fragment is isolated and ligated with the synthetic DNA (N10)

Plasmidet plntl3 (fig.12) oppstår. The plasmid plntl3 (fig.12) arises.

DNA (N10) koder for en aminosyresekvens som inneholder flere spaltningssteder for en kjemisk spaltning: DNA (N10) codes for an amino acid sequence that contains several cleavage sites for a chemical cleavage:

a) Met for bromcyan, a) Met for cyanogen bromide,

b) Trp for N-bromsuksinimid (NBS eller BSI ), b) Trp for N-bromosuccinimide (NBS or BSI ),

c) Asp-Pro for proteolytisk spaltning, hvorved det foran-stående Glu ytterligere svekker Asp-Pro-bindingen overfor c) Asp-Pro for proteolytic cleavage, whereby the preceding Glu further weakens the Asp-Pro bond to

innvirkningen av syrer. the impact of acids.

Innføringen av dette Hindlll-Sall-forbindelsesleddet (N10) i leserammen for et kodet polypeptid tillater altså de angitte mulighetene for en kjemisk avspaltning, avhengig av amino-syrerekkefølgen for det ønskede proteinet, henholdsvis dets følsomhet overfor spaltningsreagensene. The introduction of this HindIII-Sall link (N10) into the reading frame of an encoded polypeptide thus allows the stated possibilities for a chemical cleavage, depending on the amino acid sequence of the desired protein, respectively its sensitivity to the cleavage reagents.

Figurene er ikke gjengitt i riktig målestokk. The figures are not reproduced to the correct scale.

Aminosyresekvens I Amino acid sequence I

DNA-sekvens I DNA sequence I

Claims

1. Fusion protein, characterized by the general formula Met-Xn-D<*->Y-Z in which n is zero or 1, X is a sequence of 1 to 12 genetically codeable amino acids, D' is a sequence of approx. 70 amino acids in the region of the amino acid sequence 23-93 of the D peptide in the trp operon for E. coli, Y is a sequence of one or more genetically coded amino acids, which enables a cleavage of the following amino acid sequence Z, and Z is a sequence of genetically coded amino acids.

2. Fusion protein according to claim 1, characterized in that n is 1 and X consists of up to 5 amino acids, whereby preferably Lys-Ala is N-terminally in X and/or that Y C-terminally contains Met, Cys, Trp, Arg or Lys or one of the groups (Asp)m-Pro or Glu-(Asp )m-Pro or Ile-Glu-Gly-Arg wherein m is 1, 2 or 3, or consists of one of these amino acids, or one of these groups and/or that Z stands for the amino acid sequence from human proinsulin or a hirudin.

3. DNA, characterized in that it codes for the fusion protein according to claim 1 or 2, where the following sequence codes for D': for X codes the DNA sequence (coding strand):

in the nucleotide sequence of Y that codes for the amino acids, Y C-terminally contains Met, Cys, Trp, Arg or Lys or one of the groups

wherein m means 1, 2 or 3, or consists of these amino acids or one of these groups.