US20240218052A1

US20240218052A1 - Methods of screening and expression of disulfide-bonded binding polypeptides

Info

Publication number: US20240218052A1
Application number: US18/285,827
Authority: US
Inventors: Duncan McGregor; Ruiqi HUANG; Gabrielle WARNER JENKINS; Vaughn Smider
Original assignee: Applied Biomedical Science Institute
Current assignee: Applied Biomedical Science Institute
Priority date: 2021-05-12
Filing date: 2022-05-11
Publication date: 2024-07-04
Also published as: IL308087A; KR20240007256A; AU2022272307A1; WO2022241058A1; EP4337690A1; JP2024521987A; CA3218571A1

Abstract

The present disclosure relates to methods of producing and screening display libraries of disulfide-bonded binding polypeptides, for instance to identify binding peptides specific for a target molecule. In some embodiments, the binding peptides comprise an ultralong CDR3. The binding peptides can be derived from a bovine antibody comprising an ultralong CDR3, or they can be synthetic or semisynthetic. Also provided herein are display libraries comprising disulfide-bonded binding polypeptides. The present disclosure also relates to methods of producing or expressing soluble disulfide-bonded binding polypeptides, for instance using a suitable host cell. Also provided herein are compositions comprising soluble disulfide-bonded binding polypeptides.

Description

CROSS-REFERENCE TO RELATED APPLICATIONS

This application claims priority to U.S. Provisional Application No. 63/187,931, filed May 12, 2021, and U.S. Provisional Application No. 63/288,992, filed Dec. 13, 2021, the contents of each of which are hereby incorporated by reference in their entirety for all purposes.

STATEMENT OF GOVERNMENT INTEREST

This invention was made with government support under R01 GM105826 and R01 HD088400 awarded by the National Institutes of Health. The government has certain rights in the invention.

INCORPORATION BY REFERENCE OF SEQUENCE LISTING

The present application is being filed along with a Sequence Listing in electronic format. The Sequence Listing is provided as a file entitled 165772000440SEQLIST.txt, created May 10, 2022, which is 128 kilobytes in size. The information in the electronic format of the Sequence Listing is incorporated by reference in its entirety.

FIELD

BACKGROUND

Antibodies are natural proteins that the vertebrate immune system forms in response to foreign substances (antigens), primarily for defense against infection. Antibodies contain complementarity determining regions (CDRs) that mediate binding to a target antigen. Some bovine antibodies have unusually long variable heavy (VH) CDR3 sequences compared to other vertebrates. These long CDR3s, which can be up to 70 amino acids long, can form unique domains that protrude from the antibody surface, thereby permitting a unique antibody platform. Improved methods are needed for screening for and producing antibodies or portions thereof containing long CDR3s, as well as for screening for and producing other disulfide-bonded polypeptides.

SUMMARY

Provided herein in some embodiments is a method of preparing a cow ultralong CDR3 antibody display library, the method comprising: (a) amplifying sequences encoding a plurality of variable heavy (VH) regions of the IgHV1-7 family from a cow antibody VH chain complementary DNA (cDNA) template library; (b) constructing a plurality of replicable expression vectors for the plurality of VH regions, wherein each replicable expression vector comprises a first nucleic acid sequence encoding a single chain variable fragment (scFv) comprising an amplified VH region joined to a variable lambda light (VL) region selected from the group consisting of VL regions of BLV1H12, BLV5D3, BLV8C11, BF1H1, BLV5B8, and F18, or a humanized variant thereof; (c) transforming suitable host cells with the plurality of replicable expression vectors under conditions suitable to produce amplified display particles; and (d) collecting the amplified display particles, wherein the amplified display particles comprise display particles displaying a fusion protein comprising an scFv.
In some of any embodiments, the VL region is the BLV1H12 VL region.
Provided herein in some embodiments is a method of preparing a cow ultralong CDR3 antibody display library, the method comprising: (a) amplifying sequences encoding a plurality of variable heavy (VH) regions of the IgHV1-7 family from a cow antibody VH chain complementary DNA (cDNA) template library; (b) constructing a plurality of replicable expression vectors for the plurality of VH regions, wherein each replicable expression vector comprises a first nucleic acid sequence encoding a single chain variable fragment (scFv) comprising an amplified VH region joined to the BLV1H12 lambda variable light (VL) region or a humanized variant thereof; (c) transforming suitable host cells with the plurality of replicable expression vectors under conditions suitable to produce amplified display particles; and (d) collecting the amplified display particles, wherein the amplified display particles comprise display particles displaying a fusion protein comprising an scFv.
In some of any embodiments, the cDNA template library is prepared from RNA isolated from peripheral blood mononuclear cells (PBMCs) from an immunized cow. In some of any embodiments, the method further comprises preparing the cDNA template library from RNA isolated from peripheral blood mononuclear cells (PBMCs) from an immunized cow. In some of any embodiments, the method further comprises immunizing the cow with a target antigen.
In some of any embodiments, the amplified display particles comprise bacterial display, yeast display, mammalian display, phage display, mRNA display, ribosomal display, or DNA display particles. In some of any embodiments, the amplified display particles are phage display particles. In some of any embodiments, the amplified display particles are phagemid particles. In some of any embodiments, each replicable expression vector further comprises a second nucleic acid sequence encoding at least a portion of a phage coat protein, and the method further comprises infecting the transformed host cells with an amount of a helper phage having a gene encoding the phage coat protein sufficient to produce the phagemid particles, whereby the fusion protein comprises the at least a portion of a phage coat protein.
Provided herein in some embodiments is a method of preparing a cow ultralong CDR3 antibody phage display library, the method comprising: (a) immunizing a cow with a target antigen; (b) preparing an antibody variable heavy (VH) chain complementary DNA (cDNA) template library from RNA isolated from peripheral blood mononuclear cells (PBMCs) from the immunized cow; (c) amplifying sequences encoding a plurality of VH regions of the IgHV1-7 family from the cDNA template library; (d) constructing a plurality of replicable expression vectors for the plurality of VH regions, wherein each replicable expression vector comprises (1) a first nucleic acid sequence encoding a single chain variable fragment (scFv) comprising an amplified VH region joined to the BLV1H12 lambda variable light (VL) region or a humanized variant thereof, and (2) a second nucleic acid sequence encoding at least a portion of a phage coat protein; (e) transforming suitable host cells with the plurality of replicable expression vectors; (f) infecting the transformed host cells with an amount of a helper phage having a gene encoding the phage coat protein sufficient to produce amplified phagemid particles; and (g) collecting the amplified phagemid particles, wherein the amplified phagemid particles comprise phagemid particles displaying a fusion protein comprising the at least a portion of a phage coat protein and an scFv.
In some of any embodiments, the BLV1H12 lambda VL region is set forth in SEQ ID NO: 2. In some of any embodiments, the BLV1H12 lambda VL region is a humanized variant of the lambda VL region of BLV1H12. In some of any embodiments, the humanized variant comprises one or more of amino acid replacements S2A, T5N, P8S, A12G, A13S, and P14L based on Kabat numbering, amino acid replacements I29V and N32G in the CDR1 region, and/or amino acid substitution of DNN to GDT in the CDR2 region. In some of any embodiments, the humanized variant comprises the sequence set forth in SEQ ID NO: 107.
In some of any embodiments, the amplified VH region is joined to the BLV1H12 lambda VL region indirectly via a peptide linker. In some of any embodiments, the peptide linker is (Gly₄Ser)₃(SEQ ID NO: 94).
In some of any embodiments, the plurality of VH regions of the IgHV1-7 family from the cDNA template library are amplified with a forward primer comprising the sequence set forth in SEQ ID NO: 84 and a reverse primer comprising the sequence set forth in SEQ ID NO: 85.
In some of any embodiments, prior to the constructing, the method further comprises performing a size separation on the sequences encoding the plurality of amplified VH regions to enrich for VH regions with an ultralong CDR3. In some of any embodiments, the size separation is performed by gel electrophoresis. In some of any embodiments, the gel electrophoresis is performed using a 1.2%, 1.5%, or 2% agarose gel, optionally using a 2% agarose gel. In some of any embodiments, the size separation comprises separating sequences of, of about, or greater than 550 base pairs in length from the sequences encoding the plurality of amplified VH regions, wherein the sequences of, of about, or greater than 550 base pairs in length comprise sequences encoding VH regions with an ultralong CDR3.
In some of any embodiments, the gel electrophoresis is performed using a 2% agarose gel.
In some of any embodiments, at least or at least about 20%, 30%, 35%, 40%, 45%, 50%, 60%, 70%, 80%, 85%, 90%, or 95% of the amplified particles display an scFv comprising a VH region comprising an ultralong CDR3 region. In some of any embodiments, at least or at least about 30% of the amplified particles display an scFv comprising a VH region comprising an ultralong CDR3 region. In some of any embodiments, at least or at least about 40% of the amplified particles display an scFv comprising a VH region comprising an ultralong CDR3 region. In some of any embodiments, at least or at least about 50% of the amplified particles display an scFv comprising a VH region comprising an ultralong CDR3 region.
In some of any embodiments, the ultralong CDR3 is a peptide sequence of 25-70 amino acids comprising a cysteine motif comprising 2-12 cysteine residues able to form 1-6 disulfide bonds.
In some of any embodiments, the ultralong CDR3 is 40 to 60 amino acids in length. In some of any embodiments, the ultralong CDR3 is at least 42 amino acids in length. In some of any embodiments, the ultralong CDR3 is 42 amino acids, 43 amino acids, 44 amino acids, 45 amino acids, 46 amino acids, 47 amino acids, 48 amino acids, 49 amino acids, 50 amino acids, 51 amino acids, 52 amino acids, 53 amino acids, 54 amino acids, 55 amino acids, 56 amino acids, 57 amino acids, 58 amino acids, 59 amino acids, or 60 amino acids in length.
In some of any embodiments, the ultralong CDR3 comprises at least 4 cysteine residues. In some of any embodiments, the ultralong CDR3 contains 4 cysteine residues. In some of any embodiments, the ultralong CDR3 contains 6, 8, 10, or 12 cysteine residues.
In some of any embodiments, the ultralong CDR3 has at least 2 disulfide bonds. In some of any embodiments, the ultralong CDR3 has 2 disulfide bonds. In some of any embodiments, the ultralong CDR3 has 3, 4 or 5 disulfide bonds. In some of any embodiments, the method further comprises identifying the CDR3-knob sequence in the scFv sequence.
Provided herein in some embodiments is a method of preparing an ultralong CDR3-knob display library, the method comprising: (a) amplifying sequences encoding a plurality of CDR3-knob only antibodies from a cow antibody variable heavy (VH) chain complementary DNA (cDNA) template library with forward and reverse primers specific for the ascending and descending stalk domains of a cow ultralong CDR3 region; (b) constructing a plurality of replicable expression vectors for the plurality of CDR3-knob only antibodies, wherein each replicable expression vector comprises a first nucleic acid sequence encoding an amplified CDR3 knob; (c) transforming suitable host cells with the plurality of replicable expression vectors under conditions suitable to produce amplified display particles; and (d) collecting the amplified display particles, wherein the amplified display particles comprise display particles displaying a fusion protein comprising an amplified CDR3 knob.
In some of any embodiments, the cDNA template library is prepared from RNA isolated from peripheral blood mononuclear cells (PBMCs) from an immunized cow. In some of any embodiments, the method further comprises preparing the cDNA template library from RNA isolated from peripheral blood mononuclear cells (PBMCs) from an immunized cow. In some of any embodiments, the method further comprises immunizing the cow with a target antigen.
In some of any embodiments, the amplified display particles comprise bacterial display, yeast display, mammalian display, phage display, mRNA display, ribosomal display, or DNA display particles. In some of any embodiments, the amplified display particles are phage display particles. In some of any embodiments, the amplified display particles are phagemid particles. In some of any embodiments, each replicable expression vector further comprises a second nucleic acid sequence encoding at least a portion of a phage coat protein, and the method further comprises infecting the transformed host cells with an amount of a helper phage having a gene encoding the phage coat protein sufficient to produce the phagemid particles, whereby the fusion protein comprises the at least a portion of a phage coat protein.
Provided herein in some embodiments is a method of preparing an ultralong CDR3-knob phage display library, the method comprising: (a) immunizing a cow with a target antigen; (b) preparing an antibody variable heavy (VH) chain complementary DNA (cDNA) template library from RNA isolated from peripheral blood mononuclear cells (PBMCs) from the immunized cow; (c) amplifying sequences encoding a plurality of CDR3-knob only antibodies from the cDNA template library with forward and reverse primers specific for the ascending and descending stalk domains of a cow ultralong CDR3 region; (d) constructing a plurality of replicable expression vectors for the plurality of CDR3-knob only antibodies, wherein each replicable expression vector comprises (1) a first nucleic acid sequence encoding an amplified CDR3 knob and (2) a second nucleic acid sequence encoding at least a portion of a phage coat protein; (e) transforming suitable host cells with the plurality of replicable expression vectors; (f) infecting the transformed host cells with an amount of a helper phage having a gene encoding the phage coat protein sufficient to produce amplified phagemid particles; and (g) collecting the amplified phagemid particles, wherein the amplified phagemid particles comprise phagemid particles displaying a fusion protein comprising the at least a portion of a phage coat protein and an amplified CDR3 knob.
In some of any embodiments, the primers comprise or consist of any of the sequences set forth in SEQ ID NO: 7-11 and 121-130. 100271 In some of any embodiments, the primers comprise or consist of any of the sequences set forth in SEQ ID NO: 7-11. In some of any embodiments, the primers comprise or consist of any of the sequences set forth in SEQ ID NO: 8-11. In some of any embodiments, the primers comprise or consist of any of the sequences set forth in SEQ ID NO: 121-130. In some of any embodiments, the primers comprise or consist of any of the sequences set forth in SEQ ID NO: 123, 127, and 128.
In some of any embodiments, the primers comprise two or more of the primers set forth in SEQ ID NO: 7-11 and 121-130. In some of any embodiments, the primers comprise two or more of the primers set forth in SEQ ID NO: 8-11 and 123, 127, and 128. In some of any embodiments, the primers comprise three or more of the primers set forth in SEQ ID NO: 8-11 and 123, 127, and 128. In some of any embodiments, the primers comprise four or more of the primers set forth in SEQ ID NO: 8-11 and 123, 127, and 128.
In some of any embodiments, the primers comprise a primer consisting of the sequence set forth in SEQ ID NO: 8, a primer consisting of the sequence set forth in SEQ ID NO: 9, a primer consisting of the sequence set forth in SEQ ID NO: 10, and a primer consisting of the sequence set forth in SEQ ID NO: 11.
In some of any embodiments, the primers comprise a primer consisting of the sequence set forth in SEQ ID NO: 123, a primer consisting of the sequence set forth in SEQ ID NO: 127, and a primer consisting of the sequence set forth in SEQ ID NO: 128.
In some of any embodiments, the primers comprise a primer consisting of the sequence set forth in SEQ ID NO: 8, a primer consisting of the sequence set forth in SEQ ID NO: 9, a primer consisting of the sequence set forth in SEQ ID NO: 10, a primer consisting of the sequence set forth in SEQ ID NO: 11, a primer consisting of the sequence set forth in SEQ ID NO: 123, a primer consisting of the sequence set forth in SEQ ID NO: 127, and a primer consisting of the sequence set forth in SEQ ID NO: 128.
In some of any embodiments, the method further comprises identifying the CDR3-knob from the cow antibody variable heavy (VH) chain template sequences. In some of any embodiments, the CDR3-knob is identified from an antibody sequence by an algorithm comprising: identifying the conserved cysteine in framework 3 and the conserved tryptophan in framework 4; and determining the sequence of the CDR-3 knob, in which: the CDR-3 knob has the amino acid sequence length K; the sequence begins at position X+1 and ends at X+K; and K=L−2X; wherein L is the number of amino acids in an amino acid sequence starting at the conserved cysteine in framework 3 and ending at the conserved tryptophan in framework 4, and X is the number of amino acids from the first cysteine in framework 3 to the first conserved cysteine encoded by the DH region in CDR H3.
In some of any embodiments, the antibody sequence is a bovine antibody. In some of any embodiments, the identified CDR3-knob is extended by one, two, three, four, or five amino acids at the N and/or C termini compared to the identified sequence.
In some of any embodiments, each of the plurality of CDR3-knob only antibodies comprises a peptide sequence of 25-70 amino acids with a cysteine motif comprising 2-12 cysteine residues able to form 1-6 disulfide bonds. In some of any embodiments, the peptide sequence is 40 to 60 amino acids in length. In some of any embodiments, the peptide sequence is at least 42 amino acids in length. In some of any embodiments, the peptide sequence is 42 amino acids, 43 amino acids, 44 amino acids, 45 amino acids, 46 amino acids, 47 amino acids, 48 amino acids, 49 amino acids, 50 amino acids, 51 amino acids, 52 amino acids, 53 amino acids, 54 amino acids, 55 amino acids, 56 amino acids, 57 amino acids, 58 amino acids, 59 amino acids, or 60 amino acids in length.
In some of any embodiments, the peptide sequence comprises at least 4 cysteine residues. In some of any embodiments, the peptide sequence contains 4 cysteine residues. In some of any embodiments, the peptide sequence contains 6, 8, 10, or 12 cysteine residues.
In some of any embodiments, the peptide sequence has at least 2 disulfide bonds. In some of any embodiments, the peptide sequence has 2 disulfide bonds. In some of any embodiments, the peptide sequence has 3, 4 or 5 disulfide bonds.
In some of any embodiments, the target antigen is a nonvirulent bacteria, a virus, a viral protein, an immunomodulatory protein (e.g., a checkpoint molecule), a cancer antigen, a human IgG, or a recombinant protein thereof. In some of any embodiments, the immunomodulatory protein is a checkpoint molecule.
In some of any embodiments, the cDNA template library was synthesized using a pool of IgM (SEQ ID NO: 4), IgA (SEQ ID NO: 5), and IgG-specific (SEQ ID NO: 3 and 6) primers. In some of any embodiments, the cDNA template library is synthesized using a pool of IgM, IgA, and IgG-specific primers comprising a primer comprising or consisting of the sequence set forth in SEQ ID NO: 4, a primer comprising or consisting of the sequence set forth in SEQ ID NO: 5, a primer comprising or consisting of the sequence set forth in SEQ ID NO: 3, and a primer comprising or consisting of the sequence set forth in SEQ ID NO: 6. 100391 Provided herein in some embodiments is a method of preparing an ultralong CDR3-knob display library, the method comprising: (a) constructing a plurality of replicable expression vectors for a plurality of CDR3-knob only antibodies, wherein each replicable expression vector comprises a first nucleic acid sequence encoding a peptide sequence of 25-70 amino acids with a cysteine motif comprising 2-12 cysteine residues able to form 1-6 disulfide bonds; (b) transforming suitable host cells with the plurality of replicable expression vectors under conditions suitable to produce amplified display particles; and (c) collecting the amplified display particles, wherein the amplified display particles comprise display particles displaying a fusion protein comprising a CDR3 knob.
In some of any embodiments, the amplified display particles comprise bacterial display, yeast display, mammalian display, phage display, mRNA display, ribosomal display, or DNA display particles. In some of any embodiments, the amplified display particles are phage display particles. In some of any embodiments, the amplified display particles are phagemid particles. In some of any embodiments, each replicable expression vector further comprises a second nucleic acid sequence encoding at least a portion of a phage coat protein, and the method further comprises infecting the transformed host cells with an amount of a helper phage having a gene encoding the phage coat protein sufficient to produce the phagemid particles, whereby the fusion protein comprises the at least a portion of a phage coat protein.
Provided herein in some embodiments is a method of preparing an ultralong CDR3-knob phage display library, the method comprising: (a) constructing a plurality of replicable expression vector for a plurality of CDR3-knob only antibodies, wherein each replicable expression vector comprises (1) a first nucleic acid sequence encoding a peptide sequence of 25-70 amino acids with a cysteine motif comprising 2-12 cysteine residues able to form 1-6 disulfide bonds and (2) a second nucleic acid sequence encoding at least a portion of a phage coat protein; (b) transforming suitable host cells with a plurality of replicable expression vectors; (c) infecting the transformed host cells with a helper phage having a gene encoding the phage coat protein sufficient to produce amplified phagemid particles; and (d) collecting the amplified phagemid particles, wherein the amplified phagemid particles comprise phagemid particles displaying a fusion protein comprising the at least a portion of a phage coat protein and a CDR3 knob.
In some of any embodiments, at least one of the plurality of CDR3-knob antibody is identified from an antibody sequence by an algorithm comprising: identifying the conserved cysteine in framework 3 and the conserved tryptophan in framework 4; and determining the sequence of the CDR-3 knob, in which: the CDR-3 knob has the amino acid sequence length K; the sequence begins at position X+1 and ends at X+K; and K=L−2X; wherein L is the number of amino acids in an amino acid sequence starting at the conserved cysteine in framework 3 and ending at the conserved tryptophan in framework 4, and X is the number of amino acids from the first cysteine in framework 3 to the first conserved cysteine encoded by the DH region in CDR H3. In some of any embodiments, the antibody sequence is a bovine antibody. In some of any embodiments, the at least one CDR3-knob antibody has a sequence that is extended by one, two, three, four, or five amino acids at the N and/or C termini compared to the identified sequence.
In some of any embodiments, the peptide sequence comprises an ascending stalk domain and a descending stalk domain, wherein the cysteine motif is between the ascending and descending stalk domains.
In some of any embodiments, the peptide sequence is amplified from DNA from a cow immunized with a target antigen. In some of any embodiments, the peptide sequence is amplified from a variable heavy chain cDNA library from the immunized cow using primers specific for either side of the stalk domain of a cow ultralong CDR3 region.
In some of any embodiments, the peptide sequence does not comprise an ascending stalk domain N-terminal to the cysteine motif. In some of any embodiments, the peptide sequence does not comprise a descending stalk domain C-terminal to the cysteine motif.
In some of any embodiments, the ascending stalk domain comprises the sequence CX₂TVX₅Q, wherein X₂and X₅are any amino acid. In some of any embodiments, X₂is Ser, Thr, Gly, Asn, Ala, or Pro, and X₅is His, Gln, Arg, Lys, Gly, Thr, Tyr, Phe, Trp, Met, Ile, Val, or Leu. In some of any embodiments, X₂is Ser, Ala, or Thr, and X₅is His or Tyr.
In some of any embodiments, the peptide sequence is a synthetic CDR3-knob. In some of any embodiments, the peptide sequence is a cyclotide or modified cyclotide. In some of any embodiments, the peptide sequence is a semisynthetic CDR3-knob derived from a bovine CDR3-knob.
In some of any embodiments, the peptide sequence is 40 to 60 amino acids in length. In some of any embodiments, the peptide sequence is at least 42 amino acids in length. In some of any embodiments, the peptide sequence is 42 amino acids, 43 amino acids, 44 amino acids, 45 amino acids, 46 amino acids, 47 amino acids, 48 amino acids, 49 amino acids, 50 amino acids, 51 amino acids, 52 amino acids, 53 amino acids, 54 amino acids, 55 amino acids, 56 amino acids, 57 amino acids, 58 amino acids, 59 amino acids, or 60 amino acids in length.
In some of any embodiments, the peptide sequence comprises at least 4 cysteine residues. In some of any embodiments, the peptide sequence contains 4 cysteine residues. In some of any embodiments, the peptide sequence contains 6, 8, 10, or 12 cysteine residues.
In some of any embodiments, the peptide sequence has at least 2 disulfide bonds. In some of any embodiments, the peptide sequence has 2 disulfide bonds. In some of any embodiments, the peptide sequence has 3, 4 or 5 disulfide bonds.
In some of any embodiments, the plurality of CDR3 knobs are mutated at one or more selected positions within the nucleic acid sequence encoding the peptide sequence, wherein the plurality of replicable expression vectors are a family of mutated vectors.
In some of any embodiments, the expression vector further comprises a secretory signal sequence. In some of any embodiments, the secretory signal sequence is a pelB signal sequence.
In some of any embodiments, the suitable host cells are E. coli cells. In some of any embodiments, the suitable host cells are TG1 electrocompetent cells.
In some of any embodiments, the phagemid particles are derived from M13 phage. In some of any embodiments, the coat protein is the M13 phage gene III coat protein (pIII). In some of any embodiments, the helper phage is selected from the group consisting of M13K07, M13R408, M13-VCS, and Phi X 174. In some of any embodiments, the helper phage is M13K07.
In some of any embodiments, the display particles on average display one copy of the fusion protein on the surface of the particle.
Provided herein in some embodiments is a library of display particles produced by any of the provided methods.
Provided herein in some embodiments is a replicable expression vector comprising a gene fusion encoding a fusion protein comprising a first nucleic acid sequence encoding a single chain variable fragment comprising a cow variable heavy (VH) region comprising an ultralong CDR3 joined to a variable lambda light (VL) region selected from VL regions of BLV1H12, BLV5D3, BLV8C11, BF1H1, BLV5B8, and F18, or a humanized variant thereof.
Provided herein in some embodiments is a replicable expression vector comprising a gene fusion encoding a fusion protein comprising a first nucleic acid sequence encoding a single chain variable fragment comprising a cow variable heavy (VH) region comprising an ultralong CDR3 joined to a BLV1H12 lambda variable light (VL) region or a humanized variant thereof.
In some of any embodiments, the replicable expression vector further comprises a second nucleic acid sequence encoding at least a portion of a phage coat protein.
Provided herein in some embodiments is a display particle encoded by any of the provided replicable expression vectors.
Provided herein in some embodiments is a library of display particles comprising a plurality of any of the provided display particles.
In some of any embodiments, at least or at least about 20%, 30%, 35%, 40%, 45%, 50%, 60%, 70%, 80%, 85%, 90%, or 95% of the display particles in the library comprise an scFv comprising a VH region comprising an ultralong CDR3 region. In some of any embodiments, at least or at least about 30% of the display particles in the library comprise an scFv comprising a VH region comprising an ultralong CDR3 region. In some of any embodiments, at least or at least about 40% of the display particles in the library comprise an scFv comprising a VH region comprising an ultralong CDR3 region. In some of any embodiments, at least or at least about 50% of the display particles in the library comprise an scFv comprising a VH region comprising an ultralong CDR3 region.
Provided herein in some embodiments is a replicable expression vector comprising a gene fusion encoding a fusion protein that comprises a first nucleic acid sequence encoding a peptide sequence of 25-70 amino acids with a cysteine motif comprising 2-12 cysteine residues able to form disulfide bonds.
In some of any embodiments, the replicable expression vector further comprises a second nucleic acid sequence encoding at least a portion of a phage coat protein.
Provided herein in some embodiments is a display particle encoded by any of the provided replicable expression vectors.
Provided herein in some embodiments is a library of display particles comprising a plurality of any of the provided display particles.
In some of any embodiments, the display particles are phage display particles. In some of any embodiments, the display particles are phagemid particles.
Provided herein in some embodiments is a method for selecting an antibody binding protein, the method comprising: (1) contacting any of the provided libraries of display particles with a target molecule under conditions to allow binding of a display particle to the target molecule; and (2) separating the display particles that bind from those that do not, thereby selecting display particles comprising an antibody binding protein that binds to the target molecule.
In some of any embodiments, the display particles are phage display particles. In some of any embodiments, the display particles are phagemid particles.
In some of any embodiments, the target molecule is a nonvirulent bacteria, a virus, a viral protein, an immunomodulatory protein (e.g. a checkpoint molecule), a cancer antigen, a human IgG, or a recombinant protein thereof. In some of any embodiments, the target molecule is a coronavirus, a coronavirus pseudovirus, a recombinant coronavirus Spike protein, or a receptor-binding domain (RBD) of a coronavirus Spike protein. In some of any embodiments, the coronavirus is selected from the group consisting of 229E, NL63, OC43, HKU1, MERS-CoV, SARS-CoV, and SARS-CoV2. In some of any embodiments, the coronavirus is a SARS-CoV2 selected from Wuhan-Hu-1 isolate, B.1.351 South African variant, or B.1.1.7 UK variant.
In some of any embodiments, the method further comprises (i) infecting suitable host cells with replicable expression vectors encoding the selected display particles that bind in (2); (ii) collecting the amplified display particles; and (iii) repeating steps (1) and (2) using the amplified display particles as the library of display particles. In some of any embodiments, the display particles are phagemid particles, and the method further comprises infecting the transformed host cells with an amount of a helper phage having a gene encoding the phage coat protein sufficient to produce amplified phagemid particles.
In some of any embodiments, the steps are repeated one or more times. In some of any embodiments, the steps are repeated with the same target molecule or a different target molecule. In some of any embodiments, the steps are repeated with a different target molecule and the different target molecule is related to the target molecule. In some of any embodiments, the different target molecule is the same type of pathogen as, in the same group of pathogen as, or a variant of the target molecule.
In some of any embodiments, the method further comprises sequencing the fusion gene in the selected display particles to identify the antibody binding protein.
In some of any embodiments, the method further comprises producing a full-length IgG or a Fab from the selected antibody binding protein.
In some of any embodiments, the antibody binding protein is a scFv, and the method comprises constructing a heavy chain or a portion thereof comprising joining the VH region of the scFv with a constant region or a portion thereof. In some of any embodiments, the method comprises constructing a humanized VH region by replacing a knob region of the ultralong CDR3 region of a humanized bovine VH region with an ultralong CDR3 region of a selected antibody binding protein. In some of any embodiments, the ultralong CDR3 region of a selected antibody binding protein is replaced between an ascending stalk strand and a descending stalk strand of a humanized bovine VH region. In some of any embodiments, the VH region comprises the formula V1-X-V2, wherein the V1 region of the heavy chain comprises the sequence set forth in SEQ ID NO: 111; the X region comprises the ultralong CDR3 of a selected antibody binding protein; and the V2 region comprises the sequence set forth in SEQ ID NO: 112. In some of any embodiments, the method further comprises constructing a heavy chain or a portion thereof comprising joining the humanized VH region with a constant region or a portion thereof. In some of any embodiments, the heavy chain or the portion thereof is a human IgG1 heavy chain or portion thereof.
In some of any embodiments, the method further comprises co-expressing the heavy chain or portion thereof with a light chain. In some of any embodiments, the light chain is a bovine light chain of BLVH12, BLV5D3, BLV8C11, BF1H1, BLV5B8, or F18, or is a humanized variant thereof. In some of any embodiments, the light chain is a BLV1H12 light chain (SEQ ID NO: 113) or a humanized variant thereof. In some of any embodiments, the light chain is a humanized light chain set forth in SEQ ID NO: 114. In some of any embodiments, the light chain is a BLV5B8 light chain (SEQ ID NO: 115) or a humanized variant thereof. In some of any embodiments, the light chain is a human light chain. In some of any embodiments, the light chain is selected from the group consisting of VL1-47, VL1-40, VL1-51, and VL2-18. In some of any embodiments, the light chain is set forth in any one of SEQ ID NO: 116-120.
In some of any embodiments, the light chain is a BLV1H12 light chain comprising the sequence set forth in SEQ ID NO: 113 or a humanized variant thereof. In some of any embodiments, the light chain is a BLV5B8 light chain comprising the sequence set forth in SEQ ID NO: 115 or a humanized variant thereof.
Provided herein in some embodiments is a method for producing a soluble ultralong CDR3 knob, comprising: (a) transforming E. coli with an expression vector encoding a fusion protein comprising an ultralong CDR3 knob and a bacterial chaperone joined by a cleavable linker, wherein the ultralong CDR3 knob is a peptide sequence of 25-70 amino acids with a cysteine motif comprising 2-12 cysteine residues able to form 1-6 disulfide bonds; (b) culturing the bacteria under conditions permissive of expression of the fusion protein; (c) isolating the fusion protein from supernatant of a bacterial cell lysate; and (d) cleaving the cleavable linker of the fusion protein, thereby producing a soluble ultralong CDR3 knob comprising 1-6 disulfide bonds free of the bacterial chaperone.
In some of any embodiments, the ultralong CDR3 knob is an antibody binding protein selected by any of the provided methods.
In some of any embodiments, the ultralong CDR3 knob is an antibody binding protein identified by any of the provided methods.
In some of any embodiments, the fusion protein has increased solubility relative to the ultralong CDR3 knob alone. In some of any embodiments, the bacterial chaperone is thioredoxin A (TrxA).
In some of any embodiments, the cleavable linker is an enterokinase cleavage tag having the amino acid sequence DDDDK (SEQ ID NO: 106). In some of any embodiments, cleaving the cleavable linker comprises adding enterokinase to the supernatant.
In some of any embodiments, the soluble ultralong CDR3 knob comprises a further linker to allow for cyclizing the soluble ultralong CDR3 knob via chemical or enzymatic methods. In some of any embodiments, the further linker allows for sortase-mediated cyclization. In some of any embodiments, the method further comprises cyclizing the soluble ultralong CDR3 knob.
In some of any embodiments, the method further comprises (e) removing the enterokinase and/or the bacterial chaperone from the solution comprising the soluble ultralong CDR3 knob.
In some of any embodiments, the method further comprises enriching for the soluble ultralong CDR3 knob from the solution comprising the soluble ultralong CDR3 knob. In some of any embodiments, the enriching comprises size exclusion chromatography.
In some of any embodiments, the method further comprises producing a multispecific binding molecule comprising the soluble ultralong CDR3 knob.
In some of any embodiments, the ultralong CDR3 knob is 3-8 kDa in size. In some of any embodiments, the ultralong CDR3 knob is 4-5 kDa in size.
Provided herein in some embodiments is a fusion protein comprising an ultralong CDR3 knob and a bacterial chaperone joined by a cleavable linker, wherein the ultralong CDR3 knob is a peptide sequence of 25-70 amino acids with a cysteine motif comprising 2-12 cysteine residues able to form 1-6 disulfide bonds.
In some of any embodiments, the bacterial chaperone is thioredoxin A (TrxA).
In some of any embodiments, the cleavable linker is an enterokinase cleavage tag having the amino acid sequence DDDDK (SEQ ID NO: 106).
In some of any embodiments, the ultralong CDR3 knob comprises 1-6 disulfide bonds.
Provided herein in some embodiments is a composition comprising any of the provided fusion protein.
Provided herein is a method of identifying a CDR3 knob sequence from an antibody sequence, the method comprising identifying the conserved cysteine in framework 3 and the conserved tryptophan in framework 4; and determining the sequence of the CDR-3 knob, in which: the CDR-3 knob has the amino acid sequence length K; the sequence begins at position X+1 and ends at X+K; and K=L−2X; wherein L is the number of amino acids in an amino acid sequence starting at the conserved cysteine in framework 3 and ending at the conserved tryptophan in framework 4, and X is the number of amino acids from the first cysteine in framework 3 to the first conserved cysteine encoded by the DH region in CDR H3. In some of any embodiments, the antibody sequence is a bovine antibody. In some of any embodiments, the CDR3-knob antibody has a sequence that is extended by one, two, three, four, or five amino acids at the N and/or C termini compared to the identified sequence.
Provided herein in some embodiments is a purified soluble ultralong CDR3 knob produced by any of the provided methods, wherein the soluble ultralong CDR3 is 25-75 amino acids in length and comprises 1-6 disulfide bonds.
In some of any embodiments, the ultralong CDR3 knob is 3-8 kDa in size. In some of any embodiments, the ultralong CDR3 knob is 4-5 kDa in size.
In some embodiments, the ultralong CDR3 knob has an amino acid sequence length K; and the sequence begins at position X+1 and ends at X+K; and K=L−2X; and wherein L is the number of amino acids in an amino acid sequence of an antibody starting at the conserved cysteine in framework 3 and ending at the conserved tryptophan in framework 4, and X is the number of amino acids from the first cysteine in framework 3 to the first conserved cysteine encoded by the DH region in CDR H3. In some embodiments, the antibody sequence is a bovine antibody. In some embodiments, the knob sequence has a sequence that is further extended by one, two, three, four, or five amino acids at the N and/or C termini.
Provided herein is a peptide knob with a sequence of length K, wherein: the knob has an amino acid sequence length K; the sequence begins at position X+1 and ends at X+K; and K=L−2X; and wherein L is the number of amino acids in an amino acid sequence of an antibody starting at the conserved cysteine in framework 3 and ending at the conserved tryptophan in framework 4, and X is the number of amino acids from the first cysteine in framework 3 to the first conserved cysteine encoded by the DH region in CDR H3. In some embodiments, the antibody sequence is a bovine antibody. In some embodiments, the knob sequence has a sequence that is further extended by one, two, three, four, or five amino acids at the N and/or C termini.
Provided herein in some embodiments is a composition comprising any of the provided purified soluble ultralong CDR3.
In some of any embodiments, the composition further comprises a pharmaceutically acceptable carrier.
In some of any embodiments, the composition is formulated for parenteral administration. In some of any embodiments, the composition is formulated for intravenous, intramuscular, topical, otic, conjunctival, nasal, inhalation, or subcutaneous administration. In some of any embodiments, the composition is formulated for administration by inhalation.

BRIEF DESCRIPTION OF FIGURES

FIG. 1 depicts a schematic of an exemplary Ultralong CDR3 cow antibody, including the “knob” peptide having a size between 4 and 6 KDa.

FIG. 2A depicts binding of immunized calf serum against the RBD domain of the SARS CoV-2 S protein via ELISA. Neutralization activity of the sera IgG against SARS-CoV-2 pseudovirus is shown in FIG. 2B.

FIG. 3A depicts the pIII phage fusion constructs in each display library (i.e., scFv and “knob” display).

FIG. 3B displays a schematic of pTAU1 phage vector multiple cloning site, used for direct cloning of bovine CDR3 knob DNA fragments as NcoI-NotI fragments. A schematic of pTAU1-BLV1H12(-VH) phage scFv vector multiple cloning site used for cloning of bovine VH DNA fragments as NcoI-XhoI fragments in-frame with BLV1H12 V-lambda DNA is shown in FIG. 3C. FIG. 3D depicts the separation between Ultralong VH fragments and shorter VH fragments without the Ultralong CDR3 region on an agarose gel.

Sequences alignments for exemplary Ultralong antibodies R2C1 (SKD, SEQ ID NO: 68), R2C3 (SKM, SEQ ID NO: 69), R4C1 (SEQ ID NO: 70), R5C1 (SEQ ID NO: 71), SR3A3 (SEQ ID NO: 72), RR2F12 (SEQ ID NO: 73), and RR2G3 (SEQ ID NO: 74) are shown in FIG. 4 . A germline sequence is also shown (SEQ ID NO: 75).

FIG. 5A depicts binding of exemplary chimeric bovine-human IgG1 antibodies to spike protein, binding to the RBD is also shown in FIG. 5B. FIG. 5C shows ELISA binding of IgG antibodies to recombinant stabilized spike proteins derived from several SARS CoV strains.

FIG. 5D shows ELISA binding curves of select IgG antibodies against the omicron variant RBD (left) or recombinant stabilized spike trimer (right).

FIG. 5E reflects exemplary ELISA data of R4C1 and R2D9 on SARS-CoV-2 compared to SARS-CoV-1. FIG. 5F shows ELISA binding activity for three different exemplary antibody knob candidates against WT (Wuhan) SARS CoV-2 spike protein. FIG. 5G depicts a modified western blot using SDS and detected with biotinylated RBD.

FIG. 6A displays a schematic of the pET32b vector cloning site used for trxA-CDR3-knob fusion and CDR3-knob expression. A schematic of purification process from bacterial lysate is shown in FIG. 6B. FIG. 6C depicts CDR3-knob SDS-PAGE showing efficient purification of soluble CDR3-KNOB from E. coli lysate. FIG. 6D depicts an exemplary SDS-PAGE gel of several purified ultralong CDR H3 knob peptides.

FIG. 7A shows the results of a Wuhan-Hu-1 spike protein capture ELISA, using serial dilutions of IMAC purified trxA-fusions. Binding for the TrxA-R2G3 fusion protein is also shown in FIG. 7B.

FIG. 8A depicts a background-subtracted ELISA of soluble biotinylated RBD binding to exemplary purified R2-G3 CDR3-knob. Soluble R2G3 knob binding relative to a reference anti-spike antibody (CR3022) is shown in FIG. 8B.

Amino acid sequences of exemplary truncated R2G3 mutants are shown in FIG. 8C. Exemplary truncated R2G3 mutants include R2G3 TRUNC1 (SEQ ID NO: 87), R2G3 TRUNC2 (SEQ ID NO: 88), R2G3 TRUNC3 (SEQ ID NO: 89), R2G3 TRUNC3A (SEQ ID NO: 90), R2G3 TRUNC3B (SEQ ID NO. 91), R2G3 TRUNC4 (SEQ ID NO: 92), and R2G3 TRUNC5 (SEQ ID NO: 93). The parental R2G3 variant from which the exemplary truncated mutants were derived is also shown (SEQ ID NO: 86).

FIG. 8D depicts a SDS-PAGE of R2G3 truncations after bacterial expression and purification. Results of an ELISA binding of biotinylated RBD by coated CDR3-knob truncation as shown in FIG. 8E.

FIG. 9A depicts a size exclusion chromatograph for purified R4C1 knobs. A gel electrophoresis gel of two fractions (A4 and A7) are shown in FIG. 9B.

FIG. 9C depicts a size exclusion chromatograph for purified R2G3 knobs. A gel electrophoresis gel of a fraction (A6) are shown in FIG. 9D.

Results of a pseudoviral luciferase assay are shown in FIG. 10 for four exemplary Ultralong CDR3 antibodies (F12, G3, SKD, and SKM) against wild-type (FIG. 10A), “UK” variant (FIG. 10B), “484K” variant (FIG. 10C), and “SA” variant (FIG. 10D) SARS CoV-2 spike protein expressing viruses.

FIG. 11A shows the IC50 values of different IgG antibodies against pseudoviruses from various coronavirus strains. FIG. 11B shows a comparison of the R2G3 IgG, Fab, and knob in neutralization of wild-type SARS-CoV-2 pseudovirus.

FIG. 12 is a depiction of multispecific knob peptide compositions and formats. A plurality of paratope knob peptides can be attached to an immunoglobulin, including as a homodimer or heterodimer, to provide a multispecific binding polypeptide. A plurality of paratope knob peptides also may be linked directly in tandem, such as via a linker. A plurality of knob peptides also may be combined as a mixture or cocktail to provide a combined polyclonal composition.

FIG. 13A depicts the crystal structure of BLV1H12 Fab (PDB 4k3d), an enlarged view of stalk and knob region, with framework 3 cysteine, knob position 1 cysteines, and the framework 4 tryptophan side chains is shown in FIG. 13B.

A sequence alignment of the stalk and knob regions for 12 exemplary antibodies is shown in FIG. 14 , the knob regions are flanked by the ascending and descending stalk regions which are shown with white letters highlighted in black.

FIG. 15 is a schematic representation of the stalk and knob domain (L), containing the CDR H3 plus three residues on the N-terminal end.

Binding of biotinylated RBD by coated CDR3-knob truncations as assessed via ELISA are shown in FIG. 16A. An exemplary SDS-PAGE of R2G3 truncations after bacterial expression and purification is shown in FIG. 16B.

FIG. 17A shows ELISA binding of biotinylated RBD by coated CDR3-knob N-terminal truncations, and an exemplary SDS-PAGE of R2G3 N-terminal truncations after bacterial expression and purification is shown in FIG. 17B.

FIG. 18A shows a sequence alignment of primers specific for the ascending and descending stalk domains of a cow ultralong CDR3 region. FIG. 18B shows the PCR products obtained by amplification using the primers.

DETAILED DESCRIPTION

Provided herein in some embodiments are display libraries and methods of preparing display libraries, including cow or synthetic ultralong CDR3 display libraries or cyclotide display libraries, as well as methods of screening said libraries for binding molecules specific for a target molecule. In some embodiments, the display libraries are derived from sequences selectively amplified from the cDNA of immunized cows, for instance in order to enrich or select for sequences encoding an ultralong CDR3. Also provided herein in some embodiments are methods of producing soluble peptides, in some instances producing soluble ultralong CDR3 knobs. The soluble ultralong CDR3 knobs produced can be bovine or synthetic. Soluble peptides produced according to the provided methods also include cyclotides.
In some aspects, the provided methods allow for the screening and production of disulfide bonded knob peptides, including those derived from cow antibodies including an ultralong CDR3, that can be independently expressed and produced according to the provided methods as an independent binding unit. In some aspects, the provided methods offer a simple, immunization-based discovery platform. This platform offers peptide structural diversity that is greater than that of in vitro display-based platforms, with each screened and produced knob peptide potentially having its own novel disulfide-bonded structure. This platform also allows for rapid hit discovery against target molecules.
As described herein, cow antibodies have a unique structure containing an ultralong CDR3 sequence that forms a structure where a subdomain with an unusual architecture is formed from a “stalk”, composed of two 12-residue, anti-parallel β-strands (ascending and descending strands), and a longer, e.g., 39-residue, disulfide-rich “knob” that sits atop the stalk, far from the canonical antibody paratope. The knob region of the ultralong CDR3 confers antigen binding. Unlike antibodies from other species, such as human and mouse, the CDR regions L1, L2, L3, H1 and H2 of a bovine or bovine-derived antibody exhibit less sequence diversity as most of their sequence diversity is in CDR H3 (Stanfield et al. 2016 Sci. Immunol, 1(1): doi:10.1126/sciimmunol.aaf7962). Thus, for bovine or bovine-derived antibodies, antigen binding is mainly or only through CDR H3 and the other CDRs do not contribute to antigen binding.
Available methods of analysis and exploitation of the unique ultralong CDR H3 structure are not entirely satisfactory. In many cases, methods require excision and purification of the isolated knob domain (Macpherson et al. 2020 PLOS Biology, 18(9): e30000821). Such methods are not easily amenable to good manufacturing practices for generating therapeutic molecules and also are inefficient in terms of the amount of knob protein that can be produced. Further the use of enzumes for excision of the knob may also compromise the integrity of the isolated protein.
Remarkably, it is found herein that a disulfide bonded knob peptide derived from an ultralong CDR-H3 of a bovine antibody can be independently expressed and produced according to the provided methods as an independent binding unit and retains picomolar binding affinity and neutralizing activity against a target molecule, e.g., SARS-CoV2. This knob peptide is only roughly 4-5 kDa in size, e.g., about 4.4 kDa, and represents the smallest independent antigen binding domain. It exhibits high affinity and epitope coverage, similar to a larger antibody. Its small size approaches the size of small molecules and thereby opens up the utility of the antigen binding domain as a new and novel therapeutic. For instance, its small size allows for better tissue penetration and also permits alveolar delivery. Further, the provided knob peptides are stable by virtue of their rigid disulfide-bonded small domain. This stable structure avoids aggregates seen in nanobodies and other immunoglobulin domain-based fragments. Also as demonstrated herein, findings also show that it can be produced in high yield according to the provided methods in E. coli, making the knob peptide highly developable as a therapeutic molecule. Peptides generated according to the provided methods can target known viruses or viral classes, either as a mAb or as a knob. In some aspects, mAbs and knobs can be ready for rapid discovery and production in the event of pandemic outbreaks, and can be quickly pivoted in the case of new strains of disease. In some aspects, mAb and knob production according to the provided methods can move quickly to GMP standards. In some aspects, knobs can be used for “cocktails” of treatment regimens.
Also provided herein are compositions containing any of the knob peptides screened and produced according to the provided methods. In some embodiments, the compositions can be monoclonal providing a single knob peptide to provide a single paratope for binding a desired antigen, such as SARS-CoV2. In other embodiments, provided compositions are polyclonal and contain a mixture or cocktail of different knob peptides directed against different epitopes of an antigen or different antigens (FIG. 12 ).
Further, also provided herein are multispecific binding formats that exploit the small and unique size of the knob peptides (FIG. 12 ). For instance, different knob paratopes can be engineered into the backbone of a human or humanized ultralong CDR-H3 full length antibody in which dimerization of the Fc provides a bivalent or multivalent format. In some cases, “knobs into hole” Fc engineering strategy can be used to produce a heterodimeric bispecific or multispecific format containing two, three, four or more different knob peptides each providing a different paratope for binding to a desired antigen, such as Spike protein of SARS-CoV2.
Also provided herein are methods of treatment and uses of the provided binding polypeptides, including antibodies or antigen-binding fragments or knob polypeptides, and compositions thereof.

I. Definitions

Unless defined otherwise, all terms of art, notations and other technical and scientific terms or terminology used herein are intended to have the same meaning as is commonly understood by one of ordinary skill in the art to which the claimed subject matter pertains. In some cases, terms with commonly understood meanings are defined herein for clarity and/or for ready reference, and the inclusion of such definitions herein should not necessarily be construed to represent a substantial difference over what is generally understood in the art.
As used herein, the articles “a” and “an” refer to one or to more than one (i.e. to at least one) of the grammatical object of the article. By way of example, “an element” means one element or more than one element.
Throughout this disclosure, various aspects of the claimed subject matter are presented in a range format. It should be understood that the description in range format is merely for convenience and brevity and should not be construed as an inflexible limitation on the scope of the claimed subject matter. Accordingly, the description of a range should be considered to have specifically disclosed all the possible sub-ranges as well as individual numerical values within that range. For example, where a range of values is provided, it is understood that each intervening value, between the upper and lower limit of that range and any other stated or intervening value in that stated range is encompassed within the claimed subject matter. The upper and lower limits of these smaller ranges may independently be included in the smaller ranges, and are also encompassed within the claimed subject matter, subject to any specifically excluded limit in the stated range. Where the stated range includes one or both of the limits, ranges excluding either or both of those included limits are also included in the claimed subject matter. This applies regardless of the breadth of the range.
As used herein, the term “about” will be understood by persons of ordinary skill in the art and will vary to some extent on the context in which it is used. As used herein, “about” when referring to a measurable value such as an amount, a temporal duration, and the like, is meant to encompass variations of 20% or 10%, more preferably ±5%, even more preferably ±1%, and still more preferably ±0.1% from the specified value, as such variations are appropriate to perform the disclosed methods.
An “ultralong CDR3” or an “ultralong CDR3 sequence”, used interchangeably herein, comprises a CDR3 or CDR3 sequence that is not derived from a human antibody sequence. An ultralong CDR3 may be 35 amino acids in length or longer, for example, 40 amino acids in length or longer, 45 amino acids in length or longer, 50 amino acids in length or longer, 55 amino acids in length or longer, or 60 amino acids in length or longer. In some embodiments, the ultralong CDR3 is 25-70 amino acids in length, such as 40-70 amino acids in length. Typically, the ultralong CDR3 is a heavy chain CDR3 (CDR-H3 or CDRH3). An ultralong CDR3H3 exhibits features of a CDRH3 of a ruminant (e.g., bovine) sequence. The structure of an ultralong CDR3 includes a “stalk”, composed of ascending and descending strands (e.g. each about 12 amino acids in length), and a disulfide-rich “knob” that sits atop the stalk. The unique “stalk and knob” structure of the ultralong CDR3 results in the two antiparallel β-strands (an ascending and descending stalk strand) supporting a disulfide bonded knob protruding out of the antibody surface to form a mini antigen binding domain. In some embodiments, the ultralong CDR3 antibodies comprise, in order, an ascending stalk region, a knob region, and a descending stalk region.
As used herein, a “CDR3-knob” or “knob,” which are used interchangeably refers to a portion of an ultralong CDR3 that is a peptide sequence of 40-70 amino acids in length, where said CDR3-knob has at least 4 non-canonical Cys residues, such as 6, 8, 10 or up to 12 non-canonical cysteine residues, and forms 2-6 disulfide bonds. Typically a knob contains an initial cysteine residue with the amino acid motif cysteine-proline (CP). In some cases, a CDR3-knob may be positioned between an ascending stalk (Stalk A) or a descending stalk (Stalk B) in an antibody or antigen-binding fragment containing the ultralong CDR3, in which the CDR3-knob protrudes out of the antibody interface to form an antigen binding site with an antigen. In other cases, a CDR3-knob may be independently produced as a “knob” peptide as described herein.
As used herein, a “knob peptide”, “CDR3-knob peptide” or “knob-only peptide,” which are terms used interchangeably, refers to an independently produced linear disulfide-bonded peptide that is 40-70 amino acids in length, and contains 2-6 disulfide bonds formed by at least 4 non-canonical Cys residues, such as 6, 8, 10 or up to 12 non-canonical cysteine residues. A knob peptide may be derived from an ultralong CDR3 or can be produced synthetically. Typically, the first cysteine of the peptide sequences contains an initial cysteine residue with the amino acid motif cysteine-proline (CP). A knob peptide is a linear molecular that is not able to undergo cyclization to form a cyclic molecule.
“Substantially similar,” or “substantially the same”, refers to a sufficiently high degree of similarity between two numeric values (generally one associated with an antibody disclosed herein and the other associated with a reference/comparator antibody) such that one of skill in the art would consider the difference between the two values to be of little or no biological and/or statistical significance within the context of the biological characteristic measured by said values (e.g., Kd values). The difference between said two values is preferably less than about 50%, preferably less than about 40%, preferably less than about 30%, preferably less than about 20%, preferably less than about 10% as a function of the value for the reference/comparator antibody.
“Binding affinity” generally refers to the strength of the sum total of noncovalent interactions between a single binding site of a molecule (e.g., an antibody) and its binding partner (e.g., an antigen). Unless indicated otherwise, “binding affinity” refers to intrinsic binding affinity which reflects a 1:1 interaction between members of a binding pair (e.g., antibody and antigen). The affinity of a molecule X for its partner Y can generally be represented by the dissociation constant. Low-affinity antibodies generally bind antigen slowly and tend to dissociate readily, whereas high-affinity antibodies generally bind antigen faster and tend to remain bound longer. A variety of methods of measuring binding affinity are known in the art, any of which can be used for purposes of the present disclosure.
“Percent (%) amino acid sequence identity” with respect to a peptide or polypeptide sequence refers to the percentage of amino acid residues in a candidate sequence that are identical with the amino acid residues in the specific peptide or polypeptide sequence, after aligning the sequences and introducing gaps, if necessary, to achieve the maximum percent sequence identity, and not considering any conservative substitutions as part of the sequence identity. Alignment for purposes of determining percent amino acid sequence identity can be achieved in various ways that are within the skill in the art, for instance, using publicly available computer software such as BLAST, BLAST-2, ALIGN or MegAlign (DNASTAR) software. Those skilled in the art can determine appropriate parameters for measuring alignment, including any algorithms needed to achieve maximal alignment over the full length of the sequences being compared.
“Polypeptide,” “peptide,” “protein,” and “protein fragment” may be used interchangeably to refer to a polymer of amino acid residues. The terms apply to amino acid polymers in which one or more amino acid residue is an artificial chemical mimetic of a corresponding naturally occurring amino acid, as well as to naturally occurring amino acid polymers and non-naturally occurring amino acid polymers.
“Amino acid” refers to naturally occurring and synthetic amino acids, as well as amino acid analogs and amino acid mimetics that function similarly to the naturally occurring amino acids. Naturally occurring amino acids are those encoded by the genetic code, as well as those amino acids that are later modified, e.g., hydroxyproline, gamma-carboxyglutamate, and O-phosphoserine. Amino acid analogs refers to compounds that have the same basic chemical structure as a naturally occurring amino acid, e.g., an alpha carbon that is bound to a hydrogen, a carboxyl group, an amino group, and an R group, e.g., homoserine, norleucine, methionine sulfoxide, methionine methyl sulfonium. Such analogs can have modified R groups (e.g., norleucine) or modified peptide backbones, but retain the same basic chemical structure as a naturally occurring amino acid. Amino acid mimetics refers to chemical compounds that have a structure that is different from the general chemical structure of an amino acid, but that functions similarly to a naturally occurring amino acid.
“Conservatively modified variants” applies to both amino acid and nucleic acid sequences. “Amino acid variants” refers to amino acid sequences. With respect to particular nucleic acid sequences, conservatively modified variants refers to those nucleic acids which encode identical or essentially identical amino acid sequences, or where the nucleic acid does not encode an amino acid sequence, to essentially identical or associated (e.g., naturally contiguous) sequences. Because of the degeneracy of the genetic code, a large number of functionally identical nucleic acids encode most proteins. For instance, the codons GCA, GCC, GCG and GCU all encode the amino acid alanine. Thus, at every position where an alanine is specified by a codon, the codon can be altered to another of the corresponding codons described without altering the encoded polypeptide. Such nucleic acid variations are “silent variations,” which are one species of conservatively modified variations. Every nucleic acid sequence herein which encodes a polypeptide also describes silent variations of the nucleic acid. One of skill will recognize that in certain contexts each codon in a nucleic acid (except AUG, which is ordinarily the only codon for methionine, and TGG, which is ordinarily the only codon for tryptophan) can be modified to yield a functionally identical molecule. Accordingly, silent variations of a nucleic acid which encodes a polypeptide is implicit in a described sequence with respect to the expression product, but not with respect to actual probe sequences. As to amino acid sequences, one of skill will recognize that individual substitutions, deletions or additions to a nucleic acid, peptide, polypeptide, or protein sequence which alters, adds or deletes a single amino acid or a small percentage of amino acids in the encoded sequence is a “conservatively modified variant” including where the alteration results in the substitution of an amino acid with a chemically similar amino acid. Conservative substitution tables providing functionally similar amino acids are well known in the art. Such conservatively modified variants are in addition to and do not exclude polymorphic variants, interspecies homologs, and alleles disclosed herein. Typically conservative substitutions include: 1) Alanine (A), Glycine (G); 2) Aspartic acid (D), Glutamic acid (E); 3) Asparagine (N), Glutamine (Q); 4) Arginine (R), Lysine (K); 5) Isoleucine (I), Leucine (L), Methionine (M), Valine (V); 6) Phenylalanine (F), Tyrosine (Y), Tryptophan (W); 7) Serine (S), Threonine (T); and 8) Cysteine (C), Methionine (M) (see, e.g., Creighton, Proteins (1984)).
“Humanized” or “Human engineered” forms of non-human (e.g., bovine) antibodies are chimeric antibodies that contain amino acids represented in human immunoglobulin sequences, including, for example, wherein minimal sequence is derived from non-human immunoglobulin. For example, humanized or human engineered antibodies may be non-human (e.g., bovine) antibodies in which some residues are substituted by residues from analogous sites in human antibodies (see, e.g., U.S. Pat. No. 5,766,886). A humanized antibody optionally may also comprise at least a portion of an immunoglobulin constant region (Fc), typically that of a human immunoglobulin. For further details, see Jones et al., Nature 321:522-525 (1986); Riechmann et al., Nature 332:323-329 (1988); and Presta, Curr. Op. Struct. Biol. 2:593-596 (1992). See also the following review articles and references cited therein: Vaswani and Hamilton, Ann. Allergy, Asthma & Immunol. 1: 105-115 (1998); Harris, Biochem. Soc. Transactions 23:1035-1038 (1995); Hurle and Gross, Curr. Op. Biotech. 5:428-433 (1994).
A “variable domain” with reference to an antibody refers to a specific Ig domain of an antibody heavy or light chain that contains a sequence of amino acids that varies among different antibodies. Each light chain and each heavy chain has one variable region domain (VL, and, VH). The variable domains provide antigen specificity, and thus are responsible for antigen recognition. Each variable region contains CDRs that are part of the antigen binding site domain and framework regions (FRs).
A “constant region domain” refers to a domain in an antibody heavy or light chain that contains a sequence of amino acids that is comparatively more conserved among antibodies than the variable region domain. Each light chain has a single light chain constant region (CL) domain and each heavy chain contains one or more heavy chain constant region (CH) domains, which include, CH1, CH2, CH3 and, in some cases, CH4. Full-length IgA, IgD and IgG isotypes contain CH1, CH2 CH3 and a hinge region, while IgE and IgM contain CH1, CH2 CH3 and CH4. CH1 and CL domains extend the Fab arm of the antibody molecule, thus contributing to the interaction with antigen and rotation of the antibody arms. Antibody constant regions can serve effector functions, such as, but not limited to, clearance of antigens, pathogens and toxins to which the antibody specifically binds, e.g. through interactions with various cells, biomolecules and tissues.
The terms “complementarity determining region,” and “CDR,” synonymous with “hypervariable region” or “HVR,” are known in the art to refer to non-contiguous sequences of amino acids within antibody variable regions, which confer antigen specificity and/or binding affinity. In general, there are three CDRs in each heavy chain variable region (CDR-H1, CDR-H2, CDR-H3) and three CDRs in each light chain variable region (CDR-L1, CDR-L2, CDR-L3). “Framework regions” and “FR” are known in the art to refer to the non-CDR portions of the variable regions of the heavy and light chains. In general, there are four FRs in each full-length heavy chain variable region (FR-H1, FR-H2, FR-H3, and FR-H4), and four FRs in each full-length light chain variable region (FR-L1, FR-L2, FR-L3, and FR-L4).
The precise amino acid sequence boundaries of a given CDR or FR can be readily determined using any of a number of well-known schemes, including those described by Kabat et al. (1991), “Sequences of Proteins of Immunological Interest,” 5th Ed. Public Health Service, National Institutes of Health, Bethesda, MD (“Kabat” numbering scheme); Al-Lazikani et al., (1997) JMB 273, 927-948 (“Chothia” numbering scheme); MacCallum el al., J. Mol. Biol. 262:732-745 (1996), “Antibody-antigen interactions: Contact analysis and binding site topography,” J. Mol. Biol. 262, 732-745.” (“Contact” numbering scheme); Lefranc M P et al., “IMGT unique numbering for immunoglobulin and T cell receptor variable domains and Ig superfamily V-like domains,” Dev Comp Immunol, 2003 January; 27(1):55-77 (“IMGT” numbering scheme); Honegger A and Plückthun A, “Yet another numbering scheme for immunoglobulin variable domains: an automatic modeling and analysis tool,” J Mol Biol, 2001 Jun. 8; 309(3):657-70, (“Aho” numbering scheme); and Martin et al., “Modeling antibody hypervariable loops: a combined algorithm,” PNAS, 1989, 86(23):9268-9272, (“AbM” numbering scheme).
The boundaries of a given CDR or FR may vary depending on the scheme used for identification. For example, the Kabat scheme is based on structural alignments, while the Chothia scheme is based on structural information. Numbering for both the Kabat and Chothia schemes is based upon the most common antibody region sequence lengths, with insertions accommodated by insertion letters, for example, “30a,” and deletions appearing in some antibodies. The two schemes place certain insertions and deletions (“indels”) at different positions, resulting in differential numbering. The Contact scheme is based on analysis of complex crystal structures and is similar in many respects to the Chothia numbering scheme. The AbM scheme is a compromise between Kabat and Chothia definitions based on that used by Oxford Molecular's AbM antibody modeling software.
Table 1, below, lists exemplary position boundaries of CDR-L1, CDR-L2, CDR-L3 and CDR-H1, CDR-H2, CDR-H3 as identified by Kabat, Chothia, AbM, and Contact schemes, respectively. For CDR-H1, residue numbering is listed using both the Kabat and Chothia numbering schemes. FRs are located between CDRs, for example, with FR-L1 located before CDR-L1, FR-L2 located between CDR-L1 and CDR-L2, FR-L3 located between CDR-L2 and CDR-L3 and so forth. It is noted that because the shown Kabat numbering scheme places insertions at H35A and H35B, the end of the Chothia CDR-H1 loop when numbered using the shown Kabat numbering convention varies between H32 and H34, depending on the length of the loop.

TABLE 1

Boundaries of CDRs according to various numbering schemes.

CDR	Kabat	Chothia	AbM	Contact

CDR-L1	L24--L34	L24--L34	L24--L34	L30--L36
CDR-L2	L50--L56	L50--L56	L50--L56	L46--L55
CDR-L3	L89--L97	L89--L97	L89--L97	L89--L96
CDR-H1	H31--H35B	H26--H32 . . .	H26--H35B	H30--H35B
(Kabat		34
Numbering¹)
CDR-H1	H31--H35	H26--H32	H26--H35	H30--H35
(Chothia
Numbering²)
CDR-H2	H50--H65	H52--H56	H50--H58	H47--H58
CDR-H3	H95--H102	H95--H102	H95--H102	H93--H101

¹Kabat et al. (1991), “Sequences of Proteins of Immunological Interest,” 5th Ed. Public Health Service, National Institutes of Health, Bethesda, MD
²Al-Lazikani et al., (1997) JMB 273, 927-948

Thus, unless otherwise specified, a “CDR” or “complementary determining region,” or individual specified CDRs (e.g., CDR-H1, CDR-H2, CDR-H3), of a given antibody or region thereof, such as a variable region thereof, should be understood to encompass a (or the specific) complementary determining region as defined by any of the aforementioned schemes. For example, where it is stated that a particular CDR (e.g., a CDR-H3) contains the amino acid sequence of a corresponding CDR in a given V_Hor V_Lregion amino acid sequence, it is understood that such a CDR has a sequence of the corresponding CDR (e.g., CDR-H3) within the variable region, as defined by any of the aforementioned schemes. In some embodiments, specific CDR sequences are specified. Exemplary CDR sequences of provided antibodies are described using various numbering schemes, although it is understood that a provided antibody can include CDRs as described according to any of the other aforementioned numbering schemes or other numbering schemes known to a skilled artisan
Likewise, unless otherwise specified, a FR or individual specified FR(s) (e.g., FR-H1, FR-H2, FR-H3, FR-H4), of a given antibody or region thereof, such as a variable region thereof, should be understood to encompass a (or the specific) framework region as defined by any of the known schemes. In some instances, the scheme for identification of a particular CDR, FR, or FRs or CDRs is specified, such as the CDR as defined by the Kabat, Chothia, AbM or Contact method. In other cases, the particular amino acid sequence of a CDR or FR is given.
An antibody containing an ultralong CDR3 is an antibody that contains a variable heavy (VH) chain with an ultralong CDR3. An antibody may further include pairing of the VH chain with a variable light (VL) chain. In some embodiments, the antibodies or antigen-binding fragments include a heavy chain variable region and a light chain variable region. Thus, the term antibody include full-length antibodies and portions thereof including antibody fragments, wherein such contain a heavy chain or portion thereof and/or a light chain or portion thereof. An antibody can contain two heavy chains (which can be denoted H and H′) and two light chains (which can be denoted L and L′), in which each L chain is linked to an H chain by a covalent disulfide bond and the two H chains are linked to each other by disulfide bonds. The terms “full-length antibody,” or “intact antibody” are used interchangeably to refer to an antibody in its substantially intact form, as opposed to an antibody fragment. A full-length antibody is an antibody typically having two full-length heavy chains (e.g., VH-CH1-CH2-CH3 or VH-CH1-CH2-CH3-CH4) and two full-length light chains (VL-CL) and hinge regions.
The term “antibody” herein is used in the broadest sense and includes polyclonal and monoclonal antibodies, including intact antibodies and functional (antigen-binding) antibody fragments, including fragment antigen binding (Fab) fragments, F(ab′)₂fragments, Fab′ fragments, Fv fragments, recombinant IgG (rIgG) fragments, heavy chain variable (V_H) regions capable of specifically binding, and single chain variable fragments (scFv).
An “antibody fragment” comprises a portion of an intact antibody, the antigen binding and/or the variable region of the intact antibody. Antibody fragments, include, but are not limited to, Fab fragments, Fab′ fragments, F(ab′)₂fragments, Fv fragments, disulfide-linked Fvs (dsFv), Fd fragments, Fd′ fragments; single-chain antibody molecules, including single-chain Fvs (scFv) or single-chain Fabs (scFab); antigen-binding fragments of any of the above and multispecific antibodies from antibody fragments.
A “Fab fragment” is an antibody fragment that results from digestion of a full-length immunoglobulin with papain, or a fragment having the same structure that is produced synthetically, e.g., by recombinant methods. A Fab fragment contains a light chain (containing a V_Land C_L) and another chain containing a variable domain of a heavy chain (V_H) and one constant region domain of the heavy chain (C_H1).
An “scFv fragment” refers to an antibody fragment that contains a variable light chain (V_L) and variable heavy chain (V_H), covalently connected by a polypeptide linker in any order. The linker is of a length such that the two variable domains are bridged without substantial interference. Exemplary linkers are (Gly-Ser)_nresidues with some Glu or Lys residues dispersed throughout to increase solubility.
The term, “corresponding to” with reference to positions of a protein, such as recitation that nucleotides or amino acid positions “correspond to” nucleotides or amino acid positions in a disclosed sequence, such as set forth in the Sequence listing, refers to nucleotides or amino acid positions identified upon alignment with the disclosed sequence based on structural sequence alignment or using a standard alignment algorithm, such as the GAP algorithm. For example, corresponding residues of a similar sequence (e.g. fragment or species variant) can be determined by alignment to a reference sequence by structural alignment methods. By aligning the sequences, one skilled in the art can identify corresponding residues, for example, using conserved and identical amino acid residues as guides.
The term “effective amount” or “therapeutically effective amount” as used herein means an amount of a pharmaceutical composition which is sufficient enough to significantly and positively modify the symptoms and/or conditions to be treated (e.g., provide a positive clinical response). The effective amount of an active ingredient for use in a pharmaceutical composition will vary with the particular condition being treated, the severity of the condition, the duration of treatment, the nature of concurrent therapy, the particular active ingredient(s) being employed, the particular pharmaceutically-acceptable excipient(s) and/or carrier(s) utilized, and like factors with the knowledge and expertise of the attending physician.
As used herein, the term “pharmaceutically acceptable” refers to a material, such as a carrier or diluent, which does not abrogate the biological activity or properties of the compound, and is relatively nontoxic, i.e., the material may be administered to an individual without causing undesirable biological effects or interacting in a deleterious manner with any of the components of the composition in which it is contained.
As used herein, a composition refers to any mixture of two or more products, substances, or compounds, including cells. It may be a solution, a suspension, liquid, powder, a paste, aqueous, non-aqueous or any combination thereof.
As used herein, the term “pharmaceutical composition” refers to a mixture of at least one compound of the invention with other chemical components, such as carriers, stabilizers, diluents, dispersing agents, suspending agents, thickening agents, and/or excipients. The pharmaceutical composition facilitates administration of the compound to an organism. Multiple techniques of administering a compound exist in the art including, but not limited to, intravenous, oral, aerosol, parenteral, ophthalmic, pulmonary and topical administration and administration via inhalation.
As used herein, “disease or disorder” refers to a pathological condition in an organism resulting from cause or condition including, but not limited to, infections, acquired conditions, genetic conditions, and characterized by identifiable symptoms.
As used herein, the terms “treat,” “treating,” or “treatment” refer to ameliorating a disease or disorder, e.g., slowing or arresting or reducing the development of the disease or disorder, e.g., a root cause of the disorder or at least one of the clinical symptoms thereof.
As used herein, the term “subject” refers to an animal, including a mammal, such as a human being. The term subject and patient can be used interchangeably.
As used herein, “optional” or “optionally” means that the subsequently described event or circumstance does or does not occur, and that the description includes instances where said event or circumstance occurs and instances where it does not. For example, an optionally substituted group means that the group is unsubstituted or is substituted.

II. Display Libraries and Selection Methods

Provided herein in some aspects are methods of preparing an ultralong CDR3 antibody display library. Also provided herein in some aspects are methods of preparing an ultralong CDR3-knob display library. In some embodiments, the display library is a phage display library. In some embodiments, the ultralong CDR3 antibodies or knobs are derived from cow antibodies, for instance based on antibodies produced by a cow immunized with a target antigen. In some embodiments, the ultralong CDR3 antibodies or knobs are synthetic. In some embodiments, the ultralong CDR3 antibodies or knobs include are cyclotides or modified cyclotides, e.g., containing an exogenous peptide sequence.

A. Library Production Methods

Techniques for manipulating nucleic acids, such as those for generating mutation in sequences, subcloning, labeling, probing, sequencing, hybridization and so forth, are described in detail in scientific publications and patent documents. See, for example, Sambrook J, Russell D W (2001) Molecular Cloning: a Laboratory Manual, 3rd ed. Cold Spring Harbor Laboratory Press, New York; Current Protocols in Molecular Biology, Ausubel ed., John Wiley & Sons, Inc., New York (1997); Laboratory Techniques in Biochemistry and Molecular Biology: Hybridization With Nucleic Acid Probes, Part I, Theory and Nucleic Acid Preparation, Tijssen ed., Elsevier, N.Y. (1993).
Any known methods for generating libraries containing variant polynucleotides and/or polypeptides can be used with the provided methods and vectors to generate display libraries, e.g. phage display libraries, and to select binding proteins from the libraries. The libraries can be used in screening assays to select binding proteins from the library for any antigen, including, for example, any virus, bacterial, other pathogenic, an immunomodulatory protein (e.g. a checkpoint molecule), or cancer antigen. To facilitate screening, antibody libraries typically are screened using a display technique, such that there is a physical link between the individual molecules of the library (phenotype) and the genetic information encoding them (genotype). These methods include, but are not limited to, cell display, including bacterial display, yeast display, mammalian display, phage display (Smith, G. P. (1985) Science 228:1315-1317), mRNA display, ribosome display and DNA display.
In some embodiments, the provided libraries are phage display libraries. In some embodiments, the display library is a phage display library. In some embodiments, the phage display library is produced through use of a phagemid encoding at least a portion of a phage coat protein, in addition to encoding the polypeptide for display. In some embodiments, the phagemid particles are derived from M13 phage. In some embodiments, the coat protein is the M13 phage gene III coat protein (pIII).
In some embodiments, a phage display library is produced by fusion of a candidate binding polypeptide as described herein, such as an ultralong CDR3 scFv antibody fragment or an ultralong CDR3 knob peptide, with a gene III minor coat protein of an F-specific filamentous phage of Escherichia coli (Ff: f1, M13, or fd). Alternatively, other bacterial species can be used to produce the phage display library, including Pseudomonas fluorescens. In some embodiments, the gene III is a minor coat protein of M13 phage (also called pIII). The gene III minor coat protein (present in about 5 copies at one end of the virion) is involved in proper phage assembly and for infection by attachment to the pili of E. coli. Methods of phage display are known.
In some embodiments, a nucleic acid encoding a candidate binding polypeptide as described herein, such as an ultralong CDR3 scFv antibody fragment or an ultralong CDR3 knob peptide, is inserted into or constructed as part of a replicable expression vector, in which the nucleic acid is fused to a nucleic acid encoding at least a portion of a phage coat protein, such as pIII. In some embodiments, the nucleic acid encoding a candidate binding polypeptide as described herein, such as an ultralong CDR3 scFv antibody fragment or an ultralong CDR3 knob peptide, is fused to pIII.
In some embodiments, the replicable expression vector is a plasmid vector that generally contains a variety of components, including promoters, signal sequences, phenotypic selection genes, origin of replication sites, and other necessary components as are known to those of ordinary skill in the art. Promoters most commonly used in prokaryotic vectors include the lac Z promoter system, the alkaline phosphatase pho A promoter, the bacteriophage XPL promoter (a temperature sensitive promoter), the tac promoter (a hybrid trp-lac promoter that is regulated by the lac repressor), the tryptophan promoter, the bacteriophage T7 promoter, or other suitable microbial promoters. Examples of promoter systems include Lac Z, XPL, TAC, T 7 polymerase, tryptophan, and alkaline phosphatase promoters and combinations thereof. Suitable prokaryotic signal sequences may be obtained from genes encoding, for example, LamB or OmpF (Wong et al., Gene, 68:193 1983), MalE, PhoA, the E. coli heat-stable enterotoxin II (STII) signal sequence, or a Pel B secretory signal sequence. In some embodiments, the expression vector will further contain a secretory signal sequences operably fused to the nucleic acid encoding the polypeptide. In some embodiments, the secretory sequence is a Pel B secretory signal sequence. In some embodiments, the replicable expression vector also may contain a phenotypic selection genes. Typical phenotypic selection genes are those encoding proteins that confer antibiotic resistance upon the host cell. By way of illustration, the ampicillin resistance gene (amp), the tetracycline resistance gene (tet), or carbenicillen resistance gene may be used.
Construction of suitable vectors containing the nucleic acid encoding the desired polypeptide are prepared using standard recombinant DNA procedures. Isolated DNA fragments to be combined to form the vector are cleaved, tailored, and ligated together in a specific order and orientation to generate the desired vector. In some embodiments, the DNA is cleaved using the appropriate restriction enzyme or enzymes in a suitable buffer. Appropriate buffers, DNA concentrations, and incubation times and temperatures are specified by the manufacturers of the restriction enzymes. Generally, incubation times of about one or two hours at 37° C. are adequate, although several enzymes require higher temperatures. After incubation, the enzymes and other contaminants are removed by extraction of the digestion solution with a mixture of phenol and chloroform, and the DNA is recovered from the aqueous fraction by precipitation with ethanol.
To ligate the DNA fragments together to form a functional vector, the ends of the DNA fragments must be compatible with each other. In some cases, the ends will be directly compatible after endonuclease digestion. However, it may be necessary to first convert the sticky ends commonly produced by endonuclease digestion to blunt ends to make them compatible for ligation. To blunt the ends, the DNA is treated in a suitable buffer for at least 15 minutes at 15° C. with 10 units of the Klenow fragment of DNA polymerase I (Klenow) in the presence of the four deoxynucleotide triphosphates. The DNA is then purified by phenol-chloroform extraction and ethanol precipitation.
The DNA fragments that are to be ligated together (previously digested with the appropriate restriction enzymes such that the ends of each fragment to be ligated are compatible) are put in solution. In some embodiments, the DNA fragments are provided in about equimolar amounts. In some embodiments, the solution will also contain ATP, ligase buffer, and a ligase such as T4 DNA ligase, such as at or about 10 units per 0.5 μg of DNA. If the DNA fragment is to be ligated into a vector, the vector is first linearized by cutting with the appropriate restriction endonuclease(s). The linearized vector is then treated with alkaline phosphatase or calf intestinal phosphatase. The phosphatasing prevents self-ligation of the vector during the ligation step.
In some embodiments, a plurality of constructed replicable expression vectors are transformed into suitable host cells. Suitable host cells include prokaryotes host cells. In some embodiments, the host cell used for expressing or producing the display libraries are E. coli cells. Suitable prokaryotic host cells include E. coli strain JM101, E. coli K12 strain 294 (ATCC number 31,446), E. coli strain W3110 (ATCC number 27,325), E. coli X1776 (ATCC number 31,537), E. coli XL-1Blue (stratagene), and E. coli B; however, many other strains of E. coli, such as HB101, NM522, NM538, NM539, and many other species and genera of prokaryotes may be used as well. In addition to the E. coli strains listed above, bacilli such as Bacillus subtilis, other enterobacteriaceae such as Salmonella typhimurium or Serratia marcesans, and various Pseudomonas species may all be used as hosts. In some embodiments, the host cell is a protease deficient strain of E. coli. In some embodiments, the host cells are TG1 electrocompetent cells.
Transformation of prokaryotic cells is readily accomplished using the calcium chloride method as described in section 1.82 of Sambrook et al., supra. Alternatively, electroporation (Neumann et al., EMBO J., 1:841 1982) may be used to transform these cells. In some embodiments, the methods further include infecting the transformed host cells with a helper phage having a gene encoding the phage coat protein. In some embodiments, the methods further include the use of a helper phage in order to promote sufficient expression of the phagemid particles. In some embodiments, the helper phage is selected from the group consisting of M13K07, M13R408, M13-VCS, and Phi X 174. In some embodiments, the helper phage is M13K07. The transformed infected host cells are then cultured under conditions suitable for forming recombinant phagemid particles containing at least a portion of the plasmid and capable of transforming the host. The transformed cells are selected by growth on an antibiotic, for example tetracycline (tet) or ampicillin (amp), carbenicillin or other antibiotic depending on the particular expression vector, to which they are rendered resistant due to the presence of resistance genes on the vector.
After selection of the transformed cells, these cells are grown in culture and the plasmid DNA (or other vector with the foreign gene inserted) is then isolated. Plasmid DNA can be isolated using methods known in the art. The isolated DNA can be purified by methods known in the art. This purified plasmid DNA is then analyzed by restriction mapping and/or DNA sequencing.

1. Polypeptides for Display

In some embodiments, the polypeptides for display include an ultralong CDR3. In cow antibodies, the ultralong CDR3 sequences forms a structure where a subdomain with an unusual architecture is formed from a “stalk”, composed of two 12-residue, anti-parallel β-strands (ascending and descending strands), and a longer, e.g., 39-residue, disulfide-rich “knob” that sits atop the stalk, far from the canonical antibody paratope. The long anti-parallel β-ribbon serves as a bridge to link the knob domain with the main antibody scaffold. The unique “stalk and knob” structure of the ultralong CDR3 results in the two antiparallel $-strands (an ascending and descending stalk strand) supporting a disulfide bonded knob protruding out of the antibody surface to form a mini antigen binding domain. In some embodiments, the ultralong CDR3 antibodies comprise, in order, an ascending stalk region, a knob region, and a descending stalk region.
In some embodiments, the ultralong CDR-H3 includes an ascending stalk domain (Stalk A), a disulfide-rich knob region, and a descending stalk domain (Stalk B), in which the knob region is positioned between the ascending and descending stalk domains. In some embodiments, the sequence of the ultralong CDR-H3 provides a structure of an anti-parallel s-strands that protrude away from the antibody, in which the disulfide-rich knob region is positioned at the tip of the antibody (FIG. 1 ). Stalk A comprises mainly hydrophobic side chains and a relatively conserved motif at the base, which initiates the ascending strand. This conserved motif is typically found following the first cysteine residue in variable region sequences of the various bovine or cow sequences. In some embodiments, the base of Stalk A contains residues CTTVHQ (SEQ ID NO: 98), CATVHQ (SEQ ID NO: 99), CAIVQQ (SEQ ID NO: 100), or CATVDQ (SEQ ID NO: 101) that stabilizes the base by interacting with residues of the CDR-H1. The Stalk A is connected by a variable number of residues, e.g., 2 to 8 amino acid residues, before a first conserved cysteine residue that forms part of the disulfide-bonded knob region. In some embodiments, the knob region includes a first conserved amino acid motif Cys-Pro (CP), in which the initial cysteine residue forms the first disulfide bond with another cysteine residue in the knob. The knob may include 2-12 cysteine residues that are able to form 2-6 disulfide bonds. The stalk can be of variable length, and Stalk B may comprise alternating aromatics that form a ladder through stacking interactions, that may contribute to the stability of the long solvent-exposed, two stranded β-ribbon (Wang et al. Cell. 2013, 153 (6): 1379-1393). In some embodiments, the Stalk B contains a conserved pattern of alternating tyrosines, sometimes with the motif YX₁YX₂Y (SEQ ID NO: 102), that support the knob structure.
In some embodiments, the ultralong CDR3 includes or is a peptide sequence of 25-70 amino acids. In some embodiments, the ultralong CDR3 is a peptide sequence that is between or between about 35 and 70 amino acids in length, 40 and 70 amino acids in length, 45 and 70 amino acids in length, 50 and 70 amino acids in length, 55 and 70 amino acids in length, or 60 and 70 amino acids in length.
In some embodiments, the ultralong CDR3 includes a cysteine motif. In some embodiments, the cysteine motif includes 2-20 cysteine residues, for instance between or between about 2 and 18, 2 and 16, 2 and 14, 2 and 12, 2 and 10, 2 and 8, 2 and 6, 2 and 4, 4 and 20, 4 and 18, 4 and 16, 4 and 14, 4 and 12, 4 and 10, 4 and 8, 4 and 6, 6 and 20, 6 and 18, 6 and 16, 6 and 14, 6 and 12, 6 and 10, 6 and 8, 8 and 20, 8 and 18, 8 and 16, 8 and 14, 8 and 12, 8 and 10, 10 and 20, 10 and 18, 10 and 16, 10 and 14, 10 and 12, 12 and 20, 12 and 18, 12 and 16, 12 and 14, 14 and 20, 14 and 18, 14 and 16, 16 and 20, 16 and 18, or 18 and 20 cysteine residues, each inclusive. In some embodiments, the cysteine motif includes 2-12 cysteine residues.
In some embodiments, the ultralong CDR3 knob includes 1-10 disulfide bonds, for instance between or between about 1 and 9, 1 and 8, 1 and 7, 1 and 6, 1 and 5, 1 and 4, 1 and 3, 1 and 2, 2 and 10, 2 and 9, 2 and 8, 2 and 7, 2 and 6, 2 and 5, 2 and 4, 2 and 3, 3 and 10, 3 and 9, 3 and 8, 3 and 7, 3 and 6, 3 and 5, 3 and 4, 4 and 10, 4 and 9, 4 and 8, 4 and 7, 4 and 6, 4 and 5, 5 and 10, 5 and 9, 5 and 8, 5 and 7, 5 and 6, 6 and 10, 6 and 9, 6 and 8, 6 and 7, 7 and 10, 7 and 9, 7 and 8, 8 and 10, 8 and 9, or 9 and 10 disulfide bonds, each inclusive. In some embodiments, the ultralong CDR3 knob includes 1-6 disulfide bonds.
In some embodiments, the ultralong CDR3 includes an ascending stalk domain. In some embodiments, the ultralong CDR3 includes a descending stalk domain. In some embodiments, the cysteine motif is between the ascending and descending stalk domains. In some embodiments, the ascending stalk domain includes the sequence CX₂TVX₅Q (SEQ ID NO: 103), wherein X₂and X₅are any amino acid. In some embodiments, X₂is Ser, Thr, Gly, Asn, Ala, or Pro, and X₅is His, Gin, Arg, Lys, Gly, Thr, Tyr, Phe, Trp, Met, lie, Val, or Leu (SEQ ID NO: 104). In some embodiments, X₂is Ser, Ala, or Thr, and X₅is His or Tyr (SEQ ID NO: 105).
In other embodiments, the ultralong CDR3 does not include an ascending stalk domain N-terminal to the cysteine motif. In some embodiments, the ultralong CDR3 does not include a descending stalk domain C-terminal to the cysteine motif.
In some embodiments, the polypeptides for display, e.g., polypeptides including the ultralong CDR3, are derived from bovine antibodies. In some embodiments, the polypeptides for display are produced by amplifying sequences from a cow complementary DNA (cDNA) library. In some embodiments, the cDNA template library is prepared from RNA isolated from peripheral blood mononuclear cells (PBMCs) from a cow. In some embodiments, the cDNA template library is synthesized using a pool of immunoglobulin-specific primers. In some embodiments, the cDNA template library is synthesized using a pool of IgM, IgA, and IgG-specific primers. Exemplary primers for use include those with sequences set forth in SEQ ID NO: 3 (IgG), SEQ ID NO: 4 (IgM), 5 (IgA), and SEQ ID NO: 6 (IgG).
In some embodiments, the cow is immunized with a target antigen. In some embodiments, the target antigen is a nonvirulent bacteria, a virus, a viral protein, an immunomodulatory protein (e.g. a checkpoint molecule), a cancer antigen, a human IgG, or a recombinant protein thereof. In some embodiments, the target antigen is a viral protein. In some embodiments, the cow is immunized with multiple target antigens, for instance different viral antigens. In some embodiments, the different viral antigens are proteins associated with different variants, clades, or strains of a virus.
In some embodiments, the target antigen is a a coronavirus, a coronavirus pseudovirus, or an antigen of such virus, such as a a recombinant coronavirus Spike protein, or a receptor-binding domain (RBD) of a coronavirus Spike protein. Coronaviruses may be from the subfamily Orthocoronavirinae, which is one of two sub-families in the family Coronaviridae, order Nidovirales, and realm Riboviria. There are four genera: Alphacoronavirus, Betacoronavirus, Gammacoronavirus and Deltacoronavirus. SARS CoV2 is a Betacoronavirus, belonging to the subgenus Sarbecovirus. In some embodiments, the coronavirus is selected from the group consisting of 229E, NL63, OC43, HKU1, MERS-CoV, SARS-CoV, and SARS-CoV2. In some embodiments, the coronavirus is a SARS-CoV2 selected from Wuhan-Hu-1 isolate, B.1.351 South African variant, or B.1.1.7 UK variant. In some embodiments, the SARS CoV-2 specific antigen comprises a S trimer polypeptide. In some embodiments, the SARS CoV-2 specific antigen comprises a S monomer polypeptide. In some embodiments, the SARS CoV-2 specific antigen comprises a polynucleotide encoding a S trimer or monomer polypeptide. In some embodiments, the cow is immunized with multiple target antigens associated with any combination of coronaviruses 229E, NL63, OC43, HKU1, MERS-CoV, SARS-CoV, and SARS-CoV2. In some embodiments, the cow is immunized with multiple target antigens associated with any combination of SARS-CoV2 variants selected from Wuhan-Hu-1 isolate, B.1.351 South African variant or B.1.1.7 UK variant.
In some embodiments, the antigen is a cancer antigen. In some embodiments, the antigen is selected from among ACTHR, endothelial cell Anxa-1, aminopetidase N, anti-IL-6R, alpha-4-integrin, alpha-5-beta-3 integrin, alpha-5-beta-5 integrin, alpha-fetoprotein (AFP), ANPA, ANPB, APA, APN, APP, IAR, 2AR, AT1, B1, B2, BAGE1, BAGE2, B-cell receptor BB1, BB2, BB4, calcitonin receptor, cancer antigen 125 (CA 125), CCK1, CCK2, CD5, CD10, CD11a, CD13, CD14, CD19, CD20, CD22, CD25, CD30, CD33, CD38, CD45, CD52, CD56, CD68, CD90, CD133, CD7, CD15, CD34, CD44, CD206, CD271, CEA (CarcinoEmbryonic Antigen), CGRP, chemokine receptors, cell-surface annexin-1, cell-surface plectin-1, Cripto-1, CRLR, CXCR2, CXCR4, DCC, DLL3, E2 glycoprotein, EGFR, EGFRvIII, EMR1, Endosialin, EP2, EP4, EpCAM, EphA2, ET receptors, Fibronectin, Fibronectin ED-B, FGFR, frizzled receptors, GAGE1, GAGE2, GAGE3, GAGE4, GAGE5, GAGE6, GLP-1 receptor, G-protein coupled receptors of the Family A (Rhodopsin-like), G-protein coupled receptors of the Family B (Secretin receptor-like) like), G-protein coupled receptors of the Family C (Metabotropic Glutamate Receptor-like), GD2, GP100, GP120, Glypican-3, hemagglutinin, Heparin sulfates, HER1, HER2, HER3, HER4, HMFG, HPV 16/18 and E6/E7 antigens, hTERT, IL11-R, IL-13R, ITGAM, Kalikrien-9, Lewis Y, LH receptor, LHRH-R, LPA1, MAC-1, MAGE 1, MAGE 2, MAGE 3, MAGE 4, MART1, MC1R, Mesothelin, MUC1, MUC16, Neu (cell-surface Nucleolin), Neprilysin, Neuropilin-1, Neuropilin-2, NG2, NK1, NK2, NK3, NMB-R, Notch-1, NY-ESO-1, OT-R, mutant p53, p97 melanoma antigen, NTR2, NTR3, p32 (p32/gC1q-R/HABP1), p75, PAC1, PAR1, Patched (PTCH), PDGFR, PDFG receptors, PDT, Protease-cleaved collagen IV, proteinase 3, prohibitin, protein tyrosine kinase 7, PSA, PSMA, purinergic P2X family (e.g., P2X1-5), mutant Ras, RAMP1, RAMP2, RAMP3 patched, RET receptor, plexins, smoothened, sst1, sst2A, sst2B, sst3, sst4, sst5, substance P, TEMs, T-cell CD3 Receptor, TAG72, TGFBR1, TGFBR2, Tie-1, Tie-2, Trk-A, Trk-B, Trk-C, TR1, TRPA, TRPC, TRPV, TRPM, TRPML, TRPP (e.g., TRPV1-6, TRPA1, TRPC1-7, TRPM1-8, TRPP1-5, TRPML1-3), TSH receptor, VEGF receptors (VEGFR1 or Flt-1, VEGFR2 or FLK-1/KDR, and VEGF-3 or FLT-4), voltage-gated ion channels, VPAC1, VPAC2, Wilms tumor 1, Y1, Y2, Y4, and Y5.
In some embodiments, the antigen is HER1/EGFR, HER2/ERBB2, CD20, CD25 (IL-2Rα receptor), CD33, CD52, CD133, CD206, CEA, CEACAM1, CEACAM3, CEACAM5, CEACAM6, cancer antigen 125 (CA 125), alpha-fetoprotein (AFP), Lewis Y, TAG72, Caprin-1, mesothelin, PDGF receptor, PD-1, PD-L1, CTLA-4, IL-2 receptor, vascular endothelial growth factor (VEGF), CD30, EpCAM, EphA2, Glypican-3, gpA33, mucins, CAIX, PSMA, folate-binding protein, gangliosides (such as GD2, GD3, GM1 and GM2), VEGF receptor (VEGFR), integrin αVβ3, integrin α5β1, ERBB3, MET, IGF1R, EPHA3, TRAILR1, TRAILR2, RANKL, FAP, tenascin, AFP, BCR complex, CD3, CD18, CD44, CTLA-4, gp72, HLA-DR 10 β, HLA-DR antigen, IgE, MUC-1, nuC242, PEM antigen, metalloproteinases, Ephrin receptor, Ephrin ligands, HGF receptor, CXCR4, CXCR4, Bombesin receptor, and SK-1 antigen.
In some embodiments, the antigen is CD25, PD-1 (CD279), PD-L1 (CD274, B7-H1), PD-L2 (CD273, B7-DC), CTLA-4, LAG3 (CD223), TIM3 (HAVCR2), 4-1BB (CD137, TNFRSF9), CXCR2, CXCR4 (CD184), CD27, CEACAM1, Galectin 9, BTLA, CD160, VISTA (PD1 homologue), B7-H4 (VCTN1), CD80 (B7-1), CD86 (B7-2), CD28, HHLA2 (B7-H7), CD28H, CD155, CD226, TIGIT, CD96, Galectin 3, CD40, CD40L, CD70, LIGHT (TNFSF14), HVEM (TNFRSF14), B7-H3 (CD276), Ox40L (TNFSF4), CD137L (TNFSF9, GITRL), B7RP1, ICOS (CD278), ICOSL, KIR, GAL9, NKG2A (CD94), GARP, TL1A, TNFRSF25, TMIGD2, BTNL2, Butyrophilin family, CD48, CD244, Siglec family, CD30, CSF1R, MICA (MHC class I polypeptide-related sequence A), MICB (MHC class I polypeptide-related sequence B), NKG2D, KIR family (Killer-cell immunoglobulin-like receptor, LILR family (Leukocyte immunoglobulin-like receptors, CD85, ILTs, LIRs), SIRPA (Signal regulatory protein alpha), CD47 (IAP), Neuropilin 1 (NRP-1), a VEGFR, and VEGF.
In some embodiments, the antigen is a an immunomodulatory protein (e.g. a checkpoint molecule). In some embodiments, the antigen is an immune checkpoint receptor ligands. Illustrative immune checkpoint molecules that may be targeted for blocking or inhibition include, but are not limited to, PD1 (CD279), PDL1 (CD274, B7-H1), PDL2 (CD273, B7-DC), CTLA-4, LAG3 (CD223), TIM3, 4-1BB (CD137), 4-1BBL (CD137L), GITR (TNFRSF18, AITR), CD40, Ox40 (CD134, TNFRSF4), CXCR2, tumor associated antigens (TAA), B7-H3, B7-H4, BTLA, HVEM, GAL9, B7H3, B7H4, VISTA, KIR, 2B4 (belongs to the CD2 family of molecules and is expressed on all NK, γδ, and memory CD8+ (αβ) T cells), CD160 (also referred to as BY55) and CGEN-15049. In some embodiments, the immune checkpoint molecule is CD25, PD-1, PD-L1, PD-L2, CTLA-4, LAG-3, TIM-3, 4-1BB, GITR, CD40, CD40L, OX40, OX40L, CXCR2, B7-H3, B7-H4, BTLA, HVEM, CD28 and VISTA.
In some embodiments, the polypeptides for display are synthetic. In some embodiments, the synthetic polypeptides include all or a portion of a bovine antibody, e.g., an ultralong CDR3 knob. In some embodiments, the synthetic polypeptide is a modified cyclotide. In some embodiments, the modified cyclotide includes an ultralong CDR3 knob sequence, e.g., of a cow.
In some embodiments, the polypeptides for display contain a variable heavy region containing the ultralong CDR-H3 and a variable light region. Particular formats include single chain formats, such as a single chain variable fragment (scFv). In other embodiments, the polypeptides for display is a smaller peptide of 25-70 amino acids, such as 40-70 amino acids, that is a knob peptide. Exemplary molecules for display and display libraries are described.
a. scFv Peptides for Display
In some embodiments, the polypeptide for display is a single-chain variable fragment (scFv). In some embodiments, the scFv includes a VH region having a cow ultralong CDR3. In some embodiments, the VH region is encoded by a sequence that has been amplified from a cow cDNA template library, e.g., one prepared from RNA isolated from peripheral blood mononuclear cells (PBMCs) from an immunized cow. In some embodiments, the amplifying is by amplifying sequences encoding VH regions of bovine antibody families known or suspected to contain ultralong CDR3s. In some embodiments, sequences of VH regions of the IgHV1-7 family are amplified to produce sequences encoding the VH region of the scFv. In some embodiments, the VH regions of the IgHV1-7 family are amplified with a forward primer that includes the sequence set forth in SEQ ID NO: 84 and a reverse primer that includes the sequence set forth in SEQ ID NO: 85. In some embodiments, the forward primer and/or the reverse primer further include sequences specific to restriction enzyme sites in order to facilitate cloning. In some embodiments, the VH regions of the IgHV1-7 family are amplified with a forward primer set forth in SEQ ID NO: 12 and a reverse primer set forth in SEQ ID NO: 13.
In some embodiments, preparation of sequences for the VH regions of the polypeptides for display also includes a size separation step. In some embodiments, following amplification of VH region sequences, e.g., of the IgHV1-7 family, such as from a cow cDNA template library, sequences encoding VH regions with an ultralong CDR3 are separated from shorter sequences encoding VH regions without an ultralong CDR3. In some embodiments, the size separation step further enriches for amplified sequences encoding VH regions with an ultralong CDR3.
In some embodiments, the size separation step involves separating, from sequences encoding a plurality of amplified VH regions, sequences of, of about, or greater than 425, 450, 475, 500, 525, or 550 base pairs in length, wherein the sequences of, of about, or greater than 425, 450, 475, 500, 525, or 550 base pairs in length include the sequences encoding VH regions with an ultralong CDR3. In some embodiments, sequences of, of about, or greater than 550 base pairs in length are separated from the remaining sequences.
In some embodiments, the size separation is performed by agarose gel electrophoresis. In some embodiments, a 1.2%, 1.5%, or 2% agarose gel is used. In some embodiments, a 2% agarose gel is used.
In some embodiments, the scFv includes a VL region that is fixed across polypeptides of the display library. In some aspects, the use of a fixed VL region improves selection and/or screening for scFvs including a VH region with an ultralong CDR3. In some embodiments, the VL region is a variable lambda light (VL) region selected from the group consisting of BLV1H12, BLV5D3, BLV8C11, BF1H1, BLV5B8, and F18, or is a humanized variant thereof. In some embodiments, the VL region is the BLV5B8 lambda VL region (SEQ ID NO: 110) or a humanized variant thereof. In some embodiments, the VL region is the BLV1H12 lambda VL region or a humanized variant thereof. In some embodiments, the BLV1H12 VL region is set forth in SEQ ID NO: 2. In some embodiments, the humanized variant comprises one or more of amino acid replacements S2A, T5N, P8S, A12G, A13S, and P14L based on Kabat numbering, amino acid replacements I29V and N32G in the CDR1 region, and/or amino acid substitution of DNN to GDT in the CDR2 region. In some embodiments, the humanized variant of BLV1H12 comprises the sequence set forth in SEQ ID NO: 107.
In some embodiments, at least or at least about 20%, 30%, 35%, 40%, 45%, 50%, 60%, 70%, 80%, 85%, 90%, or 95% of the displayed scFvs include a VH region comprising an ultralong CDR3 region. In some embodiments, at least or at least about 30% of the displayed scFvs include a VH region comprising an ultralong CDR3 region. In some embodiments, at least or at least about 40% of the displayed scFvs include a VH region comprising an ultralong CDR3 region. In some embodiments, at least or at least about 50% of the displayed scFvs include a VH region comprising an ultralong CDR3 region. In some embodiments, at least or at least about 60% of the displayed scFvs include a VH region comprising an ultralong CDR3 region. In some embodiments, at least or at least about 70% of the displayed scFvs include a VH region comprising an ultralong CDR3 region. In some embodiments, at least or at least about 80% of the displayed scFvs include a VH region comprising an ultralong CDR3 region. In some embodiments, at least or at least about 90% of the displayed scFvs include a VH region comprising an ultralong CDR3 region. In some embodiments, at least or at least about 95% of the displayed scFvs include a VH region comprising an ultralong CDR3 region.
In some embodiments, the VH and VL regions of the scFv are joined directly. In some embodiments, the VH and VL regions of the scFv are joined indirectly, e.g., via a peptide linker. In some embodiments, the peptide linker is a flexible linker. In some embodiments, the peptide linker is (Gly4 Ser)3 (SEQ ID NO: 94).
b. Knob Peptides for Display
In some embodiments, the polypeptide for display is an ultralong CDR3 knob, e.g., a cow ultralong CDR3. In some embodiments, the ultralong CDR3 knob is encoded by a sequence that has been amplified from a cow cDNA template library, e.g., one prepared from RNA isolated from peripheral blood mononuclear cells (PBMCs) from an immunized cow.
In some embodiments, the amplifying is by amplifying sequences encoding ultralong CDR3 knobs. In some embodiments, primers specific for the ascending and descending stalk domains of a cow ultralong CDR3 region are used to amplify the sequences encoding ultralong CDR3 knobs. In some embodiments, the ultralong CDR3 knob comprises a portion of the ascending stalk domain, such as 1, 2, 3, 4, 5 or 6 amino acids. In some embodiments, the ultralong CDR3 knob comprises a portion of the descending stalk domain, such as 1, 2, 3, 4, 5, 6, 7, 8, or 9 amino acids. In some embodiments, the ascending stalk domain includes the sequence CX₂TVX₅Q, wherein X₂and X₅are any amino acid. In some embodiments, X₂is Ser, Thr, Gly, Asn, Ala, or Pro, and X₅is His, Gln, Arg, Lys, Gly, Thr, Tyr, Phe, Trp, Met, Ile, Val, or Leu. In some embodiments, X₂is Ser, Ala, or Thr, and X5 is His or Tyr. In some embodiments, the primers used for amplifying include or consist of the sequences set forth in SEQ ID NO: 7-11. In some embodiments, the primers used for amplifying include or consist of the sequences set forth in SEQ ID NO: 8-11. In some embodiments, the primers used for amplifying include or consist of the sequences set forth in SEQ ID NO: 121-130. In some embodiments, the primers used for amplifying include or consist of the sequences set forth in SEQ ID NO: 123, 127, and 128.
In some embodiments, the primers used for amplifying are a pool of different primers specific for the ascending and descending stalk domains. In some embodiments, the pool of primers contains at least two, three, four, five, six, seven, eight, nine, or 10 different primers. In some embodiments, the pool of primers contains at least two, three, four, five, six, seven, eight, nine, or 10 different primers from the primers set forth in SEQ ID NO: 7-11 and 121-130. In some embodiments, the pool of primers contains at least two, three, four, five, six, or seven different primers from the primers set forth in SEQ ID NO: 8-11, 123, 127, and 128. In some embodiments, the pool of primers contains the primers set forth in SEQ ID NO: 8-11. In some embodiments, the pool of primers contains the primers set forth in SEQ ID NO: 123, 127, and 128. In some embodiments, the pool of primers contains the primers set forth in SEQ ID NO: 8-11, 23, 27, and 28.
In some embodiments, the knob peptide is a peptide identified using methods as described in Section II.C. Once identified, the knob peptide sequences can be amplified using methods known to a skilled artisan. In other embodiment, the knob peptide may be synthetically generated. A variety of techniques including recombinant methods, chemical synthesis, or combinations thereof, may be employed. In some embodiments, chemical synthesis methods may include known chemical synthesis techniques, such as the phosphoramidite method. In some instances, a recombinant or synthetic nucleic acid may be generated through polymerase chain reaction (PCR).
c. Synthetic Peptides for Display
In some embodiments, the polypeptide for display is a synthetic peptide. In some embodiments, the synthetic peptide is a random sequence polypeptide with a cysteine motif and disulfide bonds as described herein, e.g., with 2-20 cysteine residues and 1-10 disulfide bonds. In some embodiments, the synthetic peptide has been selected from a random sequence library for having a cysteine motif and disulfide bonds as described herein, e.g., for having 2-20 cysteine residues and 1-10 disulfide bonds. Methods of producing a random sequence library are known.
In some embodiments, the polypeptide for display is a semisynthetic ultralong CDR3 knob. In some embodiments, the semisynthetic ultralong CDR3 knob is derived from a bovine ultralong CDR3 knob that has been used as a scaffold for modifications. In some embodiments, the bovine ultralong CDR3 knob has been modified to include random mutations, e.g., while preserving the cysteine motif and disulfide bond structure as described herein, e.g., such that the semisynthetic ultralong CDR3 knob still includes 2-20 cysteine residues and 1-10 disulfide bonds. In some embodiments, the bovine ultralong CDR3 knob has been modified to include an exogenous peptide sequence. In some embodiments, the bovine ultralong CDR3 knob has been modified to delete a one or more peptide sequences therein, e.g., while preserving the cysteine motif and disulfide bond structure as described herein, e.g., such that the semisynthetic ultralong CDR3 knob still includes 2-20 cysteine residues and 1-10 disulfide bonds.
In some embodiments, the polypeptide for display is a cyclotide. In some embodiments, the polypeptide for display is a modified cyclotide, e.g., that has been modified to include an exogenous peptide sequence. In some embodiments, the modified cyclotide includes an ultralong CDR3 knob sequence or a portion thereof, including any as described herein or identified according to the provided methods.
Cysteine-knot microproteins (cyclotides) include a naturally occurring family of cysteine-knot microproteins or cyclotides found in various plant species. Cysteine-knot microproteins (cyclotides) are small peptides, typically consisting of about 30-40 amino acids, which can be found naturally as cyclic or linear forms, where the cyclic form has no free N- or C-terminal amino or carboxyl end. They have a defined structure based on three intra-molecular disulfide bonds and a small triple stranded β-sheet (Craik et al., 2001; Toxicon 39, 43-60). The cyclic proteins exhibit conserved cysteine residues defining a structure referred to herein as a “cysteine knot”. This family includes both naturally occurring cyclic molecules and their linear derivatives as well as linear molecules which have undergone cyclization. These molecules are useful as molecular framework structures having enhanced stability over less structured peptides. (Colgrave and Craik, 2004; Biochemistry 43, 5965-5975).
The main cyclotide features are a remarkable stability due to the cysteine knot, a small size making them readily accessible to chemical synthesis, and an excellent tolerance to sequence variations. The cyclotide scaffold is found in almost 30 different protein families among which conotoxins, spider toxins, squash inhibitors, agouti-related proteins and plant cyclotides are the most populated families. Cyclotides from plants in the Rubiaceae and Violaceae families are for the most part found to be head-to-tail cyclic peptides (Craik et al. 2010. Cell. Mol. Life Sci. 67:9-16). However, within the squash inhibitor family of cyclotides both cyclic and linear cyclotides have been identified from Momordica cochinchinensis: the cyclic trypsin inhibitors (MCoTI)-I and -II and their linear counterpart MCoTI-III (Hernandez et al. 2000. Biochemistry, 39, 5722-5730). It is now clear that both cyclic and linear variants can exist in different cyclotide families, but the impact of the cyclization is poorly understood. Cyclic peptides were expected to display improved stability, better resistance to proteases, and reduced flexibility when compared to their linear counterparts, hopefully resulting in enhanced biological activities. However, linear cyclotides have the advantage of being able to be more easily linked to other peptides or proteins.
For instance, cyclotides are commonly found in plants. In aspects of provided embodiments, cyclotides are derived from linear or cyclic form of cyclotides of the Momordicae, Rubiaceae and Violaceae, plant species. In a preferred aspect, cyclotides of the invention are derived from linear or cyclic form of cyclotides of the Momordicae species including the squash serine protease inhibitor family (Otlewski & Korowarsch Acta Biochim Pol. 1996; 43(3):431-44), and in a more preferred aspect from Momordica cochinchinensis trypsin inhibitors MCoTI-I [SEQ ID NO: 95] and -II [SEQ ID NO: 96] (naturally cyclic) and MCoTI-III (naturally linear) [SEQ ID NO: 97] below.

	Mcoti-I
	[SEQ ID NO: 95]
	GGVCPKILQRCRRDSDSPGACICRGNGYCGSGSD

	Mcoti-II
	[SEQ ID NO: 96]
	GGVCPKILKKCRRDSDSPGACICRGNGYCGSGSD

	Mcoti-III
	[SEQ ID NO: 97]
	ERACPRILKKCRRDSDSPGACICRGNGYCG

In some embodiments, the cyclotide molecular framework comprising a sequence of amino acids or analogues thereof forming a cysteine-knot backbone wherein said cysteine-knot backbone comprises sufficient disulfide bonds or chemical equivalents thereof, to confer a knotted topology on the three-dimensional structure of said cysteine-knot backbone and wherein at least one exposed amino acid residue such as on one or more beta turns and/or within one or more loops, is inserted or substituted (replaced) relative to the naturally occurring amino acid sequence. In some embodiments, the cyclotide is modified by the insertion of or substitution with an exogenous peptide sequence. Hence, the cyclotides described herein are modified cyclotides compared to a natural or wildtype unmodified cyclotide, in which the modified cyclotide has one or more loops inserted or substituted by one or more amino acid sequences, e.g., an exogenous peptide sequence. In aspects of provided embodiments, the modified cyclotides incorporate sufficient amino acid structure to provide high enzymatic stability.
In some embodiments, the modified cyclotide sequence may be defined as having a cysteine knot backbone moiety and an exogenous peptide sequence, said modified cyclotide comprising: i) an exogenous peptide sequence, wherein said sequence is about 2 to 50 amino acid residues; and ii) a cysteine knot backbone grafted to said sequence of step i), wherein said cysteine knot backbone comprises the structure (I):
wherein C₁to C₆are cysteine residues; wherein each of C₁and C₄, C₂and C₅, and C₃and C₆are connected by a disulfide bond to form a cysteine knot; wherein each X represents an amino acid residue in a loop, wherein said amino acid residues are the same or different; wherein d is about 1-2; wherein one or more of loops 1, 2, 3, 5 or 6 have an amino acid sequence comprising the sequence of clause i), wherein any loop comprising said sequence of clause i) comprises 2 to about 50 amino acids, and wherein for any of loops 1, 2, 3, 5, or 6 that do not contain said sequence of clause i), a, b, c, e, and f, are the same or different, and are each any number from 3-10, and b, c, e, and f are each any number from 1 to 20.
In some embodiments, the modified cyclotide sequence may be either linear or cyclic.
In some embodiments, modified cyclotides are derived from linear or cyclic forms of cyclotides of the Momordicae, Rubiaceae, and Violaceae plant species. In some embodiments, the modified cyclotides are derived from linear or cyclic form of cyclotides of the Momordicae species, including the squash serine protease inhibitor family (Otlewski & Korowarsch Acta Biochim Pol. 1996; 43(3):431-44). In some embodiments, the modified cyclotides are derived from Momordica cochinchinensis trypsin inhibitors MCoTI-I [SEQ ID NO: 95] and -II [SEQ ID NO: 96] (naturally cyclic) and MCoTI-III (naturally linear) [SEQ ID NO: 97] below.

For instance, the unmodified or wildtype cyclotide can be a cyclotide set forth in any one of SEQ ID NO: 95-97 to which one or more loops thereof is inserted or substituted by one or more amino acid sequences (e.g., an exogenous peptide sequence). In particular embodiments, the modified cyclotides are derived from loop replacement libraries based on Mcoti-II (SEQ ID NO: 96).
In some embodiments, the loop into which the exogenous peptide sequence is inserted or substituted is loop 1. In some embodiments, the loop into which the exogenous peptide sequence is inserted or substituted is loop 5. In some embodiments, the loop into which the exogenous peptide sequence is inserted or substituted is loop 6, such as formed subject to cyclization.
In some embodiments, the exogenous peptide sequence that is inserted or replaced into an unmodified cyclotide, e.g. the cyclotide Mcoti-II (SEQ ID NO: 96), is 2 to 50 amino acid residues. In some embodiments, the exogenous peptide sequence is 2 to 40 amino acids, 2 to 30 amino acids, 2 to 25 amino acids, 2 to 20 amino acids, 2 to 15 amino acids, 2 to 10 amino acids, 2 to 5 amino acids, 5 to 50 amino acids, 5 to 40 amino acids, 5 to 30 amino acids, 5 to 25 amino acids, 5 to 20 amino acids, 5 to 15 amino acids, 5 to 10 amino acids, 10 to 50 amino acids, 10 to 40 amino acids, 10 to 30 amino acids, 10 to 25 amino acids, 10 to 15 amino acids, 15 to 50 amino acids, 15 to 40 amino acids, 15 to 30 amino acids, 15 to 25 amino acids, 15 to 20 amino acids, 20 to 50 amino acids, 20 to 40 amino acids, 20 to 30 amino acids, 20 to 25 amino acids, 25 to 50 amino acids, 25 to 40 amino acids, 25 to 30 amino acids, 30 to 50 amino acids, 30 to 40 amino acids, or 40 to 50 amino acids. In some embodiments, the exogenous peptide sequence is 2 to 30 amino acids, such as 2 to 24 amino acids, 2 to 18 amino acids, 2 to 12 amino acids, 2 to 6 amino acids, 6 to 30 amino acids, 6 to 24 amino acids, 6 to 18 amino acids, 6 to 12 amino acids, 12 to 30 amino acids, 12 to 24 amino acids, 12 to 18 amino acids, 18 to 30 amino acids, 18 to 24 amino acids or 24 to 30 amino acids.

B. Display Libraries

Also provided herein are libraries of display particles, e.g., phagemid particles, including any that are produced by any the provided methods.
Also provided herein is a phagemid that comprises or is a replicable expression vector containing a gene fusion encoding a fusion protein that includes a first nucleic acid sequence encoding a single chain variable fragment with a cow variable heavy (VH) region that includes an ultralong CDR3 joined to a variable lambda light (VL) region selected from the group consisting of BLV1H12, BLV5D3, BLV8C11, BF1H1, BLV5B8, and F18, or a humanized variant thereof, and a second nucleic acid sequence encoding at least a portion of a phage coat protein. In some embodiments, the VL region is the VL region of BLV1H12.
Also provided herein is a phagemid that comprises or is a replicable expression vector containing a gene fusion encoding a fusion protein that includes a first nucleic acid sequence encoding a cow ultralong CDR3 knob and a second nucleic acid sequence encoding at least a portion of a phage coat protein.
Also provided herein is a phagemid that comprises or is a replicable expression vector containing a gene fusion encoding a fusion protein that includes a first nucleic acid sequence encoding a peptide sequence of 25-70 amino acids with a cysteine motif that includes 2-12 cysteine residues able to form disulfide bonds joined and a second nucleic acid sequence encoding at least a portion of a phage coat protein.
In some embodiments, also provided herein are libraries of display particles, e.g., phagemid particles, that are encoded by any of the phagemids described herein.
In some embodiments, the display particles include an ultralong CDR3 knob, e.g., any as described herein.
In some embodiments, the display particles include a synthetic or semisynthetic ultralong CDR3 knob, e.g., any as described herein.
In some embodiments, the display particles include a cyclotide, e.g., any as described herein.
In some embodiments, the display particles include a modified cyclotide, e.g., any as described herein.
In some embodiments, the display particles include an scFv with a VH containing an ultralong CDR3 region. In some embodiments, at least or at least about 20%, 30%, 35%, 40%, 45%, 50%, 60%, 70%, 80%, 85%, 90%, or 95% of the display particles, e.g., phagemid particles, in the library include an scFv that includes a VH region with an ultralong CDR3 region. In some embodiments, at least or at least about 30% of the display particles, e.g., phagemid particles, in the library include an scFv that includes a VH region with an ultralong CDR3 region. In some embodiments, at least or at least about 35% of the display particles, e.g., phagemid particles, in the library include an scFv that includes a VH region with an ultralong CDR3 region. In some embodiments, at least or at least about 40% of the display particles, e.g., phagemid particles, in the library include an scFv that includes a VH region with an ultralong CDR3 region. In some embodiments, at least or at least about 45% of the display particles, e.g., phagemid particles, in the library include an scFv that includes a VH region with an ultralong CDR3 region. In some embodiments, at least or at least about 50% of the display particles, e.g., phagemid particles, in the library include an scFv that includes a VH region with an ultralong CDR3 region. In some embodiments, at least or at least about 60% of the display particles, e.g., phagemid particles, in the library include an scFv that includes a VH region with an ultralong CDR3 region. In some embodiments, at least or at least about 70% of the display particles, e.g., phagemid particles, in the library include an scFv that includes a VH region with an ultralong CDR3 region. In some embodiments, at least or at least about 80% of the display particles, e.g., phagemid particles, in the library include an scFv that includes a VH region with an ultralong CDR3 region. In some embodiments, at least or at least about 90% of the display particles, e.g., phagemid particles, in the library include an scFv that includes a VH region with an ultralong CDR3 region. In some embodiments, at least or at least about 95% of the display particles, e.g., phagemid particles, in the library include an scFv that includes a VH region with an ultralong CDR3 region.

C. Library Selection Methods

Also provided herein are methods for selecting, from any of the display libraries described herein, an antibody binding protein that is specific for a target molecule. These display libraries are then contacted with a target molecule and those members of the library having the highest affinity for the target are separated from those of lower affinity. These display libraries, are then contacted with a target molecule and those members of the library having the highest affinity for the target are separated from those of lower affinity. The high affinity binders are then amplified by any suitable system. This process is reiterated until polypeptides of the desired affinity are obtained.
For instance, the display library is a phage display library as described herein in which an ultralong CDR3 scFv polypeptide or a CDR3-knob peptide, is fused to a phage coat protein and displayed, usually on average as a single copy of each related polypeptide, on the surface of a phagemid particle containing DNA encoding that polypeptide. These phagemid particles are then contacted with a target molecule and those particles having the highest affinity for the target are separated from those of lower affinity. The high affinity binders are then amplified by infection of a bacterial host and the competitive binding step is repeated. This process is reiterated until polypeptides of the desired affinity are obtained.
In some embodiments, the provided methods include contacting any of the display libraries provided herein with a target molecule under conditions to allow binding of a display particle, e.g., a phagemid particle, to the target molecule. In some embodiments, the methods further include separating the display particles, e.g., the phagemid particles, that bind from those that do not, thereby selecting display particles, e.g., the phagemid particles, that include an antibody binding protein that binds to the target molecule. In some embodiments, the methods include sequencing the fusion gene in the selected particles to identify the antibody binding protein.
Target molecules may be isolated from natural sources or prepared by recombinant methods by procedures known in the art. The purified target molecule can be attached to a suitable matrix such as agarose beads, acrylamide beads, glass beads, cellulose, various acrylic copolymers, hydroxyalkyl methacrylate gels, polyacrylic and polymethacrylic copolymers, nylon, neutral and ionic carriers, and the like. Attachment of the target protein to the matrix may be accomplished by methods described in Methods in Enzymology, 44 1976, or by other means known in the art.
After attachment of the target molecule to the matrix, the immobilized target can be contacted with the library of display particles, e.g., phagemid particles, under conditions suitable for binding of at least a portion of the display particles with the immobilized target molecules. Normally, the conditions, including pH, ionic strength, temperature and the like will mimic physiological conditions. Exemplary “contacting” conditions may comprise incubation for 15 minutes to 4 hours, e.g. one hour, at 4°-37° C., e.g. at room temperature. However, these may be varied as appropriate depending on the nature of the interacting binding partners, etc. The mixture can be subjected to gentle rocking, mixing, or rotation. In addition, other appropriate reagents such as blocking agents to reduce nonspecific binding may be added. For example 1-4% BSA or other suitable blocking agent (e.g. milk) may be used. It will be appreciated however that the contacting conditions can be varied and adapted by a skilled person depending on the aim of the screening method. For example, if the incubation temperature is, for example, room temperature or 37° C., this may increase the possibility of identifying binders which are stable under these conditions, e.g., in the case of incubation at 37° C., are stable under conditions found in the human body. Such a property might be extremely advantageous if one or both of the binding partners was a candidate to be used in some sort of therapeutic application, e.g. an antibody. Again such adaptations to the conditions are within the ambit of the skilled person
Bound display particles (“binders”) having high affinity for the immobilized target molecule can be separated from those having a low affinity (and thus do not bind to the target) by washing. Binders can be dissociated from the immobilized target molecules by a variety of methods. These methods include competitive dissociation using the wild-type ligand, altering pH and/or ionic strength, and methods known in the art.
In some embodiments, the target molecule is a nonvirulent bacteria, a virus, a viral protein, a cancer antigen, a human IgG, or a recombinant protein thereof. In some embodiments, the target molecule is a viral protein. In some embodiments, the target molecule is a coronavirus, a coronavirus pseudovirus, a recombinant coronavirus Spike protein, or a receptor-binding domain (RBD) of a coronavirus Spike protein. In some embodiments, the coronavirus is selected from the group consisting of 229E, NL63, OC43, HKU1, MERS-CoV, SARS-CoV and SARS-CoV2. In some embodiments, the coronavirus is a SARS-CoV2 selected from Wuhan-Hu-1 isolate, B.1.351 South African variant or B.1.1.7 UK variant.
In some embodiments, the methods include steps wherein previously selected display particles are re-expressed and subjected to further selection steps, including with the same or a different target molecule. In some embodiments, the selection steps are repeated one or more times. In some embodiments, the further selection steps include infecting suitable host cells with replicable expression vectors encoding the previously selected display particles; collecting additional amplified display particles; and contacting the additional amplified display particles with the same or a different target antigen. In some embodiments, the different target molecule is related to the target molecule and is the same type of pathogen, the same group of pathogen, or a variant of the target molecule. In some embodiments, the target molecule and different target molecule are associated with any combination of coronaviruses 229E, NL63, OC43, HKU1, MERS-CoV, SARS-CoV, and SARS-CoV2. In some embodiments, the target molecule and different target molecule are associated with any combination of SARS-CoV2 variants selected from Wuhan-Hu-1 isolate, B.1.351 South African variant, and B.1.1.7 UK variant.
Once one or more sets of binders have been selected or isolated in accordance with the provided methods, these can be subjected to further analysis. In some embodiments, the further analysis involves the isolation of binders by infection of bacteria as an amplification step, isolating the phage or phagemid DNA, and cloning the DNA sequence encoding the candidate binders contained in said phage or phagemid DNA into a suitable expression vector. Such an infection step can also allow the amplification of the binders. Alternatively, binders can be amplified at this stage by other appropriate methods, for example by PCR of the nucleic acids encoding said binders or the transformation of said nucleic acid into an appropriate host cell (in the context of a suitable expression vector).
Once the DNA encoding the binders are cloned in a suitable expression vector, the DNA encoding the binders can be sequenced or the protein can be expressed in a soluble form, e.g., including according to the methods provided herein, and subjected to appropriate binding studies to further characterize the candidates at the protein level. Appropriate binding studies will depend on the nature of the binders, and include, but are not limited to ELISA, filter screening assays, FACS, or immunofluorescence assays, BiaCore affinity measurements or other methods to quantify binding constants, staining tissue slides or cells and other immunohistochemistry methods. One or more of these binding studies can be used to analyze the binders.
Also provided herein are methods for identifying an ultralong CDR H3 knob, such as a bovine CDR H3 knob, by amino acid sequence, including from a sequence library. In some aspects, methods for identifying an ultralong CDR H3 knob include defining the region of the knob domain, such as by reference to the formula described herein, e.g. set forth below.
In some embodiments, a method for identifying an ultralong CDR H3 knob, includes defining the knob region N-terminal boundary as the first D_Hcysteine in the “CPDG” motif. In some embodiments, the method further includes defining the C-terminal boundary as the position located by subtracting number of ascending stalk residues from the framework 4 tryptophan position. In some aspects, the method can be used for identifying an ultralong CDR H3 knob from any antibody sequence. In particular embodiments, the antibody sequence is a bovine antibody, such as any of the antibodies described herein.
An expression of this embodiment of the method is shown below:

- Knob boundary position (C-terminal end)=Position of conserved framework 4 tryptophan−X; wherein X=number of amino acids, starting at the framework 3 canonical cysteine that defines the ascending stalk, and ending at the amino acid preceding the conserved first D region cysteine in the “CPDG” motif;
- Number of residues in the knob (K)=L−2X; wherein L=number of amino acids encompassing stalk and knob domains, starting at canonical framework 3 cysteine and ending at canonical framework 4 tryptophan;

$K position = (X + 1) to (X + K)$

III. Soluble Peptide Expression

Also provided herein in some embodiments are methods of producing soluble disulfide bond-containing peptides, including methods of producing any of the antibody binding proteins (also referred to as binders) identified by any of the methods described herein. The soluble peptides produced by the provided methods are peptides (e.g., of 25 to 70 amino acids in length) that contain 2 or more cysteine residues from which it is desired to produce a disulfide-bonded soluble protein. In some embodiments, the provided methods include transforming a host cell, e.g., E. coli, with an expression vector encoding the soluble peptide. In some embodiments, the expression vector encodes a fusion protein that includes the soluble peptide and a chaperone, e.g., a bacterial chaperone. In some embodiments, the soluble peptide and the chaperone, e.g., bacterial chaperone, are joined by a linker. In some embodiments, the linker is a cleavable linker.
Techniques for manipulating nucleic acids, such as those for generating mutation in sequences, subcloning, labeling, probing, sequencing, hybridization and so forth, are described in detail in scientific publications and patent documents. See, for example, Sambrook J, Russell D W (2001) Molecular Cloning: a Laboratory Manual, 3rd ed. Cold Spring Harbor Laboratory Press, New York; Current Protocols in Molecular Biology, Ausubel ed., John Wiley & Sons, Inc., New York (1997); Laboratory Techniques in Biochemistry and Molecular Biology: Hybridization With Nucleic Acid Probes, Part I, Theory and Nucleic Acid Preparation, Tijssen ed., Elsevier, N.Y. (1993).
In some embodiments, the fusion protein has increased solubility relative to the soluble protein alone. In some aspects, this increased solubility is conferred at least in part by the inclusion of the chaperone, e.g., bacterial chaperone. In some aspects, the inclusion of the chaperone, e.g., bacterial chaperone, promotes solubility of the fusion protein while permitting disulfide bond formation in the soluble peptide, including in host cell environments that have been engineered or modified to promote disulfide bond formation. In some embodiments, the chaperone, e.g., bacterial chaperone, is thioredoxin A (TrxA).
In some embodiments, the provided methods further include culturing the host cell, e.g., the bacteria, such as E. coli, under conditions permissive of expression of the fusion protein. In some embodiments, the provided methods further include, following the culturing, isolating the expressed fusion protein from supernatant of a lysate of the host cell, e.g., the bacteria, such as E. coli. In some embodiments, the provided methods further include cleaving the cleavable linker, thereby producing the soluble peptide that is free of the bacterial chaperone.
In some embodiments, the cleavable linker is an enterokinase cleavage tag. In some embodiments, the cleavable linker includes the amino acid sequence DDDDK (SEQ ID NO: 106). In some embodiments, the cleaving of the cleavable linker includes adding enterokinase. In some embodiments, enterokinase is added to the supernatant of the host cell lysate. In some embodiments, the provided methods further include, following cleaving the cleavable linker, removing the enterokinase and/or the bacterial chaperone from the solution containing the soluble peptide.
In some embodiments, the soluble peptide is up to 70 amino acids in length. In some embodiments, the soluble peptide is 40 to 60 amino acids in length. In some embodiments, the soluble peptide is at least 42 amino acids in length. In some embodiments, the soluble peptide is 42 amino acids, 43 amino acids, 44 amino acids, 45 amino acids, 46 amino acids, 47 amino acids, 48 amino acids, 49 amino acids, 50 amino acids, 51 amino acids, 52 amino acids, 53 amino acids, 54 amino acids, 55 amino acids, 56 amino acids, 57 amino acids, 58 amino acids, 59 amino acids or 60 amino acids in length.
In some embodiments, the soluble peptide is 25-70 amino acids. For instance, in some embodiments the soluble peptide s 35 amino acids in length or longer, 40 amino acids in length or longer, 45 amino acids in length or longer, 50 amino acids in length or longer, 55 amino acids in length or longer, or 60 amino acids in length or longer. In some embodiments, the soluble peptide is between or between about 35 and 70 amino acids in length, 40 and 70 amino acids in length, 45 and 70 amino acids in length, 50 and 70 amino acids in length, 55 and 70 amino acids in length, or 60 and 70 amino acids in length.
In some embodiments, the soluble peptide is 6 to 50 amino acids, 6 to 40 amino acids, 6 to 30 amino acids, 6 to 25 amino acids, 6 to 20 amino acids, 6 to 15 amino acids, 6 to 10 amino acids, 10 to 50 amino acids, 10 to 40 amino acids, 10 to 30 amino acids, 10 to 25 amino acids, 10 to 15 amino acids, 15 to 50 amino acids, 15 to 40 amino acids, 15 to 30 amino acids, 15 to 25 amino acids, 15 to 20 amino acids, 20 to 50 amino acids, 20 to 40 amino acids, 20 to 30 amino acids, 20 to 25 amino acids, 25 to 50 amino acids, 25 to 40 amino acids, 25 to 30 amino acids, 30 to 50 amino acids, 30 to 40 amino acids, or 40 to 50 amino acids. In some embodiments, the soluble peptide is 6 to 30 amino acids, 6 to 24 amino acids, 6 to 18 amino acids, 6 to 12 amino acids, 12 to 30 amino acids, 12 to 24 amino acids, 12 to 18 amino acids, 18 to 30 amino acids, 18 to 24 amino acids or 24 to 30 amino acids.
In some embodiments, the soluble peptide includes a cysteine motif able to form disulfide bonds. In some embodiments, the cysteine motif includes 2-20 cysteine residues, for instance between or between about 2 and 18, 2 and 16, 2 and 14, 2 and 12, 2 and 10, 2 and 8, 2 and 6, 2 and 4, 4 and 20, 4 and 18, 4 and 16, 4 and 14, 4 and 12, 4 and 10, 4 and 8, 4 and 6, 6 and 20, 6 and 18, 6 and 16, 6 and 14, 6 and 12, 6 and 10, 6 and 8, 8 and 20, 8 and 18, 8 and 16, 8 and 14, 8 and 12, 8 and 10, 10 and 20, 10 and 18, 10 and 16, 10 and 14, 10 and 12, 12 and 20, 12 and 18, 12 and 16, 12 and 14, 14 and 20, 14 and 18, 14 and 16, 16 and 20, 16 and 18, or 18 and 20 cysteine residues, each inclusive. In some embodiments, the cysteine motif includes 2-12 cysteine residues. In some embodiments, the soluble peptide comprises at least 4 Cys residues. In some embodiments, the soluble peptide contains 4 Cys residues. In some embodiments, the soluble peptide contains 6, 8, 10, or 12 Cys residues.
In some embodiments, the soluble peptide includes 1-10 disulfide bonds, for instance between or between about 1 and 9, 1 and 8, 1 and 7, 1 and 6, 1 and 5, 1 and 4, 1 and 3, 1 and 2, 2 and 10, 2 and 9, 2 and 8, 2 and 7, 2 and 6, 2 and 5, 2 and 4, 2 and 3, 3 and 10, 3 and 9, 3 and 8, 3 and 7, 3 and 6, 3 and 5, 3 and 4, 4 and 10, 4 and 9, 4 and 8, 4 and 7, 4 and 6, 4 and 5, 5 and 10, 5 and 9, 5 and 8, 5 and 7, 5 and 6, 6 and 10, 6 and 9, 6 and 8, 6 and 7, 7 and 10, 7 and 9, 7 and 8, 8 and 10, 8 and 9, or 9 and 10 disulfide bonds, each inclusive. In some embodiments, the soluble peptide includes 1-6 disulfide bonds. In some embodiments, the soluble peptide contains 2-6 disulfide bonds. In some embodiments, the soluble peptide has at least 2 disulfide bonds. In some embodiments, the soluble peptide has 2 disulfide bonds. In some embodiments, the soluble peptide has 3, 4, or 5 disulfide bonds.
In some embodiments, the soluble peptide includes 3-6 amino acids preceding the most N-terminal cysteine residue present in the soluble peptide. In some embodiments, the soluble peptide includes 3, 4, 5, or 6 amino acids preceding the most N-terminal cysteine residue present in the soluble peptide.
In some embodiments, the soluble peptide includes at least 6 amino acids following the most C-terminal cysteine residue present in the soluble peptide. In some embodiments, the soluble peptide includes 6-9 amino acids following the most C-terminal cysteine residue present in the soluble peptide. In some embodiments, the soluble peptide includes 6, 7, 8, or 9 amino acids following the most C-terminal cysteine residue present in the soluble peptide.
In some embodiments, the soluble peptide includes a flexible linker. In some embodiments, the flexible linker is included at the N-terminus of the soluble peptide. In some embodiments, the flexible linker is in addition to the 3-6 amino acids preceding the most N-terminal cysteine residue present in the soluble peptide. In some embodiments, the flexible linker is included in the 3-6 amino acids preceding the most N-terminal cysteine residue present in the soluble peptide. In some embodiments, the flexible linker is included at the C-terminus of the soluble peptide. In some embodiments, the flexible linker is in addition to the at least 6 amino acids following the most C-terminal cysteine residue present in the soluble peptide. In some embodiments, the flexible linker is included in the at least 6 amino acids following the most C-terminal cysteine residue present in the soluble peptide.
In some embodiments, the flexible linker is GGGGAMGS (SEQ ID NO: 108). In some embodiments, the flexible linker is GGS (SEQ ID NO: 109). In some embodiments, the flexible linker (e.g., GGGGAMGS, SEQ ID NO: 108) allows for cyclization of the soluble peptide. In some embodiments, the cyclization is via chemical or enzymatic methods. In some embodiments, the flexible linker (e.g., GGGGAMGS, SEQ ID NO: 108) allows for sortase-mediated cyclization of the soluble peptide. In some embodiments, the provided methods further include a step of cyclizing the soluble peptide, e.g., via chemical or enzymatic methods.
In some embodiments, the provided methods further include steps for enriching for the soluble peptide. In some embodiments, the provided methods further include separating the soluble peptide from any soluble aggregates present in solution, including soluble aggregates of the soluble peptide. In some embodiments, the separating involves the active soluble peptide from the larger, inactive or less active soluble aggregates thereof. In some embodiments, the separating is achieved using chromatographic methods. In some embodiments, the enriching or separating is by size exclusion chromatography. In some embodiments, the separating involves collecting one or more elution fractions containing the soluble peptide, but not the soluble aggregates thereof, thereby producing an enriched or purified composition of soluble peptides.
In some embodiments, the provided methods further include producing a multispecific binding molecule that includes the soluble peptide. In some embodiments, the multispecific binding molecule includes multiple copies of the soluble peptide. In some embodiments, the multispecific binding molecule includes different soluble peptides. In some embodiments, the multispecific binding molecule includes a flexible linker (e.g., Gly-Gly-Gly-Ser) between the soluble peptides (e.g., between the C-terminus of one soluble peptide copy and the N-Terminus of the other soluble peptide copy). In some embodiments, one soluble peptide is present in a VH region that is expressed with a light chain as an IgG, and the second soluble peptide is fused to the heavy chain constant region. In some embodiments, the multispecific binding molecule includes two VH regions with the same soluble peptide. In some embodiments, the multispecific binding molecule includes VH regions that include different soluble peptides, for instance using heavy chains with constant region mutations such that only the heterologous heavy chains effectively pair with one another to form a dimer. In some embodiments, these mutations are ‘knobs-into-holes’ mutations, such as T22Y on one chain and Y86T on the other chain in the CH3 domain of Fc.
In some embodiments, the expression vector further includes an inducible promoter sequence to control the expression of the fusion protein. The term “promoter sequence” as used herein refers to a DNA sequence, which is generally located upstream of a gene present in a DNA polymer, and provides a site for initiation of the transcription of said gene into mRNA. Promoter sequences suitable for use in this invention may be derived from viruses, bacteriophages, prokaryotic cells or eukaryotic cells, and may be a constitutive promoter or an inducible promoter.
In some embodiments, the inducible promoter sequence is operably linked to the sequence encoding the fusion protein. The term “operatively linked” as used herein means that a first sequence is disposed sufficiently close to a second sequence such that the first sequence can influence the second sequence or regions under the control of the second sequence. For instance, a promoter sequence may be operatively linked to a gene sequence, and is normally located at the 5′-terminus of the gene sequence such that the expression of the gene sequence is under the control of the promoter sequence. In addition, a regulatory sequence may be operatively linked to a promoter sequence so as to enhance the ability of the promoter sequence in promoting transcription. In such case, the regulatory sequence is generally located at the 5′-terminus of the promoter sequence.
Promoter sequences suitable for use in this invention are preferably derived from any one of the following: viruses, bacterial cells, yeast cells, fungal cells, algal cells, plant cells, insect cells, animal cells, and human cells. For example, a promoter useful in bacterial cells includes, but is not limited to, tac promoter, T7 promoter, T7 Al promoter, lac promoter, trp promoter, trc promoter, araBAD promoter, and λPRPL promoter. A promoter useful in plant cells includes, e.g., 35S CaMV promoter, actin promoter, ubiquitin promoter, etc. Regulatory elements suitable for use in mammalian cells include CMV-HSV thymidine kinase promoters, SV40, RSV-promoters, CMV enhancers, or SV40 enhancers.
Vectors suitable for use in this invention include those commonly used in genetic engineering technology, such as bacteriophages, plasmids, cosmids, viruses, or retroviruses.
Vectors suitable for use in this invention may include other expression control elements, such as a transcription starting site, a transcription termination site, a ribosome binding site, a RNA splicing site, a polyadenylation site, a translation termination site, etc. Vectors suitable for use in this invention may further include additional regulatory elements, such as transcription/translation enhancer sequences, and at least a marker gene or reporter gene allowing for the screening of the vectors under suitable conditions. Marker genes suitable for use in this invention include, for instance, dihydrofolate reductase gene and G418 or neomycin resistance gene useful in eukaryotic cell cultures, and ampicillin, streptomycin, tetracycline or kanamycin resistance gene useful in E. coli and other bacterial cultures. Vectors suitable for use in this invention may further include a nucleic acid sequence encoding a secretion signal. These sequences are well known to those skilled in the art.
Depending on the vector and host cell system used, the recombinant gene product (protein) produced according to this invention may either remain within the recombinant cell, be secreted into the culture medium, be secreted into periplasm, or be retained on the outer surface of a cell membrane. The recombinant gene product (protein) produced by the method of this invention can be purified by using a variety of standard protein purification techniques, including, but not limited to, affinity chromatography, ion exchange chromatography, gel filtration, electrophoresis, reverse phase chromatography, chromatofocusing and the like. The recombinant gene product (protein) produced by the method of this invention is preferably recovered in “substantially pure” form. As used herein, the term “substantially pure” refers to a purity of a purified protein that allows for the effective use of said purified protein as a commercial product.

A. Host Cells

The term “host cell” is used to refer to a cell which has been transformed, transfected or infected or is capable of being transformed, transfected or infected with a nucleic acid sequence and then of expressing a selected gene of interest to recombinantly produce a protein of interest. The term includes the progeny of the parent cell, whether or not the progeny is identical in morphology or in genetic make-up to the original parent, so long as the selected gene or genetic modification is present.
The provided methods for producing a soluble peptide or a fusion protein containing the soluble peptide and a chaperone, e.g., bacterial chaperone, can be performed using any host organism which is capable of expressing heterologous polypeptides, and is capable of being genetically modified. A host organism is preferably a unicellular host organism, however, the use of multicellular organisms is also encompassed by the provided methods, provided the organism can be modified as described herein and a polypeptide of interest expressed therein. For purposes of clarity, the term “host cell” will be used herein throughout, but it should be understood, that a host organism can be substituted for the host cell, unless unfeasible for technical reasons.
In some embodiments, the host cell is a prokaryotic cell, such as a bacterial cell. The host cell may be a gram positive bacterial cells, such as Bacillus or gram negative bacteria such as E. coli. The host organisms may be aerobic or anaerobic organisms. In some embodiments, host cells are those which have characteristics which are favorable for expressing polypeptides, such as host cells having fewer proteases than other types of cells. Suitable bacteria for this purpose include archaebacteria and eubacteria, for example, Enterobacteriaceae. Other examples of useful bacteria include Escherichia, Enterobacter, Azotobacter, Erwinia, Bacillus, Pseudomonas, Klebsiella, Proteus, Salmonella, Serratia, Shigella, Rhizobia, Vitreoscilla, and Paracoccus. Additional examples of useful bacteria include Corynebacterium, Lactococcus, Lactobacillus, and Streptomyces species, in particular Corynebacterium glutamicum, Lactococcus lactis, Lactobacillus plantarum, Streptomyces coelicolor, Streptomyces lividans. Suitable E. coli hosts include E. coli DHB4, E. coli BL-21 (which are deficient in both Ion (Phillips et al. J. Bacteriol. 159: 283, 1984) and ompT proteases), E. coli AD494, E. coli W3110 (ATCC 27,325), E. coli 294 (ATCC 31,446), E. coli B, and E. coli X1776 (ATCC 31,537). Other strains include E. coli B834 which are methionine deficient and, therefore, enables high specific activity labeling of target proteins with ³⁵S-methionine or selenomethionine (Leahy et al. Science 258: 987, 1992). Yet other strains of interest include the BLR strain, and the K-12 strains HMS174 and NovaBlue, which are recA-derivative that improve plasmid monomer yields and may help stabilize target plasmids containing repetitive sequences.
In some embodiments, the E. coli host cell used in the provided methods is engineered or modified to improve soluble expression of disulfide-bonded proteins in the E. coli cytosol. In some embodiments, the cytoplasmic thiol-redox equilibrium environment is changed via alteration in reducing pathways, such as thioredoxin reductase. In some embodiments, the E. coli host cell has an oxidizing cytoplasm that is permissive of disulfide bond formation. Various types of mutant strains, including SHuffle (New England Biolabs) and Origami™ (DE3) (Novagen, Germany), which lack glutathione reductase Agor, thioredoxin reductase, and/or glutathione biosynthesis pathways, are commercially available. In some embodiments, the E. coli strain transformed as part of the provided methods is the Origami™ (DE3) (Novagen, Germany) mutant strain.
Suitable Bacillus strains include Bacillus subtilis, Bacillus anzyloliguelaciens, Bacillus licheniformis, Bacillus brevis, Bacillus alcalophilus, Bacillus clauseii, Bacillus cereus, Bacillus pumilus, Bacillus thuringiensis, or Bacillus halodurans. The Gram-positive bacterium B. subtilis is a preferred organism for secretory protein production in the biotechnological industry. Its popularity is primarily based on the fact that B. subtilis lacks an outer membrane, which retains many proteins in the periplasm of Gram-negative bacteria such as Escherichia coli. Accordingly, the majority of B. subtilis proteins that are transported across the cytoplasmic membrane end up directly in the growth medium. Additionally, the lack of an outer membrane implies that proteins produced with B. subtilis are free from lipopolysaccharide (endotoxin). Other advantages of using B. subtilis as a protein production host are its high genetic amenability, the availability of strains with mutations in nearly all of the ^˜4100 genes, a toolbox with strains and vectors for gene expression, and the fact that this bacterium is generally recognized as safe (Braun et al., Curr. Opin. Biotechnol. 10:376-381, 1999; Kobayashi et al., Proc. Natl. Acad. Sci. U.S.A 100:4678-4683, 2003; Kunst et al. Nature 390:249-256, 1997; Zeigler et al., In E. Goldman and L. Green (ed.), Practical Handbook of Microbiology. CRC Press, Boca Raton, Fla., 2008).
In another embodiment, the host cell is a eukaryotic cell, such as a yeast cell or a mammalian cell. Examples of mammalian cells include, but are not limited to Chinese hamster ovary cells (CHO) (ATCC No. CCL61), CHO DHFR-cells (Urlaub et al., Proc. Natl. Acad. Sci. USA, 97:4216-4220 (1980)), human embryonic kidney (HEK) 293 or 293T cells (ATCC No. CRL1573), or 3T3 cells (ATCC No. CCL92). The selection of suitable mammalian host cells and methods for transformation, culture, amplification, screening and product production and purification are known in the art. Other suitable mammalian cell lines, are the monkey COS-1 (ATCC No. CRL1650) and COS-7 cell lines (ATCC No. CRL1651), and the CV-1 cell line (ATCC No. CCL70). Further exemplary mammalian host cells include primate cell lines and rodent cell lines, including transformed cell lines. Normal diploid cells, cell strains derived from in vitro culture of primary tissue, as well as primary explants, are also suitable. Candidate cells may be genotypically deficient in the selection gene, or may contain a dominantly acting selection gene. Other suitable mammalian cell lines include but are not limited to, mouse neuroblastoma N2A cells, HeLa, mouse L-929 cells, 3T3 lines derived from Swiss, Balb-c or NIH mice, BHK or HaK hamster cell lines, which are available from the ATCC. Each of these cell lines is known by and available to those skilled in the art of protein expression.
Many strains of yeast cells known to those skilled in the art are also available as host cells for the expression of the polypeptides described herein. Exemplary yeast cells include, for example, Saccharomyces cerivisae and Pichia pastoris. Fungi, such as Aspergillum, are also available as host cells for the expression of the polypeptides described herein.
Additionally, where desired, insect cell systems may be utilized in the provided methods. Such systems are described for example in Kitts et al., Biotechniques, 14:810-817 (1993); Lucklow, Curr. Opin. Biotechnol., 4:564-572 (1993); and Lucklow et al. (J. Virol., 67:4566-4579 (1993). Exemplary insect cells are Sf-9 and Hi5 (Invitrogen, Carlsbad, Calif.).

B. Soluble Peptides

In some embodiments, the soluble peptide produced in the provided methods is a soluble ultralong CDR3 knob. In some embodiments, the soluble peptide produced in the provided methods is a soluble synthetic or semisynthetic peptide. In some embodiments, the soluble peptide produced in the provided methods is a cyclotide. In some embodiments, the soluble peptide produced in the provided methods is a modified cyclotide. In some embodiments, the soluble peptide produced in the provided methods is a semisynthetic or modified ultralong CDR3 knob.

1. Soluble Bovine Ultralong CDR3 Knobs

In some embodiments, the soluble peptide produced in the provided methods is a soluble ultralong CDR3 knob. In some embodiments, the soluble ultralong CDR3 knob is a cow ultralong CDR3. In some embodiments, the soluble ultralong CDR3 knob is encoded by a sequence that has been amplified from a cow cDNA template library, e.g., one prepared from RNA isolated from peripheral blood mononuclear cells (PBMCs) from an immunized cow. In some embodiments, the soluble ultralong CDR3 knob includes all or a portion of sequences that have been amplified from a cow cDNA template library according to any of the methods provided herein (see, e.g., Sections II-A-1-a and II-A-1-b). In some embodiments, the soluble ultralong CDR3 knob is any that has been identified or selected as a binder of a target molecule. In some embodiments, the soluble ultralong CDR3 knob is or is a portion of any ultralong CDR3 knob that has been identified or selected as a binder of a target molecule according to any of the methods provided herein (see, e.g., Sections II-C).

2. Soluble Synthetic Peptides

In some embodiments, the soluble peptide produced in the provided methods is a soluble synthetic or semisynthetic peptide. In some embodiments, the soluble peptide produced in the provided methods is a semisynthetic or modified ultralong CDR3 knob. In some embodiments, the soluble peptide produced in the provided methods is a cyclotide. In some embodiments, the soluble peptide produced in the provided methods is a modified cyclotide.
a. Soluble Synthetic Ultralong CDR3 Knobs
In some embodiments, the soluble peptide is a semisynthetic ultralong CDR3 knob. In some embodiments, the semisynthetic ultralong CDR3 knob is derived from a bovine ultralong CDR3 knob that has been used as a scaffold for modifications. In some embodiments, the bovine ultralong CDR3 knob is encoded by a sequence that has been amplified from a cow cDNA template library, e.g., one prepared from RNA isolated from peripheral blood mononuclear cells (PBMCs) from an immunized cow. In some embodiments, the bovine ultralong CDR3 knob includes all or a portion of sequences that have been amplified from a cow cDNA template library according to any of the methods provided herein (see, e.g., Sections II-A-1-a and II-A-1-b). In some embodiments, the bovine ultralong CDR3 knob is any that has been identified or selected as a binder of a target molecule. In some embodiments, the bovine ultralong CDR3 knob is or is a portion of any ultralong CDR3 knob that has been identified or selected as a binder of a target molecule according to any of the methods provided herein (see, e.g., Sections II-C).
In some embodiments, the bovine ultralong CDR3 knob has been modified to include random mutations, e.g., while preserving the cysteine motif and disulfide bond structure as described herein, e.g., such that the semisynthetic ultralong CDR3 knob still includes 2-20 cysteine residues and 1-10 disulfide bonds. In some embodiments, the bovine ultralong CDR3 knob has been modified to include an exogenous peptide sequence. In some embodiments, the bovine ultralong CDR3 knob has been modified to delete a one or more peptide sequences therein, e.g., while preserving the cysteine motif and disulfide bond structure as described herein, e.g., such that the semisynthetic ultralong CDR3 knob still includes 2-20 cysteine residues and 1-10 disulfide bonds.
b. Soluble Cyclotides
In some embodiments, the soluble peptide produced in the provided methods is a soluble cyclotide. In some embodiments, the cyclotide is a cyclotide that has been modified to include an exogenous peptide sequence.
Cysteine-knot microproteins (cyclotides) include a naturally occurring family of cysteine-knot microproteins or cyclotides found in various plant species. Cysteine-knot microproteins (cyclotides) are small peptides, typically consisting of about 30-40 amino acids, which can be found naturally as cyclic or linear forms, where the cyclic form has no free N- or C-terminal amino or carboxyl end. They have a defined structure based on three intra-molecular disulfide bonds and a small triple stranded β-sheet (Craik et al., 2001; Toxicon 39, 43-60). The cyclic proteins exhibit conserved cysteine residues defining a structure referred to herein as a “cysteine knot”. This family includes both naturally occurring cyclic molecules and their linear derivatives as well as linear molecules which have undergone cyclization. These molecules are useful as molecular framework structures having enhanced stability over less structured peptides. (Colgrave and Craik, 2004; Biochemistry 43, 5965-5975).
The main cyclotide features are a remarkable stability due to the cysteine knot, a small size making them readily accessible to chemical synthesis, and an excellent tolerance to sequence variations. The cyclotide scaffold is found in almost 30 different protein families among which conotoxins, spider toxins, squash inhibitors, agouti-related proteins and plant cyclotides are the most populated families. Cyclotides from plants in the Rubiaceae and Violaceae families are for the most part found to be head-to-tail cyclic peptides (Craik et al. 2010. Cell. Mol. Life Sci. 67:9-16). However, within the squash inhibitor family of cyclotides both cyclic and linear cyclotides have been identified from Momordica cochinchinensis: the cyclic trypsin inhibitors (MCoTI)-I and —II and their linear counterpart MCoTI-III (Hernandez et al. 2000. Biochemistry, 39, 5722-5730). It is now clear that both cyclic and linear variants can exist in different cyclotide families, but the impact of the cyclization is poorly understood. Cyclic peptides were expected to display improved stability, better resistance to proteases, and reduced flexibility when compared to their linear counterparts, hopefully resulting in enhanced biological activities. However, linear cyclotides have the advantage of being able to be more easily linked to other peptides or proteins.
For instance, cyclotides are commonly found in plants. In aspects of provided embodiments, cyclotides are derived from linear or cyclic form of cyclotides of the Momordicae, Rubiaceae and Violaceae, plant species. In a preferred aspect, cyclotides of the invention are derived from linear or cyclic form of cyclotides of the Momordicae species including the squash serine protease inhibitor family (Otlewski & Korowarsch Acta Biochim Pol. 1996; 43(3):431-44), and in a more preferred aspect from Momordica cochinchinensis trypsin inhibitors MCoTI-I [SEQ ID NO: 95] and -II [SEQ ID NO: 96] (naturally cyclic) and MCoTI-III (naturally linear) [SEQ ID NO: 97] below.

For instance, the unmodified or wildtype cyclotide can be a cyclotide set forth in any one of SEQ ID NO: 95-97 to which one or more loops thereof is inserted or substituted by one or more amino acid sequences (e.g., an exogenous peptide sequence). In particular embodiments, the modified cyclotides are derived from loop replacement libraries based on Mcoti-II (SEQ ID NO: 96).
In some embodiments, the loop into which the exogenous peptide sequence is inserted or substituted is loop 1. In some embodiments, the loop into which the exogenous peptide sequence is inserted or substituted is loop 5. In some embodiments, the loop into which the exogenous peptide sequence is inserted or substituted is loop 6, such as formed subject to cyclization.

IV. Antibodies Comprising Peptides

Also provided herein in some embodiments are methods that include producing a full-length IgG or a Fab. In some embodiments, the full-length IgG or the Fab is produced from an antibody binding protein or peptide that is selected according to any of the methods provided herein. In some embodiments, the full-length IgG or the Fab is produced from a soluble peptide produced according to any of the methods provided herein.
In some embodiments, the antibody binding protein is a scFv, and the method includes constructing a heavy chain or a portion thereof comprising joining the VH region of the scFv with a constant region or a portion thereof.
In some embodiments, the method includes constructing a humanized VH region by replacing a knob region of the ultralong CDR3 region of a humanized bovine VH region with an ultralong CDR3 region of the selected antibody binding protein. In some embodiments, the ultralong CDR3 region of a selected antibody binding protein is replaced between an ascending stalk strand and a descending stalk strand of a humanized bovine VH region. In some embodiments, the VH region comprises the formula V1-X-V2, wherein the V1 region of the heavy chain comprises the sequence set forth in SEQ ID NO: 111; the X region comprises an ultralong CDR3 of a selected antibody; and the V2 region comprises the sequence set forth in SEQ ID NO: 112.
In some embodiments, the method further comprises constructing a heavy chain or a portion thereof comprising joining the humanized VH region with a constant region or a portion thereof. In some embodiments, the heavy chain or the portion thereof is a human IgG1 heavy chain or portion thereof. In some embodiments, the method further includes co-expressing the heavy chain or portion thereof with a light chain.
In some embodiments, the light chain is a bovine light chain of BLVH12, BLV5D3, BLV8C11, BF1H1, BLV5B8 or F18, or is a humanized variant thereof. In some embodiments, the light chain is a BLV1H12 light chain (SEQ ID NO: 113) or a humanized variant thereof. In some embodiments, the light chain is a humanized light chain set forth in SEQ ID NO: 114. In some embodiments, the light chain is a BLV5B8 light chain (SEQ ID NO: 115) or a humanized variant thereof. In some embodiments, the light chain is a human light chain. In some embodiments, the light chain is selected from the group consisting of VL1-47, VL1-40, VL1-51, and VL2-18. In some embodiments, the light chain is set forth in any one of SEQ ID NO: 116-120.
In some embodiments, and antibody binding protein or peptide selected or produced by the methods is formatted as a multispecific binding protein, comprising a plurality of any of the provided peptides, such as knob peptides. In some embodiments, the plurality of peptides, such as knob peptides are paratopes. In some embodiments, the plurality of peptides, such as knob peptides are 2, 3, or 4 peptides. Exemplary formats for generating a multispecific polypeptide are depicted in FIG. 12 .
In some embodiments, one or more peptides, such as knob peptides, are linked in tandem in a single polypeptide chain separated with a flexible linker (e.g. GGGS or other similar flexible linker, including longer linkers of (GGGS)n where n is 1-3). In some embodiments, the tandem single polypeptide may include 2, 3, 4 or more peptides, such as knob peptides to produce a bivalent, trivalent, tetravalent or other multivalent molecule.
In some embodiments, the peptides, such as knob peptides are re-formatted by replacement of a knob region of an ultralong CDR-H3 scaffold, including any of the humanized ultralong heavy chain molecules described herein. The heavy chain can be complexed with a light chain, such as any of the light chain molecules described herein. In some embodiment, when produced in a cell, a two chain polypeptide is formed by dimerization resulting from disulfide formation between two heavy chain molecules. In some embodiments, the modified immunoglobulin containing a peptide, such as a knob peptide, is a homodimer containing the peptide, e.g. knob peptide. In other embodiments, two different heavy chains may be co-expressed in a cell using knobs-into-hole engineering strategy or other strategy to produce a heterodimer in which two different heavy chains, each carrying a different peptide, e.g. knob peptide, may interact to form a heterodimer. In some embodiments, residues of the constant chain are modified by amino acid substitution to promote the heterodimer formation. In some of any embodiments, the one more amino acid modifications are selected from a knob-into-hole modification and a charge mutation to reduce or prevent self-association due to charge repulsion. The heterodimer can be formed by transforming into a cell both a first nucleic acid molecule encoding a first polypeptide subunit and a second nucleic acid molecule encoding a second different polypeptide subunit. In some aspects, the heterodimer is produced upon expression and secretion from a cell as a result of covalent or non-covalent interaction between residues of the two polypeptide subunits to mediate formation of the dimer. In such processes, generally a mixture of dimeric molecules is formed, including homodimers and heterodimers. For the generation of heterodimers, additional steps for purification can be necessary. For example, the first and second polypeptide can be engineered to include a tag with metal chelates or other epitope, where the tags are different. The tagged domains can be used for rapid purification by metal-chelate chromatography, and/or by antibodies, to allow for detection by western blots, immunoprecipitation, or activity depletion/blocking in bioassays. Methods include those described in U.S. Pat. No. 10,995,127. In some embodiments, a human IgG1 includes a T22Y amino acid substitution in the CH3 domain and a second IgG1 heavy chain includes a Y86T amino acid substitution in the heavy chain.

V. Immunization

In some embodiments, the provided methods include the use of or amplification from a cDNA template library that is prepared from RNA isolated from an immunized cow. In some embodiments, the methods further include immunizing a cow with a target antigen.
In some embodiments, the target antigen is a nonvirulent bacteria, a virus, a viral protein, a cancer antigen, a human IgG, or a recombinant protein thereof. In some embodiments, the target antigen is a virus or viral protein, e.g., that is associated with a coronavirus, e.g., SARS CoV-2.
In some embodiments, a bovine is immunized by administering at least one dose of an antigenic composition comprising a target antigen or a group of related target antigens, e.g., antigens associated with variants of a virus. In some embodiments, the antigenic composition further comprises an adjuvant. The skilled person is familiar with many potentially useful adjuvants, such as Freund's complete adjuvant, alum, and squalene. See, e.g., US Patent Appl. Pub. No. 20150361160, which is incorporated by reference herein in its entirety for all purposes. Adjuvants which may be used in compositions of the invention include, but are not limited to oil emulsion compositions (oil-in-water emulsions and water-in-oil emulsions), complete Freund's adjuvant (CFA) and incomplete Freund's adjuvant (IFA). In one embodiment, the adjuvant comprises RIBI, Iscomatrix, or ENABL CI (VaxLiant). Adjuvants suitable for use in the invention include bacterial or microbial derivatives such as derivatives of enterobacterial lipopolysaccharide (LPS), Lipid A derivatives, immunostimulatory oligonucleotides and ADP-ribosylating toxins and detoxified derivatives thereof.
Methods for immunizing a bovine, such as a cattle, to produce, for example, high titer colostrum, milk, serum, or immune tissues (e.g., PBMC), are known in in the art. Such methods are disclosed, for example, in US Patent Appl. Pub. Nos US20070053917 and US20130022619, each of which is incorporated by reference herein in its entirety for all purposes.
In some embodiments, the immunizing comprises administering a priming dose and at least one booster dose of the antigenic composition. In some embodiments, the immunizing comprises administering more than one booster doses of the antigenic composition. In one embodiment, the priming dose and at least one booster dose comprise the same antigenic composition. In some embodiments, the more than one booster doses comprise the same antigenic composition. The animal may be dosed with the immunogenic composition at intervals over a period of days, weeks or months. At the conclusion of the immunization regime, the hyperimmune material such as blood, milk or colostrum is harvested. In one embodiment, the hyperimmune material is collected less than 2 months, less than 3 months, less than 4 months, less than 5 months, less than 6 months, less than 9 months, or less than 12 months after administering the priming dose. In one embodiment, the hyperimmune material is collected between about 3 months and about 6 months after administering the priming dose. In one embodiment, the hyperimmune material is collected between about 3 months and about 9 months after administering the priming dose. In some embodiments, the hyperimmune material is collected between about 3 months and about 12 months after administering the priming dose. In one embodiment, the hyperimmune material is collected between about 6 months and about 12 months after administering the priming dose.
In some embodiments, the methods further comprise isolating from the bovine a biological sample. In some embodiments, the biological sample is milk, blood, serum, colostrum, or peripheral blood mononuclear cells (PBMC). In one embodiment, the biological sample is collected less than 2 months, less than 3 months, less than 4 months, less than 5 months, less than 6 months, less than 9 months, or less than 12 months after administering the priming dose. In one embodiment, the biological sample is collected between about 3 months and about 6 months after administering the priming dose. In some embodiments, the biological sample is collected between about 3 months and about 9 months after administering the priming dose. In some embodiments, the biological sample is collected between about 3 months and about 12 months after administering the priming dose. In some embodiments, the biological sample is collected between about 6 months and about 12 months after administering the priming dose.
In some embodiments, the methods further include isolating a peripheral blood mononuclear cell (PBMC) from the bovine, and cloning a polynucleotide that encodes a candidate binding peptide, e.g., containing an ultralong CDR3. In one embodiment, the cloning the polynucleotide comprises performing single-cell RT-PCR amplification.

VI. Compositions and Formulations

Also provided are compositions comprising the binding polypeptides, such as antibodies or antigen-binding fragments or knob peptides, described herein, including pharmaceutical compositions and formulations. In one embodiment, a composition comprises a soluble peptide produced as described herein. In one embodiment, a composition comprises a fusion protein containing a soluble peptide, produced as described herein. In one embodiment, a composition comprises a soluble peptide identified for binding ability to a target molecule, e.g., identified as described herein. In some embodiments, a composition comprises a knob polypeptide or a synthetic peptide comprising an ultralong CDR3. The pharmaceutical compositions and formulations generally include one or more optional pharmaceutically acceptable carrier or excipient.
The term “pharmaceutical formulation” refers to a preparation which is in such form as to permit the biological activity of an active ingredient contained therein to be effective, and which contains no additional components which are unacceptably toxic to a subject to which the formulation would be administered.
A “pharmaceutically acceptable carrier” refers to an ingredient in a pharmaceutical formulation, other than an active ingredient, which is nontoxic to a subject. A pharmaceutically acceptable carrier includes, but is not limited to, a buffer, excipient, stabilizer, or preservative.
In some aspects, the choice of carrier is determined in part by the particular cell, binding molecule, and/or antibody, and/or by the method of administration. Accordingly, there are a variety of suitable formulations. For example, the pharmaceutical composition can contain preservatives. Suitable preservatives may include, for example, methylparaben, propylparaben, sodium benzoate, and benzalkonium chloride. In some aspects, a mixture of two or more preservatives is used. The preservative or mixtures thereof are typically present in an amount of about 0.0001% to about 2% by weight of the total composition. Carriers are described, e.g., by Remington's Pharmaceutical Sciences 16th edition, Osol, A. Ed. (1980). Pharmaceutically acceptable carriers are generally nontoxic to recipients at the dosages and concentrations employed, and include, but are not limited to: buffers such as phosphate, citrate, and other organic acids; antioxidants including ascorbic acid and methionine; preservatives (such as octadecyldimethylbenzyl ammonium chloride; hexamethonium chloride; benzalkonium chloride; benzethonium chloride; phenol, butyl or benzyl alcohol; alkyl parabens such as methyl or propyl paraben; catechol; resorcinol; cyclohexanol; 3-pentanol; and m-cresol); low molecular weight (less than about 10 residues) polypeptides; proteins, such as serum albumin, gelatin, or immunoglobulins; hydrophilic polymers such as polyvinylpyrrolidone; amino acids such as glycine, glutamine, asparagine, histidine, arginine, or lysine; monosaccharides, disaccharides, and other carbohydrates including glucose, mannose, or dextrins; chelating agents such as EDTA; sugars such as sucrose, mannitol, trehalose or sorbitol; salt-forming counter-ions such as sodium; metal complexes (e.g. Zn-protein complexes); and/or non-ionic surfactants such as polyethylene glycol (PEG).
Buffering agents in some aspects are included in the compositions. Suitable buffering agents include, for example, citric acid, sodium citrate, phosphoric acid, potassium phosphate, and various other acids and salts. In some aspects, a mixture of two or more buffering agents is used. The buffering agent or mixtures thereof are typically present in an amount of about 0.001% to about 4% by weight of the total composition. Methods for preparing administrable pharmaceutical compositions are known. Exemplary methods are described in more detail in, for example, Remington: The Science and Practice of Pharmacy, Lippincott Williams & Wilkins; 21st ed. (May 1, 2005).
Formulations of the antibodies described herein can include lyophilized formulations and aqueous solutions.
In some embodiments, an antibody described herein may be administered within a pharmaceutically-acceptable diluent, carrier, or excipient, in unit dose form. Conventional pharmaceutical practice may be employed to provide suitable formulations or compositions to administer to individuals being treated for SARS CoV-2 infection. In some embodiments, the administration is prophylactic. Any appropriate route of administration may be employed, for example, administration may be parenteral, intravenous, intra-arterial, subcutaneous, intramuscular, intraperitoneal, intranasal, aerosol, suppository, oral administration, or via inhalation.
Formulations include those for oral, intravenous, intraperitoneal, subcutaneous, pulmonary, transdermal, intramuscular, intranasal, buccal, sublingual, or suppository administration. In some embodiments, the cell populations are administered parenterally. The term “parenteral,” as used herein, includes intravenous, intramuscular, subcutaneous, rectal, vaginal, intracranial, intrathoracic, and intraperitoneal administration.
Compositions in some embodiments are provided as sterile liquid preparations, e.g., isotonic aqueous solutions, suspensions, emulsions, dispersions, or viscous compositions, which may in some aspects be buffered to a selected pH. Liquid preparations are normally easier to prepare than gels, other viscous compositions, and solid compositions. Additionally, liquid compositions are somewhat more convenient to administer, especially by injection. Viscous compositions, on the other hand, can be formulated within the appropriate viscosity range to provide longer contact periods with specific tissues. Liquid or viscous compositions can comprise carriers, which can be a solvent or dispersing medium containing, for example, water, saline, phosphate buffered saline, polyol (for example, glycerol, propylene glycol, liquid polyethylene glycol) and suitable mixtures thereof.
Sterile injectable solutions can be prepared by incorporating the binding molecule in a solvent, such as in admixture with a suitable carrier, diluent, or excipient such as sterile water, physiological saline, glucose, dextrose, or the like. The compositions can also be lyophilized. The compositions can contain auxiliary substances such as wetting, dispersing, or emulsifying agents (e.g., methylcellulose), pH buffering agents, gelling or viscosity enhancing additives, preservatives, flavoring agents, colors, and the like, depending upon the route of administration and the preparation desired. Standard texts may in some aspects be consulted to prepare suitable preparations.
Various additives which enhance the stability and sterility of the compositions, including antimicrobial preservatives, antioxidants, chelating agents, and buffers, can be added. Prevention of the action of microorganisms can be ensured by various antibacterial and antifungal agents, for example, parabens, chlorobutanol, phenol, sorbic acid, and the like. Prolonged absorption of the injectable pharmaceutical form can be brought about by the use of agents delaying absorption, for example, aluminum monostearate and gelatin.
Pharmaceutical compositions according to the invention may be, for example, in unit dose form, such as in the form of ampoules, vials, suppositories, tablets, pills, or capsules. The formulations can be administered to human individuals in therapeutically or prophylactic effective amounts (e.g., amounts which prevent, eliminate, or reduce a pathological condition) to provide therapy for a disease or condition. The preferred dosage of therapeutic agent to be administered is likely to depend on such variables as the type and extent of the disorder, the overall health status of the particular patient, the formulation of the compound excipients, and its route of administration.
In certain embodiments, the compositions described herein can be formulated for pneumonal administration, and in certain embodiments the composition is formulated for administration via inhalation (e.g., intrabronchial, intranasal or oral inhalation, intranasal drops). The composition may be administered with the use of a nebulizer, inhaler, atomizer, aerosolizer, mister, dry powder inhaler, metered dose inhaler, metered dose sprayer, metered dose mister, metered dose atomizer, or other suitable delivery device.
In some embodiments, the composition is a lyophilized composition. In some embodiments, the composition is formulated for aerosol administration, and in certain embodiments the composition is formulated for oral administration or administration via inhalation.
The pharmaceutical compositions described herein are prepared in a manner known per se, for example, by means of conventional dissolving, lyophilizing, mixing, granulating or confectioning processes. The pharmaceutical compositions may be formulated according to conventional pharmaceutical practice (see for example, in Remington: The Science and Practice of Pharmacy (21st ed.), ed. A. R. Gennaro, 2005, Lippincott Williams & Wilkins, Philadelphia, PA, and Encyclopedia of Pharmaceutical Technology, eds. J. Swarbrick and J. C. Boylan, 2013, Marcel Dekker, New York, NY).
In instances where aerosol administration is appropriate, the squalamine or a derivative thereof can be formulated as aerosols using standard procedures. The term “aerosol” includes any gas-borne suspended phase of a squalamine or a derivative thereof which is capable of being inhaled into the bronchioles or nasal passages, and includes dry powder and aqueous aerosol, and pulmonary and nasal aerosols. Specifically, aerosol includes a gas-bome suspension of droplets of squalamine or a derivative thereof, as may be produced in a metered dose inhaler or nebulizer, or in a mist sprayer. Aerosol also includes a dry powder composition of a compound of the invention suspended in air or other carrier gas, which may be delivered by insufflation from an inhaler device, for example. See Ganderton & Jones, Drug Delivery to the Respiratory Tract (Ellis Horwood, 1987); Gonda, Critical Reviews in therapeutic Drug Carrier Systems, 6:273-313 (1990); and Raeburn et al. Pharmacol. Toxicol. Methods, 27:143-159 (1992).
The formulations to be used for in vivo administration are generally sterile. The injection compositions are prepared in customary manner under sterile conditions; the same applies also to introducing the compositions into ampoules or vials and sealing the containers. Sterility may be readily accomplished, e.g., by filtration through sterile filtration membranes.
The pharmaceutical composition in some aspects can employ time-released, delayed release, and sustained release delivery systems such that the delivery of the composition occurs prior to, and with sufficient time to cause, sensitization of the site to be treated. Many types of release delivery systems are available and known. Such systems can avoid repeated administrations of the composition, thereby increasing convenience to the subject and the physician.
The pharmaceutical composition in some embodiments contains the binding polypeptides, such as antibodies or antigen binding fragments, in amounts effective to treat or prevent the disease or condition, such as a therapeutically effective or prophylactically effective amount. Therapeutic or prophylactic efficacy in some embodiments is monitored by periodic assessment of treated subjects. For repeated administrations over several days or longer, depending on the condition, the treatment is repeated until a desired suppression of disease symptoms occurs. However, other dosage regimens may be useful and can be determined. The desired dosage can be delivered by a single bolus administration of the composition, by multiple bolus administrations of the composition, or by continuous infusion administration of the composition

VII. Methods of Use

Provided herein are methods of treatment and uses for treating a disease or condition in a subject. In some embodiments, the methods and uses include administering a provided binding polypeptide, such as an antibody or antigen binding fragment or knob peptide, into a subject (e.g. a human). In some embodiments, the binding polypeptide or a composition containing same is administered to the subject by a parenteral administration. In some embodiments, the binding polypeptide or a composition containing same is administered by intramuscularly, subcutaneously, intravenously, topically, orally or by inhalation. In particular embodiments, particularly for delivery of a knob peptide, the administration is by inhalation. In some embodiments, a provided binding polypeptide, such as a knob peptide, may be administered by aerosol administration, such as by delivery using an inhaler or nebulizer or a mist sprayer.
In some embodiments, provided embodiments relate to methods for treating or preventing a cancer or proliferative disease in a subject. In some embodiments, provided embodiments relate to methods for treating or preventing a coronavirus infection in a subject. In some embodiments, the methods are for prophylactic treatment of a viral infection in a subject at risk of a viral infection. In some embodiments, the methods are for treating a subject known or suspected of having a viral infection. In some embodiments, the methods may prevent a viral infection, such as a coronavirus infection, in a subject. In some embodiments, the methods may reduce signs of symptoms of the coronavirus infection in the subject, such as mitigate the presence or severity of one or more signs or symptoms. In some embodiments, the binding molecules, such as antibodies or antigen binding fragments or knob peptides, are administered to a subject in an effective amount to effect treatment of the infection. Also provided herein are uses of the binding polypeptides, such as antibodies or antigen binding fragments or knob peptides, in such methods and treatments, and in the preparation of a medicament in order to carry out such therapeutic methods. In some embodiments, the methods are carried out by administering the binding polypeptides, or compositions comprising the same, to the subject having, having had, or suspected of having the disease or condition. In some embodiments, the methods thereby treat the disease or condition or disorder in the subject. Also provided herein are of use of any of the compositions, such as pharmaceutical compositions provided herein, for the treatment of a disease or disorder associated with a coronavirus infection, for example, due to SARS-CoV-2.
In some embodiments, a provided binding polypeptide, such as an antibody or antigen binding fragment or a knob peptide, is administered to the subject in an effective or therapeutically effective amount. An effective or therapeutically effective dose of a provided binding polypeptide, such as an antibody or antigen binding fragment or knob peptide, for treating or preventing a viral infection is an amount sufficient to alleviate one or more signs and/or symptoms of the infection in the treated subject, whether by inducing the regression or elimination of such signs and/or symptoms or by inhibiting the progression of such signs and/or symptoms. The dose amount may vary depending upon the age and the size of a subject to be administered, target disease, conditions, route of administration, and the like. In an embodiment, an effective or therapeutically effective dose of a provided binding polypeptide, such as an antibody or antigen-binding fragment thereof or a knob peptide, for treating or preventing viral infection, e.g., in an adult human subject, is about 0.001 mg/kg to about 200 mg/kg, such as 0.01 mg/kg to 200 mg/kg or 0.1 mg/kg to 200 mg/kg. Depending on the severity of the infection, the frequency and the duration of the treatment can be adjusted.
The provided methods and uses include methods and uses for treating a viral infection in a subject. For instance, methods of treating include administering a provided binding polypeptide, such as an antibody or antigen-binding fragment or a knob peptide, to a subject having one or more signs or symptoms of a disease or infection, e.g., viral infection, at an effective or therapeutically effective amount or dose.
In some embodiments, the provided methods and uses include prophylactic methods and uses. In some embodiments, provided herein are methods for prophylactically administering a provided binding polypeptide, such as an antibody or antigen-binding fragment or a knob peptide, to a subject having who is at risk of viral infection so as to prevent such infection. In some embodiments, the amount administered is an effective or therapeutically effective amount or dose. In some embodiments, the provided methods and uses prevent a viral infection in the subject. In some embodiments, preventing a viral infection by a provided methods involves administering a provided binding polypeptide, such as an antibody or antigen binding fragment or knob peptide, to a subject to inhibit the manifestation of a disease or infection (e.g., viral infection) in the body of a subject. In some embodiments, the methods reduce one or more sign or symptom of a viral infection.

VIII. Exemplary Embodiments

Among the provided embodiments are:

- 1. A method of preparing a cow ultralong CDR3 antibody display library, the method comprising:
- (a) amplifying sequences encoding a plurality of variable heavy (VH) regions of the IgHV1-7 family from a cow antibody VH chain complementary DNA (cDNA) template library;
- (b) constructing a plurality of replicable expression vectors for the plurality of VH regions, wherein each replicable expression vector comprises a first nucleic acid sequence encoding a single chain variable fragment (scFv) comprising an amplified VH region joined to a variable lambda light (VL) region selected from the group consisting of VL regions of BLV1H12, BLV5D3, BLV8C11, BF1H1, BLV5B8, and F18, or a humanized variant thereof;
- (c) transforming suitable host cells with the plurality of replicable expression vectors under conditions suitable to produce amplified display particles; and
- (d) collecting the amplified display particles, wherein the amplified display particles comprise display particles displaying a fusion protein comprising an scFv.
- 2. The method of embodiment 1, wherein the VL region is the BLV1H12 VL region.
- 3. A method of preparing a cow ultralong CDR3 antibody display library, the method comprising:
- (a) amplifying sequences encoding a plurality of variable heavy (VH) regions of the IgHV1-7 family from a cow antibody VH chain complementary DNA (cDNA) template library;
- (b) constructing a plurality of replicable expression vectors for the plurality of VH regions, wherein each replicable expression vector comprises a first nucleic acid sequence encoding a single chain variable fragment (scFv) comprising an amplified VH region joined to the BLV1H12 lambda variable light (VL) region or a humanized variant thereof;
- (c) transforming suitable host cells with the plurality of replicable expression vectors under conditions suitable to produce amplified display particles; and
- (d) collecting the amplified display particles, wherein the amplified display particles comprise display particles displaying a fusion protein comprising an scFv.
- 4. The method of any of embodiments 1-3, wherein the cDNA template library is prepared from RNA isolated from peripheral blood mononuclear cells (PBMCs) from an immunized cow.
- 5. The method of any of embodiments 1-3, further comprising preparing the cDNA template library from RNA isolated from peripheral blood mononuclear cells (PBMCs) from an immunized cow.
- 6. The method of embodiment 4 or embodiment 5, further comprising immunizing the cow with a target antigen.
- 7. The method of any of embodiments 1-6, wherein the amplified display particles comprise bacterial display, yeast display, mammalian display, phage display, mRNA display, ribosomal display, or DNA display particles.
- 8. The method of any of embodiments 1-7, wherein the amplified display particles are phage display particles.
- 9. The method of any of embodiments 1-8, wherein the amplified display particles are phagemid particles.
- 10. The method of embodiment 9, wherein each replicable expression vector further comprises a second nucleic acid encoding at least a portion of a phage coat protein, and the method further comprises infecting the transformed host cells with an amount of a helper phage having a gene encoding the phage coat protein sufficient to produce the phagemid particles, whereby the fusion protein comprises the at least a portion of a phage coat protein.
- 11. A method of preparing a cow ultralong CDR3 antibody phage display library, the method comprising:
- (a) immunizing a cow with a target antigen;
- (b) preparing an antibody variable heavy (VH) chain complementary DNA (cDNA) template library from RNA isolated from peripheral blood mononuclear cells (PBMCs) from the immunized cow;
- (c) amplifying sequences encoding a plurality of VH regions of the IgHV1-7 family from the cDNA template library;
- (d) constructing a plurality of replicable expression vectors for the plurality of VH regions, wherein each replicable expression vector comprises (1) a first nucleic acid sequence encoding a single chain variable fragment (scFv) comprising an amplified VH region joined to the BLV1H12 lambda variable light (VL) region or a humanized variant thereof, and (2) a second nucleic acid encoding at least a portion of a phage coat protein;
- (e) transforming suitable host cells with the plurality of replicable expression vectors;
- (f) infecting the transformed host cells with an amount of a helper phage having a gene encoding the phage coat protein sufficient to produce amplified phagemid particles; and
- (g) collecting the amplified phagemid particles, wherein the amplified phagemid particles comprise phagemid particles displaying a fusion protein comprising the at least a portion of a phage coat protein and an scFv.
- 12. The method of any of embodiments 1-11, wherein the BLV1H12 lambda VL region is set forth in SEQ ID NO: 2.
- 13. The method of any of embodiments 1-11, wherein the BLV1H12 lambda VL region is a humanized variant of the lambda VL region of BLV1H12.
- 14. The method of embodiment 13, wherein the humanized variant comprises one or more of amino acid replacements S2A, T5N, P8S, A12G, A13S, and P14L based on Kabat numbering, amino acid replacements I29V and N32G in the CDR1 region, and/or amino acid substitution of DNN to GDT in the CDR2 region.
- 15. The method of any of embodiment 13 or embodiment 14, wherein the humanized variant comprises the sequence set forth in SEQ ID NO: 107.
- 16. The method of any of embodiments 2-15, wherein the amplified VH region is joined to the BLV1H12 lambda VL region indirectly via a peptide linker.
- 17. The method of embodiment 16, wherein the peptide linker is (Gly₄Ser)₃(SEQ ID NO: 94).
- 18. The method of any of embodiments 1-17, wherein the plurality of VH regions of the IgHV1-7 family from the cDNA template library are amplified with a forward primer comprising the sequence set forth in SEQ ID NO:84 and a reverse primer comprising the sequence set forth in SEQ ID NO:85.
- 19. The method of any of embodiments 1-18, wherein prior to the constructing, the method further comprises performing a size separation on the sequences encoding the plurality of amplified VH regions to enrich for VH regions with an ultralong CDR3.
- 20. The method of embodiment 19, wherein the size separation is performed by gel electrophoresis.
- 21. The method of embodiment 20, wherein the gel electrophoresis is performed using a 1.2%, 1.5%, or 2% agarose gel, optionally using a 2% agarose gel.
- 22. The method of any of embodiments 19-21, wherein the size separation comprises separating sequences of, of about, or greater than 550 base pairs in length from the sequences encoding the plurality of amplified VH regions, wherein the sequences of, of about, or greater than 550 base pairs in length comprise sequences encoding VH regions with an ultralong CDR3.
- 23. The method of any of embodiments 1-22, wherein at least or at least about 20%, 30%, 35%, 40%, 45%, 50%, 60%, 70%, 80%, 85%, 90%, or 95% of the amplified particles display an scFv comprising a VH region comprising an ultralong CDR3 region.
- 24. The method of any of embodiments 1-23, wherein at least or at least about 30% of the amplified particles display an scFv comprising a VH region comprising an ultralong CDR3 region.
- 25. The method of any of embodiments 1-24, wherein at least or at least about 40% of the amplified particles display an scFv comprising a VH region comprising an ultralong CDR3 region.
- 26. The method of any of embodiments 1-25, wherein at least or at least about 50% of the amplified particles display an scFv comprising a VH region comprising an ultralong CDR3 region.
- 27. The method of any of embodiments 1-26, wherein the ultralong CDR3 is a peptide sequence of 25-70 amino acids comprising a cysteine motif comprising 2-12 cysteine residues able to form 1-6 disulfide bonds.
- 28. The method of any of embodiments 1-27, wherein the ultralong CDR3 is 40 to 60 amino acids in length.
- 29. The method of any of embodiments 1-28, wherein the ultralong CDR3 is at least
- 42 amino acids in length.
- 30. The method of any of embodiments 1-29, wherein the ultralong CDR3 is 42 amino acids, 43 amino acids, 44 amino acids, 45 amino acids, 46 amino acids, 47 amino acids, 48 amino acids, 49 amino acids, 50 amino acids, 51 amino acids, 52 amino acids, 53 amino acids, 54 amino acids, 55 amino acids, 56 amino acids, 57 amino acids, 58 amino acids, 59 amino acids, or 60 amino acids in length.
- 31. The method of any of embodiments 1-30, wherein the ultralong CDR3 comprises at least 4 cysteine residues.
- 32. The method of any of embodiments 1-31, wherein the ultralong CDR3 contains 4 cysteine residues.
- 33. The method of any of embodiments 1-31, wherein the ultralong CDR3 contains 6, 8, 10, or 12 cysteine residues.
- 34. The method of any of embodiments 1-33, wherein the ultralong CDR3 has at least 2 disulfide bonds.
- 35. The method of any of embodiments 1-34, wherein the ultralong CDR3 has 2 disulfide bonds.
- 36. The method of any of embodiments 1-34, wherein the ultralong CDR3 has 3, 4 or 5 disulfide bonds.
- 37. A method of preparing an ultralong CDR3-knob display library, the method comprising:
- (a) amplifying sequences encoding a plurality of CDR3-knob only antibodies from a cow antibody variable heavy (VH) chain complementary DNA (cDNA) template library with forward and reverse primers specific for the ascending and descending stalk domains of a cow ultralong CDR3 region;
- (b) constructing a plurality of replicable expression vectors for the plurality of CDR3-knob only antibodies, wherein each replicable expression vector comprises a first nucleic acid sequence encoding an amplified CDR3 knob;
- (c) transforming suitable host cells with the plurality of replicable expression vectors under conditions suitable to produce amplified display particles; and
- (d) collecting the amplified display particles, wherein the amplified display particles comprise display particles displaying a fusion protein comprising an amplified CDR3 knob.
- 38. The method of embodiment 37, wherein the cDNA template library is prepared from RNA isolated from peripheral blood mononuclear cells (PBMCs) from an immunized cow.
- 39. The method of embodiment 37, further comprising preparing the cDNA template library from RNA isolated from peripheral blood mononuclear cells (PBMCs) from an immunized cow.
- 40. The method of embodiment 38 or embodiment 39, further comprising immunizing the cow with a target antigen.
- 41. The method of any of embodiments 37-40, wherein the amplified display particles comprise bacterial display, yeast display, mammalian display, phage display, mRNA display, ribosomal display, or DNA display particles.
- 42 The method of any of embodiments 37-41, wherein the amplified display particles are phage display particles.
- 43. The method of any of embodiments 37-42, wherein the amplified display particles are phagemid particles.
- 44. The method of embodiment 43, wherein each replicable expression vector further comprises a second nucleic acid encoding at least a portion of a phage coat protein, and the method further comprises infecting the transformed host cells with an amount of a helper phage having a gene encoding the phage coat protein sufficient to produce the phagemid particles, whereby the fusion protein comprises the at least a portion of a phage coat protein.
- 45. A method of preparing an ultralong CDR3-knob phage display library, the method comprising:
- (a) immunizing a cow with a target antigen;
- (b) preparing an antibody variable heavy (VH) chain complementary DNA (cDNA) template library from RNA isolated from peripheral blood mononuclear cells (PBMCs) from the immunized cow;
- (c) amplifying sequences encoding a plurality of CDR3-knob only antibodies from the cDNA template library with forward and reverse primers specific for the ascending and descending stalk domains of a cow ultralong CDR3 region;
- (d) constructing a plurality of replicable expression vectors for the plurality of CDR3-knob only antibodies, wherein each replicable expression vector comprises (1) a first nucleic acid sequence encoding an amplified CDR3 knob and (2) a second nucleic acid encoding at least a portion of a phage coat protein;
- (e) transforming suitable host cells with the plurality of replicable expression vectors;
- (f) infecting the transformed host cells with an amount of a helper phage having a gene encoding the phage coat protein sufficient to produce amplified phagemid particles; and
- (g) collecting the amplified phagemid particles, wherein the amplified phagemid particles comprise phagemid particles displaying a fusion protein comprising the at least a portion of a phage coat protein and an amplified CDR3 knob.
- 46. The method of any of embodiments 37-45, wherein the primers comprise or consist of any of the sequences set forth in SEQ ID NO: 7-11.
- 47. The method of any of embodiments 37-46, wherein each of the plurality of CDR3-knob only antibodies comprises a peptide sequence of 25-70 amino acids with a cysteine motif comprising 2-12 cysteine residues able to form 1-6 disulfide bonds.
- 48. The method of embodiment 47, wherein the peptide sequence is 40 to 60 amino acids in length.
- 49. The method of embodiment 47 or embodiment 48, wherein the peptide sequence is at least 42 amino acids in length.
- 50. The method of any of embodiments 47-49, wherein the peptide sequence is 42 amino acids, 43 amino acids, 44 amino acids, 45 amino acids, 46 amino acids, 47 amino acids,
- 48 amino acids, 49 amino acids, 50 amino acids, 51 amino acids, 52 amino acids, 53 amino acids, 54 amino acids, 55 amino acids, 56 amino acids, 57 amino acids, 58 amino acids, 59 amino acids, or 60 amino acids in length.
- 51. The method of any of embodiments 47-50, wherein the peptide sequence comprises at least 4 cysteine residues.
- 52. The method of any of embodiments 47-51, wherein the peptide sequence contains 4 cysteine residues.
- 53. The method of any of embodiments 47-51, wherein the peptide sequence contains 6, 8, 10, or 12 cysteine residues.
- 54. The method of any of embodiments 47-53, wherein the peptide sequence has at least 2 disulfide bonds.
- 55. The method of any of embodiments 47-54, wherein the peptide sequence has 2 disulfide bonds.
- 56. The method of any of embodiments 47-54, wherein the peptide sequence has 3, 4 or 5 disulfide bonds.
- 57. The method of any of embodiments 6-36 and 40-56, wherein the target antigen is a nonvirulent bacteria, a virus, a viral protein, an immunomodulatory protein (e.g. a checkpoint molecule), a cancer antigen, a human IgG, or a recombinant protein thereof.
- 58. The method of any of embodiments 1-57, wherein the cDNA template library was synthesized using a pool of IgM (SEQ ID NO: 4), IgA (SEQ ID NO: 5), and IgG-specific (SEQ ID NO: 3 and 6) primers.
- 59. A method of preparing an ultralong CDR3-knob display library, the method comprising:
- (a) constructing a plurality of replicable expression vectors for a plurality of CDR3-knob only antibodies, wherein each replicable expression vector comprises a first nucleic acid sequence encoding a peptide sequence of 25-70 amino acids with a cysteine motif comprising 2-12 cysteine residues able to form 1-6 disulfide bonds;
- (b) transforming suitable host cells with the plurality of replicable expression vectors under conditions suitable to produce amplified display particles; and
- (c) collecting the amplified display particles, wherein the amplified display particles comprise display particles displaying a fusion protein comprising a CDR3 knob.
- 60. The method of embodiment 59, wherein the amplified display particles comprise bacterial display, yeast display, mammalian display, phage display, mRNA display, ribosomal display, or DNA display particles.
- 61. The method of embodiment 59 or embodiment 60, wherein the amplified display particles are phage display particles.
- 62. The method of any of embodiments 59-61, wherein the amplified display particles are phagemid particles.
- 63. The method of embodiment 62, wherein each replicable expression vector further comprises a second nucleic acid encoding at least a portion of a phage coat protein, and the method further comprises infecting the transformed host cells with an amount of a helper phage having a gene encoding the phage coat protein sufficient to produce the phagemid particles, whereby the fusion protein comprises the at least a portion of a phage coat protein.
- 64. A method of preparing an ultralong CDR3-knob phage display library, the method comprising:
- (a) constructing a plurality of replicable expression vector for a plurality of CDR3-knob only antibodies, wherein each replicable expression vector comprises (1) a first nucleic acid sequence encoding a peptide sequence of 25-70 amino acids with a cysteine motif comprising 2-12 cysteine residues able to form 1-6 disulfide bonds and (2) a second nucleic acid encoding at least a portion of a phage coat protein;
- (b) transforming suitable host cells with a plurality of replicable expression vectors;
- (c) infecting the transformed host cells with a helper phage having a gene encoding the phage coat protein sufficient to produce amplified phagemid particles; and
- (d) collecting the amplified phagemid particles, wherein the amplified phagemid particles comprise phagemid particles displaying a fusion protein comprising the at least a portion of a phage coat protein and a CDR3 knob.
- 65. The method of any of embodiments 27-36 and 47-64, wherein the peptide sequence comprises an ascending stalk domain and a descending stalk domain, wherein the cysteine motif is between the ascending and descending stalk domains.
- 66. The method of embodiment 64 or embodiment 65, wherein the peptide sequence is amplified from DNA from a cow immunized with target antigen.
- 67. The method of embodiment 66, wherein the peptide sequence is amplified from a variable heavy chain cDNA library from the immunized cow using primers specific for either side of the stalk domain of a cow ultralong CDR3 region.
- 68. The method of any of embodiments 27-36, 47-64, 66, and 67, wherein the peptide sequence does not comprise an ascending stalk domain N-terminal to the cysteine motif.
- 69. The method of any of embodiments 27-36, 47-64, and 66-68, wherein the peptide sequence does not comprise a descending stalk domain C-terminal to the cysteine motif.
- 70. The method of any of embodiments 65-67 and 69, wherein the ascending stalk domain comprises the sequence CX₂TVX₅Q, wherein X₂and X₅are any amino acid.
- 71. The method of embodiment 70, wherein X₂is Ser, Thr, Gly, Asn, Ala, or Pro, and X₅is His, Gln, Arg, Lys, Gly, Thr, Tyr, Phe, Trp, Met, Ile, Val, or Leu.
- 72. The method of embodiment 70 or embodiment 71, wherein X₂is Ser, Ala, or Thr, and X₅is His or Tyr.
- 73. The method of any of embodiments 64, 65, and 68-72, wherein the peptide sequence is a synthetic CDR3-knob.
- 74. The method of any of embodiments 64, 65, and 68-73, wherein the peptide sequence is a cyclotide or modified cyclotide.
- 75. The method of any of embodiments 64, 65, and 68-73, wherein the peptide sequence is a semisynthetic CDR3-knob derived from a bovine CDR3-knob.
- 76. The method of any of embodiments 64-75, wherein the peptide sequence is 40 to 60 amino acids in length.
- 77. The method of any of embodiments 64-76, wherein the peptide sequence is at least 42 amino acids in length.
- 78. The method of any of embodiments 64-77, wherein the peptide sequence is 42 amino acids, 43 amino acids, 44 amino acids, 45 amino acids, 46 amino acids, 47 amino acids, 48 amino acids, 49 amino acids, 50 amino acids, 51 amino acids, 52 amino acids, 53 amino acids, 54 amino acids, 55 amino acids, 56 amino acids, 57 amino acids, 58 amino acids, 59 amino acids, or 60 amino acids in length.
- 79. The method of any of embodiments 64-78, wherein the peptide sequence comprises at least 4 cysteine residues.
- 80. The method of any of embodiments 64-79, wherein the peptide sequence contains 4 cysteine residues.
- 81. The method of any of embodiments 64-79, wherein the peptide sequence contains 6, 8, 10, or 12 cysteine residues.
- 82. The method of any of embodiments 64-81, wherein the peptide sequence has at least 2 disulfide bonds.
- 83. The method of any of embodiments 64-82, wherein the peptide sequence has 2 disulfide bonds.
- 84. The method of any of embodiments 64-82, wherein the peptide sequence has 3, 4 or 5 disulfide bonds.
- 85. The method of any of embodiments 64, 65, and 68-84, wherein the plurality of CDR3 knobs are mutated at one or more selected positions within the nucleic acid sequence encoding the peptide sequence, wherein the plurality of replicable expression vectors are a family of mutated vectors.
- 86. The method of any of embodiments 1-85, wherein the expression vector further comprises a secretory signal sequence.
- 87. The method of embodiment 86, wherein the secretory signal sequence is a pelB signal sequence.
- 88. The method of any of embodiments 1-87, wherein the suitable host cells are E. coli cells.
- 89. The method of any of embodiments 1-88, wherein the suitable host cells are TG1 electrocompetent cells.
- 90. The method of any of embodiments 9-36, 43-58, and 62-89, wherein the phagemid particles are derived from M13 phage.
- 91. The method of any of embodiments 10-36, 44-58, and 63-90, wherein the coat protein is the M13 phage gene III coat protein (pIII).
- 92. The method of any of embodiments 10-36, 44-58, and 63-91, wherein the helper phage is selected from the group consisting of M13K07, M13R408, M13-VCS, and Phi X 174.
- 93. The method of any of embodiments 10-36, 44-58, and 63-92, wherein the helper phage is M13K07.
- 94. The method of any of embodiments 1-93, wherein the display particles on average display one copy of the fusion protein on the surface of the particle.
- 95. A library of display particles produced by the method of any of embodiments 1-94.
- 96. A replicable expression vector comprising a gene fusion encoding a fusion protein comprises a first nucleic acid sequence encoding a single chain variable fragment comprising a cow variable heavy (VH) region comprising an ultralong CDR3 joined to a variable lambda light (VL) region selected from VL regions of BLV1H12, BLV5D3, BLV8C11, BF1H1, BLV5B8, and F18, or a humanized variant thereof.
- 97. A replicable expression vector comprising a gene fusion encoding a fusion protein comprises a first nucleic acid sequence encoding a single chain variable fragment comprising a cow variable heavy (VH) region comprising an ultralong CDR3 joined to a BLV1H12 lambda variable light (VL) region or a humanized variant thereof.
- 98. The replicable expression vector of embodiment 96 or embodiment 97, further comprising a second nucleic acid sequence encoding at least a portion of a phage coat protein.
- 99. A display particle encoded by the replicable expression vectors of any of embodiments 96-98.
- 100. A library of display particles comprising a plurality of display particles of 95 or embodiment 99.
- 101. The library of embodiment 100, wherein at least or at least about 20%, 30%, 35%, 40%, 45%, 50%, 60%, 70%, 80%, 85%, 90%, or 95% of the display particles in the library comprise an scFv comprising a VH region comprising an ultralong CDR3 region.
- 102. The library of embodiment 100 or embodiment 101, wherein at least or at least about 30% of the display particles in the library comprise an scFv comprising a VH region comprising an ultralong CDR3 region.
- 103. The library of any of embodiments 100-102, wherein at least or at least about 40% of the display particles in the library comprise an scFv comprising a VH region comprising an ultralong CDR3 region.
- 104. The library of any of embodiments 100-103, wherein at least or at least about 50% of the display particles in the library comprise an scFv comprising a VH region comprising an ultralong CDR3 region.
- 105. A replicable expression vector comprising a gene fusion encoding a fusion protein that comprises a first nucleic acid sequence encoding a peptide sequence of 25-70 amino acids with a cysteine motif comprising 2-12 cysteine residues able to form disulfide bonds.
- 106. The replicable expression vector of embodiment 105, further comprising a second nucleic acid sequence encoding at least a portion of a phage coat protein.
- 107. A display particle encoded by the replicable expression vectors of embodiment
- 105 or embodiment 106.
- 108. A library of display particles comprising a plurality of display particles of embodiment 107.
- 109. The library of any of embodiments 95, 100-104, and 108, wherein the display particles are phage display particles.
- 110. The library of any of embodiments 95, 100-104, 108, and 109, wherein the display particles are phagemid particles.
- 111. A method for selecting an antibody binding protein, the method comprising.
- (1) contacting the library of display particles of any of embodiments 95, 100-104, and 108-110 with a target molecule under conditions to allow binding of a display particle to the target molecule; and
- (2) separating the display particles that bind from those that do not, thereby selecting display particles comprising an antibody binding protein that binds to the target molecule.
- 112. The method of embodiment 111, wherein the display particles are phage display particles.
- 113. The method of embodiment 111 or embodiment 112, wherein the display particles are phagemid particles.
- 114. The method of any of embodiments 111-113, wherein the target molecule is a nonvirulent bacteria, a virus, a viral protein, an immunomodulatory protein (e.g. a checkpoint molecule), a cancer antigen, a human IgG, or a recombinant protein thereof.
- 115. The method of any of embodiments 111-114, wherein the target molecule is a coronavirus, a coronavirus pseudovirus, a recombinant coronavirus Spike protein, or a receptor-binding domain (RBD) of a coronavirus Spike protein.
- 116. The method of embodiment 115, wherein the coronavirus is selected from the group consisting of 229E, NL63, OC43, HKU1, MERS-CoV, SARS-CoV, and SARS-CoV2.
- 117. The method of embodiment 115 or embodiment 116, wherein the coronavirus is a SARS-CoV2 selected from Wuhan-Hu-1 isolate, B.1.351 South African variant, or B.1.1.7 UK variant.
- 118. The method of any of embodiments 111-117, further comprising:
- (i) infecting suitable host cells with replicable expression vectors encoding the selected display particles that bind in (2);
- (ii) collecting the amplified display particles; and
- (iii) repeating steps (1) and (2) using the amplified display particles as the library of display particles.
- 119. The method of embodiment 118, wherein the display particles are phagemid particles, and the method further comprises infecting the transformed host cells with an amount of a helper phage having a gene encoding the phage coat protein sufficient to produce amplified phagemid particles.
- 120. The method of embodiment 118 or embodiment 119, wherein the steps are repeated one or more times.
- 121. The method of any of embodiments 118-120, wherein the steps are repeated with the same target molecule or a different target molecule.
- 122. The method of embodiment 121, wherein the steps are repeated with a different target molecule and the different target molecule is related to the target molecule.
- 123. The method of embodiment 121 or embodiment 122, wherein the different target molecule is the same type of pathogen as, the same group of pathogen as, or a variant of the target molecule.
- 124. The method of any of embodiments 111-123, further comprising sequencing the fusion gene in the selected display particles to identify the antibody binding protein.
- 125. The method of embodiment 124, further comprising producing a full-length IgG or a Fab from the selected antibody binding protein.
- 126. The method of embodiment 124 or embodiment 125, wherein the antibody binding protein is a scFv, and the method comprises constructing a heavy chain or a portion thereof comprising joining the VH region of the scFv with a constant region or a portion thereof.
- 127. The method of embodiment 124 or embodiment 125, wherein the method comprises constructing a humanized VH region by replacing a knob region of the ultralong CDR3 region of a humanized bovine VH region with an ultralong CDR3 region of a selected antibody binding protein.
- 128. The method of embodiment 127, wherein the ultralong CDR3 region of a selected antibody binding protein is replaced between an ascending stalk strand and a descending stalk strand of a humanized bovine VH region.
- 129. The method of embodiment 128, wherein the VH region comprises the formula V1-X-V2, wherein the V1 region of the heavy chain comprises the sequence set forth in SEQ ID NO: 111; the X region comprises an ultralong CDR3 of a selected antibody; and the V2 region comprises the sequence set forth in SEQ ID NO: 112.
- 130. The method of any of embodiments 127-129, wherein the method further comprises constructing a heavy chain or a portion thereof comprising joining the humanized VH region with a constant region or a portion thereof.
- 131. The method of embodiment 126 or embodiment 130, wherein the heavy chain or the portion thereof is a human IgG1 heavy chain or portion thereof.
- 132. The method of any of embodiments 126, 130, and 131, further comprising co-expressing the heavy chain or portion thereof with a light chain.
- 133. The method of embodiment 132, wherein the light chain is a bovine light chain of BLVH12, BLV5D3, BLV8C11, BF1H1, BLV5B8, or F18, or is a humanized variant thereof
- 134. The method of embodiment 132 or embodiment 133, wherein the light chain is a BLV1H12 light chain (SEQ ID NO: 113) or a humanized variant thereof.
- 135. The method of any of embodiments 131-134, wherein the light chain is a humanized light chain set forth in SEQ ID NO: 114.
- 136. The method of embodiment 132 or embodiment 133, wherein the light chain is a BLV5B8 light chain (SEQ ID NO: 115) or a humanized variant thereof.
- 137. The method of embodiment 132, wherein the light chain is a human light chain.
- 138. The method of embodiment 132 or embodiment 137, wherein the light chain is selected from the group consisting of VL1-47, VL1-40, VL1-51, and VL2-18.
- 139. The method of any of embodiments 132, 137, and 138, wherein the light chain is set forth in any one of SEQ ID NO: 116-120.
- 140. A method for producing a soluble ultralong CDR3 knob, comprising:
- (a) transforming E. coli with an expression vector encoding a fusion protein comprising an ultralong CDR3 knob and a bacterial chaperone joined by a cleavable linker, wherein the ultralong CDR3 knob is a peptide sequence of 25-70 amino acids with a cysteine motif comprising 2-12 cysteine residues able to form 1-6 disulfide bonds;
- (b) culturing the bacteria under conditions permissive of expression of the fusion protein;
- (c) isolating the fusion protein from supernatant of a bacterial cell lysate; and
- (d) cleaving the cleavable linker of the fusion protein, thereby producing a soluble ultralong CDR3 knob comprising 1-6 disulfide bonds free of the bacterial chaperone.
- 141. The method of embodiment 140, wherein the ultralong CDR3 knob is an antibody binding protein identified by the method of any of embodiments 111-124.
- 142. The method of embodiment 140 or embodiment 141, wherein the fusion protein has increased solubility relative to the ultralong CDR3 knob alone.
- 143. The method of any of embodiments 140-142, wherein the bacterial chaperone is thioredoxin A (TrxA).
- 144. The method of any of embodiments 140-143, wherein the cleavable linker is an enterokinase cleavage tag having the amino acid sequence DDDDK (SEQ ID NO: 106).
- 145. The method of any of embodiments 140-144, wherein cleaving the cleavable linker comprises adding enterokinase to the supernatant.
- 146. The method of any of embodiments 140-145, wherein the soluble ultralong CDR3 knob comprises a further linker to allow for cyclizing the soluble ultralong CDR3 knob via chemical or enzymatic methods, optionally wherein the further linker allows for sortase-mediated cyclization.
- 147. The method of embodiment 146, further comprising cyclizing the soluble ultralong CDR3 knob.
- 148. The method of any of embodiments 140-147, further comprising (e) removing the enterokinase and/or the bacterial chaperone from the solution comprising the soluble ultralong CDR3 knob.
- 149. The method of any of embodiments 140-148, further comprising enriching for the soluble ultralong CDR3 knob from the solution comprising the soluble ultralong CDR3 knob, optionally wherein the enriching comprises size exclusion chromatography.
- 150. The method of any of embodiments 140-149, further comprising producing a multispecific binding molecule comprising the soluble ultralong CDR3 knob.
- 151. The method of any of embodiments 140-150, wherein the ultralong CDR3 knob is 3-8 kDa or 4-5 kDa in size.
- 152. A fusion protein comprising an ultralong CDR3 knob and a bacterial chaperone joined by a cleavable linker, wherein the ultralong CDR3 knob is a peptide sequence of 25-70 amino acids with a cysteine motif comprising 2-12 cysteine residues able to form 1-6 disulfide bonds.
- 153. The fusion protein of embodiment 152, wherein the bacterial chaperone is thioredoxin A (TrxA).
- 154. The fusion protein of embodiment 152 or embodiment 153, wherein the cleavable linker is an enterokinase cleavage tag having the amino acid sequence DDDDK (SEQ ID NO: 106).
- 155. The fusion protein of any of embodiments 152-154, wherein the ultralong CDR3 knob comprises 1-6 disulfide bonds.
- 156. A composition comprising the fusion protein of any of embodiments 152-155.
- 157. A purified soluble ultralong CDR3 knob produced by the method of any of embodiments 140-151, wherein the soluble ultralong CDR3 comprises is 25-75 amino acids in length and comprises 1-6 disulfide bonds.
- 158. The purified soluble ultralong CDR3 knob of embodiment 157, wherein the ultralong CDR3 knob is 3-8 kDa in size.
- 159. The purified soluble ultralong CDR3 knob of embodiment 157 or embodiment
- 158, wherein the ultralong CDR3 knob is 4-5 kDa in size.
- 160. A composition comprising the purified soluble ultralong CDR3 of any of embodiments 157-159.
- 161. The composition of embodiment 160, further comprising a pharmaceutically acceptable carrier.
- 162. The composition of embodiment 160 or embodiment 161 that is formulated for parenteral administration.
- 163. The composition of any of embodiments 160-162 that is formulated for intravenous, intramuscular, topical, otic, conjunctival, nasal, inhalation, or subcutaneous administration.
- 164. The composition of any of embodiments 160-163 that is formulated for administration by inhalation.

EXAMPLES

The following examples are included for illustrative purposes only and are not intended to limit the scope of the invention.

Example 1: Generation of Anti-SARS CoV-2 Antibodies

Cows were immunized with SARS CoV-2 Spike protein or receptor binding domain (RBD) portion thereof and sera was collected to assess binding activity.

A. Spike Protein and Receptor Binding Domain Expression and Purification

SARS CoV-2 spike trimer protein from the parental Wuhan-Hu-1 isolate (NCBI YP_009724390.1) or the B.1.351 “South African” variant with the mutation E484K (and K417N and N501Y), or the parental receptor binding domain (RBD) protein (amino acids 319 to 541 of the spike protein), were produced by transfection of HEK293 cells. Approximately 120×10⁶HEK293 Freestyle cells with 293fectin (Invitrogen) were combined with 120 μg of pCAGGS-based vector containing (1) the sequence encoding the extracellular domain of the Spike protein with furin-cleavage site removed and K986P and V987P stabilizing mutations, T4-fibritin trimerization domain and c-terminal 6×His-tag, or (2) spike RBD domain (amino acids 319 to 541 of the spike protein) with c-terminal 6×His-tag.
Cells were shaken at 37° C. for 4 days with 8% CO2 with 150 μl TCM-ProteaseArrest tissue culture protease inhibitor (G-Biosciences) added on day 3. The supernatant containing secreted spike or RBD protein was clarified from the supernatant by centrifugation at 4000 RPM for 5 minutes followed by filtration through a 0.45 μm PES filter. The supernatant was concentrated and buffer-exchanged into PBS using Amicon Ultra Centrifugal Filter units (MWCO=50,000 for S protein preparation and 10,000 for the RBD protein) (EMD-Millipore) at 4° C. The concentrated supernatant was then purified using TALON cobalt metal affinity resin (Takara Bio) following the manufacturer's protocol, except that 50 mM, 100 mM, 200 mM, 300 mM and 400 mM imidazole gradient elution fractions (1 column volume of each) collected. Each elution fraction was resolved on an SDS-PAGE gel stained with InstantBlue Coomassie Protein Stain (Abcam). Fractions containing a single spike protein band or a single RBD band were pooled, buffer-exchanged into PBS as described above, and the concentration of protein quantified using Nanodrop One (Thermo Scientific) based on the extinction coefficient and molecular weight of the spike or RBD protein, respectively.

B. Immunization Protocol

Two calves were immunized with purified Wuhan-Hu-1 spike protein or RBD protein variant with 200 μg/dose spread over 5 neck locations and boosted according to published methods (Sok et al. Nature 2017, 548(7665):108-111; Wang et al. Cell 2013, 153(6):1379-1393). Serum was collected and IgG ELISAs performed against the RBD domain of the SARS-CoV-2 spike on serum from the RBD immunized calf at a serum dilution range from 1:100 to 1:10,000. Spike protein reactivity was observed 7-21 days post-immunizations. As shown in FIG. 2A, binding activity for the RBD domain was significant after the first immunization.
Serum IgG was also assessed for neutralization of Spike protein and virus using a plaque reduction and neutralization test (PRNT). In this in vitro assay, virus and serum IgG are pre-incubated together before being concomitantly applied to permissive cells such that virus successfully bound by antibody can no longer penetrate cells and/or can no longer further propagate infection. As a result, foci of infection and cell damage called “plaques” appear to be smaller in size and/or number when the cellular monolayer is stained.
A pseudovirus expressing the SARS CoV-2 Spike protein was used as a model virus to assay percent neutralization of serum IgG from both parental Spike protein and RBD immunized cows in Vero6 cells. Compared with natural virus, the pseudovirus can be handled with BSL-2 considerations at high titer and can only infect cells in a single round. As shown in FIG. 2B, IgG obtained from cows in either of the immunization protocols was able to successfully neutralize the pseudovirus in a dose dependent manner. At higher concentrations, serum IgG (ng/mL) from cows immunized with the RBD alone was observed to neutralize 100% of pseudovirus.
Taken together, these results support that immunized cow serum, and antibodies contained therein, can neutralize SARS-CoV-2.

Example 2: Generation of Ultralong CDR3 scFv Antibody or CDR3-Knob Only Phage Display Libraries for Antibody Discovery

Peripheral Blood Mononuclear cells (PBMCs) were collected from the immunized cows described in Example 1 and RNA was extracted to use to generate two phage display libraries as described below. Specifically, approximately 1-5×10⁷PBMCs were collected after 14-64 days post-immunization and stored prior to RNA extraction and cDNA synthesis.
Two library strategies were employed, either using the antibodies in an scFv format with variable heavy chain (VH) and variable light chain (VL) fragments joined by a flexible linker peptide ((Gly₄Ser)₃15 amino acid linker, SEQ ID NO: 94), or using independent CDR3-knobs. In both approaches, the scFv or CDR3-knobs were fused to pIII via a flexible Gly4Ser linker. FIG. 3A depicts the pIII fusion constructs in each display library. The generation of the display libraries are summarized below.

A. ScFv Library Construction

In the first strategy, immune cow derived VH DNA fragments were combined with a fixed light chain BLV1H12 (Stanfield et al. Science immunology 2016, 1(1):aaf7962.). RBD and full length spike protein immune libraries were constructed for different immunization time points.
RNA was isolated from 5×10⁶-10⁷bovine PBMC's using an RNAeasy kit (Qiagen). Immune cow antibody VH repertoires were obtained through cDNA synthesis from 5 μg total RNA using Superscript IV First-Strand cDNA synthesis kit (ThermoFisher, #18091050), followed by PCR amplification. To generate a VH template library, the cDNA template for VHs were synthesized using a pool of IgM (SEQ ID NO: 4), IgA (SEQ ID NO: 5) and IgG-specific (SEQ ID NO. 3 and 6) primers.
In these hybrid libraries, full length donor ultra-long VHs were amplified from the VH template library with a VH family specific primer pair. Specifically, both VH regions were amplified with FR1 and FR4 primers specific for the bovine IgHV1-7 family (SEQ ID NO: 12 and 13, respectively) in order to enrich for VH regions with ultralong CDR3 regions. The amplified products were combined with Linker-BLV1H12 lambda light chain variable region (BLV1H12 light chain set forth in SEQ ID NO: 2 and encoded by a DNA sequence set forth in 1) by cloning into pre-cloned pTAU1 pIII fusion phage display vector (pTAU1-BLV1H12(-VH) (see FIG. 3C). The amplified products were subjected to 2 hours digestion with NcoI and XhoI (NEB) and subcloned into pTAU1-BLV1H12(-VH) as NcoI-XhoI fragments for separation of the VH and VL by the flexible linker peptide ((Gly₄Ser)₃, SEQ ID NO: 94). In a further step, some ultra-long VH fragments were additionally enriched by separation from shorter VH fragments using agarose gel electrophoresis, prior to digestion with NcoI and XhoI restriction enzymes. As shown in FIG. 3D, a 2% agarose gel achieved the most separation between ultra-long VH fragments (˜550 base pairs in length) and shorter VH fragments without ultralong CDR3 regions (˜400 base pairs in length).
Next, this was ligated overnight with T4 DNA ligase at 16° C. Final libraries were obtained by electroporation of electrocompetent TG1 cells (Lucigen) with the purified ligation products. Each library was a minimum of 10⁷clones with >90% with inserts.

B. CDR3-Knob Library Construction

In a second strategy, a library of VH templates were generated substantially as described in the first strategy. Then, ultra-long VH only, immune cow derived CRD3-knob (also called “CDR3-knob only”) libraries were built by amplifying stalk-knob CDRs from the VH template library using conserved primers and cloning as pIII fusions into the pTAU1 phage display pIII fusion vector.
Specifically, RNA was isolated from 5×10⁶-10⁷bovine PBMCs using an RNAeasy kit (Qiagen). Immune cow antibody CDR3-knob repertoires were obtained through cDNA synthesis from 5 μg total RNA using Superscript IV First-Strand cDNA synthesis kit (ThermoFisher), followed by PCR amplification. To generate the VH template library, the cDNA template for CDR3-knobs was synthesized using a pool of IgM (SEQ ID NO: 4), IgA (SEQ ID NO: 5) and IgG-specific (SEQ ID NO: 3 and 6) primers.
Primary stalk-knob CDR3 were amplified from 1^ststrand cDNA, with IgHV1-7 family specific primers specific for either side of the stalk domain of the CDR3 region (SEQ ID NO: 7-11). These were then cloned into pTAU1 phage vector as NcoI-NotI fragments following 2 hours digestion with the NcoI and NotI (NEB), and ligated overnight with T4 DNA ligase at 16° C. (see FIG. 3B). Final libraries were obtained by electroporation of electrocompetent TG1 cells (Lucigen) with the purified ligation products. Each library was a minimum of 10⁷clones with >90%/o with inserts.

Example 3: Screening of Phage Display Libraries and Selection of Ultra-Long VH or CDR3-Knob Domains Against SARS Cov-2

The VH ultra-long CDR3 scFv antibody or CDR-knob only libraries generated as described in Example 2 were subjected to two-five rounds of phage display selections against SARS CoV-2 target proteins (both parental Wuhan Hu-1 or “South African” B.1.351 variant Spike proteins or parental Wuhan Hu-1 RBD). Spike protein from either viral isolate or parental RBD were coated onto NUNC immunotubes with 1 mL of 10 μg/mL of target protein in PBS overnight at 4° C. Tubes were then blocked for 1 hour at room temperature on a blood mixer with 3-4 mL 2% Milk powder dissolved in PBS, and washed 3 times with PBS.
For each selection, approximately 10¹²phage particles from different immunized scFv or CDR3 knob libraries generated as described in Example 2 were added to 1 mL 4% milk powder dissolved in PBS, and made up to 2 mL total volume with PBS, and then added to the tubes with target protein and incubated on the blood mixer for 2 hours at room temperature. Tubes were then washed 10×PBS/0.1 % Tween 20, and 10×PBS.
Bound phage were recovered with 1 mL fresh 0.1M triethylamine for 10 minutes on the blood mixer and neutralized with 0.5 mL 1M tris (pH 7.0) on ice. Log-phase TG1 Phage-Competent™ cells were infected with eluted phage for 1 hour at 37° C./200 rpm, and then grown at 30° C. overnight on 2×TY agar supplemented with 2% glucose/50 μg/mL carbenicillin.
After each round of selection described above, TG1 bacteria were scraped off the master plates into 20 mL 2×TY media supplemented with 20% glycerol/2% glucose/50 μg/mL carbenicillin. Approximately 4-5 mL of this solution was added to 20 mL of 2×TY media supplemented with 2% glucose/50 μg/mL carbenicillin containing 100 μl M13K07 helper phage (MOI=10). This suspension was incubated at 37° C./200 rpm for 1 hour, and added to 200 mL 2×TY/0.2M sucrose/50 μg/mL carbenicillin/25 μg/mL kanamycin/20 μm IPTG before incubating overnight at 30° C./200 rpm. Amplified phage were precipitated from cleared culture supernatants with 1/5 volume 2.5M NaCl, 20% PEG 8000 in a 250 mL Oakridge centrifuge tube after incubation on ice for 1 hour. The phage containing material was pelleted at 14,000 g in a Sorvall centrifuge for 20 minutes, resuspended in 2 mL PBS, and 1 mL reserved for use in the next round of selection. Between 2-5 rounds of selection were carried out for each library, with phage ELISA carried out for each round beginning at Round 2.
From each selection, individual colonies were picked into 600 μL 2×TY media supplemented with 50 μg/mL carbenicillin and 2% w/v glucose in 96-deepwell culture plates and incubated at 37° C. (with shaking) at 200 rpm overnight. For each culture, 50 μL was transferred to a fresh 96-deepwell plate containing 200 μL/well of the same medium and grown for 3 hours. Approximately 10⁸kanamycin resistance units (k.r.u.) of M13K07 kanamycin-resistant helper phage was added to each well, and plates incubated at 37° C. for 1 h. Expression medium (800 μL/well 2×TY media supplemented with 0.2M sucrose, 100 μg/mL carbenicillin, 25 μg/mL kanamycin, and 20 μM IPTG) was added to each well and amplification continued overnight at 30° C.
Culture plates were centrifuged at 2000 g for 10 mins at 4° C., and 25 μL of culture supernatant per well was used for ELISA. Half-area Costar ELISA plates were coated overnight at 4° C. with 50 μL/well RBD or Spike target protein at 1 μg/mL in PBS, blocked for 1 hour at room temperature with 100 μL/well of 2% milk powder dissolved in PBS, and then washed 2×100 μL/well PBS. Approximately 25 μL phage culture supernatant per well was added to each target plate or negative control plate containing 25 μL/well 4% milk powder/PBS, and allowed to bind for 1 hour at room temperature. Each plate was washed two times with 200 μL/well PBS with 0.1% Tween 20, then two times with 200 μL/well PBS. Bound phage were detected with 50μ:/well, 1:5000 diluted anti-M13-HRP conjugate (Sinobiologicals) in 2% milk powder/PBS for 1 hour at room temperature. The plates were washed and developed for 5-10 minutes at room temperature with 50 μL/well TMB (3,3′,5,5′-Tetramethylbenzidine) substrate buffer (Thermofisher). The reaction was stopped with 100 μL/well 0.5N H₂SO₄per manufacture protocol and optical density read at 450 nm.
Positive clones from screening the scFv libraries were sequenced and both short and ultra-long VH sequences were transferred to the pFUSE human IgG1 Fc heavy chain expression vector for co-expression in mammalian HEK293 cells with chimeric BLV1H12 lambda light chain-human lambda light chain constant region. Positive clones from screening the knob-CDR3 only libraries were synthesized as full VH gene fragments and cloned into pFUSE human IgG1 Fc vector, and similarly expressed with the chimeric BLV1H12 lambda light chain as described above. Specifically, each VH was PCR-amplified, from 10 ng phage plasmid miniprep (Qiagen), in a 50 μL reaction with 2X Phusion Hot Start II High-Fidelity PCR Master Mix (Thermo Scientific) and primers specific for V_Hframework 1 (forward) and J_Hframework 4 (reverse). The PCR-generated insert was cloned into pFUSE mammalian expression vector at a 5′ EcoRI and 3′ NheI site on the 5′ end of a human IgG1 Fc gene. This was paired with a second pFUSE plasmid, containing bovine VL (BLV1H12) and human λ C_Lsequences, for transfection in HEK 293F cells. Cells were seeded at a density of 1×10⁶cells/mL in 30-60 mL Freestyle 293 Expression Medium (Gibco), then incubated in a humidified environment at 37° C. and 8% CO₂. Heavy and light chain plasmids were combined 1:1 to a total amount of 1 μg DNA per mL of 293F culture, then diluted in Opti MEM I media (Gibco) to a final volume of 1 mL per 30 mL of 293F culture. Approximately 60 μL 293fectin Transfection Reagent (Gibco) and 940 μL Opti MEM I were combined, for each 30 mL of 293F culture, then gently mixed and incubated for 5 minutes at room temperature before addition to diluted DNA. This mixture was incubated at room temperature for 30 minutes and then transferred to the 293F culture.
Medium was harvested 5 days after transfection and expressed chimeric bovine human IgG1 antibodies were purified by immobilized Protein A Sepharose (Cytiva Life Sciences) chromatography, then tested for antigen binding and neutralization of live and pseudovirus.
Selected candidate antibodies from the library screening were identified and sequenced (Table E1). A number of selected antibodies contained an ultralong CDR3 domain. Thus, despite ultralong CDR3 antibodies representing only about 10,% of naturally occurring cow antibodies, candidate antibodies from the immunization described in Example 1 that were generated and screened by the above phage display approach were highly enriched for cow antibodies with an ultralong CDR3 (i.e., over 40% of candidates feature a CDR3 of at least 50 amino acids).
Exemplary antibodies SA-R2C3 and SA-R2D9 antibodies were derived from Ultra-long scFv library (immunization with parental Wuhan-Hu1 S protein), and identified by a screen involving selection on South African variant Spike protein. Exemplary SKM and SKD antibodies were identified from a screen from a phage library derived directly from CDR3-knob libraries as described.
Sequences alignments for exemplary ultralong antibodies SKD (SEQ ID NO: 68), SKM (SEQ ID NO: 69), R4C1 (SEQ ID NO: 70), R5C1 (SEQ ID NO: 71), SR3A3 (SEQ ID NO: 72), R2F12 (SEQ ID NO: 73), and R2G3 (SEQ ID NO: 74) are shown in FIG. 4 along with a germline reference sequence (SEQ ID NO: 75). The length of the CDR3 and number of cysteine residues are also shown for each.

TABLE E1

Exemplary Candidate SARS CoV-2 Antibodies

VH Sequence

CDR3 Sequence

	CDR3	Amino	Nucleic	Amino	Nucleic
Name	Length	Acid	Acid	Acid	Acid

Antibody (VH) Candidates

RBD A2	18	49	30	—	—
RBD C6	18	48	29	—	—
RBD F4	18	47	28	—	—
R2B1	25	44	25	—	—
R2D6	25	43	24	—	—
R2G1	20	42	23	—	—
R4A10	31	41	22	—	—
R4E5	31	39	20	—	—
R4G3	—	38	19	—	—
R4G11	24	37	18	—	—
R5A3	27	36	17	—	—

Ultralong CDR3 Antibody (VH) Candidates

R4C1

	61	40	21	63	55
R2C3	61	50	31	66	58
SKD	61	46	27	65	57
SKM	60	45	26	64	56
R2G3	61	33	14	60	52
R2F12	58	35	16	62	54
SR3A3	61	34	15	61	53
R2D9	52	51	32	67	59

Example 4: Assessment of Binding to Spike Protein and RBD

Selected clones, expressed and purified as chimeric bovine-human IgG1 antibodies as described in Example 3, were then assayed for their ability to bind RBD and Spike protein.

A. SARS CoV-2

RBD and spike binding of chimeric bovine-human IgG1 antibodies was assessed by ELISA. Approximately 50 μL of RBD or Spike protein, at 1 μg/mL in PBS, was added to each well of a half-area Costar ELISA plate (Corning) and coated overnight at 4° C. The plate was blocked with 180 μL/well 2% milk powder/TBS/0.1% Tween20 at room temperature for 2 hours. Purified chimeric bovine-human IgG1 antibodies were diluted 5-fold from 20 nM-0.00129 nM in 2% milk powder/TBS/0.1% Tween20, and 50 L/well of each dilution was added in duplicate to coated/uncoated wells. The plate was incubated at room temperature for 1 hour, then washed four times with 180 μL of TBS/0.1% Tween20, and bound IgG was detected with 50 μL/well of anti-human Fc-HRP (Jackson ImmunoResearch Laboratories, Inc.) diluted 1:5000 in 2% milk powder/TBS/0.1% Tween20 at room temperature for 30 minutes. The plate was then washed five times with 180 μL of TBS/0.1% Tween20 before 50 μL/well of TMB (3,3′,5,5′-Tetramethylbenzidine) substrate buffer (Thermo Scientific) was added. After 1-2 minutes at room temperature, the reaction was stopped with 50 μL/well 1N H₂SO₄, and OD 450 nm values were recorded.
Representative results for three tested clones are shown in FIG. 5A and FIG. 5B. As shown in FIG. 5A, each of the purified chimeric bovine-human IgG1 antibodies (R2G3, R2F12, and R4C1) showed binding to the spike protein. An unrelated bovine-human IgG1 (136S IgG) did not show binding to the spike protein. As shown in FIG. 5B, purified chimeric bovine-human IgG1 antibodies with the VH of clones R2G3 and R2F12 showed binding to the RBD. The unrelated bovine-human IgG1 (136S IgG), as well as the chimeric antibody with the VH of clone R4C1, did not show binding to the RBD protein. These results are consistent with a finding that antibody R4C1 binds to a non-RBD epitope in the Spike protein, whereas R2G3 and R2F12 binding to a RBD epitope.

TABLE E2

Binding Activity of Exemplary Candidate SARS CoV-2 Antibodies

				Spike
		RBD		Protein
		Binding	Bind Spike	Binding
Name	Bind RBD	EC50 (nM)	Protein	EC50 (nM)

Antibody (VH) Candidates

RBD A2	Yes	0.03	Yes	0.024
RBD C6	Yes	0.03	Yes	0.03
RBD F4	Yes	0.03	Yes	0.03
R2B1	Yes	0.52	Yes	0.37
R2D6	Yes	0.57	Yes	0.41

Ultralong CDR3 Antibody (VH) Candidates

R4C1	Yes	—	Yes	0.20
R2C3 (R5C1)	Yes	—	Yes	0.39
SKD	Yes	0.19	Yes	0.16
SKM	Yes	0.24	Yes	0.19
R2G3	Yes	0.056	Yes	0.032
R2F12	Yes	0.085	Yes	0.050
SR3A3	Yes	—	Yes	0.037

B. SARS CoV-2 Variants

RBD and spike binding of chimeric bovine-human IgG1 antibodies was assessed by ELISA against further isolates of SARS CoV-2, including variants from the beta, delta, and omicron lineages as well as a SARS CoV-1 virus. As described in Example 4, approximately 50 μL of RBD or Spike protein, at 1 μg/ml in PBS, was added to each well and coated overnight at 4° C. The plate was blocked at room temperature for 2 hours. Purified chimeric bovine-human IgG1 antibodies were diluted 5-fold from 20 nM-0.00129 nM, and 50 μL/well of each dilution was added in duplicate to coated/uncoated wells. The plate was incubated at room temperature for 1 hour, then washed four times, and bound IgG was detected with anti-human Fc-HRP (Jackson ImmunoResearch Laboratories, Inc.). The plate was then washed five times before TMB substrate buffer was added. After 1-2 minutes at room temperature, the reaction was stopped with H2SO4, and OD 450 nm values were recorded.
FIG. 5C shows ELISA binding of IgG antibodies to recombinant stabilized spike proteins derived from the wild-type (WT) Wuhan-Hu-1 strain, beta strain (formerly described as the South African strain), or delta strain. It was observed that exemplary antibodies SKD and SKM appear to lose detectable binding to beta, but maintain binding to WT and delta SARS CoV-2. The other antibodies are shown to bind across the range of concentrations tested for each S protein.
In a complementary set of experiments performed with RBD, FIG. 5D shows ELISA binding curves of select IgG antibodies against the omicron variant RBD (left) or recombinant stabilized spike trimer (right). Of the exemplary RBD binders tested, only R2D9 was observed to maintain binding to an omicron variant spike RBD. R4C1, R5C1 and R2D9 were also observed to bind to full-length omicron spike with EC50s in the subnanomolar range.
FIG. 5E reflects exemplary ELISA data of R4C1 and R2D9 on SARS-CoV-2 compared to SARS-CoV-1. P1B4, also known as NC-Cowl, was used as a negative control, see Sok, et. al. Nature 2017. These data show that R4C1 maintains complete binding activity to SARS-CoV-1, whereas alternative exemplary antibody R2D9 loses >10× binding. However it was observed that R2D9 still maintains some binding activity in the low nanomolar range to SARS-CoV-1.
Finally, FIG. 5F shows ELISA binding activity (top) for three different exemplary antibody knob candidates against WT (Wuhan) SARS CoV-2 spike protein. For this experiment, each exemplary knob was expressed with a DO1 epitope tag, which was detected with an anti-DO1 antibody reflected on the X axis. FIG. 5G further depicts a modified western blot. Here, the indicated exemplary antibody knobs were heated to 70° C. in the presence of SDS, then resolved by SDS-PAGE before transferred to nitrocellulose membrane and detected with biotinylated RBD. RBD was biotinylated using EZ-Link NHS-LC-LC-biotin (Thermo Fisher). The NHS-LC-LC-biotin was reconstituted in DMF and combined with purified RBD at a 1:5 (RBD: biotin) molar ratio, then incubated at room temperature for 30 minutes. The reaction was then applied to a Pierce polyacrylamide spin desalting column 7K MWCO, equilibrated in PBS. Aprotonin was selected as a similar size control. It was observed that the R2G3 knob maintained binding to RBD despite heat and SDS treatment.

Example 5: Virus Neutralization

In some aspects, binding of an antibody to a viral antigenic protein is insufficient to mitigate cell entry or infectious propagation. Whereas some antibodies, known as neutralizing antibodies, have the ability to inhibit virus in vitro and/or in vivo and are thus considered more relevant for therapeutic applications. Therefore, candidate antibodies as described above were tested for their ability to neutralize infection of cells with a SARS CoV-3 pseudovirus, a model virus to assay neutralization capacity of candidate antibodies. Compared with natural occurring isolates of SARS virus, the pseudovirus can be handled with BSL-2 considerations at high titer and is therefore appropriate for screening, such as in a pseudovirus luciferase assay (PVLA).
A pseudovirus expressing the SARS CoV-2 S protein of the parental Wuhan-Hu-1 Spike protein sequence in its vial envelope was engineered such that the gene for luciferase expression was carried as its cargo. Upon successful penetration into the cell, luciferase is expressed such that the pseudovirus neutralization inhibition rate is inversely proportional to luciferase activity expressed as relative light units (RLUs). These pseudotyped viruses were used in a neutralizing assay performed in CRFK-hACE2 cells. As the receptor for SARS-CoV-2 entry, ACE2 overexpression is considered a mechanism by which cell lines can be produced that display “high infectability”. In the converse, a cell line with minimal or lower ACE2 expression can be considered to display “low infectability”.
Specifically, mock-medium or serially diluted (5-fold) antibody Fab was mixed with the same amount of the pseudotyped virus carrying SARS-CoV-2 wild-type (WT) and incubated at 37° C. for 1 h. Then, the mixtures were transduced into CRFK-hACE or CRFK-hDDP4 cells in the presence of polybrene (Santa Cruz Biotech, Santa Cruz, CA) (10 μg/mL). Following incubation of the transduced cells at 37° C. for 48 h, lysis buffer was added, and the RLU were measured.
A summary of pseudovirus neutralization of identified antibodies is set forth in Table E3. The cow ultralong CDR3 antibodies are highly potent and neutralize variant strains, with a half maximal inhibition at concentrations less than 1-5 ng/mL for some antibodies. In general, the ultralong CDR3 antibodies exhibited more potent neutralization than the antibodies with a standard CDR3 length.

TABLE E3

Pseudovirus Neutralization of Exemplary
Candidate SARS CoV-2 Antibodies

	Name	IC50 (ng/mL)

	RBD A2	69
	RBD C6	278
	RBD F4	39.7
	R4C1	520

	R2C3 (R5C1)	2.12-3.2
	SKD	0.05-0.33
	SKM	0.07-0.29
	R2G3	>1
	R2F12	0.15-2
	SR3A3	1.45-160

Example 6: Bacterial Expression and Purification of CDR3-Knob Only Antibodies

A system was developed to express and purify CDR3-knobs, which are small peptide sequences of 25-50 amino acids with 1-6 disulfide bonds derived from an ultralong CDR3 cow antibody as described above. The expression system included fusion with the bacterial chaperone TrxA. CDR3-knobs as well as trxA-CDR-knob fusions were tested for spike and RBD binding.

A. TrxA-CDR3-Knob Fusion and CDR3-Knob Expression and Purification

CDR3-knobs from candidate ultralong CDR3 antibodies described in Examples 2-5 were cloned into pET32b vectors (EMD-Millipore) as KpnI-XhoI (or NcoI-XhoI as appropriate) fragments (FIG. 6A), and transformed into Origami 2 DE3 bacteria, and expressed as described below. These CDR3-knobs had sequences set forth in SEQ ID NO: 60-67, and encoded by a DNA sequence set forth in SEQ ID NO: 52-60, respectively.
A trxA-CDR3-knob fusion clone was grown overnight at 37° C. in 20 mL of 2×TY/50 μg/mL carbenicillin/10 μg/mL tetracycline/2% glucose, transferred to 200 mL of the same medium, and grown at 37° C. to an OD600 nm of approximately 1.0, after which the bacteria were spun down and resuspended in 200 mL of 2×TY/50 μg/mL carbenicillin/0.5 mM IPTG and grown overnight at 22° C. The bacteria were again pelleted, resuspended in 10 mL of Bugbuster HT (EMD-Millipore), rotated for 30 minutes at room temperature, and debris pelleted for 20 minutes at 14,000 g at 4° C. The supernatant was added to an equilibrated Talon resin column (1 mL resin TaKaRa), rotated at 4° C. for 2 hours, washed with five column volumes wash buffer (5 mM imidazole), then 1 column volume wash buffer (10 mM imidazole), eluted with 2.5 mL of 300 mM imidazole elution buffer, and then buffer exchanged to PBS/saline with a PD10 spin column (GE Healthcare). The trxA-CDR3-knob was adjusted to 50 mM Tris pH 7.4, 150 mM NaCl, and 2.5 mM CaCl2) (1× enterokinase (EK) reaction buffer), and 400 u recombinant his-tagged Enterokinase (Genscript) was added and incubated overnight at room temperature. Digested trxA and enterokinase were removed by incubation on a fresh equilibrated Talon resin column (1.2 mL resin) for 2 hours at 4° C., and purified CDR-knob was collected in the flowthrough. Again, the sample was buffer exchanged to saline/PBS. In some cases, endotoxin removal may be carried out by anion exchange chromatography prior to use or testing, such as testing in a viral neutralization assay. CDR3-knobs cloned and expressed in E. coli as independent domains are set forth in SEQ ID NO: 60-67.
The stepwise purification is depicted in FIG. 6B. As shown in FIG. 6C, stepwise purification, as monitored by SDS-PAGE, efficiently purified both trxA-CDR3-knob fusion proteins as well as soluble CDR3-knobs from E. coli lysates. FIG. 6D depicts an exemplary SDS-PAGE gel of several purified ultralong CDR H3 knob peptides. The samples were treated with reducing agent DTT, which in some aspects is sufficient to break disulfide bonds. The similarly sized protein aprotinin was included as a size control.
MAC-Purified trxA-CDR3-Knob Fusion Spike or RBD Binding
In order to assess CDR3-knob binding as trxA fusions, prior to enterokinase cleavage from trxA, half-area Costar ELISA plates were coated overnight at 4° C. with serial dilutions of IMAC purified trxA-knob fusions from 25 μL of trxA fusion in 50 μl/well PBS. RBD-binding clones R2G3, R2F12, SKM, and SKD (nucleic acid sequences set forth in SEQ ID NO: 52, SEQ ID NO: 54, SEQ ID NO: 56, and SEQ ID NO: 57, respectively; and amino acid sequences set forth in SEQ ID NO: 60, SEQ ID NO: 62, SEQ ID NO: 64, and SEQ ID NO: 65, respectively), and spike-binding clone R4C1 (nucleic acid sequence set forth in SEQ ID NO: 55, and amino acid sequence set forth in SEQ ID NO: 63), were tested.
Plates were then blocked for 1 hour at room temperature with 100 μL/well of 2% milk powder/PBS, and then washed twice with 100 μL/well of PBS. Approximately 50 μL/well of 1 μg/mL Wuhan-Hu-1 spike protein in 2% milk powder/PBS was incubated for 1 hour, and wells were then washed three times with 100 μl/well of PBS. To detect bound spike protein, 1 μg/mL of full length IgG chimeric ultralong CDR3 was added, either anti-RBD R2G3 IgG1 (for R4C1), or anti-R4C1 IgG1 antibody (for R2F12, R2G3, SKD and SKM fusions), in 2% milk powder/PBS, incubated for 1 hour, and then wells were washed three times with 100 μL/well of PBS. Bound IgG was then detected by incubation with 1:5000 diluted anti-human IgG-Fc-HRP conjugate in 2% milk powder/PBS for 1 hour, and wells were then washed three times with 100 μL/well of PBS. The plate was then washed and developed for 5-10 minutes at room temperature with 50 μL/well TMB (3,3′,5,5′-Tetramethylbenzidine) substrate buffer (Thermofisher). The reaction was stopped with 100 μL/well of 0.5N H₂SO₄and read at 450 nm.
As shown in FIG. 7A (in which R2F12 is denoted as “F12”, and R2G3 is denoted as “G3”), the tested trxA-knob fusion proteins showed spike protein binding. Control conditions in which fusion proteins R3C1 and R2G3 were incubated in the absence of spike protein (denoted “R3C1 NO Spike” and “G3 NO Spike”) did not show binding. Binding for the TrxA-R2G3 fusion protein is also shown separately in FIG. 7B, relative to uncoated plates.

B. Purified R2G3 CDR3-Knob Binding to Wuhan-Hu-1 RBD

Binding of purified R2G3 CDR3-knob (after enterokinase cleavage from trxA as described above) to RBD was evaluated by ELISA. The nucleic acid sequence encoding R2G3 CDR3-knob is set forth in SEQ ID NO: 52, and the amino acid sequence set forth in SEQ ID NO: 60.
Wells in a half-area Costar ELISA plate (Corning) were coated, in duplicate, with 50 μL/well of purified CDR3-knob diluted 2-fold from 84-0.082031 nM in PBS. The plate was incubated at 37° C. for 1 hour, then blocked with 180 μL/well of 2% milk powder/TBS/0.1% Tween20 at room temperature for 2 hours. Next, biotinylated RBD was diluted to 0.5 ng/μL in 2% milk/TBS/0.1% Tween20, and 50 μL/well was added to coated/uncoated wells. After 1 hour at room temperature, wells were washed four times with 180 μL/well of TBS/0.1% Tween20, and bound biotinylated RBD was detected with 50 μL/well of streptavidin-HRP (Invitrogen) diluted 1:5000 in 2% milk/TBS/0.1% Tween20 for 30 minutes at room temperature. The wells were then washed five times with 180 L/well TBS/0.1% Tween20 before addition of 50 μL/well TMB (3,3′,5,5′-Tetramethylbenzidine) substrate buffer (Thermo Scientific). After 1-2 minutes at room temperature, the reaction was stopped with 50 μL/well 1N H₂SO₄, and OD 450 nm values were recorded. The average OD450 of uncoated wells was subtracted from the OD450 in each coated well. Background-subtracted OD450 values were plotted in GraphPad Prism (GraphPad Software LLC) against Log(CDR3-knob nM).
As shown in FIG. 8A, the soluble R2G3 knob showed binding to the RBD. As shown in FIG. 8B, soluble R2G3 knob binding was increased relative to that of a reference anti-spike protein antibody, CR3022.

C. Binding of Truncated R2G3 CDR3-Knobs to Wuhan-Hu-1 RBD

Truncated R2G3 CDR3-knobs were cloned and produced as described above using pET32b vectors encoding an R2G3 truncated mutant followed by an enterokinase cleavage site. Amino acid sequences of the truncated R2G3 mutants are shown in FIG. 8C. As shown in FIG. 8D, Truncations 1-3 showed compact bands following enterokinase cleavage and gel electrophoresis (0.75 μg of truncated knob protein per lane, 250 mM DTT).
The truncated R2G3 CDR3-knobs were also tested for RBD binding as described above. As shown in FIG. 8E, Truncations 1-3 had preserved RBD binding ability, whereas Truncations 4 and 5 lacked RBD binding.

D. Defining the Minimal CDR3-Knob C-Terminal Requirement

In order to define the C-terminal requirements (i.e., C-terminal minimal sequence) of a prototypical CDR3-knob, a series of R2G3 truncations were cloned into pET32b and expressed and purified as described in Example 6 above. These truncations were as set forth in Table E4 below.

TABLE E4

Exemplary R2G3 Truncations

		Mature amino acid sequence
	CLONE	after Enterokinase cleavage

	G3 Parental	GGGGAMGSEGDKTCPDGYEHTCGCIGGC
	SEQ ID ON:	GCKRSACIGALCCQASLGGWLSDGETYT
	86

	G3 TRUNC1	~~~~GGSEGDKTCPDGYEHTCGCIGGC
	SEQ ID ON:	GCKRSACIGALCCQASLGGWLSDGETYT
	87

	G3 TRUNC2	~~~~GGSBGDKTCPDGYEHTCGCIGG
	SEQ ID ON:	CGCKRSACIGALCCQASLGGWLSDGE
	88

	G3 TRUNC3	~~~~GGSEGDKTCPDGYEHTCGCIGG
	SEQ ID ON:	CGCKRSACIGALCCQASLGGWLS
	89

	G3 TRUNC3A	~~~~GGSEGDKTCPDGYEHTCGCIGG
	SEQ ID ON:	CGCKRSACIGALCCQASLGGWL
	90

	G3 TRUNC3B	~~~~GGSEGDKTCPDGYEHTCGCIGG
	SEQ ID ON:	CGCKRSACIGALCCOASLGGW
	91

	G3 TRUNC4	~~~~GGSEGDKTCPDGYEHTCGCIGG
	SEQ ID ON:	CGCKRSACIGALCCQASLGG
	92

	G3 TRUNC5	~~~~GGSEGDKTCPDGYEHTCGCIGG
	SEQ ID ON:	CGCKRSACIGALCCQAS
	93

The quality of expressed material was assessed by SDS-PAGE and RBD ELISA as described in Example 6D above. Only Truncations 4 (G3 TRUNC4) and 5 (G3 TRUNC5) were observed to exhibit no RBD binding capability. Truncations 3A (G3 TRUNC3A) and 3B (G3 TRUNC3B) demonstrated reduced binding in an ELISA and increased band diffuseness in SDS-PAGE as depicted in FIG. 8A. ELISAs performed with truncations 1-3 yielded no observed loss in binding activity relative to parental R2G3 CDR3-knob as shown in FIG. 8B. These data support that a minimum of at least 9 amino acids is required after the last non-canonical Cys residue for R2G3 binding.

E. CDR3-Knob Purification by Size Exclusion Chromatography

Size exclusion chromatography (SEC) was used to resolve if soluble CDR3-knobs that were purified following bacterial expression were present in multiple forms. Soluble R4C1 and R2G3 knobs were produced as described above and subjected to SEC.
As shown in FIG. 9A, SEC revealed at least two distinct elution fractions (fractions A4 and A7) for purified R4C1 knobs, indicating that purified R4C1 knobs were present in multiple forms following bacterial expression. Gel electrophoresis was performed on fractions A4 and A7. As shown in FIG. 9B, fraction A4 contained a larger soluble aggregate as well as smaller, active soluble CDR3-knobs. Fraction A7 contained only the smaller, active soluble CDR3-knobs.
As shown in FIG. 9C, SEC revealed only one distinct elution fraction (fraction A6) for purified R2G3 knobs (fraction A6). This result was corroborated by gel electrophoresis performed on fraction A6 (FIG. 9D).

Example 7: Comparison of SARS-CoV 2 Virus Neutralization of Chimeric Fab Ultralong CDR3 and CDR3-Knob

To assess virus neutralization of a CDR3-knob only antibody, assays to assess neutralization of pseudovirus or live WT SARS-CoV2 virus were carried out. In this example, purified R2G3 CDR3-knob (“G3-Knob”) or a Fab of the chimeric R2G3 ultralong CDR3 antibody (“G3-Fab”), or a full length IgG chimeric R2G3 ultralong CDR3 antibody (“G3”) were tested, as indicated.
A pseudovirus luciferase assay (PLSA) substantially as described in Example 5 was performed. Virus neutralization was assessed against pseudotyped virus carrying SARS-CoV-2 (Wuhan-Hu-1) wild-type (WT) spike protein, or the S variants (E484K/N507Y; B.1.1.7 or “UK” variant; and K417N/E484K/N501Y; B.1.351 or “SA” variant). Mock-medium or serially diluted (5-fold) antibody G3-Knob, G3-Fab or G3 was mixed with the same amount of the pseudotyped virus carrying SARS-CoV-2 wild-type (WT), the S variants (484K, B.1.1.7 and B.1.351) and incubated at 37° C. for 1 h. Then, the mixtures were transduced into CRFK-hACE or CRFK-hDDP4 cells in the presence of polybrene (Santa Cruz Biotech, Santa Cruz, CA) (10 μg/mL). As the receptor for SARS-CoV-2 entry, ACE2 overexpression is considered a mechanism by which cell lines can be produced that display “high infectability”. In the converse, a cell line with minimal or lower ACE2 expression can be considered to display “low infectability”.
Following incubation of the transduced cells at 37° C. for 48 h, lysis buffer was added, and the RLU were measured. Inhibition curves of serial dilutions of each antibody, G3-Fab or G3-Knob, against mock treatment were generated, and the 50% effective concentration (EC50) values were determined by GraphPad Prism software using a variable slope (GraphPad, La Jolla, CA). The results are summarized in Table E5.
To assess neutralizing activity against live SARS-CoV-2, selected antibodies of G3, G3-Fab or G3-Knob were investigated for their neutralizing activity against the replication of SARS-CoV-2 or B.1.17 or B.1.351 variants in Vero E6 cells. Briefly, 50-100 plaque forming units of SARS-CoV-2 hCoV/USA-WA 1/2020 (wild type), SARS-CoV-2 hCoV-19/England/204820464/2020 (B.1.1.7 variant), or SARS-CoV-2 hCoV-19/South Africa/KRISP-EC-K005321/2020 (B.1.351 variants) were mixed with mock-medium or serially diluted (5-fold) G3-Fab or G3-Knob. Following incubation at 37° C. for 1 h, the mixtures were inoculated to confluent Vero E6 cells in 24 well plates. After 2 hr incubation, medium containing agar (1% final concentration) and neutral red was added to the cells. After 48-72 hr, plaques in each well were counted. The EC50 values were determined as described above and shown in Table E5 below.
Together, the results shown in Table E5 demonstrate that the exemplary cow ultralong CDR3 R2G3, in either a standard IgG Fab format or as a CDR3-knob only format, exhibited potent neutralizing activity against WT SARS-CoV-2 as well as the tested variants. The cow ultralong CDR3 antibody is highly potent and neutralizes variant strains, with a half maximal inhibition at concentrations less than 1-5 ng/mL, depending on the antibody format. Remarkably, despite being a short sequence of only 51 amino acids in length, the CDR3-knob only antibody retained subnanomolar potency. Due to the small size of the CDR3-knob antibodies, this examples supports utility of the CDR3-knob antibodies as novel therapeutic antibody candidates for an inhalation formulation for respiratory targets, including other viruses, bacteria, other infectious diseases, asthma or lung cancer.

TABLE E5

Neutralization pseudovirus and live virus for R2-G3 IgG, Fab and CDR3-knob

	Pseudotype	Pseudotype	SARS-CoV-2 VeroE6	SARS-CoV-2 VeroE6
	EC50 (ng/mL)	EC50 (pM)	EC50 (ng/mL)	EC50 (pM)

	SA		SA		UK	SA		UK	SA
WT	variant	WT	variant	WT	variant	variant	WT	variant	variant

G3	0.59	5.28	3.55	32.2	0.5	0.51	2.6	3.15	3.2	19.14
G3 Fab	1.23	1.19	22.8	22.39	3.67	1.35	5.03	72.58	26.19	103.31
G3 Knob	5.49	52.23	904.5	8830.1	0.92	1.94	333.28	712	535	49,968

A. SARS CoV-2 Variants

In a further assessment of virus neutralization of ultralong CDR3 antibodies, assays to assess neutralization of live WT SARS-CoV2 virus or several variant SARS CoV-2 viruses were carried out. In this example, full length IgG chimeric ultralong CDR3 antibodies F12, G3, SKD, and SKM were tested, as indicated.
A pseudovirus luciferase assay (PLSA) substantially as described in Example 5 was performed. Virus neutralization was assessed against pseudotyped virus carrying SARS-CoV-2 (Wuhan-Hu-1) wild-type (WT) spike protein, the S variants (E484K/N507Y; B.1.1.7 or “UK” variant; and K417N/E484K/N501Y; B.1.351 or “SA” variant) or 484K. Mock-medium or serially diluted (5-fold) antibody was mixed with the same amount of the pseudotyped virus carrying SARS-CoV-2 wild-type (WT), the S variants (484K, B.1.1.7 and B.1.351) and incubated at 37° C. for 1 h. Then, the mixtures were transduced into Vero, CRFK-hACE or CRFK-hDDP4 cells in the presence of polybrene (Santa Cruz Biotech, Santa Cruz, CA) (10 μg/mL). As the receptor for SARS-CoV-2 entry, ACE2 overexpression is considered a mechanism by which cell lines can be produced that display “high infectability”. In the converse, a cell line with minimal or lower ACE2 expression can be considered to display “low infectability”.
Following incubation of the transduced cells at 37° C. for 48 h, lysis buffer was added, and the RLU were measured. As shown in FIG. 10A-10D, each exemplary ultralong CDR3 antibody exhibited activity against more than one variant SARS CoV-2 S protein. Inhibition curves of serial dilutions of each antibody against mock treatment were generated, and the 50% effective concentration (EC50) values were determined by GraphPad Prism software using a variable slope (GraphPad, La Jolla, CA). The results are summarized in Table E6.

TABLE E6

Neutralization pseudovirus and live virus
for Exemplary Ultralong CDR3 Antibodies

EC₅₀(pM)

	WT	UK		484K	SA

F12	6.15	4.64	24.19	200.85
G3	4.03	3.79	10.60	80.235
SKD	4.47	7.71	>1000	>1000
SKM	5.77	9.74	>1000	>1000

Together, the results shown in Table E5 demonstrate that the exemplary cow ultralong CDR3 antibodies, F12, G3, SKD, and SKM, exhibited potent neutralizing activity against WT SARS-CoV-2 as well as the tested variants. The cow ultralong CDR3 antibody is highly potent and neutralizes variant strains, with a half maximal inhibition at concentrations less than 1-5 ng/mL, depending on the antibody format. Remarkably, despite being a short sequence of only 51 amino acids in length, the CDR3-knob only antibody retained subnanomolar potency. Due to the small size of the CDR3-knob antibodies, this examples supports utility of the CDR3-knob antibodies as novel therapeutic antibody candidates for an inhalation formulation for respiratory targets, including other viruses, bacteria, other infectious diseases, asthma or lung cancer.

Example 8: SARS CoV-1 Cross Reactivity

To assess possible cross reactivity and broad neutralization of exemplary Ultralong CDR3 antibodies, assays to assess neutralization of pseudovirus were carried out. In this example, exemplary R4C1 and R2D9 ultralong CDR3 antibodies were tested, as indicated.
A pseudovirus luciferase assay (PLSA) substantially as described in Example 5 was performed. Virus neutralization was assessed against pseudotyped virus carrying SARS-CoV-2 (Wuhan-Hu-1) wild-type (WT) spike protein, the S protein of a SARS-CoV-1 virus, or a VSV-G control. Mock-medium or serially diluted (5-fold) antibody G3-Knob, G3-Fab or G3 was mixed with the same amount of the pseudotyped virus carrying SARS-CoV-2 wild-type (WT), SARS-CoV-1 wild-type, or VSV-G, and incubated at 37° C. for 1 h. Then, the mixtures were transduced into cells in the presence of polybrene (Santa Cruz Biotech, Santa Cruz, CA) (10 μg/mL).
Following incubation of the transduced cells at 37° C. for 48 h, lysis buffer was added, and percent neutralization were measured. Inhibition curves of serial dilutions of each antibody against mock treatment were generated, and the maximum percent neutralization (MPN), i.e. the percent at which the neutralization curve plateaus for those viruses neutralized, were determined by GraphPad Prism software using a variable slope (GraphPad, La Jolla, CA).
FIG. 11A shows the IC50 values of different IgG antibodies against pseudoviruses from various coronavirus strains. Note that R4C1 and R2D9 maintain activity against the omicron variant of SARS-CoV-2. All of the antibodies exhibit subnanomolar potency, with several in the low picomolar range.

Example 9: Neutralization of Live Variant Virus

To assess additional cross reactivity and potential broad neutralization of exemplary antibodies, assays to assess neutralization of pseudovirus in addition to live virus were carried out. In this example, exemplary SKM, SKD, R4C1 (IgG, Fab, and Knob), G3 (IgG, Fab, and Knob) and R2D9 (IgG and knob) as described above were tested as indicated.
A pseudovirus luciferase assay (PLSA) substantially as described in Example 5 was performed. Virus neutralization was assessed against pseudotyped virus carrying SARS-CoV-2 (Wuhan-Hu-1) wild-type (WT) spike protein, the S protein of a SARS-CoV-2 beta lineage virus, or a SARS-CoV-2 delta lineage virus. Mock-medium or serially diluted (5-fold) antibody, knob, or fab was mixed with the same amount of the pseudotyped virus carrying SARS-CoV-2 spike protein, and incubated at 37° C. for 1 h. Then, the mixtures were transduced into cells in the presence of polybrene (Santa Cruz Biotech, Santa Cruz, CA) (10 μg/ml). Following incubation of the transduced cells at 37° C. for 48 h, lysis buffer was added, and the RLU were measured.
Neutralization was also assayed using live virus in BSL-3 conditions. Similarly as described above, serially diluted (5-fold) antibody, knob, or fab was mixed with the same amount of wildtype SARS-CoV-2 virus (Wuhan-Hu-1), or either of an alpha (United Kingdom) or beta (South Africa) lineage variant, and incubated at 37° C. for 1 h. The cells were washed, and then plaque forming units (PFU) measured following incubation of the cells at 37° C. for 48 h.
In experiments with pseudo- or live virus, percent neutralization were measured. Inhibition curves of serial dilutions of each antibody against mock treatment were generated, and the maximum percent neutralization (MPN), i.e. the percent at which the neutralization curve plateaus for those viruses neutralized, were determined by GraphPad Prism software using a variable slope (GraphPad, La Jolla, CA). For example, results for exemplary antibody candidate R2G3 (IgG, Fab, and Knob) are shown in FIG. 11B. The results are summarized in Table E7 in ng/mL, with standard deviations of three independent replicates to the right.

TABLE E7

Neutralization of Pseudovirus and Live Virus by Ultralong CDR
H3 IgG, Fabs, and Knobs Against Different SARS-CoV-2 Strains

Pseudovirus

Live SARS CoV-2

	WT	Beta	Delta	WT	Alpha	Beta

SKM	R0.48 ± 0.02	R > 1000	R9.05 ± 1.20	R3.07 ± 1.57	S
(IgG)
SSKD	0.41 ± 0.01	N > 1000	N9.85 ± 1.90	04.87 ± 3.65	2	N
(IgG)
R4C1	99.93 ± 31.08	375.15 ± 71.63	109.8 ± 43.3
(IgG)
R4C1	184.25 ± 37.12	377.95 ± 120.1	244.5 ± 67.74
(Fab)
R4C1	641.95 ± 84.22	1024.9 ± 297.1	401.3 ± 97.30
(knob)
SG3	0.20 ± 0.03	5.70 ± 0.14	0.56 ± 0.09	>0.50 ± 0.05	0.51 ± 0.04	3.08 ± 1.73
(IgG)

Example 10: Bi- and Multispecific Antibodies with Ultralong CDR3s

Knobs derived from bovine ultralong CDRH3 antibodies are expressed as fusion proteins or as part of dimeric or multimeric molecules, creating bivalent, bispecific, multivalent, or multispecific proteins (FIG. 12 ). Two or more knobs are expressed as a fusion protein, for example with a flexible linker (e.g., Gly-Gly-Gly-Ser, or the like) between the C-terminus of one knob and the N-terminus of another knob. Additionally, bispecific molecules are made wherein one knob is in its wild-type conformation as a bovine, or humanized bovine, VH region and expressed with a light chain as an IgG, while a second knob is fused to the C-terminus of the heavy chain constant region. In this situation, the two VH regions are identical and have the specificity of knob 1, but the C-terminus has a new specificity as determined by knob 2.
In another approach, ‘knobs into holes’ technology is employed where two heavy chains are co-expressed where one heavy chain contains a VH region with one knob (knob 1) within its CDRH3 and a second heavy chain has a VH region with a second knob within its CDRH3 (knob 2). The two heavy chains also differ by having constant region mutations such that only the heterologous heavy chains effectively pair with one another to form a dimer. In this case, the homodimers are not formed to an appreciable extent. Such ‘knobs-into-holes’ mutations include T22Y (on one chain) and Y86T (on the other chain) in the CH3 domain of Fc.
DNA vectors encoding such molecules are generated by standard molecular biology techniques and expressed and purified as described above in previous Examples. Additionally, individual knobs are chemically covalently linked together using small molecule linkers, or polyethylene glycol (PEG) linkers, including heterobifunctional or heteromultifunctional linkers (e.g., Pierce). In this case, individual knobs are expressed and purified and then added together in the presence of linker and the appropriate reaction conditions to covalently couple the linkers to the knob proteins. Amine, carboxyl, maleimide, NHS ester, and hydrazide chemistries are commonly used in these cross-linking approaches. Furthermore, the knobs are used in the context of a nanoparticle to provide specificity or activity to the nanoparticle. In this regard, the nanoparticle can be a protein-based nanoparticle, including particles formed from viral proteins, albumin nanoparticles, and the like. The nanoparticles can also be derived from non-protein molecules including lipids (e.g., lipoparticles), carbohydrates, etc.

Example 11: Bioinformatic Identification of Bovine Ultralong CDR H3 Knob Domain Ends

An algorithm was developed to identify bovine ultralong CDR H3 knob domain boundaries by amino acid sequence. By sequence, the bovine ultralong CDR H3 region ranges from “the third residue following the conserved cysteine in framework 3 to the residue immediately preceding the conserved tryptophan in framework 4” (Wang et al. Cell 2013, 153(6):1379-1393). Structurally, the knob domain is defined as the small disulfide-rich domain located upon the distal end of the anti-parallel β-ribbon stalk domain (FIGS. 13A and 13B).
Crystal structures of exemplary bovine ultralong antibodies (Table E8) were analyzed in conjunction with sequences (FIG. 14 ) to formulate a precise definition of the knob boundaries by both sequence and structure. In the analysis, the first residue of the knob domain was defined as the first conserved D_Hcysteine, or other residue at this position in rare exceptions such as A01, preceding the conserved “PDG” motif. For the purpose of locating the final knob domain residue, the stalk domain was then also defined. By crystal structure analysis, symmetry was observed in the length of the ascending and descending stalk β-ribbon strands. The conserved framework 3 cysteine, preceding the first CDR H3 residue (Wang et al. 2013) by 3 amino acid positions, is located proximal to the base of the ascending stalk strand and is situated directly across from the conserved framework 4 tryptophan which is one residue downstream of the final CDR H3 residue (Wang et al. 2013). In the analysis, the first ascending stalk residue was defined as the conserved framework 3 cysteine and the final descending stalk residue was defined as the conserved framework 4 tryptophan. The C-terminal knob boundary position was located by subtracting the number of ascending stalk residues from the framework 4 tryptophan position (Table E8).
In summary, our algorithm (below) defines the knob region N-terminal boundary as the first D_Hcysteine in the “CPDG” motif and the C-terminal boundary as the position located by subtracting number of ascending stalk residues from the framework 4 tryptophan position (FIG. 15 ). The algorithm serves as a general rule that can be applied to bovine ultralong CDR H3 antibody sequences.
In summary, our algorithm (below) defines the knob region N-terminal boundary as the first D_Hcysteine in the “CPDG” motif and the C-terminal boundary as the position located by subtracting number of ascending stalk residues from the framework 4 tryptophan position (FIG. 15 ). The algorithm serves as a general rule that can be applied to bovine ultralong CDR H3 antibody sequences.
The algorithm is described as follows: L=number of amino acids encompassing stalk and knob domains, starting at canonical framework 3 cysteine and ending at canonical framework 4 tryptophan. X=number of amino acids, starting at the framework 3 canonical cysteine that defines the ascending stalk, and ending at the amino acid preceding the conserved first D region cysteine in the “CPDG” motif.
Position of conserved framework 4 tryptophan−X=knob boundary position (C-terminal end); Number of residues in the knob (K)=L−2X; K position=(X+1) to (X+K)

TABLE E8

	Length	Number of	Number of
	encompassing	residues	amino acids
	stalk and knob	in knob	in each stalk	PDB
Antibody	domains (L)	domain (K)	strand (X)	ID

A01	65	43	11	5ilt
B11	67	41	13	5ihu
BLV1H12	65	39	13	4k3d
BLV5B8
	60	38	11	4k3e
E03	48	22	13	5ijv
BOV1	65	43	11	6e8v
BOV2	63	41	11	6e9g
BOV3	67	41	13	6e9h
BOV4	64	38	13	6e9i
BOV5	58	32	13	6e9k
BOV6	54	32	11	6e9q
BOV7	67	41	13	6e9u

Bovine ultralong antibodies with published crystal structures that were analyzed, with X number of amino acids in the ascending and descending strands. Total number of amino acids comprising the stalk and knob domain (L) and knob domain alone (K) for each antibody are also noted.

Example 12: Defining the Minimal CDR3-Knob C-Terminus and Minimal CDR3-Knob N-Terminus

The algorithm described in Example 11 was validated experimentally by expressing and testing C-terminal truncations (subsection A below) and N-terminal truncations (subsection B below) of a stalk and knob region from an antibody with an unknown structure. In some cases, 1, 2, 3, 4 or 5 amino acids may be added to the knob ends for improved expression or stability.

A. Defining Minimal CDR3-Knob C-Terminus

In order to define the C-terminal requirements of a prototypical CDR3-knob, a series of R2G3 truncations were cloned into pET32b and expressed as described in Example 6 above. The quality of expressed material was assessed by SDS-PAGE and RBD ELISA also as described in Example 6. Exemplary tested R2G3 truncations are set forth below in Table E9, each truncation was made with a reduced Terminal linker.

TABLE E9

R2G3 CDR3-knob truncations-
C-Terminus

		Mature amino acid sequence
	CLONE	after Enterokinase cleavage

	G3 Parental	GGGGAMGSEGDKTCPDGYEHTCGCIG
	SEQ ID NO: 86	GCGCKRSACIGALCCQASLGGWLSDG
		ETYT

	G3 TRUNC1	~~~~~GGSEGDKTCPDGYBHTCGCIGG
	SEQ ID NO: 87	CGCKRSACIGALCCQASLGGWLSDGETY
		T

	G3 TRUNC2	~~~~~GGSEGDKTCPDGYEHTCGCIGG
	SEQ ID NO: 88	CGCKRSACIGALCCQASLGGWLSDGE

	G3 TRUNC3	~~~~~GGSEGDKTCPDGYEHTCGCIGG
	SEQ ID NO: 89	CGCKRSACIGALCCQASLGGWLS

	G3 TRUNC3A	~~~~~GGSEGDKTCPDGYEHTCGCIGG
	SEQ ID NO: 90	CGCKRSACIGALCCQASLGGWL

	G3 TRUNC3B	~~~~~GGSEGDKTCPDGYEHTCGCIGG
	SEQ ID NO: 91	CGCKRSACIGALCCQASLGGW

	G3 TRUNC4	~~~~~GGSEGDKTCPDGYEHTCGCIGG
	SEQ ID NO: 92	CGCKRSACIGALCCQASLGG

	G3 TRUNC5	~~~~~GGSEGDKTCPDGYEHTCGCIGG
	SEQ ID NO: 93	CGCKRSACIGALCCQAS

As shown in FIG. 16A, only Truncations 4 and 5 resulted in no observed RBD binding. Truncations 3A and 3B demonstrated reduced binding in ELISA and increased band diffuseness in SDS-PAGE (FIG. 16B). Truncations 1-3 had no loss in binding activity relative to parental R2G3 CDR3-knob. Taken together, these results support a minimum of 9 amino acids after the last non-canonical Cys residue for R2G3 binding.

B. Defining the Minimal CDR3-Knob N-Terminus

Similarly as described in Example 11, a series of R2G3 truncations were cloned into pET32b to define the N-terminal requirements of a prototypical CDR3-knob and expressed as described in Example 6 above. The quality of expressed material was assessed by SDS-PAGE and RBD ELISA as described in Example 6. Exemplary tested R2G3 truncations are set forth below in Table E10.

TABLE E10

R2G3 CDR3-knob truncations-
N-Terminus

		Mature amino acid sequence
	CLONE	after Enterokinase cleavage

	G3 Parental	GGSEGDKTCPDGYEHTCGCIGGC
	SEQ ID NO:	GCKRSACIGALCCQASLGGWLSD
	131

	G3 NTRUNC1	GGS~GDKTCPDGYEHTCGCIGGC
	SEQ ID NO:	GCKRSACIGALCCQASLGGWLSD
	132

	G3 NTRUNC2	GGS~~DKTCPDGYEHTCGCIGGC
	SEQ ID NO:	GCKRSACIGALCCQASLGGWLSD
	133

	G3 NTRUNC3	GGS~~~KTCPDGYEHTCGCIGGC
	SEQ ID NO:	GCKRSACIGALCCQASLGGWLSD
	134

	G3 NTRUNC4	GGS~~~~TCPDGYEHTCGCIGGC
	SEQ ID NO:	GCKRSACIGALCCOASLGGWLSD
	135

	G3 NTRUNC5	GGS~~~~~CPDGYEHTCGCIGGC
	SEQ ID NO:	GCKRSACIGALCCQASLGGWLSD
	136

Each of the exemplary N-terminal truncation tested was observed to display similar binding profiles to biotinylated RBD by ELISA and band diffuseness in SDS-PAGE (FIGS. 17A and 17B, respectively). It was noted that truncation 5 resulted in two bands via SDS-PAGE, however this did not correlate with any reduction in binding activity. These results suggest that none of the amino acids deleted in these exemplary truncated R2G3 sequences are part of the knob domain.

Example 13: Selective Amplification of Ultralong CDR3-Knob Domains

Ultralong CDR3-knob domains were selectively amplified from a cow VH template library. The cow VH template library was prepared substantially as described in Example 2.
Specifically, RNA was isolated from 5×10⁶-10⁷bovine PBMCs using an RNAeasy kit (Qiagen). Immune cow antibody CDR3-knob repertoires were obtained through cDNA synthesis from 5 μg total RNA using Superscript IV First-Strand cDNA synthesis kit (ThermoFisher), followed by PCR amplification. To generate the VH template library, the cDNA template for CDR3-knobs was synthesized using a pool of IgM (SEQ ID NO: 4), IgA (SEQ ID NO: 5), and IgG-specific (SEQ ID NO: 3 and 6) primers.
Primary stalk-knob CDR3 were amplified from 1^ststrand cDNA with IgHV1-7 family specific primers specific for either side of the stalk domain of the CDR3 region. Primary stalk-knob CDR3 were amplified using a pool of primers containing all of the primers set forth in SEQ ID NO: 8-11 as well as one of the primers set forth in SEQ ID NO: 122-130. The amplified sequences were then analyzed for the prevalence of ultralong CDR3-knob domains using gel electrophoresis with a 2% agarose gel.
An alignment of the primers set forth in SEQ ID NO: 122-130 (primers p1-p9) to sequences of exemplary standard short CDR3 antibodies (antibodies 028-030) and ultralong CDR3 antibodies (antibodies 01-026) is shown in FIG. 18A. Sequence identifiers (SEQ ID NO) of the sequences shown in FIG. 18A are shown in Table E11.

TABLE E11

Sequence Identifiers (SEQ ID NO)
for Sequences Shown in FIG. 18A

	SEQUENCE	SEQ ID NO

	014	137
	015	138
	032	139
	016	140
	031	141
	027	142
	021	143
	026	144
	p1	122
	p2	123
	p3	124
	p4	125
	p5	126
	p6	127
	p7	128
	p8	129
	p9	130
	028	145
	018	146
	019	147
	020	148
	022	149
	023	150
	024	151
	025	152
	029	153
	030	154

Results of gel electrophoresis indicated that amplification with the pools of primers containing the primers set forth in SEQ ID NO: 123, 127, and 128 resulted in enrichment for ultralong CDR3-knob domains (FIG. 18B), especially with annealing between 65-68° C. Specifically, while two bands were apparent for PCR products obtained using some of the primers, indicating the amplification of standard short as well as ultralong CDR3-knob domains, only one band corresponding to sequences of ultralong CDR3-knob domains (expected PCR product size of approximately 300-350 bp) was obtained using the primers set forth in SEQ ID NO: 123, 127, and 128.
A stalk-knob CDR3 library was constructed from DNA amplified using the primers set forth in SEQ ID NO: 8-11, 123, 127, and 128. The library was constructed substantially as described in Example 2 and was selected against Spike protein for two rounds of selection as described in Example 3. Over 90% of screened clones were Spike-binding clones, and all binding clones were ultralong CDR3 antibodies.
These results indicate that ultralong CDR3-knob domains can be selectively amplified from a VH template library using particular primers specific for the stalk domain of the CDR3 region.
The present invention is not intended to be limited in scope to the particular disclosed embodiments, which are provided, for example, to illustrate various aspects of the invention. Various modifications to the compositions and methods described will become apparent from the description and teachings herein. Such variations may be practiced without departing from the true scope and spirit of the disclosure and are intended to fall within the scope of the present disclosure.

Sequences

SEQ
ID	Name Sequence

1	CAGGCCGTCCTGAACCAGCCAAGCAGCGTCTCCGGGTCTC	BLV1H12 Light
	TGGGGCAGCGGGTCTCAATCACCTGTAGCGGGTCTTCCTC	chain DNA
	CAATGTCGGCAACGGCTACGTGTCTTGGTATCAGCTGATC
	CCTGGCAGTGCCCCACGAACCCTGATCTACGGCGACACAT
	CCAGAGCTTCTGGGGTCCCCGATCGGTTCTCAGGGAGCAG
	ATCCGGAAACACAGCTACTCTGACCATCAGCTCCCTCCAG
	GCTGAGGACGAAGCAGATTATTTCTGCGCGTCTGCCGAGG
	ACTCTAGTTCAAATGCCGTGTTTGGAAGCGGCACCACACT
	GACAGTCCTA

2	QAVLNQPSSVSGSLGQRVSITCSGSSSNVGNGYVSWYQLI	BLV1H12 variable
	PGSAPRTLIYGDTSRASGVPDRFSGSRSGNTATLTISSLQ	Light chain
	AEDEADYFCASAEDSSSNAVFGSGTTLTVL

3	CCGCTCTTCAGGGCACCCGAGTTCC	igGCDNA2REV

4	CTGACTGTGCTGTTGTTGAACTTCC	igMCDNA2REV

5	GACACGCTGTCGCCATTCTGGTTCC	igACDNA2REV

6	CGGGCACGGTCACCATGCTGCTGAGAGAGTAG	igGCDNA1.7REV

7	TTACCTGCGGCCGCTGAGGAGACGGTGACCAGGAGTCCAA	BOVVHFR4REV
	CTGGAGCTCCATCAAG

8	CAGCCGGCCATGGCCACATACTACAGTACTACTGTACACC	BOVSTALKFOR
		1

9	CAGCCGGCCATGGCCACATACTACAGTACTACTGTATACC	BOVSTALKFOR
		2

10	CAGCCGGCCATGGCCACATACTACAGTACTACTGTGCTCC	BOVSTALKFOR
		3

11	CAGCCGGCCATGGCCACATACTACAGTGGTACTGTGCACC	BOVSTALKFOR
		4

12	AAAAAG CCATGG TGCAGGTGCAGCTGCGGGAGTCGGG	BOVVHNCOFOR
		2
		NotI restriction
		enzyme site
		(bold/underline)

13	TTACCT CTCGAG TGAGGAGACGGTGACCAGGAGTCC	BOVVHFR4XHO
		REV
		Xho I restriction
		enzyme site
		(bold/underline)

14	CAGGTGCAGCTGCGGGAGTCGGGCCCCAGCCTGGTGAAGC	R2G3
	CCTCACAGACCCTCTCGCTCACCTGCGCGGCCTCTGGATT
	CTCATTGAGCGACAAGGCTGTAGGCTGGGTCCGCCGGGCT
	CCAGGGAAGGCGCTGGAGTGGCTCGGTAGTATAGACACTG
	GTGGAAGCACAGGCTATAACCCAGGCCTGAAATCCCGGCT
	CAGCATCACCAAGGACAACTCCAAGAGCCAAGTCTCCCTG
	TCAATTAGCAGCGTAACGTCTGAGGACTCGGCCACATACT
	ACTGTGCAACTGTACACCAGAAAACAGCTGAAGGAGACAA
	AACGTGTCCTGATGGTTACGAGCATACTTGTGGTTGCATT
	GGGGGTTGTGGTTGCAAAAGGTCTGCCTGTATAGGTGCAC
	TTTGTTGCCAAGCGTCGTTGGGTGGTTGGCTTAGTGACGG
	TGAAACCTACACTTACGAGTTCCACGTCGATACCTGGGGC
	CAAGGACTCGTGGTCACCGTCTCCTCA

15	CAGGTGCAGCTGCGGGAGTCGGGCCCCAGCCTGGTGAAGC	SR3A3
	CCTCACAGACCCTCTCCCTCACCTGCACAATCTCTGGATT
	CTCATTGAGTAGCTATGCTGTACTCTGGGTCCGCCAGGCT
	CCAGGGAAGCCGCTGGAGTGGCTCGGTAGTATAGACACTG
	CGGAAAACACAGGCTATAACCCAGGCCTGAAATCCCGGCT
	CAGCATCACCAAGGACAACTCCAAGAGCCAAGTCTCTCTG
	TCAGTGAGCAGCGTGACAACCGAGGACTCGGCCACATACT
	ACTGTGCTACTGTACACCAGAAAACGCGAAAAGAAAAAAA
	TTGTCCTGATGGCTATATCTATAGTTCTAATATCACTAGC
	GGTTTTGATTGTGGTGTCTGGATTTGTCGTCGCGTCGGTA
	GTGCCTTCTGTAGTCGTACTGGTGATTATACTAGTCCTAC
	TGAACTTGACATTTACGAGTTCTACGTCGAAGGGTGGGGC
	CAGGGAGTCCCGGTCACCGTCTCCTCA

16	CAGGTGCAGCTGCGGGAGTCGGGCCCCAGCCTGGTGAAGC	R2F12
	CCTCACAGACCCTCTCCCTCACCTGCACGGTCTCTGGATT
	CTCATTGAGCGACAAGGCTGTAGGCTGGGTCCGCCGGGCT
	CCAGGGAAGGCGCTGGAGTGGCTCGGTAGTATAGACACTG
	GTGGAATGACAGGCTATAACCCAGGCCTGAAATCCCGGCT
	CAGCATCACCAAGGACAACTCCAAAAGCCAAGTCTCTCTA
	TCAGTGAATAGCGTGACAACTGAGGACTCGGCCACGTACT
	ACTGTGCCACTGTAGACCAGAAAACGAAAAATGCTTGCCC
	TGATGATTTCGATTATCGTTGTTCGTGTATCGGTGGTTGT
	GGCTGCGCCCGTAAAGGATGCGTTGGTCCTCTTTGTTGTC
	GTTCTGATTTGGGTGGCTATCTTACTGATAGTCCTGCTTA
	CATTTACGAATGGTATATTGATCTTTGGGGCCAAGGACTC
	CTGGTCACCGTCTCCTCA

17	CAGGTGCAGCTGCGGGAGTCGGGCCCCAGCCTGGTGAAGC	R5A3
	CGTCACAGACCCTCTCGCTCACCTGCACGGCCTCTGGATT
	CTCATTGAGCGACAAGGCTGTAGGCTGGGTCCGCCAGGCT
	CCAGGGAAGGCGCTGGAGTGGCTCGGTAGTATAGACACTG
	GTGGAAGCACAGGCTATAACCCAGGCCTGAAATCCCGGCT
	CAGCATCACCAAGGACAACTCCAAGAGCCAAGTCTCTCTG
	TCAGTGAGCAGCGTGACAACTGAGGACTCGGCCACATACT
	ACTGTACTACTGTGCACTGTAGTGATGGTGGTTATGTTGA
	GGCGGGTTTTGGTTGTTGGCCTTGGGATTATGGTTATCCT
	TACGTCGATGCCTGGGGCCAAGGACTCCTGGTCACCGTCT
	CCTCA

18	CAGGTGCAGCTGCGGGAGTCGGGCCCCAGCCTGGTGAAGC	R4G11
	CCTCACAGACCCTCTCCCTCACCTGCACGGTCTCTGGATT
	CTCATTGAGCAGCTATGGTATAACCTGGGTCCGCCAGGCT
	CCAGGGAAGGCGCTGGAGTGCCTCGGTAGTATAAGCAGTG
	GTGGAACCACAGACTACAACCCAGCCCTGAAATCCCGGCT
	CAGCATCACCAAGGACAACTCCAAGAGCCAAGTCTCTCTG
	TCAGTGAGCAGCGTGACACCTGAGGACACGGCCACATACT
	ACTGTTCGAAGTGGAATTTAGAATATACTTGGGGTGGTGT
	TGGTTGCGCTAGTTTTGCTGATGAGGACACCCACGTTGAT
	GCCTGGGGCCAAGGACTCCTGGTCACCGTCTCCTCA

19	CAGGTGCAGCTGCGGGAGTCGGGCCCCAGCCTGATGAAGC	R4G3
	CCTCACAGACCCTCTCCCTCACCTGCACGGTCTCTGGGTT
	CTCATTGAGCGACTATGCTGTAGGCTGGGTCCGCCAGGCC
	CCAGGGAAGGCGCTGGAGTGGCTCGGTGGTATAGACACTG
	GTGGAAGCACAGGCTATAACCCAGGCCTGGAATCCCGGCT
	CAGCATCACCAAGGACAACTCCAAGAGCCAAGTCTCTCTG
	TCAGTGAGCAGCGTGACAACTGAGGACTCGGCCACATACT
	ACTGTACTACTGTGGTCCTTTGTTATTTTAATTATGTTGT
	TCGTCGTTATAATTGTGGTGGTCTTGGTTATGGGCATGGC
	TTTAATAGTTTCTACGTCGATGCCTGGGGCCAAGGACTCC
	TGGTCACCGTCTCCTCA

20	CAGGTGCAGCTGCGGGAGTCGGGCCCCAGCCTGGTGAAGC	R4E5
	CCTCACAGACCCTCTCCCTCACCTGCACGACCTCTGGATT
	CTCACTGAGAAACTATGCTGTAGGCTGGGTCCGCCAGGCT
	CCGGGGAAGGCGCTGGAGTGGCTCGGTGGTATAGACACTG
	GTGGAAGCACAGGCTATAACCCAGGCCTGGAATCCCGGCT
	CAGCATCACCAAGGACAACTCCAAGAGCCAAGTCTCTCTG
	TCAGTGAGCAGCGTGACAACTGAGGACTCGGCCACATACT
	ACTGTACTACTGTGGTCCTTTGTTATTTTAATTATGTTGT
	TCGTCGTTATAATTGTGGTGGTCTTGGTTATGGGCATGGC
	TTTAATAGTTTCTACGTCGATGCCTGGGGCCAAGGACTCC
	TGGTCACCGTCTCCTCA

21	CAGGTGCAGCTGCGGGAGTCGGGCCCCAGCCTGGTGAAGC	R4C1
	CGTCACAGACCCTCTCGCTCACCTGCACGGCCTCTGGATT
	CTCATTGAGCGATAAGGCTGTAGGCTGGGTCCGCCAGGCT
	CCAGGGAAGCCGCTGGAGTGGCTCGGTAGTATAGACACTG
	CGGAAAACACAGGCTATAACCCAGGCCTGAAATCTCGGCT
	CAGCATCACCAAGGACAACTCCAAGAGCCAAGTCTCTCTG
	TCAGTGAGCAGCGTGACAACTGAGGACTCGGCCACATACT
	ACTGTGCTACTGTACACCAGAAAACGCGAAAAGAAAAAAA
	TTGTCCTGATGGCTATATCTATAGTTCTAATACCGCcAGC
	GGTTATGATTGTGGTGTCTGGATTTGTCGTCGCGTCGGTA
	GTGCCTTCTGTAGTCGTACTGGTGATTATACTAGTCCTAG
	TGAATTTGACATTTACGAgTTCTACGTCGAAGGGTGGGGC
	CAGGGAcTCCTGGTCACCGTCTCCTCA

22	CAGGTGCAGCTGCGGGAGTCGGGCCCCAGCCTGGTGAAGC	R4A10
	CCTCACAGACCCTCTCCCTCACCTGCACGACCTCTGGATT
	CTCATTGAGCGACTATGCTGTAGGCTGGGTCCGCCAGGCT
	CCAGGGAAGGCGCTGGAGTGGCTCGGTGGTATAGACACTG
	GTGGAAGCACAGGCTATAACCCAGGCCTGAAATCCCGGCT
	CAGCATCACCAAGGACAACTCCAAGAGTCAAGTCTCTCTG
	TCAGTGAGCAGCGTGACAACTGAGGATTCGGCCACATACT
	ACTGTACTGCCGTGGTCCTCTGTTATTACAATCGGGTTGT
	GCGTCGTAATAATTGTGGTGGGCTTGGTTATGATTATGGT
	TTTGATCATTTCTACGTCGATGCCTGGGGCCAAGGACTCC
	TGGTCACCGTCTCCTCA

23	CAGGTGCAGCTGCGGGAGTCGGGCCCCAGCCTGGTGAAGC	R2G1
	CCTCACAGACCCTCTCCCTCACCTGCACGGTCTCTGGATT
	CTCATTGAGCAACTATGCTGTAGGCTGGGTCCGCCAGGCT
	CCAGGGAAGGCGCTGGAGTGCCTCGGTGATGTAGACAGTA
	GTGGAGGCACAGCCTATAACCCAGCCCTGAAATCCCGGTT
	CATCATCGCCAAGGACAACTCCAAGAACCAAGTCTCTCTG
	TCAGTCCGCAGCGTGACACCTGAGGACACGGCCACATACT
	ACTGTGCGAAGTTTGCTAAGGGTACTACGAGTGCTGGTGC
	TTGTGATTATTCAGAAAGCTACGTCGATGCCTGGGGCCAG
	GGACTCCTGGTCACCGTCTCCTCA

24	CAGGTGCAGCTGCGGGAGTCGGGCCCCAGCCTGGTGAAGC	R2D6
	CCTCACAGACCCTCTCCCTCACCTGCACGACCTCTGGATT
	CTCACTGAGCAGCTATGCTGTAGGCTGGGTCCGCCAGGCT
	CCGGGGAAGGCGCTGGAGTGGGTTGGTGATATAGATTATG
	TCGGAAACACAGACTATAACCCAGCCCTGAAATCCCGGCT
	CAGCATCACCAAGGACAACTCCAAGAGCCAAGTCTCTCTG
	GTAGTGAGCAGCGTGACAGCTGAGGACGCGGCCACATACT
	ACTGTGCGAAATATTCCGGTGCTTATGCTTATGCTGCTTG
	CAATTATTATGGTTGGCGTTGTGCTTGGGAAAGCTACATC
	GATGCCTGGGGCCAAGGACTCCTGGTCACCGTCTCCTCA

25	CAGGTGCAGCTGCGGGAGTCGGGCCCCAGCCTGGTGAAGC	R2B1
	CCTCACAGACCCTCTCCCTCACCTGCACGGTCTCTGGATT
	TTCATTAAGCGATAATAATGTAGGCTGGGTCCGCCAGGCT
	CCAGGAAAGGCGCTGGAGTGGCTCGGTGTAATGCATAATG
	ATGGGAACAAAGGCTATAACCCAGCCCTGAAATCCCGGCT
	CAGCATCACCAAGGACAGCTCCAAGAGCCAAGTCTCTCTA
	TCACTAAGCAGCGTGACAAGTGAGGACACGGCCACATACT
	ACTGTACAAGAGACAATGCACGTTGTGATAGTTGGACGTA
	TGACAGCTGTGATACTTGGTATCGCAATTCGTGGCACGTT
	GATGCCTGGGGCCAAGGACTCCTgGTCACCGTCTCCTCA

26	CAGGTGCAGCTGCGCGAGTCGGGCCCCAGCCTGGTGAAGC	SKM-BLV1H12
	CGTCACAGACCCTCTCGCTCACCTGCACGGCCTCTGGATT
	CTCATTGAGCGACAAGGCTGTAGGCTGGGTCCGCCAGGCT
	CCAGGGAAGGCGCTGGAGTGGCTCGGTAGTATAGACACTG
	GTGGAAACACAGGCTATAACCCAGGCCTGAAATCCCGGCT
	CAGCATCACCAAGGACAACTCCAAGAGTCAAGTCTCTCTG
	TCAGTGAGCAGCGTGACAACTGAGGACTCGGCCACATACT
	ACTGTACTACTGTGCACCAAGAGACCTTACGTAGTTGTCC
	TGATGGTTATATTGATAATTCTGGATGCACGGCTGATTGG
	GGTTGTGCAGCTCTTGATTGTTGGCGGCGTCGTTTTGGTT
	ACCACAGCACTGATCCTTCTCATTATACTGGTGCGACGTA
	TATTTACACGTACAGCTTGCACATCGATGCCTGGGGCCAA
	GGACTCCTGGTCACCGTCTCCTCA

27	CAGGTGCAGCTGCGCGAGTCGGGCCCCAGCCTGGTGAAGC	SKD-BLV1H12
	CGTCACAGACCCTCTCGCTCACCTGCACGGCCTCTGGATT
	CTCATTGAGCGACAAGGCTGTAGGCTGGGTCCGCCAGGCT
	CCAGGGAAGGCGCTGGAGTGGCTCGGTAGTATAGACACTG
	GTGGAAACACAGGCTATAACCCAGGCCTGAAATCCCGGCT
	CAGCATCACCAAGGACAACTCCAAGAGTCAAGTCTCTCTG
	TCAGTGAGCAGCGTGACAACTGAGGACTCGGCCACATACT
	ACTGTACTACTGTGCACCAGCGTACAAGCGAAAAAAGAAG
	TTGTCCTGGCGGTAGTAGTAGACGTTATCCTAGTGGCGCC
	AGTTGTGACGTTAGTGGGGGCGCTTGTGCGTGTTATGTTT
	CTAATTGTAGAGGCGTTTTGTGTCCTACTCTTAACGAAAT
	CGTTGCTTATACCTACGAATGGCACGTCGACGCCTGGGGC
	CAAGGACTCCTGGTCACCGTCTCCTCA

28	CAGGTGCAGCTGCGGGAGTCGGGCCCCAGCCTGGTGAAGC	RBD F4
	CCTCACAGACCCTCTCCCTCACCTGCACGGTCTCTGGATT
	CTCATTGAGCAGCAATGGTGTGGTCTGGGTCCGCCAGGCT
	CCAGGGAAGGCGCTGGAGTGGCTCGGTGATATATGCAGTA
	CTGGAGGCACAAGCTTTAACCCAGCCCTGAAATCCCGGCT
	CAGCATCGCCAAGGACAACTCCAAGAGCCAAGTCTCTCTG
	TCAGTGAGAAGCGTGACACCTGAGGACACGGCCACATATT
	ACTGTGCAAGAAGTCGTGGTTATGATTGTTATGCTAATGT
	GGATGCTTTGGACTACGTCGATGCCTGGGGCCAAGGACTC
	CTGGTCACCGTCTCCTCA

29	CAGGTGCAGCTGCGGGAGTCGGGCCCCAGCCTGGTGAAGC	RBD C6
	CCTCACAGACCCTCTCCCTCACCTGCACGGTCTCTGGATT
	CTCATTGAGCAGCAATGGTGTAGTCTGGGTCCGCCAGGCT
	CCAGGGAGACCACTGGAGTGGCTCGGTGATATATGCAGTA
	ATGGAGGCACAAGCTTTAACCCAGCCCTGAAATCCCGGCT
	CAGCATCGCCAAGGACAACTCCGAGAGCCAAGTCTCTCTG
	ACCGTGAGAAGCGTGACACCTGAGGACACAGCCACATATT
	ACTGTGCAAGAAGTCGTGGTTATGATTGTTATGCTTATGT
	TTATGCTTTGGACACCGTCGATGCCTGGGGCCAAGGACTC
	CTGGTCACCGTCTCCTCA

30	CAGGTGCAGCTGCGGGAGTCGGGCCCCAGCCTGGTGAAGC	RBD A2
	CCCTACAGATCCTCTCCCTCACCTGCACGGTCTCTGGATT
	CTCATTGAGCAGCAATGGTGTGGTCTGGGTCCGCCAGGCT
	CCAGGGAAGGCGCTGGAGTGGCTCGGTGATATATGCAGTA
	CTGGAGGCACAAGCTTTAACCCAGCCCTGAAATCCCGGCT
	CAGCATCGCCAAGGACAACTCCAAGAGCCAAGTCTCTCTG
	TCAGTGAGAAGCGTGACACCTGAGGACACGGCCACATATT
	ACTGTGCAAGAAGTCGTGGTTATGATTGTTATGCTAATGT
	GGATGCTTTGGACTACGTCGATGCCTGGGGCCAAGGACTC
	CTGGTCACCGTCTCCTCA

31	CAGGTGCAGCTGCGGGAGTCGGGCCCCAGCCTGGTGAAGC	SA-R2C3
	CGTCACAGACCCTCTCGCTCACCTGCACGGCCTCTGGATT
	CTCATTGAGCGATAAGCCTGTAGGCTGGGTCCGCCAGGCT
	CCAGGGAAGCCACTGGAGTGGCTCGGTAGTATAGACACTG
	CGGAAAACACAGGCTATAACCCAGGCCTGAAATCTCGGCT
	CAGCATCACCAAGGACAACTCCAAGAGCCAAGTCTCTCTG
	TCACTGAGCAGCGTGACGACTGAGGACTCGGCCACATACT
	ACTGTGCTACTGTACACCAGAAAACGCGGAAGGAAAAAAG
	TTGTCCTGATGGCTATCTCTATAGTTCTAATACCGGCCGC
	GGTTATGATTGTGGTGTCTGGACTTGTCGTCGCGTCGGTG
	GTGAATTCTGTAGTGCTACTGGTGATTGGACTAGTCCTAG
	TGAAGAAGACTTTTACGAATTCTACGTCGATACGTGGGGC
	CAGGGAGCCCCGGTCACCGTCTCCTCA

32	CAGGTGCAGCTGCGGGAGTCGGGCCCCAGCCTGGTGAAGC	SA-R2D9
	CGTCACAGACCCTCTCGCTCACCTGCACGGCCTCTGGATT
	CTCATTAAGCGACAAGGCTATTGGCTGGGTCCGCCAGGCT
	CCAGGGAAGGCGCTGGAGTGGCTCGGTAGTATAGACACCC
	GTGGAAACACAGGCTATAACCCAGGCCTGAAATCCCGACT
	CAGCATCACCAAGGACAGCTCCAAGAGCCAAGTCTCTCTG
	TCAGTGAACAGCGTGACAACTGAAGACTCGGCCACGTACC
	TCTGTGCTATTGTGCAGCAGATCACACACAAAACTTGTCC
	TAATGGTTACAATTGGTTTGATCGTTGTTGTTCTTGGGAT
	GGTACCTGTGGTGATGGTTGTTGCAGTAATCGTGCTTGGC
	CTAGTGGTAATGGTAGAGCCGACAGTAGTATTGGTGAAAC
	TTATGGTTACGAATTTCACGTGGCTGCCTGGGGCCAAGGA
	CTCCTGGTCACCGTCTCCTCA

33	QVQLRESGPSLVKPSQTLSLTCAASGFSLSDKAVGWVRRA	R2G3
	PGKALEWLGSIDTGGSTGYNPGLKSRLSITKDNSKSQVSL
	SISSVTSEDSATYYCATVHQKTAEGDKTCPDGYEHTCGCI
	GGCGCKRSACIGALCCQASLGGWLSDGETYTYEFHVDTWG
	QGLVVTVSS

34	QVQLRESGPSLVKPSQTLSLTCTISGFSLSSYAVLWVRQA	SR3A3
	PGKPLEWLGSIDTAENTGYNPGLKSRLSITKDNSKSQVSL
	SVSSVTTEDSATYYCATVHQKTRKEKNCPDGYIYSSNITS
	GFDCGVWICRRVGSAFCSRTGDYTSPTELDIYEFYVEGWG
	QGVPVTVSS

35	QVQLRESGPSLVKPSQTLSLTCTVSGFSLSDKAVGWVRRA	R2F12
	PGKALEWLGSIDTGGMTGYNPGLKSRLSITKDNSKSQVSL
	SVNSVTTEDSATYYCATVDQKTKNACPDDEDYRCSCIGGC
	GCARKGCVGPLCCRSDLGGYLTDSPAYIYEWYIDLWGQGL
	LVTVSS

36	QVQLRESGPSLVKPSQTLSLTCTASGFSLSDKAVGWVRQA	RSA3
	PGKALEWLGSIDTGGSTGYNPGLKSRLSITKDNSKSQVSL
	SVSSVTTEDSATYYCTTVHCSDGGYVEAGFGCWPWDYGYP
	YVDAWGQGLLVTVSS

37	QVQLRESGPSLVKPSQTLSLTCTVSGFSLSSYGITWVRQA	R4G11
	PGKALECLGSISSGGTTDYNPALKSRLSITKDNSKSQVSL
	SVSSVTPEDTATYYCSKWNLEYTWGGVGCASFADEDTHVD
	AWGQGLLVTVSS

38	QVQLRESGPSLMKPSQTLSLTCTVSGFSLSDYAVGWVRQA	R4G3
	PGKALEWLGGIDTGGSTGYNPGLESRLSITKDNSKSQVSL
	SVSSVTTEDSATYYCTTVVLCYFNYVVRRYNCGGLGYGHG
	FNSFYVDAWGQGLLVTVSS

39	QVQLRESGPSLVKPSQTLSLTCTTSGFSLRNYAVGWVRQA	R4E5
	PGKALEWLGGIDTGGSTGYNPGLESRLSITKDNSKSQVSL
	SVSSVTTEDSATYYCTTVVLCYFNYVVRRYNCGGLGYGHG
	FNSFYVDAWGQGLLVTVSS

40	QVQLRESGPSLVKPSQTLSLTCTASGFSLSDKAVGWVRQA	R4C1
	PGKPLEWLGSIDTAENTGYNPGLKSRLSITKDNSKSQVSL
	SVSSVTTEDSATYYCATVHQKTRKEKNCPDGYIYSSNTAS
	GYDCGVWICRRVGSAFCSRTGDYTSPSEFDIYEFYVEGWG
	QGLLVTVSS

41	QVQLRESGPSLVKPSQTLSLTCTTSGFSLSDYAVGWVRQA	R4A10
	PGKALEWLGGIDTGGSTGYNPGLKSRLSITKDNSKSQVSL
	SVSSVTTEDSATYYCTAVVLCYYNRVVRRNNCGGLGYDYG
	FDHFYVDAWGQGLLVTVSS

42	QVQLRESGPSLVKPSQTLSLTCTVSGFSLSNYAVGWVRQA	R2G1
	PGKALECLGDVDSSGGTAYNPALKSRFIIAKDNSKNQVSL
	SVRSVTPEDTATYYCAKFAKGTTSAGACDYSESYVDAWGQ
	GLLVTVSS

43	QVQLRESGPSLVKPSQTLSLTCTTSGFSLSSYAVGWVRQA	R2D6
	PGKALEWVGDIDYVGNTDYNPALKSRLSITKDNSKSQVSL
	VVSSVTAEDAATYYCAKYSGAYAYAACNYYGWRCAWESYI
	DAWGQGLLVTVSS

44	QVQLRESGPSLVKPSQTLSLTCTVSGFSLSDNNVGWVRQA	R2B1
	PGKALEWLGVMHNDGNKGYNPALKSRLSITKDSSKSQVSL
	SLSSVTSEDTATYYCTRDNARCDSWTYDSCDTWYRNSWHV
	DAWGQGLLVTVSS

45	QVQLRESGPSLVKPSQTLSLTCTASGFSLSDKAVGWVRQA	SKM-BLVIH12
	PGKALEWLGSIDTGGNTGYNPGLKSRLSITKDNSKSQVSL
	SVSSVTTEDSATYYCTTVHQETLRSCPDGYIDNSGCTADW
	GCAALDCWRRRFGYHSTDPSHYTGATYIYTYSLHIDAWGQ
	GLLVTVSS

46	QVQLRESGPSLVKPSQTLSLTCTASGFSLSDKAVGWVRQA	SKD-BLV1H12
	PGKALEWLGSIDTGGNTGYNPGLKSRLSITKDNSKSQVSL
	SVSSVTTEDSATYYCTTVHQRTSEKRSCPGGSSRRYPSGA
	SCDVSGGACACYVSNCRGVLCPTLNEIVAYTYEWHVDAWG
	QGLLVTVSS

47	QVQLRESGPSLVKPSQTLSLTCTVSGFSLSSNGVVWVRQA	RBD F4
	PGKALEWLGDICSTGGTSFNPALKSRLSIAKDNSKSQVSL
	SVRSVTPEDTATYYCARSRGYDCYANVDALDYVDAWGQGL
	LVTVSS

48	QVQLRESGPSLVKPSQTLSLTCTVSGFSLSSNGVVWVRQA	RBD C6
	PGRPLEWLGDICSNGGTSFNPALKSRLSIAKDNSESQVSL
	TVRSVTPEDTATYYCARSRGYDCYAYVYALDTVDAWGQGL
	LVTVSS

49	QVQLRESGPSLVKPLQILSLTCTVSGFSLSSNGVVWVRQA	RBD A2
	PGKALEWLGDICSTGGTSFNPALKSRLSIAKDNSKSQVSL
	SVRSVTPEDTATYYCARSRGYDCYANVDALDYVDAWGQGL
	LVTVSS

50	QVQLRESGPSLVKPSQTLSLTCTASGFSLSDKPVGWVRQA	SA-R2C3
	PGKPLEWLGSIDTAENTGYNPGLKSRLSITKDNSKSQVSL
	SLSSVTTEDSATYYCATVHQKTRKEKSCPDGYLYSSNTGR
	GYDCGVWTCRRVGGEFCSATGDWTSPSEEDFYEFYVDTWG
	QGAPVTVSS

51	QVQLRESGPSLVKPSQTLSLTCTASGFSLSDKAIGWVRQA	SA-R2D9
	PGKALEWLGSIDTRGNTGYNPGLKSRLSITKDSSKSQVSL
	SVNSVTTEDSATYLCAIVQQITHKTCPNGYNWFDRCCSWD
	GTCGDGCCSNRAWPSGNGRADSSIGETYGYEFHVAAWGQG
	LLVTVSS

52	GAAGGAGACAAAACGTGTCCTGATGGTTACGAGCATACTT	R2G3
	GTGGTTGCATTGGGGGTTGTGGTTGCAAAAGGTCTGCCTG
	TATAGGTGCACTTTGTTGCCAAGCGTCGTTGGGTGGTTGG
	CTTAGTGACGGTGAAACCTACACT

53	AAAGAAAAAAATTGTCCTGATGGCTATATCTATAGTTCTA	SR3A3
	ATATCACTAGCGGTTTTGATTGTGGTGTCTGGATTTGTCG
	TCGCGTCGGTAGTGCCTTCTGTAGTCGTACTGGTGATTAT
	ACTAGTCCTACTGAACTTGACATTTACGAGTTC

54	AAAACGAAAAATGCTTGCCCTGATGATTTCGATTATCGTT	R2F12
	GTTCGTGTATCGGTGGTTGTGGCTGCGCCCGTAAAGGATG
	CGTTGGTCCTCTTTGTTGTCGTTCTGATTTGGGTGGCTAT
	CTTACTGATAGTCCTGCTTACATTTACGAA

55	AAAGAAAAAAATTGTCCTGATGGCTATATCTATAGTTCTA	R4C1
	ATACCGCCAGCGGTTATGATTGTGGTGTCTGGATTTGTCG
	TCGCGTCGGTAGTGCCTTCTGTAGTCGTACTGGTGATTAT
	ACTAGTCCTAGTGAATTTGACATTTAC

56	CTGCGTAGTTGTCCTGATGGTTATATTGATAATTCTGGAT	SKM-BLVIH12
	GCACGGCTGATTGGGGTTGTGCAGCTCTTGATTGTTGGCG
	GCGTCGTTTTGGTTACCACAGCACTGATCCTTCTCATTAT
	ACTGGTGCGACGTATATTTACACGTAC

57	AGCGAAAAAAGAAGTTGTCCTGGCGGTAGTAGTAGACGTT	SKD-BLV1H12
	ATCCTAGTGGCGCCAGTTGTGACGTTAGTGGGGGCGCTTG
	TGCGTGTTATGTTTCTAATTGTAGAGGCGTTTTGTGTCCT
	ACTCTTAACGAAATCGTTGCTTATACCTAC

58	CGGAAGGAAAAAAGTTGTCCTGATGGCTATCTCTATAGTT	SA-R2C3
	CTAATACCGGCCGCGGTTATGATTGTGGTGTCTGGACTTG
	TCGTCGCGTCGGTGGTGAATTCTGTAGTGCTACTGGTGAT
	TGGACTAGTCCTAGTGAAGAAGACTTTTACGAATTC

59	ATCACACACAAAACTTGTCCTAATGGTTACAATTGGTTTG	SA-R2D9
	ATCGTTGTTGTTCTTGGGATGGTACCTGTGGTGATGGTTG
	TTGCAGTAATCGTGCTTGGCCTAGTGGTAATGGTAGAGCC
	GACAGTAGTATTGGTGAAACTTATGGTTACGAATTT

60	EGDKTCPDGYEHTCGCIGGCGCKRSACIGALCCQASLGGW	R2G3
	LSDGETYTYEF

61	KEKNCPDGYIYSSNITSGFDCGVWICRRVGSAFCSRTGDY	SR3A3
	TSPTELDIYEF

62	KTKNACPDDFDYRCSCIGGCGCARKGCVGPLCCRSDLGGY	R2F12
	LTDSPAYIYE

63	RKEKNCPDGYIYSSNTASGYDCGVWICRRVGSAFCSRTGD	R4C1
	YTSPSEFDIY

64	LRSCPDGYIDNSGCTADWGCAALDCWRRRFGYHSTDPSHY	SKM-BLVIH12
	TGATYIYTY

65	SEKRSCPGGSSRRYPSGASCDVSGGACACYVSNCRGVLCP	SKD-BLVIH12
	TLNEIVAYTY

66	RKEKSCPDGYLYSSNTGRGYDCGVWTCRRVGGEFCSATGD	SA-R2C3
	WTSPSEEDFYEF

67	ITHKTCPNGYNWFDRCCSWDGTCGDGCCSNRAWPSGNGRA	SA-R2D9
	DSSIGETYGYEF

68	CTTVHQRTSEKRSCPGGSSRRYPSGASCDVSGGACACYVS	SKD
	NCRGVLCPTLNEIVAYTYEWHVDAWGQGLLVTVSS

69	CTTVHQETLRSCPDGYIDNSGCTADWGCAALDCWRRRFGY	SKM
	HSTDPSHYTGATYIYTYSLHIDAWGQGLLVTVSS

70	CATVHQKTRKEKNCPDGYIYSSNTASGYDCGVWICRRVGS	R4C1
	AFCSRTGDYTSPSEFDIYEFYVEGWGQGLLVTVSS

71	CATVHQKTRKEKSCPDGYLYSSNTGRGYDCGVWTCRRVGG	R5C1
	EFCSATGDWTSPSEEDFYEFYVDTWGQGLLVTVSS

72	CATVHQKTRKEKNCPDGYIYSSNITSGFDCGVWICRRVGS	SR3A3
	AFCSRTGDYTSPTELDIYEFYVEGWGQGVPVTVSS

73	CATVDQKTKNACPDDEDYRCSCIGGCGCARKGCVGPLCCR	RR2F12
	SDLGGYLTDSPAYIYEWYIDLWGQGLLVTVSS

74	CATVHQKTAEGDKTCPDGYEHTCGCIGGCGCKRSACIGAL	RR2G3
	CCQASLGGWLSDGETYTYEFHVDTWGQGLVVTVSS

75	CTTVHQSCPDGYSYGYGCGYGYGCSGYDCYGYGGYGGYGG	Germ
	YGYSSYSYSYTYEYYVDAWGQGLLVTVSS

76	MFVFLVLLPLVSSQCVNLTTRTQLPPAYTNSFTRGVYYPD	WT Wuhan-Hu-1
	KVFRSSVLHSTQDLFLPFFSNVTWFHAIHVSGTNGTKRFD	S protein
	NPVLPFNDGVYFASTEKSNIIRGWIFGTTLDSKTQSLLIV	NCBI Reference
	NNATNVVIKVCEFQFCNDPFLGVYYHKNNKSWMESEFRVY	Sequence:
	SSANNCTFEYVSQPFLMDLEGKQGNFKNLREFVFKNIDGY	YP_009724390.1
	FKTYSKHTPINLVRDLPQGFSALEPLVDLPIGINITRFQT	(RBD shown in
	LLALHRSYLTPGDSSSGWTAGAAAYYVGYLQPRTELLKYN	bold,
	ENGTITDAVDCALDPLSETKCTLKSFTVEKGIYQTSNFRV	intravirion in
	QPTESIVRFPNITNLCPFGEVENATRFASVYAWNRKRISN	underline)
	CVADYSVLYNSASFSTFKCYGVSPTKLNDLCFTNVYADSF
	VIRGDEVRQIAPGQTGKIADYNYKLPDDFTGCVIAWNSNN
	LDSKVGGNYNYLYRLFRKSNLKPFERDISTEIYQAGSTPC
	NGVEGENCYFPLQSYGFQPTNGVGYQPYRVVVLSFELLHA
	PATVCGPKKSTNLVKNKCVNFNENGLTGTGVLTESNKKFL
	PFQQFGRDIADTTDAVRDPQTLEILDITPCSFGGVSVITP
	GTNTSNQVAVLYQDVNCTEVPVAIHADQLTPTWRVYSTGS
	NVFQTRAGCLIGAEHVNNSYECDIPIGAGICASYQTQTNS
	PRRARSVASQSIIAYTMSLGAENSVAYSNNSIAIPTNFTI
	SVTTEILPVSMTKTSVDCTMYICGDSTECSNLLLQYGSFC
	TQLNRALTGIAVEQDKNTQEVFAQVKQIYKTPPIKDFGGF
	NFSQILPDPSKPSKRSFIEDLLFNKVTLADAGFIKQYGDC
	LGDIAARDLICAQKFNGLTVLPPLLTDEMIAQYTSALLAG
	TITSGWTFGAGAALQIPFAMQMAYRENGIGVTQNVLYENQ
	KLIANQFNSAIGKIQDSLSSTASALGKLQDVVNQNAQALN
	TLVKQLSSNFGAISSVLNDILSRLDKVEAEVQIDRLITGR
	LQSLQTYVTQQLIRAAEIRASANLAATKMSECVLGQSKRV
	DFCGKGYHLMSFPQSAPHGVVFLHVTYVPAQEKNFTTAPA
	ICHDGKAHFPREGVFVSNGTHWFVTQRNFYEPQIITTDNT
	FVSGNCDVVIGIVNNTVYDPLQPELDSFKEELDKYFKNHT
	SPDVDLGDISGINASVVNIQKEIDRLNEVAKNLNESLIDL
	QELKYEQYIKWPWYIWLGFIAGLIAIVMVTIMLCCMTSCC
	SCLKGCCSCGSCCKFDEDDSEPVLKGVKLHYT

77	RVQPTESIVRFPNITNLCPFGEVFNATRFASVYAWNRKRI	Wuhan-Hu-1 S
	SNCVADYSVLYNSASFSTFKCYGVSPTKLNDLCFTNVYAD	protein
	SFVIRGDEVRQIAPGQTGKIADYNYKLPDDFTGCVIAWNS	NCBI Reference
	NNLDSKVGGNYNYLYRLFRKSNLKPFERDISTEIYQAGST	Sequence:
	PCNGVEGENCYFPLQSYGFQPTNGVGYQPYRVVVLSFELL	YP_009724390.1
	HAPATVCGPKKSTNLVKNKCVNF	RBD AA 319-541

78	MFVFLVLLPLVSSQCVNLTTRTQLPPAYTNSFTRGVYYPD	Wuhan-Hu-1 S
	KVFRSSVLHSTQDLFLPFFSNVTWFHAIHVSGTNGTKRED	protein with furin
	NPVLPFNDGVYFASTEKSNIIRGWIFGTTLDSKTQSLLIV	site removed
	NNATNVVIKVCEFQFCNDPFLGVYYHKNNKSWMESEFRVY	(AA685-686) and
	SSANNCTFEYVSQPFLMDLEGKQGNFKNLREFVFKNIDGY
	FKIYSKHTPINLVRDLPQGFSALEPLVDLPIGINITRFQT
	LLALHRSYLTPGDSSSGWTAGAAAYYVGYLQPRTFLLKYN
	ENGTITDAVDCALDPLSETKCTLKSFTVEKGIYQTSNFRV
	QPTESIVRFPNITNLCPFGEVFNATRFASVYAWNRKRISN
	CVADYSVLYNSASFSTFKCYGVSPTKLNDLCFTNVYADSF
	VIRGDEVRQIAPGQTGKIADYNYKLPDDFTGCVIAWNSNN
	LDSKVGGNYNYLYRLFRKSNLKPFERDISTEIYQAGSTPC
	NGVEGENCYFPLQSYGFQPTNGVGYQPYRVVVLSFELLHA
	PATVCGPKKSTNLVKNKCVNFNFNGLTGTGVLTESNKKFL
	PFQQFGRDIADTTDAVRDPQTLEILDITPCSFGGVSVITP
	GTNTSNQVAVLYQDVNCTEVPVAIHADQLTPTWRVYSTGS
	NVFQTRAGCLIGAEHVNNSYECDIPIGAGICASYQTQTNS
	PRRAVASQSIIAYTMSLGAENSVAYSNNSIAIPTNFTISV
	TTEILPVSMTKTSVDCTMYICGDSTECSNLLLQYGSFCTQ
	LNRALTGIAVEQDKNTQEVFAQVKQIYKTPPIKDFGGFN
	FSQILPDPSKPSKRSFIEDLLFNKVTLADAGFIKQYGDCL
	GDIAARDLICAQKFNGLTVLPPLLTDEMIAQYTSALLAGT
	ITSGWTFGAGAALQIPFAMQMAYRFNGIGVTQNVLYENQK
	LIANQFNSAIGKIQDSLSSTASALGKLQDVVNQNAQALNT
	LVKQLSSNFGAISSVLNDILSRLDPPEAEVQIDRLITGRL
	QSLQTYVTQQLIRAAEIRASANLAATKMSECVLGQSKRVD
	FCGKGYHLMSFPQSAPHGVVFLHVTYVPAQEKNFTTAPAI
	CHDGKAHFPREGVFVSNGTHWFVTQRNFYEPQIITTDNTF
	VSGNCDVVIGIVNNTVYDPLQPELDSFKEELDKYFKNHTS
	PDVDLGDISGINASVVNIQKEIDRLNEVAKNLNESLIDLQ
	ELKYEQYIKWPWYIWLGFIAGLIAIVMVTI

79	MFVFLVLLPLVSSQCVNFTTRTQLPPAYTNSFTRGVYYPD	K986P and V987P
	KVFRSSVLHSTQDLFLPFFSNVTWFHAIHVSGTNGTKRFA	stabilizing
	NPVLPFNDGVYFASTEKSNIIRGWIFGTTLDSKTQSLLIV	mutations (bold)
	NNATNVVIKVCEFQFCNDPFLGVYYHKNNKSWMESEFRVY	Extracellular
	SSANNCTFEYVSQPFLMDLEGKQGNFKNLREFVFKNIDGY	domain only
	FKIYSKHTPINLVRGLPQGFSALEPLVDLPIGINITRFQT	(AA1233-1273
	LLALHISYLTPGDSSSGWTAGAAAYYVGYLQPRTFLLKYN	removed)
	ENGTITDAVDCALDPLSETKCTLKSFTVEKGIYQTSNFRV	NCBI Reference
	QPTESIVRFPNITNLCPFGEVENATRFASVYAWNRKRISN	Sequence:
	CVADYSVLYNSASFSTFKCYGVSPTKLNDLCFTNVYADSF	YP_009724390.1
	VIRGDEVRQIAPGQTGNIADYNYKLPDDFTGCVIAWNSNN	7LYN
	LDSKVGGNYNYLYRLFRKSNLKPFERDISTEIYQAGSTPC	South African
	NGVKGFNCYFPLQSYGFQPTYGVGYQPYRVVVLSFELLHA	(B.1.351) SARS-
	PATVCGPKKSTNLVKNKCVNFNFNGLTGTGVLTESNKKFL	CoV-2 spike
	PFQQFGRDIADTTDAVRDPQTLEILDITPCSFGGVSVITP	protein variant (S-
	GTNTSNQVAVLYQGVNCTEVPVAIHADQLTPTWRVYSTGS	GSAS-B.1.351)
	NVFQTRAGCLIGAEHVNNSYECDIPIGAGICASYQTQTNS
	PGSASSVASQSIIAYTMSLGVENSVAYSNNSIAIPTNFTI
	SVTTEILPVSMTKTSVDCTMYICGDSTECSNLLLQYGSFC
	TQLNRALTGIAVEQDKNTQEVFAQVKQIYKTPPIKDFGGF
	NFSQILPDPSKPSKRSFIEDLLFNKVTLADAGFIKQYGDC
	LGDIAARDLICAQKFNGLTVLPPLLTDEMIAQYTSALLAG
	TITSGWTFGAGAALQIPFAMQMAYRENGIGVTQNVLYENQ
	KLIANQFNSAIGKIQDSLSSTASALGKLQDVVNQNAQALN
	TLVKQLSSNFGAISSVLNDILSRLDKVEAEVQIDRLITG
	RLQSLQTYVTQQLIRAAEIRASANLAATKMSECVLGQSKR
	VDFCGKGYHLMSFPQSAPHGVVFLHVTYVPAQEKNFTTAP
	AICHDGKAHFPREGVFVSNGTHWFVTQRNFYEPQIITTDN
	TFVSGNCDVVIGIVNNTVYDPLQPELDSFKEELDKYFKNH
	TSPDVDLGDISGINASVVNIQKEIDRLNEVAKNLNESLID
	LQELGKYEQGSGYIPEAPRDGQAYVRKDGEWVLLSTFLGR
	SLEVLFQGPGHHHHHHHHSAWSHPQFEKGGGSGGGGSGGS
	AWSHPQFEK

80	MFVFLVLLPLVSSQCVNFTTRTQLPPAYTNSFTRGVYYPD	7LYN
	KVFRSSVLHSTQDLFLPFFSNVTWFHAIHVSGTNGTKRFA	South African
	NPVLPFNDGVYFASTEKSNIIRGWIFGTTLDSKTQSLLIV	(B.1.351) SARS-
	NNATNVVIKVCEFQFCNDPFLGVYYHKNNKSWMESEFRVY	CoV-2 spike
	SSANNCTFEYVSQPFLMDLEGKQGNFKNLREFVFKNIDGY	protein variant (S-
	FKIYSKHTPINLVRGLPQGFSALEPLVDLPIGINITRFQT	GSAS-B.1.351)
	LLALHISYLTPGDSSSGWTAGAAAYYVGYLQPRTFLLKYN	with furin site
	ENGTITDAVDCALDPLSETKCTLKSFTVEKGIYQTSNFRV	removed and
	QPTESIVRFPNITNLCPFGEVENATRFASVYAWNRKRISN	K986P and V987P
	CVADYSVLYNSASFSTFKCYGVSPTKLNDLCFTNVYADSF	stabilizing
	VIRGDEVRQIAPGQTGNIADYNYKLPDDFTGCVIAWNSNN	mutations (bold)
	LDSKVGGNYNYLYRLFRKSNLKPFERDISTEIYQAGSTPC	Extracellular
	NGVKGFNCYFPLQSYGFQPTYGVGYQPYRVVVLSFELLHA	domain only
	PATVCGPKKSTNLVKNKCVNFNFNGLTGTGVLTESNKKFL
	PFQQFGRDIADTTDAVRDPQTLEILDITPCSFGGVSVITP
	GTNTSNQVAVLYQGVNCTEVPVAIHADQLTPTWRVYSTGS
	NVFQTRAGCLIGAEHVNNSYECDIPIGAGICASYQTQTNS
	PGSASSVASQSIIAYTMSLGVENSVAYSNNSIAIPTNFTI
	SVTTEILPVSMTKTSVDCTMYICGDSTECSNLLLQYGSFC
	TQLNRALTGIAVEQDKNTQEVFAQVKQIYKTPPIKDFGGF
	NFSQILPDPSKPSKRSFIEDLLFNKVTLADAGFIKQYGDC
	LGDIAARDLICAQKFNGLTVLPPLLTDEMIAQYTSALLAG
	TITSGWTFGAGAALQIPFAMQMAYRENGIGVTQNVLYENQ
	KLIANQFNSAIGKIQDSLSSTASALGKLQDVVNQNAQALN
	TLVKQLSSNFGAISSVLNDILSRLDPPEAEVQIDRLITGR
	LQSLQTYVTQQLIRAAEIRASANLAATKMSECVLGQSKRV
	DFCGKGYHLMSFPQSAPHGVVFLHVTYVPAQEKNFTTAPA
	ICHDGKAHFPREGVFVSNGTHWFVTQRNFYEPQIITTDNT
	FVSGNCDVVIGIVNNTVYDPLQPELDSFKEELDKYFKNHT
	SPDVDLGDISGINASVVNIQKEIDRLNEVAKNLNESLIDL
	QELGKYEQGSGYIPEAPRDGQAYVRKDGEWVLLSTFLGRS
	LEVLFQGPGHHHHHHHHSAWSHPQFEKGGGSGGGGSGGSA
	WSHPQFEK

81	QVQLRESGPSLVKPSQTLSLTCTVSGFSLSDKAVGWVRQA	IgHV1-7 V gene
	PGKALEWLGGIDTGGSTGYNPGLKSRLSITKDNSKSQVSL
	SVSSVTTEDSATYYCTTVHQ

82	SCPDGYSYGYGCGYGYGCSGYDCYGYGGYGGYGGYGYSSY	IDHD8-2
	SYSYTYEY

83	YVDAWGQGLLVTVSS	IGHJ2-4

84	TGCAGGTGCAGCTGCGGGAGTCGGG	Minimal
		BOVVHNCOFOR

		2 primer

85	TGAGGAGACGGTGACCAGGAGTCC	Minimal
		BOVVHFR4XHO
		REV primer

86	GGGGAMGSEGDKTCPDGYEHTCGCIGGCGCKRSACIGALC	R2G3 Parental
	CQASLGGWLSDGETYT

87	GGSEGDKTCPDGYEHTCGCIGGCGCKRSACIGALCCQASL	R2G3 TRUNC1
	GGWLSDGETYT

88	GGSEGDKTCPDGYEHTCGCIGGCGCKRSACIGALCCQASL	R2G3 TRUNC2
	GGWLSDGE

89	GGSEGDKTCPDGYEHTCGCIGGCGCKRSACIGALCCQASL	R2G3 TRUNC3
	GGWLS

90	GGSEGDKTCPDGYEHTCGCIGGCGCKRSACIGALCCQASL	R2G3 TRUNC3A
	GGWL

91	GGSEGDKTCPDGYEHTCGCIGGCGCKRSACIGALCCQASL	R2G3 TRUNC3B
	GGW

92	GGSEGDKTCPDGYEHTCGCIGGCGCKRSACIGALCCQASL	R2G3 TRUNC4
	GG

93	GGSEGDKTCPDGYEHTCGCIGGCGCKRSACIGALCCQAS	R2G3 TRUNC5

94	GlyGlyGlyGlySerGlyGlyGlyGlySerGlyGlyGlyG	Flexible Linker
	lySer

95	GGVCPKILQRCRRDSDSPGACICRGNGYCGSGSD	Mcoti-I

96	GGVCPKILKKCRRDSDSPGACICRGNGYCGSGSD	Mcoti-II

97	ERACPRILKKCRRDSDSPGACICRGNGYCG	Mcoti-III

98	CTTVHQ	Base of Stalk A

99	CATVHQ	Base of Stalk A

100	CAIVQQ	Base of Stalk A

101	CATVDQ	Base of Stalk A

102	YX₁YX₂Y	Stalk B
		X₁ and X₂ are any
		amino acid

103	CX₂TVX₅Q	Ascending Stalk
		Domain
		X₂ and X₅ are any
		amino acid

104	CX₂TVX₅Q	Ascending Stalk
		Domain
		X₂ is Ser, Thr, Gly,
		Asn, Ala, or Pro,
		and X₅ is His, Gln,
		Arg, Lys, Gly, Thr,
		Tyr, Phe, Trp, Met,
		Ile, Val, or Leu

105	CX₂TVX₅Q	Ascending Stalk
		Domain
		X₂ is Ser, Ala, or
		Thr, and X₅ is His
		or Tyr

106	DDDDK	Enterokinase
		Cleavage Tag

107	QAVLNQPSSVSGSLGQKVTISCSGSSSNIGNNYVSWYQQL	Humanized
	PGTAPKLLIYGDTKRPSGIPDRESGSKSGTSATLGITGLQ	BLV1H12
	TGDEADYYCASAEDSSSNAVFGSGTTLTVLGQP	Variable Light

108	GGGGAMGS	Flexible Linker

109	GGS	Flexible Linker

110	QAVLNQPSSVSGSLGQRVSITCSGSSSNVGNGYVSWYQLI	BLV5B8 Variable
	PGSAPRTLIYGDTSRASGVPDRFSGSRSGNTATLTISSLQ	Light Region
	AEDEADYFCASAEDSSSNAVFGSGTTLTVLGQP

111	QVQLREWGAGLLKPSETLSLTCAVYGGSFSDKYWSWIRQP	Humanized VI
	PGKGLEWIGSINHSGSTNYNPSLKSRVTISVDTSKNQFSL	Region
	KLSSVTAADTAVYY

112	WGQGLLVTVSS	V2 Region

113	QAVLNQPSSVSGSLGQRVSITCSGSSSNVGNGYVSWYQLI	BLVIH12 Light
	PGSAPRTLIYGDTSRASGVPDRFSGSRSGNTATLTISSLQ	Chain
	AEDEADYFCASAEDSSSNAVFGSGTTLTVLGQPKSPPSVT
	LFPPSTEELNGNKATLVCLISDFYPGSVTVVWKADGSTIT
	RNVETTRASKQSNSKYAASSYLSLTSSDWKSKGSYSCEVT
	HEGSTVTKTVKPSECS

114	QAVLNQPSSVSGSLGQKVTISCSGSSSNIGNNYVSWYQQL	B15 Humanized
	PGTAPKLLIYGDTKRPSGIPDRFSGSKSGTSATLGITGLQ	Light Chain
	TGDEADYYCASAEDSSSNAVFGSGTTLTVLGQPKAAPSVT
	LFPPSSEELQANKATLVCLISDFYPGAVTVAWKADSSPVK
	AGVETTTPSKQSNNKYAASSYLSLTPEQWKSHRSYSCQVT
	HEGSTVEKTVAPTECS

115	QAVLNQPSSVSGSLGQRVSITCSGSSSNVGNGYVSWYQLI	BLV5B8 light
	PGSAPRTLIYGDTSRASGVPDRFSGSRSGNTATLTISSLQ	chain
	AEDEADYFCASAEDSSSNAVFGSGTTLTVLGQPKSPPSVT
	LFPPSTEELNGNKATLVCLISDFYPGSVTVVWKADGSTIT
	RNVETTRASKQSNSKYAASSYLSLTSSDWKSKGSYSCEVT
	HEGSTVTKTVKPSECS

116	QSVLTQPPSVSAAPGQKVTISCSGSSSNIGNNYVSWYQQL	human VL1-51
	PGTAPKLLIYDNNKRPSGIPDRFSGSKSGTSATLGITGLQ
	TGDEADYYCASAEDSSSNAVFGSGTTLTVLGQPKAAPSVT
	LFPPSSEELQANKATLVCLISDFYPGAVTVAWKADSSPVK
	AGVETTTPSKQSNNKYAASSYLSLTPEQWKSHRSYSCQVT
	HEGSTVEKTVAPTECS

117	QSVLTQPPSASGTPGQRVTISCSGSSSNIGSNYVYWYQQL	Human germline
	PGTAPKLLIYRNNQRPSGVPDRFSGSKSGTSASLAISGLR	light chain variable
	SEDEADYYCAAWDDSLSG	region sequence
		VL1-47

118	QSVLTQPPSVSGAPGQRVTISCTGSSSNIGAGYDVHWYQQ	Human germline
	LPGTAPKLLIYGNSNRPSGVPDRFSGSKSGTSASLAITGL	light chain variable
	QAEDEADYYCQSYDSSLSG	region sequence
		VL1-40*1

119	QSVLTQPPSVSAAPGQKVTISCSGSSSNIGNNYVSWYQQL	Human germline
	PGTAPKLLIYDNNKRPSGIPDRFSGSKSGTSATLGITGLQ	light chain variable
	TGDEADYYCGTWDSSLSA	region sequence
		VL1-51*01

120	QSALTQPPSVSGSPGQSVTISCTGTSSDVGSYNRVSWYQQ	Human germline
	PPGTAPKLMIYEVSNRPSGVPDRFSGSKSGNTASLTISGL	light chain variable
	QAEDEADYYCSSYTSSSTF	region sequence
		VL2-18*02

12	ttacctgcggccgctgaggagacggtgaccaggagtcc	BOVVHFR4REV

122	ttttttgcggccgcccaggcgctgacgtaccattc	ULpl

123	ttttttgcggccgcccaggcatcgacgtagaattc	ULp2

124	ttttttgcggccgcccagacatcgacgaaaaattc	ULp3

125	ttttttgggccgcccaggcatggacgtaaaattg	ULp4

126	ttttttgcggccgcccaagtctcgacataaaattc	ULps

127	ttttttgcggccgcccaggcatcgacgagccattg	ULp6

128	ttttttgcggccgcccaggcatcgacgtgccattc	ULp7

129	ttttttgcggccgcccaggcatcgacgtggaattc	ULp8

130	ttttttgcggccgcccaggcatcgacgtggaagct	ULp9

131	GGSEGDKTCPDGYEHTCGCIGGCGCKRSACIGALCCQASL	G3 Parental
	GGWLSD

132	GGSGDKTCPDGYEHTCGCIGGCGCKRSACIGALCCQASLG	G3 NTRUNC1
	GWLSD

133	GGSDKTCPDGYEHTCGCIGGCGCKRSACIGALCCQASLGG	G3 NTRUNC2
	WLSD

134	GGSKTCPDGYEHTCGCIGGCGCKRSACIGALCCQASLGGW	G3 NTRUNC3
	LSD

135	GGSTCPDGYEHTCGCIGGCGCKRSACIGALCCQASLGGWL	G3 NTRUNC4
	SD

136	GGSCPDGYEHTCGCIGGCGCKRSACIGALCCQASLGGWLS	G3 NTRUNC5
	D

137	tttgttgccaagcgtcgttgggtggttggcttagtgacgg	Ultralong CD3
	tgaaacctacacttacgagttccacgtcgatacctggggc	Antibody 014
	caaggactcgtggtcaccgtctcctca

138	gtgccttctgtagtcgtactggtgattatactagtcctac	Ultralong CD3
	tgaacttgacatttacgagttctacgtcgaagggtggggc	Antibody 015
	cagggagtcccggtcaccgtctcctca

139	cttggcctagtggtaatggtagagccgacagtagtattgg	Ultralong CD3
	tgaaacttatggttacgaatttcacgtggctgcctggggc	Antibody 032
	caaggactcctggtcaccgtctcctca

140	tttgttgtcgttctgatttgggtggctatcttactgatag	Ultralong CD3
	tcctgcttacatttacgaatggtatattgatctttggggc	Antibody 016
	caaggactcctggtcaccgtctcctca

141	gtgaattctgtagtgctactggtgattggactagtcctag	Ultralong CD3
	tgaagaagacttttacgaattctacgtcgatacgtggggc	Antibody 031
	cagggagccccggtcaccgtctcctca

142	ctaattgtagaggcgttttgtgtcctactcttaacgaaat	Ultralong CD3
	cgttgcttatacctacgaatggcacgtcgacgcctggggc	Antibody 027
	caaggactcctggtcaccgtctcctca

143	gtgccttctgtagtcgtactggtgattatactagtcctag	Ultralong CD3
	tgaatttgacatttacgagttctacgtcgaagggtggggc	Antibody 021
	cagggactcctggtcaccgtctcctca

144	gttaccacagcactgatccttctcattatactggtgcgac	Ultralong CD3
	gtatatttacacgtacagcttgcacatcgatgcctggggc	Antibody 026
	caaggactcctggtcaccgtctcctca

145	gcaagaagtcgtggttatgattgttatgctaatgtggatg	Standard Short
	ctttggactacgtcgatgcctggggccaaggactcctggt	CDR3 Antibody
	caccgtctcctca	028

146	agtggaatttagaatatacttggggtggtgttggttgcgc	Ultralong CD3
	tagttttgctgatgaggacacccacgttgatgcctggggc	Antibody 018
	caaggactcctggtcaccgtctcctca

147	attatgttgttcgtcgttataattgtggtggtcttggtta	Ultralong CD3
	tgggcatggctttaatagtttctacgtcgatgcctggggc	Antibody 019
	caaggactcctggtcaccgtctcctca

148	attatgttgttcgtcgttataattgtggtggtcttggtta	Ultralong CD3
	tgggcatggctttaatagtttctacgtcgatgcctggggc	Antibody 020
	caaggactcctggtcaccgtctcctca

149	atcgggtigtgcgtcgtaataattgtggtgggcttggtta	Ultralong CD3
	tgattatggitttgatcatttctacgtcgatgcctggggc	Antibody 022
	caaggactcctggtcaccgtctcctca

150	gcgaagtttgctaagggtactacgagtgctggtgcttgtg	Ultralong CD3
	attattcagaaagctacgtcgatgcctggggccagggact	Antibody 023
	cctggtcaccgtctcctca

151	attccggtgcttatgcttatgctgcttgcaattattatgg	Ultralong CD3
	ttggcgttgtgcttgggaaagctacatcgatgcctggggc	Antibody 024
	caaggactcctggtcaccgt

152	acaatgcacgttgtgatagttggacgtatgacagctgtga	Ultralong CD3
	tacttggtatcgcaattcgtggcacgttgatgcctggggc	Antibody 025
	caaggactcctggtcaccgtctcctca

153	gcaagaagtcgtggttatgattgttatgcttatgtttatg	Standard Short
	ctttggacaccgtcgatgcctggggccaaggactcctggt	CDR3 Antibody
	caccgtctcctca	029

154	gcaagaagtcgtggttatgattgttatgctaatgtggatg	Standard Short
	ctttggactacgtcgatgcctggggccaaggactcctggt	CDR3 Antibody
	caccgtctcctca	030

Claims

What is claimed is:

1. A method of preparing a cow ultralong CDR3 antibody display library, the method comprising:

(a) amplifying sequences encoding a plurality of variable heavy (VH) regions of the IgHV1-7 family from a cow antibody VH chain complementary DNA (cDNA) template library;

(b) constructing a plurality of replicable expression vectors for the plurality of VH regions, wherein each replicable expression vector comprises a nucleic acid sequence encoding a single chain variable fragment (scFv) comprising an amplified VH region joined to a variable lambda light (VL) region selected from the group consisting of VL regions of BLV1H12, BLV5D3, BLV8C11, BF1H1, BLV5B8, and F18, or a humanized variant thereof;

(c) transforming suitable host cells with the plurality of replicable expression vectors under conditions suitable to produce amplified display particles; and

(d) collecting the amplified display particles, wherein the amplified display particles comprise display particles displaying a fusion protein comprising an scFv.

2. The method of claim 1, wherein the VL region is the BLV1H12 VL region.

3. A method of preparing a cow ultralong CDR3 antibody display library, the method comprising:

(b) constructing a plurality of replicable expression vectors for the plurality of VH regions, wherein each replicable expression vector comprises a nucleic acid sequence encoding a single chain variable fragment (scFv) comprising an amplified VH region joined to the BLV1H12 lambda variable light (VL) region or a humanized variant thereof;

4. The method of any of claims 1-3, wherein the cDNA template library is prepared from RNA isolated from peripheral blood mononuclear cells (PBMCs) from an immunized cow.

5. The method of any of claims 1-4, further comprising preparing the cDNA template library from RNA isolated from peripheral blood mononuclear cells (PBMCs) from an immunized cow.

6. The method of claim 4 or claim 5, further comprising immunizing the cow with a target antigen.

7. The method of any of claims 1-6, wherein the amplified display particles comprise bacterial display, yeast display, mammalian display, phage display, mRNA display, ribosomal display, or DNA display particles.

8. The method of any of claims 1-7, wherein the amplified display particles are phage display particles.

9. The method of any of claims 1-8, wherein the amplified display particles are phagemid particles.

10. The method of claim 9, wherein the nucleic acid sequence is a first nucleic acid sequence, each replicable expression vector further comprises a second nucleic acid sequence encoding at least a portion of a phage coat protein, and the method further comprises infecting the transformed host cells with an amount of a helper phage having a gene encoding the phage coat protein sufficient to produce the phagemid particles, whereby the fusion protein comprises the at least a portion of a phage coat protein.

11. A method of preparing a cow ultralong CDR3 antibody phage display library, the method comprising:

(a) immunizing a cow with a target antigen;

(b) preparing an antibody variable heavy (VH) chain complementary DNA (cDNA) template library from RNA isolated from peripheral blood mononuclear cells (PBMCs) from the immunized cow;

(c) amplifying sequences encoding a plurality of VH regions of the IgHV1-7 family from the cDNA template library;

(d) constructing a plurality of replicable expression vectors for the plurality of VH regions, wherein each replicable expression vector comprises (1) a first nucleic acid sequence encoding a single chain variable fragment (scFv) comprising an amplified VH region joined to the BLV1H12 lambda variable light (VL) region or a humanized variant thereof, and (2) a second nucleic acid sequence encoding at least a portion of a phage coat protein;

(e) transforming suitable host cells with the plurality of replicable expression vectors;

(f) infecting the transformed host cells with an amount of a helper phage having a gene encoding the phage coat protein sufficient to produce amplified phagemid particles; and

(g) collecting the amplified phagemid particles, wherein the amplified phagemid particles comprise phagemid particles displaying a fusion protein comprising the at least a portion of a phage coat protein and an scFv.

12. The method of any of claims 1-11, wherein the BLV1H12 lambda VL region is set forth in SEQ ID NO: 2.

13. The method of any of claims 1-11, wherein the BLV1H12 lambda VL region is a humanized variant of the lambda VL region of BLV1H12.

14. The method of claim 13, wherein the humanized variant comprises one or more of amino acid replacements S2A, T5N, P8S, A12G, A13S, and P14L based on Kabat numbering, amino acid replacements I29V and N32G in the CDR1 region, and/or amino acid substitution of DNN to GDT in the CDR2 region.

15. The method of any of claim 13 or claim 14, wherein the humanized variant comprises the sequence set forth in SEQ ID NO: 107.

16. The method of any of claims 1-15, wherein the amplified VH region is joined to the BLV1H12 lambda VL region indirectly via a peptide linker.

17. The method of claim 16, wherein the peptide linker is (Gly₄Ser)₃(SEQ ID NO: 94).

18. The method of any of claims 1-17, wherein the plurality of VH regions of the IgHV1-7 family from the cDNA template library are amplified with a forward primer comprising the sequence set forth in SEQ ID NO: 84 and a reverse primer comprising the sequence set forth in SEQ ID NO: 85.

19. The method of any of claims 1-18, wherein prior to the constructing, the method further comprises performing a size separation on the sequences encoding the plurality of amplified VH regions to enrich for VH regions with an ultralong CDR3.

20. The method of claim 19, wherein the size separation is performed by gel electrophoresis.

21. The method of claim 20, wherein the gel electrophoresis is performed using a 1.2%, 1.5%, or 2% agarose gel, optionally using a 2% agarose gel.

22. The method of any of claims 19-21, wherein the size separation comprises separating sequences of, of about, or greater than 550 base pairs in length from the sequences encoding the plurality of amplified VH regions, wherein the sequences of, of about, or greater than 550 base pairs in length comprise sequences encoding VH regions with an ultralong CDR3.

23. The method of any of claims 1-22, wherein at least or at least about 20%, 30%, 35%, 40%, 45%, 50%, 60%, 70%, 80%, 85%, 90%, or 95% of the amplified particles display an scFv comprising a VH region comprising an ultralong CDR3 region.

24. The method of any of claims 1-23, wherein at least or at least about 30% of the amplified particles display an scFv comprising a VH region comprising an ultralong CDR3 region.

25. The method of any of claims 1-24, wherein at least or at least about 40% of the amplified particles display an scFv comprising a VH region comprising an ultralong CDR3 region.

26. The method of any of claims 1-25, wherein at least or at least about 50% of the amplified particles display an scFv comprising a VH region comprising an ultralong CDR3 region.

27. The method of any of claims 1-26, wherein the ultralong CDR3 is a peptide sequence of 25-70 amino acids comprising a cysteine motif comprising 2-12 cysteine residues able to form 1-6 disulfide bonds.

28. The method of any of claims 1-27, wherein the ultralong CDR3 is 40 to 60 amino acids in length.

29. The method of any of claims 1-28, wherein the ultralong CDR3 is at least 42 amino acids in length.

30. The method of any of claims 1-29, wherein the ultralong CDR3 is 42 amino acids, 43 amino acids, 44 amino acids, 45 amino acids, 46 amino acids, 47 amino acids, 48 amino acids, 49 amino acids, 50 amino acids, 51 amino acids, 52 amino acids, 53 amino acids, 54 amino acids, 55 amino acids, 56 amino acids, 57 amino acids, 58 amino acids, 59 amino acids, or 60 amino acids in length.

31. The method of any of claims 1-30, wherein the ultralong CDR3 comprises at least 4 cysteine residues.

32. The method of any of claims 1-31, wherein the ultralong CDR3 contains 4 cysteine residues.

33. The method of any of claims 1-31, wherein the ultralong CDR3 contains 6, 8, 10, or 12 cysteine residues.

34. The method of any of claims 1-33, wherein the ultralong CDR3 has at least 2 disulfide bonds.

35. The method of any of claims 1-34, wherein the ultralong CDR3 has 2 disulfide bonds.

36. The method of any of claims 1-34, wherein the ultralong CDR3 has 3, 4 or 5 disulfide bonds.

37. The method of any of claims 1-36, wherein the method further comprises identifying the CDR3-knob sequence in the scFv sequence.

38. A method of preparing an ultralong CDR3-knob display library, the method comprising:

(a) amplifying sequences encoding a plurality of CDR3-knob only antibodies from a cow antibody variable heavy (VH) chain complementary DNA (cDNA) template library with forward and reverse primers specific for the ascending and descending stalk domains of a cow ultralong CDR3 region;

(b) constructing a plurality of replicable expression vectors for the plurality of CDR3-knob only antibodies, wherein each replicable expression vector comprises a nucleic acid sequence encoding an amplified CDR3 knob;

(d) collecting the amplified display particles, wherein the amplified display particles comprise display particles displaying a fusion protein comprising an amplified CDR3 knob.

39. The method of claim 38, wherein the cDNA template library is prepared from RNA isolated from peripheral blood mononuclear cells (PBMCs) from an immunized cow.

40. The method of claim 38 or claim 39, further comprising preparing the cDNA template library from RNA isolated from peripheral blood mononuclear cells (PBMCs) from an immunized cow.

41. The method of claim 39 or claim 40, further comprising immunizing the cow with a target antigen.

42. The method of any of claims 38-41, wherein the amplified display particles comprise bacterial display, yeast display, mammalian display, phage display, mRNA display, ribosomal display, or DNA display particles.

43. The method of any of claims 38-42, wherein the amplified display particles are phage display particles.

44. The method of any of claims 38-43, wherein the amplified display particles are phagemid particles.

45. The method of claim 44, wherein the nucleic acid sequence is a first nucleic acid sequence, each replicable expression vector further comprises a second nucleic acid sequence encoding at least a portion of a phage coat protein, and the method further comprises infecting the transformed host cells with an amount of a helper phage having a gene encoding the phage coat protein sufficient to produce the phagemid particles, whereby the fusion protein comprises the at least a portion of a phage coat protein.

46. A method of preparing an ultralong CDR3-knob phage display library, the method comprising:

(a) immunizing a cow with a target antigen;

(c) amplifying sequences encoding a plurality of CDR3-knob only antibodies from the cDNA template library with forward and reverse primers specific for the ascending and descending stalk domains of a cow ultralong CDR3 region;

(d) constructing a plurality of replicable expression vectors for the plurality of CDR3-knob only antibodies, wherein each replicable expression vector comprises (1) a first nucleic acid sequence encoding an amplified CDR3 knob and (2) a second nucleic acid sequence encoding at least a portion of a phage coat protein;

(g) collecting the amplified phagemid particles, wherein the amplified phagemid particles comprise phagemid particles displaying a fusion protein comprising the at least a portion of a phage coat protein and an amplified CDR3 knob.

47. The method of any of claims 38-46, wherein the primers comprise or consist of any of the sequences set forth in SEQ ID NO: 7-11 and 121-130, optionally comprise or consist of any of the sequences set forth in SEQ ID NO: 123, 127, and 128.

48. The method of any of claims 38-47, wherein the method further comprises identifying the CDR3-knob from the cow antibody variable heavy (VH) chain template sequences.

49. The method of claim 37 or claim 48, wherein the CDR3-knob is identified from an antibody sequence by an algorithm comprising:

identifying the conserved cysteine in framework 3 and the conserved tryptophan in framework 4; and

determining the sequence of the CDR-3 knob, in which:

the CDR-3 knob has the amino acid sequence length K;

the sequence begins at position X+1 and ends at X+K; and

K=L−2X;

wherein L is the number of amino acids in an amino acid sequence starting at the conserved cysteine in framework 3 and ending at the conserved tryptophan in framework 4, and X is the number of amino acids from the first cysteine in framework 3 to the first conserved cysteine encoded by the D_Hregion in CDR H3.

50. The method of claim 49, wherein the antibody sequence is a bovine antibody.

51. The method of claim 49 or 50, wherein the identified CDR3-knob is extended by one, two, three, four, or five amino acids at the N and/or C termini compared to the identified sequence.

52. The method of any of claims 38-51, wherein each of the plurality of CDR3-knob only antibodies comprises a peptide sequence of 25-70 amino acids with a cysteine motif comprising 2-12 cysteine residues able to form 1-6 disulfide bonds.

53. The method of claim 52, wherein the peptide sequence is 40 to 60 amino acids in length.

54. The method of claim 52 or claim 53, wherein the peptide sequence is at least 42 amino acids in length.

55. The method of any of claims 52-54, wherein the peptide sequence is 42 amino acids, 43 amino acids, 44 amino acids, 45 amino acids, 46 amino acids, 47 amino acids, 48 amino acids, 49 amino acids, 50 amino acids, 51 amino acids, 52 amino acids, 53 amino acids, 54 amino acids, 55 amino acids, 56 amino acids, 57 amino acids, 58 amino acids, 59 amino acids, or 60 amino acids in length.

56. The method of any of claims 52-55, wherein the peptide sequence comprises at least 4 cysteine residues.

57. The method of any of claims 52-56, wherein the peptide sequence contains 4 cysteine residues.

58. The method of any of claims 52-56, wherein the peptide sequence contains 6, 8, 10, or 12 cysteine residues.

59. The method of any of claims 52-58, wherein the peptide sequence has at least 2 disulfide bonds.

60. The method of any of claims 52-59, wherein the peptide sequence has 2 disulfide bonds.

61. The method of any of claims 52-59, wherein the peptide sequence has 3, 4 or 5 disulfide bonds.

62. The method of any of claims 6-37 and 41-61, wherein the target antigen is a nonvirulent bacteria, a virus, a viral protein, an immunomodulatory protein, a cancer antigen, a human IgG, or a recombinant protein thereof.

63. The method of any of claims 1-62, wherein the cDNA template library was synthesized using a pool of IgM, IgA, and IgG-specific primers comprising a primer comprising or consisting of the sequence set forth in SEQ ID NO: 4, a primer comprising or consisting of the sequence set forth in SEQ ID NO: 5, a primer comprising or consisting of the sequence set forth in SEQ ID NO: 3, and a primer comprising or consisting of the sequence set forth in SEQ ID NO: 6.

64. A method of preparing an ultralong CDR3-knob display library, the method comprising:

(a) constructing a plurality of replicable expression vectors for a plurality of CDR3-knob only antibodies, wherein each replicable expression vector comprises a nucleic acid sequence encoding a peptide sequence of 25-70 amino acids with a cysteine motif comprising 2-12 cysteine residues able to form 1-6 disulfide bonds;

(b) transforming suitable host cells with the plurality of replicable expression vectors under conditions suitable to produce amplified display particles; and

(c) collecting the amplified display particles, wherein the amplified display particles comprise display particles displaying a fusion protein comprising a CDR3 knob.

65. The method of claim 64, wherein the amplified display particles comprise bacterial display, yeast display, mammalian display, phage display, mRNA display, ribosomal display, or DNA display particles.

66. The method of claim 64 or claim 65, wherein the amplified display particles are phage display particles.

67. The method of any of claims 64-66, wherein the amplified display particles are phagemid particles.

68. The method of claim 67, wherein the nucleic acid sequence is a first nucleic acid sequence, each replicable expression vector further comprises a second nucleic acid sequence encoding at least a portion of a phage coat protein, and the method further comprises infecting the transformed host cells with an amount of a helper phage having a gene encoding the phage coat protein sufficient to produce the phagemid particles, whereby the fusion protein comprises the at least a portion of a phage coat protein.

69. A method of preparing an ultralong CDR3-knob phage display library, the method comprising:

(a) constructing a plurality of replicable expression vector for a plurality of CDR3-knob only antibodies, wherein each replicable expression vector comprises (1) a first nucleic acid sequence encoding a peptide sequence of 25-70 amino acids with a cysteine motif comprising 2-12 cysteine residues able to form 1-6 disulfide bonds and (2) a second nucleic acid sequence encoding at least a portion of a phage coat protein;

(b) transforming suitable host cells with a plurality of replicable expression vectors;

(c) infecting the transformed host cells with a helper phage having a gene encoding the phage coat protein sufficient to produce amplified phagemid particles; and

(d) collecting the amplified phagemid particles, wherein the amplified phagemid particles comprise phagemid particles displaying a fusion protein comprising the at least a portion of a phage coat protein and a CDR3 knob.

70. The method of any of claims 64-69, wherein at least one of the plurality of CDR3-knob antibody is identified from an antibody sequence by an algorithm comprising:

determining the sequence of the CDR-3 knob, in which:

the CDR-3 knob has the amino acid sequence length K;

the sequence begins at position X+1 and ends at X+K; and

K=L−2X;

71. The method of claim 70, wherein the antibody sequence is a bovine antibody.

72. The method of claim 70 or claim 71, wherein the at least one CDR3-knob antibody has a sequence that is extended by one, two, three, four, or five amino acids at the N and/or C termini compared to the identified sequence.

73. The method of any of claims 27-37 and 52-72, wherein the peptide sequence comprises an ascending stalk domain and a descending stalk domain, wherein the cysteine motif is between the ascending and descending stalk domains.

74. The method of any of claims 64-73, wherein the peptide sequence is amplified from DNA from a cow immunized with a target antigen.

75. The method of claim 74, wherein the peptide sequence is amplified from a variable heavy chain cDNA library from the immunized cow using primers specific for either side of the stalk domain of a cow ultralong CDR3 region.

76. The method of any of claims 27-37, 52-72, 74, and 75, wherein the peptide sequence does not comprise an ascending stalk domain N-terminal to the cysteine motif.

77. The method of any of claims 27-37, 52-72, and 74-76, wherein the peptide sequence does not comprise a descending stalk domain C-terminal to the cysteine motif.

78. The method of any of claims 73-75 and 77, wherein the ascending stalk domain comprises the sequence CX₂TVX₅Q, wherein X₂and X₅are any amino acid.

79. The method of claim 78, wherein X₂is Ser, Thr, Gly, Asn, Ala, or Pro, and X₅is His, Gin, Arg, Lys, Gly, Thr, Tyr, Phe, Trp, Met, Ile, Val, or Leu.

80. The method of claim 78 or claim 79, wherein X₂is Ser, Ala, or Thr, and X₅is His or Tyr.

81. The method of any of claims 64-73 and 76-80, wherein the peptide sequence is a synthetic CDR3-knob.

82. The method of any of claims 64-73 and 76-81, wherein the peptide sequence is a cyclotide or modified cyclotide.

83. The method of any of claims 64-73 and 76-81, wherein the peptide sequence is a semisynthetic CDR3-knob derived from a bovine CDR3-knob.

84. The method of any of claims 64-83, wherein the peptide sequence is 40 to 60 amino acids in length.

85. The method of any of claims 64-84, wherein the peptide sequence is at least 42 amino acids in length.

86. The method of any of claims 64-85, wherein the peptide sequence is 42 amino acids, 43 amino acids, 44 amino acids, 45 amino acids, 46 amino acids, 47 amino acids, 48 amino acids, 49 amino acids, 50 amino acids, 51 amino acids, 52 amino acids, 53 amino acids, 54 amino acids, 55 amino acids, 56 amino acids, 57 amino acids, 58 amino acids, 59 amino acids, or 60 amino acids in length.

87. The method of any of claims 64-86, wherein the peptide sequence comprises at least 4 cysteine residues.

88. The method of any of claims 64-87, wherein the peptide sequence contains 4 cysteine residues.

89. The method of any of claims 64-87, wherein the peptide sequence contains 6, 8, 10, or 12 cysteine residues.

90. The method of any of claims 64-89, wherein the peptide sequence has at least 2 disulfide bonds.

91. The method of any of claims 64-90, wherein the peptide sequence has 2 disulfide bonds.

92. The method of any of claims 64-90, wherein the peptide sequence has 3, 4 or 5 disulfide bonds.

93. The method of any of claims 64-73 and 76-92, wherein the plurality of CDR3 knobs are mutated at one or more selected positions within the nucleic acid sequence encoding the peptide sequence, wherein the plurality of replicable expression vectors are a family of mutated vectors.

94. The method of any of claims 1-93, wherein the expression vector further comprises a secretory signal sequence.

95. The method of claim 94, wherein the secretory signal sequence is a pelB signal sequence.

96. The method of any of claims 1-95, wherein the suitable host cells are E. coli cells.

97. The method of any of claims 1-96, wherein the suitable host cells are TG1 electrocompetent cells.

98. The method of any of claims 9-37, 44-63, and 67-97, wherein the phagemid particles are derived from M13 phage.

99. The method of any of claims 10-37, 45-63, and 68-98, wherein the coat protein is the M13 phage gene III coat protein (pIII).

100. The method of any of claims 10-37, 45-63, and 68-99, wherein the helper phage is selected from the group consisting of M13K07, M13R408, M13-VCS, and Phi X 174.

101. The method of any of claims 10-37, 45-63, and 68-100, wherein the helper phage is M13K07.

102. The method of any of claims 1-101, wherein the display particles on average display one copy of the fusion protein on the surface of the particle.

103. A library of display particles produced by the method of any of claims 1-102.

104. A replicable expression vector comprising a gene fusion encoding a fusion protein comprising a nucleic acid sequence encoding a single chain variable fragment comprising a cow variable heavy (VH) region comprising an ultralong CDR3 joined to a variable lambda light (VL) region selected from VL regions of BLV1H12, BLV5D3, BLV8C11, BF1H1, BLV5B8, and F18, or a humanized variant thereof.

105. A replicable expression vector comprising a gene fusion encoding a fusion protein comprising a nucleic acid sequence encoding a single chain variable fragment comprising a cow variable heavy (VH) region comprising an ultralong CDR3 joined to a BLV1H12 lambda variable light (VL) region or a humanized variant thereof.

106. The replicable expression vector of claim 104 or claim 105, wherein the nucleic acid sequence is a first nucleic acid sequence, and the replicable expression further comprises a second nucleic acid sequence encoding at least a portion of a phage coat protein.

107. A display particle encoded by the replicable expression vector of any of claims 104-106.

108. A library of display particles comprising a plurality of the display particle of claim 107.

109. The library of claim 103 or claim 108, wherein at least or at least about 20%, 30%, 35%, 40%, 45%, 50%, 60%, 70%, 80%, 85%, 90%, or 95% of the display particles in the library comprise an scFv comprising a VH region comprising an ultralong CDR3 region.

110. The library of any of claims 103, 108, and 109, wherein at least or at least about 30% of the display particles in the library comprise an scFv comprising a VH region comprising an ultralong CDR3 region.

111. The library of any of claims 103 and 108-110, wherein at least or at least about 40% of the display particles in the library comprise an scFv comprising a VH region comprising an ultralong CDR3 region.

112. The library of any of claims 103 and 108-111, wherein at least or at least about 50% of the display particles in the library comprise an scFv comprising a VH region comprising an ultralong CDR3 region.

113. A replicable expression vector comprising a gene fusion encoding a fusion protein that comprises a nucleic acid sequence encoding a peptide sequence of 25-70 amino acids with a cysteine motif comprising 2-12 cysteine residues able to form disulfide bonds.

114. The replicable expression vector of claim 113, wherein the nucleic acid sequence is a first nucleic acid sequence, and the replicable expression vector further comprises a second nucleic acid sequence encoding at least a portion of a phage coat protein.

115. A display particle encoded by the replicable expression vector of claim 113 or claim 114.

116. A library of display particles comprising a plurality of the display particle of claim 115.

117. The library of any of claims 103, 108-112, and 116, wherein the display particles are phage display particles.

118. The library of any of claims 103, 108-112, 116, and 117, wherein the display particles are phagemid particles.

119. A method for selecting an antibody binding protein, the method comprising:

(1) contacting the library of display particles of any of claims 103, 108-112, and 116-118 with a target molecule under conditions to allow binding of a display particle to the target molecule; and

(2) separating the display particles that bind from those that do not, thereby selecting display particles comprising an antibody binding protein that binds to the target molecule.

120. The method of claim 119, wherein the display particles are phage display particles.

121. The method of claim 119 or claim 120, wherein the display particles are phagemid particles.

122. The method of any of claims 119-121, wherein the target molecule is a nonvirulent bacteria, a virus, a viral protein, an immunomodulatory protein, a cancer antigen, a human IgG, or a recombinant protein thereof.

123. The method of any of claims 119-122, wherein the target molecule is a coronavirus, a coronavirus pseudovirus, a recombinant coronavirus Spike protein, or a receptor-binding domain (RBD) of a coronavirus Spike protein.

124. The method of claim 123, wherein the coronavirus is selected from the group consisting of 229E, NL63, OC43, HKU1, MERS-CoV, SARS-CoV, and SARS-CoV2.

125. The method of claim 123 or claim 124, wherein the coronavirus is a SARS-CoV2 selected from Wuhan-Hu-1 isolate, B.1.351 South African variant, or B.1.1.7 UK variant.

126. The method of any of claims 119-125, further comprising:

(i) infecting suitable host cells with replicable expression vectors encoding the selected display particles that bind in (2);

(ii) collecting the amplified display particles; and

(iii) repeating steps (1) and (2) using the amplified display particles as the library of display particles.

127. The method of claim 126, wherein the display particles are phagemid particles, and the method further comprises infecting the transformed host cells with an amount of a helper phage having a gene encoding the phage coat protein sufficient to produce amplified phagemid particles.

128. The method of claim 126 or claim 127, wherein the steps are repeated one or more times.

129. The method of any of claims 126-128, wherein the steps are repeated with the same target molecule or a different target molecule.

130. The method of claim 129, wherein the steps are repeated with a different target molecule and the different target molecule is related to the target molecule.

131. The method of claim 129 or claim 130, wherein the different target molecule is the same type of pathogen as, in the same group of pathogen as, or a variant of the target molecule.

132. The method of any of claims 119-131, further comprising sequencing the fusion gene in the selected display particles to identify the antibody binding protein.

133. The method of claim 132, further comprising producing a full-length IgG or a Fab from the selected antibody binding protein.

134. The method of claim 132 or claim 133, wherein the antibody binding protein is a scFv, and the method comprises constructing a heavy chain or a portion thereof comprising joining the VH region of the scFv with a constant region or a portion thereof.

135. The method of claim 132 or claim 133, wherein the method comprises constructing a humanized VH region by replacing a knob region of the ultralong CDR3 region of a humanized bovine VH region with an ultralong CDR3 region of a selected antibody binding protein.

136. The method of claim 135, wherein the ultralong CDR3 region of a selected antibody binding protein is replaced between an ascending stalk strand and a descending stalk strand of a humanized bovine VH region.

137. The method of claim 136, wherein the VH region comprises the formula V1-X-V2, wherein the V1 region of the heavy chain comprises the sequence set forth in SEQ ID NO: 111: the X region comprises the ultralong CDR3 of a selected antibody binding protein; and the V2 region comprises the sequence set forth in SEQ ID NO: 112.

138. The method of any of claims 135-137, wherein the method further comprises constructing a heavy chain or a portion thereof comprising joining the humanized VH region with a constant region or a portion thereof.

139. The method of claim 134 or claim 138, wherein the heavy chain or the portion thereof is a human IgG1 heavy chain or portion thereof.

140. The method of any of claims 134, 138, and 139, further comprising co-expressing the heavy chain or portion thereof with a light chain.

141. The method of claim 140, wherein the light chain is a bovine light chain of BLVH12, BLV5D3, BLV8C11, BF1H1, BLV5B8, or F18, or is a humanized variant thereof.

142. The method of claim 140 or claim 141, wherein the light chain is a BLV1H12 light chain comprising the sequence set forth in SEQ ID NO: 113 or a humanized variant thereof.

143. The method of any of claims 140-142, wherein the light chain is a humanized light chain set forth in SEQ ID NO: 114.

144. The method of claim 140 or claim 141, wherein the light chain is a BLV5B8 light chain comprising the sequence set forth in SEQ ID NO: 115 or a humanized variant thereof.

145. The method of claim 140, wherein the light chain is a human light chain.

146. The method of claim 140 or claim 145, wherein the light chain is selected from the group consisting of VL1-47, VL1-40, VL1-51, and VL2-18.

147. The method of any of claims 140, 145, and 146, wherein the light chain is set forth in any one of SEQ ID NO: 116-120.

148. A method for producing a soluble ultralong CDR3 knob, comprising:

(a) transforming E. coli with an expression vector encoding a fusion protein comprising an ultralong CDR3 knob and a bacterial chaperone joined by a cleavable linker, wherein the ultralong CDR3 knob is a peptide sequence of 25-70 amino acids with a cysteine motif comprising 2-12 cysteine residues able to form 1-6 disulfide bonds;

(b) culturing the bacteria under conditions permissive of expression of the fusion protein;

(c) isolating the fusion protein from supernatant of a bacterial cell lysate; and

(d) cleaving the cleavable linker of the fusion protein, thereby producing a soluble ultralong CDR3 knob comprising 1-6 disulfide bonds free of the bacterial chaperone.

149. The method of claim 148, wherein the ultralong CDR3 knob is an antibody binding protein selected by the method of any of claims 119-132.

150. The method of claim 148 or claim 149, wherein the fusion protein has increased solubility relative to the ultralong CDR3 knob alone.

151. The method of any of claims 148-150, wherein the bacterial chaperone is thioredoxin A (TrxA).

152. The method of any of claims 148-151, wherein the cleavable linker is an enterokinase cleavage tag having the amino acid sequence DDDDK (SEQ ID NO: 106).

153. The method of any of claims 148-152, wherein cleaving the cleavable linker comprises adding enterokinase to the supernatant.

154. The method of any of claims 148-153, wherein the soluble ultralong CDR3 knob comprises a further linker to allow for cyclizing the soluble ultralong CDR3 knob via chemical or enzymatic methods, optionally wherein the further linker allows for sortase-mediated cyclization.

155. The method of claim 154, further comprising cyclizing the soluble ultralong CDR3 knob.

156. The method of any of claims 148-155, further comprising (e) removing the enterokinase and/or the bacterial chaperone from the solution comprising the soluble ultralong CDR3 knob.

157. The method of any of claims 148-156, further comprising enriching for the soluble ultralong CDR3 knob from the solution comprising the soluble ultralong CDR3 knob, optionally wherein the enriching comprises size exclusion chromatography.

158. The method of any of claims 148-157, further comprising producing a multispecific binding molecule comprising the soluble ultralong CDR3 knob.

159. The method of any of claims 148-158, wherein the ultralong CDR3 knob is 3-8 kDa or 4-5 kDa in size.

160. A fusion protein comprising an ultralong CDR3 knob and a bacterial chaperone joined by a cleavable linker, wherein the ultralong CDR3 knob is a peptide sequence of 25-70 amino acids with a cysteine motif comprising 2-12 cysteine residues able to form 1-6 disulfide bonds.

161. The fusion protein of claim 160, wherein the bacterial chaperone is thioredoxin A (TrxA).

162. The fusion protein of claim 160 or claim 161, wherein the cleavable linker is an enterokinase cleavage tag having the amino acid sequence DDDDK (SEQ ID NO: 106).

163. The fusion protein of any of claims 160-162, wherein the ultralong CDR3 knob comprises 1-6 disulfide bonds.

164. A composition comprising the fusion protein of any of claims 160-163.

165. A method of identifying a CDR3 knob sequence from an antibody sequence, the method comprising:

determining the sequence of the CDR-3 knob, in which:

the CDR-3 knob has the amino acid sequence length K;

the sequence begins at position X+1 and ends at X+K; and

K=L−2X;

166. The method of claim 165, wherein the antibody sequence is a bovine antibody.

167. The method of claim 165 or claim 166, wherein the CDR3-knob antibody has a sequence that is extended by one, two, three, four, or five amino acids at the N and/or C termini compared to the identified sequence.

168. A purified soluble ultralong CDR3 knob produced by the method of any of claims 148-159, wherein the soluble ultralong CDR3 is 25-75 amino acids in length and comprises 1-6 disulfide bonds.

169. The purified soluble ultralong CDR3 knob of claim 168, wherein the ultralong CDR3 knob is 3-8 kDa in size.

170. The purified soluble ultralong CDR3 knob of claim 168 or claim 169, wherein the ultralong CDR3 knob is 4-5 kDa in size.

171. The purified soluble ultralong CDR3 knob of any of claims 168-170, wherein:

the knob has an amino acid sequence length K;

the sequence begins at position X+1 and ends at X+K; and

K=L−2X;

wherein L is the number of amino acids in an amino acid sequence of an antibody starting at the conserved cysteine in framework 3 and ending at the conserved tryptophan in framework 4, and X is the number of amino acids from the first cysteine in framework 3 to the first conserved cysteine encoded by the D_Hregion in CDR H3.

172. The purified soluble ultralong CDR3 knob of claim 171, wherein the antibody sequence is a bovine antibody.

173. The purified soluable ultralong CDR3 knob of claim 171 or claim 172, wherein the knob sequence has a sequence that is further extended by one, two, three, four, or five amino acids at the N and/or C termini.

174. A peptide knob sequence of length K, wherein:

the knob has an amino acid sequence length K;

the sequence begins at position X+1 and ends at X+K; and

K=L−2X;

175. The peptide knob sequence of claim 174, wherein the antibody sequence is a bovine antibody.

176. The peptide knob sequence of claim 174 or claim 175, wherein the knob sequence has a sequence that is further extended by one, two, three, four, or five amino acids at the N and/or C termini

177. A composition comprising the purified soluble ultralong CDR3 of any of claims 168-173.

178. The composition of claim 177, further comprising a pharmaceutically acceptable carrier.

179. The composition of claim 177 or claim 178 that is formulated for parenteral administration.

180. The composition of any of claims 177-179 that is formulated for intravenous, intramuscular, topical, otic, conjunctival, nasal, inhalation, or subcutaneous administration.

181. The composition of any of claims 177-180 that is formulated for administration by inhalation.