WO1997026001A1 - Proteins and peptides for contraceptive vaccines and fertility diagnosis - Google Patents
Proteins and peptides for contraceptive vaccines and fertility diagnosis Download PDFInfo
- Publication number
- WO1997026001A1 WO1997026001A1 PCT/US1997/000908 US9700908W WO9726001A1 WO 1997026001 A1 WO1997026001 A1 WO 1997026001A1 US 9700908 W US9700908 W US 9700908W WO 9726001 A1 WO9726001 A1 WO 9726001A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- lys
- glu
- ser
- peptide
- val
- Prior art date
Links
- 108090000623 proteins and genes Proteins 0.000 title claims abstract description 177
- 102000004169 proteins and genes Human genes 0.000 title claims abstract description 165
- 108090000765 processed proteins & peptides Proteins 0.000 title claims abstract description 164
- 229960005486 vaccine Drugs 0.000 title claims abstract description 33
- 102000004196 processed proteins & peptides Human genes 0.000 title abstract description 47
- 239000003433 contraceptive agent Substances 0.000 title description 9
- 230000002254 contraceptive effect Effects 0.000 title description 9
- 230000035558 fertility Effects 0.000 title description 3
- 238000003745 diagnosis Methods 0.000 title description 2
- 210000001550 testis Anatomy 0.000 claims abstract description 61
- 230000014509 gene expression Effects 0.000 claims abstract description 25
- 241000124008 Mammalia Species 0.000 claims abstract description 16
- 208000000509 infertility Diseases 0.000 claims abstract description 15
- 230000036512 infertility Effects 0.000 claims abstract description 15
- 231100000535 infertility Toxicity 0.000 claims abstract description 15
- 238000003556 assay Methods 0.000 claims abstract description 14
- 150000001413 amino acids Chemical class 0.000 claims description 85
- 102100035037 Calpastatin Human genes 0.000 claims description 49
- 108010044208 calpastatin Proteins 0.000 claims description 49
- ZXJCOYBPXOBJMU-HSQGJUDPSA-N calpastatin peptide Ac 184-210 Chemical compound C([C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)NCC(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N1[C@@H](CCC1)C(=O)N1[C@@H](CCC1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(N)=O)NC(=O)[C@@H](NC(=O)[C@H](CO)NC(=O)[C@H](CO)NC(=O)[C@H](CCSC)NC(=O)[C@H]1N(CCC1)C(=O)[C@H](CC(O)=O)NC(C)=O)[C@@H](C)O)C1=CC=C(O)C=C1 ZXJCOYBPXOBJMU-HSQGJUDPSA-N 0.000 claims description 49
- 210000003719 b-lymphocyte Anatomy 0.000 claims description 43
- 230000002163 immunogen Effects 0.000 claims description 43
- 210000004027 cell Anatomy 0.000 claims description 39
- 238000000034 method Methods 0.000 claims description 39
- 241000880493 Leptailurus serval Species 0.000 claims description 37
- 108010029485 Protein Isoforms Proteins 0.000 claims description 34
- 102000001708 Protein Isoforms Human genes 0.000 claims description 34
- 108010049041 glutamylalanine Proteins 0.000 claims description 30
- JZDHUJAFXGNDSB-WHFBIAKZSA-N Glu-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H](N)CCC(O)=O JZDHUJAFXGNDSB-WHFBIAKZSA-N 0.000 claims description 28
- 210000001744 T-lymphocyte Anatomy 0.000 claims description 27
- 108020004414 DNA Proteins 0.000 claims description 23
- NJJBATPLUQHRBM-IHRRRGAJSA-N Phe-Pro-Ser Chemical compound C1C[C@H](N(C1)C(=O)[C@H](CC2=CC=CC=C2)N)C(=O)N[C@@H](CO)C(=O)O NJJBATPLUQHRBM-IHRRRGAJSA-N 0.000 claims description 23
- MDXLPNRXCFOBTL-BZSNNMDCSA-N Tyr-Ser-Tyr Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O MDXLPNRXCFOBTL-BZSNNMDCSA-N 0.000 claims description 23
- KOSRFJWDECSPRO-UHFFFAOYSA-N alpha-L-glutamyl-L-glutamic acid Natural products OC(=O)CCC(N)C(=O)NC(CCC(O)=O)C(O)=O KOSRFJWDECSPRO-UHFFFAOYSA-N 0.000 claims description 22
- 230000000392 somatic effect Effects 0.000 claims description 22
- HNXWVVHIGTZTBO-LKXGYXEUSA-N Asn-Ser-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC(N)=O HNXWVVHIGTZTBO-LKXGYXEUSA-N 0.000 claims description 21
- BBBXWRGITSUJPB-YUMQZZPRSA-N Glu-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](N)CCC(O)=O BBBXWRGITSUJPB-YUMQZZPRSA-N 0.000 claims description 20
- YMTOEGGOCHVGEH-IHRRRGAJSA-N Val-Lys-Lys Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(O)=O YMTOEGGOCHVGEH-IHRRRGAJSA-N 0.000 claims description 19
- HRBYTAIBKPNZKQ-AVGNSLFASA-N Glu-Lys-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCC(O)=O HRBYTAIBKPNZKQ-AVGNSLFASA-N 0.000 claims description 18
- 108010064235 lysylglycine Proteins 0.000 claims description 17
- 210000001124 body fluid Anatomy 0.000 claims description 14
- 239000010839 body fluid Substances 0.000 claims description 14
- UFBURHXMKFQVLM-CIUDSAMLSA-N Arg-Glu-Ser Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O UFBURHXMKFQVLM-CIUDSAMLSA-N 0.000 claims description 13
- UPJODPVSKKWGDQ-KLHWPWHYSA-N His-Thr-Pro Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC2=CN=CN2)N)O UPJODPVSKKWGDQ-KLHWPWHYSA-N 0.000 claims description 13
- HGNRJCINZYHNOU-LURJTMIESA-N Lys-Gly Chemical compound NCCCC[C@H](N)C(=O)NCC(O)=O HGNRJCINZYHNOU-LURJTMIESA-N 0.000 claims description 13
- OYEDZGNMSBZCIM-XGEHTFHBSA-N Ser-Arg-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(O)=O OYEDZGNMSBZCIM-XGEHTFHBSA-N 0.000 claims description 13
- BKIOKSLLAAZYTC-KKHAAJSZSA-N Thr-Val-Asn Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O BKIOKSLLAAZYTC-KKHAAJSZSA-N 0.000 claims description 13
- 108010055341 glutamyl-glutamic acid Proteins 0.000 claims description 13
- 108010004914 prolylarginine Proteins 0.000 claims description 13
- IEIFEYBAYFSRBQ-IHRRRGAJSA-N Phe-Val-Ser Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)N IEIFEYBAYFSRBQ-IHRRRGAJSA-N 0.000 claims description 12
- HZYOWMGWKKRMBZ-BYULHYEWSA-N Val-Asp-Asp Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC(=O)O)C(=O)O)N HZYOWMGWKKRMBZ-BYULHYEWSA-N 0.000 claims description 12
- XPSGESXVBSQZPL-SRVKXCTJSA-N Arg-Arg-Arg Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O XPSGESXVBSQZPL-SRVKXCTJSA-N 0.000 claims description 11
- 108010070643 prolylglutamic acid Proteins 0.000 claims description 11
- PBVLJOIPOGUQQP-CIUDSAMLSA-N Asp-Ala-Leu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(O)=O PBVLJOIPOGUQQP-CIUDSAMLSA-N 0.000 claims description 10
- YCCUXNNKXDGMAM-KKUMJFAQSA-N Phe-Leu-Ser Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O YCCUXNNKXDGMAM-KKUMJFAQSA-N 0.000 claims description 10
- LGSANCBHSMDFDY-GARJFASQSA-N Pro-Glu-Pro Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CCC(=O)O)C(=O)N2CCC[C@@H]2C(=O)O LGSANCBHSMDFDY-GARJFASQSA-N 0.000 claims description 10
- WKOBSJOZRJJVRZ-FXQIFTODSA-N Ala-Glu-Glu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O WKOBSJOZRJJVRZ-FXQIFTODSA-N 0.000 claims description 9
- HAOUOFNNJJLVNS-BQBZGAKWSA-N Gly-Pro-Ser Chemical compound NCC(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O HAOUOFNNJJLVNS-BQBZGAKWSA-N 0.000 claims description 9
- KZNQNBZMBZJQJO-UHFFFAOYSA-N N-glycyl-L-proline Natural products NCC(=O)N1CCCC1C(O)=O KZNQNBZMBZJQJO-UHFFFAOYSA-N 0.000 claims description 9
- FNGOXVQBBCMFKV-CIUDSAMLSA-N Pro-Ser-Glu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(O)=O FNGOXVQBBCMFKV-CIUDSAMLSA-N 0.000 claims description 9
- ZYPWIUFLYMQZBS-SRVKXCTJSA-N Asn-Lys-Lys Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(=O)N)N ZYPWIUFLYMQZBS-SRVKXCTJSA-N 0.000 claims description 8
- XEDQMTWEYFBOIK-ACZMJKKPSA-N Asp-Ala-Glu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(O)=O XEDQMTWEYFBOIK-ACZMJKKPSA-N 0.000 claims description 8
- 102000053602 DNA Human genes 0.000 claims description 8
- MHHUEAIBJZWDBH-YUMQZZPRSA-N Gly-Asp-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)CN MHHUEAIBJZWDBH-YUMQZZPRSA-N 0.000 claims description 8
- LVTJJOJKDCVZGP-QWRGUYRKSA-N Leu-Lys-Gly Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)NCC(O)=O LVTJJOJKDCVZGP-QWRGUYRKSA-N 0.000 claims description 8
- CGHXMODRYJISSK-NHCYSSNCSA-N Leu-Val-Asp Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CC(O)=O CGHXMODRYJISSK-NHCYSSNCSA-N 0.000 claims description 8
- YEIYAQQKADPIBJ-GARJFASQSA-N Lys-Asp-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)O)NC(=O)[C@H](CCCCN)N)C(=O)O YEIYAQQKADPIBJ-GARJFASQSA-N 0.000 claims description 8
- DRCILAJNUJKAHC-SRVKXCTJSA-N Lys-Glu-Arg Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O DRCILAJNUJKAHC-SRVKXCTJSA-N 0.000 claims description 8
- QMCDMHWAKMUGJE-IHRRRGAJSA-N Ser-Phe-Val Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C(C)C)C(O)=O QMCDMHWAKMUGJE-IHRRRGAJSA-N 0.000 claims description 8
- 108010040443 aspartyl-aspartic acid Proteins 0.000 claims description 8
- 230000004720 fertilization Effects 0.000 claims description 8
- 101100289888 Caenorhabditis elegans lys-5 gene Proteins 0.000 claims description 7
- UQTNIFUCMBFWEJ-IWGUZYHVSA-N Thr-Asn Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@H](C(O)=O)CC(N)=O UQTNIFUCMBFWEJ-IWGUZYHVSA-N 0.000 claims description 7
- NEEOBPIXKWSBRF-IUCAKERBSA-N Leu-Glu-Gly Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O NEEOBPIXKWSBRF-IUCAKERBSA-N 0.000 claims description 6
- QXOHLNCNYLGICT-YFKPBYRVSA-N Met-Gly Chemical compound CSCC[C@H](N)C(=O)NCC(O)=O QXOHLNCNYLGICT-YFKPBYRVSA-N 0.000 claims description 6
- LNICFEXCAHIJOR-DCAQKATOSA-N Pro-Ser-Leu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O LNICFEXCAHIJOR-DCAQKATOSA-N 0.000 claims description 6
- DYEGLQRVMBWQLD-IXOXFDKPSA-N Ser-Thr-Phe Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)NC(=O)[C@H](CO)N)O DYEGLQRVMBWQLD-IXOXFDKPSA-N 0.000 claims description 6
- 108010005942 methionylglycine Proteins 0.000 claims description 6
- 108010026333 seryl-proline Proteins 0.000 claims description 6
- DDBMKOCQWNFDBH-RHYQMDGZSA-N Arg-Thr-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N)O DDBMKOCQWNFDBH-RHYQMDGZSA-N 0.000 claims description 5
- LEFKSBYHUGUWLP-ACZMJKKPSA-N Asn-Ala-Glu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(O)=O LEFKSBYHUGUWLP-ACZMJKKPSA-N 0.000 claims description 5
- UTKUTMJSWKKHEM-WDSKDSINSA-N Glu-Ala-Gly Chemical compound OC(=O)CNC(=O)[C@H](C)NC(=O)[C@@H](N)CCC(O)=O UTKUTMJSWKKHEM-WDSKDSINSA-N 0.000 claims description 5
- CKOFNWCLWRYUHK-XHNCKOQMSA-N Glu-Asp-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)O)NC(=O)[C@H](CCC(=O)O)N)C(=O)O CKOFNWCLWRYUHK-XHNCKOQMSA-N 0.000 claims description 5
- IETUUAHKCHOQHP-KZVJFYERSA-N Ala-Thr-Val Chemical compound CC(C)[C@H](NC(=O)[C@@H](NC(=O)[C@H](C)N)[C@@H](C)O)C(O)=O IETUUAHKCHOQHP-KZVJFYERSA-N 0.000 claims description 4
- HFZNNDWPHBRNPV-KZVJFYERSA-N Pro-Ala-Thr Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O HFZNNDWPHBRNPV-KZVJFYERSA-N 0.000 claims description 4
- YKRQRPFODDJQTC-CSMHCCOUSA-N Thr-Lys Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@H](C(O)=O)CCCCN YKRQRPFODDJQTC-CSMHCCOUSA-N 0.000 claims description 4
- 108010005233 alanylglutamic acid Proteins 0.000 claims description 4
- 108010077515 glycylproline Proteins 0.000 claims description 4
- GJFYPBDMUGGLFR-NKWVEPMBSA-N Asn-Gly-Pro Chemical compound C1C[C@@H](N(C1)C(=O)CNC(=O)[C@H](CC(=O)N)N)C(=O)O GJFYPBDMUGGLFR-NKWVEPMBSA-N 0.000 claims description 3
- NAPNAGZWHQHZLG-ZLUOBGJFSA-N Asp-Asp-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CC(=O)O)N NAPNAGZWHQHZLG-ZLUOBGJFSA-N 0.000 claims description 3
- ABSSTGUCBCDKMU-UWVGGRQHSA-N Pro-Lys-Gly Chemical compound NCCCC[C@@H](C(=O)NCC(O)=O)NC(=O)[C@@H]1CCCN1 ABSSTGUCBCDKMU-UWVGGRQHSA-N 0.000 claims description 3
- IXZHZUGGKLRHJD-DCAQKATOSA-N Ser-Leu-Val Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(O)=O IXZHZUGGKLRHJD-DCAQKATOSA-N 0.000 claims description 3
- 238000012258 culturing Methods 0.000 claims description 3
- HZPSDHRYYIORKR-WHFBIAKZSA-N Asn-Ala-Gly Chemical compound OC(=O)CNC(=O)[C@H](C)NC(=O)[C@@H](N)CC(N)=O HZPSDHRYYIORKR-WHFBIAKZSA-N 0.000 claims description 2
- SONUFGRSSMFHFN-IMJSIDKUSA-N Asn-Ser Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CO)C(O)=O SONUFGRSSMFHFN-IMJSIDKUSA-N 0.000 claims description 2
- PZTZYZUTCPZWJH-FXQIFTODSA-N Val-Ser-Ser Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)O)N PZTZYZUTCPZWJH-FXQIFTODSA-N 0.000 claims description 2
- UJMCYJKPDFQLHX-XGEHTFHBSA-N Val-Ser-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](C(C)C)N)O UJMCYJKPDFQLHX-XGEHTFHBSA-N 0.000 claims description 2
- 230000002401 inhibitory effect Effects 0.000 claims 5
- 238000012360 testing method Methods 0.000 abstract description 6
- BYXHQQCXAJARLQ-ZLUOBGJFSA-N Ala-Ala-Ala Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(O)=O BYXHQQCXAJARLQ-ZLUOBGJFSA-N 0.000 description 212
- 235000018102 proteins Nutrition 0.000 description 118
- 235000001014 amino acid Nutrition 0.000 description 81
- 239000002299 complementary DNA Substances 0.000 description 26
- 239000013598 vector Substances 0.000 description 16
- 210000001519 tissue Anatomy 0.000 description 15
- 108010092854 aspartyllysine Proteins 0.000 description 12
- 108020001507 fusion proteins Proteins 0.000 description 12
- 102000037865 fusion proteins Human genes 0.000 description 12
- 108010054155 lysyllysine Proteins 0.000 description 12
- 210000002966 serum Anatomy 0.000 description 12
- 108020004999 messenger RNA Proteins 0.000 description 11
- 238000001262 western blot Methods 0.000 description 11
- 238000012217 deletion Methods 0.000 description 10
- 230000037430 deletion Effects 0.000 description 10
- 239000000284 extract Substances 0.000 description 10
- 108020004705 Codon Proteins 0.000 description 9
- SITLTJHOQZFJGG-UHFFFAOYSA-N N-L-alpha-glutamyl-L-valine Natural products CC(C)C(C(O)=O)NC(=O)C(N)CCC(O)=O SITLTJHOQZFJGG-UHFFFAOYSA-N 0.000 description 9
- 108091028043 Nucleic acid sequence Proteins 0.000 description 9
- VQBCMLMPEWPUTB-ACZMJKKPSA-N Ser-Glu-Ser Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O VQBCMLMPEWPUTB-ACZMJKKPSA-N 0.000 description 9
- ZXAGTABZUOMUDO-GVXVVHGQSA-N Val-Glu-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O)N ZXAGTABZUOMUDO-GVXVVHGQSA-N 0.000 description 9
- 125000003275 alpha amino acid group Chemical group 0.000 description 9
- 108010037850 glycylvaline Proteins 0.000 description 9
- 108010034529 leucyl-lysine Proteins 0.000 description 9
- 108010009298 lysylglutamic acid Proteins 0.000 description 9
- 239000000523 sample Substances 0.000 description 9
- LMVOVCYVZBBWQB-SRVKXCTJSA-N Lys-Asp-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCCCN LMVOVCYVZBBWQB-SRVKXCTJSA-N 0.000 description 8
- IMAKMJCBYCSMHM-AVGNSLFASA-N Lys-Glu-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@H](C(O)=O)CCCCN IMAKMJCBYCSMHM-AVGNSLFASA-N 0.000 description 8
- 238000000636 Northern blotting Methods 0.000 description 8
- AEMPCGRFEZTWIF-IHRRRGAJSA-N Val-Leu-Lys Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(O)=O AEMPCGRFEZTWIF-IHRRRGAJSA-N 0.000 description 8
- 239000000969 carrier Substances 0.000 description 8
- 108010073969 valyllysine Proteins 0.000 description 8
- 239000000427 antigen Substances 0.000 description 7
- 108091007433 antigens Proteins 0.000 description 7
- 102000036639 antigens Human genes 0.000 description 7
- 239000012634 fragment Substances 0.000 description 7
- 208000021267 infertility disease Diseases 0.000 description 7
- 108020004707 nucleic acids Proteins 0.000 description 7
- 102000039446 nucleic acids Human genes 0.000 description 7
- 150000007523 nucleic acids Chemical class 0.000 description 7
- 229920001184 polypeptide Polymers 0.000 description 7
- SOBIAADAMRHGKH-CIUDSAMLSA-N Ala-Leu-Ser Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O SOBIAADAMRHGKH-CIUDSAMLSA-N 0.000 description 6
- 108091003079 Bovine Serum Albumin Proteins 0.000 description 6
- 108010065920 Insulin Lispro Proteins 0.000 description 6
- NKKFVJRLCCUJNA-QWRGUYRKSA-N Lys-Gly-Lys Chemical compound NCCCC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCCN NKKFVJRLCCUJNA-QWRGUYRKSA-N 0.000 description 6
- YDDDRTIPNTWGIG-SRVKXCTJSA-N Lys-Lys-Ser Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O YDDDRTIPNTWGIG-SRVKXCTJSA-N 0.000 description 6
- 241000282553 Macaca Species 0.000 description 6
- 230000000469 anti-sperm effect Effects 0.000 description 6
- 229940098773 bovine serum albumin Drugs 0.000 description 6
- 239000003153 chemical reaction reagent Substances 0.000 description 6
- 238000012216 screening Methods 0.000 description 6
- 238000013519 translation Methods 0.000 description 6
- 102000005720 Glutathione transferase Human genes 0.000 description 5
- 108010070675 Glutathione transferase Proteins 0.000 description 5
- -1 LDH-C4 Proteins 0.000 description 5
- 241000283973 Oryctolagus cuniculus Species 0.000 description 5
- 102000003923 Protein Kinase C Human genes 0.000 description 5
- 108090000315 Protein Kinase C Proteins 0.000 description 5
- 240000004808 Saccharomyces cerevisiae Species 0.000 description 5
- 239000000463 material Substances 0.000 description 5
- 238000002360 preparation method Methods 0.000 description 5
- 125000006850 spacer group Chemical group 0.000 description 5
- 238000013518 transcription Methods 0.000 description 5
- 230000035897 transcription Effects 0.000 description 5
- 235000014393 valine Nutrition 0.000 description 5
- HMRWQTHUDVXMGH-GUBZILKMSA-N Ala-Glu-Lys Chemical compound C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@H](C(O)=O)CCCCN HMRWQTHUDVXMGH-GUBZILKMSA-N 0.000 description 4
- ZVFVBBGVOILKPO-WHFBIAKZSA-N Ala-Gly-Ala Chemical compound C[C@H](N)C(=O)NCC(=O)N[C@@H](C)C(O)=O ZVFVBBGVOILKPO-WHFBIAKZSA-N 0.000 description 4
- UZSQXCMNUPKLCC-FJXKBIBVSA-N Arg-Thr-Gly Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O UZSQXCMNUPKLCC-FJXKBIBVSA-N 0.000 description 4
- DWOGMPWRQQWPPF-GUBZILKMSA-N Asp-Leu-Glu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O DWOGMPWRQQWPPF-GUBZILKMSA-N 0.000 description 4
- GKWFMNNNYZHJHV-SRVKXCTJSA-N Asp-Lys-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CC(O)=O GKWFMNNNYZHJHV-SRVKXCTJSA-N 0.000 description 4
- MYLZFUMPZCPJCJ-NHCYSSNCSA-N Asp-Lys-Val Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O MYLZFUMPZCPJCJ-NHCYSSNCSA-N 0.000 description 4
- 241000894006 Bacteria Species 0.000 description 4
- PAQUJCSYVIBPLC-AVGNSLFASA-N Glu-Asp-Phe Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 PAQUJCSYVIBPLC-AVGNSLFASA-N 0.000 description 4
- BFEZQZKEPRKKHV-SRVKXCTJSA-N Glu-Pro-Lys Chemical compound C1C[C@H](N(C1)C(=O)[C@H](CCC(=O)O)N)C(=O)N[C@@H](CCCCN)C(=O)O BFEZQZKEPRKKHV-SRVKXCTJSA-N 0.000 description 4
- QOXDAWODGSIDDI-GUBZILKMSA-N Glu-Ser-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CCC(=O)O)N QOXDAWODGSIDDI-GUBZILKMSA-N 0.000 description 4
- WDEHMRNSGHVNOH-VHSXEESVSA-N Gly-Lys-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCCCN)NC(=O)CN)C(=O)O WDEHMRNSGHVNOH-VHSXEESVSA-N 0.000 description 4
- SLFSYFJKSIVSON-SRVKXCTJSA-N His-Met-Glu Chemical compound CSCC[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CC1=CN=CN1)N SLFSYFJKSIVSON-SRVKXCTJSA-N 0.000 description 4
- PDQDCFBVYXEFSD-SRVKXCTJSA-N Leu-Leu-Asp Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O PDQDCFBVYXEFSD-SRVKXCTJSA-N 0.000 description 4
- KCXUCYYZNZFGLL-SRVKXCTJSA-N Lys-Ala-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(O)=O KCXUCYYZNZFGLL-SRVKXCTJSA-N 0.000 description 4
- GPJGFSFYBJGYRX-YUMQZZPRSA-N Lys-Gly-Asp Chemical compound NCCCC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC(O)=O GPJGFSFYBJGYRX-YUMQZZPRSA-N 0.000 description 4
- YBAFDPFAUTYYRW-UHFFFAOYSA-N N-L-alpha-glutamyl-L-leucine Natural products CC(C)CC(C(O)=O)NC(=O)C(N)CCC(O)=O YBAFDPFAUTYYRW-UHFFFAOYSA-N 0.000 description 4
- 108700015872 N-acetyl-nor-muramyl-L-alanyl-D-isoglutamine Proteins 0.000 description 4
- SXJOPONICMGFCR-DCAQKATOSA-N Pro-Ser-Lys Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)O SXJOPONICMGFCR-DCAQKATOSA-N 0.000 description 4
- KWMZPPWYBVZIER-XGEHTFHBSA-N Pro-Ser-Thr Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(O)=O KWMZPPWYBVZIER-XGEHTFHBSA-N 0.000 description 4
- HJEBZBMOTCQYDN-ACZMJKKPSA-N Ser-Glu-Asp Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O HJEBZBMOTCQYDN-ACZMJKKPSA-N 0.000 description 4
- UBRMZSHOOIVJPW-SRVKXCTJSA-N Ser-Leu-Lys Chemical compound OC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(O)=O UBRMZSHOOIVJPW-SRVKXCTJSA-N 0.000 description 4
- OWCVUSJMEBGMOK-YUMQZZPRSA-N Ser-Lys-Gly Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)NCC(O)=O OWCVUSJMEBGMOK-YUMQZZPRSA-N 0.000 description 4
- MGJLBZFUXUGMML-VOAKCMCISA-N Thr-Lys-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)O)N)O MGJLBZFUXUGMML-VOAKCMCISA-N 0.000 description 4
- VBMOVTMNHWPZJR-SUSMZKCASA-N Thr-Thr-Glu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(O)=O VBMOVTMNHWPZJR-SUSMZKCASA-N 0.000 description 4
- BGXVHVMJZCSOCA-AVGNSLFASA-N Val-Pro-Lys Chemical compound CC(C)[C@@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(=O)O)N BGXVHVMJZCSOCA-AVGNSLFASA-N 0.000 description 4
- 108010087924 alanylproline Proteins 0.000 description 4
- 238000004458 analytical method Methods 0.000 description 4
- 108010062796 arginyllysine Proteins 0.000 description 4
- 108010093581 aspartyl-proline Proteins 0.000 description 4
- 108010047857 aspartylglycine Proteins 0.000 description 4
- 108010068265 aspartyltyrosine Proteins 0.000 description 4
- 230000008878 coupling Effects 0.000 description 4
- 238000010168 coupling process Methods 0.000 description 4
- 238000005859 coupling reaction Methods 0.000 description 4
- 238000010586 diagram Methods 0.000 description 4
- 108010042598 glutamyl-aspartyl-glycine Proteins 0.000 description 4
- 108010050848 glycylleucine Proteins 0.000 description 4
- 108010015792 glycyllysine Proteins 0.000 description 4
- 238000002169 hydrotherapy Methods 0.000 description 4
- 230000001900 immune effect Effects 0.000 description 4
- 230000003053 immunization Effects 0.000 description 4
- 238000003018 immunoassay Methods 0.000 description 4
- 238000010166 immunofluorescence Methods 0.000 description 4
- 210000004185 liver Anatomy 0.000 description 4
- 108010003700 lysyl aspartic acid Proteins 0.000 description 4
- 230000026731 phosphorylation Effects 0.000 description 4
- 238000006366 phosphorylation reaction Methods 0.000 description 4
- 108010077112 prolyl-proline Proteins 0.000 description 4
- 239000007787 solid Substances 0.000 description 4
- 241000894007 species Species 0.000 description 4
- 238000003786 synthesis reaction Methods 0.000 description 4
- 108091032973 (ribonucleotides)n+m Proteins 0.000 description 3
- 102000007590 Calpain Human genes 0.000 description 3
- 108010032088 Calpain Proteins 0.000 description 3
- 108091026890 Coding region Proteins 0.000 description 3
- 108091035707 Consensus sequence Proteins 0.000 description 3
- 238000002965 ELISA Methods 0.000 description 3
- 102000004190 Enzymes Human genes 0.000 description 3
- 108090000790 Enzymes Proteins 0.000 description 3
- LSPKYLAFTPBWIL-BYPYZUCNSA-N Glu-Gly Chemical compound OC(=O)CC[C@H](N)C(=O)NCC(O)=O LSPKYLAFTPBWIL-BYPYZUCNSA-N 0.000 description 3
- UQHGAYSULGRWRG-WHFBIAKZSA-N Glu-Ser Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CO)C(O)=O UQHGAYSULGRWRG-WHFBIAKZSA-N 0.000 description 3
- SITLTJHOQZFJGG-XPUUQOCRSA-N Glu-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@@H](N)CCC(O)=O SITLTJHOQZFJGG-XPUUQOCRSA-N 0.000 description 3
- 108090000144 Human Proteins Proteins 0.000 description 3
- 102000003839 Human Proteins Human genes 0.000 description 3
- KZSNJWFQEVHDMF-BYPYZUCNSA-N L-valine Chemical compound CC(C)[C@H](N)C(O)=O KZSNJWFQEVHDMF-BYPYZUCNSA-N 0.000 description 3
- LALNXSXEYFUUDD-GUBZILKMSA-N Ser-Glu-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O LALNXSXEYFUUDD-GUBZILKMSA-N 0.000 description 3
- GRSLLFZTTLBOQX-CIUDSAMLSA-N Ser-Glu-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CO)N GRSLLFZTTLBOQX-CIUDSAMLSA-N 0.000 description 3
- XXXAXOWMBOKTRN-XPUUQOCRSA-N Ser-Gly-Val Chemical compound [H]N[C@@H](CO)C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O XXXAXOWMBOKTRN-XPUUQOCRSA-N 0.000 description 3
- XOTBWOCSLMBGMF-SUSMZKCASA-N Thr-Glu-Thr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O XOTBWOCSLMBGMF-SUSMZKCASA-N 0.000 description 3
- BNGDYRRHRGOPHX-IFFSRLJSSA-N Thr-Glu-Val Chemical compound CC(C)[C@H](NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)[C@@H](C)O)C(O)=O BNGDYRRHRGOPHX-IFFSRLJSSA-N 0.000 description 3
- XXWBHOWRARMUOC-NHCYSSNCSA-N Val-Lys-Asn Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(=O)N)C(=O)O)N XXWBHOWRARMUOC-NHCYSSNCSA-N 0.000 description 3
- KZSNJWFQEVHDMF-UHFFFAOYSA-N Valine Natural products CC(C)C(N)C(O)=O KZSNJWFQEVHDMF-UHFFFAOYSA-N 0.000 description 3
- 238000002835 absorbance Methods 0.000 description 3
- 239000002671 adjuvant Substances 0.000 description 3
- 238000012512 characterization method Methods 0.000 description 3
- 238000011161 development Methods 0.000 description 3
- 239000003814 drug Substances 0.000 description 3
- 238000002649 immunization Methods 0.000 description 3
- 238000002347 injection Methods 0.000 description 3
- 239000007924 injection Substances 0.000 description 3
- 150000002605 large molecules Chemical class 0.000 description 3
- 108010012581 phenylalanylglutamate Proteins 0.000 description 3
- 239000013612 plasmid Substances 0.000 description 3
- 125000002924 primary amino group Chemical group [H]N([H])* 0.000 description 3
- 230000010076 replication Effects 0.000 description 3
- 238000011160 research Methods 0.000 description 3
- 229960000814 tetanus toxoid Drugs 0.000 description 3
- 108010061238 threonyl-glycine Proteins 0.000 description 3
- 239000004474 valine Substances 0.000 description 3
- XVZCXCTYGHPNEM-IHRRRGAJSA-N (2s)-1-[(2s)-2-[[(2s)-2-amino-4-methylpentanoyl]amino]-4-methylpentanoyl]pyrrolidine-2-carboxylic acid Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N1CCC[C@H]1C(O)=O XVZCXCTYGHPNEM-IHRRRGAJSA-N 0.000 description 2
- TTXMOJWKNRJWQJ-FXQIFTODSA-N Ala-Arg-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)C)CCCN=C(N)N TTXMOJWKNRJWQJ-FXQIFTODSA-N 0.000 description 2
- BUDNAJYVCUHLSV-ZLUOBGJFSA-N Ala-Asp-Ser Chemical compound C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O BUDNAJYVCUHLSV-ZLUOBGJFSA-N 0.000 description 2
- BVSGPHDECMJBDE-HGNGGELXSA-N Ala-Glu-His Chemical compound C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N BVSGPHDECMJBDE-HGNGGELXSA-N 0.000 description 2
- QHASENCZLDHBGX-ONGXEEELSA-N Ala-Gly-Phe Chemical compound C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 QHASENCZLDHBGX-ONGXEEELSA-N 0.000 description 2
- OYJCVIGKMXUVKB-GARJFASQSA-N Ala-Leu-Pro Chemical compound C[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N1CCC[C@@H]1C(=O)O)N OYJCVIGKMXUVKB-GARJFASQSA-N 0.000 description 2
- QUIGLPSHIFPEOV-CIUDSAMLSA-N Ala-Lys-Ala Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(O)=O QUIGLPSHIFPEOV-CIUDSAMLSA-N 0.000 description 2
- SDZRIBWEVVRDQI-CIUDSAMLSA-N Ala-Lys-Asp Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(O)=O SDZRIBWEVVRDQI-CIUDSAMLSA-N 0.000 description 2
- FQNILRVJOJBFFC-FXQIFTODSA-N Ala-Pro-Asp Chemical compound C[C@@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(=O)O)C(=O)O)N FQNILRVJOJBFFC-FXQIFTODSA-N 0.000 description 2
- NHWYNIZWLJYZAG-XVYDVKMFSA-N Ala-Ser-His Chemical compound C[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N NHWYNIZWLJYZAG-XVYDVKMFSA-N 0.000 description 2
- PEEYDECOOVQKRZ-DLOVCJGASA-N Ala-Ser-Phe Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O PEEYDECOOVQKRZ-DLOVCJGASA-N 0.000 description 2
- NCQMBSJGJMYKCK-ZLUOBGJFSA-N Ala-Ser-Ser Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O NCQMBSJGJMYKCK-ZLUOBGJFSA-N 0.000 description 2
- YJHKTAMKPGFJCT-NRPADANISA-N Ala-Val-Glu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O YJHKTAMKPGFJCT-NRPADANISA-N 0.000 description 2
- REWSWYIDQIELBE-FXQIFTODSA-N Ala-Val-Ser Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O REWSWYIDQIELBE-FXQIFTODSA-N 0.000 description 2
- VBFJESQBIWCWRL-DCAQKATOSA-N Arg-Ala-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCCNC(N)=N VBFJESQBIWCWRL-DCAQKATOSA-N 0.000 description 2
- VWVPYNGMOCSSGK-GUBZILKMSA-N Arg-Arg-Asn Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(N)=O)C(O)=O VWVPYNGMOCSSGK-GUBZILKMSA-N 0.000 description 2
- HJVGMOYJDDXLMI-AVGNSLFASA-N Arg-Arg-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCNC(N)=N)NC(=O)[C@@H](N)CCCNC(N)=N HJVGMOYJDDXLMI-AVGNSLFASA-N 0.000 description 2
- OVVUNXXROOFSIM-SDDRHHMPSA-N Arg-Arg-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CCCN=C(N)N)N)C(=O)O OVVUNXXROOFSIM-SDDRHHMPSA-N 0.000 description 2
- FBLMOFHNVQBKRR-IHRRRGAJSA-N Arg-Asp-Tyr Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 FBLMOFHNVQBKRR-IHRRRGAJSA-N 0.000 description 2
- PBSOQGZLPFVXPU-YUMQZZPRSA-N Arg-Glu-Gly Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O PBSOQGZLPFVXPU-YUMQZZPRSA-N 0.000 description 2
- SKTGPBFTMNLIHQ-KKUMJFAQSA-N Arg-Glu-Phe Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O SKTGPBFTMNLIHQ-KKUMJFAQSA-N 0.000 description 2
- OTZMRMHZCMZOJZ-SRVKXCTJSA-N Arg-Leu-Glu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O OTZMRMHZCMZOJZ-SRVKXCTJSA-N 0.000 description 2
- UZGFHWIJWPUPOH-IHRRRGAJSA-N Arg-Leu-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N UZGFHWIJWPUPOH-IHRRRGAJSA-N 0.000 description 2
- DIIGDGJKTMLQQW-IHRRRGAJSA-N Arg-Lys-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CCCN=C(N)N)N DIIGDGJKTMLQQW-IHRRRGAJSA-N 0.000 description 2
- AMIQZQAAYGYKOP-FXQIFTODSA-N Arg-Ser-Asn Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(O)=O AMIQZQAAYGYKOP-FXQIFTODSA-N 0.000 description 2
- JOTRDIXZHNQYGP-DCAQKATOSA-N Arg-Ser-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CCCN=C(N)N)N JOTRDIXZHNQYGP-DCAQKATOSA-N 0.000 description 2
- FRBAHXABMQXSJQ-FXQIFTODSA-N Arg-Ser-Ser Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O FRBAHXABMQXSJQ-FXQIFTODSA-N 0.000 description 2
- INOIAEUXVVNJKA-XGEHTFHBSA-N Arg-Thr-Ser Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O INOIAEUXVVNJKA-XGEHTFHBSA-N 0.000 description 2
- DRDWXKWUSIKKOB-PJODQICGSA-N Arg-Trp-Ala Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](C)C(O)=O DRDWXKWUSIKKOB-PJODQICGSA-N 0.000 description 2
- UVTGNSWSRSCPLP-UHFFFAOYSA-N Arg-Tyr Natural products NC(CCNC(=N)N)C(=O)NC(Cc1ccc(O)cc1)C(=O)O UVTGNSWSRSCPLP-UHFFFAOYSA-N 0.000 description 2
- XWGJDUSDTRPQRK-ZLUOBGJFSA-N Asn-Ala-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC(N)=O XWGJDUSDTRPQRK-ZLUOBGJFSA-N 0.000 description 2
- IARGXWMWRFOQPG-GCJQMDKQSA-N Asn-Ala-Thr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O IARGXWMWRFOQPG-GCJQMDKQSA-N 0.000 description 2
- KSBHCUSPLWRVEK-ZLUOBGJFSA-N Asn-Asn-Asp Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O KSBHCUSPLWRVEK-ZLUOBGJFSA-N 0.000 description 2
- HXWUJJADFMXNKA-BQBZGAKWSA-N Asn-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](N)CC(N)=O HXWUJJADFMXNKA-BQBZGAKWSA-N 0.000 description 2
- PNHQRQTVBRDIEF-CIUDSAMLSA-N Asn-Leu-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CC(=O)N)N PNHQRQTVBRDIEF-CIUDSAMLSA-N 0.000 description 2
- OSZBYGVKAFZWKC-FXQIFTODSA-N Asn-Pro-Cys Chemical compound NC(=O)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CS)C(O)=O OSZBYGVKAFZWKC-FXQIFTODSA-N 0.000 description 2
- YUOXLJYVSZYPBJ-CIUDSAMLSA-N Asn-Pro-Glu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O YUOXLJYVSZYPBJ-CIUDSAMLSA-N 0.000 description 2
- MKJBPDLENBUHQU-CIUDSAMLSA-N Asn-Ser-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O MKJBPDLENBUHQU-CIUDSAMLSA-N 0.000 description 2
- WLVLIYYBPPONRJ-GCJQMDKQSA-N Asn-Thr-Ala Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O WLVLIYYBPPONRJ-GCJQMDKQSA-N 0.000 description 2
- MJIJBEYEHBKTIM-BYULHYEWSA-N Asn-Val-Asn Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CC(=O)N)N MJIJBEYEHBKTIM-BYULHYEWSA-N 0.000 description 2
- WSOKZUVWBXVJHX-CIUDSAMLSA-N Asp-Arg-Glu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(O)=O WSOKZUVWBXVJHX-CIUDSAMLSA-N 0.000 description 2
- ZELQAFZSJOBEQS-ACZMJKKPSA-N Asp-Asn-Glu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O ZELQAFZSJOBEQS-ACZMJKKPSA-N 0.000 description 2
- QOVWVLLHMMCFFY-ZLUOBGJFSA-N Asp-Asp-Asn Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O QOVWVLLHMMCFFY-ZLUOBGJFSA-N 0.000 description 2
- WCFCYFDBMNFSPA-ACZMJKKPSA-N Asp-Asp-Glu Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCC(O)=O WCFCYFDBMNFSPA-ACZMJKKPSA-N 0.000 description 2
- BFOYULZBKYOKAN-OLHMAJIHSA-N Asp-Asp-Thr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O BFOYULZBKYOKAN-OLHMAJIHSA-N 0.000 description 2
- XJQRWGXKUSDEFI-ACZMJKKPSA-N Asp-Glu-Asn Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O XJQRWGXKUSDEFI-ACZMJKKPSA-N 0.000 description 2
- PDECQIHABNQRHN-GUBZILKMSA-N Asp-Glu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CC(O)=O PDECQIHABNQRHN-GUBZILKMSA-N 0.000 description 2
- YNCHFVRXEQFPBY-BQBZGAKWSA-N Asp-Gly-Arg Chemical compound OC(=O)C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCN=C(N)N YNCHFVRXEQFPBY-BQBZGAKWSA-N 0.000 description 2
- PZXPWHFYZXTFBI-YUMQZZPRSA-N Asp-Gly-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CC(O)=O PZXPWHFYZXTFBI-YUMQZZPRSA-N 0.000 description 2
- WSXDIZFNQYTUJB-SRVKXCTJSA-N Asp-His-Leu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(C)C)C(O)=O WSXDIZFNQYTUJB-SRVKXCTJSA-N 0.000 description 2
- SWTQDYFZVOJVLL-KKUMJFAQSA-N Asp-His-Tyr Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)O)NC(=O)[C@H](CC2=CN=CN2)NC(=O)[C@H](CC(=O)O)N)O SWTQDYFZVOJVLL-KKUMJFAQSA-N 0.000 description 2
- PAYPSKIBMDHZPI-CIUDSAMLSA-N Asp-Leu-Asp Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O PAYPSKIBMDHZPI-CIUDSAMLSA-N 0.000 description 2
- UJGRZQYSNYTCAX-SRVKXCTJSA-N Asp-Leu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC(O)=O UJGRZQYSNYTCAX-SRVKXCTJSA-N 0.000 description 2
- CJUKAWUWBZCTDQ-SRVKXCTJSA-N Asp-Leu-Lys Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(O)=O CJUKAWUWBZCTDQ-SRVKXCTJSA-N 0.000 description 2
- UMHUHHJMEXNSIV-CIUDSAMLSA-N Asp-Leu-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC(O)=O UMHUHHJMEXNSIV-CIUDSAMLSA-N 0.000 description 2
- OAMLVOVXNKILLQ-BQBZGAKWSA-N Asp-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](N)CC(O)=O OAMLVOVXNKILLQ-BQBZGAKWSA-N 0.000 description 2
- CTWCFPWFIGRAEP-CIUDSAMLSA-N Asp-Lys-Asp Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(O)=O CTWCFPWFIGRAEP-CIUDSAMLSA-N 0.000 description 2
- LBOVBQONZJRWPV-YUMQZZPRSA-N Asp-Lys-Gly Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)NCC(O)=O LBOVBQONZJRWPV-YUMQZZPRSA-N 0.000 description 2
- YZQCXOFQZKCETR-UWVGGRQHSA-N Asp-Phe Chemical compound OC(=O)C[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 YZQCXOFQZKCETR-UWVGGRQHSA-N 0.000 description 2
- UKGGPJNBONZZCM-WDSKDSINSA-N Asp-Pro Chemical compound OC(=O)C[C@H](N)C(=O)N1CCC[C@H]1C(O)=O UKGGPJNBONZZCM-WDSKDSINSA-N 0.000 description 2
- BWJZSLQJNBSUPM-FXQIFTODSA-N Asp-Pro-Asn Chemical compound OC(=O)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(O)=O BWJZSLQJNBSUPM-FXQIFTODSA-N 0.000 description 2
- UAXIKORUDGGIGA-DCAQKATOSA-N Asp-Pro-Lys Chemical compound C1C[C@H](N(C1)C(=O)[C@H](CC(=O)O)N)C(=O)N[C@@H](CCCCN)C(=O)O UAXIKORUDGGIGA-DCAQKATOSA-N 0.000 description 2
- RVMXMLSYBTXCAV-VEVYYDQMSA-N Asp-Pro-Thr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(O)=O RVMXMLSYBTXCAV-VEVYYDQMSA-N 0.000 description 2
- CUQDCPXNZPDYFQ-ZLUOBGJFSA-N Asp-Ser-Asp Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O CUQDCPXNZPDYFQ-ZLUOBGJFSA-N 0.000 description 2
- FIAKNCXQFFKSSI-ZLUOBGJFSA-N Asp-Ser-Cys Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CS)C(O)=O FIAKNCXQFFKSSI-ZLUOBGJFSA-N 0.000 description 2
- GCACQYDBDHRVGE-LKXGYXEUSA-N Asp-Thr-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H]([C@H](O)C)NC(=O)[C@@H](N)CC(O)=O GCACQYDBDHRVGE-LKXGYXEUSA-N 0.000 description 2
- WAEDSQFVZJUHLI-BYULHYEWSA-N Asp-Val-Asp Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O WAEDSQFVZJUHLI-BYULHYEWSA-N 0.000 description 2
- GCDLPNRHPWBKJJ-WDSKDSINSA-N Cys-Gly-Glu Chemical compound [H]N[C@@H](CS)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O GCDLPNRHPWBKJJ-WDSKDSINSA-N 0.000 description 2
- NXTYATMDWQYLGJ-BQBZGAKWSA-N Cys-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](N)CS NXTYATMDWQYLGJ-BQBZGAKWSA-N 0.000 description 2
- UCSXXFRXHGUXCQ-SRVKXCTJSA-N Cys-Leu-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CS)N UCSXXFRXHGUXCQ-SRVKXCTJSA-N 0.000 description 2
- MKVKKORBPTUSNX-LPEHRKFASA-N Cys-Met-Pro Chemical compound CSCC[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CS)N MKVKKORBPTUSNX-LPEHRKFASA-N 0.000 description 2
- WZJLBUPPZRZNTO-CIUDSAMLSA-N Cys-Ser-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CS)N WZJLBUPPZRZNTO-CIUDSAMLSA-N 0.000 description 2
- IRKLTAKLAFUTLA-KATARQTJSA-N Cys-Thr-Lys Chemical compound C[C@@H](O)[C@H](NC(=O)[C@@H](N)CS)C(=O)N[C@@H](CCCCN)C(O)=O IRKLTAKLAFUTLA-KATARQTJSA-N 0.000 description 2
- OELDIVRKHTYFNG-WDSKDSINSA-N Cys-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@@H](N)CS OELDIVRKHTYFNG-WDSKDSINSA-N 0.000 description 2
- MHYHLWUGWUBUHF-GUBZILKMSA-N Cys-Val-Arg Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)NC(=O)[C@H](CS)N MHYHLWUGWUBUHF-GUBZILKMSA-N 0.000 description 2
- 241000588724 Escherichia coli Species 0.000 description 2
- 241000206602 Eukaryota Species 0.000 description 2
- WZZSKAJIHTUUSG-ACZMJKKPSA-N Glu-Ala-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCC(O)=O WZZSKAJIHTUUSG-ACZMJKKPSA-N 0.000 description 2
- DIXKFOPPGWKZLY-CIUDSAMLSA-N Glu-Arg-Asp Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(O)=O DIXKFOPPGWKZLY-CIUDSAMLSA-N 0.000 description 2
- KKCUFHUTMKQQCF-SRVKXCTJSA-N Glu-Arg-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(O)=O KKCUFHUTMKQQCF-SRVKXCTJSA-N 0.000 description 2
- ZOXBSICWUDAOHX-GUBZILKMSA-N Glu-Asn-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CCC(O)=O ZOXBSICWUDAOHX-GUBZILKMSA-N 0.000 description 2
- PCBBLFVHTYNQGG-LAEOZQHASA-N Glu-Asn-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCC(=O)O)N PCBBLFVHTYNQGG-LAEOZQHASA-N 0.000 description 2
- XXCDTYBVGMPIOA-FXQIFTODSA-N Glu-Asp-Glu Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O XXCDTYBVGMPIOA-FXQIFTODSA-N 0.000 description 2
- JVSBYEDSSRZQGV-GUBZILKMSA-N Glu-Asp-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CCC(O)=O JVSBYEDSSRZQGV-GUBZILKMSA-N 0.000 description 2
- HJIFPJUEOGZWRI-GUBZILKMSA-N Glu-Asp-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CCC(=O)O)N HJIFPJUEOGZWRI-GUBZILKMSA-N 0.000 description 2
- NKLRYVLERDYDBI-FXQIFTODSA-N Glu-Glu-Asp Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O NKLRYVLERDYDBI-FXQIFTODSA-N 0.000 description 2
- MUSGDMDGNGXULI-DCAQKATOSA-N Glu-Glu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CCC(O)=O MUSGDMDGNGXULI-DCAQKATOSA-N 0.000 description 2
- LGYZYFFDELZWRS-DCAQKATOSA-N Glu-Glu-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CCC(O)=O LGYZYFFDELZWRS-DCAQKATOSA-N 0.000 description 2
- QJCKNLPMTPXXEM-AUTRQRHGSA-N Glu-Glu-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CCC(O)=O QJCKNLPMTPXXEM-AUTRQRHGSA-N 0.000 description 2
- LYCDZGLXQBPNQU-WDSKDSINSA-N Glu-Gly-Cys Chemical compound OC(=O)CC[C@H](N)C(=O)NCC(=O)N[C@@H](CS)C(O)=O LYCDZGLXQBPNQU-WDSKDSINSA-N 0.000 description 2
- RAUDKMVXNOWDLS-WDSKDSINSA-N Glu-Gly-Ser Chemical compound OC(=O)CC[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O RAUDKMVXNOWDLS-WDSKDSINSA-N 0.000 description 2
- BRKUZSLQMPNVFN-SRVKXCTJSA-N Glu-His-Arg Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O BRKUZSLQMPNVFN-SRVKXCTJSA-N 0.000 description 2
- DRLVXRQFROIYTD-GUBZILKMSA-N Glu-His-Asn Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCC(=O)O)N DRLVXRQFROIYTD-GUBZILKMSA-N 0.000 description 2
- YBAFDPFAUTYYRW-YUMQZZPRSA-N Glu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](N)CCC(O)=O YBAFDPFAUTYYRW-YUMQZZPRSA-N 0.000 description 2
- VSRCAOIHMGCIJK-SRVKXCTJSA-N Glu-Leu-Arg Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O VSRCAOIHMGCIJK-SRVKXCTJSA-N 0.000 description 2
- WNRZUESNGGDCJX-JYJNAYRXSA-N Glu-Leu-Phe Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O WNRZUESNGGDCJX-JYJNAYRXSA-N 0.000 description 2
- FBEJIDRSQCGFJI-GUBZILKMSA-N Glu-Leu-Ser Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O FBEJIDRSQCGFJI-GUBZILKMSA-N 0.000 description 2
- OQXDUSZKISQQSS-GUBZILKMSA-N Glu-Lys-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(O)=O OQXDUSZKISQQSS-GUBZILKMSA-N 0.000 description 2
- BCYGDJXHAGZNPQ-DCAQKATOSA-N Glu-Lys-Glu Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O BCYGDJXHAGZNPQ-DCAQKATOSA-N 0.000 description 2
- ILWHFUZZCFYSKT-AVGNSLFASA-N Glu-Lys-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O ILWHFUZZCFYSKT-AVGNSLFASA-N 0.000 description 2
- FMBWLLMUPXTXFC-SDDRHHMPSA-N Glu-Lys-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCCCN)NC(=O)[C@H](CCC(=O)O)N)C(=O)O FMBWLLMUPXTXFC-SDDRHHMPSA-N 0.000 description 2
- QMOSCLNJVKSHHU-YUMQZZPRSA-N Glu-Met-Gly Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCSC)C(=O)NCC(O)=O QMOSCLNJVKSHHU-YUMQZZPRSA-N 0.000 description 2
- YBTCBQBIJKGSJP-BQBZGAKWSA-N Glu-Pro Chemical compound OC(=O)CC[C@H](N)C(=O)N1CCC[C@H]1C(O)=O YBTCBQBIJKGSJP-BQBZGAKWSA-N 0.000 description 2
- DXVOKNVIKORTHQ-GUBZILKMSA-N Glu-Pro-Glu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O DXVOKNVIKORTHQ-GUBZILKMSA-N 0.000 description 2
- DCBSZJJHOTXMHY-DCAQKATOSA-N Glu-Pro-Pro Chemical compound OC(=O)CC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N1[C@H](C(O)=O)CCC1 DCBSZJJHOTXMHY-DCAQKATOSA-N 0.000 description 2
- BIYNPVYAZOUVFQ-CIUDSAMLSA-N Glu-Pro-Ser Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O BIYNPVYAZOUVFQ-CIUDSAMLSA-N 0.000 description 2
- IDEODOAVGCMUQV-GUBZILKMSA-N Glu-Ser-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O IDEODOAVGCMUQV-GUBZILKMSA-N 0.000 description 2
- JSIQVRIXMINMTA-ZDLURKLDSA-N Glu-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@@H](N)CCC(O)=O JSIQVRIXMINMTA-ZDLURKLDSA-N 0.000 description 2
- MXJYXYDREQWUMS-XKBZYTNZSA-N Glu-Thr-Ser Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O MXJYXYDREQWUMS-XKBZYTNZSA-N 0.000 description 2
- NTHIHAUEXVTXQG-KKUMJFAQSA-N Glu-Tyr-Arg Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)NC(=O)[C@H](CCC(=O)O)N)O NTHIHAUEXVTXQG-KKUMJFAQSA-N 0.000 description 2
- UUTGYDAKPISJAO-JYJNAYRXSA-N Glu-Tyr-Leu Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CC(C)C)C(O)=O)CC1=CC=C(O)C=C1 UUTGYDAKPISJAO-JYJNAYRXSA-N 0.000 description 2
- YPHPEHMXOYTEQG-LAEOZQHASA-N Glu-Val-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CCC(O)=O YPHPEHMXOYTEQG-LAEOZQHASA-N 0.000 description 2
- ZYRXTRTUCAVNBQ-GVXVVHGQSA-N Glu-Val-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCC(=O)O)N ZYRXTRTUCAVNBQ-GVXVVHGQSA-N 0.000 description 2
- RJIVPOXLQFJRTG-LURJTMIESA-N Gly-Arg-Gly Chemical compound OC(=O)CNC(=O)[C@@H](NC(=O)CN)CCCN=C(N)N RJIVPOXLQFJRTG-LURJTMIESA-N 0.000 description 2
- FZQLXNIMCPJVJE-YUMQZZPRSA-N Gly-Asp-Leu Chemical compound [H]NCC(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O FZQLXNIMCPJVJE-YUMQZZPRSA-N 0.000 description 2
- DHDOADIPGZTAHT-YUMQZZPRSA-N Gly-Glu-Arg Chemical compound NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N DHDOADIPGZTAHT-YUMQZZPRSA-N 0.000 description 2
- ZQIMMEYPEXIYBB-IUCAKERBSA-N Gly-Glu-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)CN ZQIMMEYPEXIYBB-IUCAKERBSA-N 0.000 description 2
- QITBQGJOXQYMOA-ZETCQYMHSA-N Gly-Gly-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)CNC(=O)CN QITBQGJOXQYMOA-ZETCQYMHSA-N 0.000 description 2
- UQJNXZSSGQIPIQ-FBCQKBJTSA-N Gly-Gly-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)CNC(=O)CN UQJNXZSSGQIPIQ-FBCQKBJTSA-N 0.000 description 2
- IKAIKUBBJHFNBZ-LURJTMIESA-N Gly-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)CN IKAIKUBBJHFNBZ-LURJTMIESA-N 0.000 description 2
- SJLKKOZFHSJJAW-YUMQZZPRSA-N Gly-Met-Glu Chemical compound CSCC[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)CN SJLKKOZFHSJJAW-YUMQZZPRSA-N 0.000 description 2
- JJGBXTYGTKWGAT-YUMQZZPRSA-N Gly-Pro-Glu Chemical compound NCC(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O JJGBXTYGTKWGAT-YUMQZZPRSA-N 0.000 description 2
- YABRDIBSPZONIY-BQBZGAKWSA-N Gly-Ser-Met Chemical compound [H]NCC(=O)N[C@@H](CO)C(=O)N[C@@H](CCSC)C(O)=O YABRDIBSPZONIY-BQBZGAKWSA-N 0.000 description 2
- YGHSQRJSHKYUJY-SCZZXKLOSA-N Gly-Val-Pro Chemical compound CC(C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)CN YGHSQRJSHKYUJY-SCZZXKLOSA-N 0.000 description 2
- WSDOHRLQDGAOGU-BQBZGAKWSA-N His-Asn Chemical compound NC(=O)C[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CN=CN1 WSDOHRLQDGAOGU-BQBZGAKWSA-N 0.000 description 2
- MMFKFJORZBJVNF-UWVGGRQHSA-N His-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CN=CN1 MMFKFJORZBJVNF-UWVGGRQHSA-N 0.000 description 2
- YAALVYQFVJNXIV-KKUMJFAQSA-N His-Leu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC1=CN=CN1 YAALVYQFVJNXIV-KKUMJFAQSA-N 0.000 description 2
- RNMNYMDTESKEAJ-KKUMJFAQSA-N His-Leu-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC1=CN=CN1 RNMNYMDTESKEAJ-KKUMJFAQSA-N 0.000 description 2
- TWROVBNEHJSXDG-IHRRRGAJSA-N His-Leu-Val Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(O)=O TWROVBNEHJSXDG-IHRRRGAJSA-N 0.000 description 2
- TVMNTHXFRSXZGR-IHRRRGAJSA-N His-Lys-Val Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O TVMNTHXFRSXZGR-IHRRRGAJSA-N 0.000 description 2
- SOYCWSKCUVDLMC-AVGNSLFASA-N His-Pro-Arg Chemical compound N[C@@H](Cc1cnc[nH]1)C(=O)N2CCC[C@H]2C(=O)N[C@@H](CCCNC(=N)N)C(=O)O SOYCWSKCUVDLMC-AVGNSLFASA-N 0.000 description 2
- IAYPZSHNZQHQNO-KKUMJFAQSA-N His-Ser-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CC2=CN=CN2)N IAYPZSHNZQHQNO-KKUMJFAQSA-N 0.000 description 2
- HFKJBCPRWWGPEY-BQBZGAKWSA-N L-arginyl-L-glutamic acid Chemical compound NC(=N)NCCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(O)=O HFKJBCPRWWGPEY-BQBZGAKWSA-N 0.000 description 2
- KWTVLKBOQATPHJ-SRVKXCTJSA-N Leu-Ala-Lys Chemical compound C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(C)C)N KWTVLKBOQATPHJ-SRVKXCTJSA-N 0.000 description 2
- DBVWMYGBVFCRBE-CIUDSAMLSA-N Leu-Asn-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O DBVWMYGBVFCRBE-CIUDSAMLSA-N 0.000 description 2
- DLFAACQHIRSQGG-CIUDSAMLSA-N Leu-Asp-Asp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O DLFAACQHIRSQGG-CIUDSAMLSA-N 0.000 description 2
- ULXYQAJWJGLCNR-YUMQZZPRSA-N Leu-Asp-Gly Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O ULXYQAJWJGLCNR-YUMQZZPRSA-N 0.000 description 2
- DLCOFDAHNMMQPP-SRVKXCTJSA-N Leu-Asp-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O DLCOFDAHNMMQPP-SRVKXCTJSA-N 0.000 description 2
- NHHKSOGJYNQENP-SRVKXCTJSA-N Leu-Cys-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CCCCN)C(=O)O)N NHHKSOGJYNQENP-SRVKXCTJSA-N 0.000 description 2
- HQUXQAMSWFIRET-AVGNSLFASA-N Leu-Glu-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@H](C(O)=O)CCCCN HQUXQAMSWFIRET-AVGNSLFASA-N 0.000 description 2
- HYIFFZAQXPUEAU-QWRGUYRKSA-N Leu-Gly-Leu Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC(C)C HYIFFZAQXPUEAU-QWRGUYRKSA-N 0.000 description 2
- APFJUBGRZGMQFF-QWRGUYRKSA-N Leu-Gly-Lys Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCCN APFJUBGRZGMQFF-QWRGUYRKSA-N 0.000 description 2
- XVZCXCTYGHPNEM-UHFFFAOYSA-N Leu-Leu-Pro Natural products CC(C)CC(N)C(=O)NC(CC(C)C)C(=O)N1CCCC1C(O)=O XVZCXCTYGHPNEM-UHFFFAOYSA-N 0.000 description 2
- IEWBEPKLKUXQBU-VOAKCMCISA-N Leu-Leu-Thr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O IEWBEPKLKUXQBU-VOAKCMCISA-N 0.000 description 2
- OTXBNHIUIHNGAO-UWVGGRQHSA-N Leu-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(O)=O)CCCCN OTXBNHIUIHNGAO-UWVGGRQHSA-N 0.000 description 2
- WXUOJXIGOPMDJM-SRVKXCTJSA-N Leu-Lys-Asn Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O WXUOJXIGOPMDJM-SRVKXCTJSA-N 0.000 description 2
- DCGXHWINSHEPIR-SRVKXCTJSA-N Leu-Lys-Cys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CS)C(=O)O)N DCGXHWINSHEPIR-SRVKXCTJSA-N 0.000 description 2
- HVHRPWQEQHIQJF-AVGNSLFASA-N Leu-Lys-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O HVHRPWQEQHIQJF-AVGNSLFASA-N 0.000 description 2
- FIICHHJDINDXKG-IHPCNDPISA-N Leu-Lys-Trp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O FIICHHJDINDXKG-IHPCNDPISA-N 0.000 description 2
- FLNPJLDPGMLWAU-UWVGGRQHSA-N Leu-Met-Gly Chemical compound OC(=O)CNC(=O)[C@H](CCSC)NC(=O)[C@@H](N)CC(C)C FLNPJLDPGMLWAU-UWVGGRQHSA-N 0.000 description 2
- UHNQRAFSEBGZFZ-YESZJQIVSA-N Leu-Phe-Pro Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N2CCC[C@@H]2C(=O)O)N UHNQRAFSEBGZFZ-YESZJQIVSA-N 0.000 description 2
- YWKNKRAKOCLOLH-OEAJRASXSA-N Leu-Phe-Thr Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H]([C@@H](C)O)C(O)=O)CC1=CC=CC=C1 YWKNKRAKOCLOLH-OEAJRASXSA-N 0.000 description 2
- DPURXCQCHSQPAN-AVGNSLFASA-N Leu-Pro-Pro Chemical compound CC(C)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N1[C@H](C(O)=O)CCC1 DPURXCQCHSQPAN-AVGNSLFASA-N 0.000 description 2
- BRTVHXHCUSXYRI-CIUDSAMLSA-N Leu-Ser-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O BRTVHXHCUSXYRI-CIUDSAMLSA-N 0.000 description 2
- QWWPYKKLXWOITQ-VOAKCMCISA-N Leu-Thr-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CC(C)C QWWPYKKLXWOITQ-VOAKCMCISA-N 0.000 description 2
- JGKHAFUAPZCCDU-BZSNNMDCSA-N Leu-Tyr-Leu Chemical compound CC(C)C[C@H]([NH3+])C(=O)N[C@H](C(=O)N[C@@H](CC(C)C)C([O-])=O)CC1=CC=C(O)C=C1 JGKHAFUAPZCCDU-BZSNNMDCSA-N 0.000 description 2
- WSXTWLJHTLRFLW-SRVKXCTJSA-N Lys-Ala-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCCN)C(O)=O WSXTWLJHTLRFLW-SRVKXCTJSA-N 0.000 description 2
- NQCJGQHHYZNUDK-DCAQKATOSA-N Lys-Arg-Ser Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CO)C(O)=O)CCCN=C(N)N NQCJGQHHYZNUDK-DCAQKATOSA-N 0.000 description 2
- FACUGMGEFUEBTI-SRVKXCTJSA-N Lys-Asn-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CCCCN FACUGMGEFUEBTI-SRVKXCTJSA-N 0.000 description 2
- QUCDKEKDPYISNX-HJGDQZAQSA-N Lys-Asn-Thr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O QUCDKEKDPYISNX-HJGDQZAQSA-N 0.000 description 2
- IBQMEXQYZMVIFU-SRVKXCTJSA-N Lys-Asp-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CCCCN)N IBQMEXQYZMVIFU-SRVKXCTJSA-N 0.000 description 2
- GJJQCBVRWDGLMQ-GUBZILKMSA-N Lys-Glu-Ala Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O GJJQCBVRWDGLMQ-GUBZILKMSA-N 0.000 description 2
- PBIPLDMFHAICIP-DCAQKATOSA-N Lys-Glu-Glu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O PBIPLDMFHAICIP-DCAQKATOSA-N 0.000 description 2
- PAMDBWYMLWOELY-SDDRHHMPSA-N Lys-Glu-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CCCCN)N)C(=O)O PAMDBWYMLWOELY-SDDRHHMPSA-N 0.000 description 2
- CANPXOLVTMKURR-WEDXCCLWSA-N Lys-Gly-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CCCCN CANPXOLVTMKURR-WEDXCCLWSA-N 0.000 description 2
- VMTYLUGCXIEDMV-QWRGUYRKSA-N Lys-Leu-Gly Chemical compound OC(=O)CNC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CCCCN VMTYLUGCXIEDMV-QWRGUYRKSA-N 0.000 description 2
- NVGBPTNZLWRQSY-UWVGGRQHSA-N Lys-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(O)=O)CCCCN NVGBPTNZLWRQSY-UWVGGRQHSA-N 0.000 description 2
- XOQMURBBIXRRCR-SRVKXCTJSA-N Lys-Lys-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCCCN XOQMURBBIXRRCR-SRVKXCTJSA-N 0.000 description 2
- UQRZFMQQXXJTTF-AVGNSLFASA-N Lys-Lys-Glu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O UQRZFMQQXXJTTF-AVGNSLFASA-N 0.000 description 2
- HVAUKHLDSDDROB-KKUMJFAQSA-N Lys-Lys-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O HVAUKHLDSDDROB-KKUMJFAQSA-N 0.000 description 2
- YXPJCVNIDDKGOE-MELADBBJSA-N Lys-Lys-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCCCN)NC(=O)[C@H](CCCCN)N)C(=O)O YXPJCVNIDDKGOE-MELADBBJSA-N 0.000 description 2
- AFLBTVGQCQLOFJ-AVGNSLFASA-N Lys-Pro-Arg Chemical compound NCCCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCN=C(N)N)C(O)=O AFLBTVGQCQLOFJ-AVGNSLFASA-N 0.000 description 2
- JCVOHUKUYSYBAD-DCAQKATOSA-N Lys-Pro-Cys Chemical compound C1C[C@H](N(C1)C(=O)[C@H](CCCCN)N)C(=O)N[C@@H](CS)C(=O)O JCVOHUKUYSYBAD-DCAQKATOSA-N 0.000 description 2
- LUTDBHBIHHREDC-IHRRRGAJSA-N Lys-Pro-Lys Chemical compound NCCCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(O)=O LUTDBHBIHHREDC-IHRRRGAJSA-N 0.000 description 2
- CRIODIGWCUPXKU-AVGNSLFASA-N Lys-Pro-Met Chemical compound [H]N[C@@H](CCCCN)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCSC)C(O)=O CRIODIGWCUPXKU-AVGNSLFASA-N 0.000 description 2
- UQJOKDAYFULYIX-AVGNSLFASA-N Lys-Pro-Pro Chemical compound NCCCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N1[C@H](C(O)=O)CCC1 UQJOKDAYFULYIX-AVGNSLFASA-N 0.000 description 2
- YSPZCHGIWAQVKQ-AVGNSLFASA-N Lys-Pro-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CCCCN YSPZCHGIWAQVKQ-AVGNSLFASA-N 0.000 description 2
- HKXSZKJMDBHOTG-CIUDSAMLSA-N Lys-Ser-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CCCCN HKXSZKJMDBHOTG-CIUDSAMLSA-N 0.000 description 2
- DYJOORGDQIGZAS-DCAQKATOSA-N Lys-Ser-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CCCCN)N DYJOORGDQIGZAS-DCAQKATOSA-N 0.000 description 2
- TVHCDSBMFQYPNA-RHYQMDGZSA-N Lys-Thr-Arg Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O TVHCDSBMFQYPNA-RHYQMDGZSA-N 0.000 description 2
- QVTDVTONTRSQMF-WDCWCFNPSA-N Lys-Thr-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H]([C@H](O)C)NC(=O)[C@@H](N)CCCCN QVTDVTONTRSQMF-WDCWCFNPSA-N 0.000 description 2
- CAVRAQIDHUPECU-UVOCVTCTSA-N Lys-Thr-Thr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O CAVRAQIDHUPECU-UVOCVTCTSA-N 0.000 description 2
- YUTZYVTZDVZBJJ-IHPCNDPISA-N Lys-Trp-Lys Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](N)CCCCN)C(=O)N[C@@H](CCCCN)C(O)=O)=CNC2=C1 YUTZYVTZDVZBJJ-IHPCNDPISA-N 0.000 description 2
- NYTDJEZBAAFLLG-IHRRRGAJSA-N Lys-Val-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCCCN)C(O)=O NYTDJEZBAAFLLG-IHRRRGAJSA-N 0.000 description 2
- GODBLDDYHFTUAH-CIUDSAMLSA-N Met-Asp-Glu Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCC(O)=O GODBLDDYHFTUAH-CIUDSAMLSA-N 0.000 description 2
- TZLYIHDABYBOCJ-FXQIFTODSA-N Met-Asp-Ser Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O TZLYIHDABYBOCJ-FXQIFTODSA-N 0.000 description 2
- RMHHNLKYPOOKQN-FXQIFTODSA-N Met-Cys-Ser Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CS)C(=O)N[C@@H](CO)C(O)=O RMHHNLKYPOOKQN-FXQIFTODSA-N 0.000 description 2
- MHQXIBRPDKXDGZ-ZFWWWQNUSA-N Met-Gly-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)CNC(=O)[C@@H](N)CCSC)C(O)=O)=CNC2=C1 MHQXIBRPDKXDGZ-ZFWWWQNUSA-N 0.000 description 2
- ZBLSZPYQQRIHQU-RCWTZXSCSA-N Met-Thr-Val Chemical compound CSCC[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(O)=O ZBLSZPYQQRIHQU-RCWTZXSCSA-N 0.000 description 2
- 241001465754 Metazoa Species 0.000 description 2
- PESQCPHRXOFIPX-UHFFFAOYSA-N N-L-methionyl-L-tyrosine Natural products CSCCC(N)C(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 PESQCPHRXOFIPX-UHFFFAOYSA-N 0.000 description 2
- 108700026244 Open Reading Frames Proteins 0.000 description 2
- 108091005804 Peptidases Proteins 0.000 description 2
- RIYZXJVARWJLKS-KKUMJFAQSA-N Phe-Asp-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 RIYZXJVARWJLKS-KKUMJFAQSA-N 0.000 description 2
- MPFGIYLYWUCSJG-AVGNSLFASA-N Phe-Glu-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 MPFGIYLYWUCSJG-AVGNSLFASA-N 0.000 description 2
- RBRNEFJTEHPDSL-ACRUOGEOSA-N Phe-Phe-Lys Chemical compound C([C@@H](C(=O)N[C@@H](CCCCN)C(O)=O)NC(=O)[C@@H](N)CC=1C=CC=CC=1)C1=CC=CC=C1 RBRNEFJTEHPDSL-ACRUOGEOSA-N 0.000 description 2
- UNBFGVQVQGXXCK-KKUMJFAQSA-N Phe-Ser-Leu Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O UNBFGVQVQGXXCK-KKUMJFAQSA-N 0.000 description 2
- GMWNQSGWWGKTSF-LFSVMHDDSA-N Phe-Thr-Ala Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O GMWNQSGWWGKTSF-LFSVMHDDSA-N 0.000 description 2
- JHSRGEODDALISP-XVSYOHENSA-N Phe-Thr-Asn Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(O)=O JHSRGEODDALISP-XVSYOHENSA-N 0.000 description 2
- NJONQBYLTANINY-IHPCNDPISA-N Phe-Trp-Asn Chemical compound N[C@@H](Cc1ccccc1)C(=O)N[C@@H](Cc1c[nH]c2ccccc12)C(=O)N[C@@H](CC(N)=O)C(O)=O NJONQBYLTANINY-IHPCNDPISA-N 0.000 description 2
- APKRGYLBSCWJJP-FXQIFTODSA-N Pro-Ala-Asp Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(O)=O APKRGYLBSCWJJP-FXQIFTODSA-N 0.000 description 2
- CGBYDGAJHSOGFQ-LPEHRKFASA-N Pro-Ala-Pro Chemical compound C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@@H]2CCCN2 CGBYDGAJHSOGFQ-LPEHRKFASA-N 0.000 description 2
- ZSKJPKFTPQCPIH-RCWTZXSCSA-N Pro-Arg-Thr Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(O)=O ZSKJPKFTPQCPIH-RCWTZXSCSA-N 0.000 description 2
- ILMLVTGTUJPQFP-FXQIFTODSA-N Pro-Asp-Asp Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O ILMLVTGTUJPQFP-FXQIFTODSA-N 0.000 description 2
- YFNOUBWUIIJQHF-LPEHRKFASA-N Pro-Asp-Pro Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CC(=O)O)C(=O)N2CCC[C@@H]2C(=O)O YFNOUBWUIIJQHF-LPEHRKFASA-N 0.000 description 2
- ZCXQTRXYZOSGJR-FXQIFTODSA-N Pro-Asp-Ser Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O ZCXQTRXYZOSGJR-FXQIFTODSA-N 0.000 description 2
- XUSDDSLCRPUKLP-QXEWZRGKSA-N Pro-Asp-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H]1CCCN1 XUSDDSLCRPUKLP-QXEWZRGKSA-N 0.000 description 2
- MGDFPGCFVJFITQ-CIUDSAMLSA-N Pro-Glu-Asp Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O MGDFPGCFVJFITQ-CIUDSAMLSA-N 0.000 description 2
- LXVLKXPFIDDHJG-CIUDSAMLSA-N Pro-Glu-Ser Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O LXVLKXPFIDDHJG-CIUDSAMLSA-N 0.000 description 2
- XYSXOCIWCPFOCG-IHRRRGAJSA-N Pro-Leu-Leu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O XYSXOCIWCPFOCG-IHRRRGAJSA-N 0.000 description 2
- JUJCUYWRJMFJJF-AVGNSLFASA-N Pro-Lys-Arg Chemical compound NC(N)=NCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H]1CCCN1 JUJCUYWRJMFJJF-AVGNSLFASA-N 0.000 description 2
- SXMSEHDMNIUTSP-DCAQKATOSA-N Pro-Lys-Asn Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O SXMSEHDMNIUTSP-DCAQKATOSA-N 0.000 description 2
- XQPHBAKJJJZOBX-SRVKXCTJSA-N Pro-Lys-Glu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O XQPHBAKJJJZOBX-SRVKXCTJSA-N 0.000 description 2
- RWCOTTLHDJWHRS-YUMQZZPRSA-N Pro-Pro Chemical compound OC(=O)[C@@H]1CCCN1C(=O)[C@H]1NCCC1 RWCOTTLHDJWHRS-YUMQZZPRSA-N 0.000 description 2
- JLMZKEQFMVORMA-SRVKXCTJSA-N Pro-Pro-Arg Chemical compound NC(N)=NCCC[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@H]1NCCC1 JLMZKEQFMVORMA-SRVKXCTJSA-N 0.000 description 2
- FHZJRBVMLGOHBX-GUBZILKMSA-N Pro-Pro-Asp Chemical compound OC(=O)C[C@H](NC(=O)[C@@H]1CCCN1C(=O)[C@@H]1CCCN1)C(O)=O FHZJRBVMLGOHBX-GUBZILKMSA-N 0.000 description 2
- IURWWZYKYPEANQ-HJGDQZAQSA-N Pro-Thr-Glu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(O)=O IURWWZYKYPEANQ-HJGDQZAQSA-N 0.000 description 2
- VVAWNPIOYXAMAL-KJEVXHAQSA-N Pro-Thr-Tyr Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O VVAWNPIOYXAMAL-KJEVXHAQSA-N 0.000 description 2
- FHJQROWZEJFZPO-SRVKXCTJSA-N Pro-Val-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H]1CCCN1 FHJQROWZEJFZPO-SRVKXCTJSA-N 0.000 description 2
- 239000004365 Protease Substances 0.000 description 2
- FIXILCYTSAUERA-FXQIFTODSA-N Ser-Ala-Arg Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O FIXILCYTSAUERA-FXQIFTODSA-N 0.000 description 2
- WTUJZHKANPDPIN-CIUDSAMLSA-N Ser-Ala-Lys Chemical compound C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CO)N WTUJZHKANPDPIN-CIUDSAMLSA-N 0.000 description 2
- BRKHVZNDAOMAHX-BIIVOSGPSA-N Ser-Ala-Pro Chemical compound C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CO)N BRKHVZNDAOMAHX-BIIVOSGPSA-N 0.000 description 2
- YUSRGTQIPCJNHQ-CIUDSAMLSA-N Ser-Arg-Glu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(O)=O YUSRGTQIPCJNHQ-CIUDSAMLSA-N 0.000 description 2
- VQBLHWSPVYYZTB-DCAQKATOSA-N Ser-Arg-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CO)N VQBLHWSPVYYZTB-DCAQKATOSA-N 0.000 description 2
- WXUBSIDKNMFAGS-IHRRRGAJSA-N Ser-Arg-Tyr Chemical compound NC(N)=NCCC[C@H](NC(=O)[C@H](CO)N)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 WXUBSIDKNMFAGS-IHRRRGAJSA-N 0.000 description 2
- CNIIKZQXBBQHCX-FXQIFTODSA-N Ser-Asp-Arg Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O CNIIKZQXBBQHCX-FXQIFTODSA-N 0.000 description 2
- BYIROAKULFFTEK-CIUDSAMLSA-N Ser-Asp-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CO BYIROAKULFFTEK-CIUDSAMLSA-N 0.000 description 2
- MMAPOBOTRUVNKJ-ZLUOBGJFSA-N Ser-Asp-Ser Chemical compound C([C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CO)N)C(=O)O MMAPOBOTRUVNKJ-ZLUOBGJFSA-N 0.000 description 2
- GYXVUTAOICLGKJ-ACZMJKKPSA-N Ser-Glu-Cys Chemical compound C(CC(=O)O)[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CO)N GYXVUTAOICLGKJ-ACZMJKKPSA-N 0.000 description 2
- QKQDTEYDEIJPNK-GUBZILKMSA-N Ser-Glu-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CO QKQDTEYDEIJPNK-GUBZILKMSA-N 0.000 description 2
- AEGUWTFAQQWVLC-BQBZGAKWSA-N Ser-Gly-Arg Chemical compound [H]N[C@@H](CO)C(=O)NCC(=O)N[C@@H](CCCNC(N)=N)C(O)=O AEGUWTFAQQWVLC-BQBZGAKWSA-N 0.000 description 2
- GZFAWAQTEYDKII-YUMQZZPRSA-N Ser-Gly-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CO GZFAWAQTEYDKII-YUMQZZPRSA-N 0.000 description 2
- WSTIOCFMWXNOCX-YUMQZZPRSA-N Ser-Gly-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CO)N WSTIOCFMWXNOCX-YUMQZZPRSA-N 0.000 description 2
- KDGARKCAKHBEDB-NKWVEPMBSA-N Ser-Gly-Pro Chemical compound C1C[C@@H](N(C1)C(=O)CNC(=O)[C@H](CO)N)C(=O)O KDGARKCAKHBEDB-NKWVEPMBSA-N 0.000 description 2
- IUXGJEIKJBYKOO-SRVKXCTJSA-N Ser-Leu-His Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CO)N IUXGJEIKJBYKOO-SRVKXCTJSA-N 0.000 description 2
- GZSZPKSBVAOGIE-CIUDSAMLSA-N Ser-Lys-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(O)=O GZSZPKSBVAOGIE-CIUDSAMLSA-N 0.000 description 2
- NNFMANHDYSVNIO-DCAQKATOSA-N Ser-Lys-Arg Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O NNFMANHDYSVNIO-DCAQKATOSA-N 0.000 description 2
- GVMUJUPXFQFBBZ-GUBZILKMSA-N Ser-Lys-Glu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O GVMUJUPXFQFBBZ-GUBZILKMSA-N 0.000 description 2
- XUDRHBPSPAPDJP-SRVKXCTJSA-N Ser-Lys-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CO XUDRHBPSPAPDJP-SRVKXCTJSA-N 0.000 description 2
- CRJZZXMAADSBBQ-SRVKXCTJSA-N Ser-Lys-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CO CRJZZXMAADSBBQ-SRVKXCTJSA-N 0.000 description 2
- BUYHXYIUQUBEQP-AVGNSLFASA-N Ser-Phe-Glu Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CO)N BUYHXYIUQUBEQP-AVGNSLFASA-N 0.000 description 2
- OVQZAFXWIWNYKA-GUBZILKMSA-N Ser-Pro-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@@H]1CCCN1C(=O)[C@H](CO)N OVQZAFXWIWNYKA-GUBZILKMSA-N 0.000 description 2
- HHJFMHQYEAAOBM-ZLUOBGJFSA-N Ser-Ser-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O HHJFMHQYEAAOBM-ZLUOBGJFSA-N 0.000 description 2
- GYDFRTRSSXOZCR-ACZMJKKPSA-N Ser-Ser-Glu Chemical compound OC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCC(O)=O GYDFRTRSSXOZCR-ACZMJKKPSA-N 0.000 description 2
- BMKNXTJLHFIAAH-CIUDSAMLSA-N Ser-Ser-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O BMKNXTJLHFIAAH-CIUDSAMLSA-N 0.000 description 2
- VLMIUSLQONKLDV-HEIBUPTGSA-N Ser-Thr-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O VLMIUSLQONKLDV-HEIBUPTGSA-N 0.000 description 2
- ANOQEBQWIAYIMV-AEJSXWLSSA-N Ser-Val-Pro Chemical compound CC(C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CO)N ANOQEBQWIAYIMV-AEJSXWLSSA-N 0.000 description 2
- DWYAUVCQDTZIJI-VZFHVOOUSA-N Thr-Ala-Ser Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(O)=O DWYAUVCQDTZIJI-VZFHVOOUSA-N 0.000 description 2
- PQLXHSACXPGWPD-GSSVUCPTSA-N Thr-Asn-Thr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O PQLXHSACXPGWPD-GSSVUCPTSA-N 0.000 description 2
- FHDLKMFZKRUQCE-HJGDQZAQSA-N Thr-Glu-Arg Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O FHDLKMFZKRUQCE-HJGDQZAQSA-N 0.000 description 2
- LGNBRHZANHMZHK-NUMRIWBASA-N Thr-Glu-Asp Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CC(=O)O)C(=O)O)N)O LGNBRHZANHMZHK-NUMRIWBASA-N 0.000 description 2
- BIYXEUAFGLTAEM-WUJLRWPWSA-N Thr-Gly Chemical compound C[C@@H](O)[C@H](N)C(=O)NCC(O)=O BIYXEUAFGLTAEM-WUJLRWPWSA-N 0.000 description 2
- XPNSAQMEAVSQRD-FBCQKBJTSA-N Thr-Gly-Gly Chemical compound C[C@@H](O)[C@H](N)C(=O)NCC(=O)NCC(O)=O XPNSAQMEAVSQRD-FBCQKBJTSA-N 0.000 description 2
- JQAWYCUUFIMTHE-WLTAIBSBSA-N Thr-Gly-Tyr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O JQAWYCUUFIMTHE-WLTAIBSBSA-N 0.000 description 2
- BQBCIBCLXBKYHW-CSMHCCOUSA-N Thr-Leu Chemical compound CC(C)C[C@@H](C([O-])=O)NC(=O)[C@@H]([NH3+])[C@@H](C)O BQBCIBCLXBKYHW-CSMHCCOUSA-N 0.000 description 2
- AMXMBCAXAZUCFA-RHYQMDGZSA-N Thr-Leu-Arg Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O AMXMBCAXAZUCFA-RHYQMDGZSA-N 0.000 description 2
- VTVVYQOXJCZVEB-WDCWCFNPSA-N Thr-Leu-Glu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O VTVVYQOXJCZVEB-WDCWCFNPSA-N 0.000 description 2
- XIULAFZYEKSGAJ-IXOXFDKPSA-N Thr-Leu-His Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CC1=CNC=N1 XIULAFZYEKSGAJ-IXOXFDKPSA-N 0.000 description 2
- MECLEFZMPPOEAC-VOAKCMCISA-N Thr-Leu-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)O)N)O MECLEFZMPPOEAC-VOAKCMCISA-N 0.000 description 2
- NCXVJIQMWSGRHY-KXNHARMFSA-N Thr-Leu-Pro Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N1CCC[C@@H]1C(=O)O)N)O NCXVJIQMWSGRHY-KXNHARMFSA-N 0.000 description 2
- VRUFCJZQDACGLH-UVOCVTCTSA-N Thr-Leu-Thr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O VRUFCJZQDACGLH-UVOCVTCTSA-N 0.000 description 2
- WTMPKZWHRCMMMT-KZVJFYERSA-N Thr-Pro-Ala Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C)C(O)=O WTMPKZWHRCMMMT-KZVJFYERSA-N 0.000 description 2
- AHERARIZBPOMNU-KATARQTJSA-N Thr-Ser-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O AHERARIZBPOMNU-KATARQTJSA-N 0.000 description 2
- UQCNIMDPYICBTR-KYNKHSRBSA-N Thr-Thr-Gly Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O UQCNIMDPYICBTR-KYNKHSRBSA-N 0.000 description 2
- CSNBWOJOEOPYIJ-UVOCVTCTSA-N Thr-Thr-Lys Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCCN)C(O)=O CSNBWOJOEOPYIJ-UVOCVTCTSA-N 0.000 description 2
- MZDJYWGXAIEYEP-BPUTZDHNSA-N Trp-Cys-Arg Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N MZDJYWGXAIEYEP-BPUTZDHNSA-N 0.000 description 2
- YTHWAWACWGWBLE-MNSWYVGCSA-N Trp-Tyr-Thr Chemical compound C([C@@H](C(=O)N[C@@H]([C@H](O)C)C(O)=O)NC(=O)[C@@H](N)CC=1C2=CC=CC=C2NC=1)C1=CC=C(O)C=C1 YTHWAWACWGWBLE-MNSWYVGCSA-N 0.000 description 2
- HKYTWJOWZTWBQB-AVGNSLFASA-N Tyr-Glu-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 HKYTWJOWZTWBQB-AVGNSLFASA-N 0.000 description 2
- YSGAPESOXHFTQY-IHRRRGAJSA-N Tyr-Met-Asp Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)N YSGAPESOXHFTQY-IHRRRGAJSA-N 0.000 description 2
- TYFLVOUZHQUBGM-IHRRRGAJSA-N Tyr-Ser-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 TYFLVOUZHQUBGM-IHRRRGAJSA-N 0.000 description 2
- KLQPIEVIKOQRAW-IZPVPAKOSA-N Tyr-Thr-Thr Chemical compound C[C@H]([C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)N)O KLQPIEVIKOQRAW-IZPVPAKOSA-N 0.000 description 2
- WFENBJPLZMPVAX-XVKPBYJWSA-N Val-Gly-Glu Chemical compound CC(C)[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCC(O)=O WFENBJPLZMPVAX-XVKPBYJWSA-N 0.000 description 2
- ZIGZPYJXIWLQFC-QTKMDUPCSA-N Val-His-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H](C(C)C)N)O ZIGZPYJXIWLQFC-QTKMDUPCSA-N 0.000 description 2
- ZHQWPWQNVRCXAX-XQQFMLRXSA-N Val-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](C(C)C)N ZHQWPWQNVRCXAX-XQQFMLRXSA-N 0.000 description 2
- CXWJFWAZIVWBOS-XQQFMLRXSA-N Val-Lys-Pro Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N1CCC[C@@H]1C(=O)O)N CXWJFWAZIVWBOS-XQQFMLRXSA-N 0.000 description 2
- VPGCVZRRBYOGCD-AVGNSLFASA-N Val-Lys-Val Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O VPGCVZRRBYOGCD-AVGNSLFASA-N 0.000 description 2
- OJOMXGVLFKYDKP-QXEWZRGKSA-N Val-Met-Asp Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(=O)O)C(=O)O)N OJOMXGVLFKYDKP-QXEWZRGKSA-N 0.000 description 2
- YTNGABPUXFEOGU-SRVKXCTJSA-N Val-Pro-Arg Chemical compound CC(C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCN=C(N)N)C(O)=O YTNGABPUXFEOGU-SRVKXCTJSA-N 0.000 description 2
- DOFAQXCYFQKSHT-SRVKXCTJSA-N Val-Pro-Pro Chemical compound CC(C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N1[C@H](C(O)=O)CCC1 DOFAQXCYFQKSHT-SRVKXCTJSA-N 0.000 description 2
- VIKZGAUAKQZDOF-NRPADANISA-N Val-Ser-Glu Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCC(O)=O VIKZGAUAKQZDOF-NRPADANISA-N 0.000 description 2
- GVNLOVJNNDZUHS-RHYQMDGZSA-N Val-Thr-Lys Chemical compound [H]N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCCN)C(O)=O GVNLOVJNNDZUHS-RHYQMDGZSA-N 0.000 description 2
- HTONZBWRYUKUKC-RCWTZXSCSA-N Val-Thr-Val Chemical compound CC(C)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(O)=O HTONZBWRYUKUKC-RCWTZXSCSA-N 0.000 description 2
- JXCOEPXCBVCTRD-JYJNAYRXSA-N Val-Tyr-Arg Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N JXCOEPXCBVCTRD-JYJNAYRXSA-N 0.000 description 2
- 230000030120 acrosome reaction Effects 0.000 description 2
- 108010047495 alanylglycine Proteins 0.000 description 2
- 108010070944 alanylhistidine Proteins 0.000 description 2
- NWMHDZMRVUOQGL-CZEIJOLGSA-N almurtide Chemical compound OC(=O)CC[C@H](C(N)=O)NC(=O)[C@H](C)NC(=O)CO[C@@H]([C@H](O)[C@H](O)CO)[C@@H](NC(C)=O)C=O NWMHDZMRVUOQGL-CZEIJOLGSA-N 0.000 description 2
- 210000004102 animal cell Anatomy 0.000 description 2
- 108010013835 arginine glutamate Proteins 0.000 description 2
- 108010069205 aspartyl-phenylalanine Proteins 0.000 description 2
- 230000033228 biological regulation Effects 0.000 description 2
- 230000000903 blocking effect Effects 0.000 description 2
- 239000000872 buffer Substances 0.000 description 2
- 244000309464 bull Species 0.000 description 2
- 210000003756 cervix mucus Anatomy 0.000 description 2
- 210000000349 chromosome Anatomy 0.000 description 2
- 150000001875 compounds Chemical class 0.000 description 2
- 108010016616 cysteinylglycine Proteins 0.000 description 2
- 108010060199 cysteinylproline Proteins 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 235000013601 eggs Nutrition 0.000 description 2
- 239000002158 endotoxin Substances 0.000 description 2
- 239000013604 expression vector Substances 0.000 description 2
- 108010063718 gamma-glutamylaspartic acid Proteins 0.000 description 2
- 108010057083 glutamyl-aspartyl-leucine Proteins 0.000 description 2
- 108010037389 glutamyl-cysteinyl-lysine Proteins 0.000 description 2
- 108010079413 glycyl-prolyl-glutamic acid Proteins 0.000 description 2
- 230000036541 health Effects 0.000 description 2
- 108010040030 histidinoalanine Proteins 0.000 description 2
- 108010025306 histidylleucine Proteins 0.000 description 2
- 230000002209 hydrophobic effect Effects 0.000 description 2
- 230000005847 immunogenicity Effects 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 108010090333 leucyl-lysyl-proline Proteins 0.000 description 2
- 108010073472 leucyl-prolyl-proline Proteins 0.000 description 2
- 229920006008 lipopolysaccharide Polymers 0.000 description 2
- 108010025153 lysyl-alanyl-alanine Proteins 0.000 description 2
- 108010038320 lysylphenylalanine Proteins 0.000 description 2
- 238000004519 manufacturing process Methods 0.000 description 2
- 108010056582 methionylglutamic acid Proteins 0.000 description 2
- 108010085203 methionylmethionine Proteins 0.000 description 2
- 239000004005 microsphere Substances 0.000 description 2
- 210000004681 ovum Anatomy 0.000 description 2
- 108010084572 phenylalanyl-valine Proteins 0.000 description 2
- 108010025488 pinealon Proteins 0.000 description 2
- 230000009257 reactivity Effects 0.000 description 2
- 230000009467 reduction Effects 0.000 description 2
- 230000028327 secretion Effects 0.000 description 2
- 108010069117 seryl-lysyl-aspartic acid Proteins 0.000 description 2
- 108010048397 seryl-lysyl-leucine Proteins 0.000 description 2
- 239000007790 solid phase Substances 0.000 description 2
- 238000010532 solid phase synthesis reaction Methods 0.000 description 2
- 238000004659 sterilization and disinfection Methods 0.000 description 2
- 230000004936 stimulating effect Effects 0.000 description 2
- 108010072986 threonyl-seryl-lysine Proteins 0.000 description 2
- 239000003053 toxin Substances 0.000 description 2
- 231100000765 toxin Toxicity 0.000 description 2
- 108700012359 toxins Proteins 0.000 description 2
- 230000005030 transcription termination Effects 0.000 description 2
- 108010080629 tryptophan-leucine Proteins 0.000 description 2
- 238000011144 upstream manufacturing Methods 0.000 description 2
- 150000003680 valines Chemical class 0.000 description 2
- 108010015385 valyl-prolyl-proline Proteins 0.000 description 2
- 238000005406 washing Methods 0.000 description 2
- LYLMANSTAHZWBY-UHFFFAOYSA-N 1-[6-(1-hydroxy-2,5-dioxopyrrolidin-3-yl)-6-oxohexyl]pyrrole-2,5-dione Chemical compound O=C1N(O)C(=O)CC1C(=O)CCCCCN1C(=O)C=CC1=O LYLMANSTAHZWBY-UHFFFAOYSA-N 0.000 description 1
- PJUPKRYGDFTMTM-UHFFFAOYSA-N 1-hydroxybenzotriazole;hydrate Chemical compound O.C1=CC=C2N(O)N=NC2=C1 PJUPKRYGDFTMTM-UHFFFAOYSA-N 0.000 description 1
- SINBGNJPYWNUQI-UHFFFAOYSA-N 2,2,2-trifluoro-1-imidazol-1-ylethanone Chemical compound FC(F)(F)C(=O)N1C=CN=C1 SINBGNJPYWNUQI-UHFFFAOYSA-N 0.000 description 1
- FPQQSJJWHUJYPU-UHFFFAOYSA-N 3-(dimethylamino)propyliminomethylidene-ethylazanium;chloride Chemical compound Cl.CCN=C=NCCCN(C)C FPQQSJJWHUJYPU-UHFFFAOYSA-N 0.000 description 1
- 208000030507 AIDS Diseases 0.000 description 1
- 229920000936 Agarose Polymers 0.000 description 1
- XCIGOVDXZULBBV-DCAQKATOSA-N Ala-Val-Lys Chemical compound CC(C)[C@H](NC(=O)[C@H](C)N)C(=O)N[C@@H](CCCCN)C(O)=O XCIGOVDXZULBBV-DCAQKATOSA-N 0.000 description 1
- 108010088751 Albumins Proteins 0.000 description 1
- 102000009027 Albumins Human genes 0.000 description 1
- 239000004382 Amylase Substances 0.000 description 1
- 102000013142 Amylases Human genes 0.000 description 1
- 108010065511 Amylases Proteins 0.000 description 1
- VPPXTHJNTYDNFJ-CIUDSAMLSA-N Asp-Ala-Lys Chemical compound C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(=O)O)N VPPXTHJNTYDNFJ-CIUDSAMLSA-N 0.000 description 1
- 101000800130 Bos taurus Thyroglobulin Proteins 0.000 description 1
- 102100025905 C-Jun-amino-terminal kinase-interacting protein 4 Human genes 0.000 description 1
- 241000283707 Capra Species 0.000 description 1
- 102000014914 Carrier Proteins Human genes 0.000 description 1
- 108010078791 Carrier Proteins Proteins 0.000 description 1
- 102000005575 Cellulases Human genes 0.000 description 1
- 108010084185 Cellulases Proteins 0.000 description 1
- VEXZGXHMUGYJMC-UHFFFAOYSA-M Chloride anion Chemical compound [Cl-] VEXZGXHMUGYJMC-UHFFFAOYSA-M 0.000 description 1
- 108020004635 Complementary DNA Proteins 0.000 description 1
- 102000005927 Cysteine Proteases Human genes 0.000 description 1
- 108010005843 Cysteine Proteases Proteins 0.000 description 1
- 239000003298 DNA probe Substances 0.000 description 1
- 238000001712 DNA sequencing Methods 0.000 description 1
- 241000233866 Fungi Species 0.000 description 1
- 101150066516 GST gene Proteins 0.000 description 1
- BXSZPACYCMNKLS-AVGNSLFASA-N Glu-Ser-Phe Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O BXSZPACYCMNKLS-AVGNSLFASA-N 0.000 description 1
- YQAQQKPWFOBSMU-WDCWCFNPSA-N Glu-Thr-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O YQAQQKPWFOBSMU-WDCWCFNPSA-N 0.000 description 1
- SXRSQZLOMIGNAQ-UHFFFAOYSA-N Glutaraldehyde Chemical compound O=CCCCC=O SXRSQZLOMIGNAQ-UHFFFAOYSA-N 0.000 description 1
- AEMRFAOFKBGASW-UHFFFAOYSA-N Glycolic acid Polymers OCC(O)=O AEMRFAOFKBGASW-UHFFFAOYSA-N 0.000 description 1
- 241000282412 Homo Species 0.000 description 1
- 101001076862 Homo sapiens C-Jun-amino-terminal kinase-interacting protein 4 Proteins 0.000 description 1
- 101001019513 Homo sapiens Calpastatin Proteins 0.000 description 1
- 108010001336 Horseradish Peroxidase Proteins 0.000 description 1
- 108091006905 Human Serum Albumin Proteins 0.000 description 1
- 102000008100 Human Serum Albumin Human genes 0.000 description 1
- 108060003951 Immunoglobulin Proteins 0.000 description 1
- 108010044467 Isoenzymes Proteins 0.000 description 1
- 102000003855 L-lactate dehydrogenase Human genes 0.000 description 1
- 108700023483 L-lactate dehydrogenases Proteins 0.000 description 1
- 102000004882 Lipase Human genes 0.000 description 1
- 108090001060 Lipase Proteins 0.000 description 1
- 239000004367 Lipase Substances 0.000 description 1
- SWWCDAGDQHTKIE-RHYQMDGZSA-N Lys-Arg-Thr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(O)=O SWWCDAGDQHTKIE-RHYQMDGZSA-N 0.000 description 1
- UGTZHPSKYRIGRJ-YUMQZZPRSA-N Lys-Glu Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(O)=O)CCC(O)=O UGTZHPSKYRIGRJ-YUMQZZPRSA-N 0.000 description 1
- RFQATBGBLDAKGI-VHSXEESVSA-N Lys-Gly-Pro Chemical compound C1C[C@@H](N(C1)C(=O)CNC(=O)[C@H](CCCCN)N)C(=O)O RFQATBGBLDAKGI-VHSXEESVSA-N 0.000 description 1
- JQSIGLHQNSZZRL-KKUMJFAQSA-N Lys-Lys-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CCCCN)N JQSIGLHQNSZZRL-KKUMJFAQSA-N 0.000 description 1
- IHITVQKJXQQGLJ-LPEHRKFASA-N Met-Asn-Pro Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N1CCC[C@@H]1C(=O)O)N IHITVQKJXQQGLJ-LPEHRKFASA-N 0.000 description 1
- YORIKIDJCPKBON-YUMQZZPRSA-N Met-Glu-Gly Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O YORIKIDJCPKBON-YUMQZZPRSA-N 0.000 description 1
- NQTADLQHYWFPDB-UHFFFAOYSA-N N-Hydroxysuccinimide Chemical compound ON1C(=O)CCC1=O NQTADLQHYWFPDB-UHFFFAOYSA-N 0.000 description 1
- XMBSYZWANAQXEV-UHFFFAOYSA-N N-alpha-L-glutamyl-L-phenylalanine Natural products OC(=O)CCC(N)C(=O)NC(C(O)=O)CC1=CC=CC=C1 XMBSYZWANAQXEV-UHFFFAOYSA-N 0.000 description 1
- 239000004677 Nylon Substances 0.000 description 1
- 108010058846 Ovalbumin Proteins 0.000 description 1
- 241000237988 Patellidae Species 0.000 description 1
- 102000035195 Peptidases Human genes 0.000 description 1
- 239000004698 Polyethylene Substances 0.000 description 1
- 229920000954 Polyglycolide Polymers 0.000 description 1
- 108010039918 Polylysine Proteins 0.000 description 1
- 239000004743 Polypropylene Substances 0.000 description 1
- 239000004793 Polystyrene Substances 0.000 description 1
- MTHRMUXESFIAMS-DCAQKATOSA-N Pro-Asn-Lys Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCCCN)C(=O)O MTHRMUXESFIAMS-DCAQKATOSA-N 0.000 description 1
- STASJMBVVHNWCG-IHRRRGAJSA-N Pro-His-Leu Chemical compound C([C@@H](C(=O)N[C@@H](CC(C)C)C([O-])=O)NC(=O)[C@H]1[NH2+]CCC1)C1=CN=CN1 STASJMBVVHNWCG-IHRRRGAJSA-N 0.000 description 1
- FIODMZKLZFLYQP-GUBZILKMSA-N Pro-Val-Ser Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O FIODMZKLZFLYQP-GUBZILKMSA-N 0.000 description 1
- 108010076504 Protein Sorting Signals Proteins 0.000 description 1
- 108020004511 Recombinant DNA Proteins 0.000 description 1
- 102100037486 Reverse transcriptase/ribonuclease H Human genes 0.000 description 1
- 241000283984 Rodentia Species 0.000 description 1
- 238000012300 Sequence Analysis Methods 0.000 description 1
- SMIDBHKWSYUBRZ-ACZMJKKPSA-N Ser-Glu-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O SMIDBHKWSYUBRZ-ACZMJKKPSA-N 0.000 description 1
- OHKFXGKHSJKKAL-NRPADANISA-N Ser-Glu-Val Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O OHKFXGKHSJKKAL-NRPADANISA-N 0.000 description 1
- FAPWRFPIFSIZLT-UHFFFAOYSA-M Sodium chloride Chemical compound [Na+].[Cl-] FAPWRFPIFSIZLT-UHFFFAOYSA-M 0.000 description 1
- 108091081024 Start codon Proteins 0.000 description 1
- VGYBYGQXZJDZJU-XQXXSGGOSA-N Thr-Glu-Ala Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O VGYBYGQXZJDZJU-XQXXSGGOSA-N 0.000 description 1
- KBLYJPQSNGTDIU-LOKLDPHHSA-N Thr-Glu-Pro Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N1CCC[C@@H]1C(=O)O)N)O KBLYJPQSNGTDIU-LOKLDPHHSA-N 0.000 description 1
- 108091036066 Three prime untranslated region Proteins 0.000 description 1
- 108010034949 Thyroglobulin Proteins 0.000 description 1
- 102000009843 Thyroglobulin Human genes 0.000 description 1
- 241000209140 Triticum Species 0.000 description 1
- 235000021307 Triticum Nutrition 0.000 description 1
- 108091023045 Untranslated Region Proteins 0.000 description 1
- 241000607479 Yersinia pestis Species 0.000 description 1
- 239000012190 activator Substances 0.000 description 1
- AZDRQVAHHNSJOQ-UHFFFAOYSA-N alumane Chemical class [AlH3] AZDRQVAHHNSJOQ-UHFFFAOYSA-N 0.000 description 1
- 235000019418 amylase Nutrition 0.000 description 1
- 230000001580 bacterial effect Effects 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 229920001400 block copolymer Polymers 0.000 description 1
- 210000004899 c-terminal region Anatomy 0.000 description 1
- 150000001720 carbohydrates Chemical class 0.000 description 1
- 235000014633 carbohydrates Nutrition 0.000 description 1
- 125000003178 carboxy group Chemical group [H]OC(*)=O 0.000 description 1
- 239000012876 carrier material Substances 0.000 description 1
- 238000004113 cell culture Methods 0.000 description 1
- 210000000170 cell membrane Anatomy 0.000 description 1
- 239000003795 chemical substances by application Substances 0.000 description 1
- 238000010367 cloning Methods 0.000 description 1
- 210000001072 colon Anatomy 0.000 description 1
- 230000000052 comparative effect Effects 0.000 description 1
- 230000001268 conjugating effect Effects 0.000 description 1
- 229940124462 contraceptive vaccine Drugs 0.000 description 1
- 239000007822 coupling agent Substances 0.000 description 1
- 230000037029 cross reaction Effects 0.000 description 1
- ATDGTVJJHBUTRL-UHFFFAOYSA-N cyanogen bromide Chemical compound BrC#N ATDGTVJJHBUTRL-UHFFFAOYSA-N 0.000 description 1
- 235000018417 cysteine Nutrition 0.000 description 1
- XUJNEKJLAYXESH-UHFFFAOYSA-N cysteine Natural products SCC(N)C(O)=O XUJNEKJLAYXESH-UHFFFAOYSA-N 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 230000002939 deleterious effect Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 239000003085 diluting agent Substances 0.000 description 1
- 229940079593 drug Drugs 0.000 description 1
- 230000000463 effect on translation Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 239000000839 emulsion Substances 0.000 description 1
- 210000002472 endoplasmic reticulum Anatomy 0.000 description 1
- 230000030583 endoplasmic reticulum localization Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 239000003623 enhancer Substances 0.000 description 1
- 239000011152 fibreglass Substances 0.000 description 1
- ZZUFCTLCJUWOSV-UHFFFAOYSA-N furosemide Chemical compound C1=C(Cl)C(S(=O)(=O)N)=CC(C(O)=O)=C1NCC1=CC=CO1 ZZUFCTLCJUWOSV-UHFFFAOYSA-N 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 239000011521 glass Substances 0.000 description 1
- 150000004676 glycans Chemical class 0.000 description 1
- 230000002414 glycolytic effect Effects 0.000 description 1
- PCHJSUWPFVWCPO-UHFFFAOYSA-N gold Chemical compound [Au] PCHJSUWPFVWCPO-UHFFFAOYSA-N 0.000 description 1
- 239000010931 gold Substances 0.000 description 1
- 229910052737 gold Inorganic materials 0.000 description 1
- 102000045597 human CAST Human genes 0.000 description 1
- 229940042795 hydrazides for tuberculosis treatment Drugs 0.000 description 1
- 230000028993 immune response Effects 0.000 description 1
- 102000018358 immunoglobulin Human genes 0.000 description 1
- 229940072221 immunoglobulins Drugs 0.000 description 1
- 230000001976 improved effect Effects 0.000 description 1
- 238000000338 in vitro Methods 0.000 description 1
- 230000001939 inductive effect Effects 0.000 description 1
- 230000004941 influx Effects 0.000 description 1
- 239000003112 inhibitor Substances 0.000 description 1
- 230000005764 inhibitory process Effects 0.000 description 1
- 238000003780 insertion Methods 0.000 description 1
- 230000037431 insertion Effects 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 230000003834 intracellular effect Effects 0.000 description 1
- 238000002955 isolation Methods 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 210000000265 leukocyte Anatomy 0.000 description 1
- AIHDCSAXVMAMJH-GFBKWZILSA-N levan Chemical compound O[C@H]1[C@H](O)[C@@H](CO)O[C@@]1(CO)OC[C@@H]1[C@@H](O)[C@H](O)[C@](CO)(CO[C@@H]2[C@H]([C@H](O)[C@@](O)(CO)O2)O)O1 AIHDCSAXVMAMJH-GFBKWZILSA-N 0.000 description 1
- 235000019421 lipase Nutrition 0.000 description 1
- 239000002502 liposome Substances 0.000 description 1
- 239000007788 liquid Substances 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 239000003550 marker Substances 0.000 description 1
- 239000002609 medium Substances 0.000 description 1
- 239000012528 membrane Substances 0.000 description 1
- 230000034217 membrane fusion Effects 0.000 description 1
- MYWUZJCMWCOHBA-VIFPVBQESA-N methamphetamine Chemical compound CN[C@@H](C)CC1=CC=CC=C1 MYWUZJCMWCOHBA-VIFPVBQESA-N 0.000 description 1
- 239000006151 minimal media Substances 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 239000002991 molded plastic Substances 0.000 description 1
- 238000010369 molecular cloning Methods 0.000 description 1
- 230000007935 neutral effect Effects 0.000 description 1
- 238000011587 new zealand white rabbit Methods 0.000 description 1
- 229920001778 nylon Polymers 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 229940092253 ovalbumin Drugs 0.000 description 1
- 210000001672 ovary Anatomy 0.000 description 1
- 239000000123 paper Substances 0.000 description 1
- 230000007170 pathology Effects 0.000 description 1
- 230000035515 penetration Effects 0.000 description 1
- 238000010647 peptide synthesis reaction Methods 0.000 description 1
- 210000002381 plasma Anatomy 0.000 description 1
- 229920002401 polyacrylamide Polymers 0.000 description 1
- 230000008488 polyadenylation Effects 0.000 description 1
- 229920000573 polyethylene Polymers 0.000 description 1
- 229920000656 polylysine Polymers 0.000 description 1
- 229920000642 polymer Polymers 0.000 description 1
- 229920001155 polypropylene Polymers 0.000 description 1
- 229920001282 polysaccharide Polymers 0.000 description 1
- 239000005017 polysaccharide Substances 0.000 description 1
- 229920002223 polystyrene Polymers 0.000 description 1
- 230000035935 pregnancy Effects 0.000 description 1
- 210000002307 prostate Anatomy 0.000 description 1
- 230000002797 proteolythic effect Effects 0.000 description 1
- 150000003212 purines Chemical class 0.000 description 1
- 238000003127 radioimmunoassay Methods 0.000 description 1
- 230000001850 reproductive effect Effects 0.000 description 1
- 238000012552 review Methods 0.000 description 1
- 210000003705 ribosome Anatomy 0.000 description 1
- 229930182490 saponin Natural products 0.000 description 1
- 150000007949 saponins Chemical class 0.000 description 1
- 235000017709 saponins Nutrition 0.000 description 1
- 210000000582 semen Anatomy 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
- 238000012163 sequencing technique Methods 0.000 description 1
- 210000000813 small intestine Anatomy 0.000 description 1
- 239000011780 sodium chloride Substances 0.000 description 1
- 239000002904 solvent Substances 0.000 description 1
- 230000021595 spermatogenesis Effects 0.000 description 1
- 210000000952 spleen Anatomy 0.000 description 1
- 238000010561 standard procedure Methods 0.000 description 1
- 238000003860 storage Methods 0.000 description 1
- 239000000758 substrate Substances 0.000 description 1
- 238000013268 sustained release Methods 0.000 description 1
- 239000012730 sustained-release form Substances 0.000 description 1
- 230000002194 synthesizing effect Effects 0.000 description 1
- 210000001541 thymus gland Anatomy 0.000 description 1
- 231100000419 toxicity Toxicity 0.000 description 1
- 230000001988 toxicity Effects 0.000 description 1
- 230000002103 transcriptional effect Effects 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 230000001131 transforming effect Effects 0.000 description 1
- 230000001960 triggered effect Effects 0.000 description 1
- KMIOJWCYOHBUJS-HAKPAVFJSA-N vorolanib Chemical compound C1N(C(=O)N(C)C)CC[C@@H]1NC(=O)C1=C(C)NC(\C=C/2C3=CC(F)=CC=C3NC\2=O)=C1C KMIOJWCYOHBUJS-HAKPAVFJSA-N 0.000 description 1
- 210000004340 zona pellucida Anatomy 0.000 description 1
Classifications
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K14/00—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
- C07K14/81—Protease inhibitors
- C07K14/8107—Endopeptidase (E.C. 3.4.21-99) inhibitors
- C07K14/8139—Cysteine protease (E.C. 3.4.22) inhibitors, e.g. cystatin
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61K—PREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
- A61K39/00—Medicinal preparations containing antigens or antibodies
Definitions
- NICHD Network-to-Network Interface
- This invention relates to novel proteins and peptides and their use in contraceptive vaccines and to assess infertility.
- the invention also relates to DNA molecules coding for the proteins and peptides and host cells containing the DNA molecules linked to expression control sequences for producing the proteins and peptides.
- Mammalian spermatozoa are highly specialized both in structure and function. These cells are the product of a developmental program that involves the expression of genes unique to the testes and of testis-specific variants of common somatic genes. Why testis and sperm should need specialized isoforms of common proteins or genes that are expressed only during spermatogenesis remains to be established.
- Idiopathic infertility is characterized clinically as the inability to achieve a pregnancy by cohabiting couples with no apparent anatomical or functional reproductive pathology. In about 10% of such cases, the cause is attributed to immunological phenomena, including circulating antisperm antibodies in one or both partners. Presumably, such antibodies target to spermatozoa and, as a consequence, conception is blocked or fails. Additionally, there is indirect evidence of an association between infertility and antisperm antibodies in both male and female patients. With respect to the subject of immunologic infertility, see Witkin et al.. Am. J. Obstet. Gvnecol.. 158. 59-62 (1988); Clarke et al., Fertil. Steril..
- the invention provides purified proteins and peptides whose sequences comprise the sequence of an epitope of one of these proteins.
- the proteins and peptides are described in detail below.
- the proteins are unique to sperm and testis, and the proteins and peptides can be used in vaccines for contraception in mammals. Accordingly, the invention further provides: (1) immunogens comprising a peptide linked to a carrier, the peptide being capable of producing an antibody that reacts specifically with one of the proteins of the invention and having a sequence comprising a sequence which forms a B-cell epitope of the protein; and
- vaccines comprising the proteins (or immunogenic portions thereof) , peptides and immunogens in a delivery system.
- proteins and peptides can be used in diagnostic assays for assessing infertility.
- the assays and kits for performing the assays are also part of the invention.
- the invention provides DNA molecules coding for the proteins and peptides, and host cells containing the DNA molecules linked to expression control sequences, for producing the proteins and peptides.
- Figure 1 Diagram comparing the sequences of somatic and testis-specific isoforms of calpastatin.
- Figure 2 Computer-generated hydropathy plot comparing the first forty-one amino acids of somatic (solid bars) and testis-specific (open bars) isoforms of calpastatin.
- Figure 3 Western blot of human tissue extracts (lane 1 - testis, lane 2 - sperm, lane 3 - liver) probed with affinity-purified rabbit antiserum to a peptide having the sequence of a B-cell epitope found only on the testis- specific isoform of calpastatin.
- Figure 4 Graph of ELISA results. In particular, absorbance at 405 nm is plotted versus weeks post primary immunization of macaques with a peptide having the sequence of a B-cell epitope found only on testis-specific isoform of calpastatin linked to a universal T-cell epitope by a four-amino acid linker.
- Figure 5 Diagram of the technique of epitope mapping by nested deletions for clone C-2 and photograph of Coomasie blue-stained PAGE gel after separation of the resultant truncated proteins.
- Figure 6 Western blots of truncated proteins produced by nested deletions performed to identify B-cell epitopes on the protein produced by clone C-2.
- Figure 7 Diagram illustrating epitope identification for clone C-2.
- Figure 8 Computer-generated plot of the occurrence of the amino acid valine along the length of the clone L-7 protein.
- Figure 9 Western blots of truncated proteins produced by nested deletions performed to identify B-cell epitopes on the protein produced by clone L-7.
- Figure 10 Diagram illustrating epitope identification for clone L-7.
- the invention provides a purified protein which is a testis-specific isoform of calpastatin.
- Testis-specific is used herein to mean that the isoform is found in the testes and sperm, but is not found in other tissues.
- somatic isoforms are those found in one or more, generally several, types of tissues.
- the somatic isoforms may be found in testes and sperm but, if so, will also be found in at least one other type of tissue.
- Clone Y-19 coding for a human testis-specific isoform of calpastatin, was identified by screening a human testis cDNA library with sera from infertile patients positive for antisperm antibodies (see Example 1 below) .
- the complete sequence of this human testis-specific isoform of calpastatin is given in Chart A below.
- Affinity-purified antiserum specific for this testis- specific isoform of calpastatin was used to localize the isoform on human sperm by immuno-fluorescence. Diffuse, granular fluorescence was observed throughout the acrosome, and intense fluorescence was observed in the equatorial segment of the sperm (see Example 4) .
- Calpastatin is the peptide inhibitor of calpain, a cysteine protease. Calpain has been localized to the sperm head and appears to be involved in the acrosome reaction. See, Schollmeyer, Biol. Reprod.. 34. 721-731 (1986). Although not wishing to be bound by any particular theory, it is believed that infertility in individuals having antibodies directed to testis-specific calpastatin occurs as follows. The acrosome reaction, which must occur in order for the sperm to penetrate the zona pellucida of the egg, is triggered by an influx of Ca +2 . Wasserman, Annu. Rev. Cell Biol.. 2, 109-142 (1987) .
- Calpain then, in the presence of the Ca +2 would hydrolyze calpastatin, thereby releasing protease inhibition and permitting proteolytic activity in membrane fusion phenomena. Goll et al., Bioessays. 14. 549-556 (1992) . Perturbation of this sequence of events by antibodies directed to testis- specific calpastatin would compromise fertilization and concomitantly cause infertility. Preliminary studies have demonstrated loss of calpastatin immunoreactivity from acrosome-reacted sperm, a result predicted from this theory. Also, the immunofluorescence studies described above show that testis-specific calpastatin is found on the surface of sperm and would, therefore, be accessible to antibodies.
- the invention further provides a protein which is the protein produced by clone C-2.
- Clone C-2 is a human cDNA clone that was identified by screening a human testis cDNA library with sera from infertile patients positive for antisperm antibodies (see Example 1 below) .
- the C-2 protein is found in testis and sperm, but it is not found in other tissues.
- the complete amino acid sequence of the C-2 protein is set forth in Chart B below.
- the invention also provides a protein which is the protein produced by clone L-7.
- Clone L-7 is a human cDNA clone that was identified by screening a human testis cDNA library with sera from infertile patients positive for antisperm antibodies (see Example 1 below) .
- the L-7 protein is found in testis and sperm, but it is not found in other tissues. Affinity-purified antiserum specific for the L-7 protein was used to localize the L-7 protein on human sperm by immunofluorescence. Fluorescence was observed throughout the acrosome.
- the complete amino acid sequence of the L-7 protein is set forth in Chart C below.
- the Y-19, C-2 and L-7 proteins are human proteins. Corresponding proteins in other mammals would be expected to be at least 70% homologous to these human proteins.
- the corresponding proteins in other mammals can be obtained by the method described in Example 1 or by using the sequences given in Charts A, B and C to design DNA probes which can be used to screen testis gene libraries, preferably cDNA libraries, of other mammals. Methods of making gene (e.g.. cDNA) libraries, designing probes for screening them, identifying and isolating a desired clone, producing protein from the clone, etc., are well known in the art. See, e.g.. Ausubel et al.. Current Protocols In Molecular Biology. Volumes 1 and 2 (John Wiley and Sons, New York 1989) and Sambrook et al.. Molecular Cloning; A Laboratory Manual (Cold Spring Harbor
- Testis cDNA libraries can also be purchased from ClonTech Laboratories, Inc., 1020 E. Meadow Circle, Palo Alto, CA 94303-4230.
- the proteins of the invention can be used in contraceptive vaccines in mammals. Preferably a protein from the same species of mammal that is to be immunized is used in the vaccine. However, given the expected close homology of the proteins from different mammalian species, it is expected that proteins from other species, especially closely-related species, can be used.
- Immunogenic portions of the proteins can also be used in the vaccines. Immunogenic portions of the proteins must include at least a B-cell epitope. In choosing an immunogenic portion of testis-specific calpastatin, a portion must be chosen which includes sequences found on the testis-specific isoform but not found on the somatic isoforms.
- testis-specific calpastatin or an immunogenic portion thereof, since somatic isoforms exist, and cross-reaction with these somatic isoforms may occur if the complete protein or an immunogenic portion containing an immunogenic somatic sequence is used in the vaccine. This may cause deleterious side effects and should be avoided except when the vaccine is to be used for contraception in pest species (e.g.. rodents) .
- peptides derived from the proteins of the invention are used in the vaccines.
- the peptides must comprise at least a B-cell epitope of the protein.
- a peptide derived from testis- specific calpastatin must include a B-cell epitope from the sequences found on the testis-specific isoform but not found on the somatic isoforms.
- the peptide may include other sequences besides those which form the B-cell epitope, but these sequences must be chosen so that the antibody produced as a result of immunization with the vaccine containing the peptide will react specifically with the protein found in testis and sperm.
- This sequence of 41 amino acids is unique to the testis- specific isoform of calpastatin.
- Peptides having this sequence, or a portion of it that includes the sequence from amino acid 26 through amino acid 41 can be used to elicit antibodies that react with the testis-specific isoform of calpastatin, but do not react with somatic isoforms of calpastatin.
- Amino acids 26-41 in the above sequence have been identified as a B-cell epitope.
- the protein coded for by clone C-2 contains the following sequence:
- Pro Glu Pro Lys lie lie Pro Ser Glu Glu Asp Pro Thr Phe 15
- Peptides having this sequence, or a portion of it that includes the sequence from amino acid 4 through amino acid 17, can be used to elicit antibodies that react specifically with the C-2 protein.
- Amino acids 4-17 in the above sequence have been identified as a B-cell epitope.
- the protein coded for by clone L-7 contains the following sequence:
- SEQ ID NO:11 and SEQ ID NO:12 have been identified as B-cell epitopes, and peptides having these sequences can be used to elicit antibodies that react specifically with the protein.
- the peptides comprising a B-cell epitope of one of the proteins of the invention are preferably used in the vaccines in the form of an immunogen comprising the peptide linked to a carrier.
- Suitable carriers are compounds capable of stimulating the production of antibodies to haptens coupled to them in a host animal. Many such carriers are well-known.
- the carrier may be a high molecular weight compound.
- Suitable high molecular weight compounds include proteins, polypeptides, carbohydrates, polysaccharides, lipopolysaccharides, nucleic acids, and the like of sufficient size and immunogenicity.
- Preferred high molecular weight compounds are proteins and polypeptides.
- Suitable immunogenic carrier proteins and polypeptides will generally have molecular weights between 4,000 and 10,000,000, and preferably greater than 15,000.
- suitable carriers include proteins such as albumins (e.g.. bovine serum albumin, ovalbumin, human serum albumin) , immunoglobulins, thyroglobulins (e.g.. bovine thyroglobulin) , he ocyanins (e.g.. Keyhole Limpet he ocyanin) , toxins (e.g.. diptheria toxoid, tetanus toxoid) and polypeptides such as polylysine or polyalaninelysine. Preferred are diptheria toxoid and tetanus toxoid.
- the peptide may be coupled to the carrier with conjugating reagents such as glutaraldehyde, a water soluble carbodiimide such as l-(3-dimethylaminopropyl)-3-ethylcarbodiimide hydro ⁇ chloride (ECDI) , N-N-carbonyldiimidazole, 1— hydroxybenzotriazole monohydrate, N-hydroxysuccinimide, 6- maleimidocaproyl-N-hydroxysuccinimide, n-trifluoroacetylimidazole cyanogen bromide, 3-(2'— benzothiazolyl-dithio) propionate succinimide ester hydrazides or affinity labeling methods.
- conjugating reagents such as glutaraldehyde, a water soluble carbodiimide such as l-(3-dimethylaminopropyl)-3-ethylcarbodiimide hydro ⁇ chloride (ECDI
- the number of peptides attached to the high molecular weight carrier is called the "epitopic density.”
- the epitopic density can range from 1 to the number of available coupling groups on the carrier molecule.
- the epitopic density on a particular carrier will depend upon the molecular weight of the carrier and the density and availability of coupling sites.
- the carrier may also be a peptide which has a sequence comprising the sequence of a T-cell epitope of one of the proteins of the invention or of another protein.
- Methods of identifying T-cell epitopes are known. See, O'Hern and Goldberg, in Techniques In Protein Chemistry IV. pages 481- 490 (1993) ; O'Hern and Goldberg, Proceed ⁇ Intern ⁇ Svm . Control Rel. Bioact. Mater.. 20, 394-395 (1993).
- the three criteria for selection of a T-cell epitope are: a size of 8-12 amino acids; hypervariability; and one or more representations of the tetrapeptide motif previously reported to be associated with T-cell epitopes. O'Hern and Goldberg, in Techniques In Protein Chemistry IV. pages 481- 490 (1993); O'Hern and Goldberg, Proceed. Intern ⁇ Svmp. Control Rel. Bioact. Mater.. 20, 394-395 (1993).
- the carrier is a peptide which has a sequence comprising the sequence of a promiscuous T-cell epitope.
- a promiscuous T-cell epitope is a T-cell epitope that is recognized by individuals of several different major histocompatability (MHC) types. Promiscuous T-cell epitopes are known. See, Ho et al., Eur. J. Immunol.. 20. 477-483 (1990); Kaumaya, et al., J. Molec. Reco ⁇ .. 6_, 81-94 (1993) .
- a preferred promiscuous T-cell epitope has the following sequence:
- a peptide carrier which has a sequence comprising the sequence of a T-cell epitope may include other sequences linked to the N-terminal or C-terminal of the T-cell epitope.
- additional amino acids may be provided to link the B-cell epitope on the peptide to the T-cell epitope on the carrier. These linking amino acids should form a four-residue j8-turn based on examination of 33 patterns in native proteins that code for ⁇ corners. Efimov, FEBS Lett.. 166. 33 (1984); Kaumaya et al., Biochemistry. 29. 13-23 (1990) .
- Peptides comprising a B-cell epitope may be coupled to a peptide carrier comprising a T-cell epitope in the same manner as described above for high molecular weight proteins and polypeptides to form the immunogen.
- immunogens are preferably synthesized as a single peptide in the ways described below for the synthesis of peptides.
- the vaccines contain one or more of the proteins (or an immunogenic portion thereof) , peptides and immunogens of the invention in a delivery system. Suitable delivery systems are well known. For instance, the delivery system may simply be a solvent (such as saline and buffers) or other liquid (such as an oil) . However, the delivery system preferably enhances the immune response.
- Such delivery systems include aluminum salts, water-oil emulsions (such as incomplete Freund's adjuvant), saponins, liposomes, immune stimulating complex, lipopolysaccharides.
- mycobacterial adjuvants such as Freund's complete adjuvant
- Squalene-Arlacel A containing the synthetic muramyl dipeptide N-acetyl-nor-muramyl-L-alanyl-D- isoglutamine (CGP11637; Ciba-Geigy Pharmaceuticals, Basel, Switzerland) , live vectors, antigen immunotargeting materials, and polymers (e.g..
- biodegradable microspheres such as polylactide-polyglycolide microspheres, and block copolymers for sustained release. See Goldberg, in Gamete Interaction: Prospects For Immunocontraception. pages 63- 73 (1990); Alexander et al., Reprod. Ferti1. Dev.. 6_, 273- 80 (1994); O'Hern et al., Biol. Reprod.. 52. 331-339 (1995) .
- the vaccines may be administered in any conventional manner, including orally, intradermally, subcutaneously, intramuscularly, etc. to male or female mammals to inhibit fertilization of eggs by sperm.
- Suitable routes of administration and effective amounts (effective dosages and number of doses) necessary to inhibit conception can be determined empirically as is known in the art.
- inhibitor is meant at least a 50% reduction in the number of female mammals becoming pregnant as a result of the administration of the vaccine. Preferably at least a 75%, most preferably at least a 90%, reduction is achieved.
- the proteins and peptides comprising a B-cell epitope can also be used in assays to assess infertility.
- the peptides may used as such or may be linked to a carrier.
- the carriers e.g. f large molecular weight and T-cell epitope carriers
- methods of linking the peptides to the carriers are the same as described above for the immunogens.
- the protein, peptide or peptide linked to a carrier is contacted with a body fluid of a patient under conditions that permit antibodies in the body fluid to bind to it.
- the assays are immunoassays that allow for the determination of whether the body fluid of a patient contains antibodies that bind to the protein, peptide or peptide linked to a carrier.
- Suitable immunoassays and reagents for use therein are well known in the art, and those skilled in the art will be able to determine operative and optimal assay conditions using only ordinary skill in the art.
- the protein, peptide or peptide linked to a carrier will be immobilized on a solid surface.
- Suitable solid surfaces are well-known and include glass, polystyrene, polypropylene, polyethylene, nylon, paper, fiberglass, polyacrylamide and agaroses.
- the immobilized material is contacted with the body fluid so that antibodies present in the body fluid can bind to the protein, peptide or peptide linked to a carrier.
- a labeled secondary antibody or other material which binds specifically to the antibody in the body fluid is added as a means to detect and quantitate the antibody bound to the protein, peptide or peptide linked to a carrier.
- Suitable labels are well known in the art. They include enzymes, fluorophores, radionucleotides, bioluminescent labels, chemiluminescent labels, and particulate labels. The binding and detection of these labels can be accomplished using standard techniques well known to those skilled in the art.
- the body fluid may be any body fluid that contains antibodies. Suitable body fluids include serum, plasma, cervical mucus and seminal plasma.
- the assays may be used to assess infertility in patients unable to conceive. If the patient has antibodies specific for one of the proteins of the invention, then this may be the cause, or one of the causes, of the infertility. The assays may also be used to evaluate whether administration of the vaccines of the invention has been effective in immunizing recipients of the vaccines.
- the invention also comprises a kit.
- the kit is a packaged combination of one or more containers holding reagents useful in performing the immunoassays. Suitable containers for the reagents include bottles, vials, test tubes, microtiter plates, a solid phase (see listing above) held in a molded plastic device, and other containers known in the art.
- the kit will contain at least one container holding a protein, peptide comprising a B-cell epitope or such a peptide linked to a carrier.
- the kit may also comprise a container of a labeled component useful for detecting or quantitating the antibodies in the body fluids that bind to the protein, peptide or peptide linked to a carrier.
- the kit may also contain other materials which are known in the art and which may be desirable from a commercial and user standpoint, such as buffers, enzyme substrates, diluents, standards, etc.
- the kit may include containers, such as test tubes and microtiter plates, for performing the immunoassay.
- the peptides of the invention may be made in a variety of ways. For instance, solid phase synthesis techniques may be used. Suitable techniques are well known in the art, and include those described in Merrifield, in Chem. Polypeptides , pp. 335-61 (Katsoyannis and Panayotis eds. 1973); Merrifield, J. Am. Chem . Soc , 85, 2149 (1963); Davis et al., Biochem. Int'l, 10, 394-414 (1985); Stewart and Young, Solid Phase Peptide Synthesis (1969); U.S. Patents Nos. 3,941,763, 4,782,136, 4,990,596; Finn et al., in The Proteins , 3rd ed.
- Solid phase synthesis is the preferred method of making the peptides of the invention.
- the peptides may also be produced by culturing a host cell comprising a DNA molecule coding for the peptide operatively linked to expression control sequences under conditions permitting expression of the peptide.
- the proteins of the invention may also be produced in this manner.
- the proteins and peptides can be produced in transformed host cells using recombinant DNA techniques. Such techniques and suitable host cells and other reagents for use therein are well known in the art. For instance, the selection of a particular host cell is dependent upon a number of factors recognized by the art. These include, for example, compatibility with the chosen expression vector, use and toxicity of the protein or peptide encoded by the expression vector, rate of transformation, expression characteristics, bio-safety, and costs.
- useful host cells include bacteria, yeast and other fungi, animal cell lines, animal cells in an intact animal, or other host cells known in the art.
- the host cells may be transformed with a vector comprising DNA encoding the peptide or protein.
- the coding sequence must be operatively linked to a promoter.
- the promoter used in the vector may be any sequence which shows transcriptional activity in the host cell and may be derived from genes encoding homologous or heterologous proteins and either extracellular or intracellular proteins, such as amylase, glycoamylases, proteases, lipases, cellulases, and glycolytic enzymes.
- the promoter need not be identical to any naturally-occurring promoter. It may be composed of portions of various promoters or may be partially or totally synthetic.
- the promoter may be inducible or constitutive, and is preferably a strong promoter.
- strong it is meant that the promoter provides for a high rate of transcription in the host cell.
- the coding sequences In the vector, the coding sequences must be operatively linked to transcription termination sequences, as well as to the promoter.
- the coding sequence may also be operatively linked to expression control sequences other than the promoters and transcription termination sequences. These additional expression control sequences include activators, enhancers, operators, stop signals, cap signals, polyadenylation signals, ribosome binding sites, and other signals involved with the control of transcription and translation.
- the site at which the ribosome binds to the messenger includes a sequence of 3-9 purines.
- the consensus sequence of this stretch is 5'-AGGAGG-3', and it is frequently referred to as the Shine-Dalgarno sequence.
- the sequence of the ribosome binding site may be modified to alter expression. See Hui and DeBoer, Proc. Natl. Acad. Sci. USA. 84. 4762-66 (1987) . Comparative studies of ribosomal binding sites, such as the study of Scherer, et al.. Nucleic Acids Res.. 8., 3895-3907 (1987), may provide guidance as to suitable base changes.
- the ribosome binding site lies 3-12 bases upstream of the start (AUG) codon.
- a ribosome binding site and spacer that provide for efficient translation in the prokaryotic host cell should be provided.
- a preferred ribosome binding site and spacer sequence for optimal translation in E. coli are described in Springer and Sligar, Proc. Nat'l Acad. Sci. UH , M 8961-65 (1987) and von Bod an et al., Proc. Nat'l Acad. Sci. USA. 83. 9443-47 (1986).
- the sequence of this ribosome binding site and spacer is: AGGAGAACAA CAACC [SEQ ID NO:28] .
- the consensus sequence for the translation start sequence of eukaryotes has been defined by Kozak (Cell. 44. 283-292 (1986)) to be: C(A/G)CCAUGG. Deviations from this sequence, particularly at the -3 position (A or G) , have a large effect on translation of a particular mRNA. Virtually all highly expressed mammalian genes use this sequence. Highly expressed yeast mRNAs, on the other hand, differ from this sequence and instead use the sequence (A/Y)A(A/U)AAUGUCU (Cigan and Donahue, Gene. 59. 1-18 (1987)). These sequences may be altered empirically to determine the optimal sequence for use in a particular host cell.
- DNA molecules encoding for the protein or peptide could be excised from genes or cDNA clones by methods well known in the art.
- the DNA molecules encoding a protein or peptide of the invention are preferably chemically synthesized. Methods of chemically synthesizing DNA are well known in the art. Chemical synthesis is preferable for several reasons. First, chemical synthesis is desirable because codons preferred by the host in which the DNA sequence will be expressed may be used to optimize expression. Not all of the codons need to be altered to obtain improved expression, but greater than 50%, most preferably at least about 80%, of the codons should be changed to host- preferred codons. The codon preferences of many host cells, including E.
- chemically synthesized DNA also allows for the selection of codons with a view to providing unique or nearly unique restriction sites at convenient points in the sequence. The use of these sites provides a convenient means of constructing the synthetic coding sequences. In addition, if secondary structures formed by the messenger RNA transcript interfere with transcription or translation, they may be eliminated by altering the codon selections.
- Chemical synthesis also allows for the use of optimized expression control sequences with the DNA sequence coding for a protein or peptide. In this manner, opti al expression of the protein or peptide can be obtained. For instance, as noted above, promoters can be chemically synthesized and their location relative to the transcription start optimized. Similarly an optimized ribosome binding site and spacer can be chemically synthesized and used with coding sequences that are to be expressed in prokaryotes.
- DNA coding for a signal or signal-leader sequence may be located upstream of the DNA sequence encoding the protein or peptide.
- a signal or signal-leader sequence is an amino acid sequence at the amino terminus of a protein which allows the protein to which it is attached to be secreted from the cell in which it is produced. Suitable signal and signal-leader sequences are well known. Although secreted proteins are often easier to purify, secretion is generally not preferred since expression levels are much lower than those that can be obtained in the absence of secretion.
- the vector used to transform the host cells may have one or more replication systems which allow it to replicate in the host cells.
- the vector should contain the yeast 2u replication genes REP 1-3 and origin of replication. Many bacterial replicons are known.
- an integrating vector may be used which allows the integration into the host cell's chromosome of the sequence coding for the protein or peptide. Although the copy number of the coding sequence in the host cells would be lower than when self-replicating vectors are used, transformants having sequences integrated into their chromosomes are generally quite stable.
- the vector When the vector is a self-replicating vector, it is preferably a high copy number plasmid so that high levels of expression are obtained.
- a "high copy number plasmid" is one which is present at about 100 copies or more per cell. Many suitable high copy number plasmids are known.
- the vector desirably also has unique restriction sites for the insertion of DNA sequences and a sequence coding for a selectable or identifiable phenotypic trait which is manifested when the vector is present in the host cell ("a selection marker") . If a vector does not have unique restriction sites, it may be modified to introduce or eliminate restriction sites to make it more suitable for further manipulations.
- the vector comprising the sequence coding for the protein or peptide is prepared, it is used to transform the host cells. Methods of transforming host cells are well known in the art, and any of these methods may be used. Transformed host cells are selected in known ways and then cultured to produce the protein or peptide. The methods of culture are those well known in the art for the chosen host cell, but the use of enriched media (rather than minimal media) is preferred since higher yields are obtained. The expressed protein or peptide may be recovered using methods of recovering and purifying proteins from cell cultures which are well known in the art.
- a human testis cDNA library was screened with sera from infertile patients positive for antisperm antibodies. This screening was performed as described in Liang et al. , Reprod. Ferti1. Dev.. 6_, 297-305 (1994) . It is interesting to note that these patients, although infertile, were otherwise healthy. A total of 43 unique cDNA inserts were detected by the screening, of which four were testis-specific by Northern blot analysis (performed as described in Liang et al., Reprod. Fertil. Dev.. 6_, 297-305 (1994) ; see below) . One of the four clones turned out to encode a truncated mRNA for a somatic peptide and was not evaluated further. The remaining three clones were designated Y-19, C-2 and L-7.
- Figure 1 shows the relationship between the published sequence of DNA coding for somatic calpastatin (solid) and the testis-specific region of clone Y-19 (diagonal stripes) .
- Clone Y-19 appears to be a product of alternative splicing whereby DNA coding for somatic calpastatin domains L and 1 has been deleted and replaced with DNA coding for a unique, testis-specific L domain of approximately 65 amino acids (stripes) .
- the rest of the cDNA sequence of clone Y-19 is virtually identical to the published sequence of somatic calpastatin.
- DNA coding for testis-specific calpastatin contains 2 unique restriction sites (arrows) .
- a lkb fragment of clone Y-19 was used to probe a Northern blot of human poly A+ RNA from eight different human tissues (leukocytes, colon, small intestine, ovary, testis, prostate, thymus and spleen; Multiple Tissue Northern blots purchased from Clonetech, Palo Alto, CA) . Two mRNAs of 4.3 and 2.8kb were detected by the probe in all tissues. A third mRNA of 1.9kb was detected only in testis.
- Serum YM The serum that identified clone Y-19 (serum YM) agglutinates human sperm in a head-to-head orientation and completely inhibits cervical mucus penetration.
- Figure 2 shows a computer-generated hydropathy plot of the first 41 residues of somatic calpastatin (solid lines) versus the first 41 residues of testis-specific calpastatin (open bars) .
- This hydropathy plot was generated using algorithms described in Hopp and Woods, Proc. Natl. Acad. Sci. USA. 78. 3824-28 (1981) and Kyte and Doolittle, J. Mol. Biol.. 157. 105 (1982) .
- Only residues 26-41 of testis-specific calpastatin are both hydrophilic and unique to the testis isoform. Therefore, this segment was chosen as a testis-specific B-cell epitope.
- This segment has the sequence:
- testis-specific calpastatin has a hydrophobic tail. This hydrophobic tail could serve as a membrane anchor for the protein.
- a peptide immunogen was prepared containing the testis-specific calpastatin B-cell epitope identified in
- Example 3 linked to a carrier comprising a universal T-cell epitope derived from tetanus toxoid.
- the T-cell epitope had the following sequence: Val Asp Asp Ala Leu lie Asn Ser Thr Lys lie Tyr Ser Tyr
- This immunogen [SEQ ID NO:7] was synthesized at the Salk Institute (under Contract NOl-HD-0-2906 with the NIH) and made available by the Contraceptive Development Branch, Center for Population Research, NICHD (Bethesda, MD) .
- the affinity-purified antiserum was used to probe a Western blot of human tissue extracts.
- the tissue extracts were made and the Western blots were performed as described in Diekman and Goldberg, Biol. Reprod.. 50. 1087-1093 (1994) .
- the antiserum recognized a single protein of approximately 65Kd in human testis extracts (lane 1) and a slightly larger protein of approximately 68Kd in human sperm extracts (lane 2) .
- There was no reactivity with human liver extracts (lane 3) although liver is known to be rich in the somatic isoforms of calpastatin.
- the affinity-purified antiserum was also used to localize testis-specific calpastatin on human sperm by immunofluorescence, performed as described in Wright et al., Biol. Reprod.. 42. 693-701 (1990). Diffuse, granular fluorescence was observed throughout the acrosome, and intense fluorescence was observed in the equatorial segment of the sperm.
- mice Female cynomologous macaques (three per group) were immunized with either lOO ⁇ g or 300 ⁇ g of the peptide immunogen [SEQ ID NO:7] prepared in Example 4.
- the immunogen was administered intramuscularly in Sgualene- Arlacel A containing the synthetic muramyl dipeptide N- acetyl-nor-muramyl-L-alanyl-D-isoglutamine (CGP11637; Ciba- Geigy Pharmaceuticals, Basel, Switzerland) .
- a single booster injection consisting of the same dose in the same delivery system was administered intramuscularly ten days after the initial injection.
- ELISA titers were determined on microtiter plates coated with the testis-specific calpastatin B-cell epitope peptide (SEQ ID NO:2; see Example 3) conjugated to bovine serum albumin (BSA) .
- BSA bovine serum albumin
- the B-cell epitope peptide was synthesized with a non-natural cysteine at the amino terminus and conjugated to BSA as described in O'Hern et al., Biol. Reprod.. 52, 331-339 (1995).
- the ELISA was performed as described in Laerimore et al., J. Virol.. 69. 6077-6089 (1995) .
- the microtiter plate was coated with peptide-conjugated BSA or BSA alone.
- the cDNA insert of clone C-2 was used to probe a Northern blot of human poly A+ RNA from eight different human tissues as described above in Example 2. A single mRNA of 2.lkb was detected in testis only.
- clone C-2 cDNA encodes a unique and previously undescribed protein.
- the mRNA is approximately 2.1 kb. It has an open reading frame (ORF) of 1.4 kb translating to a peptide of 65-70 Kd. There are no significant sequence motifs or unusual properties.
- the original antiserum that detected clone C-2 (number 629) is 100% effective in blocking fertilization in vitro of human ova by human sperm (see table below) .
- Serum 629 which has been absorbed with sperm no longer blocks binding of sperm to zona (see table below) .
- the peptide coded for by a 900 bp fragment from the 3' end of the C-2 cDNA was expressed as a glutathione-s- transferase (GST) fusion protein using cloning methods well known in the art. See, e.g.. Smith and Johnson, Gene. 67. 31-40 (1988); Johnson et al. , Nature. 338. 585-587 (1989); Kemp et al.. Gene. 94. 223-28 (1990); Kaelin Jr. et al., Cell. 64. 521-532 (1991); Chittenden Jr. et al. , Cell. 65. 1073-1082 (1991); Kaelin Jr. et al.. Cell. 7_0, 351-364 (1992) .
- the clone encoding this fusion protein was designated clone GST-C2.
- Each of the truncated GST-C2 fusion proteins was partially purified and used as the target for Western blots (all as described in Example 6) probed with the original patient 629 serum.
- the results are shown in Figure 6.
- the full-length fusion protein and the first 4 deletions were strongly positive for the antibody.
- Time points 5-10 were negative, as was GST alone. Therefore, the C2 epitope recognized by the original human serum resides within time point 4.
- Each of the 10 nested deletions was sequenced using an oligo primer specific for the pGEX vector (see Pharmacia Biotech GST Gene Fusion Manual) .
- the results are shown in Figure 7.
- the first 3 time points showed deletion of the 3' untranslated region (UTR) .
- Time point 4 from which the 9 carboxy terminal amino acids were deleted, was still antibody positive.
- Time point 5 with deletion of an additional 26 amino acids, was antibody negative. Therefore, the relevant B-cell epitope (cross-hatched box) resides within the region of amino acids 426-454.
- the sequence of amino acids 426-454 is as follows:
- EXAMPLE 8 Preparation of C-2 Immunogen An immunogen comprising the B-cell epitopes identified in Example 7 was prepared as described in Example 4. The sequence of this immunogen is:
- the cDNA insert of clone L-7 was used to probe a Northern blot of human poly A+ RNA from eight different human tissues as described above in Example 2. A single mRNA of 2.5kb was detected in testis only.
- the sequence of the cDNA insert of clone L-7 was determined as described in Liang et al. , Reprod. Ferti1. Dev. f 6_, 297-305 (1994) .
- the DNA sequence of the insert and the corresponding amino acid sequence are set forth in Chart C below. Homology searches of the GenEMBL databases found that the sequence of the cDNA insert of clone L-7 was not represented. Thus, clone L-7 cDNA encodes an unique and previously undescribed protein.
- This protein is relatively large (66 kD) and consists of several domains of as yet unknown functional significance.
- the protein contains an endoplasmic reticulum signal sequence and appears to be anchored in the sperm plasma membrane at its amino terminus, but with surface accessible epitopes.
- This plot was generated using PC/Gene software from Intelligenetics, Inc., 700 E. El Ca ino Rd. , Mountainview, CA 94047.
- This computer analysis revealed the following features.
- Residues 88-328 contain very little valine and 9 potential protein kinase C (PKC) phosphorylation sites (P) .
- Residues 329 to 493 contains many valines and no PKC phosphorylation sites.
- Residues 329-493 also contain 11 repeats of a 15 amino acid motif (see below) .
- the consensus sequence of the motif is KgqEaQVKKsesgVp [SEQ ID NO:16].
- Residues 494-568 contain few valines and 3 potential PKC phosphorylation sites. From the computer analysis and the protein's sequence, the following domain organization of the L-7 protein is proposed:
- Domain I contains a consensus endoplasmic reticulum localization signal (p>0.85)
- Domain IV again has a high isoelectric point and contains 2 bipartite nuclear translocation signals (see Robbins et al., Cell. 64. 615-623 (1991)).
- L-7 was expressed and purified as a GST fusion protein as described in Example 6 above. This clone was designated GST-L7. Sera from three infertile patients (numbers 44, 65 and 66) recognized the fusion protein on Western blots (performed as described in Example 6) .
- Epitope 1 is amino acids 500-517
- epitope 2 is amino acids 389-408. These epitopes have the following sequences: Lys Gly Gin Glu Ala Gin Val Lys Lys Arg Glu Ser Val Val
- Immunogens comprising the two B-cell epitopes identified in Example 10 were prepared as described in Example 4. The sequences of these two immunogens are:
- Example 14 was used to immunize rabbits as described in Example 4.
- the rabbit antiserum was affinity purified, and the affinity-purified rabbit antiserum was used to probe a Western blot of human tissue extracts, all as described in Example 4.
- the affinity-purified antiserum recognized a single protein of approximately 58 Kd in human testis extracts and a protein of approximately 68 Kd in human sperm extracts. There was no reactivity with human liver extracts.
- a B-cell epitope of macaque testis-specific calpastatin was identified and has the following sequence:
- This B-cell epitope is 85% homologous to the B-cell epitope identified above for human testis-specific calpastatin [SEQ ID NO:2] .
- the B-cell epitope of the macaque protein corresponding to the human protein produced by clone C-2 has a sequence identical to that of the B-cell epitope of the C-2 protein [SEQ ID NO:8]. Thus, in this case, there was 100% homology between the sequences.
- Peptides having the sequences of the B-cell epitopes identified in Examples 3, 7 and 10 can be synthesized and coupled to diptheria toxin to produce immunogens that can -34- be used to immunize mammals, all as described in O'Hern et al., Biol. Reprod.. 5_2_, 331-339 (1995).
- EXAMPLE 16 Sequencing Of Clones Y-19.
- C-2 and L-7 DNA fragments of clones Y-19, C-2 and L-7 were subcloned into the pBluescriptll SK+ phagemid (Stratagene, Palo Alto, CA) and sequenced by a modification of the method of Kraft et al., Biotechniques. 6_, 544-547 (1988) as described in O'Hern et al., Biol. Reprod.. 52. 331-339 (1995) .
- the DNA sequences and deduced amino acid sequences are presented in Charts A (Y-19) , B (C-2) and C (L-7) .
- GAG AAG GCC AAA GAA GAA GAC CGT GAA AAG CTT GGT 657 Glu Lys Ala Lys Glu Glu Asp Arg Glu Lys Leu Gly 185 190
- GGA GGT AAA GCG AAG GAT TCA GCA AAG ACA ACA GAG 1341 Gly Gly Lys Ala Lys Asp Ser Ala Lys Thr Thr Glu
- Glu Leu Ser Ser lie Lys Asn Leu Gin His Asn lie 105 110 CAT CTG AAG GAG CTC TTT CTC ATG GGG AAC CCA TGT 432 His Leu Lys Glu Leu Phe Leu Met Gly Asn Pro Cys 115 120 125
- TGG TAC ACA GAC ATC AAT GCT ACT CTT TCC TCT TTA 720
- Arg Arg Pro Glu Pro Lys lie lie Pro Ser Glu Glu 440 445 GAC CCA ACC TTT GAA GAC AAC CCT GAA GTG CCT CCG 1440 Asp Pro Thr Phe Glu Asp Asn Pro Glu Val Pro Pro 450 455 460
- GTA CTA AAA GGA CAG GAA GCC CAA GAA AAG AAG GAG 1694 Val Leu Lys Gly Gin Glu Ala Gin Glu Lys Lys Glu
- GGT GAA AAA TCA AAA GGC TCG AAA AGG CGA AGG CAA 1874 Gly Glu Lys Ser Lys Gly Ser Lys Arg Arg Arg Gin
- GAG AAG GCC AAA GAA GAA GAC CGT GAA AAG CTT GGT 657
- GGA GGT AAA GCG AAG GAT TCA GCA AAG ACA ACA GAG 1341 Gly Gly Lys Ala Lys Asp Ser Ala Lys Thr Thr Glu
- GAG AGC AAA GAC CAC CTA CAG GCA CCA GAC ATA GAG 756
- GCT GTC ATC CTG ACT CTA CTG GGA CTT GCC
- GCT ATT TTG TTA ACA AGA TGG GCA CGA CGT
- CAC CTT CAG CAC CCG CGG TCA CCC ATG GCA CCC ATA 758 His Leu Gin His Pro Arg Ser Pro Met Ala Pro He 165 170 175
- GCA AGG TCT CAG ATA GCC
- GAG AAG AAA ACA AGG AAG 974 Ala Arg Ser Gin He Ala Glu Lys Lys Thr Arg Lys 240 245
- GTA CCA AAA GGA CAA GAA GGC CAA GTA GAG AAG ACT 1514
- GGT GAA AAA TCA AAA GGC TCG AAA AGG CGA AGG CAA 1874 Gly Glu Lys Ser Lys Gly Ser Lys Arg Arg Arg Gin 540 545 ATA CAG GAA GGA AGT ACA ACA AAA AAG TGG AAG AGT 1910 He Gin Glu Gly Ser Thr Thr Lys Lys Trp Lys Ser 550 555 560
Landscapes
- Chemical & Material Sciences (AREA)
- Organic Chemistry (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Biochemistry (AREA)
- Biophysics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Genetics & Genomics (AREA)
- Medicinal Chemistry (AREA)
- Molecular Biology (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Gastroenterology & Hepatology (AREA)
- Peptides Or Proteins (AREA)
- Medicines Containing Antibodies Or Antigens For Use As Internal Diagnostic Agents (AREA)
Abstract
The invention comprises novel proteins and peptides derived from these proteins. The proteins are unique to sperm and testes, and the proteins and peptides are useful in vaccines for contraception in mammals. The proteins and peptides are also useful in diagnostic assays for assessing infertility. The invention also provides DNA molecules coding for the proteins and peptides and host cells containing the DNA molecules linked to expression control sequences for producing the proteins and peptides.
Description
PROTEINS AND PEPTIDES FOR CONTRACEPTIVE VACCINES AND FERILITY DIAGNOSIS
This invention was developed in part by a subcontract under grant U54 HD 29099 from the National Institutes of Health (NIH) and a grant from the Contraceptive Research and Development Program (CSA-92-099) under a Cooperative
Agreement with the U.S. Agency for International
Development (DPE-3044-A-00-6063-00) , which in turn receives funds for AIDS research from an interagency agreement with the National Institute of Child Health and Human
Development (NICHD) . The U.S. government may have rights in the invention.
FIELD OF THE INVENTION This invention relates to novel proteins and peptides and their use in contraceptive vaccines and to assess infertility. The invention also relates to DNA molecules coding for the proteins and peptides and host cells containing the DNA molecules linked to expression control sequences for producing the proteins and peptides.
BACKGROUND OF THE INVENTION
Mammalian spermatozoa are highly specialized both in structure and function. These cells are the product of a developmental program that involves the expression of genes unique to the testes and of testis-specific variants of common somatic genes. Why testis and sperm should need specialized isoforms of common proteins or genes that are expressed only during spermatogenesis remains to be established.
Idiopathic infertility is characterized clinically as the inability to achieve a pregnancy by cohabiting couples with no apparent anatomical or functional reproductive pathology. In about 10% of such cases, the cause is attributed to immunological phenomena, including circulating antisperm antibodies in one or both partners. Presumably, such antibodies target to spermatozoa and, as a consequence, conception is blocked or fails. Additionally, there is indirect evidence of an association
between infertility and antisperm antibodies in both male and female patients. With respect to the subject of immunologic infertility, see Witkin et al.. Am. J. Obstet. Gvnecol.. 158. 59-62 (1988); Clarke et al., Fertil. Steril.. 49_, 1018-1025 (1988); Mathur et al., Fertil. Sterjl., 3_6, 486-495 (1981); Menge, in Immunological Aspects Of Infertility And Fertility Regulation, pages 205- 224 (Dhindsa and Schumacher eds. 1981); and Isojima et al.. Am. J. Obstet. Gvnecol.. 101. 677-683 (1968). These observations regarding immunologic infertility led to the suggestion that a vaccine based on a sperm antigen could provide an effective and innovative contraceptive technology. A number of sperm-specific proteins and peptides have been evaluated for use in contraceptive vaccines. See generally, Alexander et al., Reprod. Fertil. Dev.. 6_, 273-280 (1994) and Aitken et al., Brit. Med. Bull.. 49. 88-99 (1993) . For a recent review of sperm antigens, see Diekman and Goldberg, in Immunology Of Human Reproductionf Chapter 1 (1995) . The testis-specific isoform of lactate dehydrogenase, LDH-C4, and peptides derived from it are perhaps the most extensively characterized sperm antigens. See U.S. Patents Nos. 4,290,944, 4,310,456, 4,353,822, 4,354,967, 4,377,516, 4,392,997, 4,578,219, 4,585,587, 4,782,136, and 4,990,496; Wheat and Goldberg, in Isozymes; Current Topics In Biological and Medical Research. Volume 7: Molecular Structure and Regulation, pages 113-130 (1983); Millan et al., Proc. Natl. Acad. Sci. USA. 21, 5311-5315 (1987); Goldberg, in Gamete Interaction; Prospects For T mupo- contraception. pages 63-73 (Alexander et al. eds. 1990); LeVan and Goldberg, Biochem. J.. 273. 587-592 (1991) ; O'Hern and Goldberg, Proceed. Intern.: Svmp. Control. Rel. Bioact. Mater.. 20, 394-395 (1993); O'Hern and Goldberg, in Techniques In Protein Chemistry IV. pages 481-490 (1993); Kaumaya et al., J. Molec. Recog.. £, 81-94 (1993); and O'Hern et al., Biol. Reprod.. §2., 331-339 (1995).
Even though several sperm antigens have been identified, there remains a need to identify additional such antigens. In particular, it may be necessary to use a contraceptive vaccine containing several sperm antigens in genetically diverse populations of mammals, such as humans, to obtain effective contraception.
SUMMARY OF THE INVENTION
The invention provides purified proteins and peptides whose sequences comprise the sequence of an epitope of one of these proteins. The proteins and peptides are described in detail below.
The proteins are unique to sperm and testis, and the proteins and peptides can be used in vaccines for contraception in mammals. Accordingly, the invention further provides: (1) immunogens comprising a peptide linked to a carrier, the peptide being capable of producing an antibody that reacts specifically with one of the proteins of the invention and having a sequence comprising a sequence which forms a B-cell epitope of the protein; and
(2) vaccines comprising the proteins (or immunogenic portions thereof) , peptides and immunogens in a delivery system.
In addition, the proteins and peptides can be used in diagnostic assays for assessing infertility. The assays and kits for performing the assays are also part of the invention.
Finally, the invention provides DNA molecules coding for the proteins and peptides, and host cells containing the DNA molecules linked to expression control sequences, for producing the proteins and peptides.
BRIEF DESCRIPTION OF THE DRAWINGS
Figure 1: Diagram comparing the sequences of somatic and testis-specific isoforms of calpastatin.
Figure 2: Computer-generated hydropathy plot comparing the first forty-one amino acids of somatic (solid
bars) and testis-specific (open bars) isoforms of calpastatin.
Figure 3: Western blot of human tissue extracts (lane 1 - testis, lane 2 - sperm, lane 3 - liver) probed with affinity-purified rabbit antiserum to a peptide having the sequence of a B-cell epitope found only on the testis- specific isoform of calpastatin.
Figure 4: Graph of ELISA results. In particular, absorbance at 405 nm is plotted versus weeks post primary immunization of macaques with a peptide having the sequence of a B-cell epitope found only on testis-specific isoform of calpastatin linked to a universal T-cell epitope by a four-amino acid linker.
Figure 5: Diagram of the technique of epitope mapping by nested deletions for clone C-2 and photograph of Coomasie blue-stained PAGE gel after separation of the resultant truncated proteins.
Figure 6: Western blots of truncated proteins produced by nested deletions performed to identify B-cell epitopes on the protein produced by clone C-2.
Figure 7: Diagram illustrating epitope identification for clone C-2.
Figure 8: Computer-generated plot of the occurrence of the amino acid valine along the length of the clone L-7 protein.
Figure 9: Western blots of truncated proteins produced by nested deletions performed to identify B-cell epitopes on the protein produced by clone L-7.
Figure 10: Diagram illustrating epitope identification for clone L-7.
DETAILED DESCRIPTION OF THE PRESENTLY PREFERRED EMBODIMENTS
In a first aspect, the invention provides a purified protein which is a testis-specific isoform of calpastatin.
"Testis-specific" is used herein to mean that the isoform is found in the testes and sperm, but is not found in other tissues. In contrast to the testis-specific isoform are
the somatic isoforms of calpastatin. The somatic isoforms are those found in one or more, generally several, types of tissues. The somatic isoforms may be found in testes and sperm but, if so, will also be found in at least one other type of tissue.
Clone Y-19, coding for a human testis-specific isoform of calpastatin, was identified by screening a human testis cDNA library with sera from infertile patients positive for antisperm antibodies (see Example 1 below) . The complete sequence of this human testis-specific isoform of calpastatin is given in Chart A below.
Affinity-purified antiserum specific for this testis- specific isoform of calpastatin was used to localize the isoform on human sperm by immuno-fluorescence. Diffuse, granular fluorescence was observed throughout the acrosome, and intense fluorescence was observed in the equatorial segment of the sperm (see Example 4) .
Calpastatin is the peptide inhibitor of calpain, a cysteine protease. Calpain has been localized to the sperm head and appears to be involved in the acrosome reaction. See, Schollmeyer, Biol. Reprod.. 34. 721-731 (1986). Although not wishing to be bound by any particular theory, it is believed that infertility in individuals having antibodies directed to testis-specific calpastatin occurs as follows. The acrosome reaction, which must occur in order for the sperm to penetrate the zona pellucida of the egg, is triggered by an influx of Ca+2. Wasserman, Annu. Rev. Cell Biol.. 2, 109-142 (1987) . Calpain, then, in the presence of the Ca+2 would hydrolyze calpastatin, thereby releasing protease inhibition and permitting proteolytic activity in membrane fusion phenomena. Goll et al., Bioessays. 14. 549-556 (1992) . Perturbation of this sequence of events by antibodies directed to testis- specific calpastatin would compromise fertilization and concomitantly cause infertility. Preliminary studies have demonstrated loss of calpastatin immunoreactivity from acrosome-reacted sperm, a result predicted from this
theory. Also, the immunofluorescence studies described above show that testis-specific calpastatin is found on the surface of sperm and would, therefore, be accessible to antibodies. The invention further provides a protein which is the protein produced by clone C-2. Clone C-2 is a human cDNA clone that was identified by screening a human testis cDNA library with sera from infertile patients positive for antisperm antibodies (see Example 1 below) . The C-2 protein is found in testis and sperm, but it is not found in other tissues. The complete amino acid sequence of the C-2 protein is set forth in Chart B below.
The invention also provides a protein which is the protein produced by clone L-7. Clone L-7 is a human cDNA clone that was identified by screening a human testis cDNA library with sera from infertile patients positive for antisperm antibodies (see Example 1 below) . The L-7 protein is found in testis and sperm, but it is not found in other tissues. Affinity-purified antiserum specific for the L-7 protein was used to localize the L-7 protein on human sperm by immunofluorescence. Fluorescence was observed throughout the acrosome. The complete amino acid sequence of the L-7 protein is set forth in Chart C below.
As noted above, the Y-19, C-2 and L-7 proteins are human proteins. Corresponding proteins in other mammals would be expected to be at least 70% homologous to these human proteins. The corresponding proteins in other mammals can be obtained by the method described in Example 1 or by using the sequences given in Charts A, B and C to design DNA probes which can be used to screen testis gene libraries, preferably cDNA libraries, of other mammals. Methods of making gene (e.g.. cDNA) libraries, designing probes for screening them, identifying and isolating a desired clone, producing protein from the clone, etc., are well known in the art. See, e.g.. Ausubel et al.. Current Protocols In Molecular Biology. Volumes 1 and 2 (John Wiley and Sons, New York 1989) and Sambrook et al.. Molecular
Cloning; A Laboratory Manual (Cold Spring Harbor
Laboratory Press, New York 1989) . Testis cDNA libraries can also be purchased from ClonTech Laboratories, Inc., 1020 E. Meadow Circle, Palo Alto, CA 94303-4230. The proteins of the invention can be used in contraceptive vaccines in mammals. Preferably a protein from the same species of mammal that is to be immunized is used in the vaccine. However, given the expected close homology of the proteins from different mammalian species, it is expected that proteins from other species, especially closely-related species, can be used.
Immunogenic portions of the proteins can also be used in the vaccines. Immunogenic portions of the proteins must include at least a B-cell epitope. In choosing an immunogenic portion of testis-specific calpastatin, a portion must be chosen which includes sequences found on the testis-specific isoform but not found on the somatic isoforms.
Further, care should taken in using testis-specific calpastatin, or an immunogenic portion thereof, since somatic isoforms exist, and cross-reaction with these somatic isoforms may occur if the complete protein or an immunogenic portion containing an immunogenic somatic sequence is used in the vaccine. This may cause deleterious side effects and should be avoided except when the vaccine is to be used for contraception in pest species (e.g.. rodents) .
Preferably peptides derived from the proteins of the invention are used in the vaccines. To produce antibodies that react specifically with one of the proteins of the invention, the peptides must comprise at least a B-cell epitope of the protein. A peptide derived from testis- specific calpastatin must include a B-cell epitope from the sequences found on the testis-specific isoform but not found on the somatic isoforms. The peptide may include other sequences besides those which form the B-cell epitope, but these sequences must be chosen so that the
antibody produced as a result of immunization with the vaccine containing the peptide will react specifically with the protein found in testis and sperm.
Methods of identifying B-cell epitopes of a protein are known. See O'Hern and Goldberg, in Techniques In Protein Chemistry IV. pages 481-490 (1993); O'Hern and Goldberg, Proceed. Intern. Sv p. Control Rel. Bioact. Mater.. 20, 394-395 (1993). Three criteria are essential for immunogenicity: a size greater than 10 amino acids; surface accessibility of the sequence; and hypervariability (degree of foreignness) . See O'Hern and Goldberg, in Techniques In Protein Chemistry IV. pages 481-490 (1993) ; O'Hern and Goldberg, Proceed. Intern. Symp. Control Rel. Bioact. Mater.. 20, 394-395 (1993). The human testis-specific isoform of calpastatin has the following sequence at its N-terminal:
Met Gly Gin Phe Leu Ser Ser Thr Phe Leu Glu Gly Ser Pro
5 10
Ala Thr Val Ser Thr lie Ser Phe Val Thr Val Asn Ala Glu 15 20 25
Glu Gin Glu Lys Gin Phe Val Ser Ser Arg Thr Lys Gin 30 35 40
SEQ ID NO:l.
This sequence of 41 amino acids is unique to the testis- specific isoform of calpastatin. Peptides having this sequence, or a portion of it that includes the sequence from amino acid 26 through amino acid 41, can be used to elicit antibodies that react with the testis-specific isoform of calpastatin, but do not react with somatic isoforms of calpastatin. Amino acids 26-41 in the above sequence have been identified as a B-cell epitope.
The protein coded for by clone C-2 contains the following sequence:
Thr Asn lie Val Gin Glu Lys Lys His Thr Pro Arg Arg Arg
5 10
Pro Glu Pro Lys lie lie Pro Ser Glu Glu Asp Pro Thr Phe 15
20 25
Glu
SEQ ID NO:8.
Peptides having this sequence, or a portion of it that includes the sequence from amino acid 4 through amino acid 17, can be used to elicit antibodies that react specifically with the C-2 protein. Amino acids 4-17 in the above sequence have been identified as a B-cell epitope.
The protein coded for by clone L-7 contains the following sequence:
Lys Gly Gin Glu Ala Gin Val Lys Lys Arg Glu Ser Val Val
5 10
Leu Lys Gly Gin Glu Ala 15 20
SEQ ID NO:11 and the following sequence:
Lys Glu Arg Asp Ala Glu Lys Asp Pro Asn Lys Lys Glu Lys
5 10
Gly Asp Lys Asn 15
SEQ ID NO:12.
Both of these sequences of amino acids (SEQ ID NO:11 and SEQ ID NO:12) have been identified as B-cell epitopes, and peptides having these sequences can be used to elicit antibodies that react specifically with the protein.
The peptides comprising a B-cell epitope of one of the proteins of the invention are preferably used in the vaccines in the form of an immunogen comprising the peptide
linked to a carrier. Suitable carriers are compounds capable of stimulating the production of antibodies to haptens coupled to them in a host animal. Many such carriers are well-known. For instance, the carrier may be a high molecular weight compound. Suitable high molecular weight compounds include proteins, polypeptides, carbohydrates, polysaccharides, lipopolysaccharides, nucleic acids, and the like of sufficient size and immunogenicity. Preferred high molecular weight compounds are proteins and polypeptides. Suitable immunogenic carrier proteins and polypeptides will generally have molecular weights between 4,000 and 10,000,000, and preferably greater than 15,000. Such suitable carriers include proteins such as albumins (e.g.. bovine serum albumin, ovalbumin, human serum albumin) , immunoglobulins, thyroglobulins (e.g.. bovine thyroglobulin) , he ocyanins (e.g.. Keyhole Limpet he ocyanin) , toxins (e.g.. diptheria toxoid, tetanus toxoid) and polypeptides such as polylysine or polyalaninelysine. Preferred are diptheria toxoid and tetanus toxoid.
Methods of coupling the peptides to high molecular weight carriers are well-known. For instance, the peptide may be coupled to the carrier with conjugating reagents such as glutaraldehyde, a water soluble carbodiimide such as l-(3-dimethylaminopropyl)-3-ethylcarbodiimide hydro¬ chloride (ECDI) , N-N-carbonyldiimidazole, 1— hydroxybenzotriazole monohydrate, N-hydroxysuccinimide, 6- maleimidocaproyl-N-hydroxysuccinimide, n-trifluoroacetylimidazole cyanogen bromide, 3-(2'— benzothiazolyl-dithio) propionate succinimide ester hydrazides or affinity labeling methods. See also Pierce Handbook and General Catalog (1989) for a list of possible coupling agents. Additional references concerning conventional high molecular weight immunogenic carrier materials and techniques for coupling haptens thereto are: Erlanger,
Methods In Enzymology. 70, 85-104 (1980); Makela and Seppala, Handbook of Experimental Immunology (Blackwell 1986) ; Parker, Radioimmunoassay of Biologically Active Compounds (Prentice-Hall 1976) ; Butler J. Immunol. Meth.. 7, 1-24 (1974); Weinryb and Shroff, Drug. Metab. Rev.. 10. 271-83 (1979) ; Broughton and Strong, Clin. Chem.. 22. 726-32 (1976); Playfair et al., Br. Med. Bull.. 30. 24-31 (1974); U.S. Patents Nos. 4,990,596 and 4,782,136.
The number of peptides attached to the high molecular weight carrier is called the "epitopic density." The epitopic density can range from 1 to the number of available coupling groups on the carrier molecule. The epitopic density on a particular carrier will depend upon the molecular weight of the carrier and the density and availability of coupling sites. Preferably, only high molecular weight carriers having an epitopic density of at least 15 peptides per molecule are used in the vaccines of the invention.
The carrier may also be a peptide which has a sequence comprising the sequence of a T-cell epitope of one of the proteins of the invention or of another protein. Methods of identifying T-cell epitopes are known. See, O'Hern and Goldberg, in Techniques In Protein Chemistry IV. pages 481- 490 (1993) ; O'Hern and Goldberg, Proceed♦ Intern♦ Svm . Control Rel. Bioact. Mater.. 20, 394-395 (1993). The three criteria for selection of a T-cell epitope are: a size of 8-12 amino acids; hypervariability; and one or more representations of the tetrapeptide motif previously reported to be associated with T-cell epitopes. O'Hern and Goldberg, in Techniques In Protein Chemistry IV. pages 481- 490 (1993); O'Hern and Goldberg, Proceed. Intern♦ Svmp. Control Rel. Bioact. Mater.. 20, 394-395 (1993).
Most preferably the carrier is a peptide which has a sequence comprising the sequence of a promiscuous T-cell epitope. A promiscuous T-cell epitope is a T-cell epitope that is recognized by individuals of several different major histocompatability (MHC) types. Promiscuous T-cell
epitopes are known. See, Ho et al., Eur. J. Immunol.. 20. 477-483 (1990); Kaumaya, et al., J. Molec. Recoα.. 6_, 81-94 (1993) . A preferred promiscuous T-cell epitope has the following sequence:
Val Asp Asp Ala Leu lie Asn Ser Thr Lys lie Tyr Ser Tyr
5 10
Phe Pro Ser Val 15
SEQ ID NO:5.
A peptide carrier which has a sequence comprising the sequence of a T-cell epitope may include other sequences linked to the N-terminal or C-terminal of the T-cell epitope. In particular, additional amino acids may be provided to link the B-cell epitope on the peptide to the T-cell epitope on the carrier. These linking amino acids should form a four-residue j8-turn based on examination of 33 patterns in native proteins that code for αα corners. Efimov, FEBS Lett.. 166. 33 (1984); Kaumaya et al., Biochemistry. 29. 13-23 (1990) .
Peptides comprising a B-cell epitope may be coupled to a peptide carrier comprising a T-cell epitope in the same manner as described above for high molecular weight proteins and polypeptides to form the immunogen. However, such immunogens are preferably synthesized as a single peptide in the ways described below for the synthesis of peptides. The vaccines contain one or more of the proteins (or an immunogenic portion thereof) , peptides and immunogens of the invention in a delivery system. Suitable delivery systems are well known. For instance, the delivery system may simply be a solvent (such as saline and buffers) or other liquid (such as an oil) . However, the delivery system preferably enhances the immune response. Such delivery systems include aluminum salts, water-oil emulsions (such as incomplete Freund's adjuvant), saponins, liposomes, immune stimulating complex, lipopolysaccharides.
mycobacterial adjuvants (such as Freund's complete adjuvant) , Squalene-Arlacel A containing the synthetic muramyl dipeptide N-acetyl-nor-muramyl-L-alanyl-D- isoglutamine (CGP11637; Ciba-Geigy Pharmaceuticals, Basel, Switzerland) , live vectors, antigen immunotargeting materials, and polymers (e.g.. biodegradable microspheres, such as polylactide-polyglycolide microspheres, and block copolymers for sustained release) . See Goldberg, in Gamete Interaction: Prospects For Immunocontraception. pages 63- 73 (1990); Alexander et al., Reprod. Ferti1. Dev.. 6_, 273- 80 (1994); O'Hern et al., Biol. Reprod.. 52. 331-339 (1995) .
The vaccines may be administered in any conventional manner, including orally, intradermally, subcutaneously, intramuscularly, etc. to male or female mammals to inhibit fertilization of eggs by sperm. Suitable routes of administration and effective amounts (effective dosages and number of doses) necessary to inhibit conception can be determined empirically as is known in the art. By "inhibit" is meant at least a 50% reduction in the number of female mammals becoming pregnant as a result of the administration of the vaccine. Preferably at least a 75%, most preferably at least a 90%, reduction is achieved.
The proteins and peptides comprising a B-cell epitope can also be used in assays to assess infertility. The peptides may used as such or may be linked to a carrier. The carriers (e.g. f large molecular weight and T-cell epitope carriers) and methods of linking the peptides to the carriers are the same as described above for the immunogens. To perform the assay, the protein, peptide or peptide linked to a carrier is contacted with a body fluid of a patient under conditions that permit antibodies in the body fluid to bind to it. Thus, the assays are immunoassays that allow for the determination of whether the body fluid of a patient contains antibodies that bind to the protein, peptide or peptide linked to a carrier. Suitable immunoassays and reagents for use therein are well
known in the art, and those skilled in the art will be able to determine operative and optimal assay conditions using only ordinary skill in the art.
Preferably the protein, peptide or peptide linked to a carrier will be immobilized on a solid surface. Suitable solid surfaces are well-known and include glass, polystyrene, polypropylene, polyethylene, nylon, paper, fiberglass, polyacrylamide and agaroses. The immobilized material is contacted with the body fluid so that antibodies present in the body fluid can bind to the protein, peptide or peptide linked to a carrier. After washing away unbound materials, a labeled secondary antibody or other material which binds specifically to the antibody in the body fluid is added as a means to detect and quantitate the antibody bound to the protein, peptide or peptide linked to a carrier. Suitable labels are well known in the art. They include enzymes, fluorophores, radionucleotides, bioluminescent labels, chemiluminescent labels, and particulate labels. The binding and detection of these labels can be accomplished using standard techniques well known to those skilled in the art.
The body fluid may be any body fluid that contains antibodies. Suitable body fluids include serum, plasma, cervical mucus and seminal plasma. The assays may be used to assess infertility in patients unable to conceive. If the patient has antibodies specific for one of the proteins of the invention, then this may be the cause, or one of the causes, of the infertility. The assays may also be used to evaluate whether administration of the vaccines of the invention has been effective in immunizing recipients of the vaccines.
The invention also comprises a kit. The kit is a packaged combination of one or more containers holding reagents useful in performing the immunoassays. Suitable containers for the reagents include bottles, vials, test tubes, microtiter plates, a solid phase (see listing above) held in a molded plastic device, and other containers known
in the art. The kit will contain at least one container holding a protein, peptide comprising a B-cell epitope or such a peptide linked to a carrier. The kit may also comprise a container of a labeled component useful for detecting or quantitating the antibodies in the body fluids that bind to the protein, peptide or peptide linked to a carrier. The kit may also contain other materials which are known in the art and which may be desirable from a commercial and user standpoint, such as buffers, enzyme substrates, diluents, standards, etc. Finally, the kit may include containers, such as test tubes and microtiter plates, for performing the immunoassay.
The peptides of the invention may be made in a variety of ways. For instance, solid phase synthesis techniques may be used. Suitable techniques are well known in the art, and include those described in Merrifield, in Chem. Polypeptides , pp. 335-61 (Katsoyannis and Panayotis eds. 1973); Merrifield, J. Am. Chem . Soc , 85, 2149 (1963); Davis et al., Biochem. Int'l, 10, 394-414 (1985); Stewart and Young, Solid Phase Peptide Synthesis (1969); U.S. Patents Nos. 3,941,763, 4,782,136, 4,990,596; Finn et al., in The Proteins , 3rd ed. , vol. 2, pp. 105-253 (1976); and Erickson et al. in The Proteins , 3rd ed. , vol. 2, pp. 257- 527 (1976) . Solid phase synthesis is the preferred method of making the peptides of the invention.
The peptides may also be produced by culturing a host cell comprising a DNA molecule coding for the peptide operatively linked to expression control sequences under conditions permitting expression of the peptide. The proteins of the invention may also be produced in this manner. In particular, the proteins and peptides can be produced in transformed host cells using recombinant DNA techniques. Such techniques and suitable host cells and other reagents for use therein are well known in the art. For instance, the selection of a particular host cell is dependent upon a number of factors recognized by the art. These include, for example, compatibility with the
chosen expression vector, use and toxicity of the protein or peptide encoded by the expression vector, rate of transformation, expression characteristics, bio-safety, and costs. A balance of these factors must be struck with the understanding that not all hosts may be equally effective for the expression of a particular protein or peptide. Within the above guidelines, useful host cells include bacteria, yeast and other fungi, animal cell lines, animal cells in an intact animal, or other host cells known in the art.
The host cells may be transformed with a vector comprising DNA encoding the peptide or protein. On the vector, the coding sequence must be operatively linked to a promoter. The promoter used in the vector may be any sequence which shows transcriptional activity in the host cell and may be derived from genes encoding homologous or heterologous proteins and either extracellular or intracellular proteins, such as amylase, glycoamylases, proteases, lipases, cellulases, and glycolytic enzymes. However, the promoter need not be identical to any naturally-occurring promoter. It may be composed of portions of various promoters or may be partially or totally synthetic. Guidance for the design of promoters is provided by studies of promoter structure such as that of Harley and Reynolds, Nucleic Acids Res.. 15. 2343-61 (1987) . Also, the location of the promoter relative to the transcription start may be optimized. See Roberts, et al., Proc. Natl Acad. Sci. USA. 76„ 760-4 (1979).
The promoter may be inducible or constitutive, and is preferably a strong promoter. By "strong," it is meant that the promoter provides for a high rate of transcription in the host cell.
In the vector, the coding sequences must be operatively linked to transcription termination sequences, as well as to the promoter. The coding sequence may also be operatively linked to expression control sequences other than the promoters and transcription termination sequences.
These additional expression control sequences include activators, enhancers, operators, stop signals, cap signals, polyadenylation signals, ribosome binding sites, and other signals involved with the control of transcription and translation.
In prokaryotic mRNA, the site at which the ribosome binds to the messenger includes a sequence of 3-9 purines. The consensus sequence of this stretch is 5'-AGGAGG-3', and it is frequently referred to as the Shine-Dalgarno sequence. The sequence of the ribosome binding site may be modified to alter expression. See Hui and DeBoer, Proc. Natl. Acad. Sci. USA. 84. 4762-66 (1987) . Comparative studies of ribosomal binding sites, such as the study of Scherer, et al.. Nucleic Acids Res.. 8., 3895-3907 (1987), may provide guidance as to suitable base changes.
The ribosome binding site lies 3-12 bases upstream of the start (AUG) codon. The exact distance between the ribosome binding site and the translational start codon, and the base sequence of this "spacer" region, affect the efficiency of translation and may be optimized empirically.
To achieve optimal expression of a protein or peptide in prokaryotes, a ribosome binding site and spacer that provide for efficient translation in the prokaryotic host cell should be provided. A preferred ribosome binding site and spacer sequence for optimal translation in E. coli are described in Springer and Sligar, Proc. Nat'l Acad. Sci. UH , M 8961-65 (1987) and von Bod an et al., Proc. Nat'l Acad. Sci. USA. 83. 9443-47 (1986). The sequence of this ribosome binding site and spacer is: AGGAGAACAA CAACC [SEQ ID NO:28] .
The consensus sequence for the translation start sequence of eukaryotes has been defined by Kozak (Cell. 44. 283-292 (1986)) to be: C(A/G)CCAUGG. Deviations from this sequence, particularly at the -3 position (A or G) , have a large effect on translation of a particular mRNA. Virtually all highly expressed mammalian genes use this
sequence. Highly expressed yeast mRNAs, on the other hand, differ from this sequence and instead use the sequence (A/Y)A(A/U)AAUGUCU (Cigan and Donahue, Gene. 59. 1-18 (1987)). These sequences may be altered empirically to determine the optimal sequence for use in a particular host cell.
Methods of preparing DNA molecules are well known in the art. For instances, sequences coding for the protein or peptide could be excised from genes or cDNA clones by methods well known in the art. However, the DNA molecules encoding a protein or peptide of the invention are preferably chemically synthesized. Methods of chemically synthesizing DNA are well known in the art. Chemical synthesis is preferable for several reasons. First, chemical synthesis is desirable because codons preferred by the host in which the DNA sequence will be expressed may be used to optimize expression. Not all of the codons need to be altered to obtain improved expression, but greater than 50%, most preferably at least about 80%, of the codons should be changed to host- preferred codons. The codon preferences of many host cells, including E. coli. yeast, and other prokaryotes and eukaryotes, are known. See Maximizing Gene Expression, pages 225-85 (Reznikoff & Gold, eds., 1986). The codon preferences of other host cells can be deduced by methods known in the art.
The use of chemically synthesized DNA also allows for the selection of codons with a view to providing unique or nearly unique restriction sites at convenient points in the sequence. The use of these sites provides a convenient means of constructing the synthetic coding sequences. In addition, if secondary structures formed by the messenger RNA transcript interfere with transcription or translation, they may be eliminated by altering the codon selections. Chemical synthesis also allows for the use of optimized expression control sequences with the DNA sequence coding for a protein or peptide. In this manner,
opti al expression of the protein or peptide can be obtained. For instance, as noted above, promoters can be chemically synthesized and their location relative to the transcription start optimized. Similarly an optimized ribosome binding site and spacer can be chemically synthesized and used with coding sequences that are to be expressed in prokaryotes.
DNA coding for a signal or signal-leader sequence may be located upstream of the DNA sequence encoding the protein or peptide. A signal or signal-leader sequence is an amino acid sequence at the amino terminus of a protein which allows the protein to which it is attached to be secreted from the cell in which it is produced. Suitable signal and signal-leader sequences are well known. Although secreted proteins are often easier to purify, secretion is generally not preferred since expression levels are much lower than those that can be obtained in the absence of secretion.
The vector used to transform the host cells may have one or more replication systems which allow it to replicate in the host cells. In particular, when the host is a yeast, the vector should contain the yeast 2u replication genes REP 1-3 and origin of replication. Many bacterial replicons are known. Alternatively, an integrating vector may be used which allows the integration into the host cell's chromosome of the sequence coding for the protein or peptide. Although the copy number of the coding sequence in the host cells would be lower than when self-replicating vectors are used, transformants having sequences integrated into their chromosomes are generally quite stable.
When the vector is a self-replicating vector, it is preferably a high copy number plasmid so that high levels of expression are obtained. As used herein, a "high copy number plasmid" is one which is present at about 100 copies or more per cell. Many suitable high copy number plasmids are known.
The vector desirably also has unique restriction sites for the insertion of DNA sequences and a sequence coding for a selectable or identifiable phenotypic trait which is manifested when the vector is present in the host cell ("a selection marker") . If a vector does not have unique restriction sites, it may be modified to introduce or eliminate restriction sites to make it more suitable for further manipulations.
After the vector comprising the sequence coding for the protein or peptide is prepared, it is used to transform the host cells. Methods of transforming host cells are well known in the art, and any of these methods may be used. Transformed host cells are selected in known ways and then cultured to produce the protein or peptide. The methods of culture are those well known in the art for the chosen host cell, but the use of enriched media (rather than minimal media) is preferred since higher yields are obtained. The expressed protein or peptide may be recovered using methods of recovering and purifying proteins from cell cultures which are well known in the art.
EXAMPLE8
EXAMPLE 1: Identification Of Testis-Specific Clones
A human testis cDNA library was screened with sera from infertile patients positive for antisperm antibodies. This screening was performed as described in Liang et al. , Reprod. Ferti1. Dev.. 6_, 297-305 (1994) . It is interesting to note that these patients, although infertile, were otherwise healthy. A total of 43 unique cDNA inserts were detected by the screening, of which four were testis-specific by Northern blot analysis (performed as described in Liang et al., Reprod. Fertil. Dev.. 6_, 297-305 (1994) ; see below) . One of the four clones turned out to encode a truncated mRNA for a somatic peptide and was not evaluated further. The remaining three clones were designated Y-19, C-2 and L-7.
EXAMPLE 2: Characterization Of Clone Y-19
1. DNA Sequence The sequence of the cDNA insert of clone Y-19 was determined as described in Liang et al. , Reprod. Fertil. Dev.. 6_, 297-305 (1994) . The DNA sequence of the insert and the deduced corresponding amino acid sequence are set forth in Chart A below. Homology searches of the GenEMBL databases (performed as described in Liang et al., Reprod. Fertil. Dev.. 6 , 297- 305 (1994)) indicated that clone Y-19 codes for a testis- specific isoform of human calpastatin.
Figure 1 shows the relationship between the published sequence of DNA coding for somatic calpastatin (solid) and the testis-specific region of clone Y-19 (diagonal stripes) . Clone Y-19 appears to be a product of alternative splicing whereby DNA coding for somatic calpastatin domains L and 1 has been deleted and replaced with DNA coding for a unique, testis-specific L domain of approximately 65 amino acids (stripes) . The rest of the cDNA sequence of clone Y-19 is virtually identical to the
published sequence of somatic calpastatin. However, DNA coding for testis-specific calpastatin contains 2 unique restriction sites (arrows) .
2. Northern Blots
Northern blots were performed as described in Liang et al., Reprod. Fertil. Dev.. 6, 297-305 (1994).
A lkb fragment of clone Y-19 was used to probe a Northern blot of human poly A+ RNA from eight different human tissues (leukocytes, colon, small intestine, ovary, testis, prostate, thymus and spleen; Multiple Tissue Northern blots purchased from Clonetech, Palo Alto, CA) . Two mRNAs of 4.3 and 2.8kb were detected by the probe in all tissues. A third mRNA of 1.9kb was detected only in testis.
The Multiple Tissue Northern blots probed with the lkb
Y-19 fragment were stripped as described in Liang et al.,
Reprod. Fertil. Dev.. 6 , 297-305 (1994) and re-probed with a 135 bp fragment of the unique 5' sequence of Y-19. Only the 1.9kb mRNA in testis was detected with this probe.
3. Serum YM The serum that identified clone Y-19 (serum YM) agglutinates human sperm in a head-to-head orientation and completely inhibits cervical mucus penetration.
These assays were performed as described in Schulman et al., Am. J. Obstet Gvnecol.. 123, 139-144 (1975) and Ansbacher et al., Fertil. Steril., 24. 305-308 (1973).
EXAMPLE 3: Identification Of B-Cell Epitope Of
Testis-Specific Calpastatin
The complete amino acid sequence of human testis- specific calpastatin coded for by clone Y-19 is set forth in Chart A below. A comparison of the first 41 amino acids of human somatic calpastatin with the first 41 residues of human testis-specific calpastatin showed no sequence homology between them:
SEQ ID NO:15 Somatic: MNPTETKAIPVSQQMEGPHLPNKKKHKKQAVKTEPEKKSQS
Testis- Specific: MGQFLSSTFLEGSPATVSTISFVTVNAEEQEKQFVSSRTKQ
SEQ ID NO:l
Beginning at residue 42 of testis-specific calpastatin (residue 387 of somatic calpastatin) , the two sequences are virtually identical.
Figure 2 shows a computer-generated hydropathy plot of the first 41 residues of somatic calpastatin (solid lines) versus the first 41 residues of testis-specific calpastatin (open bars) . This hydropathy plot was generated using algorithms described in Hopp and Woods, Proc. Natl. Acad. Sci. USA. 78. 3824-28 (1981) and Kyte and Doolittle, J. Mol. Biol.. 157. 105 (1982) . Only residues 26-41 of testis-specific calpastatin are both hydrophilic and unique to the testis isoform. Therefore, this segment was chosen as a testis-specific B-cell epitope. This segment has the sequence:
Asn Ala Glu Glu Gin Glu Lys Gin Phe Val Ser Ser Arg Thr
5 10
Lys Gin 15
SEQ ID NO:2.
The hydropathy plot also shows that testis-specific calpastatin has a hydrophobic tail. This hydrophobic tail could serve as a membrane anchor for the protein.
EXAMPLE 4: Preparation Of Immunogen Containing B-Cell Epitope Of Testis-Specific Calpastatin And
Uses Thereof
A peptide immunogen was prepared containing the testis-specific calpastatin B-cell epitope identified in
Example 3 linked to a carrier comprising a universal T-cell epitope derived from tetanus toxoid. The T-cell epitope had the following sequence:
Val Asp Asp Ala Leu lie Asn Ser Thr Lys lie Tyr Ser Tyr
5 10
Phe Pro Ser Val 15
SEQ ID NO:5.
Four amino acids (Gly Pro Ser Leu) were used to link the B- cell epitope to the T-cell epitope. Thus, the complete carrier sequence was:
Gly Pro Ser Leu Val Asp Asp Ala Leu lie Asn Ser Thr Lys
5 10 lie Tyr Ser Tyr Phe Pro Ser Val 15 20
SEQ ID NO:6,
and the complete immunogen had the following sequence:
Thr Val Asn Ala Glu Glu Gin Glu Lys Gin Phe Val Ser Ser 5 10
Arg Thr Lys Gin Gly Pro Ser Leu Val Asp Asp Ala Leu lie 15 20 25
Asn Ser Thr Lys lie Tyr Ser Tyr Phe Pro Ser Val 30 35 40
SEQ ID NO:7.
This immunogen [SEQ ID NO:7] was synthesized at the Salk Institute (under Contract NOl-HD-0-2906 with the NIH) and made available by the Contraceptive Development Branch, Center for Population Research, NICHD (Bethesda, MD) .
Female New Zealand White rabbits were immunized with the immunogen [SEQ ID NO:7] as described in O'Hern et al.,
Biol. Reprod.. 52. 331-339 (1995). The rabbit antiserum was affinity purified by epitope selection as described in
Snyder et al., Methods Enzymol♦ . 154. 107-128 (1987).
The affinity-purified antiserum was used to probe a Western blot of human tissue extracts. The tissue extracts were made and the Western blots were performed as described
in Diekman and Goldberg, Biol. Reprod.. 50. 1087-1093 (1994) . As shown in Figure 3, the antiserum recognized a single protein of approximately 65Kd in human testis extracts (lane 1) and a slightly larger protein of approximately 68Kd in human sperm extracts (lane 2) . There was no reactivity with human liver extracts (lane 3) , although liver is known to be rich in the somatic isoforms of calpastatin.
The affinity-purified antiserum was also used to localize testis-specific calpastatin on human sperm by immunofluorescence, performed as described in Wright et al., Biol. Reprod.. 42. 693-701 (1990). Diffuse, granular fluorescence was observed throughout the acrosome, and intense fluorescence was observed in the equatorial segment of the sperm.
EXAMPLE 5: Immunization With Immunogen Containing B-
Cell Epitope Of Testis-Specific Calpastatin
Female cynomologous macaques (three per group) were immunized with either lOOμg or 300μg of the peptide immunogen [SEQ ID NO:7] prepared in Example 4. The immunogen was administered intramuscularly in Sgualene- Arlacel A containing the synthetic muramyl dipeptide N- acetyl-nor-muramyl-L-alanyl-D-isoglutamine (CGP11637; Ciba- Geigy Pharmaceuticals, Basel, Switzerland) . A single booster injection consisting of the same dose in the same delivery system was administered intramuscularly ten days after the initial injection. ELISA titers were determined on microtiter plates coated with the testis-specific calpastatin B-cell epitope peptide (SEQ ID NO:2; see Example 3) conjugated to bovine serum albumin (BSA) . The B-cell epitope peptide was synthesized with a non-natural cysteine at the amino terminus and conjugated to BSA as described in O'Hern et al., Biol. Reprod.. 52, 331-339 (1995). The ELISA was performed as described in Laerimore et al., J. Virol.. 69. 6077-6089 (1995) . The microtiter plate was coated with
peptide-conjugated BSA or BSA alone. After standard washing and blocking procedures, goat anti-human IgG conjugated to horseradish peroxidase was added to detect bound antibody. The results were recorded as absorbance of duplicate wells minus background absorbance. The results are shown in Figure 4 where open symbols denote the low dose group (lOOμg) , closed symbols denote the high dose group (300 μg) , and the arrows show the time of the booster injections.
EXAMPLE 6: Characterization Of Clone c-2
The cDNA insert of clone C-2 was used to probe a Northern blot of human poly A+ RNA from eight different human tissues as described above in Example 2. A single mRNA of 2.lkb was detected in testis only.
The sequence of the cDNA insert of clone C-2 was determined as described in Liang et al. , Reprod. Ferti1. Dev.. 6_, 297-305 (1994) . The DNA sequence of the insert and the deduced corresponding amino acid sequence are set forth in Chart B below.
Homology searches of the GenEMBL databases found that the sequence of the cDNA insert of clone C-2 was not represented. Thus, clone C-2 cDNA encodes a unique and previously undescribed protein. As noted above, the mRNA is approximately 2.1 kb. It has an open reading frame (ORF) of 1.4 kb translating to a peptide of 65-70 Kd. There are no significant sequence motifs or unusual properties.
The original antiserum that detected clone C-2 (number 629) is 100% effective in blocking fertilization in vitro of human ova by human sperm (see table below) . Serum 629 which has been absorbed with sperm no longer blocks binding of sperm to zona (see table below) . These assays were performed by Gary Clarke, The Royal Wo ens' Hospital, Melbourne, Australia, using procedures described in Clarke et al.. Arch. Androl.. 3_5, 21-27 (1995).
Serum Number Ova Number Sperm Treatment Fertilized Bound To Zona
Normal Serum 5/6 62
629 0/10 1.5
629 Preabsorbed With Sperm ND 67
The peptide coded for by a 900 bp fragment from the 3' end of the C-2 cDNA was expressed as a glutathione-s- transferase (GST) fusion protein using cloning methods well known in the art. See, e.g.. Smith and Johnson, Gene. 67. 31-40 (1988); Johnson et al. , Nature. 338. 585-587 (1989); Kemp et al.. Gene. 94. 223-28 (1990); Kaelin Jr. et al., Cell. 64. 521-532 (1991); Chittenden Jr. et al. , Cell. 65. 1073-1082 (1991); Kaelin Jr. et al.. Cell. 7_0, 351-364 (1992) . The clone encoding this fusion protein was designated clone GST-C2.
Western blots (performed as described above in Example 4) showed that the fusion protein was recognized by the 629 serum. It was not recognized by the 629 serum which had been absorbed with human sperm. Furthermore, the sera from four other infertile patients recognized this fusion protein on Western blots. One of these sera inhibited sperm-zona binding.
EXAMPLE 7: Identification Of B-Cell Epitope Of Clone C-
2 Protein
Unidirectional nested deletions were prepared from the
3' end of clone GST-C2 (see Figure 5, upper portion) using the protocol and reagents provided in the Stratagene instruction manual (pBluescript II exo/mung DNA sequencing system) . Each time point was religated, and the truncated GST-C2 fusion proteins were expressed and assayed by PAGE as described in the previous example. The lower half of Figure 5 shows the Coomasie blue-stained PAGE gel (lanes 1 and 7 - GST, lane 2 - full-length GST-C2 fusion protein, lanes 3-6 and 8-11 - truncated GST-C2 fusion proteins) .
Each of the truncated GST-C2 fusion proteins was partially purified and used as the target for Western blots (all as described in Example 6) probed with the original patient 629 serum. The results are shown in Figure 6. The full-length fusion protein and the first 4 deletions were strongly positive for the antibody. Time points 5-10 were negative, as was GST alone. Therefore, the C2 epitope recognized by the original human serum resides within time point 4. Each of the 10 nested deletions was sequenced using an oligo primer specific for the pGEX vector (see Pharmacia Biotech GST Gene Fusion Manual) . The results are shown in Figure 7. The first 3 time points showed deletion of the 3' untranslated region (UTR) . Time point 4, from which the 9 carboxy terminal amino acids were deleted, was still antibody positive. Time point 5, with deletion of an additional 26 amino acids, was antibody negative. Therefore, the relevant B-cell epitope (cross-hatched box) resides within the region of amino acids 426-454. The sequence of amino acids 426-454 is as follows:
Thr Asn lie Val Gin Glu Lys Lys His Thr Pro Arg Arg Arg
5 10 Pro Glu Pro Lys lie lie Pro Ser Glu Glu Asp Pro Thr Phe 15 20 25
Glu SEQ ID NO:8
Computer-assisted sequence analysis was performed as described in O'Hern and Goldberg, in Techniques In Protein Chemistry IV. pages 481-490 (1993) to calculate the surface accessibility of amino acids 426-454. Residues 430-443 were determined to be highly surface accessible and likely to represent the B-cell epitope. This epitope has the following sequence:
Val Gin Glu Lys Lys His Thr Pro Arg Arg Arg
5 10
Pro Glu Pro Lys 15
SEQ ID NO:9.
EXAMPLE 8: Preparation of C-2 Immunogen An immunogen comprising the B-cell epitopes identified in Example 7 was prepared as described in Example 4. The sequence of this immunogen is:
Val Gin Glu Lys Lys His Thr Pro Arg Arg Arg Pro Glu 5 10
Pro Lys Gly Pro Ser Leu Val Asp Asp Ala Leu lie 15 20 25 Asn Ser Thr Lys lie Tyr Ser Tyr Phe Pro Ser Val
30 35
SEQ ID NO:10.
EXAMPLE 9: Characterization Of Clone L-7
The cDNA insert of clone L-7 was used to probe a Northern blot of human poly A+ RNA from eight different human tissues as described above in Example 2. A single mRNA of 2.5kb was detected in testis only. The sequence of the cDNA insert of clone L-7 was determined as described in Liang et al. , Reprod. Ferti1. Dev. f 6_, 297-305 (1994) . The DNA sequence of the insert and the corresponding amino acid sequence are set forth in Chart C below. Homology searches of the GenEMBL databases found that the sequence of the cDNA insert of clone L-7 was not represented. Thus, clone L-7 cDNA encodes an unique and previously undescribed protein. This protein is relatively large (66 kD) and consists of several domains of as yet unknown functional significance. The protein contains an endoplasmic reticulum signal sequence and appears to be
anchored in the sperm plasma membrane at its amino terminus, but with surface accessible epitopes.
A computer-generated plot (Figure 8) of the occurrence of the amino acid valine along the length of the polypeptide chain revealed a distinct domain structure for the protein. This plot was generated using PC/Gene software from Intelligenetics, Inc., 700 E. El Ca ino Rd. , Mountainview, CA 94047. This computer analysis revealed the following features. Residues 88-328 contain very little valine and 9 potential protein kinase C (PKC) phosphorylation sites (P) . Residues 329 to 493 contains many valines and no PKC phosphorylation sites. Residues 329-493 also contain 11 repeats of a 15 amino acid motif (see below) . The consensus sequence of the motif is KgqEaQVKKsesgVp [SEQ ID NO:16].
329- KRTGVQVKKSESGVP SEQ ID NO:17
344- KGQEAQVTKSGLWL SEQ ID NO:18
359- KGQEAQVEKSEMGVP SEQ ID NO:19
374- RRQESQVKKSQSGVS SEQ ID NO:20 389- KGQEAQVKKRESWL SEQ ID NO:21
404- KGQEAQVEKSELKVP SEQ ID NO:22
419- KGQEGQVEKTEAECP SEQ ID NO:23
434- KEQEVQEKKSEAGVL SEQ ID NO:24
449- KGPEFQVKNTEVSVP SEQ ID NO:25 464- ETLESQVKKSESGVL SEQ ID NO:26
479- KGQEAQEKKESFEDK SEQ ID NO:27
Residues 494-568 contain few valines and 3 potential PKC phosphorylation sites. From the computer analysis and the protein's sequence, the following domain organization of the L-7 protein is proposed:
Domain I (residues 1-90) contains a consensus endoplasmic reticulum localization signal (p>0.85)
(see von Heijne, J. Memb. Biol.. 115. 195-201 (1990));
Domain II (residues 91-328) has a high isoelectric point and contains the 9 potential PKC phosphorylation sites;
Domain III (residues 329 to 493) has a neutral pl and contains the 11 repeat motifs; and
Domain IV (residues 494 to 568) again has a high isoelectric point and contains 2 bipartite nuclear translocation signals (see Robbins et al., Cell. 64. 615-623 (1991)).
This structure is unique in the databases.
EXAMPLE 10: Identification Of B-Cell Epitope Of Clone L-
7 Protein
A 900 bp fragment from the 3' end of the cDNA of clone
L-7 was expressed and purified as a GST fusion protein as described in Example 6 above. This clone was designated GST-L7. Sera from three infertile patients (numbers 44, 65 and 66) recognized the fusion protein on Western blots (performed as described in Example 6) .
Nested deletions of the 900 bp fragment were prepared, and the truncated fusion proteins were expressed and purified, all as described in Example 7. Western blots were probed with serum from patient 44. The results are shown in Figure 9. Signal intensity decreased markedly between time points 2 and 3 (arrows) and disappears between time points 8 and 9 (arrows) , indicating the presence of two B-cell epitopes in this region of the L-7 protein.
The two epitopes identified by nested deletion analysis of clone L-7 are indicated by cross-hatched boxes in Figure 10. Epitope 1 is amino acids 500-517, and epitope 2 is amino acids 389-408. These epitopes have the following sequences:
Lys Gly Gin Glu Ala Gin Val Lys Lys Arg Glu Ser Val Val
5 10
Leu Lys Gly Gin Glu Ala 15 20
SEQ ID NO:11 and Lys Glu Arg Asp Ala Glu Lys Asp Pro Asn Lys Lys Glu Lys
5 10
Gly Asp Lys Asn
15 SEQ ID NO:12.
EXAMPLE 11: Preparation of J.-7 Tmτnunogens
Immunogens comprising the two B-cell epitopes identified in Example 10 were prepared as described in Example 4. The sequences of these two immunogens are:
Lys Gly Gin Glu Ala Gin Val Lys Lys Arg Glu Ser Val Val
5 10
Leu Lys Gly Gin Glu Ala Gly Pro Ser Leu Val Asp Asp Ala 15 20 25
Leu lie Asn Ser Thr Lys lie Tyr Ser Tyr Phe Pro Ser Val 30 35 40 SEQ ID NO:13. and
Lys Glu Arg Asp Ala Glu Lys Asp Pro Asn Lys Lys Glu Lys
5 10
Gly Asp Lys Asn Gly Pro Ser Leu Val Asp Asp Ala Leu lie 15 20 25
Asn Ser Thr Lys lie Tyr Ser Tyr Phe Pro Ser Val 30 35 40
SEQ ID NO:14.
EXAMPLE 12: Preparation Of Antiserum To L-7 Protein One of the immunogens prepared in Example 11 [SEQ ID
NO: 14] was used to immunize rabbits as described in Example 4. The rabbit antiserum was affinity purified, and the affinity-purified rabbit antiserum was used to probe a
Western blot of human tissue extracts, all as described in Example 4. The affinity-purified antiserum recognized a single protein of approximately 58 Kd in human testis extracts and a protein of approximately 68 Kd in human sperm extracts. There was no reactivity with human liver extracts.
EXAMPLE 13: Isolation Of Macaque cDNA Clones
Corresponding To Human cDNA Clones And Identification Of B-Cell Epitopes
A macaque testis cDNA library (obtained from Dr. John
Herr, University of Virginia) was screened with the human cDNAs as probes (see Examples 1 and 2) , and B-cell epitopes identified by comparison to B-cell epitopes identified in Examples 7 and 10.
A B-cell epitope of macaque testis-specific calpastatin was identified and has the following sequence:
Asn Ala Glu Gly Gin Glu Lys Gin Phe Leu Ser Ser Arg Thr 5 10
Lys Gin 15
SEQ ID NO:29.
This B-cell epitope is 85% homologous to the B-cell epitope identified above for human testis-specific calpastatin [SEQ ID NO:2] .
The B-cell epitope of the macaque protein corresponding to the human protein produced by clone C-2 has a sequence identical to that of the B-cell epitope of the C-2 protein [SEQ ID NO:8]. Thus, in this case, there was 100% homology between the sequences.
EXAMPLE 15: Preparation Of Immunogens Containing Testis-
Specific B-Cell Epitopes
Peptides having the sequences of the B-cell epitopes identified in Examples 3, 7 and 10 can be synthesized and coupled to diptheria toxin to produce immunogens that can
-34- be used to immunize mammals, all as described in O'Hern et al., Biol. Reprod.. 5_2_, 331-339 (1995).
EXAMPLE 16: Sequencing Of Clones Y-19. C-2 and L-7 DNA fragments of clones Y-19, C-2 and L-7 were subcloned into the pBluescriptll SK+ phagemid (Stratagene, Palo Alto, CA) and sequenced by a modification of the method of Kraft et al., Biotechniques. 6_, 544-547 (1988) as described in O'Hern et al., Biol. Reprod.. 52. 331-339 (1995) . The DNA sequences and deduced amino acid sequences are presented in Charts A (Y-19) , B (C-2) and C (L-7) .
CHART A
CTTGATATCG AATTCGGGGGG AGTCTCCCT GACTTCCAGC 40
AACAATCCTT GAGTCTGAGA CTGCCCTGGC CTAAG ATG GGC 81
Met Gly
CAG TTT CTA TCT TCG ACT TTC TTG GAG GGC TCA CCG 117 Gin Phe Leu Ser Ser Thr Phe Leu Glu Gly Ser Pro 5 10
GCC ACA GTG TCG ACG ATA AGC TTT GTG ACG GTG AAC 153 Ala Thr Val Ser Thr lie Ser Phe Val Thr Val Asn 15 20 25 GCA GAG GAG CAA GAG AAG CAG TTC GTA TCT TCC AGG 189 Ala Glu Glu Gin Glu Lys Gin Phe Val Ser Ser Arg 30 35
ACC AAG CAA AAA GCT AAA GAA GAA AAA CTA GAG AAG 225 Thr Lys Gin Lys Ala Lys Glu Glu Lys Leu Glu Lys 40 45 50
TGT GGT GAG GAT GAT GAA ACA ATC CCA TCT GAG TAC 261 Cys Gly Glu Asp Asp Glu Thr lie Pro Ser Glu Tyr 55 60
AGA TTA AAA CCA GCC ACG GAT AAA GAT GGA AAA CCA 297 Arg Leu Lys Pro Ala Thr Asp Lys Asp Gly Lys Pro 65 70
CTA TTG CCA GAG CCT GAA GAA AAA CCC AAG CCT CGG 333 Leu Leu Pro Glu Pro Glu Glu Lys Pro Lys Pro Arg 75 80 85 AGT GAA TCA GAA CTC ATT GAT GAA CTT TCA GAA GAT 369 Ser Glu Ser Glu Leu lie Asp Glu Leu Ser Glu Asp 90 95
TTC GAC CTG TCT GAA TGT AAA GAG AAA CCA TCT AAG 405 Phe Asp Leu Ser Glu Cys Lys Glu Lys Pro Ser Lys 100 105 110
CCA ACT GAA AAG ACA GAA GAA TCT AAG GCC GCT GCT 441 Pro Thr Glu Lys Thr Glu Glu Ser Lys Ala Ala Ala 115 120
CCA GCT CCT GTG TCG GAG GCT GTG TCT CGG ACC TCC 477 Pro Ala Pro Val Ser Glu Ala Val Ser Arg Thr Ser 125 130
ATG TGT AGT ATA CAG TCA GCA CCC CCT GAG CCG GCT 513 Met Cys Ser lie Gin Ser Ala Pro Pro Glu Pro Ala 135 140 145
ACC TTG AAG GTC ACA GTG CCA GAT GAT GCT GTA GAA 549 Thr Leu Lys Val Thr Val Pro Asp Asp Ala Val Glu 150 155 GCC TTG GCT GAT AGC CTG GGG AAA AAG GAA GCA GAT 585 Ala Leu Ala Asp Ser Leu Gly Lys Lys Glu Ala Asp 160 165 170
CCA GAA GAT GGA AAA CCT GTG ATG GAT AAA GCT AAG 621 Pro Glu Asp Gly Lys Pro Val Met Asp Lys Val Lys
175 180
GAG AAG GCC AAA GAA GAA GAC CGT GAA AAG CTT GGT 657 Glu Lys Ala Lys Glu Glu Asp Arg Glu Lys Leu Gly 185 190
GAA AAA GAA GAA ACA ATT CCT CCT GAT TAT ATA TTA 693 Glu Lys Glu Glu Thr lie Pro Pro Asp Tyr lie Leu 195 200 205
GAA GAG GTC AAG GAT AAA GAT GGA AAG CCA CTC CTG 729 Glu Glu Val Lys Asp Lys Asp Gly Lys Pro Leu Leu 210 215 CCA AAA GAG TCT AAG GAA CAG CTT CCA CCC ATG AGT 765 Pro Lys Glu Ser Lys Glu Gin Leu Pro Pro Met Ser 220 225 230
GAA GAC TTC CTT CTG GAT GCT TTG TCT GAG GAC TTC 801 Glu Asp Phe Leu Leu Asp Ala Leu Ser Glu Asp Phe
235 240
TCT GGT CCA CAA AAT GCT TCA TCT CTT AAA TTT GAA 837 Ser Gly Pro Gin Asn Ala Ser Ser Leu Lys Phe Glu 240 245
GAT GCT AAA CTT GCT GCT GCC ATC TCT GAA GTG GTT 873
Asp Ala Lys Leu Ala Ala Ala lie Ser Glu Val Val
250 255 260
TCC CAA ACC CCA GCT TCA ACG ACC CAA GCT GGA GCC 909
Ser Gin Thr Pro Ala Ser Thr Thr Gin Ala Gly Ala
265 270 CCA CCC CGT GAT ACC TCG AGT GAC AAA GAC CTC GAT 945 Pro Pro Arg Asp Thr Ser Ser Asp Lys Asp Leu Asp 275 280 285
GAT GCC TTG GAT AAA CTC TCT GAC AGT CTA GGA CAA 981 Asp Ala Leu Asp Lys Leu Ser Asp Ser Leu Gly Gin
290 300
AGG CAG CCT GAC CCA GAT GAG AAC AAA CCA ATG GAA 1017 Arg Gin Pro Asp Pro Asp Glu Asn Lys Pro Met Glu 305 310
-37-
GAT AAA GTA AAG GAA AAA GCT AAA GCT GAA CAT AGA 1053 Asp Lys Val Lys Glu Lys Ala Lys Ala Glu His Arg 315 320 325
GAC AAG CTT GGA GAG AGA GAT GAC ACT ATC CCA CCT 1089 Asp Lys Leu Gly Glu Arg Asp Asp Thr lie Pro Pro 330 335
GAA TAC AGA CAT CTC CTG GAT GAT AAT GGA CAG GAC 1125 Glu Tyr Arg His Leu Leu Asp Asp Asn Gly Gin Asp 340 345 350
AAA CCA GTG AAG CCA CCT ACA AAG AAA TCA GAG GAT 1161 Lys Pro Val Lys Pro Pro Thr Lys Lys Ser Glu Asp
355 360
TCA AAG AAA CCT GCA GAT GAC CAA GAC CCC ATT GAT 1197 Ser Lys Lys Pro Ala Asp Asp Gin Asp Pro lie Asp 365 370
GCT CTC TCA GGA GAT CTG GAC AGC TGT CCC TCC ACT 1233 Ala Leu Ser Gly Asp Leu Asp Ser Cys Pro Ser Thr 375 380 385
ACA GAA ACC TCA CAG AAC ACA GCA AAG GAT AAG TGC 1269 Thr Glu Thr Ser Gin Asn Thr Ala Lys Asp Lys Cys 390 395
AAG AAG GCT GCT TCC AGC TCC AAA GCA CCT AAG AAT 1305 Lys Lys Ala Ala Ser Ser Ser Lys Ala Pro Lys Asn 400 405 410
GGA GGT AAA GCG AAG GAT TCA GCA AAG ACA ACA GAG 1341 Gly Gly Lys Ala Lys Asp Ser Ala Lys Thr Thr Glu
415 420
GAA ACT TCC AAG CCA AAA GAT GAC TAA AGAAATACAAG 1377 Glu Thr Ser Lys Pro Lys Asp Asp 425 430
TTAAGGTATC TGGTATCTGC ATTTAAAATC TTCAGCTGGT 1417
GGATTGTGAC TTTTGAAGAA CAAAAGGCTT TGGCAACAGA 1457
AAACAATTGT TCTGGGTGAT TTCTAGAATG TTTTTTGTTG 1497
AGTCTCTGAA CATCCTAAAT ATTTGTTTGT TATTCTTTTC 1537
CAGAAAGAAA ATGAATTTGA CTGGTTCACC TGTGTACTGA 1577
GTATTGATAA ACTTCGAATT TTTTAAATTT CCTTCAAGGG 1617
AGAGAAAGCT TATATTGGTT TGTTATTCTT TTCCAGAAAG 1657
AAAATGAATT TGACTGGGTT CACTGTGTTA CTGAGTATTG 1697
ATAAACTTTG AATTTTTGCA ATTGCCTTCA ATTTTTAGAG 1737
GAAAAGCTTT ATATTTGTGT TATTACTTCT TCATCTTACA 1777
GTCATCACAG AACACACTGA GACTTGAATC AAGTCAGCAA 1817
CAGAGCAAAA TAAAGGTTAG ATAAGTCCTT GTGTAGCAAA 1857
TTTCGAGCAT AAGAAATAAA ATCTAATTAA TTCTTAGGGT 1897
AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 1927
SEQ ID NO:30
CHART B
AAAGCGTCAT TCGAGGTCCG GGTCCGGCTT GCGGGGTCAG 40 CGAACTGGAG AGGCGCC ATG GGC TGG ATC ACA 72
Met Gly Trp lie Thr
5
GAA GAT CTT ATT AGA CGG AAT GCT GAA CAC AAC GAC 108 Glu Asp Leu lie Arg Arg Asn Ala Glu His Asn Asp
10 15
TGT GTC ATT TTT TCC CTG GAG GAA CTC TCG TTG CAT 144 Cys Val lie Phe Ser Leu Glu Glu Leu Ser Leu His 20 25
CAG CAA GAA ATA GAA AGA CTA GAA CAC ATT GAT AAA 180 Gin Gin Glu lie Glu Arg Leu Glu His lie Asp Lys 30 35 40
TGG TGC CGG GAT TTA AAA ATT CTC TAT CTT CAA AAT 216 Trp Cys Arg Asp Leu Lys lie Leu Tyr Leu Gin Asn 45 50 AAT CTT ATT GGG AAA ATT GAA AAT GTT AGC AAA CTC 252 Asn Leu lie Gly Lys lie Glu Asn Val Ser Lys Leu 55 60 65
AAG AAA CTT GAA TAT TTG AAT TTA GCT TTA AAC AAC 288 Lys Lys Leu Glu Tyr Leu Asn Leu Ala Leu Asn Asn
70 75
ATT GAA AAA ATA GAA AAC TTG GAA GGA TGT GAA GAG 324 lie Glu Lys lie Glu Asn Leu Glu Gly Cys Glu Glu 80 85
CTG GCA AAA CTT GAC CTG ACT GTG AAT TTC ATT GGA 360
Leu Ala Lys Leu Asp Leu Thr Val Asn Phe lie Gly
90 95 100
GAG CTG AGC AGC ATT AAA AAC TTG CAG CAC AAT ATC 396
Glu Leu Ser Ser lie Lys Asn Leu Gin His Asn lie 105 110 CAT CTG AAG GAG CTC TTT CTC ATG GGG AAC CCA TGT 432 His Leu Lys Glu Leu Phe Leu Met Gly Asn Pro Cys 115 120 125
GCT TCC TTT GAC CAC TAT AGG GAG TTC GTG GTA GCA 468 Ala Ser Phe Asp His Tyr Arg Glu Phe Val Val Ala
130 135
ACT CTT CCA CAA TTA AAG TGG TTG GAT GGT AAA GAA 504 Thr Leu Pro Gin Leu Lys Trp Leu Asp Gly Lys Glu 140 145
ATA GAG CCT TCA GAA AGG ATT AAG GCA TTG CAG GAC 540 lie Glu Pro Ser Glu Arg lie Lys Ala Leu Gin Asp 150 155 160 TAT TCA GTA ATT GAA CCA CAA ATC AGA GAG CAG GAA 576 Tyr Ser Val lie Glu Pro Gin lie Arg Glu Gin Glu 165 170
AAA GAT CAC TGT CTT AAA CGA GCC AAA CTC AAG GAA 612 Lys Asp His Cys Leu Lys Arg Ala Lys Leu Lys Glu 175 180 185
GAG GCT CAG AGG AAA CAC CAA GAA GAG GAT AAA AAT 648 Glu Ala Gin Arg Lys His Gin Glu Glu Asp Lys Asn 190 195
GAA GAC AAG AGA AGT AAC GCA GGC TTT GAT GGA CGT 684
Glu Asp Lys Arg Ser Asn Ala Gly Phe Asp Gly Arg 200 205
TGG TAC ACA GAC ATC AAT GCT ACT CTT TCC TCT TTA 720
Trp Tyr Thr Asp lie Asn Ala Thr Leu Ser Ser Leu 210 215 220 GAG AGC AAA GAC CAC CTA CAG GCA CCA GAC ATA GAG 756 Glu Ser Lys Asp His Leu Gin Ala Pro Asp lie Glu 225 230
GAA CAC AAC ACA AAG AAA TTA GAC GAT GAC TTG GAA 792 Glu His Asn Thr Lys Lys Leu Asp Asp Asp Leu Glu 235 240 245
TTC TGG AAT AAG CCC TGT TTG TTT ACT CCT GAA TCA 828 Phe Trp Asn Lys Pro Cys Leu Phe Thr Pro Glu Ser 250 255
AGA TTG GAA ACT CTT AGA CAC ATG GAA AAA CAA CGG 864 Arg Leu Glu Thr Leu Arg His Met Glu Lys Gin Arg 260 265
AAG AAA CAG GAA AAA TTA AGT GAA AAA AAG AAG AAA 900 Lys Lys Gin Glu Lys Leu Ser Glu Lys Lys Lys Lys 270 275 280 GTG AAA CCA CCC AGG ACT TTG ATC ACT GAA GAT GGG 936 Val Lys Pro Pro Arg Thr Leu lie Thr Glu Asp Gly 285 290
AAA GCC CTA AAT GTG AAT GAG CCC AAA ATT GAC TTC 972 Lys Ala Leu Asn Val Asn Glu Pro Lys lie Asp Phe 295 300 305
TCT TTG AAA GAT AAC GAA AAG CAG ATC ATC CTG GAC 1008 Ser Leu Lys Asp Asn Glu Lys Gin lie lie Leu Asp 310 315
CTT GCT GTC TAT AGG TAT ATG GAT ACC TCT TTA ATC 1044 Leu Ala Val Tyr Arg Tyr Met Asp Thr Ser Leu lie 320 325 GAT GTT GAT GTG CAA CCA ACT TAC GTG CGA GTA ATG 1080 Asp Val Asp Val Gin Pro Thr Tyr Val Arg Val Met 330 335 340
ATC AAA GGA AAG CCA TTT CAG CTT GTC CTT CCT GCA 1116 lie Lys Gly Lys Pro Phe Gin Leu Val Leu Pro Ala
345 350
GAA GTG AAA CCC GAT AGT AGT TCT GCT AAA AGA TCT 1152 Glu Val Lys Pro Asp Ser Ser Ser Ala Lys Arg Ser 355 360 365
CAG ACA ACG GGT CAT TTG GTC ATC TGC ATG CCC AAG 1188 Gin Thr Thr Gly His Leu Val lie Cys Met Pro Lys
370 375
GTA GGA GAA GTA ATC ACA GGT GGT CAG CGA GCA TTC 1224 Val Gly Glu Val lie Thr Gly Gly Gin Arg Ala Phe 380 385 AAA TCT ATG AAA ACT ACC TCG GAC AGG AGC AGA GAA 1260 Lys Ser Met Lys Thr Thr Ser Asp Arg Ser Arg Glu 390 395 400
CAA ACA AAT ACA AGA AGC AAG CAC ATG GAG AAA CTA 1296 Gin Thr Asn Thr Arg Ser Lys His Met Glu Lys Leu
405 410
GAA GTA GAC CCT AGC AAG CAC TCA TTC CCT GAT GTG 1332 Glu Val Asp Pro Ser Lys His Ser Phe Pro Asp Val 415 420 425
ACT AAC ATA GTT CAA GAG AAA AAA CAC ACA CCC AGA 1368
Thr Asn lie Val Gin Glu Lys Lys His Thr Pro Arg
430 435
AGA CGA CCT GAA CCC AAA ATT ATA CCA AGT GAG GAA 1404
Arg Arg Pro Glu Pro Lys lie lie Pro Ser Glu Glu 440 445 GAC CCA ACC TTT GAA GAC AAC CCT GAA GTG CCT CCG 1440 Asp Pro Thr Phe Glu Asp Asn Pro Glu Val Pro Pro 450 455 460
CTG ATT TGA 1446 Leu lie
SEQ ID NO:31
CHART C
AGCTGGGAGC GCAGAGGCTC ACGCCTGTAA TCCATCATTT 40
GCTTAGGTCT GATCAATCTG CTCCACACAA TTTCTCAGTG 80
ATCCTCTGCA TCTCTGCCTA CAAGGGCCTC CCTGACACCC 120
AAGTTCATAT TGCTCAGAAA CAGTGAACTT GAGTTTTTCG 160
TTTTACCTTG ATCTCTCTCT GACAAAGAAA TCCAGATGAT 200
GCAACACCTG ATGAAGACAA TACATGGAAA 230 ATG ACA GTC TTG GAA ATA ACT TTG 254
Met Thr Val Leu Glu He Thr Leu
5
GCT GTC ATC CTG ACT CTA CTG GGA CTT GCC ATC CTG 290 Ala Val He Leu Thr Leu Leu Gly Leu Ala He Leu 10 15 20
GCT ATT TTG TTA ACA AGA TGG GCA CGA CGT AAG CAA 326 Ala He Leu Leu Thr Arg Trp Ala Arg Arg Lys Gin 25 30
AGT GAA ATG TAT ATC TCC AGA TAC AGT TCA GAA CAA 362
Ser Glu Met Tyr He Ser Arg Tyr Ser Ser Glu Gin 35 40
AGT GCT AGA CTT CTG GAC TAT GAG GAT GGT AGA GGA 398
Ser Ala Arg Leu Leu Asp Tyr Glu Asp Gly Arg Gly 45 50 55 TCC CGA CAT GCA TAT CAA CAC AAA GTG ACA CTT CAT 434 Ser Arg His Ala Tyr Gin His Lys Val Thr Leu His 60 65
ATG ATA ACC GAG AGA GAT CCA AAA AGA GAT TAC ACA 470 Met He Thr Glu Arg Asp Pro Lys Arg Asp Tyr Thr 70 75 80
CCA TCA ACC AAC TCT CTA GCA CTG TCT CGA TCA AGT 506 Pro Ser Thr Asn Ser Leu Ala Leu Ser Arg Ser Ser 85 90
ATT GCT TTA CCT CAA GGA TCC ATG AGT AGT ATA AAA 542 He Ala Leu Pro Gin Gly Ser Met Ser Ser He Lys 95 100
TGT TTA CAA ACA ACT GAA GAA CCT CCT TCC AGA ACT 578 Cys Leu Gin Thr Thr Glu Glu Pro Pro Ser Arg Thr 105 110 115
GCA GGA GCC ATG ATG CAA TTC ACA GCC CTA TTC CCG 614 Ala Gly Ala Met Met Gin Phe Thr Ala Leu Phe Pro 120 125 GAG CTA CAG GAC CTA TCA AGC TCT CTC AAA AAA CCA 650 Glu Leu Gin Asp Leu Ser Ser Ser Leu Lys Lys Pro 130 135 140
TTG TGC AAA CTC CAG GAC CTA TTG TAC AAT ATC TGG 686 Leu Cys Lys Leu Gin Asp Leu Leu Tyr Asn He Trp
145 150
ATC CAA TGT CAG ATC GCA TCT CAC ACA ATC ACT GGT 722 He Gin Cys Gin He Ala Ser His Thr He Thr Gly 155 160
CAC CTT CAG CAC CCG CGG TCA CCC ATG GCA CCC ATA 758
His Leu Gin His Pro Arg Ser Pro Met Ala Pro He
165 170 175
ATA ATT TCA CAG AGA ACC GCA AGT CAG CTG GCA GCA 794
He He Ser Gly Arg Thr Ala Ser Gin Leu Ala Ala 180 185 CCT ATA AGA ATA CCT CAA GTT CAC ACT ATG GAC AGT 830 Pro He Arg He Pro Gin Val His Thr Met Asp Ser 190 195 200
TCT GGA AAA ATC ACA CTG ACT CCT GTG GTT ATA TTA 866 Ser Gly Lys He Thr Leu Thr Pro Val Val He Leu
205 210
ACA GGT TAC ATG GAC GAA GAA CTT CGA AAA AAA TCT 902 Thr Gly Tyr Met Asp Glu Glu Leu Arg Lys Lys Ser 215 220
TGT TCC AAA ATC CAG ATT CTA AAA TGT GGA GGC ACT 938
Cys Ser Lys He Gin He Leu Lys Cys Gly Gly Thr 225 230 235
GCA AGG TCT CAG ATA GCC GAG AAG AAA ACA AGG AAG 974
Ala Arg Ser Gin He Ala Glu Lys Lys Thr Arg Lys
240 245 CAA CTA AAG AAT GAC ATC ATA TTT ACG AAT TCT GTA 1010 Gin Leu Lys Asn Asp He He Phe Thr Asn Ser Val 250 255 260
GAA TCC TTG AAA TCA GCA CAC ATA AAG GAG CCA GAA 1046 Glu Ser Leu Lys Ser Ala His He Lys Glu Pro Glu
265 270
AGA GAA GGA AAA GGC ACT GAT TTA GAG AAA GAC AAA 1082 Arg Glu Gly Lys Gly Thr Asp Leu Glu Lys Asp Lys 275 280
ATA GGA ATG GAG GTC AAG GTA GAC AGT GAC GCT GGA 1118 He Gly Met Glu Val Lys Val Asp Ser Asp Ala Gly 285 290 295 ATA CCA AAA AGA CAG GAA ACC CAA CTA AAA ATC AGT 1154 He Pro Lys Arg Gin Glu Thr Gin Leu Lys He Ser 300 305
GAA GAT GAG TAT ACC ACA AGG ACA GGG AGC CCA AAT 1190 Glu Asp Glu Tyr Thr Thr Arg Thr Gly Ser Pro Gin 310 315 320
AAA GAA AAG TGT GTC AGA TGT ACC AAG AGG ACA GGA 1226 Lys Glu Lys Cys Val Arg Cys Thr Lys Arg Thr Gly 325 330
GTC CAA GTA AAG AAG AGT GAG TCA GGT GTC CCA AAA 1262 Val Gin Val Lys Lys Ser Glu Ser Gly Val Pro Lys 335 340
GGA CAA GAA GCC CAA GTA ACG AAG AGT GGG TTG GTT 1298 Gly Gin Glu Ala Gin Val Thr Lys Ser Gly Leu Val 345 350 355 GTA CTG AAA GGA CAG GAA GCC CAG GTA GAG AAG AGT 1334 Val Leu Lys Gly Gin Glu Ala Gin Val Glu Lys Ser 360 365
GAG ATG GGT GTG CCA AGA AGA CAG GAA TCC CAA GTA 1370 Glu Met Gly Val Pro Arg Arg Gin Glu Ser Gin Val 370 375 380
AAG AAG AGT CAG TCT GGT GTC TCA AAG GGA CAG GAA 1406 Lys Lys Ser Gin Ser Gly Val Ser Lys Gly Gin Glu 385 390
GCC CAG GTA AAG AAG AGG GAG TCA GTT GTA CTG AAA 1442 Ala Gin Val Lys Lys Arg Glu Ser Val Val Leu Lys 395 400
GGA CAG GAA GCC CAG GTA GAG AAG AGT GAG TTG AAG 1478 Gly Gin Glu Ala Gin Val Glu Lys Ser Glu Leu Lys 405 410 415 GTA CCA AAA GGA CAA GAA GGC CAA GTA GAG AAG ACT 1514 Val Pro Lys Gly Gin Glu Gly Gin Val Glu Lys Thr 420 425
GAG GCA GAT GTG CCA AAG GAA CAA GAG GTC CAA GAA 1550 Glu Ala Asp Val Pro Lys Glu Gin Glu Val Gin Glu 430 435 440
AAG AAG AGT GAG GCA GGT GTA CTG AAA GGA CCA GAA 1586 Lys Lys Ser Glu Ala Gly Val Leu Lys Gly Pro Glu 445 450
TCC CAA GTA AAG AAC ACT GAG GTG AGT GTA CCA GAA 1622 Ser Gin Val Lys Asn Thr Glu Val Ser Val Pro Glu 455 460 ACA CTG GAA TCC CAA GTA AAG AAG AGT GAG TCA GGT 1658 Thr Leu Glu Ser Gin Val Lys Lys Ser Glu Ser Gly 465 470 475
GTA CTA AAA GGA CAG GAA GCC CAA GAA AAG AAG GAG 1694 Val Leu Lys Gly Gin Glu Ala Gin Glu Lys Lys Glu
480 485
AGT TTT GAG GAT AAA GGA AAT AAT GAT AAA GAA AAG 1730 Ser Phe Glu Asp Lys Gly Asn Asn Asp Lys Glu Lys 490 495 500
GAG AGA GAT GCA GAG AAA GAT CCA AAT AAA AAA GAA 1766 Glu Arg Asp Ala Glu Lys Asp Pro Asn Lys Lys Glu
505 510
AAA GGT GAC AAA AAC ACA AAA GGT GAC AAA GGA AAG 1802 Lys Gly Asp Lys Asn Thr Lys Gly Asp Lys Gly Lys 515 520 GAC AAA GTT AAA GGA AAG AGA GAA TCA GAA ATC AAT 1838 Asp Lys Val Lys Gly Lys Arg Glu Ser Glu He Asn 525 530 535
GGT GAA AAA TCA AAA GGC TCG AAA AGG CGA AGG CAA 1874 Gly Glu Lys Ser Lys Gly Ser Lys Arg Arg Arg Gin
540 545
ATA CAG GAA GGA AGT ACA ACA AAA AAG TGG AAG AGT 1910 He Gin Glu Gly Ser Thr Thr Lys Lys Trp Lys Ser 550 555 560
AAG GAT AAA TTT TTT AAA GGC CCA TAA GACAAGTGAT 1946 Lys Asp Lys Phe Phe Lys Gly Pro
565
TATTATGATT CCCATACTCC AGATACAAAC CATATCCCAG 1986
CCATTGCCTA AACAGATTAC AATTATAAAA TCCCTTTCAT 2026 CTTCATATCA CAGTTTCTGC TCTTCAGAAG TTTCACCCTT 2066
TTTAATCTCT CAGCCACAAA CCTCAGTTCC AATATTGTTA 2106
TAAGTTAAGA CGTATATGAT TCCGTCAAGA AAGACTGGAT 2146
ACTTTCTGAA GTAAAACATT TTAATTAAAG AAAAAAAA 2184
SEQ ID NO:32
SEQUENCE LISTING
(1) GENERAL INFORMATION:
(i) APPLICANT: Goldberg, Erwin (i) APPLICANT: O'Hern, Patricia A.
(ii) TITLE OF INVENTION: Proteins And Peptides For Contraceptive Vaccines And Fertility Diagnostics (iii) NUMBER OF SEQUENCES: 32 (iv) CORRESPONDENCE ADDRESS: (A) ADDRESSEE: Willian Brinks Hofer Gilson &
Lione
(B) STREET: P.O. BOX 10395
(C) CITY: Chicago
(D) STATE: Illinois (E) COUNTRY: USA
(F) ZIP: 60610 (V) COMPUTER READABLE FORM:
(A) MEDIUM TYPE: Diskette, 3.50 inch, 2 Mb storage
(B) COMPUTER: IBM XT compatible (C) OPERATING SYSTEM: MS-DOS
(D) SOFTWARE: WordPerfect 5.1 (vi) CURRENT APPLICATION DATA:
(A) APPLICATION NUMBER:
(B) FILING DATE: ll-JAN-1996 (C) CLASSIFICATION:
(viii) ATTORNEY/AGENT INFORMATION:
(A) NAME: Crook, Wannell M.
(B) REGISTRATION NUMBER: 31071
(C) REFERENCE/DOCKET NUMBER: 6793/9 (ix) TELECOMMUNICATION INFORMATION:
(A) TELEPHONE: (312)321-4229
(B) TELEFAX: (312)321-4299
(2) INFORMATION FOR SEQ ID NO:l: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 41 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY:
(xi) SEQUENCE DESCRIPTION:SEQ ID NO:l:
Met Gly Gin Phe Leu Ser Ser Thr Phe Leu Glu Gly Ser
5 10
Pro Ala Thr Val Ser Thr He Ser Phe Val Thr Val Asn 15 20 25
Ala Glu Glu Gin Glu Lys Gin Phe Val Ser Ser Arg Thr Lys 30 35 40
Gin
(2) INFORMATION FOR SEQ ID NO:2: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 16 amino acids
(B) TYPE: amino acid (C) STRANDEDNESS:
(D) TOPOLOGY: (xi) SEQUENCE DESCRIPTION:SEQ ID NO:2:
Asn Ala Glu Glu Gin Glu Lys Gin Phe Val Ser Ser Arg Thr 5 10
Lys Gin 15 (2) INFORMATION FOR SEQ ID NO:3: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 18 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: (D) TOPOLOGY:
(xi) SEQUENCE DESCRIPTION:SEQ ID NO:3:
Thr Val Asn Ala Glu Glu Gin Glu Lys Gin Phe Val Ser Ser
5 10
Arg Thr Lys Gin 15
(2) INFORMATION FOR SEQ ID NO:4: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY: (xi) SEQUENCE DESCRIPTION:SEQ ID NO:4:
Ser Phe Val Thr Val Asn Ala Glu Glu Gin Glu Lys Gin Phe
5 10 Val Ser Ser Arg Thr Lys Gin 15 20
(2) INFORMATION FOR SEQ ID NO:5: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 18 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY:
(xi) SEQUENCE DESCRIPTION:SEQ ID NO:5:
Val Asp Asp Ala Leu He Asn Ser Thr Lys He Tyr Ser Tyr
5 10
Phe Pro Ser Val 15
(2) INFORMATION FOR SEQ ID NO:6: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 22 amino acids
(B) TYPE: amino acid (C) STRANDEDNESS:
(D) TOPOLOGY: (xi) SEQUENCE DESCRIPTION:SEQ ID NO:6:
Gly Pro Ser Leu Val Asp Asp Ala Leu He Asn Ser Thr Lys 5 10
He Tyr Ser Tyr Phe Pro Ser Val 15 20 (2) INFORMATION FOR SEQ ID NO:7: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 38 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: (D) TOPOLOGY:
( i) SEQUENCE DESCRIPTION:SEQ ID NO:7:
Asn Ala Gly Glu Gin Glu Lys Gin Phe Leu Ser Ser Arg Thr
5 10
Lys Gin Gly Pro Ser Leu Val Asp Asp Ala Leu He Asn Ser 15 20 25
Thr Lys He Tyr Ser Tyr Phe Pro Ser Val 30 35
(2) INFORMATION FOR SEQ ID NO:8: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 29 amino acids (B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY:
(xi) SEQUENCE DESCRIPTION:SEQ ID NO:8: Thr Asn He Val Gin Glu Lys Lys His Thr Pro Arg Arg Arg
5 10
Pro Glu Pro Lys He He Pro Ser Glu Glu Asp Pro Thr Phe 15
20 25
Glu
(2) INFORMATION FOR SEQ ID NO:9: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 15 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY:
(xi) SEQUENCE DESCRIPTION:SEQ ID NO:9:
Val Gin Glu Lys Lys His Thr Pro Arg Arg Arg
5 10
Pro Glu Pro Lys 15
(2) INFORMATION FOR SEQ ID NO:10: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 37 amino acids (B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY:
(xi) SEQUENCE DESCRIPTION:SEQ ID NO:10: Val Gin Glu Lys Lys His Thr Pro Arg Arg Arg Pro Glu
5 10
Pro Lys Gly Pro Ser Leu Val Asp Asp Ala Leu He 15 20 25
Asn Ser Thr Lys He Tyr Ser Tyr Phe Pro Ser Val
30 35
(2) INFORMATION FOR SEQ ID NO:11: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY: (xi) SEQUENCE DESCRIPTION:SEQ ID NO:11:
Lys Gly Gin Glu Ala Gin Val Lys Lys Arg Glu Ser Val Val
5 10 Leu Lys Gly Gin Glu Ala 15 20
(2) INFORMATION FOR SEQ ID NO:12: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 18 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY:
(xi) SEQUENCE DESCRIPTION:SEQ ID NO:12:
Lys Glu Arg Asp Ala Glu Lys Asp Pro Asn Lys Lys Glu Lys
5 10
Gly Asp Lys Asn 15
(2) INFORMATION FOR SEQ ID NO:13: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 42 amino acids
(B) TYPE: amino acid (C) STRANDEDNESS:
(D) TOPOLOGY: (xi) SEQUENCE DESCRIPTION:SEQ ID NO:13:
Lys Gly Gin Glu Ala Gin Val Lys Lys Arg Glu Ser Val Val 5 10
Leu Lys Gly Gin Glu Ala Gly Pro Ser Leu Val Asp Asp Ala 15 20 25 Leu He Asn Ser Thr Lys He Tyr Ser Tyr Phe Pro Ser Val 30 35 40
(2) INFORMATION FOR SEQ ID NO:14: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 40 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY:
(xi) SEQUENCE DESCRIPTION:SEQ ID NO:14:
Lys Glu Arg Asp Ala Glu Lys Asp Pro Asn Lys Lys Glu Lys
5 10
Gly Asp Lys Asn Gly Pro Ser Leu Val Asp Asp Ala Leu He 15 20 25
Asn Ser Thr Lys He Tyr Ser Tyr Phe Pro Ser Val 30 35 40 (2) INFORMATION FOR SEQ ID NO:15: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 41 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: (D) TOPOLOGY:
(xi) SEQUENCE DESCRIPTION:SEQ ID NO:15:
Met Asn Pro Thr Glu Thr Lys Ala He Pro Val Ser Gin Gin
5 10
Met Glu Gly Pro His Leu Pro Asn Lys Lys Lys His Lys Lys 15 20 25
Gin Ala Val Lys Thr Glu Pro Glu Lys Lys Ser Gin Ser 30 35 40
-51-
(2) INFORMATION FOR SEQ ID NO:16: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 15 amino acids
(B) TYPE: amino acid (C) STRANDEDNESS:
(D) TOPOLOGY: (xi) SEQUENCE DESCRIPTION:SEQ ID NO:16:
Lys Gin Gin Glu Ala Gin Val Lys Lys Ser Glu Ser Gly Val
5 10
Pro 15
(2) INFORMATION FOR SEQ ID NO:17: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 15 amino acids
(B) TYPE: amino acid (C) STRANDEDNESS:
(D) TOPOLOGY: (Xi) SEQUENCE DESCRIPTION:SEQ ID NO:17:
Lys Arg Thr Gly Val Gin Val Lys Lys Ser Glu Ser Gly Val 5 10
Pro 15 (2) INFORMATION FOR SEQ ID NO:18: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 15 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: (D) TOPOLOGY:
(Xi) SEQUENCE DESCRIPTION:SEQ ID NO:18:
Lys Gly Gin Glu Ala Gin Val Thr Lys Ser Gly Leu Val Val
5 10
Leu 15
(2) INFORMATION FOR SEQ ID NO:19: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 15 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY: (Xi) SEQUENCE DESCRIPTION:SEQ ID NO:19:
Lys Gly Gin Glu Ala Gin Val Glu Lys Ser Glu Met Gly Val
5 10 Pro 15
(2) INFORMATION FOR SEQ ID NO:20: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 15 amino acids
(B) TYPE: amino acid (C) STRANDEDNESS:
(D) TOPOLOGY: (xi) SEQUENCE DESCRIPTION:SEQ ID NO:20:
Arg Arg Gin Glu Ser Gin Val Lys Lys Ser Gin Ser Gly Val 5 10
Ser
15 (2) INFORMATION FOR SEQ ID NO:21: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 15 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: (D) TOPOLOGY:
(Xi) SEQUENCE DESCRIPTION:SEQ ID NO:21:
Lys Gly Gin Glu Ala Gin Val Lys Lys Arg Glu Ser Val Val
5 10
Leu 15
(2) INFORMATION FOR SEQ ID NO:22: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 15 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY: (xi) SEQUENCE DESCRIPTION:SEQ ID NO:22:
Lys Gly Gin Glu Ala Gin Val Glu Lys Ser Glu Leu Lys Val
5 10 Pro 14
(2) INFORMATION FOR SEQ ID NO:23: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 15 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY:
(Xi) SEQUENCE DESCRIPTION:SEQ ID NO:23:
Lys Gly Gin Glu Gly Gin Val Glu Lys Thr Glu Ala Glu Cys
5 10
Pro 15
(2) INFORMATION FOR SEQ ID NO:24: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 15 amino acids
(B) TYPE: amino acid (C) STRANDEDNESS:
(D) TOPOLOGY: (xi) SEQUENCE DESCRIPTION:SEQ ID NO:24:
Lys Glu Gin Glu Val Gin Glu Lys Lys Ser Glu Ala Gly Val 5 10
Leu 15 (2) INFORMATION FOR SEQ ID NO:25: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 15 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: (D) TOPOLOGY:
(xi) SEQUENCE DESCRIPTION:SEQ ID NO:25:
Lys Gly Pro Glu Phe Gin Val Lys Asn Thr Glu Val Ser Val
5 10
Pro 15
(2) INFORMATION FOR SEQ ID NO:26: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 15 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY: (Xi) SEQUENCE DESCRIPTION:SEQ ID NO:26:
Glu Thr Leu Glu Ser Gin Val Lys Lys Ser Glu Ser Gly Val
5 10 Leu 15
(2) INFORMATION FOR SEQ ID NO:27: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 15 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY:
(xi) SEQUENCE DESCRIPTION:SEQ ID NO:27:
Lys Gly Gin Glu Ala Gin Glu Lys Lys Glu Ser Phe Glu Asp
5 10
Lys 15
(2) INFORMATION FOR SEQ ID NO:28: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 15 bases
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION:SEQ ID NO:28:
AGGAGAACAA CAACC 15
(2) INFORMATION FOR SEQ ID NO:29: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 16 amino acids
(B) TYPE: amino acid (C) STRANDEDNESS:
(D) TOPOLOGY: (xi) SEQUENCE DESCRIPTION:SEQ ID NO:29:
Asn Ala Glu Gly Gin Glu Lys Gin Phe Leu Ser Ser Arg Thr 5 10
Lys Gin 15 (2) INFORMATION FOR SEQ ID NO:30: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1927 bases
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double (D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION:SEQ ID NO:30:
CTTGATATCG AATTCGGGGGG AGTCTCCCT GACTTCCAGC 40 AACAATCCTT GAGTCTGAGA CTGCCCTGGC CTAAG ATG GGC 81
Met Gly
CAG TTT CTA TCT TCG ACT TTC TTG GAG GGC TCA CCG 117 Gin Phe Leu Ser Ser Thr Phe Leu Glu Gly Ser Pro 5 10
GCC ACA GTG TCG ACG ATA AGC TTT GTG ACG GTG AAC 153
Ala Thr Val Ser Thr He Ser Phe Val Thr Val Asn 15 20 25
GCA GAG GAG CAA GAG AAG CAG TTC GTA TCT TCC AGG 189
Ala Glu Glu Gin Glu Lys Gin Phe Val Ser Ser Arg 30 35 ACC AAG CAA AAA GCT AAA GAA GAA AAA CTA GAG AAG 225 Thr Lys Gin Lys Ala Lys Glu Glu Lys Leu Glu Lys 40 45 50
TGT GGT GAG GAT GAT GAA ACA ATC CCA TCT GAG TAC 261 Cys Gly Glu Asp Asp Glu Thr He Pro Ser Glu Tyr
55 60
AGA TTA AAA CCA GCC ACG GAT AAA GAT GGA AAA CCA 297 Arg Leu Lys Pro Ala Thr Asp Lys Asp Gly Lys Pro 65 70 CTA TTG CCA GAG CCT GAA GAA AAA CCC AAG CCT CGG 333 Leu Leu Pro Glu Pro Glu Glu Lys Pro Lys Pro Arg 75 80 85
AGT GAA TCA GAA CTC ATT GAT GAA CTT TCA GAA GAT 369 Ser Glu Ser Glu Leu He Asp Glu Leu Ser Glu Asp
90 95
TTC GAC CTG TCT GAA TGT AAA GAG AAA CCA TCT AAG 405 Phe Asp Leu Ser Glu Cys Lys Glu Lys Pro Ser Lys 100 105 110
CCA ACT GAA AAG ACA GAA GAA TCT AAG GCC GCT GCT 441 Pro Thr Glu Lys Thr Glu Glu Ser Lys Ala Ala Ala
115 120
CCA GCT CCT GTG TCG GAG GCT GTG TCT CGG ACC TCC 477 Pro Ala Pro Val Ser Glu Ala Val Ser Arg Thr Ser 125 130 ATG TGT AGT ATA CAG TCA GCA CCC CCT GAG CCG GCT 513 Met Cys Ser He Gin Ser Ala Pro Pro Glu Pro Ala 135 140 145
ACC TTG AAG GTC ACA GTG CCA GAT GAT GCT GTA GAA 549 Thr Leu Lys Val Thr Val Pro Asp Asp Ala Val Glu
150 155
GCC TTG GCT GAT AGC CTG GGG AAA AAG GAA GCA GAT 585 Ala Leu Ala Asp Ser Leu Gly Lys Lys Glu Ala Asp 160 165 170
CCA GAA GAT GGA AAA CCT GTG ATG GAT AAA GCT AAG 621
Pro Glu Asp Gly Lys Pro Val Met Asp Lys Val Lys
175 180
GAG AAG GCC AAA GAA GAA GAC CGT GAA AAG CTT GGT 657
Glu Lys Ala Lys Glu Glu Asp Arg Glu Lys Leu Gly 185 190 GAA AAA GAA GAA ACA ATT CCT CCT GAT TAT ATA TTA 693 Glu Lys Glu Glu Thr He Pro Pro Asp Tyr He Leu 195 200 205
GAA GAG GTC AAG GAT AAA GAT GGA AAG CCA CTC CTG 729 Glu Glu Val Lys Asp Lys Asp Gly Lys Pro Leu Leu
210 215
CCA AAA GAG TCT AAG GAA CAG CTT CCA CCC ATG AGT 765 Pro Lys Glu Ser Lys Glu Gin Leu Pro Pro Met Ser 220 225 230
GAA GAC TTC CTT CTG GAT GCT TTG TCT GAG GAC TTC 801 Glu Asp Phe Leu Leu Asp Ala Leu Ser Glu Asp Phe
235 240 TCT GGT CCA CAA AAT GCT TCA TCT CTT AAA TTT GAA 837 Ser Gly Pro Gin Asn Ala Ser Ser Leu Lys Phe Glu 240 245
GAT GCT AAA CTT GCT GCT GCC ATC TCT GAA GTG GTT 873 Asp Ala Lys Leu Ala Ala Ala He Ser Glu Val Val 250 255 260
TCC CAA ACC CCA GCT TCA ACG ACC CAA GCT GGA GCC 909 Ser Gin Thr Pro Ala Ser Thr Thr Gin Ala Gly Ala 265 270
CCA CCC CGT GAT ACC TCG AGT GAC AAA GAC CTC GAT 945
Pro Pro Arg Asp Thr Ser Ser Asp Lys Asp Leu Asp
275 280 285
GAT GCC TTG GAT AAA CTC TCT GAC AGT CTA GGA CAA 981
Asp Ala Leu Asp Lys Leu Ser Asp Ser Leu Gly Gin
290 300 AGG CAG CCT GAC CCA GAT GAG AAC AAA CCA ATG GAA 1017 Arg Gin Pro Asp Pro Asp Glu Asn Lys Pro Met Glu 305 310
GAT AAA GTA AAG GAA AAA GCT AAA GCT GAA CAT AGA 1053 Asp Lys Val Lys Glu Lys Ala Lys Ala Glu His Arg 315 320 325
GAC AAG CTT GGA GAG AGA GAT GAC ACT ATC CCA CCT 1089 Asp Lys Leu Gly Glu Arg Asp Asp Thr He Pro Pro 330 335
GAA TAC AGA CAT CTC CTG GAT GAT AAT GGA CAG GAC 1125 Glu Tyr Arg His Leu Leu Asp Asp Asn Gly Gin Asp 340 345 350
AAA CCA GTG AAG CCA CCT ACA AAG AAA TCA GAG GAT 1161 Lys Pro Val Lys Pro Pro Thr Lys Lys Ser Glu Asp
355 360 TCA AAG AAA CCT GCA GAT GAC CAA GAC CCC ATT GAT 1197 Ser Lys Lys Pro Ala Asp Asp Gin Asp Pro He Asp 365 370
GCT CTC TCA GGA GAT CTG GAC AGC TGT CCC TCC ACT 1233 Ala Leu Ser Gly Asp Leu Asp Ser Cys Pro Ser Thr 375 380 385
ACA GAA ACC TCA CAG AAC ACA GCA AAG GAT AAG TGC 1269 Thr Glu Thr Ser Gin Asn Thr Ala Lys Asp Lys Cys 390 395
AAG AAG GCT GCT TCC AGC TCC AAA GCA CCT AAG AAT 1305 Lys Lys Ala Ala Ser Ser Ser Lys Ala Pro Lys Asn 400 405 410
GGA GGT AAA GCG AAG GAT TCA GCA AAG ACA ACA GAG 1341 Gly Gly Lys Ala Lys Asp Ser Ala Lys Thr Thr Glu
415 420
GAA ACT TCC AAG CCA AAA GAT GAC TAA AGAAATACAAG 1377 Glu Thr Ser Lys Pro Lys Asp Asp 425 430
TTAAGGTATC TGGTATCTGC ATTTAAAATC TTCAGCTGGT 1417
GGATTGTGAC TTTTGAAGAA CAAAAGGCTT TGGCAACAGA 1457
AAACAATTGT TCTGGGTGAT TTCTAGAATG TTTTTTGTTG 1497
AGTCTCTGAA CATCCTAAAT ATTTGTTTGT TATTCTTTTC 1537
CAGAAAGAAA ATGAATTTGA CTGGTTCACC TGTGTACTGA 1577
GTATTGATAA ACTTCGAATT TTTTAAATTT CCTTCAAGGG 1617
AGAGAAAGCT TATATTGGTT TGTTATTCTT TTCCAGAAAG 1657
AAAATGAATT TGACTGGGTT CACTGTGTTA CTGAGTATTG 1697
ATAAACTTTG AATTTTTGCA ATTGCCTTCA ATTTTTAGAG 1737
GAAAAGCTTT ATATTTGTGT TATTACTTCT TCATCTTACA 1777
GTCATCACAG AACACACTGA GACTTGAATC AAGTCAGCAA 1817
CAGAGCAAAA TAAAGGTTAG ATAAGTCCTT GTGTAGCAAA 1857
TTTCGAGCAT AAGAAATAAA ATCTAATTAA TTCTTAGGGT 1897
AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 1927
(2) INFORMATION FOR SEQ ID NO:31: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1446 bases
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION:SEQ ID NO:31:
AAAGCGTCAT TCGAGGTCCG GGTCCGGCTT GCGGGGTCAG 40
CGAACTGGAG AGGCGCC ATG GGC TGG ATC ACA 72
Met Gly Trp He Thr
5
GAA GAT CTT ATT AGA CGG AAT GCT GAA CAC AAC GAC 108 Glu Asp Leu He Arg Arg Asn Ala Glu His Asn Asp
10 15 TGT GTC ATT TTT TCC CTG GAG GAA CTC TCG TTG CAT 144 Cys Val He Phe Ser Leu Glu Glu Leu Ser Leu His 20 25
CAG CAA GAA ATA GAA AGA CTA GAA CAC ATT GAT AAA 180 Gin Gin Glu He Glu Arg Leu Glu His He Asp Lys 30 35 40
TGG TGC CGG GAT TTA AAA ATT CTC TAT CTT CAA AAT 216 Trp Cys Arg Asp Leu Lys He Leu Tyr Leu Gin Asn 45 50
AAT CTT ATT GGG AAA ATT GAA AAT GTT AGC AAA CTC 252
Asn Leu He Gly Lys He Glu Asn Val Ser Lys Leu
55 60 65
AAG AAA CTT GAA TAT TTG AAT TTA GCT TTA AAC AAC 288
Lys Lys Leu Glu Tyr Leu Asn Leu Ala Leu Asn Asn
70 75 ATT GAA AAA ATA GAA AAC TTG GAA GGA TGT GAA GAG 324 He Glu Lys He Glu Asn Leu Glu Gly Cys Glu Glu 80 85
CTG GCA AAA CTT GAC CTG ACT GTG AAT TTC ATT GGA 360 Leu Ala Lys Leu Asp Leu Thr Val Asn Phe He Gly 90 95 100
GAG CTG AGC AGC ATT AAA AAC TTG CAG CAC AAT ATC 396 Glu Leu Ser Ser He Lys Asn Leu Gin His Asn He 105 110
CAT CTG AAG GAG CTC TTT CTC ATG GGG AAC CCA TGT 432 His Leu Lys Glu Leu Phe Leu Met Gly Asn Pro Cys 115 120 125
GCT TCC TTT GAC CAC TAT AGG GAG TTC GTG GTA GCA 468 Ala Ser Phe Asp His Tyr Arg Glu Phe Val Val Ala
130 135 ACT CTT CCA CAA TTA AAG TGG TTG GAT GGT AAA GAA 504 Thr Leu Pro Gin Leu Lys Trp Leu Asp Gly Lys Glu 140 145
ATA GAG CCT TCA GAA AGG ATT AAG GCA TTG CAG GAC 540 He Glu Pro Ser Glu Arg He Lys Ala Leu Gin Asp 150 155 160
TAT TCA GTA ATT GAA CCA CAA ATC AGA GAG CAG GAA 576 Tyr Ser Val He Glu Pro Gin He Arg Glu Gin Glu 165 170
AAA GAT CAC TGT CTT AAA CGA GCC AAA CTC AAG GAA 612 Lys Asp His Cys Leu Lys Arg Ala Lys Leu Lys Glu 175 180 185 GAG GCT CAG AGG AAA CAC CAA GAA GAG GAT AAA AAT 648 Glu Ala Gin Arg Lys His Gin Glu Glu Asp Lys Asn
190 195
GAA GAC AAG AGA AGT AAC GCA GGC TTT GAT GGA CGT 684 Glu Asp Lys Arg Ser Asn Ala Gly Phe Asp Gly Arg
200 205
TGG TAC ACA GAC ATC AAT GCT ACT CTT TCC TCT TTA 720 Trp Tyr Thr Asp He Asn Ala Thr Leu Ser Ser Leu 210 215 220
GAG AGC AAA GAC CAC CTA CAG GCA CCA GAC ATA GAG 756
Glu Ser Lys Asp His Leu Gin Ala Pro Asp He Glu 225 230
GAA CAC AAC ACA AAG AAA TTA GAC GAT GAC TTG GAA 792
Glu His Asn Thr Lys Lys Leu Asp Asp Asp Leu Glu 235 240 245 TTC TGG AAT AAG CCC TGT TTG TTT ACT CCT GAA TCA 828 Phe Trp Asn Lys Pro Cys Leu Phe Thr Pro Glu Ser
250 255
AGA TTG GAA ACT CTT AGA CAC ATG GAA AAA CAA CGG 864 Arg Leu Glu Thr Leu Arg His Met Glu Lys Gin Arg
260 265
AAG AAA CAG GAA AAA TTA AGT GAA AAA AAG AAG AAA 900 Lys Lys Gin Glu Lys Leu Ser Glu Lys Lys Lys Lys 270 275 280
GTG AAA CCA CCC AGG ACT TTG ATC ACT GAA GAT GGG 936
Val Lys Pro Pro Arg Thr Leu He Thr Glu Asp Gly 285 290
AAA GCC CTA AAT GTG AAT GAG CCC AAA ATT GAC TTC 972
Lys Ala Leu Asn Val Asn Glu Pro Lys He Asp Phe
295 300 305 TCT TTG AAA GAT AAC GAA AAG CAG ATC ATC CTG GAC 1008 Ser Leu Lys Asp Asn Glu Lys Gin He He Leu Asp
310 315
CTT GCT GTC TAT AGG TAT ATG GAT ACC TCT TTA ATC 1044 Leu Ala Val Tyr Arg Tyr Met Asp Thr Ser Leu He
320 325
GAT GTT GAT GTG CAA CCA ACT TAC GTG CGA GTA ATG 1080 Asp Val Asp Val Gin Pro Thr Tyr Val Arg Val Met 330 335 340
ATC AAA GGA AAG CCA TTT CAG CTT GTC CTT CCT GCA 1116 He Lys Gly Lys Pro Phe Gin Leu Val Leu Pro Ala 345 350 GAA GTG AAA CCC GAT AGT AGT TCT GCT AAA AGA TCT 1152 Glu Val Lys Pro Asp Ser Ser Ser Ala Lys Arg Ser 355 360 365
CAG ACA ACG GGT CAT TTG GTC ATC TGC ATG CCC AAG 1188 Gin Thr Thr Gly His Leu Val He Cys Met Pro Lys
370 375
GTA GGA GAA GTA ATC ACA GGT GGT CAG CGA GCA TTC 1224 Val Gly Glu Val He Thr Gly Gly Gin Arg Ala Phe 380 385
AAA TCT ATG AAA ACT ACC TCG GAC AGG AGC AGA GAA 1260 Lys Ser Met Lys Thr Thr Ser Asp Arg Ser Arg Glu 390 395 400
CAA ACA AAT ACA AGA AGC AAG CAC ATG GAG AAA CTA 1296 Gin Thr Asn Thr Arg Ser Lys His Met Glu Lys Leu 405 410 GAA GTA GAC CCT AGC AAG CAC TCA TTC CCT GAT GTG 1332 Glu Val Asp Pro Ser Lys His Ser Phe Pro Asp Val 415 420 425
ACT AAC ATA GTT CAA GAG AAA AAA CAC ACA CCC AGA 1368 Thr Asn He Val Gin Glu Lys Lys His Thr Pro Arg
430 435
AGA CGA CCT GAA CCC AAA ATT ATA CCA AGT GAG GAA 1404 Arg Arg Pro Glu Pro Lys He He Pro Ser Glu Glu 440 445
GAC CCA ACC TTT GAA GAC AAC CCT GAA GTG CCT CCG 1440 Asp Pro Thr Phe Glu Asp Asn Pro Glu Val Pro Pro 450 455 460
CTG ATT TGA 1446
Leu He
(2) INFORMATION FOR SEQ ID NO:32: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 2184 bases
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear ( i) SEQUENCE DESCRIPTION:SEQ ID NO:32:
AGCTGGGAGC GCAGAGGCTC ACGCCTGTAA TCCATCATTT 40
GCTTAGGTCT GATCAATCTG CTCCACACAA TTTCTCAGTG 80
ATCCTCTGCA TCTCTGCCTA CAAGGGCCTC CCTGACACCC 120
AAGTTCATAT TGCTCAGAAA CAGTGAACTT GAGTTTTTCG 160
TTTTACCTTG ATCTCTCTCT GACAAAGAAA TCCAGATGAT 200
GCAACACCTG ATGAAGACAA TACATGGAAA 230
ATG ACA GTC TTG GAA ATA ACT TTG 254 Met Thr Val Leu Glu He Thr Leu
5
GCT GTC ATC CTG ACT CTA CTG GGA CTT GCC ATC CTG 290 Ala Val He Leu Thr Leu Leu Gly Leu Ala He Leu 10 15 20 GCT ATT TTG TTA ACA AGA TGG GCA CGA CGT AAG CAA 326 Ala He Leu Leu Thr Arg Trp Ala Arg Arg Lys Gin
25 30
AGT GAA ATG TAT ATC TCC AGA TAC AGT TCA GAA CAA 362 Ser Glu Met Tyr He Ser Arg Tyr Ser Ser Glu Gin
35 40
AGT GCT AGA CTT CTG GAC TAT GAG GAT GGT AGA GGA 398 Ser Ala Arg Leu Leu Asp Tyr Glu Asp Gly Arg Gly 45 50 55
TCC CGA CAT GCA TAT CAA CAC AAA GTG ACA CTT CAT 434 Ser Arg His Ala Tyr Gin His Lys Val Thr Leu His 60 65
ATG ATA ACC GAG AGA GAT CCA AAA AGA GAT TAC ACA 470 Met He Thr Glu Arg Asp Pro Lys Arg Asp Tyr Thr 70 75 80 CCA TCA ACC AAC TCT CTA GCA CTG TCT CGA TCA AGT 506 Pro Ser Thr Asn Ser Leu Ala Leu Ser Arg Ser Ser
85 90
ATT GCT TTA CCT CAA GGA TCC ATG AGT AGT ATA AAA 542 He Ala Leu Pro Gin Gly Ser Met Ser Ser He Lys
95 100
TGT TTA CAA ACA ACT GAA GAA CCT CCT TCC AGA ACT 578 Cys Leu Gin Thr Thr Glu Glu Pro Pro Ser Arg Thr 105 110 115
GCA GGA GCC ATG ATG CAA TTC ACA GCC CTA TTC CCG 614 Ala Gly Ala Met Met Gin Phe Thr Ala Leu Phe Pro 120 125
GAG CTA CAG GAC CTA TCA AGC TCT CTC AAA AAA CCA 650
Glu Leu Gin Asp Leu Ser Ser Ser Leu Lys Lys Pro
130 135 140
TTG TGC AAA CTC CAG GAC CTA TTG TAC AAT ATC TGG 686 Leu Cys Lys Leu Gin Asp Leu Leu Tyr Asn He Trp
145 150 ATC CAA TGT CAG ATC GCA TCT CAC ACA ATC ACT GGT 722 He Gin Cys Gin He Ala Ser His Thr He Thr Gly 155 160
CAC CTT CAG CAC CCG CGG TCA CCC ATG GCA CCC ATA 758 His Leu Gin His Pro Arg Ser Pro Met Ala Pro He 165 170 175
ATA ATT TCA CAG AGA ACC GCA AGT CAG CTG GCA GCA 794 He He Ser Gly Arg Thr Ala Ser Gin Leu Ala Ala 180 185
CCT ATA AGA ATA CCT CAA GTT CAC ACT ATG GAC AGT 830 Pro He Arg He Pro Gin Val His Thr Met Asp Ser 190 195 200
TCT GGA AAA ATC ACA CTG ACT CCT GTG GTT ATA TTA 866 Ser Gly Lys He Thr Leu Thr Pro Val Val He Leu
205 210 ACA GGT TAC ATG GAC GAA GAA CTT CGA AAA AAA TCT 902 Thr Gly Tyr Met Asp Glu Glu Leu Arg Lys Lys Ser 215 220
TGT TCC AAA ATC CAG ATT CTA AAA TGT GGA GGC ACT 938 Cys Ser Lys He Gin He Leu Lys Cys Gly Gly Thr 225 230 235
GCA AGG TCT CAG ATA GCC GAG AAG AAA ACA AGG AAG 974 Ala Arg Ser Gin He Ala Glu Lys Lys Thr Arg Lys 240 245
CAA CTA AAG AAT GAC ATC ATA TTT ACG AAT TCT GTA 1010 Gin Leu Lys Asn Asp He He Phe Thr Asn Ser Val 250 255 260
GAA TCC TTG AAA TCA GCA CAC ATA AAG GAG CCA GAA 1046 Glu Ser Leu Lys Ser Ala His He Lys Glu Pro Glu
265 270 AGA GAA GGA AAA GGC ACT GAT TTA GAG AAA GAC AAA 1082 Arg Glu Gly Lys Gly Thr Asp Leu Glu Lys Asp Lys 275 280
ATA GGA ATG GAG GTC AAG GTA GAC AGT GAC GCT GGA 1118 He Gly Met Glu Val Lys Val Asp Ser Asp Ala Gly 285 290 295
ATA CCA AAA AGA CAG GAA ACC CAA CTA AAA ATC AGT 1154 He Pro Lys Arg Gin Glu Thr Gin Leu Lys He Ser 300 305
GAA GAT GAG TAT ACC ACA AGG ACA GGG AGC CCA AAT 1190 Glu Asp Glu Tyr Thr Thr Arg Thr Gly Ser Pro Gin 310 315 320 AAA GAA AAG TGT GTC AGA TGT ACC AAG AGG ACA GGA 1226 Lys Glu Lys Cys Val Arg Cys Thr Lys Arg Thr Gly
325 330
GTC CAA GTA AAG AAG AGT GAG TCA GGT GTC CCA AAA 1262 Val Gin Val Lys Lys Ser Glu Ser Gly Val Pro Lys
335 340
GGA CAA GAA GCC CAA GTA ACG AAG AGT GGG TTG GTT 1298 Gly Gin Glu Ala Gin Val Thr Lys Ser Gly Leu Val 345 350 355
GTA CTG AAA GGA CAG GAA GCC CAG GTA GAG AAG AGT 1334 Val Leu Lys Gly Gin Glu Ala Gin Val Glu Lys Ser 360 365
GAG ATG GGT GTG CCA AGA AGA CAG GAA TCC CAA GTA 1370 Glu Met Gly Val Pro Arg Arg Gin Glu Ser Gin Val 370 375 380 AAG AAG AGT CAG TCT GGT GTC TCA AAG GGA CAG GAA 1406 Lys Lys Ser Gin Ser Gly Val Ser Lys Gly Gin Glu
385 390
GCC CAG GTA AAG AAG AGG GAG TCA GTT GTA CTG AAA 1442 Ala Gin Val Lys Lys Arg Glu Ser Val Val Leu Lys
395 400
GGA CAG GAA GCC CAG GTA GAG AAG AGT GAG TTG AAG 1478 Gly Gin Glu Ala Gin Val Glu Lys Ser Glu Leu Lys 405 410 415
GTA CCA AAA GGA CAA GAA GGC CAA GTA GAG AAG ACT 1514
Val Pro Lys Gly Gin Glu Gly Gin Val Glu Lys Thr 420 425
GAG GCA GAT GTG CCA AAG GAA CAA GAG GTC CAA GAA 1550
Glu Ala Asp Val Pro Lys Glu Gin Glu Val Gin Glu 430 435 440 AAG AAG AGT GAG GCA GGT GTA CTG AAA GGA CCA GAA 1586 Lys Lys Ser Glu Ala Gly Val Leu Lys Gly Pro Glu
445 450
TCC CAA GTA AAG AAC ACT GAG GTG AGT GTA CCA GAA 1622 Ser Gin Val Lys Asn Thr Glu Val Ser Val Pro Glu
455 460
ACA CTG GAA TCC CAA GTA AAG AAG AGT GAG TCA GGT 1658 Thr Leu Glu Ser Gin Val Lys Lys Ser Glu Ser Gly 465 470 475
GTA CTA AAA GGA CAG GAA GCC CAA GAA AAG AAG GAG 1694 Val Leu Lys Gly Gin Glu Ala Gin Glu Lys Lys Glu 480 485 AGT TTT GAG GAT AAA GGA AAT AAT GAT AAA GAA AAG 1730 Ser Phe Glu Asp Lys Gly Asn Asn Asp Lys Glu Lys 490 495 500
GAG AGA GAT GCA GAG AAA GAT CCA AAT AAA AAA GAA 1766 Glu Arg Asp Ala Glu Lys Asp Pro Asn Lys Lys Glu
505 510
AAA GGT GAC AAA AAC ACA AAA GGT GAC AAA GGA AAG 1802 Lys Gly Asp Lys Asn Thr Lys Gly Asp Lys Gly Lys 515 520
GAC AAA GTT AAA GGA AAG AGA GAA TCA GAA ATC AAT 1838 Asp Lys Val Lys Gly Lys Arg Glu Ser Glu He Asn 525 530 535
GGT GAA AAA TCA AAA GGC TCG AAA AGG CGA AGG CAA 1874 Gly Glu Lys Ser Lys Gly Ser Lys Arg Arg Arg Gin 540 545 ATA CAG GAA GGA AGT ACA ACA AAA AAG TGG AAG AGT 1910 He Gin Glu Gly Ser Thr Thr Lys Lys Trp Lys Ser 550 555 560
AAG GAT AAA TTT TTT AAA GGC CCA TAA GACAAGTGAT 1946 Lys Asp Lys Phe Phe Lys Gly Pro
565
TATTATGATT CCCATACTCC AGATACAAAC CATATCCCAG 1986 CCATTGCCTA AACAGATTAC AATTATAAAA TCCCTTTCAT 2026
CTTCATATCA CAGTTTCTGC TCTTCAGAAG TTTCACCCTT 2066
TTTAATCTCT CAGCCACAAA CCTCAGTTCC AATATTGTTA 2106
TAAGTTAAGA CGTATATGAT TCCGTCAAGA AAGACTGGAT 2146
ACTTTCTGAA GTAAAACATT TTAATTAAAG AAAAAAAA 2184
Claims
1. A purified protein which is a testis-specific isoform of calpastatin.
2. The protein of Claim 1 which has the following sequence at its N-terminal:
Met Gly Gin Phe Leu Ser Ser Thr Phe Leu Glu Gly Ser
5 10
Pro Ala Thr Val Ser Thr He Ser Phe Val Thr Val Asn 15 20 25
Ala Glu Glu Gin Glu Lys Gin Phe Val Ser Ser Arg Thr Lys 30 35 40
Gin
SEQ ID NO:l.
3. A peptide capable of producing an antibody that reacts specifically with a testis-specific isoform of calpastatin, said peptide having a sequence comprising a sequence which forms a B-cell epitope found on the testis- specific isoform of calpastatin and not on somatic isoforms of calpastatin.
4. The peptide of Claim 3 having the following sequence:
Met Gly Gin Phe Leu Ser Ser Thr Phe Leu Glu Gly Ser Pro
5 10
Ala Thr Val Ser Thr He Ser Phe Val Thr Val Asn Ala Glu 15 20 25
Glu Gin Glu Lys Gin Phe Val Ser Ser Arg Thr Lys Gin, 30 35 40 SEQ ID NO:l
or a portion thereof that includes the sequence from amino acid 26 through amino acid 41. 5. The peptide of Claim 4 which has the following sequence:
Asn Ala Glu Glu Gin Glu Lys Gin Phe Val Ser Ser Arg Thr
5 10
Lys Gin 15
SEQ ID NO:2.
6. The peptide of Claim 4 which has the following sequence:
Thr Val Asn Ala Glu Glu Gin Glu Lys Gin Phe Val Ser Ser 5 10
Arg Thr Lys Gin 15
SEQ ID NO:3.
7. The peptide of Claim 4 which has the following sequence:
Ser Phe Val Thr Val Asn Ala Glu Glu Gin Glu Lys Gin Phe 5 10
Val Ser Ser Arg Thr Lys Gin 15 20
SEQ ID NO:4.
8. A peptide having a sequence which comprises the sequence of a T-cell epitope found on a testis-specific isoform of calpastatin.
9. An immunogen comprising the peptide of any one of
Claims 3-7 linked to a carrier.
10. The immunogen of Claim 9 wherein the carrier is a peptide having a sequence comprising the sequence of a promiscuous T-cell epitope.
11. The immunogen of Claim 10 wherein the T-cell epitope has the following sequence:
Val Asp Asp Ala Leu He Asn Ser Thr Lys He Tyr Ser Tyr
5 10
Phe Pro Ser Val 15
SEQ ID NO:5.
12. The immunogen of Claim 11 wherein the carrier has the following sequence:
Gly Pro Ser Leu Val Asp Asp Ala Leu He Asn Ser Thr Lys 5 10
He Tyr Ser Tyr Phe Pro Ser Val 15 20
SEQ ID NO:6.
13. The immunogen of Claim 12 which has the following sequence:
Asn Ala Gly Glu Gin Glu Lys Gin Phe Leu Ser Ser Arg Thr 5 10
Lys Gin Gly Pro Ser Leu Val Asp Asp Ala Leu He Asn Ser 15 20 25 Thr Lys He Tyr Ser Tyr Phe Pro Ser Val 30 35
SEQ ID NO:7.
14. A purified protein which is the protein produced by clone C-2 or a protein at least 70% homologous to the protein produced by clone C-2.
15. The protein of Claim 14 which contains the following sequence:
Thr Asn He Val Gin Glu Lys Lys His Thr Pro Arg Arg Arg
5 10 Pro Glu Pro Lys He He Pro Ser Glu Glu Asp Pro Thr Phe 15
20 25
Glu
SEQ ID NO:8.
16. A peptide capable of producing an antibody that reacts specifically with the protein of Claim 14, said peptide having a sequence comprising a sequence which forms a B-cell epitope of the protein of Claim 14.
17. The peptide of Claim 16 having the following sequence:
Thr Asn He Val Gin Glu Lys Lys His Thr Pro Arg Arg Arg
5 10
Pro Glu Pro Lys He He Pro Ser Glu Glu Asp Pro Thr Phe 15 20 25
Glu,
SEQ ID NO:8
or a portion thereof that includes the sequence from amino acid 4 through amino acid 17.
18. The peptide of Claim 17 having the following sequence:
Val Gin Glu Lys Lys His Thr Pro Arg Arg Arg
5 10 Pro Glu Pro Lys
15
SEQ ID NO:9.
19. A peptide having a sequence which comprises the sequence of a T-cell epitope of the protein of Claim 14.
20. An immunogen comprising the peptide of any one of Claims 15-18 linked to a carrier.
21. The immunogen of Claim 20 wherein the carrier is a peptide having a sequence comprising the sequence of a promiscuous T-cell epitope.
22. The immunogen of Claim 21 wherein the T-cell epitope has the following sequence:
Val Asp Asp Ala Leu He Asn Ser Thr Lys He Tyr Ser Tyr
5 10
Phe Pro Ser Val 15
SEQ ID NO:5.
23. The immunogen of Claim 22 wherein the carrier has the following sequence:
Gly Pro Ser Leu Val Asp Asp Ala Leu He Asn Ser Thr Lys
5 10
He Tyr Ser Tyr Phe Pro Ser Val 15 20
SEQ ID NO:6.
24. The immunogen of Claim 23 which has the following sequence:
Val Gin Glu Lys Lys His Thr Pro Arg Arg Arg Pro Glu
5 10
Pro Lys Gly Pro Ser Leu Val Asp Asp Ala Leu He 15 20 25
Asn Ser Thr Lys He Tyr Ser Tyr Phe Pro Ser Val 30 35
SEQ ID NO:10.
25. A purified protein which is the protein produced by clone L-7 or a protein at least 70% homologous to the protein produced by clone L-7.
26. The protein of Claim 25 which contains the following sequence: Lys Gly Gin Glu Ala Gin Val Lys Lys Arg Glu Ser Val Val
5 10
Leu Lys Gly Gin Glu Ala 15 20
SEQ ID NO:11 and the following sequence:
Lys Glu Arg Asp Ala Glu Lys Asp Pro Asn Lys Lys Glu Lys
5 10
Gly Asp Lys Asn
15
SEQ ID NO:12.
27. A peptide capable of producing an antibody that reacts specifically with the protein of Claim 24, said peptide having a sequence comprising a sequence which forms a B-cell epitope of the protein of Claim 24.
28. The peptide of Claim 27 having the following sequence:
Lys Gly Gin Glu Ala Gin Val Lys Lys Arg Glu Ser Val Val
5 10
Leu Lys Gly Gin Glu Ala 15 20
SEQ ID NO:11.
29. The peptide of Claim 27 having the following sequence:
Lys Glu Arg Asp Ala Glu Lys Asp Pro Asn Lys Lys Glu Lys 5 10
Gly Asp Lys Asn 15
SEQ ID NO:12.
30. A peptide having a sequence which comprises the sequence of a T-cell epitope of the protein of Claim 24.
31. An immunogen comprising the peptide of any one of Claims 26-29 linked to a carrier.
32. The immunogen of Claim 31 wherein the carrier is a peptide having a sequence comprising the sequence of a promiscuous T-cell epitope.
33. The immunogen of Claim 32 wherein the T-cell epitope has the following sequence:
Val Asp Asp Ala Leu He Asn Ser Thr Lys He Tyr Ser Tyr 5 10
Phe Pro Ser Val 15
SEQ ID NO:5.
34. The immunogen of Claim 33 wherein the carrier has the following sequence:
Gly Pro Ser Leu Val Asp Asp Ala Leu He Asn Ser Thr Lys 5 10
He Tyr Ser Tyr Phe Pro Ser Val 15 20
SEQ ID NO:6.
35. The immunogen of Claim 34 which has the following sequence:
Lys Gly Gin Glu Ala Gin Val Lys Lys Arg Glu Ser Val Val 5 10
Leu Lys Gly Gin Glu Ala Gly Pro Ser Leu Val Asp Asp Ala 15 20 25 Leu He Asn Ser Thr Lys He Tyr Ser Tyr Phe Pro Ser Val
30 35 40
SEQ ID NO:13.
36. The immunogen of Claim 34 which has the following sequence:
Lys Glu Arg Asp Ala Glu Lys Asp Pro Asn Lys Lys Glu Lys 5 10
Gly Asp Lys Asn Gly Pro Ser Leu Val Asp Asp Ala Leu He 15 20 25 Asn Ser Thr Lys He Tyr Ser Tyr Phe Pro Ser Val 30 35 40
SEQ ID NO:14.
37. A vaccine comprising a protein of any one of
Claims 1-2, 14-15 and 25-26, or an immunogenic portion thereof, in a delivery system.
38. A vaccine comprising a peptide of any one of Claims 3-8, 16-19 and 27-30 in a delivery system.
39. A vaccine comprising an immunogen of Claim 9 in a delivery system.
40. A vaccine comprising an immunogen of Claim 20 in a delivery system.
41. A vaccine comprising an immunogen of Claim 31 in a delivery system.
42. A method of inhibiting fertilization of an egg by sperm comprising administering an effective amount of the vaccine of Claim 37 to a male or female mammal.
43. A method of inhibiting fertilization of an egg by sperm comprising administering an effective amount of the vaccine of Claim 38 to a male or female mammal.
44. A method of inhibiting fertilization of an egg by sperm comprising administering an effective amount of the vaccine of Claim 39 to a male or female mammal.
45. A method of inhibiting fertilization of an egg by sperm comprising administering an effective amount of the vaccine of Claim 40 to a male or female mammal.
46. A method of inhibiting fertilization of an egg by sperm comprising administering an effective amount of the vaccine of Claim 41 to a male or female mammal.
47. An assay for assessing infertility in a patient comprising:
(a) providing one or more of the following: (i) a protein of Claim 1;
(ii) a protein of Claim 14; (iii) a protein of Claim 25; (iv) a peptide of Claim 3; (v) a peptide of Claim 16; (vi) a peptide of Claim 27; (v) a peptide of Claim 3 linked to a carrier; (vi) a peptide of Claim 16 linked to a carrier; (vii) a peptide of Claim 27 linked to a carrier;
(b) contacting the protein, peptide or peptide linked to a carrier with a body fluid of the patient; and
(c) determining if the body fluid of the patient contains antibodies that bind to the protein, peptide or peptide linked to a carrier.
48. An assay for assessing infertility in a patient comprising:
(a) providing one or more of the following: (i) a protein of Claim 2;
(ϋ) a protein of Claim 15;
(iii) a protein of Claim 26;
(iv) a peptide of Claim 4;
(V) a peptide of Claim 17;
(Vi) a peptide of Claim 28;
(Vii) a peptide of Claim 29;
(viii) a peptide of Claim 4 linked to a carrier;
(ix) a peptide of Claim 17 linked to a carrier; (x) a peptide of Claim 28 linked to a carrier;
(xi) a peptide of Claim 29 linked to carrier;
(b) contacting the protein, peptide or peptide linked to a carrier with a body fluid of the patient; and
(c) determining if the body fluid of the patient contains antibodies that bind to the protein, peptide or peptide linked to a carrier.
49. An kit comprising at least one container, said container containing one or more of the following: (i) a protein of Claim 1; (ii) a protein of Claim 14; (iii) a protein of Claim 25; (iv) a peptide of Claim 3; (v) a peptide of Claim 16; (vi) a peptide of Claim 27; (v) a peptide of Claim 3 linked to a carrier; (vi) a peptide of Claim 16 linked to a carrier; (vii) a peptide of Claim 27 linked to a carrier.
50. An kit comprising at least one container, said container containing one or more of the following: (i) a protein of Claim 2; (ii) a protein of Claim 15; (iii) a protein of Claim 26; (iv) a peptide of Claim 4; (v) a peptide of Claim 17; (vi) a peptide of Claim 28; (vii) a peptide of Claim 29; (viii) a peptide of Claim 4 linked to a carrier; (ix) a peptide of Claim 17 linked to a carrier; (x) a peptide of Claim 28 linked to a carrier; (xi) a peptide of Claim 29 linked to a carrier.
51. An isolated DNA molecule coding for the protein of Claim 1, 14 or 25.
52. The DNA molecule of Claim 51 operatively linked to expression control sequences.
53. A host cell comprising the DNA molecule of Claim 51 operatively linked to expression control sequences.
54. A method of producing a protein comprising culturing the host cell of Claim 53 under conditions permitting expression of the protein.
55. A DNA molecule coding for the peptide of Claim 3, 16 or 17.
56. The DNA molecule of Claim 55 wherein the peptide sequence further comprises the sequence of a promiscuous T- cell epitope.
57. The DNA molecule of Claim 55 or 56 operatively linked to expression control sequences.
58. A host cell comprising the DNA molecule of Claim 55 operatively linked to expression control sequences.
59. A method of producing a peptide comprising culturing the host cell of Claim 58 under conditions permitting expression of the peptide.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
AU17053/97A AU1705397A (en) | 1996-01-16 | 1997-01-16 | Proteins and peptides for contraceptive vaccines and fertility diagnosis |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US58659296A | 1996-01-16 | 1996-01-16 | |
US586,592 | 1996-01-16 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO1997026001A1 true WO1997026001A1 (en) | 1997-07-24 |
Family
ID=24346372
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US1997/000908 WO1997026001A1 (en) | 1996-01-16 | 1997-01-16 | Proteins and peptides for contraceptive vaccines and fertility diagnosis |
Country Status (2)
Country | Link |
---|---|
AU (1) | AU1705397A (en) |
WO (1) | WO1997026001A1 (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO1999003490A1 (en) * | 1997-07-21 | 1999-01-28 | Northwestern University | Proteins and peptides for contraceptive vaccines and fertility diagnosis |
WO2000026360A1 (en) * | 1998-11-03 | 2000-05-11 | Adherex Technologies Inc. | Compounds and methods for modulating claudin-mediated functions |
US6723700B1 (en) | 1998-11-03 | 2004-04-20 | Adherex Technologies, Inc. | Compounds and methods for modulating claudin-mediated functions |
WO2004055185A1 (en) * | 2002-12-18 | 2004-07-01 | Takeda Pharmaceutical Company Limited | Novel protein and dna thereof |
EP2277905A3 (en) * | 2002-03-13 | 2011-05-25 | Ganymed Pharmaceuticals AG | Differential in tumour gene products and use of same |
-
1997
- 1997-01-16 WO PCT/US1997/000908 patent/WO1997026001A1/en active Application Filing
- 1997-01-16 AU AU17053/97A patent/AU1705397A/en not_active Abandoned
Non-Patent Citations (4)
Title |
---|
BIOCHEMISTRY AND MOLECULAR BIOLOGY INTERNATIONAL, May 1994, Vol. 33, No. 2, WANG et al., "Calpastatin in Human Testis", pages 245-252. * |
JOURNAL OF MOLECULAR RECOGNITION, 1993, Vol. 6, KAUMAYA et al., "Peptide Vaccines Incorporating a 'Promiscuous' T-cell Epitope Bypass Certain Haplotype Restricted Immune Responses and Provide Broad Spectrum Immunogenicity", pages 81-94. * |
REPRODUCTION FERTILITY AND DEVELOPMENT, 1994, Vol. 6, LIANG et al., "Human Testis cDNAs Identified by Sera from Infertile Patients: a Molecular Biological Approach to Immunocontraceptive Development", pages 297-305. * |
TECHNIQUES IN PROTEIN CHEMISTRY, 1993, Vol. IV, O'HEARN et al., "The Use of Molecular Modelling to Delineate B-cell and T-cell Epitopes of Human Sperm-specific LDH-C4", pages 481-490. * |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO1999003490A1 (en) * | 1997-07-21 | 1999-01-28 | Northwestern University | Proteins and peptides for contraceptive vaccines and fertility diagnosis |
WO2000026360A1 (en) * | 1998-11-03 | 2000-05-11 | Adherex Technologies Inc. | Compounds and methods for modulating claudin-mediated functions |
US6723700B1 (en) | 1998-11-03 | 2004-04-20 | Adherex Technologies, Inc. | Compounds and methods for modulating claudin-mediated functions |
US6756356B2 (en) | 1998-11-03 | 2004-06-29 | Adherex Technologies, Inc. | Compounds and methods for modulating claudin-mediated functions |
US6830894B1 (en) | 1998-11-03 | 2004-12-14 | Adherex Technologies, Inc. | Compounds and methods for modulating claudin-mediated functions |
EP2277905A3 (en) * | 2002-03-13 | 2011-05-25 | Ganymed Pharmaceuticals AG | Differential in tumour gene products and use of same |
WO2004055185A1 (en) * | 2002-12-18 | 2004-07-01 | Takeda Pharmaceutical Company Limited | Novel protein and dna thereof |
Also Published As
Publication number | Publication date |
---|---|
AU1705397A (en) | 1997-08-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US5616322A (en) | Sperm antigen corresponding to a sperm zona binding protein autoantigenic epitope | |
EP0671926B1 (en) | Immunomodulatory peptides | |
CA2124953C (en) | Peptides related to prion proteins | |
US5721348A (en) | DNA encoding PH-20 proteins | |
WO1994004171A9 (en) | Immunomodulatory peptides | |
Kaul et al. | Expression of bonnet monkey (Macaca radiata) zona pellucida‐3 (ZP3) in a prokaryotic system and its immunogenicity | |
CA2181590C (en) | Peptomers with enhanced immunogenicity | |
US5693496A (en) | DNA encoding the mouse and human PH30 beta chain protein | |
WO1997026001A1 (en) | Proteins and peptides for contraceptive vaccines and fertility diagnosis | |
CA2058999A1 (en) | Contraceptive vaccine based on cloned zona pellucida gene | |
NZ239518A (en) | Human zona pellucida protein-3, (zp3), fragments, recombinant products and immunocontraceptive vaccine | |
Hogrefe et al. | Immunogenicity of synthetic peptides corresponding to flexible and antibody-accessible segments of mouse lactate dehydrogenase (LDH)-C4 | |
Afzalpurkar et al. | Induction of native protein reactive antibodies by immunization with peptides containing linear B-cell epitopes defined by anti-porcine ZP3β monoclonal antibodies | |
AU652611B2 (en) | Analogs of piscine LHRH | |
Sivapurapu et al. | Efficacy of antibodies against Escherichia coli expressed chimeric recombinant protein encompassing multiple epitopes of zona pellucida glycoproteins to inhibit in vitro human sperm–egg binding | |
NO854167L (en) | ANTIQUE PEPTID RELATIONS. | |
HUT77262A (en) | New immunocontraceptive peptides | |
US6455041B1 (en) | Immunogenic epitopes of the human zona pellucida protein (ZP1) | |
WO1999003490A1 (en) | Proteins and peptides for contraceptive vaccines and fertility diagnosis | |
EP0646015B1 (en) | Contraceptive vaccine | |
EP1162880A1 (en) | Human sperm surface antigens | |
Manfra et al. | Expression and purification of two recombinant sterol-carrier proteins: SCPX and SCP2 | |
WO1997039020A2 (en) | Antigenic sequences of a sperm protein and immunocontraceptive methods | |
Naz | Cloning and Sequencing of Sperm Antigens Involved in Oocyte Interaction: Clinical applications in infertility and immunocontraception | |
JP2006516896A (en) | ePAD, oocyte-specific protein |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A1 Designated state(s): AU CA CN IL JP |
|
AL | Designated countries for regional patents |
Kind code of ref document: A1 Designated state(s): AT BE CH DE DK ES FI FR GB GR IE IT LU MC NL PT SE |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
NENP | Non-entry into the national phase |
Ref country code: JP Ref document number: 97526273 Format of ref document f/p: F |
|
NENP | Non-entry into the national phase |
Ref country code: CA |
|
122 | Ep: pct application non-entry in european phase |