Nothing Special   »   [go: up one dir, main page]

Skip to main content
Log in

Digital watermarking for deep neural networks

  • Regular Paper
  • Published:
International Journal of Multimedia Information Retrieval Aims and scope Submit manuscript

Abstract

Although deep neural networks have made tremendous progress in the area of multimedia representation, training neural models requires a large amount of data and time. It is well known that utilizing trained models as initial weights often achieves lower training error than neural networks that are not pre-trained. A fine-tuning step helps to both reduce the computational cost and improve the performance. Therefore, sharing trained models has been very important for the rapid progress of research and development. In addition, trained models could be important assets for the owner(s) who trained them; hence, we regard trained models as intellectual property. In this paper, we propose a digital watermarking technology for ownership authorization of deep neural networks. First, we formulate a new problem: embedding watermarks into deep neural networks. We also define requirements, embedding situations, and attack types on watermarking in deep neural networks. Second, we propose a general framework for embedding a watermark in model parameters, using a parameter regularizer. Our approach does not impair the performance of networks into which a watermark is placed because the watermark is embedded while training the host network. Finally, we perform comprehensive experiments to reveal the potential of watermarking deep neural networks as the basis of this new research effort. We show that our framework can embed a watermark during the training of a deep neural network from scratch, and during fine-tuning and distilling, without impairing its performance. The embedded watermark does not disappear even after fine-tuning or parameter pruning; the watermark remains complete even after 65% of parameters are pruned.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Notes

  1. https://github.com/BVLC/caffe/wiki/Model-Zoo.

  2. https://www.amazon.com/skills/.

  3. Fully connected layers can also be used, but we focus on convolutional layers here, because fully connected layers are often discarded in fine-tuning.

  4. Although this single-layer perceptron can be deepened into multilayer perceptron, we focus on the simplest one in this paper.

  5. https://github.com/yu4u/dnn-watermark.

  6. Note that the learning rate was also initialized to 0.1 at the beginning of the second training, while the learning rate was reduced to (\(8.0 \times 10^{-4}\)) at the end of the first training.

  7. This size is extremely small compared with their original sizes (roughly \(300 \times 200\)).

References

  1. Abadi M et al (2016) Tensorflow: Large-scale machine learning on heterogeneous distributed systems. arXiv:1603.04467

  2. Amari S (1967) A theory of adaptive pattern classifiers. IEEE Trans Electron Comput EC-16 3:299–307

    Article  MATH  Google Scholar 

  3. Anguera X, Garzon A, Adamek T (2012) Mask: robust local features for audio fingerprinting. In: Proceedings of ICME

  4. Ba LJ, Caruana R (2014) Do deep nets really need to be deep? In: Proceedings of NIPS, pp 2654—2662

  5. Babenko A, Slesarev A, Chigorin A, Lempitsky V (2014) Neural codes for image retrieval. In: Fleet D, Pajdla T, Schiele B, Tuytelaars T (eds) In: Proceedings of ECCV, pp 584–599

  6. Barr J, Bradley B, Hannigan BT (2003) Using digital watermarks with image signatures to mitigate the threat of the copy attack. In: Proceedings of ICASSP, pp 69–72

  7. Bengio Y, Courville A, Vincent P (2013) Representation learning: a review and new perspectives. IEEE Trans Pattern Anal Mach Intell 35(8):1798–1828

    Article  Google Scholar 

  8. Bergstra J, Breuleux O, Bastien F, Lamblin P, Pascanu R, Desjardins G, Turian J, Warde-Farley D, Bengio Y (2010) Theano: a CPU and GPU math expression compiler. In Proceedings of the Python for scientific computing conference (SciPy)

  9. Chen T, Goodfellow I, Shlens J (2016) Net2net: accelerating learning via knowledge transfer. In: Proceedings of ICLR

  10. Chollet F (2015) Keras, GitHub repository. https://github.com/fchollet/keras. Accessed 05 Feb 2017

  11. Choromanska A, Henaff M, Mathieu M, Arous G, LeCun Y (2015) The loss surfaces of multilayer networks. In: Proceedings of AISTATS

  12. Collobert R, Kavukcuoglu K, Farabet C (2011) Torch7: a matlab-like environment for machine learning. In: Proceedings of NIPS workshop on BigLearn

  13. Cox I, Miller M, Bloom J, Fridrich J, Kalker T (2008) Digital watermarking and steganography. Morgan Kaufmann Publishers Inc., 2nd edn

  14. Dauphin Y, Pascanu R, Gulcehre C, Cho K, Ganguli S, Bengio Y (2014) Identifying and attacking the saddle point problem in high-dimensional non-convex optimization. In Proceedings of NIPS

  15. Erhan D, Bengio Y, Courville A, Manzagol P-A, Vincent P, Bengio S (2010) Why does unsupervised pre-training help deep learning? J. Mach. Learn. Res. 11:625–660

    MathSciNet  MATH  Google Scholar 

  16. Fei-Fei L, Fergus R, Perona P (2004) Learning generative visual models from few training examples: an incremental Bayesian approach tested on 101 object categories. In: Proceedings of CVPR workshop on generative-model based vision

  17. Haitsma J, Kalker T (2002) A highly robust audio fingerprinting system. In: Proceedings of ISMIR, pp 107–115

  18. Han S, Liu X, Mao H, Pu J, Pedram A, Horowitz MA, Dally WJ (2016) Eie: efficient inference engine on compressed deep neural network. In: Proceedings of ISCA

  19. Han S, Mao H, Dally WJ (2016) Deep compression: compressing deep neural networks with pruning, trained quantization and huffman coding. In: Proceedings of ICLR

  20. Han S, Pool J, Tran J, Dally WJ (2015) Learning both weights and connections for efficient neural networks. In: Proceedings of NIPS

  21. Hartung F, Kutter M (1999) Multimedia watermarking techniques. Proc IEEE 87:1079–1107

    Article  Google Scholar 

  22. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of CVPR

  23. Hinton G, Vinyals O, Dean J (2014) Distilling the knowledge in a neural network. In: Proceedings of NIPS workshop on deep learning and representation learning

  24. Hinton GE, Salakhutdinov RR (2006) Reducing the dimensionality of data with neural networks. Science 313(5786):504–507

    Article  MathSciNet  MATH  Google Scholar 

  25. Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780

    Article  Google Scholar 

  26. James K et al (2017) Overcoming catastrophic forgetting in neural networks. PNAS 114(13):3521–3526

    Article  MathSciNet  Google Scholar 

  27. Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Darrell T (2014) Caffe: convolutional architecture for fast feature embedding. In: Proceedings of MM

  28. Johnson N, Duric Z, Jajodia S (2000) Information hiding: steganography and watermarking–attacks and countermeasures. Springer, Berlin

    Google Scholar 

  29. Joly A, Frelicot C, Buisson O (2005) Content-based video copy detection in large databases: a local fingerprints statistical similarity search approach. In: Proceedings of ICIP, pp 505–508

  30. Karpathy A, Toderici G, Shetty S, Leung T, Sukthankar R, Fei-Fei L (June 2014) Large-scale video classification with convolutional neural networks. In: Proceedings of ECCV

  31. Kodovsky J, Fridrich J, Holub V (2012) Ensemble classifiers for steganalysis of digital media. IEEE Trans Inf Forens Secur 7(2):432–444

    Article  Google Scholar 

  32. Krizhevsky A (2009) Learning multiple layers of features from tiny images. Tech Report

  33. Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Proceedings of NIPS

  34. Krogh A, Hertz JA (1992) A simple weight decay can improve generalization. In: Proceedings of NIPS

  35. LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444

    Article  Google Scholar 

  36. Lecun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86:2278–2324

    Article  Google Scholar 

  37. Lee S, Kim J, Jun J, Ha J, Zhang B (2017) Overcoming catastrophic forgetting by incremental mome. In: Proceedings of NIPS

  38. Merrer EL, Perez P, Trédan G (2017) Adversarial frontier stitching for remote neural network watermarking. arXiv:1711.01894

  39. Nesterov Y (1983) A method of solving a convex programming problem with convergence rate o(1/k2). Sov Math Doklady 27(2):372–376

    MATH  Google Scholar 

  40. Pang L, Zhu S, Ngo CW (2015) Deep multimodal learning for affective analysis and retrieval. IEEE Trans Multimed 17(11):2008–2020

    Article  Google Scholar 

  41. Shaohui L, Hongxun Y, Wen G (2003) Neural network based steganalysis in still images. In: Proceedings of ICME

  42. Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. In: Proceedings of ICLR

  43. Sutskever I, Martens J, Dahl G, Hinton G (2013) On the importance of initialization and momentum in deep learning. In: Proceedings of ICML, vol  28, pp III–1139–III–1147

  44. Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of CVPR

  45. Tokui S, Oono K, Hido S, Clayton J (2015) Chainer: a next-generation open source framework for deep learning. In: Proceedings of NIPS workshop on machine learning systems

  46. Tomáš M, Martin K, Lukáš B, Jan Č, Sanjeev K (2010) Recurrent neural network based language model. In: Proceedings of INTERSPEECH

  47. Uchida Y, Agrawal M, Sakazawa S (2011) Accurate content-based video copy detection with efficient feature indexing. In: Proceedings of ICMR

  48. Uchida Y, Nagai Y, Sakazawa S, Satoh S (2017) Embedding watermarks into deep neural networks. In: Proceedings of ICMR

  49. van den Oord A, Dieleman S, Schrauwen B (2013) Deep content-based music recommendation. In: Burges CJC, Bottou L, Welling M, Ghahramani Z, Weinberger KQ (eds) In: Proceedings of NIPS, pp 2643–2651. Curran Associates, Inc

  50. Wan J, Wang D, Hoi SCH, Wu P, Zhu J, Zhang Y, Li J (2014) Deep learning for content-based image retrieval: a comprehensive study. In: Proceedings of MM, pp 157–166

  51. Wei T, Wang C, Rui Y, Chen CW (2016) Network morphism. In: Proceedings of ICML

  52. Zagoruyko S, Komodakis N (2016) Wide residual networks. In: Proceedings of ECCV

  53. Zhang GP (2003) Time series forecasting using a hybrid arima and neural network model. Neurocomputing 50:159–175

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yuki Nagai.

Additional information

Y. Uchida and S. Sakazawa: This work was done when the authors were at KDDI Research, Inc.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Nagai, Y., Uchida, Y., Sakazawa, S. et al. Digital watermarking for deep neural networks. Int J Multimed Info Retr 7, 3–16 (2018). https://doi.org/10.1007/s13735-018-0147-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13735-018-0147-1

Keywords

Navigation