Digital watermarking for deep neural networks

Yuki Nagai ORCID: orcid.org/0000-0002-7038-8229¹,
Yusuke Uchida²,
Shigeyuki Sakazawa³ &
…
Shin’ichi Satoh⁴

2161 Accesses
91 Citations
37 Altmetric
1 Mention
Explore all metrics

Abstract

Although deep neural networks have made tremendous progress in the area of multimedia representation, training neural models requires a large amount of data and time. It is well known that utilizing trained models as initial weights often achieves lower training error than neural networks that are not pre-trained. A fine-tuning step helps to both reduce the computational cost and improve the performance. Therefore, sharing trained models has been very important for the rapid progress of research and development. In addition, trained models could be important assets for the owner(s) who trained them; hence, we regard trained models as intellectual property. In this paper, we propose a digital watermarking technology for ownership authorization of deep neural networks. First, we formulate a new problem: embedding watermarks into deep neural networks. We also define requirements, embedding situations, and attack types on watermarking in deep neural networks. Second, we propose a general framework for embedding a watermark in model parameters, using a parameter regularizer. Our approach does not impair the performance of networks into which a watermark is placed because the watermark is embedded while training the host network. Finally, we perform comprehensive experiments to reveal the potential of watermarking deep neural networks as the basis of this new research effort. We show that our framework can embed a watermark during the training of a deep neural network from scratch, and during fine-tuning and distilling, without impairing its performance. The embedded watermark does not disappear even after fine-tuning or parameter pruning; the watermark remains complete even after 65% of parameters are pruned.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Artificial Intelligence

Notes

https://github.com/BVLC/caffe/wiki/Model-Zoo.
https://www.amazon.com/skills/.
Fully connected layers can also be used, but we focus on convolutional layers here, because fully connected layers are often discarded in fine-tuning.
Although this single-layer perceptron can be deepened into multilayer perceptron, we focus on the simplest one in this paper.
https://github.com/yu4u/dnn-watermark.
Note that the learning rate was also initialized to 0.1 at the beginning of the second training, while the learning rate was reduced to ($8.0 \times 10^{-4}$) at the end of the first training.
This size is extremely small compared with their original sizes (roughly $300 \times 200$).

References

Abadi M et al (2016) Tensorflow: Large-scale machine learning on heterogeneous distributed systems. arXiv:1603.04467
Amari S (1967) A theory of adaptive pattern classifiers. IEEE Trans Electron Comput EC-16 3:299–307
Article MATH Google Scholar
Anguera X, Garzon A, Adamek T (2012) Mask: robust local features for audio fingerprinting. In: Proceedings of ICME
Ba LJ, Caruana R (2014) Do deep nets really need to be deep? In: Proceedings of NIPS, pp 2654—2662
Babenko A, Slesarev A, Chigorin A, Lempitsky V (2014) Neural codes for image retrieval. In: Fleet D, Pajdla T, Schiele B, Tuytelaars T (eds) In: Proceedings of ECCV, pp 584–599
Barr J, Bradley B, Hannigan BT (2003) Using digital watermarks with image signatures to mitigate the threat of the copy attack. In: Proceedings of ICASSP, pp 69–72
Bengio Y, Courville A, Vincent P (2013) Representation learning: a review and new perspectives. IEEE Trans Pattern Anal Mach Intell 35(8):1798–1828
Article Google Scholar
Bergstra J, Breuleux O, Bastien F, Lamblin P, Pascanu R, Desjardins G, Turian J, Warde-Farley D, Bengio Y (2010) Theano: a CPU and GPU math expression compiler. In Proceedings of the Python for scientific computing conference (SciPy)
Chen T, Goodfellow I, Shlens J (2016) Net2net: accelerating learning via knowledge transfer. In: Proceedings of ICLR
Chollet F (2015) Keras, GitHub repository. https://github.com/fchollet/keras. Accessed 05 Feb 2017
Choromanska A, Henaff M, Mathieu M, Arous G, LeCun Y (2015) The loss surfaces of multilayer networks. In: Proceedings of AISTATS
Collobert R, Kavukcuoglu K, Farabet C (2011) Torch7: a matlab-like environment for machine learning. In: Proceedings of NIPS workshop on BigLearn
Cox I, Miller M, Bloom J, Fridrich J, Kalker T (2008) Digital watermarking and steganography. Morgan Kaufmann Publishers Inc., 2nd edn
Dauphin Y, Pascanu R, Gulcehre C, Cho K, Ganguli S, Bengio Y (2014) Identifying and attacking the saddle point problem in high-dimensional non-convex optimization. In Proceedings of NIPS
Erhan D, Bengio Y, Courville A, Manzagol P-A, Vincent P, Bengio S (2010) Why does unsupervised pre-training help deep learning? J. Mach. Learn. Res. 11:625–660
MathSciNet MATH Google Scholar
Fei-Fei L, Fergus R, Perona P (2004) Learning generative visual models from few training examples: an incremental Bayesian approach tested on 101 object categories. In: Proceedings of CVPR workshop on generative-model based vision
Haitsma J, Kalker T (2002) A highly robust audio fingerprinting system. In: Proceedings of ISMIR, pp 107–115
Han S, Liu X, Mao H, Pu J, Pedram A, Horowitz MA, Dally WJ (2016) Eie: efficient inference engine on compressed deep neural network. In: Proceedings of ISCA
Han S, Mao H, Dally WJ (2016) Deep compression: compressing deep neural networks with pruning, trained quantization and huffman coding. In: Proceedings of ICLR
Han S, Pool J, Tran J, Dally WJ (2015) Learning both weights and connections for efficient neural networks. In: Proceedings of NIPS
Hartung F, Kutter M (1999) Multimedia watermarking techniques. Proc IEEE 87:1079–1107
Article Google Scholar
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of CVPR
Hinton G, Vinyals O, Dean J (2014) Distilling the knowledge in a neural network. In: Proceedings of NIPS workshop on deep learning and representation learning
Hinton GE, Salakhutdinov RR (2006) Reducing the dimensionality of data with neural networks. Science 313(5786):504–507
Article MathSciNet MATH Google Scholar
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Article Google Scholar
James K et al (2017) Overcoming catastrophic forgetting in neural networks. PNAS 114(13):3521–3526
Article MathSciNet Google Scholar
Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Darrell T (2014) Caffe: convolutional architecture for fast feature embedding. In: Proceedings of MM
Johnson N, Duric Z, Jajodia S (2000) Information hiding: steganography and watermarking–attacks and countermeasures. Springer, Berlin
Google Scholar
Joly A, Frelicot C, Buisson O (2005) Content-based video copy detection in large databases: a local fingerprints statistical similarity search approach. In: Proceedings of ICIP, pp 505–508
Karpathy A, Toderici G, Shetty S, Leung T, Sukthankar R, Fei-Fei L (June 2014) Large-scale video classification with convolutional neural networks. In: Proceedings of ECCV
Kodovsky J, Fridrich J, Holub V (2012) Ensemble classifiers for steganalysis of digital media. IEEE Trans Inf Forens Secur 7(2):432–444
Article Google Scholar
Krizhevsky A (2009) Learning multiple layers of features from tiny images. Tech Report
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Proceedings of NIPS
Krogh A, Hertz JA (1992) A simple weight decay can improve generalization. In: Proceedings of NIPS
LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444
Article Google Scholar
Lecun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86:2278–2324
Article Google Scholar
Lee S, Kim J, Jun J, Ha J, Zhang B (2017) Overcoming catastrophic forgetting by incremental mome. In: Proceedings of NIPS
Merrer EL, Perez P, Trédan G (2017) Adversarial frontier stitching for remote neural network watermarking. arXiv:1711.01894
Nesterov Y (1983) A method of solving a convex programming problem with convergence rate o(1/k2). Sov Math Doklady 27(2):372–376
MATH Google Scholar
Pang L, Zhu S, Ngo CW (2015) Deep multimodal learning for affective analysis and retrieval. IEEE Trans Multimed 17(11):2008–2020
Article Google Scholar
Shaohui L, Hongxun Y, Wen G (2003) Neural network based steganalysis in still images. In: Proceedings of ICME
Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. In: Proceedings of ICLR
Sutskever I, Martens J, Dahl G, Hinton G (2013) On the importance of initialization and momentum in deep learning. In: Proceedings of ICML, vol 28, pp III–1139–III–1147
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of CVPR
Tokui S, Oono K, Hido S, Clayton J (2015) Chainer: a next-generation open source framework for deep learning. In: Proceedings of NIPS workshop on machine learning systems
Tomáš M, Martin K, Lukáš B, Jan Č, Sanjeev K (2010) Recurrent neural network based language model. In: Proceedings of INTERSPEECH
Uchida Y, Agrawal M, Sakazawa S (2011) Accurate content-based video copy detection with efficient feature indexing. In: Proceedings of ICMR
Uchida Y, Nagai Y, Sakazawa S, Satoh S (2017) Embedding watermarks into deep neural networks. In: Proceedings of ICMR
van den Oord A, Dieleman S, Schrauwen B (2013) Deep content-based music recommendation. In: Burges CJC, Bottou L, Welling M, Ghahramani Z, Weinberger KQ (eds) In: Proceedings of NIPS, pp 2643–2651. Curran Associates, Inc
Wan J, Wang D, Hoi SCH, Wu P, Zhu J, Zhang Y, Li J (2014) Deep learning for content-based image retrieval: a comprehensive study. In: Proceedings of MM, pp 157–166
Wei T, Wang C, Rui Y, Chen CW (2016) Network morphism. In: Proceedings of ICML
Zagoruyko S, Komodakis N (2016) Wide residual networks. In: Proceedings of ECCV
Zhang GP (2003) Time series forecasting using a hybrid arima and neural network model. Neurocomputing 50:159–175
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

KDDI Research, Inc., 2-1-15 Ohara, Fujimino-shi, Saitama, 356-8502, Japan
Yuki Nagai
DeNA Co., Ltd., Shibuya Hikarie, 2-21-1 Shibuya, Shibuya-ku, Tokyo, 150-8510, Japan
Yusuke Uchida
Osaka Institute of Technology, 1-79-1 Kitayama, Hirakata-city, Osaka, 573-0196, Japan
Shigeyuki Sakazawa
National Institute of Informatics, 2-1-2 Hitotsubashi, Chiyoda-ku, Tokyo, 101-8430, Japan
Shin’ichi Satoh

Authors

Yuki Nagai
View author publications
You can also search for this author in PubMed Google Scholar
Yusuke Uchida
View author publications
You can also search for this author in PubMed Google Scholar
Shigeyuki Sakazawa
View author publications
You can also search for this author in PubMed Google Scholar
Shin’ichi Satoh
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yuki Nagai.

Additional information

Y. Uchida and S. Sakazawa: This work was done when the authors were at KDDI Research, Inc.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Nagai, Y., Uchida, Y., Sakazawa, S. et al. Digital watermarking for deep neural networks. Int J Multimed Info Retr 7, 3–16 (2018). https://doi.org/10.1007/s13735-018-0147-1

Download citation

Received: 30 September 2017
Revised: 05 December 2017
Accepted: 05 January 2018
Published: 02 February 2018
Issue Date: March 2018
DOI: https://doi.org/10.1007/s13735-018-0147-1

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Protecting IP of deep neural networks with watermarking using logistic disorder generation trigger sets

Digital Watermarking Method for Copyright Protection of Deep Neural Networks

Deep neural network watermarking based on a reversible image hiding network

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Digital watermarking for deep neural networks

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Protecting IP of deep neural networks with watermarking using logistic disorder generation trigger sets

Digital Watermarking Method for Copyright Protection of Deep Neural Networks

Deep neural network watermarking based on a reversible image hiding network

Explore related subjects

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation