research-article

False claims against model ownership resolution

AUTHORs:

Sebastian Szyller,

N. AsokanAuthors Info & Claims

SEC '24: Proceedings of the 33rd USENIX Conference on Security Symposium

Article No.: 385, Pages 6885 - 6902

Published: 12 August 2024 Publication History

Abstract

Deep neural network (DNN) models are valuable intellectual property of model owners, constituting a competitive advantage. Therefore, it is crucial to develop techniques to protect against model theft. Model ownership resolution (MOR) is a class of techniques that can deter model theft. A MOR scheme enables an accuser to assert an ownership claim for a suspect model by presenting evidence, such as a watermark or fingerprint, to show that the suspect model was stolen or derived from a source model owned by the accuser. Most of the existing MOR schemes prioritize robustness against malicious suspects, ensuring that the accuser will win if the suspect model is indeed a stolen model.

In this paper, we show that common MOR schemes in the literature are vulnerable to a different, equally important but insufficiently explored, robustness concern: a malicious accuser. We show how malicious accusers can successfully make false claims against independent suspect models that were not stolen. Our core idea is that a malicious accuser can deviate (without detection) from the specified MOR process by finding (transferable) adversarial examples that successfully serve as evidence against independent suspect models. To this end, we first generalize the procedures of common MOR schemes and show that, under this generalization, defending against false claims is as challenging as preventing (transferable) adversarial examples. Via systematic empirical evaluation, we show that our false claim attacks always succeed in MOR schemes that follow our generalization, including in a real-world model: Amazon's Rekognition API.

References

[1]

Yossi Adi, Carsten Baum, Moustapha Cisse, Benny Pinkas, and Joseph Keshet. Turning your weakness into a strength: Watermarking deep neural networks by back-dooring. In 27th USENIX Security Symposium, pages 1615-1631, 2018.

[2]

Buse Gul Atli, Sebastian Szyller, Mika Juuti, Samuel Marchai, and N Asokan. Extraction of complex dnn models: Real threat or boogeyman? In Engineering Dependable and Secure Machine Learning Systems: Third International Workshop, pages 42-57, 2020.

[3]

Eli Ben-Sasson, Alessandro Chiesa, Eran Tromer, and Madars Virza. Succinct non-interactive zero knowledge for a von neumann architecture. In 23rd USENIX Security Symposium, pages 781-796, 2014.

[4]

Arjun Nitin Bhagoji, Daniel Cullina, Chawin Sitawarin, and Prateek Mittal. Enhancing robustness of machine learning systems via data transformations. In Annual Conference on Information Sciences and Systems (CISS), pages 1-5, 2018.

[5]

Franziska Boenisch. A systematic review on model watermarking for neural networks. Frontiers in big Data, 4:729663, 2021.

[6]

Keith Bonawitz, Hubert Eichner, Wolfgang Grieskamp, Dzmitry Huba, Alex Ingerman, Vladimir Ivanov, Chloe Kiddon, Jakub Konečnỳ, Stefano Mazzocchi, Brendan McMahan, et al. Towards federated learning at scale: System design. Proceedings of machine learning and systems, 1:374-388, 2019.

[7]

Jacob Buckman, Aurko Roy, Colin Raffel, and Ian Good-fellow. Thermometer encoding: One hot way to resist adversarial examples. In International Conference on Learning Representations, 2018.

[8]

Xiaoyu Cao, Jinyuan Jia, and Neil Zhenqiang Gong. Ip-guard: Protecting intellectual property of deep neural networks via fingerprinting the classification boundary. In ACM Asia Conference on Computer and Communications Security, pages 14-25, 2021.

Digital Library

[9]

Nicholas Carlini, Matthew Jagielski, and Ilya Mironov. Cryptanalytic extraction of neural network models. In 40th Annual International Cryptology Conference, pages 189-218, 2020.

Digital Library

[10]

Huili Chen, Bita Darvish Rouhani, Cheng Fu, Jishen Zhao, and Farinaz Koushanfar. Deepmarks: A secure fingerprinting framework for digital rights management of deep learning models. In International Conference on Multimedia Retrieval, pages 105-113, 2019.

Digital Library

[11]

Huili Chen, Bita Darvish Rouhani, and Farinaz Koushanfar. Blackmarks: Blackbox multibit watermarking for deep neural networks. arXiv preprint arXiv:1904.00344, 2019.

[12]

Yiming Chen, Jinyu Tian, Xiangyu Chen, and Jiantao Zhou. Effective ambiguity attack against passport-based dnn intellectual property protection schemes through fully connected layer substitution. In 2023 IEEE Conference on Computer Vision and Pattern Recognition, 2023.

[13]

Jacson Rodrigues Correia-Silva, Rodrigo F Berriel, Claudine Badue, Alberto F de Souza, and Thiago Oliveira-Santos. Copycat cnn: Stealing knowledge by persuading confession with random non-labeled data. In 2018 International Joint Conference on Neural Networks (IJCNN), pages 1-8. IEEE, 2018.

[14]

Bita Darvish Rouhani, Huili Chen, and Farinaz Koushanfar. Deepsigns: An end-to-end watermarking framework for ownership protection of deep neural networks. In Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, pages 485-497, 2019.

Digital Library

[15]

Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition, pages 248-255, 2009.

[16]

Yinpeng Dong, Fangzhou Liao, Tianyu Pang, Hang Su, Jun Zhu, Xiaolin Hu, and Jianguo Li. Boosting adversarial attacks with momentum. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 9185-9193, 2018.

[17]

Lixin Fan, Kam Woh Ng, and Chee Seng Chan. Rethinking deep neural network ownership verification: Embedding passports to defeat ambiguity attacks. In NeurIPS, 2019.

[18]

Lixin Fan, Kam Woh Ng, Chee Seng Chan, and Qiang Yang. Deepipr: Deep neural network ownership verification with passports. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44:6122-6139, 2021.

Digital Library

[19]

Ian J Goodfellow, Jonathon Shlens, and Christian Szegedy. Explaining and harnessing adversarial examples. In Proceedings of the International Conference on Learning Representations (ICLR), 2015.

[20]

Jia Guo and Miodrag Potkonjak. Watermarking deep neural networks for embedded systems. In 2018 IEEE/ACM International Conference on Computer-Aided Design (ICCAD), pages 1-8. IEEE, 2018.

Digital Library

[21]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770-778, 2016.

[22]

Xinlei He, Jinyuan Jia, Michael Backes, Neil Zhenqiang Gong, and Yang Zhang. Stealing links from graph neural networks. In USENIX Security Symposium, pages 2669-2686, 2021.

[23]

Matthew Jagielski, Nicholas Carlini, David Berthelot, Alex Kurakin, and Nicolas Papernot. High accuracy and high fidelity extraction of neural networks. In 29th USENIX Security Symposium, 2020.

Digital Library

[24]

Hengrui Jia, Christopher A. Choquette-Choo, Varun Chandrasekaran, and Nicolas Papernot. Entangled watermarks as a defense against model extraction. In 30th USENIX Security Symposium, pages 1937-1954, 2021.

[25]

Hengrui Jia, Mohammad Yaghini, Christopher A Choquette-Choo, Natalie Dullerud, Anvith Thudi, Varun Chandrasekaran, and Nicolas Papernot. Proof-of-learning: Definitions and practice. In IEEE Symposium on Security and Privacy, pages 1039-1056, 2021.

[26]

Mika Juuti, Sebastian Szyller, Samuel Marchal, and N. Asokan. PRADA: protecting against DNN model stealing attacks. In IEEE European Symposium on Security & Privacy, pages 1-16, 2019.

[27]

Simon Kornblith, Mohammad Norouzi, Honglak Lee, and Geoffrey Hinton. Similarity of neural network representations revisited. In International Conference on Machine Learning, pages 3519-3529. PMLR, 2019.

[28]

Kalpesh Krishna, Gaurav Singh Tomar, Ankur Parikh, Nicolas Papernot, and Mohit Iyyer. Thieves of sesame street: Model extraction on bert-based apis. In International Conference on Learning Representations, 2020.

[29]

Alex Krizhevsky, Geoffrey Hinton, et al. Learning multiple layers of features from tiny images. Technical report, University of Toronto, 2009.

[30]

Alexey Kurakin, Ian J Goodfellow, and Samy Bengio. Adversarial examples in the physical world. In Artificial intelligence safety and security, pages 99-112. Chapman and Hall/CRC, 2018.

[31]

Erwan Le Merrer, Patrick Perez, and Gilles Trédan. Adversarial frontier stitching for remote neural network watermarking. Neural Computing and Applications, 32(13):9233-9244, 2020.

[32]

Huiying Li, Emily Wenger, Shawn Shan, Ben Y Zhao, and Haitao Zheng. Piracy resistant watermarks for deep neural networks. arXiv preprint arXiv:1910.01226, 2019.

[33]

Yue Li, Hongxia Wang, and Mauro Barni. A survey of deep neural network watermarking techniques. Neurocomputing, 461:171-193, 2021.

Digital Library

[34]

Zheng Li, Chengyu Hu, Yang Zhang, and Shanqing Guo. How to prove your model belongs to you: A blind-watermark based framework to protect intellectual property of dnn. In Annual Computer Security Applications Conference, page 126-137, 2019.

[35]

Weiyang Liu, Yandong Wen, Zhiding Yu, Ming Li, Bhiksha Raj, and Le Song. Sphereface: Deep hypersphere embedding for face recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 212-220, 2017.

[36]

Yanpei Liu, Xinyun Chen, Chang Liu, and Dawn Song. Delving into transferable adversarial examples and black-box attacks. In International Conference on Learning Representations, 2016.

[37]

Ziwei Liu, Ping Luo, Xiaogang Wang, and Xiaoou Tang. Deep learning face attributes in the wild. In Proceedings of International Conference on Computer Vision, 2015.

Digital Library

[38]

Nils Lukas, Edward Jiang, Xinda Li, and Florian Kerschbaum. Sok: How robust is image classification deep neural network watermarking? In 43rd IEEE Symposium on Security and Privacy, pages 787-804, 2022.

[39]

Nils Lukas, Yuxuan Zhang, and Florian Kerschbaum. Deep neural network fingerprinting by conferrable adversarial examples. In International Conference on Learning Representations, 2020.

[40]

Hengliang Luo, Yi Yang, Bei Tong, Fuchao Wu, and Bin Fan. Traffic sign recognition using a multi-task convolutional neural network. IEEE Transactions on Intelligent Transportation Systems, 19:1100-1111, 2017.

[41]

Yan Luo, Xavier Boix, Gemma Roig, Tomaso Poggio, and Qi Zhao. Foveation-based mechanisms alleviate adversarial examples. arXiv preprint arXiv:1511.06292, 2015.

[42]

Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, and Adrian Vladu. Towards deep learning models resistant to adversarial attacks. In International Conference on Learning Representations, 2018.

[43]

Pratyush Maini, Mohammad Yaghini, and Nicolas Papernot. Dataset inference: Ownership resolution in machine learning. In International Conference on Learning Representations, 2020.

[44]

Ryota Namba and Jun Sakuma. Robust watermarking of neural network with exponential weighting. In ACM Asia Conference on Computer and Communications Security, page 228-240, 2019.

Digital Library

[45]

Tribhuvanesh Orekondy, Bernt Schiele, and Mario Fritz. Knockoff nets: Stealing functionality of black-box models. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 4954-4963, 2019.

[46]

Xudong Pan, Yifan Yan, Mi Zhang, and Min Yang. Metav: A meta-verifier approach to task-agnostic model fingerprinting. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pages 1327-1336, 2022.

Digital Library

[47]

Nicolas Papernot, Patrick McDaniel, and Ian Goodfellow. Transferability in machine learning: from phenomena to black-box attacks using adversarial samples. arXiv preprint arXiv:1605.07277, 2016.

[48]

Nicolas Papernot, Patrick McDaniel, Ian Goodfellow, Somesh Jha, Z Berkay Celik, and Ananthram Swami. Practical black-box attacks against machine learning. In ACM Asia conference on computer and communications security, pages 506-519, 2017.

Digital Library

[49]

Nicolas Papernot, Patrick McDaniel, Xi Wu, Somesh Jha, and Ananthram Swami. Distillation as a defense to adversarial perturbations against deep neural networks. In 2016 IEEE symposium on security and privacy (SP), pages 582-597. IEEE, 2016.

[50]

Omkar M Parkhi, Andrea Vedaldi, and Andrew Zisserman. Deep face recognition. In British Machine Vision Association, volume 1, 2015.

[51]

B. Parno, J. Howell, C. Gentry, and M. Raykova. Pinocchio: Nearly practical verifiable computation. In 2013 IEEE Symposium on Security and Privacy, pages 238-252, May 2013.

Digital Library

[52]

Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, et al. Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems, 32, 2019.

[53]

Francesco Regazzoni, Paolo Palmieri, Fethulah Smailbegovic, Rosario Cammarota, and Ilia Polian. Protecting artificial intelligence ips: a survey of watermarking and fingerprinting for machine learning. CAAI Transactions on Intelligence Technology, 6(2):180-191, 2021.

Digital Library

[54]

Amazon Rekognition. Amazon rekognition. https://aws.amazon.com/rekognition/.

[55]

Ruslan Salakhutdinov and Geoff Hinton. Learning a nonlinear embedding by preserving class neighbourhood structure. In Artificial intelligence and statistics, pages 412-419. PMLR, 2007.

[56]

Ali Shafahi, Mahyar Najibi, Mohammad Amin Ghiasi, Zheng Xu, John Dickerson, Christoph Studer, Larry S Davis, Gavin Taylor, and Tom Goldstein. Adversarial training for free! Advances in Neural Information Processing Systems, 32, 2019.

[57]

K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. In International Conference on Learning Representations, May 2015.

[58]

Jiawei Su, Danilo Vasconcellos Vargas, and Kouichi Sakurai. One pixel attack for fooling deep neural networks. IEEE Transactions on Evolutionary Computation, 23(5):828-841, 2019.

[59]

Sebastian Szyller, Buse Gul Atli, Samuel Marchal, and N Asokan. Dawn: Dynamic adversarial watermarking of neural networks. In Proceedings of the 29th ACM International Conference on Multimedia, pages 4417-4425, 2021.

Digital Library

[60]

Sebastian Szyller, Vasisht Duddu, Tommi Gröndahl, and N Asokan. Good artists copy, great artists steal: Model extraction attacks against image translation generative adversarial networks. arXiv preprint arXiv:2104.12623, 2021.

[61]

Sebastian Szyller, Rui Zhang, Jian Liu, and N Asokan. On the robustness of dataset inference. Transactions on Machine Learning Research, 2023.

[62]

Tatsuya Takemura, Naoto Yanai, and Toru Fujiwara. Model extraction attacks on recurrent neural networks. Journal of Information Processing, 28:1010-1024, 2020.

[63]

Florian Tramèr, Fan Zhang, Ari Juels, Michael K Reiter, and Thomas Ristenpart. Stealing machine learning models via prediction apis. In 25th U SENIX Security Symposium, pages 601-618, 2016.

[64]

Yusuke Uchida, Yuki Nagai, Shigeyuki Sakazawa, and Shin'ichi Satoh. Embedding watermarks into deep neural networks. In Proceedings of the 2017 ACM on international conference on multimedia retrieval, pages 269-277, 2017.

Digital Library

[65]

Eric Wallace, Mitchell Stern, and Dawn Song. Imitation attacks and defenses for black-box machine translation systems. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, pages 5531-5546, 2020.

[66]

Hao Wang, Yitong Wang, Zheng Zhou, Xing Ji, Dihong Gong, Jingchao Zhou, Zhifeng Li, and Wei Liu. Cosface: Large margin cosine loss for deep face recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 5265-5274, 2018.

[67]

Qinglong Wang, Wenbo Guo, Kaixuan Zhang, Alexander G Ororbia II, Xinyu Xing, Xue Liu, and C Lee Giles. Learning adversary-resistant deep neural networks. arXiv preprint arXiv:1612.01401, 2016.

[68]

Tianhao Wang and Florian Kerschbaum. Attacks on digital watermarks for deep neural networks. In ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing, pages 2622-2626, 2019.

[69]

Cihang Xie, Zhishuai Zhang, Yuyin Zhou, Song Bai, Jianyu Wang, Zhou Ren, and Alan L Yuille. Improving transferability of adversarial examples with input diversity. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2730-2739, 2019.

[70]

XiangRui Xu, YaQin Li, and Cao Yuan. A novel method for identifying the deep neural network model with the serial number. arXiv preprint arXiv:1911.08053, 2019.

[71]

Sergey Zagoruyko and Nikos Komodakis. Wide residual networks. In BMVC, 2016.

[72]

Jialong Zhang, Zhongshu Gu, Jiyong Jang, Hui Wu, Marc Ph Stoecklin, Heqing Huang, and Ian Molloy. Protecting intellectual property of deep neural networks with watermarking. In ACM Asia Conference on Computer and Communications Security, pages 159-172, 2018.

Digital Library

[73]

Jianpeng Zhang, Yutong Xie, Qi Wu, and Yong Xia. Medical image classification using synergic deep learning. Medical image analysis, 54:10-19, 2019.

[74]

Rui Zhang, Jian Liu, Yuan Ding, Zhibo Wang, Qingbiao Wu, and Kui Ren. "adversarial examples" for proof-of-learning. In 2022 IEEE Symposium on Security and Privacy (SP), pages 1408-1422, 2022.

[75]

Jingjing Zhao, Qingyue Hu, Gaoyang Liu, Xiaoqiang Ma, Fei Chen, and Mohammad Mehedi Hassan. Afa: Adversarial fingerprinting authentication for deep neural networks. Comput. Commun., 150(C):488-497, 2020.

[76]

Stephan Zheng, Yang Song, Thomas Leung, and Ian Goodfellow. Improving the robustness of deep neural networks via stability training. In 2016 IEEE Conference on Computer Vision and Pattern Recognition, pages 4480-4488, 2016.

Index Terms

False claims against model ownership resolution

Index terms have been assigned to the content through auto-classification.

Recommendations

Does it sound as it claims: a detailed side-channel security analysis of QuadSeal countermeasure
CF '16: Proceedings of the ACM International Conference on Computing Frontiers

VLSI systems often rely on embedded cryptographic cores for security when the confidentiality and authorization is a must. Such cores are theoretically sound but often vulnerable to physical attacks like side-channel analysis (SCA). Several ...
On False Data-Injection Attacks against Power System State Estimation: Modeling and Countermeasures

It is critical for a power system to estimate its operation state based on meter measurements in the field and the configuration of power grid networks. Recent studies show that the adversary can bypass the existing bad data detection schemes, posing ...
Two‐channel stealthy false data injection attacks design without estimator knowledge
Abstract
This paper primarily focuses on designing stealthy false data injection attacks targeting two communication channels in cyber‐physical systems equipped with state estimators and attack detectors. It introduces the concept of perfect attacks, ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings

SEC '24: Proceedings of the 33rd USENIX Conference on Security Symposium

August 2024

7480 pages

ISBN:978-1-939133-44-1

Others:
Davide Balzarotti
Eurecom
,
Wenyuan Xu
Zhejiang University

Copyright © 2024 The USENIX Association.

Sponsors

Bloomberg Engineering
Google Inc.
NSF
Futurewei Technologies
IBM

Publisher

USENIX Association

United States

Publication History

Published: 12 August 2024

Qualifiers

Research-article
Research
Refereed limited

Acceptance Rates

Overall Acceptance Rate 40 of 100 submissions, 40%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

View Options

View options

Figures

Tables

Media

View Table of Conten