short-paper

Towards building robust DNN applications: an industrial case study of evolutionary data augmentation

Authors:

Haruki Yokoyama,

Shinji KikuchiAuthors Info & Claims

ASE '20: Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering

Pages 1184 - 1188

https://doi.org/10.1145/3324884.3421841

Published: 27 January 2021 Publication History

Abstract

Data augmentation techniques that increase the amount of training data by adding realistic transformations are used in machine learning to improve the level of accuracy. Recent studies have demonstrated that data augmentation techniques improve the robustness of image classification models with open datasets; however, it has yet to be investigated whether these techniques are effective for industrial datasets. In this study, we investigate the feasibility of data augmentation techniques for industrial use. We evaluate data augmentation techniques in image classification and object detection tasks using an industrial in-house graphical user interface dataset. As the results indicate, the genetic algorithm-based data augmentation technique outperforms two random-based methods in terms of the robustness of the image classification model. In addition, through this evaluation and interviews with the developers, we learned following two lessons: data augmentation techniques should (1) maintain the training speed to avoid slowing the development and (2) include extensibility for a variety of tasks.

References

[1]

Christopher M. Bishop. 1995. Neural Networks for Pattern Recognition. Oxford University Press, Inc., USA.

Digital Library

[2]

François Chollet. 2015. Keras. https://keras.io

[3]

Logan Engstrom, Brandon Tran, Dimitris Tsipras, Ludwig Schmidt, and Aleksander Madry. 2019. Exploring the Landscape of Spatial Robustness. In Proceedings of the 36th International Conference on Machine Learning.

[4]

Xiang Gao, Ripon Saha, Mukul Prasad, and Abhik Roychoudhury. 2020. Fuzz Testing based Data Augmentation to Improve Robustness of Deep Neural Networks. In Proceedings of the 42nd International Conference on Software Engineering.

Digital Library

[5]

Ian Goodfellow, Jonathon Shlens, and Christian Szegedy. 2015. Explaining and Harnessing Adversarial Examples. In Proceedings of the 3rd International Conference on Learning Representations.

[6]

Alex Krizhevsky. 2009. Learning multiple layers of features from tiny images. Technical Report.

[7]

Lei Ma, Felix Juefei-Xu, Fuyuan Zhang, Jiyuan Sun, Minhui Xue, Bo Li, Chunyang Chen, Ting Su, Li Li, Yang Liu, et al. 2018. Deepgauge: Multi-granularity testing criteria for deep learning systems. In Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering. 120--131.

Digital Library

[8]

Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Kopf, Edward Yang, Zachary DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. 2019. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In Advances in Neural Information Processing Systems 32, H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett (Eds.). Curran Associates, Inc., 8024--8035. http://papers.neurips.cc/paper/9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdf

Digital Library

[9]

Kexin Pei, Yinzhi Cao, Junfeng Yang, and Suman Jana. 2017. DeepXplore: Automated Whitebox Testing of Deep Learning Systems. In Proceedings of the 26th Symposium on Operating Systems Principles. ACM Press.

Digital Library

[10]

Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. 2015. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. In Advances in Neural Information Processing Systems 28.

[11]

Brian D. Ripley and N. L. Hjort. 1995. Pattern Recognition and Neural Networks (1st ed.). Cambridge University Press, USA.

[12]

Connor Shorten and Taghi M Khoshgoftaar. 2019. A survey on image data augmentation for deep learning. Journal of Big Data 6, 1 (2019), 60.

[13]

Karen Simonyan and Andrew Zisserman. 2014. Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv:1409.1556 [cs.CV]

[14]

J. Stallkamp, M. Schlipsing, J. Salmen, and C. Igel. 2012. Man vs. computer: Benchmarking machine learning algorithms for traffic sign recognition. Neural Networks 0 (2012), -.

Digital Library

[15]

Geoffrey I. Webb. 2010. Overfitting. Springer US, Boston, MA, 744--744.

[16]

Jianwei Yang, Jiasen Lu, Dhruv Batra, and Devi Parikh. 2017. A Faster PyTorch Implementation of Faster R-CNN. https://github.com/jwyang/faster-rcnn.pytorch

[17]

Barret Zoph, Ekin D. Cubuk, Golnaz Ghiasi, Tsung-Yi Lin, Jonathon Shlens, and Quoc V. Le. 2019. Learning Data Augmentation Strategies for Object Detection. arXiv:1906.11172 [cs.CV]

Cited By

Jiang JYang JZhang YWang ZYou HChen J(2024)A Post-training Framework for Improving the Performance of Deep Learning Models via Model TransformationACM Transactions on Software Engineering and Methodology10.1145/363001133:3(1-41)Online publication date: 15-Mar-2024
https://dl.acm.org/doi/10.1145/3630011
Yoosefi AKargahi M(2024)Resource-aware in-edge distributed real-time deep learningInternet of Things10.1016/j.iot.2024.10126327(101263)Online publication date: Oct-2024
https://doi.org/10.1016/j.iot.2024.101263
Li NMa LXing TYu GWang CWen YCheng SGao S(2023)Automatic design of machine learning via evolutionary computation: A surveyApplied Soft Computing10.1016/j.asoc.2023.110412143(110412)Online publication date: Aug-2023
https://doi.org/10.1016/j.asoc.2023.110412
Show More Cited By

Index Terms

Towards building robust DNN applications: an industrial case study of evolutionary data augmentation
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
        Object detection
2. Software and its engineering
  1. Software creation and management
    1. Software verification and validation
      1. Fault tree analysis

Recommendations

Effective 2D Stroke-based Gesture Augmentation for RNNs
CHI '23: Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems

Recurrent neural networks (RNN) require large training datasets from which they learn new class models. This limitation prohibits their use in custom gesture applications where only one or two end user samples are given per gesture class. One common way ...
Transfer Dataset in Image Segmentation Use Case
Neural Information Processing
Abstract
The most labour-intensive stage of machine learning (ML) modelling is the appropriate preparation of correct dataset. This paper aims to show transfer dataset approach in image segmentation use case to lower labour intensity. Moreover, we test the ...
PDT: Uav Target Detection Dataset for Pests and Diseases Tree
Computer Vision – ECCV 2024
Abstract
UAVs emerge as the optimal carriers for visual weed identification and integrated pest and disease management in crops. However, the absence of specialized datasets impedes the advancement of model development in this domain. To address this, we ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

ASE '20: Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering

December 2020

1449 pages

ISBN:9781450367684

DOI:10.1145/3324884

General Chair:
John Grundy,
Program Chairs:
Claire Le Goues,
David Lo

Copyright © 2020 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

In-Cooperation

IEEE CS

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 January 2021

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Short-paper

Conference

ASE '20

Sponsor:

ASE '20: 35th IEEE/ACM International Conference on Automated Software Engineering

December 21 - 25, 2020

Virtual Event, Australia

Acceptance Rates

Overall Acceptance Rate 82 of 337 submissions, 24%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

5
Total Citations
View Citations
98
Total Downloads

Downloads (Last 12 months)13
Downloads (Last 6 weeks)2

Reflects downloads up to 18 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Jiang JYang JZhang YWang ZYou HChen J(2024)A Post-training Framework for Improving the Performance of Deep Learning Models via Model TransformationACM Transactions on Software Engineering and Methodology10.1145/363001133:3(1-41)Online publication date: 15-Mar-2024
https://dl.acm.org/doi/10.1145/3630011
Yoosefi AKargahi M(2024)Resource-aware in-edge distributed real-time deep learningInternet of Things10.1016/j.iot.2024.10126327(101263)Online publication date: Oct-2024
https://doi.org/10.1016/j.iot.2024.101263
Li NMa LXing TYu GWang CWen YCheng SGao S(2023)Automatic design of machine learning via evolutionary computation: A surveyApplied Soft Computing10.1016/j.asoc.2023.110412143(110412)Online publication date: Aug-2023
https://doi.org/10.1016/j.asoc.2023.110412
Zhang YWang ZJiang JYou HChen J(2022)Toward Improving the Robustness of Deep Learning Models via Model TransformationProceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering10.1145/3551349.3556920(1-13)Online publication date: 10-Oct-2022
https://dl.acm.org/doi/10.1145/3551349.3556920
Shivashankar KMartini A(2022)Maintainability Challenges in ML: A Systematic Literature Review2022 48th Euromicro Conference on Software Engineering and Advanced Applications (SEAA)10.1109/SEAA56994.2022.00018(60-67)Online publication date: Aug-2022
https://doi.org/10.1109/SEAA56994.2022.00018

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents