Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article
Open access

Incremental and Approximate Computations for Accelerating Deep CNN Inference

Published: 06 December 2020 Publication History

Abstract

Deep learning now offers state-of-the-art accuracy for many prediction tasks. A form of deep learning called deep convolutional neural networks (CNNs) are especially popular on image, video, and time series data. Due to its high computational cost, CNN inference is often a bottleneck in analytics tasks on such data. Thus, a lot of work in the computer architecture, systems, and compilers communities study how to make CNN inference faster. In this work, we show that by elevating the abstraction level and re-imagining CNN inference as queries, we can bring to bear database-style query optimization techniques to improve CNN inference efficiency. We focus on tasks that perform CNN inference repeatedly on inputs that are only slightly different. We identify two popular CNN tasks with this behavior: occlusion-based explanations (OBE) and object recognition in videos (ORV). OBE is a popular method for “explaining” CNN predictions. It outputs a heatmap over the input to show which regions (e.g., image pixels) mattered most for a given prediction. It leads to many re-inference requests on locally modified inputs. ORV uses CNNs to identify and track objects across video frames. It also leads to many re-inference requests. We cast such tasks in a unified manner as a novel instance of the incremental view maintenance problem and create a comprehensive algebraic framework for incremental CNN inference that reduces computational costs. We produce materialized views of features produced inside a CNN and connect them with a novel multi-query optimization scheme for CNN re-inference. Finally, we also devise novel OBE-specific and ORV-specific approximate inference optimizations exploiting their semantics. We prototype our ideas in Python to create a tool called Krypton that supports both CPUs and GPUs. Experiments with real data and CNNs show that Krypton reduces runtimes by up to 5× (respectively, 35×) to produce exact (respectively, high-quality approximate) results without raising resource requirements.

References

[1]
Olga Russakovsky et al. 2015. Imagenet large scale visual recognition challenge. Int. J. Comput. Vis. 115, 3 (2015), 211--252.
[2]
Daniel S. Kermany et al. 2018. Identifying medical diagnoses and treatable diseases by image-based deep learning. Cell 172, 5 (2018), 1122--1131.
[3]
Mohammad Tariqul Islam et al. 2017. Abnormality detection and localization in chest x-rays using deep convolutional neural networks. Arxiv Preprint Arxiv:1705.09850 (2017).
[4]
Sharada P. Mohanty et al. 2016. Using deep learning for image-based plant disease detection. Front. Plant Sci. 7 (2016), 1419.
[5]
An Yan, Shuo Cheng, Wang-Cheng Kang, Mengting Wan, and Julian McAuley. 2019. CosRec: 2D convolutional neural networks for sequential recommendation. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management. 2173--2176.
[6]
Mohammad Sadegh Norouzzadeh, Anh Nguyen, Margaret Kosmala, Alexandra Swanson, Meredith S. Palmer, Craig Packer, and Jeff Clune. 2018. Automatically identifying, counting, and describing wild animals in camera-trap images with deep learning. Proc. Nat. Acad. Sci. 115, 25 (2018), E5716--E5725.
[7]
Farhad Arbabzadah et al. 2016. Identifying individual facial expressions by deconstructing a neural network. In Proceedings of the German Conference on Pattern Recognition. Springer, 344--354.
[8]
Yilun Wang and Michal Kosinski. 2018. Deep neural networks are more accurate than humans at detecting sexual orientation from facial images.J. Person. Social Psychol. 114, 2 (2018), 246.
[9]
Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. Arxiv Preprint Arxiv:1409.1556 (2014).
[10]
Matthew D. Zeiler and Rob Fergus. 2014. Visualizing and understanding convolutional networks. In Proceedings of the European Conference on Computer Vision. Springer, 818--833.
[11]
Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. Why should I trust you?: Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 1135--1144.
[12]
Kyu-Hwan Jung et al. 2017. Deep learning for medical image analysis: Applications to computed tomography and magnetic resonance imaging. Han. Med. Rev. 37, 2 (2017), 61--70.
[13]
Paul Voigt and Axel Von dem Bussche. 2017. The EU General Data Protection Regulation (GDPR). Vol. 18. Springer.
[14]
Supreeth Shastri, Vinay Banakar, Melissa Wasserman, Arun Kumar, and Vijay Chidambaram. 2020. Understanding and benchmarking the impact of GDPR on database systems. Proc. VLDB Endow. 13 (Mar. 2020), 1064--1077.
[15]
G. Sreenu and M. A. Saleem Durai. 2019. Intelligent video surveillance: A review through deep learning techniques for crowd analysis. J. Big Data 6, 1 (2019), 48.
[16]
Daniel Kang, John Emmons, Firas Abuzaid, Peter Bailis, and Matei Zaharia. 2017. NoScope: Optimizing neural network queries over video at scale. Proc. VLDB Endow. 10, 11 (2017), 1586--1597.
[17]
Luisa M. Zintgraf et al. 2017. Visualizing deep neural network decisions: Prediction difference analysis. Arxiv Preprint Arxiv:1702.04595 (2017).
[18]
Bert Moons and Marian Verhelst. 2016. A 0.3--2.6 TOPS/W precision-scalable processor for real-time large-scale convnets. In Proceedings of the IEEE Symposium on VLSI Circuits (VLSI-Circuits’16). IEEE, 1--2.
[19]
Yihui He, Xiangyu Zhang, and Jian Sun. 2017. Channel pruning for accelerating very deep neural networks. In Proceedings of the IEEE International Conference on Computer Vision. 1389--1397.
[20]
Allen Ordookhanians, Xin Li, Supun Nakandala, and Arun Kumar. 2019. Demonstration of Krypton: Optimized CNN inference for occlusion-based deep CNN explanations. Proc. VLDB Endow. 12, 12 (2019), 1894--1897.
[21]
Ian Goodfellow, Yoshua Bengio, Aaron Courville, and Yoshua Bengio. 2016. Deep Learning. Vol. 1. The MIT Press, Cambridge, MA.
[22]
Lukas Cavigelli, Philippe Degen, and Luca Benini. 2017. Cbinfer: Change-based inference for convolutional neural networks on video data. In Proceedings of the 11th International Conference on Distributed Smart Cameras. ACM, 1--8.
[23]
Zhou Wang et al. 2004. Image quality assessment: From error visibility to structural similarity. IEEE Trans. Image Proc. 13, 4 (2004), 600--612.
[24]
Timos K. Sellis. 1988. Multiple-query optimization. ACM Trans. Datab. Syst. 13, 1 (1988), 23--52.
[25]
ONNX Model Format. Retrieved on March 31, 2020 from https://onnx.ai.
[26]
Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: A simple way to prevent neural networks from overfitting. J. Mach. Lear. Res. 15, 1 (2014), 1929--1958.
[27]
Sharan Chetlur, Cliff Woolley, Philippe Vandermersch, Jonathan Cohen, John Tran, Bryan Catanzaro, and Evan Shelhamer. 2014. CUDNN: Efficient primitives for deep learning. Arxiv Preprint Arxiv:1410.0759 (2014).
[28]
Basic Operations in a Convolutional Neural Network—CSE@IIT Delhi. Retrieved on March 31, 2020 from http://www.cse.iitd.ernet.in/ rijurekha/lectures/lecture-2.pptx.
[29]
Saskia E. J. de Vries et al. 2011. The projective field of a retinal amacrine cell. J. Neurosci. 31, 23 (2011), 8595--8604.
[30]
Wenjie Luo, Yujia Li, Raquel Urtasun, and Richard Zemel. 2016. Understanding the effective receptive field in deep convolutional neural networks. In Proceedings of the International Conference on Advances in Neural Information Processing Systems. 4898--4906.
[31]
Steffen Eger. 2013. Restricted weighted integer compositions and extended binomial coefficients. J. Integ. Seq. 16, 13.1 (2013), 3.
[32]
Jia Deng, Wei Dong, et al. 2009. Imagenet: A large-scale hierarchical image database. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’09). IEEE, 248--255.
[33]
Davide Anguita, Alessandro Ghio, Luca Oneto, Xavier Parra, and Jorge L. Reyes-Ortiz. 2012. Human activity recognition on smartphones using a multiclass hardware-friendly support vector machine. In Proceedings of the International Workshop on Ambient Assisted Living. Springer, 216--223.
[34]
Winter PhotoVideo Contest Predator Game Camera Pictures 8 Videos. Retrieved on March 31, 2020 from https://www.trailcampro.com/pages/2017-winter-trail-camera-contest-predator-trailcam-photos.
[35]
Kaiming He et al. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770--778.
[36]
Christian Szegedy et al. 2016. Rethinking the inception architecture for computer vision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2818--2826.
[37]
Torch Vison Models. Retrieved on March 31, 2020 from https://github.com/pytorch/vision/tree/master/torchvision/models.
[38]
Mukund Sundararajan et al. 2017. Axiomatic attribution for deep networks. Arxiv Preprint Arxiv:1703.01365 (2017).
[39]
Mohammad Motamedi, Felix Portillo, Mahya Saffarpour, Daniel Fong, and Soheil Ghiasi. 2018. Resource-scalable CNN synthesis for IoT applications. Arxiv Preprint Arxiv:1901.00738 (2018).
[40]
Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman. 2013. Deep inside convolutional networks: Visualising image classification models and saliency maps. Arxiv Preprint Arxiv:1312.6034 (2013).
[41]
Ramprasaath R. Selvaraju, Michael Cogswell, Abhishek Das, Ramakrishna Vedantam, Devi Parikh, and Dhruv Batra. 2017. Grad-CAM: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE International Conference on Computer Vision (ICCV’17). IEEE, 618--626.
[42]
Tim Miller. 2017. Explanation in artificial intelligence: Insights from the social sciences. Arxiv Preprint Arxiv:1706.07269 (2017).
[43]
Suyog Gupta, Ankur Agrawal, Kailash Gopalakrishnan, and Pritish Narayanan. 2015. Deep learning with limited numerical precision. In Proceedings of the International Conference on Machine Learning. 1737--1746.
[44]
Song Han, Huizi Mao, and William J. Dally. 2015. Deep compression: Compressing deep neural networks with pruning, trained quantization and Huffman coding. Arxiv Preprint Arxiv:1510.00149 (2015).
[45]
Aojun Zhou, Anbang Yao, Yiwen Guo, Lin Xu, and Yurong Chen. 2017. Incremental network quantization: Towards lossless CNNs with low-precision weights. Arxiv Preprint Arxiv:1702.03044 (2017).
[46]
Jianbo Ye, Xin Lu, Zhe Lin, and James Z. Wang. 2018. Rethinking the smaller-norm-less-informative assumption in channel pruning of convolution layers. Arxiv Preprint Arxiv:1802.00124 (2018).
[47]
Seyed-Iman Mirzadeh, Mehrdad Farajtabar, Ang Li, and Hassan Ghasemzadeh. 2019. Improved knowledge distillation via teacher assistant: Bridging the gap between student and teacher. Arxiv Preprint Arxiv:1902.03393 (2019).
[48]
Mengwei Xu, Mengze Zhu, Yunxin Liu, Felix Xiaozhu Lin, and Xuanzhe Liu. 2018. DeepCache: Principled cache for mobile deep vision. In Proceedings of the 24th Annual International Conference on Mobile Computing and Networking. 129--144.
[49]
Mark Buckler, Philip Bedoukian, Suren Jayasuriya, and Adrian Sampson. 2018. EVA2: Exploiting temporal redundancy in live computer vision. Arxiv Preprint Arxiv:1803.06312 (2018).
[50]
Mostafa Mahmoud, Kevin Siu, and Andreas Moshovos. 2018. Diffy: A Déjà vu-free differential deep neural network accelerator. In Proceedings of the 51st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’18). IEEE, 134--147.
[51]
Yuhao Zhang and Arun Kumar. 2019. Panorama: A data system for unbounded vocabulary querying over video. Proc. VLDB Endow. 13, 4 (Dec. 2019), 477--491.
[52]
Haichen Shen, Lequn Chen, Yuchen Jin, Liangyu Zhao, Bingyu Kong, Matthai Philipose, Arvind Krishnamurthy, and Ravi Sundaram. 2019. Nexus: A GPU cluster engine for accelerating DNN-based video analysis. In Proceedings of the 27th ACM Symposium on Operating Systems Principles. 322--337.
[53]
Rada Chirkova, Jun Yang, et al. 2012. Materialized views. Found. Trends® Datab. 4, 4 (2012), 295--405.
[54]
Ashish Gupta, Inderpal Singh Mumick, et al. 1995. Maintenance of materialized views: Problems, techniques, and applications. IEEE Data Eng. Bull. 18, 2 (1995), 3--18.
[55]
Alon Y. Levy, Alberto O. Mendelzon, and Yehoshua Sagiv. 1995. Answering queries using views. In Proceedings of the 14th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. ACM, 95--104.
[56]
Milos Nikolic, Mohammed ElSeidy, and Christoph Koch. 2014. LINVIEW: Incremental view maintenance for complex analytical queries. In Proceedings of the ACM SIGMOD International Conference on Management of Data. ACM, 253--264.
[57]
Weijie Zhao, Florin Rusu, Bin Dong, Kesheng Wu, and Peter Nugent. 2017. Incremental view maintenance over array data. In Proceedings of the ACM International Conference on Management of Data. ACM, 139--154.
[58]
Wangchao Le, Anastasios Kementsietsidis, Songyun Duan, and Feifei Li. 2012. Scalable multi-query optimization for SPARQL. In Proceedings of the IEEE 28th International Conference on Data Engineering (ICDE’12). IEEE, 666--677.
[59]
Matthias Boehm, Arun Kumar, and Jun Yang. 2019. Data Management in Machine Learning Systems. Morgan 8 Claypool Publishers.
[60]
Arun Kumar, Matthias Boehm, and Jun Yang. 2017. Data management in machine learning: Challenges, techniques, and systems. In Proceedings of the ACM International Conference on Management of Data (SIGMOD’17). Association for Computing Machinery, New York, NY, 1717--1722.
[61]
Arun Kumar, Robert McCann, Jeffrey Naughton, and Jignesh M. Patel. 2016. Model selection management systems: The next frontier of advanced analytics. SIGMOD Rec. 44, 4 (May 2016), 17--22.
[62]
Ce Zhang, Arun Kumar, and Christopher Ré. 2014. Materialization optimizations for feature selection workloads. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD’14). Association for Computing Machinery, New York, NY, 265--276.
[63]
Pradap Konda, Arun Kumar, Christopher Ré, and Vaishnavi Sashikanth. 2013. Feature selection in enterprise analytics: A demonstration using an r-based data analytics system. Proc. VLDB Endow. 6, 12 (Aug. 2013), 1306--1309.
[64]
Arun Kumar, Jeffrey Naughton, and Jignesh M. Patel. 2015. Learning generalized linear models over normalized data. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD’15). Association for Computing Machinery, New York, NY, 1969--1984.
[65]
Arun Kumar, Mona Jalal, Boqun Yan, Jeffrey Naughton, and Jignesh M. Patel. 2015. Demonstration of Santoku: Optimizing machine learning over normalized data. Proc. VLDB Endow. 8, 12 (Aug. 2015), 1864--1867.
[66]
Lingjiao Chen, Arun Kumar, Jeffrey Naughton, and Jignesh M. Patel. 2017. Towards linear algebra over normalized data. Proc. VLDB Endow. 10, 11 (Aug. 2017), 1214--1225.
[67]
Side Li, Lingjiao Chen, and Arun Kumar. 2019. Enabling and optimizing non-linear feature interactions in factorized linear algebra. In Proceedings of the International Conference on Management of Data (SIGMOD’19). Association for Computing Machinery, New York, NY, 1571--1588.
[68]
Maximilian Schleich, Dan Olteanu, Mahmoud Abo Khamis, Hung Q. Ngo, and XuanLong Nguyen. 2019. A layered aggregate engine for analytics workloads. In Proceedings of the International Conference on Management of Data. ACM, 1642--1659.
[69]
Andreas Kunft, Asterios Katsifodimos, Sebastian Schelter, Sebastian Breß, Tilmann Rabl, and Volker Markl. 2019. An intermediate representation for optimizing machine learning pipelines. Proc. VLDB Endow. 12, 11 (July 2019), 1553--1567.
[70]
Supun Nakandala, Yuhao Zhang, and Arun Kumar. 2019. Cerebro: Efficient and reproducible model selection on deep learning systems. In Proceedings of the 3rd International Workshop on Data Management for End-to-End Machine Learning (DEEM’19). Association for Computing Machinery, New York, NY.
[71]
Supun Nakandala and Arun Kumar. 2020. Vista: Optimized system for declarative feature transfer from deep CNNs at scale. In Proceedings of the International Conference on Management of Data (SIGMOD’20). Association for Computing Machinery.
[72]
Minos N. Garofalakis and Phillip B. Gibbon. 2001. Approximate query processing: Taming the terabytes. In Proceedings of the 27th International Conference on Very Large Data Bases (VLDB’01). Morgan Kaufmann Publishers Inc., San Francisco, CA, 725.
[73]
Supun Nakandala, Arun Kumar, and Yannis Papakonstantinou. 2019. Incremental and approximate inference for faster occlusion-based deep CNN explanations. In Proceedings of the International Conference on Management of Data. 1589--1606.

Cited By

View all
  • (2024)Hybrid Evaluation for Occlusion-based Explanations on CNN Inference Queries2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00078(953-966)Online publication date: 13-May-2024
  • (2024)A Lightweight High-Resolution Remote Sensing Image Cultivated Land Extraction Method Integrating Transfer Learning and SENetIEEE Access10.1109/ACCESS.2024.344185012(113694-113704)Online publication date: 2024
  • (2024)An improved semantic segmentation algorithm for high-resolution remote sensing images based on DeepLabv3+Scientific Reports10.1038/s41598-024-60375-114:1Online publication date: 27-Apr-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Database Systems
ACM Transactions on Database Systems  Volume 45, Issue 4
SIGMOD 2019 Best Paper, PODS 2019 Best Paper, and Regular Papers
December 2020
170 pages
ISSN:0362-5915
EISSN:1557-4644
DOI:10.1145/3441631
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 06 December 2020
Online AM: 07 May 2020
Accepted: 01 April 2020
Received: 01 January 2020
Published in TODS Volume 45, Issue 4

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Incremental view maintenance
  2. convolutional neural network explainability
  3. multi-query optimization
  4. systems for machine learning

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

  • Hellman Fellowship and by the NIDDK of the NIH

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)257
  • Downloads (Last 6 weeks)43
Reflects downloads up to 19 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Hybrid Evaluation for Occlusion-based Explanations on CNN Inference Queries2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00078(953-966)Online publication date: 13-May-2024
  • (2024)A Lightweight High-Resolution Remote Sensing Image Cultivated Land Extraction Method Integrating Transfer Learning and SENetIEEE Access10.1109/ACCESS.2024.344185012(113694-113704)Online publication date: 2024
  • (2024)An improved semantic segmentation algorithm for high-resolution remote sensing images based on DeepLabv3+Scientific Reports10.1038/s41598-024-60375-114:1Online publication date: 27-Apr-2024
  • (2024)Toward Large-Scale Plenoptic ReconstructionPlenoptic Imaging and Processing10.1007/978-981-97-6915-5_5(191-325)Online publication date: 16-Oct-2024
  • (2024)BAFFLE: A Baseline of Backpropagation-Free Federated LearningComputer Vision – ECCV 202410.1007/978-3-031-73226-3_6(89-109)Online publication date: 29-Sep-2024
  • (2023)A Lightweight Automatic Wildlife Recognition Model Design Method Mitigating Shortcut LearningAnimals10.3390/ani1305083813:5(838)Online publication date: 25-Feb-2023
  • (2023)InTune: Reinforcement Learning-based Data Pipeline Optimization for Deep Recommendation ModelsProceedings of the 17th ACM Conference on Recommender Systems10.1145/3604915.3608778(430-442)Online publication date: 14-Sep-2023
  • (2022)Explainable AI: Foundations, Applications, Opportunities for Data Management ResearchProceedings of the 2022 International Conference on Management of Data10.1145/3514221.3522564(2452-2457)Online publication date: 10-Jun-2022
  • (2022)INS-Conv: Incremental Sparse Convolution for Online 3D Segmentation2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52688.2022.01840(18953-18962)Online publication date: Jun-2022
  • (2021)Edge Intelligence: Empowering Intelligence to the Edge of NetworkProceedings of the IEEE10.1109/JPROC.2021.3119950109:11(1778-1837)Online publication date: Nov-2021
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media