Nothing Special   »   [go: up one dir, main page]

skip to main content
10.5555/3635637.3663252acmconferencesArticle/Chapter ViewAbstractPublication PagesaamasConference Proceedingsconference-collections
research-article

MAGNets: Micro-Architectured Group Neural Networks

Published: 06 May 2024 Publication History

Abstract

Reinforcement Learning (RL) algorithms have successfully achieved human-like performances in complex environments like games, autonomous vehicles, and industrial robots. However, the Deep Neural Networks (DNNs) used to approximate large Deep Reinforcement Learning (DRL) policies are resource-hungry and opaque. This limits the applicability of DRL in safety-critical applications running on resource-constrained platforms. On the other hand, on inspecting the design of most multi-output safety critical embedded control applications, it may be observed that such systems often derive each output based on some artifacts, which are, in turn, derived from input variables. Given such dependencies of internal system states on inputs, one may argue that each of these derived artifacts can be approximated by a smaller network in a multi-network DRL setting. In this work, we propose Micro Architecture Group Neural Networks (MAGNets) that can distill the learning of a large DRL network into multiple small neural networks. Using several OpenAI Gym environments, we show that existing verification tools can be used to verify the output of MAGNets while preserving the performance of a large neural policy. We also report our gains in network compactness, which directly impacts the execution latency and applicability in edge devices.

References

[1]
Guy Amir, Michael Schapira, and Guy Katz. 2021. Towards Scalable Verification of Deep Reinforcement Learning. 2021 Formal Methods in Computer Aided Design (FMCAD) (2021), 193--203.
[2]
Jimmy Ba and Rich Caruana. 2014. Do Deep Nets Really Need to be Deep?. In Advances in Neural Information Processing Systems, Z. Ghahramani, M. Welling, C. Cortes, N. Lawrence, and K.Q. Weinberger (Eds.), Vol. 27. Curran Associates, Inc. https://proceedings.neurips.cc/paper_files/paper/2014/file/ea8fcd92d59581717e06eb187f10666d-Paper.pdf
[3]
Osbert Bastani, Yewen Pu, and Armando Solar-Lezama. 2018. Verifiable Reinforcement Learning via Policy Extraction. In Neural Information Processing Systems.
[4]
Andrew Bell, Ian Solano-Kamaiko, Oded Nov, and Julia Stoyanovich. 2022. It's Just Not That Simple: An Empirical Study of the Accuracy-Explainability Trade-off in Machine Learning for Public Policy. In 2022 ACM Conference on Fairness, Accountability, and Transparency (Seoul, Republic of Korea) (FAccT '22). Association for Computing Machinery, New York, NY, USA, 248--266. https://doi.org/10.1145/3531146.3533090
[5]
Yoshua Bengio et al. 2009. Learning deep architectures for AI. Foundations and trends® in Machine Learning, Vol. 2, 1 (2009), 1--127.
[6]
Greg Brockman, Vicki Cheung, Ludwig Pettersson, Jonas Schneider, John Schulman, Jie Tang, and Wojciech Zaremba. 2016. OpenAI Gym. ArXiv, Vol. abs/1606.01540 (2016). https://api.semanticscholar.org/CorpusID:16099293
[7]
Rudy Bunel, Matthew J. Hausknecht, Jacob Devlin, Rishabh Singh, and Pushmeet Kohli. 2018. Leveraging Grammar and Reinforcement Learning for Neural Program Synthesis. ArXiv, Vol. abs/1805.04276 (2018).
[8]
Souradeep Dutta, Xin Chen, Susmit Jha, Sriram Sankaranarayanan, and Ashish Tiwari. 2019. Sherlock - A Tool for Verification of Neural Network Feedback Systems: Demo Abstract. In Proceedings of the 22nd ACM International Conference on Hybrid Systems: Computation and Control (Montreal, Quebec, Canada) (HSCC '19). Association for Computing Machinery, New York, NY, USA, 262--263. https://doi.org/10.1145/3302504.3313351
[9]
Jonathan Frankle and Michael Carbin. 2019. The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks. In 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6-9, 2019. OpenReview.net. https://openreview.net/forum?id=rJl-b3RcF7
[10]
Abhiroop Ghosh, Yashesh Dhebar, Ritam Guha, Kalyanmoy Deb, Subramanya Nageshrao, Ling Zhu, Eric Tseng, and Dimitar Filev. 2021. Interpretable AI Agent Through Nonlinear Decision Trees for Lane Change Problem. In 2021 IEEE Symposium Series on Computational Intelligence (SSCI). 01-08. https://doi.org/10.1109/SSCI50451.2021.9659552
[11]
Song Han, Huizi Mao, and William J. Dally. 2015. Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding. arXiv: Computer Vision and Pattern Recognition (2015).
[12]
Geoffrey E. Hinton, Oriol Vinyals, and Jeffrey Dean. 2015. Distilling the Knowledge in a Neural Network. ArXiv, Vol. abs/1503.02531 (2015).
[13]
Radoslav Ivanov, James Weimer, Rajeev Alur, George J Pappas, and Insup Lee. 2019. Verisig: verifying safety properties of hybrid systems with neural network controllers. In Proceedings of the 22nd ACM International Conference on Hybrid Systems: Computation and Control. 169--178.
[14]
Radoslav Ivanov, James Weimer, Oleg Sokolsky, and Insup Lee. 2018. Demo: verisig - verifying safety properties of hybrid systems with neural network controllers. Proceedings of the Workshop on Design Automation for CPS and IoT (2018).
[15]
Guy Katz, Clark W. Barrett, David L. Dill, Kyle D. Julian, and Mykel J. Kochenderfer. 2017. Reluplex: An Efficient SMT Solver for Verifying Deep Neural Networks. In International Conference on Computer Aided Verification.
[16]
Guy Katz, Derek A. Huang, Duligur Ibeling, Kyle Julian, Christopher Lazarus, Rachel Lim, Parth Shah, Shantanu Thakoor, Haoze Wu, Aleksandar Zeljić, David L. Dill, Mykel J. Kochenderfer, and Clark Barrett. 2019. The Marabou Framework for Verification and Analysis of Deep Neural Networks. In Computer Aided Verification, Isil Dillig and Serdar Tasiran (Eds.). Springer International Publishing, Cham, 443--452.
[17]
James C. King. 1976. Symbolic Execution and Program Testing. Commun. ACM, Vol. 19, 7 (jul 1976), 385--394. https://doi.org/10.1145/360248.360252
[18]
B Ravi Kiran, Ibrahim Sobh, Victor Talpaert, Patrick Mannion, Ahmad A. Al Sallab, Senthil Yogamani, and Patrick Pérez. 2022. Deep Reinforcement Learning for Autonomous Driving: A Survey. IEEE Transactions on Intelligent Transportation Systems, Vol. 23, 6 (2022), 4909--4926. https://doi.org/10.1109/TITS.2021.3054625
[19]
William Koch, Renato Mancuso, Richard West, and Azer Bestavros. 2018. Reinforcement Learning for UAV Attitude Control. ACM Transactions on Cyber-Physical Systems, Vol. 3 (2018), 1--21.
[20]
Stephanie Milani, Zhicheng Zhang, Nicholay Topin, Zheyuan Ryan Shi, Charles Kamhoua, Evangelos E Papalexakis, and Fei Fang. 2023. MAVIPER: Learning Decision Tree Policies for Interpretable Multi-Agent Reinforcement Learning. In Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2022, Grenoble, France, September 19-23, 2022, Proceedings, Part IV. Springer, 251--266.
[21]
Ananya Paul, Krishnendu Bera, Devtanu Misra, Sattwik Barua, Saurabh Singh, Nishu Nishant Kumar, and Sulata Mitra. 2021. Intelligent Traffic Signal Management Using DRL for a Real-Time Road Network in ITS. In 2021 Thirteenth International Conference on Contemporary Computing (IC3-2021) (Noida, India) (IC3 '21). Association for Computing Machinery, New York, NY, USA, 417--425. https://doi.org/10.1145/3474124.3474187
[22]
Wenjie Qiu and He Zhu. 2022. Programmatic Reinforcement Learning without Oracles. In International Conference on Learning Representations.
[23]
Partha Pratim Ray. 2022. A review on TinyML: State-of-the-art and prospects. Journal of King Saud University - Computer and Information Sciences, Vol. 34, 4 (2022), 1595--1623. https://doi.org/10.1016/j.jksuci.2021.11.019
[24]
Stéphane Ross, Geoffrey J. Gordon, and J. Andrew Bagnell. 2010. A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning. In International Conference on Artificial Intelligence and Statistics.
[25]
Iqbal H Sarker. 2021. Deep learning: a comprehensive overview on techniques, taxonomy, applications and research directions. SN Computer Science, Vol. 2, 6 (2021), 420.
[26]
Stefan Schaal. 1999. Is imitation learning the route to humanoid robots? Trends in cognitive sciences, Vol. 3, 6 (1999), 233--242.
[27]
John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. 2017. Proximal Policy Optimization Algorithms. ArXiv, Vol. abs/1707.06347 (2017).
[28]
David Silver, Julian Schrittwieser, Karen Simonyan, Ioannis Antonoglou, Aja Huang, Arthur Guez, Thomas Hubert, Lucas Baker, Matthew Lai, Adrian Bolton, et al. 2017. Mastering the game of go without human knowledge. nature, Vol. 550, 7676 (2017), 354--359.
[29]
Filip Svoboda, David Nunes, Milad Alizadeh, Russel Daries, Rui Luo, Akhil Mathur, Sourav Bhattacharya, Jorge Sa Silva, and Nicholas Donald Lane. 2021. Resource Efficient Deep Reinforcement Learning for Acutely Constrained TinyML Devices. In Research Symposium on Tiny Machine Learning. https://openreview.net/forum?id=_vo8DFo9iuB
[30]
Abhinav Verma, Vijayaraghavan Murali, Rishabh Singh, Pushmeet Kohli, and Swarat Chaudhuri. 2018. Programmatically interpretable reinforcement learning. In International Conference on Machine Learning. PMLR, 5045--5054.
[31]
Christopher J. C. H. Watkins and Peter Dayan. 1992. Q-learning. Machine Learning, Vol. 8, 3 (01 May 1992), 279--292. https://doi.org/10.1007/BF00992698
[32]
Zidong Zhang, Dongxia Zhang, and Robert C. Qiu. 2020. Deep reinforcement learning for power system applications: An overview. CSEE Journal of Power and Energy Systems, Vol. 6, 1 (2020), 213--225. https://doi.org/10.17775/CSEEJPES.2019.00920

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
AAMAS '24: Proceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems
May 2024
2898 pages
ISBN:9798400704864

Sponsors

Publisher

International Foundation for Autonomous Agents and Multiagent Systems

Richland, SC

Publication History

Published: 06 May 2024

Check for updates

Author Tags

  1. edge computation
  2. edge machine learning
  3. knowledge distillation
  4. reinforcement learning

Qualifiers

  • Research-article

Conference

AAMAS '24
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,155 of 5,036 submissions, 23%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 18
    Total Downloads
  • Downloads (Last 12 months)18
  • Downloads (Last 6 weeks)1
Reflects downloads up to 25 Nov 2024

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media