research-article

A Self-Organizing Neuro-Fuzzy Q-Network: Systematic Design with Offline Hybrid Learning

Authors:

John Wesley Hostetter,

Mark Abdelshiheed,

Tiffany Barnes,

Min ChiAuthors Info & Claims

AAMAS '23: Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems

Pages 1248 - 1257

Published: 30 May 2023 Publication History

Abstract

In this paper, we propose a systematic design process for automatically generating self-organizing neuro-fuzzy Q-networks by leveraging unsupervised learning and an offline, model-free fuzzy reinforcement learning algorithm called Fuzzy Conservative Q-learning (FCQL). Our FCQL offers more effective and interpretable policies than deep neural networks, facilitating human-in-the-loop design and explainability.

References

[1]

Mark Abdelshiheed, John Wesley Hostetter, Preya Shabrina, Tiffany Barnes, and Min Chi. 2022. The Power of Nudging: Exploring Three Interventions for Metacognitive Skills Instruction across Intelligent Tutoring Systems. In Proceedings of the 44th annual conference of the cognitive science society. 541--548.

[2]

Mark Abdelshiheed, John Wesley Hostetter, Xi Yang, Tiffany Barnes, and Min Chi. 2022. Mixing Backward- with Forward-Chaining for Metacognitive Skill Acquisition and Transfer. In Artificial Intelligence in Education. Springer International Publishing, Cham, 546--552.

[3]

Mark Abdelshiheed, Mehak Maniktala, Tiffany Barnes, and Min Chi. 2022. Assessing Competency Using Metacognition and Motivation: The Role of Time- Awareness in Preparation for Future Learning. In Design Recommendations for Intelligent Tutoring Systems. Vol. 9. 121--131.

[4]

Mark Abdelshiheed, Mehak Maniktala, Song Ju, Ayush Jain, Tiffany Barnes, and Min Chi. 2021. Preparing Unprepared Students For Future Learning. In Proceedings of the 43rd annual conference of the cognitive science society. 2547--2553.

[5]

Mark Abdelshiheed, Guojing Zhou, Mehak Maniktala, Tiffany Barnes, and Min Chi. 2020. Metacognition and Motivation: The Role of Time-Awareness in Preparation for Future Learning. In Proceedings of the 42nd annual conference of the cognitive science society. 945--951.

[6]

Marcin Andrychowicz, Bowen Baker, et al. 2018. Learning dexterous in-hand manipulation. arXiv:1808.00177 (2018).

[7]

Kai Ang and Chai Quek. 2005. RSPOP: Rough Set-Based Pseudo Outer-Product Fuzzy Rule Identification Algorithm. Neural computation 17 (2005), 205--43.

[8]

Plamen Angelov and Xiaowei Gu. 2017. Empirical Fuzzy Sets. International Journal of Intelligent Systems 33 (09 2017). https://doi.org/10.1002/int.21935

Digital Library

[9]

H.R. Berenji and P. Khedkar. 1992. Learning and tuning fuzzy logic controllers through reinforcements. IEEE Transactions on Neural Networks 3, 5 (1992), 724--740.

Digital Library

[10]

H.R. Berenji, R.N. Lea, Y. Jani, P. Khedkar, A. Malkani, and J. Hoblit. 1993. Space shuttle attitude control by reinforcement learning and fuzzy logic. In [Proceedings 1993] Second IEEE International Conference on Fuzzy Systems. 1396--1401 vol.2. https://doi.org/10.1109/FUZZY.1993.327605

[11]

Hamid R. Berenji. 1992. A reinforcement learning-based architecture for fuzzy logic control. International Journal of Approximate Reasoning 6, 2 (1992), 267--292. https://doi.org/10.1016/0888--613X(92)90020-Z

Digital Library

[12]

Hamid R. Berenji and Sterling Software. 1991. Refinement of Approximate Reasoning-based Controllers by Reinforcement Learning. In Machine Learning Proceedings 1991, Lawrence A. Birnbaum and Gregg C. Collins (Eds.). Morgan Kaufmann, San Francisco (CA), 475--479.

[13]

James C. Bezdek, Robert Ehrlich, and William Full. 1984. FCM: The fuzzy c-means clustering algorithm. Computers & Geosciences 10, 2 (1984), 191--203. https://doi.org/10.1016/0098--3004(84)90020--7

[14]

Greg Brockman, Vicki Cheung, Ludwig Pettersson, Jonas Schneider, John Schulman, Jie Tang, and Wojciech Zaremba. 2016. OpenAI Gym. https://doi.org/10.48550/ARXIV.1606.01540

[15]

J. Casillas, O. Cordón, F.H. Triguero, and L. Magdalena. 2013. Interpretability Issues in Fuzzy Modeling. Springer Berlin Heidelberg. https://books.google.com/books?id=7r_qCAAAQBAJ

[16]

Ron Tor Das et al. 2016. ieRSPOP: A novel incremental rough set-based pseudo outer-product with ensemble learning. Applied Soft Computing 46 (2016), 170--186.

Digital Library

[17]

Sao Deroski, Luc De Raedt, and Kurt Driessens. 2001. Relational Reinforcement Learning. Machine Learning 43, 1 (01 Apr 2001), 7--52. https://doi.org/10.1023/A:1007694015589

[18]

Scott Fujimoto, Edoardo Conti, Mohammad Ghavamzadeh, and Joelle Pineau. 2019. Benchmarking Batch Deep Reinforcement Learning Algorithms. https://doi.org/10.48550/ARXIV.1910.01708

[19]

Scott Fujimoto, David Meger, and Doina Precup. 2018. Off-Policy Deep Reinforcement Learning without Exploration. https://doi.org/10.48550/ARXIV.1812.02900

[20]

P.Y. Glorennec. 1994. Fuzzy Q-learning and dynamical fuzzy Q-learning. In Proceedings of 1994 IEEE 3rd International Fuzzy Systems Conference. 474--479 vol.1. https://doi.org/10.1109/FUZZY.1994.343739

[21]

P.Y. Glorennec and L. Jouffe. 1997. Fuzzy Q-learning. In Proceedings of 6th International Fuzzy Systems Conference, Vol. 2. 659--662 vol.2. https://doi.org/10.1109/FUZZY.1997.622790

[22]

Daniel Hein, Alexander Hentschel, Thomas Runkler, and Steffen Udluft. 2017. Particle swarm optimization for generating interpretable fuzzy reinforcement learning policies. Engineering Applications of Artificial Intelligence 65 (2017), 87--98. https://doi.org/10.1016/j.engappai.2017.07.005

Digital Library

[23]

John Wesley Hostetter. 2023. johnHostetter/AAMAS-2023-FCQL: First release. https://doi.org/10.5281/zenodo.7668308

[24]

J.-S.R. Jang. 1993. ANFIS: adaptive-network-based fuzzy inference system. IEEE Transactions on Systems, Man, and Cybernetics 23, 3 (1993), 665--685.

[25]

Zhengyao Jiang and Shan Luo. 2019. Neural Logic Reinforcement Learning. CoRR abs/1904.10729 (2019). arXiv:1904.10729 http://arxiv.org/abs/1904.10729

[26]

L. Jouffe. 1998. Fuzzy inference system learning by reinforcement methods. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews) 28, 3 (1998), 338--355. https://doi.org/10.1109/5326.704563

Digital Library

[27]

Song Ju, Guojing Zhou, Mark Abdelshiheed, Tiffany Barnes, and Min Chi. 2021. Evaluating Critical Reinforcement Learning Framework in the Field. In Artificial Intelligence in Education, Ido Roll, Danielle McNamara, Sergey Sosnovsky, Rose Luckin, and Vania Dimitrova (Eds.). Springer International Publishing, Cham, 215--227.

[28]

N.K. Kasabov and Qun Song. 2002. DENFIS: dynamic evolving neural-fuzzy inference system and its application for time-series prediction. IEEE Transactions on Fuzzy Systems 10, 2 (2002), 144--154. https://doi.org/10.1109/91.995117

Digital Library

[29]

J. Kim and N. Kasabov. 1999. HyFIS: adaptive neuro-fuzzy inference systems and their application to nonlinear dynamical systems. Neural Networks 12, 9 (1999), 1301--1319. https://doi.org/10.1016/S0893--6080(99)00067--2

Digital Library

[30]

Min-Soeng Kim, Sun-Gi Hong, and Ju-Jang Lee. 1999. Self-organizing fuzzy inference system by Q-learning. In FUZZ-IEEE'99. 1999 IEEE International Fuzzy Systems. Conference Proceedings (Cat. No.99CH36315), Vol. 1. 372--377 vol.1. https://doi.org/10.1109/FUZZY.1999.793268

[31]

Diederik P. Kingma and Jimmy Ba. 2014. Adam: A Method for Stochastic Optimization. https://doi.org/10.48550/ARXIV.1412.6980

[32]

G. Klir and Bo Yuan. 1995. Fuzzy sets and fuzzy logic - theory and applications. Prentice-Hall Inc., Upper Saddle River, New Jersey.

[33]

B. Kosko. 1994. Fuzzy systems as universal approximators. IEEE Trans. Comput. 43, 11 (1994), 1329--1333. https://doi.org/10.1109/12.324566

Digital Library

[34]

Aviral Kumar, Aurick Zhou, George Tucker, and Sergey Levine. 2020. Conservative Q-Learning for Offline Reinforcement Learning. https://doi.org/10.48550/ ARXIV.2006.04779

[35]

C.C. Lee. 1990. Fuzzy logic in control systems: fuzzy logic controller. I & II. IEEE Transactions on Systems, Man, and Cybernetics 20, 2 (1990), 404--435.

[36]

Sergey Levine, Aviral Kumar, George Tucker, and Justin Fu. 2020. Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems. https://doi.org/10.48550/ARXIV.2005.01643

[37]

Yuezhang Li, Katia P. Sycara, and Rahul Radhakrishnan Iyer. 2017. Object-sensitive Deep Reinforcement Learning. ArXiv abs/1809.06064 (2017).

[38]

Cheng-Jian Lin and Chin-Teng Lin. 1996. Reinforcement learning for an ART-based fuzzy adaptive learning control network. IEEE Transactions on Neural Networks 7, 3 (1996), 709--731. https://doi.org/10.1109/72.501728

Digital Library

[39]

C.-T. Lin and C.S.G. Lee. 1991. Neural-network-based fuzzy logic control and decision system. IEEE Trans. Comput. 40, 12 (1991), 1320--1336. https://doi.org/10.1109/12.106218

Digital Library

[40]

Chin-Teng Lin and C.S.G. Lee. 1994. Reinforcement structure/parameter learning for neural-network-based fuzzy logic control systems. IEEE Transactions on Fuzzy Systems 2, 1 (1994), 46--63. https://doi.org/10.1109/91.273126

Digital Library

[41]

P. Lindskog. 1997. Fuzzy Identification from a Grey Box Modeling Point of View. Springer Berlin Heidelberg, Berlin, Heidelberg, 3--50. https://doi.org/10.1007/978--3--642--60767--7_1

[42]

H.H. Lou and Y.L. Huang. 2000. Fuzzy-logic-based process modeling using limited experimental data. Engineering Applications of Artificial Intelligence 13, 2 (2000), 121--135. https://doi.org/10.1016/S0952--1976(99)00057--3

[43]

Jean M. Mandler. 2008. On the Birth and Growth of Concepts. Philosophical Psychology 21 (2008), 207 -- 230.

[44]

Jean M Mandler, Patricia J Bauer, and Laraine McDonough. 1991. Separating the sheep from the goats: Differentiating global categories. Cognitive Psychology 23, 2 (1991), 263--298. https://doi.org/10.1016/0010-0285(91)90011-C

[45]

Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A. Rusu, Joel Veness, Marc G. Bellemare, Alex Graves, Martin Riedmiller, Andreas K. Fidjeland, Georg Ostrovski, Stig Petersen, Charles Beattie, Amir Sadik, Ioannis Antonoglou, Helen King, Dharshan Kumaran, Daan Wierstra, Shane Legg, and Demis Hassabis. 2015. Human-level control through deep reinforcement learning. Nature 518, 7540 (01 Feb 2015), 529--533. https://doi.org/10.1038/nature14236

[46]

Besa Muslimi, Miriam A. M. Capretz, and Jagath Samarabandu. 2008. An Efficient Technique for Extracting Fuzzy Rules from Neural Networks. International Journal of Electrical and Computer Engineering 2, 4 (2008), 1231 -- 1237.

[47]

Zdzislaw Pawlak. 1998. Rough Set Theory and its Applications to Data Analysis. Cybernetics and Systems 29, 7 (1998), 661--688. https://doi.org/10.1080/019697298125470 arXiv:https://doi.org/10.1080/019697298125470

[48]

Agus Priyono, Muhammad Ridwan, Ahmad Alias, Riza Rahmat, Azmi Hassan, and Mohd Mohd Ali. 2005. Generation of Fuzzy Rules with Subtractive Clustering. Jurnal Teknologi 43 (02 2005), 143. https://doi.org/10.11113/jt.v43.782

[49]

Rafael Figueiredo Prudencio, Marcos R. O. A. Maximo, and Esther Luna Colombini. 2022. A Survey on Offline Reinforcement Learning: Taxonomy, Review, and Open Problems. https://doi.org/10.48550/ARXIV.2203.01387

[50]

C. Quek and R.W. Zhou. 1999. POPFNN-AAR(S): a pseudo outer-product based fuzzy neural network. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics) 29, 6 (1999), 859--870.

Digital Library

[51]

Martin Riedmiller. 2005. Neural Fitted Q Iteration -- First Experiences with a Data Efficient Neural Reinforcement Learning Method. In Machine Learning: ECML 2005, João Gama, Rui Camacho, Pavel B. Brazdil, Alípio Mário Jorge, and Luís Torgo (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 317--328.

[52]

Rowena Rodrigues. 2020. Legal and human rights issues of AI: Gaps, challenges and vulnerabilities. Journal of Responsible Technology 4 (2020), 100005. https://doi.org/10.1016/j.jrt.2020.100005

[53]

Takuma Seno and Michita Imai. 2021. d3rlpy: An Offline Deep Reinforcement Learning Library. https://doi.org/10.48550/ARXIV.2111.03788

[54]

Hitesh Shah and M. Gopal. 2014. A Reinforcement Learning Algorithm with Evolving Fuzzy Neural Networks. IFAC Proceedings Volumes 47, 1 (2014), 1161--1165. https://doi.org/10.3182/20140313--3-IN-3024.00058 3rd International Conference on Advances in Control and Optimization of Dynamical Systems (2014).

[55]

D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. V. D. Driessche, et al. 2016. Mastering the game of go with deep neural networks and tree search. Nature 529, 7587 (2016), 484--489.

[56]

Richard S. Sutton and Andrew G. Barto. 2018. Reinforcement Learning: An Introduction. A Bradford Book, Cambridge, MA, USA.

Digital Library

[57]

Faraz Torabi, Garrett Warnell, and Peter Stone. 2018. Behavioral Cloning from Observation. https://doi.org/10.48550/ARXIV.1805.01954

[58]

Sau Wai Tung, Chai Quek, and Cuntai Guan. 2011. SaFIN: A Self-Adaptive Fuzzy Inference Network. IEEE Transactions on Neural Networks 22, 12 (2011), 1928--1940.

Digital Library

[59]

W. L. Tung and C. Quek. 2002. DIC: A Novel Discrete Incremental Clustering Technique for the Derivation of Fuzzy Membership Functions. In PRICAI 2002: Trends in Artificial Intelligence, Mitsuru Ishizuka and Abdul Sattar (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 178--187.

[60]

Hado van Hasselt, Arthur Guez, and David Silver. 2015. Deep Reinforcement Learning with Double Q-learning. https://doi.org/10.48550/ARXIV.1509.06461

[61]

L.-X. Wang. 1992. Fuzzy systems are universal approximators. In [1992 Proceedings] IEEE International Conference on Fuzzy Systems. 1163--1170. https://doi.org/10.1109/FUZZY.1992.258721

[62]

Li-Xin Wang. 1997. A Course in Fuzzy Systems and Control.

[63]

L.-X. Wang and J.M. Mendel. 1992. Fuzzy basis functions, universal approximation, and orthogonal least-squares learning. IEEE Transactions on Neural Networks 3, 5 (1992), 807--814. https://doi.org/10.1109/72.159070

Digital Library

[64]

L.-X. Wang and J.M. Mendel. 1992. Generating fuzzy rules by learning from examples. IEEE Transactions on Systems, Man, and Cybernetics 22, 6 (1992), 1414--1427. https://doi.org/10.1109/21.199466

[65]

Xue-Song Wang, Yu-Hu Cheng, and Jian-Qiang Yi. 2007. A fuzzy Actor--Critic reinforcement learning network. Information Sciences 177, 18 (2007), 3764--3781. https://doi.org/10.1016/j.ins.2007.03.012

Digital Library

[66]

Christopher J. C. H. Watkins and Peter Dayan. 1992. Q-learning. Machine Learning 8, 3 (01 May 1992), 279--292. https://doi.org/10.1007/BF00992698

Digital Library

[67]

Ronald R. Yager and Dimitar P. Filev. 1994. Generation of Fuzzy Rules by Mountain Clustering. J. Intell. Fuzzy Syst. 2, 3 (may 1994), 209--219.

[68]

L.A. Zadeh. 1965. Fuzzy sets. Information and Control 8, 3 (1965), 338--353.

[69]

Lotfi Zadeh and Rafik Aliev. 2018. Fuzzy Logic Theory and Applications: Part I and II.

[70]

Vinicius Zambaldi, David Raposo, Adam Santoro, Victor Bapst, Yujia Li, Igor Babuschkin, Karl Tuyls, David Reichert, Timothy Lillicrap, Edward Lockhart, Murray Shanahan, Victoria Langston, Razvan Pascanu, Matthew Botvinick, Oriol Vinyals, and Peter Battaglia. 2018. Relational Deep Reinforcement Learning. https://doi.org/10.48550/ARXIV.1806.01830

Cited By

Abdelshiheed MJacobs JD'Mello S(2024)Not a Team but Learning as One: The Impact of Consistent Attendance on Discourse Diversification in Math Group ModelingProceedings of the 32nd ACM Conference on User Modeling, Adaptation and Personalization10.1145/3627043.3659554(120-131)Online publication date: 22-Jun-2024
https://dl.acm.org/doi/10.1145/3627043.3659554

Index Terms

A Self-Organizing Neuro-Fuzzy Q-Network: Systematic Design with Offline Hybrid Learning
1. Applied computing
  1. Education
    1. Distance learning
    2. E-learning
2. Computing methodologies
  1. Artificial intelligence
    1. Knowledge representation and reasoning
      1. Vagueness and fuzzy logic
  2. Machine learning
    1. Learning paradigms
      1. Reinforcement learning
      2. Unsupervised learning

Recommendations

Adaptive pessimism via target Q-value for offline reinforcement learning
Abstract
Offline reinforcement learning (RL) methods learn from datasets without further environment interaction, facing errors due to out-of-distribution (OOD) actions. Although effective methods have been proposed to conservatively estimate the Q-values ...
Highlights
- Dynamically balancing constraints and reinforcement learning objectives.
- Enhancing CQL on challenging datasets while maintaining training stability.
ENOTO: Improving Offline-to-Online Reinforcement Learning with Q-Ensembles
AAMAS '24: Proceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems

Offline reinforcement learning (RL) is a learning paradigm where an agent learns from a fixed dataset of experience. However, learning solely from a static dataset can limit the performance due to the lack of exploration. To overcome it, offline-to-...
Adaptable Conservative Q-Learning for Offline Reinforcement Learning
Pattern Recognition and Computer Vision
Abstract
The Out-of-Distribution (OOD) issue presents a considerable obstacle in offline reinforcement learning. Although current approaches strive to conservatively estimate the Q-values of OOD actions, their excessive conservatism under constant ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

AAMAS '23: Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems

May 2023

3131 pages

ISBN:9781450394321

General Chairs:
Noa Agmon
Bar-Ilan University, Israel
,
Bo An
Nanyang Technological University, Singapore
,
Program Chairs:
Alessandro Ricci
University of Bologna, Italy
,
William Yeoh
Washington University in St. Louis, USA

Sponsors

Publisher

International Foundation for Autonomous Agents and Multiagent Systems

Richland, SC

Publication History

Published: 30 May 2023

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

National Science Foundation

Conference

AAMAS '23

Sponsor:

SIGAI

AAMAS '23: International Conference on Autonomous Agents and Multiagent Systems

May 29 - June 2, 2023

London, United Kingdom

Acceptance Rates

Overall Acceptance Rate 1,155 of 5,036 submissions, 23%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
53
Total Downloads

Downloads (Last 12 months)16
Downloads (Last 6 weeks)0

Reflects downloads up to 03 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Abdelshiheed MJacobs JD'Mello S(2024)Not a Team but Learning as One: The Impact of Consistent Attendance on Discourse Diversification in Math Group ModelingProceedings of the 32nd ACM Conference on User Modeling, Adaptation and Personalization10.1145/3627043.3659554(120-131)Online publication date: 22-Jun-2024
https://dl.acm.org/doi/10.1145/3627043.3659554

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten