research-article

Public Access

NeuroView-RNN: It’s About Time

Authors:

Sina Alemmohammad,

Randall Balestriero,

Richard BaraniukAuthors Info & Claims

FAccT '22: Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency

Pages 1683 - 1697

https://doi.org/10.1145/3531146.3533224

Published: 20 June 2022 Publication History

All formats PDF

Abstract

Recurrent Neural Networks (RNNs) are important tools for processing sequential data such as time-series or video. Interpretability is defined as the ability to be understood by a person and is different from explainability, which is the ability to be explained in a mathematical formulation. A key interpretability issue with RNNs is that it is not clear how each hidden state per time step contributes to the decision-making process in a quantitative manner. We propose NeuroView-RNN as a family of new RNN architectures that explains how all the time steps are used for the decision-making process. Each member of the family is derived from a standard RNN architecture by concatenation of the hidden steps into a global linear classifier. The global linear classifier has all the hidden states as the input, so the weights of the classifier have a linear mapping to the hidden states. Hence, from the weights, NeuroView-RNN can quantify how important each time step is to a particular decision. As a bonus, NeuroView-RNN also offers higher accuracy in many cases compared to the RNNs and their variants. We showcase the benefits of NeuroView-RNN by evaluating on a multitude of diverse time-series datasets.

References

[1]

S. Alemohammad, R. Balestriero, Z. Wang, and R. G. Baraniuk. 2020. Scalable Neural Tangent Kernel of Recurrent Architectures. arXiv preprint arXiv:2012.04859(2020).

[2]

S. Alemohammad, Z. Wang, R. Balestriero, and R.G. Baraniuk. 2020. The recurrent neural tangent kernel. arXiv preprint arXiv:2006.10246(2020).

[3]

L. Arras, J. Arjona-Medina, M. Widrich, G. Montavon, M. Gillhofer, K.R. Müller, S. Hochreiter, and W. Samek. 2019. Explaining and interpreting LSTMs. In Explainable ai: Interpreting, explaining and visualizing deep learning. Springer, 211–238.

[4]

C. Barberan, R. Balestriero, and R. G Baraniuk. 2021. NeuroView: Explainable Deep Network Decision Making. arXiv preprint arXiv:2110.07778(2021).

[5]

I. Bica, A. M. Alaa, J. Jordon, and M. van der Schaar. 2020. Estimating counterfactual treatment outcomes over time through adversarially balanced representations. arXiv preprint arXiv:2002.04083(2020).

[6]

P. Bojanowski, E. Grave, A. Joulin, and T. Mikolov. 2017. Enriching word vectors with subword information. Transactions of the Association for Computational Linguistics 5 (2017), 135–146.

[7]

G. Chen, J. Li, J. Lu, and J. Zhou. 2021. Human Trajectory Prediction via Counterfactual Analysis. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). 9824–9833.

[8]

K. Cho, B. Van Merriënboer, C. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk, and Y. Bengio. 2014. Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078(2014).

[9]

J. Chung, C. Gulcehre, K. Cho, and Y. Bengio. 2014. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555(2014).

[10]

H. A. Dau, E. Keogh, K. Kamgar, C.-C. M. Yeh, Y. Zhu, S. Gharghabi, C. A. Ratanamahatana, Y. Chen, B. Hu, N. Begum, A. Bagnall, A. Mueen, G. Batista, and M. L. Hexagon. 2019. The UCR Time Series Classification Archive. https://www.cs.ucr.edu/~eamonn/time_series_data_2018/.

[11]

Y. Dong, H. Su, J. Zhu, and B. Zhang. 2017. Improving interpretability of deep neural networks with semantic information. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 4306–4314.

[12]

Jeffrey L. Elman. 1990. Finding structure in time. Cognitive Science 14, 2 (1990), 179–211. https://www.sciencedirect.com/science/article/pii/036402139090002E

[13]

J.N. Foerster, J. Gilmer, J. Sohl-Dickstein, J. Chorowski, and D. Sussillo. 2017. Input switched affine networks: An rnn architecture designed for interpretability. In International Conference on Machine Learning (ICML). PMLR, 1136–1145.

[14]

H. Gammulle, S. Denman, S. Sridharan, and C. Fookes. 2017. Two stream lstm: A deep fusion framework for human action recognition. In 2017 IEEE Winter Conference on Applications of Computer Vision (WACV). IEEE, 177–186.

[15]

L. H. Gilpin, D. Bau, B. Z. Yuan, A. Bajwa, M. Specter, and L. Kagal. 2018. Explaining explanations: An overview of interpretability of machine learning. In 2018 IEEE 5th International Conference on data science and advanced analytics (DSAA). IEEE, 80–89.

[16]

G. B. Goh, N. O. Hodas, C. Siegel, and A. Vishnu. 2017. Smiles2vec: An interpretable general-purpose deep neural network for predicting chemical properties. arXiv preprint arXiv:1712.02034(2017).

[17]

C. Guan, X. Wang, Q. Zhang, R. Chen, D. He, and X. Xie. 2019. Towards a deep and unified understanding of deep neural models in nlp. In International Conference on Machine Learning (ICML). PMLR, 2454–2463.

[18]

T. Guo, T. Lin, and N. Antulov-Fantulin. 2019. Exploring interpretable LSTM neural networks over multi-variable data. In International Conference on Machine Learning (ICML). PMLR, 2494–2504.

[19]

S. Hochreiter and J. Schmidhuber. 1997. Long short-term memory. Neural Computation (Neural Comput.) 9, 8 (1997), 1735–1780.

Digital Library

[20]

C. Jiang, Y. Zhao, S. Chu, L. Shen, and K. Tu. 2020. Cold-start and Interpretability: Turning Regular Expressions into Trainable Recurrent Neural Networks. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). 3193–3207.

[21]

C.-C. Kao, M. Sun, W. Wang, and C. Wang. 2020. A comparison of pooling methods on LSTM models for rare acoustic event classification. In ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 316–320.

[22]

Q. V. Le, N. Jaitly, and G. E. Hinton. 2015. A simple way to initialize recurrent networks of rectified linear units. arXiv preprint arXiv:1504.00941(2015).

[23]

A. Li, K.and Javer, E. E. Keaveny, and A.E.X. Brown. 2017. Recurrent neural networks with interpretable cells predict and classify worm behaviour. BioRxiv (2017), 222208.

[24]

Y. Li, Y. Li, and N. Vasconcelos. 2018. Resound: Towards action recognition without representation bias. In Proceedings of the European Conference on Computer Vision (ECCV). 513–528.

[25]

J. Liu, J. Luo, and M. Shah. 2009. Recognizing realistic actions from videos “in the wild”. In 2009 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 1996–2003.

[26]

F. Ma, R. Chitta, J. Zhou, Q. You, T. Sun, and J. Gao. 2017. Dipole: Diagnosis prediction in healthcare via attention-based bidirectional recurrent neural networks. In Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining. 1903–1911.

[27]

Z. Ma and Z. Sun. 2018. Time-varying LSTM networks for action recognition. Multimedia Tools and Applications (Multimed. Tools. Appl) 77, 24(2018), 32275–32285.

Digital Library

[28]

A. L. Maas, R. E. Daly, P. T. Pham, D. Huang, A. Y. Ng, and C. Potts. 2011. Learning Word Vectors for Sentiment Analysis. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, Portland, Oregon, USA, 142–150. http://www.aclweb.org/anthology/P11-1015

[29]

L. Meng, B. Zhao, B. Chang, G. Huang, W. Sun, F. Tung, and L. Sigal. 2019. Interpretable spatio-temporal attention for video action recognition. In Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops. 0–0.

[30]

T. Mikolov, K. Chen, G. Corrado, and J. Dean. 2013. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781(2013).

[31]

Y. Ming, P. Xu, H. Qu, and L. Ren. 2019. Interpretable and steerable sequence learning via prototypes. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (ACM SIGKDD). 903–913.

[32]

J. Nam, H. Cha, S. Ahn, J. Lee, and J. Shin. 2020. Learning from failure: Training debiased classifier from biased classifier. arXiv preprint arXiv:2007.02561(2020).

[33]

A. Nematzadeh, S. C Meylan, and T. L Griffiths. 2017. Evaluating Vector-Space Models of Word Representation, or, The Unreasonable Effectiveness of Counting Words Near Other Words. In CogSci.

[34]

R. Pascanu, C. Gulcehre, K. Cho, and Y. Bengio. 2013. How to construct deep recurrent neural networks. arXiv preprint arXiv:1312.6026(2013).

[35]

R. Pascanu, T. Mikolov, and Y. Bengio. 2013. On the difficulty of training recurrent neural networks. In International Conference on Machine Learning (ICML). PMLR, 1310–1318.

[36]

J. Pennington, R. Socher, and C. D Manning. 2014. Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). 1532–1543.

[37]

J. Poulos and S. Zeng. 2021. RNN-based counterfactual prediction, with an application to homestead policy and public schooling. Journal of the Royal Statistical Society: Series C (Applied Statistics) 70, 4(2021), 1124–1139.

[38]

M. Schuster and K. K. Paliwal. 1997. Bidirectional recurrent neural networks. IEEE Transactions on Signal Processing (IEEE Trans. Signal Process) 45, 11(1997), 2673–2681.

Digital Library

[39]

P. Schwab, G. C. Scebba, J. Zhang, M. Delai, and W. Karlen. 2017. Beat by beat: Classifying cardiac arrhythmias with recurrent neural networks. In 2017 Computing in Cardiology (CinC). IEEE, 1–4.

[40]

L. Shen and J. Zhang. 2016. Empirical evaluation of RNN architectures on sentence classification task. arXiv preprint arXiv:1609.09171(2016).

[41]

T. Shi and Z. Liu. 2014. Linking GloVe with word2vec. arXiv preprint arXiv:1411.5595(2014).

[42]

J. Shin, A. Madotto, and P. Fung. 2018. Interpreting word embeddings with eigenvector analysis. In 32nd Conference on Neural Information Processing Systems (NuerIPS 2018), IRASL workshop.

[43]

J. Song, Y. Guo, L. Gao, X. Li, A. Hanjalic, and H. T. Shen. 2018. From deterministic to generative: Multimodal stochastic RNNs for video captioning. IEEE Transactions on Neural Networks and Learning Systems (IEEE Trans. Neural Netw. Learn. Syst) 30, 10 (2018), 3047–3058.

[44]

S.S. Talathi and A. Vartak. 2015. Improving performance of recurrent neural network with relu nonlinearity. arXiv preprint arXiv:1511.03771(2015).

[45]

S. Tonekaboni, S. Joshi, D. Duvenaud, and A. Goldenberg. 2019. Explaining time series by counterfactuals. (2019).

[46]

C. Wang, H. Yang, C. Bartz, and C. Meinel. 2016. Image captioning with deep bidirectional LSTMs. In Proceedings of the 24th ACM International Conference on Multimedia (Proc ACM Int Conf Multimed). 988–997.

[47]

Y. Wang, J. Li, and F. Metze. 2019. A comparison of five multiple instance learning pooling functions for sound event detection with weak labeling. In ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 31–35.

Index Terms

NeuroView-RNN: It’s About Time
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Supervised learning
    2. Machine learning approaches
      1. Neural networks
2. Theory of computation
  1. Theory and algorithms for application domains
    1. Machine learning theory

Index terms have been assigned to the content through auto-classification.

Recommendations

Exploiting inherent relationships in RNN architectures
Explaining and Interpreting LSTMs
Explainable AI: Interpreting, Explaining and Visualizing Deep Learning
Abstract
While neural networks have acted as a strong unifying force in the design of modern AI systems, the neural network architectures themselves remain highly heterogeneous due to the variety of tasks to be solved. In this chapter, we explore how to ...
Software failure time series prediction with RBF, GRNN, and LSTM neural networks
Abstract
The important task of software quality assurance is failure prediction. Time series forecasting methods can be successfully used for this purpose. This paper aims to study and compare the effectiveness of software failure prediction using ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences

FAccT '22: Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency

June 2022

2351 pages

ISBN:9781450393522

DOI:10.1145/3531146

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

ACM: Association for Computing Machinery

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 June 2022

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

NSF (National Science Foundation)
Vannevar Bush Faculty Fellowship
MURI
NSF
ONR
AFOSR

Conference

FAccT '22

Sponsor:

ACM

FAccT '22: 2022 ACM Conference on Fairness, Accountability, and Transparency

June 21 - 24, 2022

Seoul, Republic of Korea

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
173
Total Downloads

Downloads (Last 12 months)84
Downloads (Last 6 weeks)19

Reflects downloads up to 16 Nov 2024

Other Metrics

View Author Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents