Abstract
Software aging refers to the performance degradation and failure crash phenomena in long-running systems. As a proactive remedy, software rejuvenation can be scheduled timely to mitigate aging effects. Inescapably, how to accurately predict the time to aging failure (TTAF) of software is a prerequisite for implementing effective rejuvenation. However, the characterization of software aging is relatively complicated, leading to the selection of aging indicators case by case, which means that only fitting the variation trend of a single indicator for prediction models to formulate a rejuvenation schedule may be limited. To fill this gap, this paper proposes a novel framework called TTAFPred, which directly constructs the direct mapping relationships between the software aging process considering multiple system indicators and TTAF. Specifically, this framework includes three parts, i.e., data preprocessing, software degradation feature extraction, and TTAF prediction modules. First, the raw data is processed into the input form required by the network. Secondly, a temporal relationship extraction stream integrating bidirectional gated recurrent unit (BiGRU) with attention mechanism is used to extract temporal features from raw inputs. Synchronously, a spatial relationships extraction stream is adopted to extract the spatial features for enhancing the representation ability of degraded features by using the multi-scale one-dimensional convolutional neural network (1DCNN) with the residual connection. Then, extracted temporal-spatial features from the two streams are further fused. Finally, two fully-connected layers are constructed to estimate the TTAF. The experiments are performed on two mainstream software systems (OpenStack and Android), and four sets of real run-to-failure data for each software system are collected. The effectiveness of the proposed TTAFPred is verified through extensive experiments with its seven competing models, and the prediction performance can be improved by 9.1%, 8.0%, and 8.0%, respectively, in terms of three evaluation metrics, compared to the best baseline model.
Similar content being viewed by others
Data availability
The data of this study is available on Github at https://github.com/agingprediction/TTAFPred.
References
Alonso, J., Matias, R., Vicente, E., Maria, A., & Trivedi, K. (2013). A comparative experimental study of software rejuvenation overhead. Performance Evaluation, 70(3), 231–250.
Andrade, E., Pietrantuono, R., Machida, F., & Cotroneo, D. (2023). A comparative analysis of software aging in image classifiers on cloud and edge. IEEE Transactions on Dependable and Secure Computing, 20(1), 563–573.
Bai, J., Chang, X., Trivedi, K. S., & Han, Z. (2021). Resilience-driven quantitative analysis of vehicle platooning service. IEEE Transactions on Vehicular Technology, 70(6), 5378–5389.
Bai, J., Chang, X., Machida, F., Jiang, L., Han, Z., & Trivedi, K. S. (2023). Impact of service function aging on the dependability for mec service function chain. IEEE Transactions on Dependable and Secure Computing, 20(4), 2811–2824.
Battisti, F., Silva, A., Pereira, L., Carvalho, T., Araujo, J., Choi, E., Nguyen, T. A., & Min, D. (2022). hLSTM-Aging: A hybrid LSTM model for software aging forecast. Applied Sciences, 12(13), 6412.
Chung, J., Gulcehre, C., Cho, K., & Bengio, Y. (2014). Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv e-prints. Prerprint retrieved from http://arxiv.org/abs/1412.3555
Cotroneo, D., De Simone, L., Liguori, P., Natella, R., & Bidokhti, N. (2019a). How bad can a bug get? An empirical analysis of software failures in the openstack cloud computing platform. In Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (FSE/ESEC), pp. 200–211.
Cotroneo, D., De Simone, L., Natella, R., Pietrantuono, R., & Russo, S. (2019b). A configurable software aging detection and rejuvenation agent for android. In 2019 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW), pp. 239–245.
Cotroneo, D., Natella, R., Pietrantuono, R., & Russo, S. (2010). Software aging analysis of the linux operating system. In 2010 IEEE 21st International Symposium on Software Reliability Engineering (ISSRE), pp. 71–80.
Cotroneo, D., Natella, R., Pietrantuono, R., & Russo, S. (2014). A survey of software aging and rejuvenation studies. ACM Journal on Emerging Technologies in Computing Systems,10(1), 1–34 .
Cotroneo, D., Orlando, S., & Russo, S. (2007). Characterizing aging phenomena of the java virtual machine. In 2007 26th IEEE International Symposium on Reliable Distributed Systems (SRDS), pp. 127–136.
Cotroneo, D., Natella, R., & Pietrantuono, R. (2013). Predicting aging-related bugs using software complexity metrics. Performance Evaluation, 70(3), 163–178.
Cotroneo, D., De Simone, L., Natella, R., Pietrantuono, R., & Russo, S. (2022). Software micro-rejuvenation for android mobile systems. Journal of Systems and Software, 186, 111181.
Dohi, T., Zheng, J., Okamura, H., & Trivedi, K. S. (2018). Optimal periodic software rejuvenation policies based on interval reliability criteria. Reliability Engineering and System Safety, 180, 463–475.
Du, X., Xiao, G., & Sui, Y. (2020). Fault triggers in the tensorflow framework: An experience report. In 2020 IEEE 31st International Symposium on Software Reliability Engineering (ISSRE), pp. 1–12.
Espinosa, R., Palma, J., Jiménez, F., Kamińska, J., Sciavicco, G., & Lucena-Sánchez, E. (2021). A time series forecasting based multi-criteria methodology for air quality prediction. Applied Soft Computing, 113, 107850.
Grottke, M., Li, L., Vaidyanathan, K., & Trivedi, K. (2006). Analysis of software aging in a web server. IEEE Transactions on Reliability, 55(3), 411–420.
He, L., Chen, Y., & Wu, K. (2022). Fuzzy granular deep convolutional network with residual structures. Knowledge-Based Systems, 258, 109941.
Huang, Y., Kintala, C., Kolettis, N., & Fulton, N. (1995). Software rejuvenation: Analysis, module and applications. In Twenty-Fifth International Symposium on Fault-Tolerant Computing. Digest of Papers, pp. 381–390.
Jia, K., Yu, X., Zhang, C., Hu, W., Zhao, D., & Xiang, J. (2022). The impact of software aging and rejuvenation on the user experience for android system. In 2022 IEEE 33rd International Symposium on Software Reliability Engineering (ISSRE), pp. 435–445.
Jia, K., Yu, X., Zhang, C., Hu, W., Zhao, D., & Xiang, J. (2023). Software aging prediction for cloud services using a gate recurrent unit neural network model based on time series decomposition. IEEE Transactions on Emerging Topics in Computing, 11(3), 580–593.
Jin, R., Chen, Z., Wu, K., Wu, M., Li, X., & Yan, R. (2022). Bi-lstm-based two-stream network for machine remaining useful life prediction. IEEE Transactions on Instrumentation and Measurement, 71, 1–10.
Levitin, G., Xing, L., & Xiang, Y. (2020a). Cost minimization of real-time mission for software systems with rejuvenation. Reliability Engineering and System Safety, 193, 106593.
Levitin, G., Xing, L., & Xiang, Y. (2020b). Optimizing software rejuvenation policy for tasks with periodic inspections and time limitation. Reliability Engineering and System Safety, 197, 106776.
Li, D., Liang, M., Xu, B., Yu, X., Zhou, J., & Xiang, J. (2021). A cross-project aging-related bug prediction approach based on joint probability domain adaptation and k-means smote. In 2021 IEEE 21st International Conference on Software Quality, Reliability and Security Companion (QRS-C), pp. 350–358.
Li, L., Vaidyanathan, K., & Trivedi, K. (2002). An approach for estimation of software aging in a web server. In Proceedings International Symposium on Empirical Software Engineering, pp. 91–100.
Li, X., Jiang, H., Liu, Y., Wang, T., & Li, Z. (2022). An integrated deep multiscale feature fusion network for aeroengine remaining useful life prediction with multisensor data. Knowledge-Based Systems, 235, 107652.
Li, Y., Chen, Y., Shao, H., & Zhang, H. (2023). A novel dual attention mechanism combined with knowledge for remaining useful life prediction based on gated recurrent units. Reliability Engineering and System Safety, 239, 109514.
Liang, M., Li, D., Xu, B., Zhao, D., Yu, X., & Xiang, J. (2021). Within-project software aging defect prediction based on active learning. In 2021 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW), pp. 1–8.
Liu, J., Tan, X., & Wang, Y. (2019). CSSAP: Software aging prediction for cloud services based on ARIAM-LSTM hybrid model. In 2019 IEEE International Conference on Web Services (ICWS), pp. 283–290.
Machida, F., Kim, D. S., & Trivedi, K. S. (2013). Modeling and analysis of software rejuvenation in a server virtualized system with live VM migration. Performance Evaluation, 70(3), 212–230.
Machida, F., Nicola, V. F., & Trivedi, K. S. (2014). Job completion time on a virtualized server with software rejuvenation. ACM Journal on Emerging Technologies in Computing Systems, 10(1), 1–26.
Marshall, E. (1992). Fatal error: How patriot overlooked a scud. Science, 255(5050), 1347–1347.
Meng, H., Tong, X., Shi, Y., Zhu, L., Feng, K., & Hei, X. (2021). Cloud server aging prediction method based on hybrid model of autoregressive integrated moving average and recurrent neural network. Journal on Communications, 42(01), 163–171.
Ning, G., Zhao, J., Lou, Y., Alonso, J., Matias, R., Trivedi, K. S., Yin, B. B., & Cai, K. Y. (2016). Optimization of two-granularity software rejuvenation policy based on the markov regenerative process. IEEE Transactions on Reliability, 65(4), 1630–1646.
Pereira, P., Araujo, J., Matos, R., Preguiça, N., & Maciel, P. (2018). Software rejuvenation in computer systems: An automatic forecasting approach based on time series. In 2018 IEEE 37th International Performance Computing and Communications Conference (IPCCC), pp. 1–8.
Pietrantuono, R., & Russo, S. (2020). A survey on software aging and rejuvenation in the cloud. Software Quality Journal, 28, 7–38.
Qiao, Y., Zheng, Z., & Fang, Y. (2018). An empirical study on software aging indicators prediction in android mobile. In 2018 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW), pp. 271–277.
Qiao, Y., Zheng, Z., Fang, Y., Qin, F., Trivedi, K. S., & Cai, K. Y. (2019). Two-level rejuvenation for android smartphones and its optimization. IEEE Transactions on Reliability, 68(2), 633–652.
Sun, S., Liu, J., Wang, J., Chen, F., Wei, S., & Gao, H. (2022). Remaining useful life prediction for ac contactor based on mmpe and lstm with dual attention mechanism. IEEE Transactions on Instrumentation and Measurement, 71, 1–13.
Tan, X., & Liu, J. (2021). ACLM: Software aging prediction of virtual machine monitor based on attention mechanism of CNN-LSTM model. In 2021 IEEE 21st International Conference on Software Quality, Reliability and Security (QRS), pp. 759–767.
Vinícius, L., Rodrigues, L., Torquato, M., & Silva, F. A. (2022). Docker platform aging: a systematic performance evaluation and prediction of resource consumption. The Journal of Supercomputing, 78, 12898–12928.
Wan, S., Li, X., Zhang, Y., Liu, S., Hong, J., & Wang, D. (2022). Bearing remaining useful life prediction with convolutional long short-term memory fusion networks. Reliability Engineering and System Safety, 224, 108528.
Wang, D., Xie, W., & Trivedi, K. S. (2007). Performability analysis of clustered systems with rejuvenation under varying workload. Performance Evaluation, 64(3), 247–265.
Wang, L., Cao, H., Xu, H., & Liu, H. (2022a). A gated graph convolutional network with multi-sensor signals for remaining useful life prediction. Knowledge-Based Systems, 252, 109340.
Wang, T., Fu, L., Zhou, Y., & Gao, S. (2022b). Service price forecasting of urban charging infrastructure by using deep stacked CNN-BIGRU network. Engineering Applications of Artificial Intelligence, 116, 105445.
Wang, Y., Lei, Y., Li, N., Yan, T., & Si, X. (2023). Deep multisource parallel bilinear-fusion network for remaining useful life prediction of machinery. Reliability Engineering and System Safety, 231, 109006.
Weng, C., Xiang, J., Xiong, S., Zhao, D., & Yang, C. (2016). Analysis of software aging in android. In 2016 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW), pp. 78–83.
Xiang, S., Qin, Y., Luo, J., Pu, H., & Tang, B. (2021). Multicellular LSTM-based deep learning model for aero-engine remaining useful life prediction. Reliability Engineering and System Safety, 216, 107927.
Xiao, D., Qin, C., Ge, J., Xia, P., Huang, Y., & Liu, C. (2022). Self-attention-based adaptive remaining useful life prediction for IGBT with Monte Carlo dropout. Knowledge-Based Systems, 239, 107902.
Yan, Y. (2020a). Software ageing prediction using neural network with ridge. IET Software,14(5), 517–524.
Yan, Y. (2020b). Software aging forecast using recurrent SOM with local model. Journal of Information Technology Research,12(1), 30–43.
Yan, Y. (2019). Novel method to forecast software aging problems. The Journal of Engineering, 2019(10), 7237–7243.
Zhang, X., Shen, F., Zhao, J., & Yang, G. (2017). Time series forecasting using GRU neural network with multi-lag after decomposition. In International Conference on Neural Information Processing (ICONIP), Cham: Springer International Publishing. pp. 523–532.
Zhang, J., Jiang, Y., Wu, S., Li, X., Luo, H., & Yin, S. (2022). Prediction of remaining useful life based on bidirectional gated recurrent unit with temporal self-attention mechanism. Reliability Engineering and System Safety, 221, 108297.
Funding
This work is partially supported by the National Key Research and Development Program (Grant No. 2022YFB3104001), the National Natural Science Foundation of China (Grant No. 62202350), the Key Research and Development Program of Hubei Province (Grant No. 2022BAA050), and the Natural Science Foundation of Chongqing (Grant No. cstc2021jcyj-msxmX1146).
Author information
Authors and Affiliations
Contributions
Kai Jia: writing original draft, methodology, formal analysis, visualization, and experiment analysis. Xiao Yu: review & editing. Chen Zhang: review & editing. Wenzhi Xie: data curation. Dongdong Zhao: review & editing. Jianwen Xiang: supervision, project administration, funding acquisition.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Jia, K., Yu, X., Zhang, C. et al. TTAFPred: Prediction of time to aging failure for software systems based on a two-stream multi-scale features fusion network. Software Qual J 32, 1481–1513 (2024). https://doi.org/10.1007/s11219-024-09692-2
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11219-024-09692-2