Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

A Novel Classification Technique based on Formal Methods

Published: 28 June 2023 Publication History

Abstract

In last years, we are witnessing a growing interest in the application of supervised machine learning techniques in the most disparate fields. One winning factor of machine learning is represented by its ability to easily create models, as it does not require prior knowledge about the application domain. Complementary to machine learning are formal methods, that intrinsically offer safeness check and mechanism for reasoning on failures. Considering the weaknesses of machine learning, a new challenge could be represented by the use of formal methods. However, formal methods require the expertise of the domain, knowledge about modeling language with its semantic and mathematical rigour to specify properties. In this article, we propose a novel learning technique based on the adoption of formal methods for classification thanks to the automatic generation both of the formula and of the model. In this way the proposed method does not require any human intervention and thus it can be applied also to complex/large datasets. This leads to less effort both in using formal methods and in a better explainability and reasoning about the obtained results. Through a set of case studies from different real-world domains (i.e., driver detection, scada attack identification, arrhythmia characterization, mobile malware detection, and radiomics for lung cancer analysis), we demonstrate the usefulness of the proposed method, by showing that we are able to overcome the performances obtained from widespread classification algorithms.

References

[1]
Tom M. Mitchell. 1999. Machine learning and data mining. Communications of the ACM 42, 11 (1999), 30–36.
[2]
Tom Michael Mitchell. 2006. The Discipline of Machine Learning. Carnegie Mellon University, School of Computer Science, Machine Learning ....
[3]
David Lorge Parnas. 2017. The real risks of artificial intelligence. Communications of the ACM 60, 10 (2017), 27–31.
[4]
David Lorge Parnas. 1988. Why engineers should not use artificial intelligence. INFOR: Information Systems and Operational Research 26, 4 (1988), 234–246. DOI:
[5]
Antonella Santone, Gigliola Vaglini, and Maria Luisa Villani. 2013. Incremental construction of systems: An efficient characterization of the lacking sub-system. Science of Computer Programming 78, 9 (2013), 1346–1367.
[6]
A. Santone. 2003. Heuristic search + local model checking in selective mu-calculus. IEEE Transactions on Software Engineering 29, 6 (2003), 510–523.
[7]
Robin Milner. 1984. Lectures on a calculus for communicating systems. In Proceedings of the International Conference on Concurrency. Springer, 197–220.
[8]
E. Allen Emerson. 1997. Model checking and the mu-calculus. DIMACS Series in Discrete Mathematics 31, 31 (1997), 185–214.
[9]
S. Gradara, A. Santone, and M. L. Villani. 2006. DELFIN+: An efficient deadlock detection tool for CCS processes. Journal of Computer and System Sciences 72, 8 (2006), 1397–1412.
[10]
Nicoletta De Francesco, Giuseppe Lettieri, Antonella Santone, and Gigliola Vaglini. 2016. Heuristic search for equivalence checking. Software and System Modeling 15, 2 (2016), 513–530. DOI:
[11]
Colin Stirling. 1989. An introduction to modal and temporal logics for CCS. In Proceedings of the Concurrency: Theory, Language, and Architecture. 2–20.
[12]
Robin Milner. 1989. Communication and Concurrency. Prentice Hall.
[13]
Rance Cleaveland and Steve Sims. 1996. The NCSU concurrency workbench. In Proceedings of the International Conference on Computer Aided Verification. Springer, 394–397.
[14]
James Dougherty, Ron Kohavi, and Mehran Sahami. 1995. Supervised and unsupervised discretization of continuous features. In Proceedings of the Machine Learning Proceedings 1995. Elsevier, 194–202.
[15]
Mario Luca Bernardi, Marta Cimitile, Fabio Martinelli, and Francesco Mercaldo. 2018. Driver and path detection through time-series classification. Journal of Advanced Transportation 2018 23, 1758731 (2018), 1–21.
[16]
Maria Francesca Carfora, Fabio Martinelli, Francesco Mercaldo, Vittoria Nardone, Albina Orlando, Antonella Santone, and Gigliola Vaglini. 2018. A “pay-how-you-drive” car insurance approach through cluster analysis. Soft Computing 23, 13 (2018), 1–13.
[17]
Riccardo Taormina, Stefano Galelli M. ASCE, Nils Ole Tippenhauer, Elad Salomons, Avi Ostfeld F.ASCE, Demetrios G. Eliades, Mohsen Aghashahi S.M.ASCE, Raanju Sundararajan, Mohsen Pourahmadi, M. Katherine Banks F.ASCE, B. M. Brentan, Enrique Campbell, G. Lima, D. Manzi, D. Ayala-Cabrera, M. Herrera, I. Montalvo, J. Izquierdo, E. Luvizotto Jr., Sarin E. Chandy, Amin Rasekh, M.ASCE, Zachary A. Barker, Bruce Campbell, M. Ehsan Shafiee, Marcio Giacomoni, Nikolaos Gatsis, Ahmad Taha, Ahmed A. Abokifa, S.M.ASCE, Kelsey Haddad, Cynthia S. Lo, Pratim Biswas, M. Fayzul K. Pasha, Bijay Kc, Saravanakumar Lakshmanan Somasundaram, Mashor Housh, and Ziv Ohar. 2018. Battle of the attack detection algorithms: Disclosing cyber attacks on water distribution networks. Journal of Water Resources Planning and Management 144, 8 (2018), 04018048.
[18]
Mohammad Kachuee, Shayan Fazeli, and Majid Sarrafzadeh. 2018. ECG heartbeat classification: A deep transferable representation. IEEE International Conference on Healthcare Informatics (ICHI’18), IEEE, 443–444.
[19]
Daniel Arp, Michael Spreitzenbarth, Malte Hubner, Hugo Gascon, Konrad Rieck, and CERT Siemens. 2014. Drebin: Effective and explainable detection of android malware in your pocket. In Proceedings of the Ndss. 23–26.
[20]
Spreitzenbarth Michael, Echtler Florian, Schreck Thomas, C. Freiling Felix, and Johannes Hoffmann. 2013. Mobilesandbox: Looking deeper into android applications. In Proceedings of the 28th International ACM Symposium on Applied Computing.
[21]
Mario G. C. A. Cimino, Nicoletta De Francesco, Francesco Mercaldo, Antonella Santone, and Gigliola Vaglini. 2020. Model checking for malicious family detection and phylogenetic analysis in mobile environment. Computers and Security 90, 90 (2020), 101691.
[22]
Alfonso Reginelli, Roberta Grassi, Beatrice Feragalli, Maria Paola Belfiore, Alessandro Montanelli, Gianluigi Patelli, Michelearcangelo La Porta, Fabrizio Urraro, Roberta Fusco, Vincenza Granata, Antonella Petrillo, Giuliana Giacobbe, Gaetano Maria Russo, Palmino Sacco, Roberto Grassi, and Salvatore Cappabianca. 2021. Coronavirus disease 2019 (COVID-19) in Italy: Double reading of chest CT examination. Biology 10, 2 (2021), 1–10. DOI:
[23]
Alfonso Reginelli, Valerio Nardone, Giuliana Giacobbe, Maria Paola Belfiore, Roberta Grassi, Ferdinando Schettino, Mariateresa Del Canto, Roberto Grassi, and Salvatore Cappabianca. 2021. Radiomics as a new frontier of imaging for cancer prognosis: A narrative review. Diagnostics 11, 10 (2021), 1–22. DOI:
[24]
Luca Brunese, Francesco Mercaldo, Alfonso Reginelli, and Antonella Santone. 2019. Neural networks for lung cancer detection through radiomic features. In Proceedings of the 2019 International Joint Conference on Neural Networks (IJCNN’19). IEEE, 1–10.
[25]
Steven L. Salzberg. 1994. C4. 5: Programs for Machine Learning by J. Ross Quinlan. Morgan Kaufmann Publishers.
[26]
Geoff Hulten, Laurie Spencer, and Pedro Domingos. 2001. Mining time-changing data streams. In Proceedings of the 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 97–106.
[27]
Remco R. Bouckaert. 2008. Bayesian network classifiers in weka for version 3-5-7. Artificial Intelligence Tools 11, 3 (2008), 369–387.
[28]
William W. Cohen. 1995. Fast effective rule induction. In Proceedings of the Machine Learning Proceedings 1995. Elsevier, 115–123.
[29]
Chih-Chung Chang. 2001. LIBSVM: A library for support vector machines, 2001. Retrieved from http://www.csie.ntu.edu.tw/cjlin/libsvm.
[30]
Zhaodan Kong, Austin Jones, and Calin Belta. 2017. Temporal logics for learning and detection of anomalous behavior. IIEEE Transactions on Automatic Control 62, 3 (2017), 1210–1222.
[31]
Marcell Vazquez-Chanlatte, Jyotirmoy V. Deshmukh, Xiaoqing Jin, and Sanjit A. Seshia. 2017. Logical clustering and learning for time-series data. In Proceedings of the International Conference on Computer Aided Verification. Springer, 305–325.
[32]
David J. Ketchen and Christopher L. Shook. 1996. The application of cluster analysis in strategic management research: An analysis and critique. Strategic Management Journal 17, 6 (1996), 441–458.
[33]
Sara Bufo, Ezio Bartocci, Guido Sanguinetti, Massimo Borelli, Umberto Lucangelo, and Luca Bortolussi. 2014. Temporal logic based monitoring of assisted ventilation in intensive care patients. In Proceedings of the International Symposium On Leveraging Applications of Formal Methods, Verification and Validation. Springer, 391–403.
[34]
Laurence Calzone, Nathalie Chabrier-Rivier, François Fages, and Sylvain Soliman. 2006. Machine learning biochemical networks from temporal logic properties. In Proceedings of the Transactions on Computational Systems Biology VI. Springer, 68–94.
[35]
Bing Liu, Yiming Ma, and Ching-Kian Wong. 2001. Classification using association rules: Weaknesses and enhancements. In Proceedings of the Data Mining for Scientific and Engineering Applications. Springer, 591–605.
[36]
Radu Grosu, Scott A. Smolka, Flavio Corradini, Anita Wasilewska, Emilia Entcheva, and Ezio Bartocci. 2009. Learning and detecting emergent behavior in networks of cardiac myocytes. Communications of the ACM 52, 3 (2009), 97–105.
[37]
Hengyi Yang, Bardh Hoxha, and Georgios Fainekos. 2012. Querying parametric temporal logic properties on embedded systems. In Proceedings of the IFIP International Conference on Testing Software and Systems. Springer, 136–151.
[38]
Eugene Asarin, Alexandre Donzé, Oded Maler, and Dejan Nickovic. 2011. Parametric identification of temporal properties. In Proceedings of the International Conference on Runtime Verification. Springer, 147–160.
[39]
Shichao Zhang and Jiaye Li. 2021. Knn classification with one-step computation. IEEE Transactions on Knowledge and Data Engineering, IEEE.
[40]
Shichao Zhang, Jiaye Li, and Yangding Li. 2022. Reachable distance function for KNN classification. IEEE Transactions on Knowledge and Data Engineering 1, 1 (2022), 1–15.
[41]
Shichao Zhang, Xuelong Li, Ming Zong, Xiaofeng Zhu, and Ruili Wang. 2017. Efficient kNN classification with different numbers of nearest neighbors. IEEE Transactions on Neural Networks and Learning Systems 29, 5 (2017), 1774–1785.
[42]
Risto Miikkulainen, Jason Liang, Elliot Meyerson, Aditya Rawal, Daniel Fink, Olivier Francon, Bala Raju, Hormoz Shahrzad, Arshak Navruzyan, Nigel Duffy, and Babak Hodjat. 2019. Evolving deep neural networks. In Proceedings of the Artificial Intelligence in the Age of Neural Networks and Brain Computing. Elsevier, 293–312.

Cited By

View all
  • (2024)Optimizing Traveler Behavior Between MADINA and JEDDA Using UPPAAL Stratego: A Stochastic Priced Timed Games ApproachMathematics10.3390/math1221342112:21(3421)Online publication date: 31-Oct-2024
  • (2024)A statistical approach to coronavirus classification based on nucleotide distributionsMathematical Modeling and Computing10.23939/mmc2024.04.98711:4(987-994)Online publication date: 2024
  • (2023)Early Detection of Earthquakes Using IoT and Cloud Infrastructure: A SurveySustainability10.3390/su15151171315:15(11713)Online publication date: 28-Jul-2023
  • Show More Cited By

Index Terms

  1. A Novel Classification Technique based on Formal Methods

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Knowledge Discovery from Data
    ACM Transactions on Knowledge Discovery from Data  Volume 17, Issue 8
    September 2023
    348 pages
    ISSN:1556-4681
    EISSN:1556-472X
    DOI:10.1145/3596449
    Issue’s Table of Contents

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 28 June 2023
    Online AM: 14 April 2023
    Accepted: 10 April 2023
    Revised: 06 April 2023
    Received: 12 July 2022
    Published in TKDD Volume 17, Issue 8

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Model checking
    2. formal methods
    3. classification

    Qualifiers

    • Research-article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)189
    • Downloads (Last 6 weeks)32
    Reflects downloads up to 23 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Optimizing Traveler Behavior Between MADINA and JEDDA Using UPPAAL Stratego: A Stochastic Priced Timed Games ApproachMathematics10.3390/math1221342112:21(3421)Online publication date: 31-Oct-2024
    • (2024)A statistical approach to coronavirus classification based on nucleotide distributionsMathematical Modeling and Computing10.23939/mmc2024.04.98711:4(987-994)Online publication date: 2024
    • (2023)Early Detection of Earthquakes Using IoT and Cloud Infrastructure: A SurveySustainability10.3390/su15151171315:15(11713)Online publication date: 28-Jul-2023
    • (2023)Computational cost of CT Radiomics workflow: a case study on COVID-192023 IEEE 47th Annual Computers, Software, and Applications Conference (COMPSAC)10.1109/COMPSAC57700.2023.00237(1539-1544)Online publication date: Jun-2023

    View Options

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    Full Text

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media