Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/1416729.1416741acmconferencesArticle/Chapter ViewAbstractPublication PagesnotereConference Proceedingsconference-collections
research-article

Automatic classification of security messages based on text categorization

Published: 23 June 2008 Publication History

Abstract

The generated messages by the security devices are the necessary data for the detection of the malicious activities in an information system. The heterogeneity of the devices and the lack of a standard for the security messages make the automatic processing of the messages difficult. The messages are short, use a very wide vocabulary and have different formats. We propose in this article the application of the text categorization technics for the automatic classification of security log files messages, in categories defined by an ontology. We develop an extraction module for the message attributes to reduce the vocabulary size. Then we apply two training algorithms: the k-nearest neighbour algorithm and the naive bayes, on two corpus of security log messages.

References

[1]
F. Benali, V. Legrand, and S. Ubéda. An ontology for the management of heteregenous alerts of information system. In The 2007 International Conference on Security and Management (SAM '07), Las Vegas, USA, June 2007.
[2]
E. Brill. A simple rule-based part-of-speech tagger. In Proceedings of ANLP-92, 3rd Conference on Applied Natural Language Processing, pages 152--155, Trento, IT, 1992.
[3]
F. D. C. Apte and S. Weiss. Text mining with decision rules and decision trees. In the Conference on Automated Learning and Discovery, Workshop 6: Learning from Text and the Web, 1998.
[4]
W. W. Cohen and Y. Singer. Context-sensitive learning methods for text categorization. In SIGIR '96: Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval, pages 307--315, New York, NY, USA, 1996. ACM Press.
[5]
F. Cuppens and A. Miége. Alert correlation in a cooperative intrusion detection framework. In Proceedings of the IEEE Symposium of Security and Privacy, 2002.
[6]
J. O. P. E. Wiener and A. S. Weigend. A neural network approach to topic spotting. In the 4th Annual Symposium on Document Analysis and Information Retrieval. Morgan Kaufmann, 1995.
[7]
T. Joachims. Text categorization with support vector machines: learning with many relevant features. In C. Nédellec and C. Rouveirol, editors, Proceedings of ECML-98, 10th European Conference on Machine Learning, number 1398, pages 137--142, Chemnitz, DE, 1998. Springer Verlag, Heidelberg, DE.
[8]
J. Saraydaryan, V. Legrand, and S. Ubéda. Behavioral anomaly detection using bayesian modelization based on a global vision of the system. In NOTERE, April 2007.
[9]
D. D. Lewis. Naive (Bayes) at forty: The independence assumption in information retrieval. In C. Nédellec and C. Rouveirol, editors, Proceedings of ECML-98, 10th European Conference on Machine Learning, number 1398, pages 4--15, Chemnitz, DE, 1998. Springer Verlag, Heidelberg, DE.
[10]
R. E. Schapire and Y. Singer. Improved boosting using confidence-rated predictions, volume 37, pages 297--336, 1999.
[11]
F. Sebastiani. Machine learning in automated text categorization. ACM Computing Surveys, 34(1): 1--47, 2002.
[12]
H. Somers. Review article: Example-based machine translation. Machine Translation, 14(2):113--157, 1999.
[13]
Y. Yang and C. G. Chute. An example-based mapping method for text categorization and retrieval. ACM Trans. Inf. Syst., 12(3):252--277, 1994.
[14]
Y. Yang and X. Liu. A re-examination of text categorization methods, pages 42--49, 1999.
[15]
G. K. Zipf. Human Behavior and the Principle of Least Effort. Addison-Wesley (Reading MA), 1949.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
NOTERE '08: Proceedings of the 8th international conference on New technologies in distributed systems
June 2008
399 pages
ISBN:9781595939371
DOI:10.1145/1416729
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

  • Lyon 1 University
  • SIGAPP: ACM Special Interest Group on Applied Computing
  • Mairie de Villeurbanne
  • Conseil Général du Rhône
  • INSA Lyon: Institut National des Sciences Appliquées de Lyon
  • Conseil Régional Rhône-Alpes
  • Mutuelle d'assurance MAIF
  • I.U.T.A LYON 1: Institute of Technology Lyon 1
  • Ministère de l'Enseignement Supérieur et de la Recherche
  • Lyon 2 University
  • ISTASE: High-Level Engineering School in Telecommunication
  • France Telecom
  • LIRIS: Lyon Research Center for Images and Intelligent Information Systems

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 23 June 2008

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. automatic classification
  2. heterogeneous probes
  3. management of security information
  4. ontology
  5. text categorization

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 242
    Total Downloads
  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)0
Reflects downloads up to 12 Nov 2024

Other Metrics

Citations

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media