Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Identification of atypical behavior of bank employees when using e-mail to prevent information leakages

Published: 01 January 2022 Publication History

Abstract

The article presents the results of using machine learning methods to identify atypical behavior of bank employees when using e-mail. A feature space is formed that characterizes the behavior of e-mail users. The objects were previously clustered using the density-based spatial clustering of applications with noise (DBSCAN) and the fuzzy logic elements. The objects were marked using the inbuilt business rules, and the training sample was formed in the absence of marked data. The most informative features are selected, and a model of classification of e-mail users by the type of their behavior is constructed. A feature space is formed that defines the characteristics of a particular message to identify messages that are the information security incidents. Preliminary data processing was performed by removing the duplicates and encoding the categorical variables. A model of messages classification is constructed. The best combination of the machine learning method and the feature selection algorithm was determined using quality metrics. The constructed models allow specialists of cybersecurity departments of banks to identify employees with abnormal behavior and possibly involved in information leaks. A software tool in Python was developed that makes it easier to identify the final status of a message by partially replacing its manual detection for an automatic one.

References

[1]
Ponemon Institute. Cost of a Data Breach Report 2020. [Electronic resource] – URL: https://www.capita.com/sites/g/files/nginej291/files/2020-08/Ponemon-Global-Cost-of-Data-Breach-Study-2020.pdf (date of the application: 05.04.2022);
[2]
The expert and analytical center of the InfoWatch group of companies. Analytical report "Investigation of confidential information leaks from financial segment organizations in 2020". [Electronic resource] — URL: https://www.infowatch.ru/analytics/reports/issledovanie-utechek-informatsii-iz-finansovykh-organizatsiy-v-2020 (date of the application: 04.04.2022);
[3]
Technology of forecasting potentially unstable credit organizations based on machine learning methods, // Procedia Computer Science 169 (2020) 767–772. https://www.scopus.com/re...460157&origin=resultslist //// 2020.
[4]
Statistical tools for high-throughput data analysis. DBSCAN: density-based clustering for discovering clusters in large datasets with noise — Unsupervised Machine Learning. [Electronic resource] — URL: http://www.sthda.com/english/wiki/wiki.php?id_contents=7940 (date of the application: 05.04.2022);
[5]
A.V. Leonenkov, Fuzzy modeling in MATLAB and fuzzyTECH, BHV-Petersburg, St. Petersburg, 2005, p. 736. ——p.: ill.;.
[6]
Chio K., Freeman D. Machine learning and security /translated from English by A.V. Snastin. — Moscow: DMK Press, 2020 — 338 p.: ill.
[7]
Mathematical methods of teaching by precedents (theory of machine learning). [Electronic resource]. URL: http://www.machinelearning.ru/wiki/images/6/6d/Voron-ML-1.pdf (accessed 10.04.2022);
[8]
Isaev D. V. Genetic algorithm for selecting traits. //Scientific and educational journal for students and teachers "StudNet" № 5/2020. — p. 102-107.

Index Terms

  1. Identification of atypical behavior of bank employees when using e-mail to prevent information leakages
    Index terms have been assigned to the content through auto-classification.

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image Procedia Computer Science
    Procedia Computer Science  Volume 213, Issue C
    2022
    846 pages
    ISSN:1877-0509
    EISSN:1877-0509
    Issue’s Table of Contents

    Publisher

    Elsevier Science Publishers B. V.

    Netherlands

    Publication History

    Published: 01 January 2022

    Author Tags

    1. information leakage
    2. e-mail
    3. atypical behavior
    4. information security incidents
    5. machine learning
    6. fuzzy logic
    7. clustering
    8. classification

    Qualifiers

    • Research-article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 0
      Total Downloads
    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 29 Nov 2024

    Other Metrics

    Citations

    View Options

    View options

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media