Abstract
Adaptive algorithms for real-time and proactive detection of network/service anomalies, i.e., soft performance degradations, in transaction-oriented wide area networks (WANs) have been developed. These algorithms (i) adaptively sample and aggregate raw transaction records to compute service-class based traffic intensities, in which potential network anomalies are highlighted; (ii) construct dynamic and service-class based performance thresholds for detecting network and service anomalies; and (iii) perform service-class based and real-time network anomaly detection. These anomaly detection algorithms are implemented as a real-time software system called TRISTAN (\(\underline {{\text{Tr}}} \)ansaction \(\underline {\text{I}}\)n\(\underline {{\text{st}}}\)antaneous \(\underline {\text{A}} \)nomaly \(\underline {\text{N}}\)otification), which is deployed in the AT&T Transaction Access Services (TAS) network. The TAS network is a commercially important, high volume (millions of transactions per day), multiple service classes (tens), hybrid telecom and data WAN that services transaction traffic such as credit card transactions in the US and neighboring countries. TRISTAN is demonstrated to be capable of automatically and adaptively detecting network/service anomalies and correctly identifying the corresponding "guilty" service classes in TAS. TRISTAN can detect network/service faults that elude detection by the traditional alarm-based network monitoring systems.
Similar content being viewed by others
REFERENCES
A. A. Lazar, W. Wang, R. Deng, Models and algorithms for network fault detection and identification: A review, ICC Singapore, November 1992.
G. Parulkar, D. Schmidt, E. Kraemer, J. Turner, and A. Kantawala, An architecture for monitoring, visualization, and control of gigabit networks, IEEE Networks, p. 34, September/October 1997.
I. Katzela and M. Schwartz, Schemes for fault identification in communication networks, IEEE/ACM Trans. Networking, Vol. 3, No. 6, p. 753, December 1995.
C. Wang and M. Schwartz, Fault diagnosis of network connectivity problems by probabilistic reasoning. In I. T. Frisch, M. Malek, and S. S. Panwar (eds.), Network Management and Control Volume Two Plenum Press, New York, p. 67, 1994.
N. Dawes, J. Altoft, and B. Pagurek, Network diagnosis by reasoning in uncertain nested evidence spaces, IEEE Transactions on Communications, Vol. 43, p. 466, 1995.
C. Cortes, L. D. Jackel, and W. Chiang, Limits on learning machine accuracy imposed by data quality, Proceedings of NIPS94-Neural Information Processing Systems: Natural and Synthetic Pagination, MIT Press, p. 239, 1994.
J. R. Dorronsoro, F. Ginel, and C. Sanchez, Neural fraud detection in credit card operations, IEEE Transactions on Neural Networks, Vol. 8, No. 4, p. 827, 1997.
F. E. Feather, D. Siewiorek, and R. Maxion, Fault detection in an Ethernet using anomaly signature matching, ACM SIGCOMM'93, Vol. 23, No. 4, pp. xx-xx, 1993.
R. Maxion and F. E. Feather, A case study of Ethernet anomalies in a distributed computing environment, IEEE Transactions on Reliability, Vol. 39, No. 4, pp. xx-xx, October 1990.
C. Hood and C. Ji, Proactive network fault detection, IEEE Trans. Reliability, Vol. 46, No. 3, p. 333, 1997.
C. Hood and C. Ji, Proactive network fault detection, Proceeding IEEE INFOCOM, 1997.
S. Yemini, S. Kliger, E. Mozes, Y. Yemini, and D. Ohsie, High speed and robust event correlation, IEEE Communication Magazine, May 1996.
S. Katker and M. Paterok, Fault isolation and event correlation for integrated fault management, Proceedings of the Fifth IFIP/IEEE International Symposium on Integrated Network Management, p. 583, 1997.
G. Jakobson and M. D. Weissman, Alarm Correlation, IEEE Network, p. 52, November 1993.
M. Z. Hasan, B. Sugla, and R. Viswanathan, A conceptual framework for network management event correlation and filtering system, Proceedings of the Sixth IFIP/IEEE International Symposium on Integrated Network Management (IM'99), M. Sloman, S. Mazumdar, and E. Lupu (eds.), IEEE Publishing, p. 233, 1999.
B. A. Huberman and R. M. Lukose, Social dilemmas and Internet congestion, Science, Vol. 277, p. 535, July 1997.
A. P. Snow and M. B. H. Weiss, Empirical evidence of reliability growth in large-scale networks, Journal of Network and Systems Management, Vol. 5, No. 2, pp. 197-213, 1997.
S. Katker and K. Geihs, A generic model for fault isolation in integrated management systems, Journal of Network and Systems Management, Vol. 5, No. 2, pp. 109-130, 1997.
L. L. Ho, D. J. Cavuto, S. Papavassiliou, M. Z. Hasan, F. E. Feather, and A. G. Zawadzki, Adaptive network/service fault detection in transaction-oriented wide area networks," Proceedings of the Sixth IFIP/IEEE International Symposium on Integrated Network Management (IM'99), M. Sloman, S. Mazumdar, and E. Lupu (eds.), IEEE Publishing, p. 761, 1999.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Ho, L.L., Cavuto, D.J., Papavassiliou, S. et al. Adaptive Anomaly Detection in Transaction-Oriented Networks. Journal of Network and Systems Management 9, 139–159 (2001). https://doi.org/10.1023/A:1011311024699
Issue Date:
DOI: https://doi.org/10.1023/A:1011311024699