IDSV: Intrusion Detection Algorithm Based On Statistics Variance Method in User Transmission Behavior
IDSV: Intrusion Detection Algorithm Based On Statistics Variance Method in User Transmission Behavior
IDSV: Intrusion Detection Algorithm Based On Statistics Variance Method in User Transmission Behavior
1183
n
Avg = Sum n ; // the average of srcM
∑t i
2
− nTn
2
normal distribution. The larger the data number is, the less the address. Here we present a ARP spoofing detection method
average variance of srcM will be. Hence, if srcM of with considering the mapping between IP and MAC address.
current network connection is in the interval shown in formula As ARP spoofing modifies the mapping between IP and
1, the behavior is normal. The contrary is the anomaly. MAC address, location feature of transmission, two-tuples
( )
IP, Mac , is used to select the data. Three cases will be
IV. IDSV ALGORITHM studied: 1) Fixed IP and MAC address. This case is for most
In the following, an Intrusion Detection algorithm based on users with normal behavior record. In IDSV algorithm, the
Statistics Variance method, IDSV, is presented. The confidence average of IP and MAC address is kept unchanged and the
interval is weighted to improve detection rate according to the threshold value is 0, namely IP and MAC address are one-one
real detection effect of IDSV algorithm. mapped. 2) One to many mapping of IP and MAC. Under this
case, white list into which the mapping of IP and MAC address
A. Framework of IDSV Algorithm is added will be applied to filter the audit data. 3) The dynamic
How to set the threshold value in IDSV algorithm is IP allocation by DHCP. The IP and MAC address mapping
difficult. If threshold value interval is too short, there will be table is maintained through deleting the outdated address
numerous false alarms. If too long, miss rate will be high. As mapping periodically.
the confidence interval according formula 1 is too strict, we It can be seen that the procedure of ARP spoofing detection
introduce the weighted factor θ to change the size of interval with IDSV algorithm, shown in Fig. 2. The ARP reply packets
elastically as follows. are captured through data collecting component. The source IP
and MAC address are extracted from the packets and compared
⎛ s s ⎞ with the mapping table. If there exist multiple MAC address
⎜ Tn − θ ⋅ z σ ⋅ n , Tn + θ ⋅ z σ ⋅ n ⎟ mapping to one IP and one IP mapping to multiple MAC, ARP
⎝ 2
n 2
n⎠ spoofing alarms will be reported with the source and
(2) destination address of the attack.
Where, z σ = 1.96 , θ is the weighted factor. The range of V. SIMULATIONS
2
confidence interval is {Min, Max}. Then IDSV algorithm is The simulations are carried out to detect anomaly behaviors.
as follows: The influence of weighted confidence interval on IDSV
algorithm will be assessed with involving detection accuracy
Input: srcM and user connection records selected by n , and false rate.The simulations are based on the audit data set of
n ≥ 2 , Service features, weighted factor θ . KDDCUP 2009[8]. There are around 5,000,000 records in the
record set. The connection records consist of behavior features.
Output: normal/anomaly A trained data subset which includes 494,020 records is
selected from the audit data, where the weighted factor of
IsNormal ( ) { undifferentiated application and WEB application data is
for i = 1, …, n { respectively 23 and 197.
2
Sum+ = ti ; M + = ti ; //sum and square sum of audit data The detection performance of IDSV algorithm on
} undifferentiated and WEB is compared in the following
figures. With different size of audit data, false rate of
1184
undifferentiated application and WEB is little different. While 0.9975 and higher than reference [9]. Consequently, IDSV
the detection rate of WEB is obviously better than algorithm classifying audit records by features has higher
undifferentiated application’s. The reason is that audit data detection performance.
selected according to transmission feature become more
structured. Moreover, WEB applications are not sensitive to the VI. CONCLUSIONS
size of audit data. To some extent, we can infer that the
adaptability of IDS could be enhanced by classifying audit In this paper, user behavior features are studied to create
record according to Service feature. model for the user behavior. An intrusion detection algorithm
based on statistics variance method in user transmission
behavior (IDSV) is provided and applied into intrusion
detection. The simulation results show that IDSV algorithm
does well in detection performance of intrusion detection with
different behavior features. Therefore, the feasibility and
effectivity of IDSV algorithm are suggested.
ACKNOWLEDGMENT
This paper is supported by the National Natural Science
Foundation of China (70872046,70671054) and the National
Basic Research Program of China (2009CB320501)
REFERENCES
Figure 3. Comparison of IDSV Detection Rate [1] Alpcan T, Basar T. A game theoretic approach to decision and analysis
in network intrusion detection [C]. In: Proc. of 43rd IEEE Conference on
Decision and Control (CDC), Paradise Island, Bahamas: IEEE Computer
Society Press, 2006. 2595-2600
[2] Todd H, Gihan D, Karl Levitt. A network security monitor [C].
In:Prnceedings of the 1990 IEEE Symposium on Research in Security
and Privacy, USA: IEEE Computer Society Press, 1990: 296-304
[3] Kalle Burbeck. Current Research and Use of Anomaly Detection [C].
Proceedings of the 14th IEEE International Workshops on Enabling
Technologies: Infrastructure for Collaborative Enterprise, 2005.
[4] LTC Bruce D. Caulkins USA, Joohan Lee ,Morgan Wang .A Dynamic
Data Mining Technique for Intrusion Detection Systems [C]. 43rd ACM
Southeast Conference,March 18-20,2005.
[5] Yu-Fang Zhang, Gui-Hua Sun, Zhong-Yang Xiong. A Novel Method of
Intrusion Detection based on Artificial Immune System [C]. Proc. of the
Fifth International Conference on Machine Learning and Cybernetics,
Dalian, 13-16 August ,2006.
[6] Baoyi WANG, Shaomin ZHANG .A New Intrusion Detection Method
Figure 4. Comparison of IDSV False Rate Based on Artificial Immune System [C]. IFIP International Conference
on Network and Parallel Computing-Workshops, 2007
Furthermore, we analyze the reason that detection rate is [7] Chen You, Cheng Xueqi, Li Yang, Dai Lei. Lightweight Intrusion
too low with 2,500 audit records in Fig. 4. The detection rates Detection System Based on Feature Selection [J]. Journal of Software,
with different 2,500 audit records are shown in Table I. 2007,18(7): 1639-1651.
[8] Salvatore J. Stolfo, Wei Fan, Wenke Lee etc. Task description of
Kddcup’99 [OL]. http://kdd.ics.uci.edu/databases/kddcup99/task.html,
TABLE I. DETECTION RATE WITH DIFFERENT 2,500 RECORDS SET 1999
Records 1 Records 2 Records in Fig. 6 [9] Zhao Xiaofeng, Ye Zhen. Research on weighted multi-random decision
Detection Rate 1.0 0.915593 0.446237 tree and its application to intrusion detection [J]. COMPUTER
ENGINEERING AND APPLICATIONS, 2007.5 (18).
We find the records in Fig. 4 include many attacks that
statistics model is not good at detecting. So the detection rate in
Fig. 4 is low. This also shows that special detection model and
features are efficient to special intrusion detection. Detection
rate does not depend on the number of audit records but on
whether detection feature is suitable and whether detection
model is good at detecting.
A detection model based on weighted multi-random
decision tree is presented in [9] with 0.9929 detection rate. In
this paper, the average detection rate of IDSV algorithm is
1185