Facultes Universitaires
Notre-Dame de la Paix
Namur, Belgium
Institut d'Informatique
Rue Grandgagnage, 21
B-5000 Namur
BELGIUM
Distributed Audit Trail Analysis
Abdelaziz Mounji, Baudouin Le Charlier,
Denis Zampunieris, Naji Habra
RP-94-007
Phone: +32 81 72.49.66
November1994
Fax: +32 81 72.49.67
E-mail: cleroy@info.fundp.ac.be
Distributed Audit Trail Analysis
Abdelaziz Mounji
Baudouin Le Charlier
Naji Habra,
Denis Zampunieris
Institut d'Informatique,
FUNDP,
rue Grangagnage 21,
B-5000 Namur Belgium
E-mail: famo, ble, dza, nhag@info.fundp.ac.be
15 November 1994
Abstract
networks leads to more elaborate patterns of attacks.
Previous works for stand-alone computer security have
established basic concepts and models [3, 4, 5, 7, 8] and
described a few operational systems [1, 6, 9, 12, 18].
However, distributed analysis of audit trails for network security is needed because of the two following
facts. First, the correlation of user actions taking place
at di erent hosts could reveal a malicious behavior
while the same actions may seem legitimate if considered at a single host level. Second, the monitoring of
network security can potentially provide a more coherent and exible enforcement of a given security policy.
For instance, the security ocer can set up a common
security policy for all monitored hosts but choose to
tighten the security measures for critical hosts such as
rewalls [2] or for suspicious users.
An implemented system for on-line analysis of multiple distributed data streams is presented. The system
is conceptually universal since it does not rely on any
particular platform feature and uses format adaptors
to translate data streams into its own standard format.
The system is as powerful as possible (from a theoretical standpoint) but still ecient enough for on-line
analysis thanks to its novel rule-based language (RUSSEL) which is speci cally designed for ecient processing of sequential unstructured data streams.
In this paper, the generic concepts are applied to security audit trail analysis. The resulting system provides
powerful network security monitoring and sophisticated tools for intrusion/anomaly detection. The rulebased and command languages are described as well as
the distributed architecture and the implementation.
Performance measurements are reported, showing the
e ectiveness of the approach.
1 Introduction
A software architecture and a rule-based language
for universal audit trail analysis were developed in the
rst phase of the ASAX project [10, 11, 12]. The distributed system presented here uses this rule-based
language to lter audit data at each monitored host
and to analyze ltered data gathered at a central host.
The analysis language is exactly the same at both local
and central levels. This provides a tool for a exible
and a gradual granularity control at di erent levels:
users, hosts, subnets, domains, etc.
Auditing distributed environments is useful to understand the behavior of the software components. For
instance this is useful for testing new applications: one
execution trace can be analyzed to check the correctness wrt the requirements. In the area of real-time
process control, critical hardware or software components are supervised by generating log data describing
their behavior. The collection and analysis of these log
les has often to be done real-time, in parallel with
the audited process. This analysis can be conducted
for various purposes such as investigation, recovery
and prevention, production optimization, alarm and
statistics reporting. In addition, correlation of results
obtained at di erent nodes can be useful to achieve a
more comprehensive view of the whole system.
Computer and network security is currently an active research area. The rising complexity of today
The rest of this paper is organized as follows. Section 2 brie y describes the system for single audit trail
analysis and its rule-based language. Section 3 details
the functionalities o ered by the distributed system.
Section 4 presents the distributed architecture. Section 5 describes the command interface of the security
ocer. In section 6, the implementation of the main
components is outlined. Performance measurements
are reported in section 7. Finally, section 8 contains
the conclusion and indicates possible improvements of
this work.
To appear in the ISOC' 95 Symposium on Network and
Distributed System Security.
1
2 Single Audit Trail Analysis
In this section, the main features of the stand alone
version of ASAX for single audit trail analysis are explained. However, we only emphasize interesting functionalities. The reader is referred to [12] for a more detailed description of these functionalities1. A comprehensive description of ASAX is presented in [10, 11].
2.1 A motivating example
The use of the RUSSEL language for single audit
trail analysis is better introduced by a typical example: detecting repeated failed login attempts from a
single user during a speci ed time period. This example uses the SunOS 4.1 auditing mechanism. Native audit trails are translated into a standard format
(called NADF). The translation can be applied on-line
or o -line. Hence, the description below is based on
the NADF format of the audit trail records.
Assuming that login events are pre-selected for auditing, every time a user attempts to log in, an audit
record describing this event is written into the audit
trail. Audit record elds (or audit data) found in a
failed login record include the time stamp (au time),
the user id (au text 3) and a eld indicating success
or failure of the attempted login (au text 4). Notice
that audit records representing login events are not
necessarily consecutive since other audit records can
be inserted for other events generated by other users
of the system. In the example (see Figure 1),
RUSSEL keywords are noted in bold face characters, words in italic style identify elds in the current
audit record, while rule parameters are noted in roman style. Two rules are needed to detect a sequence
of failed logins. The rst one (failed login) detects the
rst occurrence of a login failure. If this record is
found, this rule triggers o the rule count rule which
remains active until it detects count down failed logins among the subsequent records or until its expiration time arrives. The parameter target uid of rule
count rule is needed to count only failed logins that are
issued by the same user (target uid). If the current audit record does not correspond to a login attempt from
the same user, count rule simply retriggers itself for
the next record otherwise. If the user id in the current
record is the same as its argument and the time stamp
is lower than the expiration argument, it retriggers itself for the next record after decrementing the count
down argument. If the latter drops to zero, count rule
writes an alarm message to the screen indicating that
a given user has performed maxtimes unsuccessful logins within the period of time duration seconds. In
addition, count rule retriggers the failed login rule in
order to search for other similar patterns in the rest
of the audit trail.
In order to initialize the analysis process, the special
rule init action makes the failed login rule active for
the rst record and also makes the print results rule
active at completion of the analysis. The latter rule is
1 Notice however that [12] is a preliminary description of a
system under implementation. The examples in the present
paper have been actually run on the implemented system
global v: integer;
rule failed login(max times, duration: integer);
if
event = 'login logout'
and au text 4 = 'incorrect password'
--> trigger o for next
count rule(au text 3,
strToInt(au time)+duration,
max times-1)
;
rule
if
count rule(target uid:
expiration,
count down:
auid
string;
integer);
= suspect auid
and event = 'login logout'
and au text 4 = 'incorrect password'
and au text 3 = target uid
and strToInt(au time) < expiration
--> if count down > 1
--> trigger o for next
count rule(target uid,
expiration,
count down-1);
count down = 1
-->
v := v + 1;
println(gettime(au time),
': 3 FAILED LOGINS ON ',
target uid);
begin
trigger o for next
failed login(3,120)
end
;
strToInt(au
-->
time)
> expiration
failed login(3,120);
trigger o for next
true
--> trigger o for next
count rule(target uid,
expiration,
count down)
;
rule print results;
begin
println(v, ' sequence(s) of bad logins found')
end;
init action;
begin
v := 0;
trigger o for next failed login(3, 120);
trigger o at completion print results
end.
Figure 1:
SunOS 4.1
RUSSEL module for failed login detection on
used to print results accumulated during the analysis
such as the total number of detected sequences.
2.2 Salient features of ASAX
2.2.1 Universality
This feature means that ASAX is theoretically able
to analyze arbitrary sequential les. This is achieved
by translating the native le into a format called
NADF (Normalized Audit Data Format). According
to this format, a native record is abstracted to a sequence of audit data elds. All data elds are considered as untyped strings of bytes. Therefore, an audit
data in the native record is converted to three elds2 :
an identi er (a 2-bytes integer) identi es the data
eld among all possible data elds;
a length (a 2-bytes integer;)
a value i.e., a string of bytes.
A native record is encoded in NADF format as the
sequence of encodings of each data eld with a leading
4-bytes integer representing the length of the whole
NADF record. Note that the NADF format is similar
to the TLV (Tag, Length, Value) encoding used for the
BER (Basic Encoding Rules) which is used as part of
the Abstract Syntax Notation ASN.1 [14]. However,
the TLV encoding is more complex since it supports
typed primitive data values such as boolean, real, etc
as well as constructor data types. Nevertheless, any
data value can be represented as a string of bytes in
principle. As a result, the exibility of the NADF
format allows a straightforward translation of native
les and a fast processing of NADF records by the
universal evaluator.
2.2.2 The RUSSEL language
RUSSEL (RUle-baSed Sequence Evaluation Language) is a novel language speci cally tailored to the
problem of searching arbitrary patterns of records in
sequential les. The built-in mechanism of rule triggering allows a single pass analysis of the sequential
le from left to right.
The language provides common control structures
such as conditional, repetitive, and compound actions.
Primitive actions include assignment, external routine
call and rule triggering. A RUSSEL program simply
consists of a set of rule declarations which are made
of a rule name, a list of formal parameters and local variables and an action part. RUSSEL also supports modules sharing global variables and exported
rule declarations.
The operational semantics of RUSSEL can be sketched
as follows:
records are analyzed sequentially. The analysis of
the current record consists in executing all active
rules. The execution of an active rule may trigger
o new rules, raise alarms, write report messages
or alter global variables, etc;
2 In fact, native les can be translated to NADF format in
many di erent ways depending on the problem at hand. The
standard method proposed here was however sucient for the
applications we have encountered so far.
rule triggering is a special mechanism by which a
rule is made active either for the current or the
next record. In general, a rule is active for the
current record because a pre x of a particular sequence of audit records has been detected. (The
rest of this sequence has still to be possibly found
in the rest of the le.) Actual parameters in the
set of active rules represent knowledge about the
already found subsequence and is useful for selecting further records in the sequence;
when all the rules active for the current record
have been executed, the next record is read and
the rules triggered for it in the previous step are
executed in turn;
to initialize the process, a set of so-called init rules
are made active for the rst record.
User-de ned and built-in C-routines can be called
from a rule body. A simple and clearly speci ed interface with C allows to extend the RUSSEL language
with any desirable feature. This includes simulation
of complex data structures, sending an alarm message
to the security ocer, locking an account in case of
outright security violation, etc.
2.2.3 Eciency
Is a critical requirement for the analysis of large
sequential les, especially when on-line monitoring is
involved. RUSSEL is ecient thanks to its operational semantics which exhibits a bottom-up approach
in constructing the searched record patterns. Furthermore, optimization issues are carefully addressed
in the implementation of RUSSEL: for instance, the
internal code generated by the compiler ensures a
fast evaluation of boolean expressions and the current record is pre-processed before evaluation by all
the current rules, in order to provide a direct access
to its elds.
3 Administrator Minded
Functionalities
3.1 Introduction
The previous sections showed that ASAX is a universal, powerful and ecient tool for analyzing sequential les, in general, and audit trails, in particular.
In this section, the functionalities of a distributed version of ASAX are presented in the context of distributed security monitoring of networked computers.
The implemented system applies to a network of SUN
workstations using the C2 security feature and uses
PVM (Parallel Virtual Machine) [15] as message passing system. However, the architecture design makes
no assumption about the communication protocol, the
auditing mechanism or the operating system of the involved hosts.
3.2 Single point administration
In a network of computers and in the context of security auditing, it is desirable that the security ocer
has control of the whole system from a single machine.
The distributed on-line system must be manageable
from a central point where a global knowledge about
the status of the monitoring system can be maintained
and administered in a exible fashion. Management of
the monitoring system involves various tasks such as
activation of distributed evaluators and auditing granularity control. Therefore, monitored nodes are, in a
sense, considered as local objects on which administration tasks can be applied in a transparent way as
if they were local to the central machine.
the other hand, if the master evaluator fails, the distributed analysis can be resumed from an other available host. In all cases, and especially for on-line analysis, all generated audit records must remain available
for analysis (no records are lost). Distributed analysis recovery must also be done in a exible way and
require a minimum e ort.
Local analysis requirement corresponds to the ability of analyzing any audit trail associated to a monitored host. This is achieved by applying an appropriate RUSSEL module to a given audit trail of a given
host. The analysis is considered local in the sense that
analyzed audit data represents events taking place at
the involved host. No assumption is otherwise made
about which host is actually performing the analysis. Local analysis is also called ltering since at the
network level, it serves as a pre-selection of relevant
events. In fact, pre-selected events may correspond to
any complex patterns of subject behaviors.
Audit records ltered at various nodes are communicated to a central host where a global (network
level) analysis takes place. In its most interesting use,
global analysis aims at detecting patterns related to
global network security status rather than host security status. In this regard, global analysis encompasses
a higher level and a more elaborate notion of security
event.
Concerted local and global analysis approach lends
itself naturally to a hierarchical model of security
events in which components of a pattern are detected
at a lower level and a more aggregate pattern is derived at the second higher level and so on. Note that
an aggregate pattern could exhibit a malicious security event while corresponding sub-patterns do not at
all. For instance, a login failure by a user is not an
outright security violation but the fact that this same
user is trying to connect to an abnormally high number of hosts may indicate that a network attack is
under course. Organizations often use networks of interconnected Lans corresponding to departments. The
hierarchical model can be mapped on the organization
hierarchy by applying a distributed analysis on each of
the Lans and an organization-wide analysis carried out
on audit data ltered at each Lan. Thus, concerted
ltering and global analysis can lead to the detection
of very complex patterns.
In the following, the node performing the global
analysis is called the central or master machine while
ltering takes place at slave machines. Correspondingly, we will also refer to master and slave evaluators.
A distributed evaluator is a master evaluator together
with its associated slave evaluators.
This functionality involves control of the granularity of security events at the network, host and user
levels. Typically, the security ocer must be able to
set up a standard granularity for most audited hosts
and to require a ner granularity for a particular user
or all users of a particular host. According to the single point administration requirement, this also means
that logging control is carried out from the central
machine without need for multiple logging to remote
hosts.
3.3 The local and global analyses
3.4 Availability
This requirement means that a distributed evaluator must survive any of its slave evaluators failure and
must easily be recovered in case of a failure of the master evaluator. The availability of a distributed evaluator ensures that if for some reasons a given slave is
lost (broken connection, fatal error in the slave code
itself, node crash, etc), the distributed analysis can
still be carried on the rest of monitored hosts. On
3.5 Logging control
4 Architecture
The architecture of the distributed system is addressed at two di erent levels. At the host level, a
number of processes cooperate to achieve logging control and ltering. The global architecture supports the
network level analysis. This section aims at giving an
intuitive view of the overall distributed system.
4.1 Host level
Processes in the local architecture are involved in
the generation of audit data, control of its granularity
level, conversion of audit data to NADF format, analysis of audit records and nally transmission of ltered
sequences to the central evaluator. At the master host,
a network level analysis subsequently takes place on
the stream of records resulting from merging records
incoming from slave machines. Both global and local
analyses are performed by a slightly modi ed version
of the analysis tool outlined in the previous section.
4.1.1 Audit trail generation
This mechanism is operating system dependent. It
generates audit records representing events such as operations on les, administrative actions, etc. It is assumed that all monitored hosts provide auditing capabilities and mechanism for controlling granularity
level. The process generating audit records is called
the audit daemon (auditd for short).
4.1.2 Login controller
This process communicates with auditd in order to
alter the granularity. It is able to change the set of preselected events. This can be done on a user, host and
network basis. Furthermore, we distinguish between
a temporary change which applies to the current login session and a permanent change a ecting also all
subsequent sessions.
4.1.3 Format adaptor
This process translates audit trails generated by auditd to the NADF format. Native les can be erased
after being converted since they are semantically redundant with NADF les. Keeping converted les instead of native les has several advantages: the les
are converted only once and can be reanalyzed several
times without requiring a new conversion. Moreover,
FILTERING
FILTERING
Format Adaptor
Format Adaptor
? ?
.
.
.
NADF
FILE
FILE
? ?
Evaluator
? ?
NADF
?
@@
R
@
? ?
...
.
.
.
NADF
FILE
FILE
? ?
Evaluator
? ?
NADF
?
...
...
...
.
...
...
...
.
.
.
.
....
NETWORK
)
Audit
Records
..?
GLOBAL
ANALYSIS
?
CENTRAL
EVALUATOR
CONSOLE
Figure 2: System Architecture
in the context of an heterogeneous network, they provide a standard and unique format.
4.1.4 Local evaluator
It analyzes the NADF les generated by the format
adaptor. Note that several instances of the evaluator
can be active at the same time to perform analyses on
di erent NADF les or possibly on the same le. O line and on-line analyses are implemented in the same
way. The only di erence is that in on-line mode, the
evaluator analyzes the currently generated le. These
processes will be further described in section 6.
Audit records ltered by slave evaluators on the various monitored slave machines are sent to the central
machine for global analysis.
4.2 Network level
At the network level, the system consists of one or
more slave machines running the processes previously
described and a master machine running the master
evaluator (see Figure 2).
The latter performs global analysis on the audit
record stream resulting from local ltering. The result of the central analysis can be network security
status reports, alarms and statistics, etc. In addition,
a console process is run on the master machine. It
provides an interactive command interface to the distributed monitoring system. This command interface
is brie y described in the next section.
5 The Command Language of
the Distributed System
5.1 Preliminaries
This section presents the command interface used
by the security ocer. In the following, evaluator instances are identi ed by their PVM instance numbers
which are similar to Process Ids in UNIX systems. Auditable events are determined by a comma separated
list of audit ags which are borrowed from the SunOS
4.1 C2 security notation for event classes. (The SunOS
4.1 C2 security features are described in detail in [16].)
These audit ags are listed in Table 1.
ags short description
data read
data write
object create/delete
object access change
Login, Logout
Administrative operation
Privileged operation
Unusual operation
dr
dw
dc
da
lo
ad
p0
p1
example
stat(2)
utimes(2)
mkdir(2)
chmod(2)
login(1)
su(1)
quota(1)
reboot(2)
Table 1: SunOS C2 security audit flags
Audit ags can optionally be preceded by +
(resp. -) to select only successful (resp. failed)
events. For instance, the list of audit ags
+dr,-dw,lo,p0,p1 speci es that successful data read,
failed data writes, all new logins and all privileged operations are selected. Under SunOS, the
le /etc/security/audit/audit control contains
(among other things) a list of audit ags determining the set of auditable events for all users of the system. /etc/security/passwd.adjunct is called the
shadow le and contains a line per user which indicates
events to be audited for this particular user. The actual set of auditable events for a given user is derived
from the system audit value and the user audit value
according to some priority rules. Finally, audit trails
in NADF format respect naming conventions based on
creation and closing times. This allows to easily select les generated during a given time interval. For
instance, the le named timei.timef .NADF contains
events generated by auditd in the time interval [timei,
timef ]. Supported commands fall into two categories:
5.2 Analysis control commands
The commands for distributed analysis allow to
start, stop and modify a distributed analysis. To
start a new distributed analysis on a set of monitored
hosts, one rst prepares a text le specifying the involved hosts, the RUSSEL modules to be applied on
the hosts, and optionally an auditing period (a time
interval which is the same for each node). By default,
analysis is performed on-line. This le is given as an
argument to the run command.
Using the rerun command, the security ocer can
change attributes of an active distributed evaluator
either by changing rule modules on some hosts (master
or slave) or by changing the time interval used by the
whole distributed evaluator. The rerun command is
parameterized by an evaluator instance number and a
rule module or a time interval.
The kill command stops an evaluator identi ed by
its instance number. ps reports the attributes of all
active distributed evaluators. Attributes of an evaluator include instance number, instance number of the
corresponding master evaluator, host name, rule module and time interval.
It is possible to activate several distributed evaluators which run independently of each others. The
command reset stops all current distributed evaluators.
5.3 Logging control commands
The command logcntl implements the logging control functionality (see 3.5). It allows to alter the granularity level for any monitored user or host. The security ocer is so able to change the auditable events
for a particular user on a particular host according to
the list of audit ags supplied to logcntl. With the
option -t the change takes e ect immediately, however, the settings are in e ect only during the current
login session. With the option -p, the change takes
e ect the next time the user logs in and for every subsequent login session until this command is invoked
again. On the host basis, the security ocer can alter
the system audit value of a speci ed host by supplying
a host name and a list of audit ags.
Although the logcntl command relies closely on the
SunOS formalism for specifying auditable events and
altering the set of events currently audited, it could be
possible to develop a system-independent event classi cation as well as a portable auditing con guration.
Nevertheless, as the SunOS 4.1 uses an event classi cation and auditing con guration that are similar to
most O.S, the current solution is sucient for a prototype system.
5.4 Example
In this section, the failed login detection example
introduced in section 2.1 is reconsidered in the context of a distributed analysis. The purpose is still to
detect repeated failed login attempts, but now failed
login events can occur at any of the monitored hosts
(here we consider two hosts viz. poireau and epinard).
According to the ltering/global analysis principle, a
slave evaluator is activated on each hosts (poireau and
epinard) and a master evaluator is initiated on poireau.
Each slave evaluator only lters failed login records
from its local host and sends it to the master evaluator which then analyzes the ltered record stream
to detect the sequence of failed logins. As indicated
in the evaluator description le shown in Figure 3,
ltering is implemented in RUSSEL by the rule module badlogin.asa while the sequence of failed logins is
detected using the rule module nbbadlogin.asa. This
le also contains the time interval to which analysis is
applied. Figures 4 and 5 depict the content of badlogin.asa and nbbadlogin.asa respectively.
Notice that the master evaluator does not check
that records correspond to login failure events since
master poireau:
nbbadlogin:[19940531170431,19940601173829];
slaves poireau, epinard: badlogin.
Figure 3: Distributed Analysis Description File
rule failed login;
begin
if
event = 'login logout'
and au text 4 = 'incorrect password'
--> send current
;
trigger o for next
end;
init action;
begin
trigger o for next
end.
failed login
failed login
Figure 4: Slave evaluator module: badlogin.asa
this is already done by the associated slave evaluators. Figure 6 shows how the distributed evaluator is
activated using the interactive console window.
The lower window contains the distributed analysis interactive console. The security ocer has just
invoked the run command with the name of the evaluator description le as argument. The upper window
is the Unix console where outputs from the master
evaluator are printed.
6 Overview of the Implementation
The implementation of the rule-based language
RUSSEL is out of the scope of this paper and is fully
explained in [10, 11]. We only consider the implementation of the distributed aspects. However, it is worth
noticing that very few modi cations were necessary to
handle record streams instead of ordinary audit trails.
In addition to the auditd process, the following concurrent processes are attached to each monitored host
(see Figure 7):
6.1 Distributed format adaptor (FA)
The distributed format adaptor fadapter translates
SunOS audit les into NADF format. It also observes date and time based naming conventions for
NADF les: a NADF le consisting of the chronological sequence of audit records R0, ..., Rn?1 is named
time0.timen.NADF where time0 is the time and date
found in R0 and timen is the time stamp in Rn?1 plus
one second. Both time0 and timen are 4 decimal digits year, and 2 decimal digits for each of the month,
day, minute, and second. The current NADF le has a
name of the form time0.not terminated.NADF where
time0 is the time stamp of its rst record.
The current native and NADF les are limited to a
maximum size which is recorded in the le nadf data.
The process sizer sends a signal to auditd (resp.
fadapter) if the maximum size for the current native
Figure 6: Console windows
(resp. NADF) le is reached. When auditd or fadapter
receives such a signal, it closes the current le and
continues on a new one. The maximum size can be
changed at any time by a simple RPC (Remote Procedure Call) server d size svc after request from the
console process. d size svc updates the le nadf data
accordingly.
The distributed FA is automatically started at boot
time of each monitored host from /etc/rc.local.
6.2
Logging control
Changing the granularity level for a user or a host
is performed remotely from the security ocer console by a remote update of the auditd con guration of
the involved host. Therefore, logging control is implemented by means of RPC. For this purpose, to each
monitored host is attached a server process logcntl svc
accepting requests from the console process running
on the master machine. Depending on the option used
for the command logcntl, the console process calls an
appropriate procedure o ered by the logcntl svc server
on the involved host. According to the RPC model,
logcntl svc transfers control to the appropriate service
procedure and then sends back a reply to the console
process indicating the outcome of the call.
It was not possible to implement such a communication using PVM since all processes participating in
the Parallel Virtual Machine must belong to the same
user while the logcntl svc server requires root privileges to access the shadow password le. Moreover,
the security ocer should not necessarily own root
privileges.
6.3
Supplier process
6.4
Evaluator process
This process runs on each monitored host. It sends
to its evaluator a record stream corresponding to a
given time interval. It receives from the console process on the master machine the instance number of
its associated evaluator and a time interval. It retrieves corresponding records from the NADF les and
sends them in sequence using a PVM message for each
record. It is interesting to note that slave and master
evaluators are implemented exactly by the same code.
This is possible at the cost of providing the additional
supplier process which hides the details of how audit
records are retrieved. For slave evaluators, the records
are received from the supplier process while a master
evaluator receives them from its slave evaluators.
The evaluator process (on master and slave machines) is the heart of the distributed system. It analyzes record streams according to a rule module. If
the evaluator is a master evaluator, the record stream
originates from a set of slave evaluators and the result of the analysis may be reports, alarms, statistics,
etc. If the evaluator is a slave evaluator, there is only
one sending process (the supplier process) and in this
case, the result is a ltered sequence of audit records
which are sent to the master evaluator. The console
process can change the rule module used by an evaluator by sending to it the name of the new module
global v: integer;
rule failed login(max times, duration: integer);
begin
trigger o for next
count rule(au text 3,
strToInt(au time)+duration,
max times-1)
end;
rule count rule(target uid: string;
expiration,
count down: integer);
if
au text 3 = target uid
and strToInt(au time) < expiration
--> if count down > 1
--> trigger o for next
count rule(target uid,
expiration,
count down-1);
count down = 1
-->
v := v + 1;
println(gettime(au time),
': 3 FAILED LOGINS ON ',
target uid);
begin
trigger o for next
failed login(3,120)
end
;
strToInt(au time) > expiration
-->
failed login(3,120);
trigger o for next
true
--> trigger o for next
count rule(target uid,
expiration,
count down)
;
rule print results;
begin
println(v, ' sequence(s) of bad logins found')
end;
init action;
begin
v := 0;
trigger o for next failed login(3, 120);
trigger o at completion print results
end.
Figure 5:
Master evaluator module: nbbadlogin.asa
nadf data
?
d size svc
3
..
.
? ? .. ? audit control
NATIVE .. sizer
]
J
. FILE ..
J
logcntl svc
.
?
...
passwd.adjunct
.
? /
fadapter
supplier
?
? ?
?
?
?
NADF
NADF ? ?
FILE
FILE
evaluator
-- : PVM communication
. . . .-. :: Read/write
Unix signal
auditd
o..
...
Figure 7: Local Architecture
to be applied. At reception, the evaluator executes
the completion rules, compiles the new module, executes the resulting init-actions and then waits for audit
records. The time interval can also be changed for all
evaluators participating in a distributed analysis. For
this purpose, the console process sends the new time
interval to all involved supplier processes and noti es
evaluators for such a change. Upon reception, supplier
processes send to the evaluators a record stream determined by the new time interval. Evaluator processes
only execute completion rules and init-actions. Completion rules report the results of the previous analysis
before changing the current rule module or the time
interval.
6.5
Console process
This process was already partially described in previous sections. Only a single instance of the console
process exists and is active on the master machine
under control of the security ocer through the command interface described in section 5. It maintains the
status of all active distributed evaluators and coordinates all processes of the distributed system. Under
interactive control of the security ocer, the console
process can also invoke the remote logcntl svc RPC
server to change the current granularity level on a
given host.
To activate a distributed analysis as indicated in
a distributed evaluator description le, the console
process initiates an evaluator-supplier pair on each
slave host and a master evaluator on the master host.
It then sends the time interval to all supplier instances and the appropriate rule module to each evaluator instance. When all suppliers are positioned
in the time interval and evaluators have successfully
compiled their modules, the console process starts
the analysis by triggering record stream transmissions
from suppliers to slave evaluators.
7 Performance Measurements
7.1
type
Introduction
This section reports some performance tests of our
system. These measurements aim at showing the feasibility and e ectiveness of the distributed system in
terms of response time and network load. It will also
follow from these measurements that on-line monitoring is feasible.
The experiments were carried out on two SUN
SPARCstation 1 running the C2 security level of the
SunOS 4.1 and connected to a 10 Mbytes/sec Ethernet. Each machine has 16 Mbytes of random access
memory. In addition, a third machine on the Ethernet
is used as a le server where NADF les generated at
each host are stored using NFS (Network File System).
The rst experiment measures the overhead due
to the distributed architecture wrt the same analysis performed on a single audit trail. The second one
compares the performance of a distributed audit trail
analysis and of a centralized audit trail analysis. The
last experiment shows the bene ts of executing several
analyses in parallel.
7.2
system response time involves optimization of the network communication.
Overhead of the distributed
architecture
In order to measure the overhead introduced by the
distributed architecture, we analyzed a single audit
le of 500 Kbytes using the single audit trail analysis version on the one hand and the distributed version on the other hand. The analyzed le represents
a two days usage of the system by two users. Audited
events are le operations as well as normal administrative operations such as the su and login commands.
In the rst case, audit records are simply retrieved
from the audit le using input/output routines. The
second case corresponds to a degenerated distributed
evaluator composed of a single slave evaluator. The
overhead introduced is mainly due to network communication (using PVM) between the slave and the master. [17] describes experiments comparing the communication times for a number of di erent network
programming environments on isolated two and four
nodes networks. Since messages exchanged in the distributed system are around 300 bytes in size, it follows
from the measurements conducted in [17] that the average data transfer rate is around 0.049Mbytes/sec.
The slave evaluator applies the badlogin.asa module as explained earlier and the master evaluator runs
the nbbadlogin.asa module. Table 2 gives the mean
values of the CPU and elapsed times (in seconds) for
the stand alone analysis (SAA) and the distributed
analysis (DA).
The results suggest that the distributed audit trail
analysis is feasible since the elapsed time for the analysis is negligible wrt the time spent in generating the
audit data (2 days). However, the overhead due to
the distributed architecture is signi cant: most of
the elapsed time is spent in communication between
nodes. Consequently, improvements of the distributed
usr
sys total elapsed
:
:
:
:
:
:
SAA 1 13 0 68 1 81
DA 3 43 3 73 7 20
53
55 7
:
:
Table 2: Stand Alone v.s Distributed Analysis
7.3
Centralized v.s distributed audit trail
analysis
This section reports the performance bene ts of a
distributed network security monitoring over a centralized network security monitoring. In the latter approach, monitored nodes do not perform any
intelligent3 ltering of audit data. All audit records
generated at one node are sent to a central host where
the analysis takes place. As shown in Table 3, the
distributed analysis has the advantage of drastically
reducing the network trac in comparison with the
centralized analysis (CA). It also achieves a balancing of the CPU time over several machines. The CPU
time of the master evaluator is smaller since part of
the analysis is carried out by slave evaluators on slave
machines. A system using a centralized architecture
for network audit trail analysis is presented in [18].
type
CA
DA
a In
usr
sys
total elapsed traca
11.90 13.60 25.56 265.78
1.15 7.46 8.61 188.56
Kbytes
2,661
39
Table 3: Distributed v.s Centralized Analysis
7.4
Parallel v.s sequential analysis
The RUSSEL language allows to execute more than
one analysis at the same time i.e., during a single analysis of a given audit le, several independent rule modules can be executed. For instance, we can search in
parallel for repeated failed logins as well as for repeated attempts to corrupt system les.
We used 4 distributed evaluators described by their
distributed evaluator description les. All analyses
are limited to a speci ed time interval as shown in
Figure 8.
The purpose of the rst one is to detect 3 repeated
failures to break a given account using the su command. Each of the hosts poireau and epinard runs
a slave evaluator which detect unsuccessful su commands. The master evaluator detects sequences of 3
failed su commands invoked at any of the two monitored hosts.
The purpose of the second distributed analysis is
to detect attempts to corrupt system les on either of
3 Assuming that a simple pre-selection of auditable events
cannot be considered as an intelligent ltering.
bad
su
commands:
master poireau:
nbbadsu:[19940531170524,19940606173854];
slaves poireau, epinard: badsu.
System corruption:
master poireau:
fscorrupt:[19940531170524,19940606173854];
slaves poireau, epinard: corrupt.
Set user id files:
master poireau:
setuid:[19940531170524,19940606173854];
slaves poireau, epinard: create.
trojan
su:
master module.asa:
nbbadsu, fscorrupt, setuid, trojan.
uses
slave module.asa:
badsu, corrupt, create, exec.
uses
Figure 9: Parallel analysis: RUSSEL modules for the master
and slave
type
nbbadsu
fscorrupt
setuid
trojan
total
parallel
usr
02 43
02 33
03 10
02 83
10 69
08 03
:
:
:
:
:
:
sys total elapsed
10 18
12 51
11 98
11 83
46 50
15 03
:
:
:
:
:
:
12 61
14 48
15 08
14 66
57 19
23 06
:
:
:
:
:
:
159 48
176 90
182 46
184 09
702 92
209 83
:
:
:
:
:
:
master poireau:
trojan:[19940531170524,19940606173854];
slaves poireau, epinard: exec.
Table 4: Multiple Distributed Analysis v.s Parallel Analysis
Figure 8:
The distributed evaluator description le for the parallel execution is depicted in Figure 10.
the two hosts. System les corruption could be deletion, creation, attribute modi cation of any of system les or directories. Each slave evaluator applies
the RUSSEL module corrupt.asa that detects deletion,
creation or attribute modi cation of les. The master
evaluators uses the module fscorrupt.asa to check that
such operations involve a system le.
The third analysis aims at detecting new set user
id les. For this purpose, slave evaluators on epinard
and poireau detect creation of les on hidden directories such as /tmp or /usr/tmp and modi cation of
their access ags. At the master evaluator, the module setuid.asa is used to detect the creation of a le on
hidden directory followed by a modi cation of access
ags of these same le such that the created le is a
set user id le.
The last analysis searches for trojan system programs
such as su. The slaves detect the execution of any
command using the module exec.asa while the master applies the module trojan.asa to check if such an
execution involves a trojan program.
The multiple distributed analysis amounts to execute these distributed analyses one after the other
using the run command with the appropriate distributed evaluator description le as argument. The
corresponding execution times are reported in Table 4.
In the case of parallel execution, we activate a single
distributed evaluator which performs the four analyses
at the same time. For this purpose, the master evaluator uses a rule module (master module.asa) which
includes all modules applied by each of the above 4
masters (see Figure 9).
Similarly, slave evaluators (on epinard and poireau)
run a single module (slave module.asa) which includes
the 4 ones applied by the previous slave evaluators.
parallel distributed evaluator:
poireau:
master module:[19940531170524,19940606173854];
poireau, epinard: slave module.
Multiple Distributed Analyses: RUSSEL modules
for the master and the slaves
master
slaves
Figure 10: Parallel Analysis Description File
The execution times for the parallel analysis are
found in the last line of table 4. It follows from this
table that the performance gain is substantial. Note
that the elapsed time of the parallel analysis is not
signi cantly di erent from the elapsed time of a single
analysis. This suggests that complex on-line analyses (combining many single analyses in parallel) are
feasible.
8 Conclusions and Future Works
This paper presented an implemented system for
on-line analysis of multiple distributed data streams.
Universality of the system makes it conceptually independent from any architecture or operating system.
This is achieved by means of format adaptors which
translate data streams to a canonical format. The
rule-based language (RUSSEL) is speci cally designed
for analyzing unstructured data streams. This makes
the presented system (theoretically) as powerful as
possible and still ecient enough for solving complex
queries on the data streams. We also presented the
distributed architecture of the system and its implementation.
E ectiveness of the distributed system was demonstrated by reporting performance measurements conducted on real network attack examples. These measurements also showed that on-line distributed analysis is feasible even for complex problems. Further
works will tackle the problem of reducing the overhead due to network communication. For the present
version of the system, audit records are transmitted
using one PVM message by record. A rst improvement is to bu er audit records before packing them in
a single PVM message. Another improvement involves
a direct use of standard communication protocols such
as TCP/IP instead of PVM. More standard protocols
will increase the portability and the robustness of our
system.
References
[1] A. Baur, W. Weiss, Audit Analysis Tool for Systems
with High Demands Regarding Security and Access
Control. Research Report, ZFE F2 SOF 42, Siemens
Nixdorf Software, Munich, November 1988.
[2] W.R. Cheswick, S.M. Bellovin, Firewalls and internet
security: repelling the wily hacker. Addison-Wesley
1994, 306 pages. ISBN 0-201-63357-4.
[3] D.E. Denning, An Intrusion-Detection Model. IEEE
Transactions on Software Engineering, Vol.13 No.2,
February 1987.
[4] Th. D. Garvey, T.F. Lunt, Model-Based Intrusion
Detection. Proceedings of the 14th National Security
Conference, Washington DC., October 1991.
[5] T. Lunt, J. van Horne, L. Halme, Automated Analysis
of Computer System Audit Trails. Proceedings of the
9th DOE Computer Security Group Conference, May
1986.
[6] T. F. Lunt, R. Jagannathan, A Prototype Real-time
Intrusion Detection Expert System. Proceedings of the
1988 IEEE Symposium on Security and Privacy, April
1988.
[9] T. F. Lunt et. al., A Real-Time Intrusion Detection
Expert System. Interim Progress Report, Computer
Science Laboratory, SRI International, Menlo Park,
CA, May 1990.
[10] N.Habra, B. Le Charlier, A. Mounji, Preliminary report on Advanced Security Audit Trail Analysis on
Unix 15.12.91, 34 pages.
[11] N.Habra, B. Le Charlier, A. Mounji, Advanced Security Audit Trail Analysis on Unix. Implementation
design of the NADF Evaluator Mar 93, 62 pages.
[12] N.Habra, B. Le Charlier, I. Mathieu, A. Mounji,
ASAX: Software Architecture and Rule-based Language for Universal Audit Trail Analysis. Proceedings of the Second European Symposium on Research
in Computer Security (ESORICS). Toulouse, France,
November 1992.
[13] A. Mounji, B. Le Charlier, D. Zampunieris, N.Habra,
Preliminary report on Advanced Security Audit Trail
Analysis on Unix 15.12.91, 34 pages.
[14] Marshall T. Rose, The Open Book: a Practical Perspective on OSI. Prentice-Hall 1990, 651 pages. ISBN
0-13-643016-3.
[15] A. Beguelin, J. Dongarra, A. Geist, R. Manchek, V.
Sunderam, A User Guide to PVM (Parallel Virtual
Machine). ORNL/TM-11826. July, 1991, 13 pages.
[16] Sun Microsystems, Network Programming Guide,
Part Number 800-3850-10 Revision A of 27 March,
1990.
[7] T. F. Lunt, Automated Audit Trail Analysis and Intrusion Detection: A Survey. Proceedings of the 11th
National Security Conference, Baltimore, MD, October 1988.
[17] Craig C. Douglas, Timothy G. Mattson, Martin H.
Shultz, Parallel Programming Systems For Workstation Clusters. Yale University Department of Computer Science Research Report YALEU/DCS/TR975, August 1993, 36 pages.
[8] T. F. Lunt, Real Time Intrusion Detection. Proceedings of the COMPCON spring 89', San Francisco, CA,
February 1989.
[18] J.R. Winkler, A Unix Prototype for Intrusion and
Anomaly Detection in Secure Networks. Planning Research Corporation, R&D, 1990.