CN109347827B

CN109347827B - Method, device, equipment and storage medium for predicting network attack behavior

Info

Publication number: CN109347827B
Application number: CN201811229471.4A
Authority: CN
Inventors: 阎俊达
Original assignee: Neusoft Corp
Current assignee: Neusoft Corp
Priority date: 2018-10-22
Filing date: 2018-10-22
Publication date: 2021-06-22
Anticipated expiration: 2038-10-22
Also published as: CN109347827A

Abstract

The embodiment of the invention provides a method, a device, equipment and a storage medium for predicting network attack behaviors. According to the method provided by the embodiment of the invention, the characteristic extraction and identification are carried out on the log to be processed according to the log analysis model so as to obtain the characteristic data of the log to be processed and the information of the equipment to which the log belongs; determining safety event data according to the characteristic data of the log to be processed and the equipment information; whether the network attack behavior occurs or not is predicted according to the security event data and the attack prediction model, the analysis and identification efficiency of the log is improved, the network attack behavior which is about to occur can be predicted before the network attack behavior occurs, a foundation is provided for effectively avoiding the network attack behavior, and therefore the security of the network equipment can be effectively guaranteed.

Description

Method, device, equipment and storage medium for predicting network attack behavior

Technical Field

The present invention relates to the field of network security technologies, and in particular, to a method, an apparatus, a device, and a storage medium for predicting network attack behavior.

Background

The network security means that the hardware, software and data in the system of the network system are protected and are not damaged, changed and leaked due to accidental or malicious reasons, the system continuously, reliably and normally operates, and the network service is uninterrupted.

Currently, whether a network device is under a network attack can be identified by analyzing logs generated by the network device. Because the logs belong to unstructured data, the formats are not uniform, the types of network equipment are various, and no uniform log analysis format exists for different network equipment. In the prior art, a regular expression matched with a log format is set through a regular matching algorithm to identify whether a network attack behavior is received.

However, in the network attack behavior recognition method in the prior art, the regular expression is selected and then recognized before each recognition, so that the recognition efficiency is low, the network attack behavior cannot be predicted, and the security of the network device cannot be effectively ensured.

Disclosure of Invention

Embodiments of the present invention provide a method, an apparatus, a device, and a storage medium for predicting a network attack behavior, so as to solve the problems that, in a network attack behavior recognition method in the prior art, since a regular expression is selected and then recognized before each recognition, recognition efficiency is low, a network attack behavior cannot be predicted, and security of a network device cannot be effectively guaranteed.

One aspect of the embodiments of the present invention is to provide a method for predicting network attack behavior, including:

performing feature extraction and identification on a log to be processed according to a log analysis model to obtain feature data of the log to be processed and information of equipment to which the log belongs;

determining safety event data according to the characteristic data of the log to be processed and the equipment information;

and predicting whether network attack behaviors occur or not according to the security event data and the attack prediction model.

Another aspect of the embodiments of the present invention is to provide a device for predicting network attack behavior, including:

the log analysis module is used for extracting and identifying the characteristics of the log to be processed according to the log analysis model so as to obtain the characteristic data of the log to be processed and the information of the equipment to which the log belongs;

the aggregation filling module is used for determining safety event data according to the characteristic data of the log to be processed and the equipment information;

and the prediction processing module is used for predicting whether the network attack behavior occurs according to the security event data and the attack prediction model.

Another aspect of the embodiments of the present invention is to provide a network attack behavior prediction apparatus, including:

a memory, a processor, and a computer program stored on the memory and executable on the processor,

and when the processor runs the computer program, the network attack behavior prediction method is realized.

It is another aspect of an embodiment of the present invention to provide a computer-readable storage medium, storing a computer program,

the computer program realizes the network attack behavior prediction method when being executed by a processor.

According to the method, the device, the equipment and the storage medium for predicting the network attack behavior, the logs to be processed are obtained; performing feature extraction and identification on the log to be processed according to a log analysis model to obtain feature data of the log to be processed and the information of the equipment to which the log belongs; determining safety event data according to the characteristic data of the log to be processed and the equipment information; whether the network attack behavior occurs or not is predicted according to the security event data and the attack prediction model, the analysis and identification efficiency of the log is improved, the network attack behavior which is about to occur can be predicted before the network attack behavior occurs, a foundation is provided for effectively avoiding the network attack behavior, and therefore the security of the network equipment can be effectively guaranteed.

Drawings

Fig. 1 is a flowchart of a method for predicting network attack behavior according to an embodiment of the present invention;

fig. 2 is a flowchart of a method for predicting network attack behavior according to a second embodiment of the present invention;

fig. 3 is a schematic structural diagram of a device for predicting network attack behavior according to a third embodiment of the present invention;

fig. 4 is a schematic structural diagram of a device for predicting network attack behavior according to a fourth embodiment of the present invention;

fig. 5 is a schematic structural diagram of a network attack behavior prediction device according to a fifth embodiment of the present invention.

With the above figures, certain embodiments of the invention have been illustrated and described in more detail below. The drawings and written description are not intended to limit the scope of the inventive concepts in any way, but rather to illustrate the inventive concepts to those skilled in the art by reference to specific embodiments.

Detailed Description

Embodiments of the present invention will be described in more detail below with reference to the accompanying drawings. While certain embodiments of the present invention are shown in the drawings, it should be understood that the present invention may be embodied in various forms and should not be construed as limited to the embodiments set forth herein, but rather are provided for a more thorough and complete understanding of the present invention. It should be understood that the drawings and examples are for illustrative purposes only and are not intended to limit the scope of the present disclosure

Reference to the terms "first," "second," "third," "fourth," and the like (if any) in embodiments of the present invention are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used are interchangeable under appropriate circumstances such that the embodiments of the invention described herein are, for example, capable of operation in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus. In the description of the following examples, "plurality" means two or more unless specifically limited otherwise.

The following several specific embodiments may be combined with each other, and details of the same or similar concepts or processes may not be repeated in some embodiments. Embodiments of the present invention will be described below with reference to the accompanying drawings.

Example one

Fig. 1 is a flowchart of a method for predicting network attack behavior according to an embodiment of the present invention. The embodiment of the invention provides a network attack behavior prediction method aiming at the problems that the recognition efficiency is low, the network attack behavior cannot be predicted and the safety of network equipment cannot be effectively ensured because the regular expression is selected and then recognized before each recognition in the network attack behavior recognition method in the prior art.

As shown in fig. 1, the method comprises the following specific steps:

and S101, extracting and identifying the characteristics of the log to be processed according to the log analysis model to obtain the characteristic data of the log to be processed and the information of the equipment.

In this embodiment, when the security of the network device needs to be detected, the log of the network device may be obtained, and the log to be processed is obtained.

In this embodiment, the network device refers to an entity device having a logging function and connected to a network. The network device includes: computer devices (such as personal computers or servers), hubs, switches, bridges, routers, gateways, Network Interface Cards (NICs), Wireless Access Points (WAPs), printers, and so on, and the variety of network devices is wide and increasing, and the present embodiment is not limited to the specific types of network devices.

Optionally, multiple logs in a preset time period of the network device may be obtained as the logs to be processed. The preset time period may be set by a technician according to actual needs, and this embodiment is not specifically limited herein.

The device information to which the log to be processed belongs may be information that can distinguish different network devices. For example, the device information to which the log belongs may include: the brand of the device to which it belongs, the device type, device identification information, etc.

In this embodiment, the log analysis model may be obtained by training through the first training sample set based on a machine learning method, the accuracy of the trained model may be tested through the first testing sample set after the training is completed, and if the accuracy of the trained model meets a preset condition, the final log analysis model is obtained.

The first training sample set comprises a plurality of training samples, each training sample corresponds to a historical log, and comprises a historical log and label data of the log. The label data of the log comprises characteristic data of the log and the device information of the log.

The feature data of the log to be processed may be a category, a frequency, or a frequency of a feature vocabulary included in the log to be processed. In this embodiment, the feature data of the log to be processed may be set by a technician according to an actual need, and this embodiment is not specifically limited here.

For example, for the contents of the following history logs:

[ip＝192.168.174.145code＝44243622type＝0dev＝130008:[hillstone fw syslog parser]prefix＝Format message:<190>Jun 20 16:23:362206401140002123(root)44243622Traffic@FLOW:SESSION:10.235.251.35:26763->10.235.117.97:902(UDP),interface aggregate2,vr trust-vr,policy 22,user-@-,host-,session start

the label data corresponding to the log includes the feature data of the log and the device information to which the log belongs, wherein the feature data of the log may include: characteristic words such as 'ip', 'type', 'format', 'message' and the like, and the occurrence frequency of the characteristic words in the log; the device information of the log may include: brand of the device to which it belongs, device type, and device identification information, etc. Such as an Intrusion Detection Systems (IDS) device of a certain brand (e.g., "green alliance"), the device information may be an "green alliance IDS device.

The first training sample set and the first testing sample set can be obtained by sorting a preset first knowledge base according to a preset first machine learning algorithm. The first knowledge base is an accumulation of network equipment logs for various manufacturers and comprises a large amount of historical logs.

The log to be processed is input into the log analysis model, and the log to be processed can be rapidly subjected to feature extraction and identification through the log analysis model to obtain feature data of the log to be processed and the equipment information of the log to be processed, so that the log to be processed can be rapidly analyzed.

And S102, determining safety event data according to the characteristic data of the log to be processed and the equipment information.

After the analysis of the log to be processed is completed to obtain the feature data of the log to be processed and the equipment information, the log to be processed is subjected to aggregation processing and filling processing, and each log to be processed which is reserved after the processing is used as safety event data.

Wherein the polymerization conditions of the polymerization treatment include: the time of generation of the log and/or the number of aggregation pieces.

The aggregating process of the logs to be processed may be to merge repeated logs having the same feature data and the device information belonging thereto into one log according to an aggregating condition, and specifically includes the following feasible embodiments:

one possible implementation is: and combining repeated logs with the same characteristic data and the equipment information in the logs to be processed with the generation time within a preset time range into one piece of data under the aggregation condition. The preset time range may be set by a technician according to actual needs, and this embodiment is not specifically limited herein. For example, the preset time range may be the last 5 minutes.

Another possible implementation is: and combining the repeated logs which do not exceed the aggregation number and have the same characteristic data and the equipment information to which the repeated logs belong in the logs to be processed into one piece of data under the aggregation condition. The number of the polymerization bars may be set by a skilled person according to actual needs, and the embodiment is not specifically limited herein. For example, the number of polymeric strands may be 1000.

Another possible implementation is: and combining the repeated logs with the same characteristic data and the equipment information which belong to the repeated logs with the same characteristic data and the same equipment information, wherein the generated time is within a preset time range, and the repeated logs with the same characteristic data and the same equipment information are combined into one piece of data. The preset time range and the number of aggregation pieces may be set by a technician according to actual needs, and this embodiment is not specifically limited herein. The padding processing of the log to be processed may be padding of basic information of the log to be processed. According to the type of the characteristic data which the log to be processed should include, if one or more characteristic data is absent in a certain log to be processed, filling processing is carried out on the absent characteristic data according to other characteristic data of the log to be processed, and therefore the characteristic data of the log to be processed is perfected.

The geographical location information corresponding to the log to be processed may be retained in other feature data of the log to be processed, and the like. Optionally, if a certain feature data is missing in the log to be processed, the filling data may be obtained by calling a third-party interface program corresponding to the missing feature data according to the type of the missing feature data, and the obtained filling data is added to the feature data of the log to be processed.

For example, if it is recognized that the feature data of a certain log includes an IP address, but the region information of the log is not recognized, the third-party interface program corresponding to the region information is called to obtain the region information and query which domain the IP address of the log belongs to, so as to obtain the region information corresponding to the log, and the region information in the feature data of the log is filled to perfect the feature data of the log. For example, if the feature data missing from a certain log is the position coordinates of the location. Then, the third-party interface program is called to obtain the longitude and the latitude of the place, so as to obtain the position coordinate of the place, and then the position coordinate of the place is added to the feature data of the log, so as to add the position coordinate of the place to the security event data corresponding to the log.

And S103, predicting whether network attack behaviors occur or not according to the security event data and the attack prediction model.

In view of the fact that the attack behavior is distinguished according to the rules and the time, and the logs generated by the attack are also in the chronological order of the time, in this embodiment, after the security event data corresponding to the logs to be processed are obtained, the security event data can be input into the attack prediction model according to the chronological order of the generation time, the probability of the network attack behavior can be predicted through the attack prediction model, and whether the network attack behavior occurs can be further predicted.

In this embodiment, the attack prediction model may be obtained by training through the second training sample set and testing the converged attack prediction model obtained through the second test sample set based on a machine learning method. The attack prediction model is used for predicting whether network attack behaviors occur or not.

The second training sample set and the second testing sample set can be obtained by sorting a preset second knowledge base according to a preset second machine learning algorithm. The content of the second knowledge base is a basic network attack rule, and the network attack rule can be formulated through third-party network security equipment to support configuration and the like. And cleaning the network attack rule data in the second knowledge base to obtain a second training sample set and a second testing sample set.

After the security event data corresponding to the log to be processed is obtained, the security event data can be input into the attack prediction model according to the sequence of the generation time, the probability of the occurrence of the network attack behavior can be predicted through the attack prediction model, and whether the network attack behavior occurs can be further predicted.

The embodiment of the invention obtains the log to be processed; performing feature extraction and identification on the log to be processed according to the log analysis model to obtain feature data of the log to be processed and the information of the equipment to which the log belongs; determining safety event data according to the characteristic data of the log to be processed and the information of the equipment to which the log belongs; whether the network attack behavior occurs or not is predicted according to the security event data and the attack prediction model, the analysis and identification efficiency of the log is improved, the network attack behavior which is about to occur can be predicted before the network attack behavior occurs, a foundation is provided for effectively avoiding the network attack behavior, and therefore the security of the network equipment can be effectively guaranteed.

Example two

Fig. 2 is a flowchart of a method for predicting network attack behavior according to a second embodiment of the present invention. On the basis of the first embodiment, in this embodiment, the feature data of the log to be processed includes generation time, and whether a network attack behavior occurs is predicted according to the security event data and the attack prediction model, which specifically includes: inputting each security event data into an attack prediction model according to the sequence of the generation time so that the attack prediction model determines the probability of network attack behavior according to the incidence relation of a plurality of security event data; comparing the probability of the network attack behavior with a preset attack threshold; if the probability of the network attack behavior is larger than a preset attack threshold value, outputting a prediction result of the network attack behavior; and if the probability of the network attack behavior is smaller than or equal to a preset attack threshold value, outputting a prediction result of the network attack behavior which does not occur. As shown in fig. 2, the method comprises the following specific steps:

step S201, obtaining a log to be processed.

And S202, extracting and identifying the characteristics of the log to be processed according to the log analysis model to obtain the characteristic data of the log to be processed and the information of the equipment.

The device information to which the log to be processed belongs may be information that can distinguish different network devices.

The log analysis model is a machine learning model. The log analysis model can be trained by a first training sample set based on a machine learning method, and the converged log analysis model is obtained by testing through a first testing sample set.

In this embodiment, a first knowledge base is obtained in advance, where the first knowledge base is an accumulation of network device logs for various manufacturers, and includes a large amount of history logs. And obtaining a first training sample set for training and optimizing a preset log analysis model and a first test sample set for testing the optimized log analysis model by sorting the logs in the first knowledge base.

And each first training sample in the first training sample set and each first testing sample in the first testing sample set are logs generated by the determined network equipment.

After the first training sample set and the first testing sample set are obtained, the log analysis model is trained by the first training sample, and the log analysis model is tested by the first testing sample until the log analysis model is converged, so that the optimized log analysis model is obtained.

In the step, the log to be processed is input into the log analysis model, and the log to be processed can be rapidly subjected to feature extraction and identification through the log analysis model to obtain feature data of the log to be processed and the equipment information of the log to be processed, so that the log to be processed can be rapidly identified, and the log identification rate and accuracy are improved.

Optionally, if the log to be processed fails to be identified according to the log analysis model, an unknown device type log is generated, and the unknown device type log is used for recording an event of the log to be processed failure to be identified, so that a technician can manually identify the feature data and the belonging device information of the log to be processed according to the unknown device type log, and send the feature data and the belonging device information of the log to be processed to the network attack behavior prediction device through the user terminal. The network attack behavior prediction device receives the characteristic data of the log to be processed and the information of the device; the received characteristic data and the information of the equipment are used as label data of the log to be processed, the log to be processed and the label data of the log to be processed are used as a first training sample and stored in a first training sample set, and the first training sample set is updated; and updating the log analysis model according to the updated first training sample set so as to further optimize the log analysis model.

Specifically, the process of storing the log to be processed, which determines the information of the device to which the log belongs, in the first training sample set is consistent with the process of training the preset log analysis model and the first training sample set, which is obtained by sorting the logs in the first knowledge base.

And S203, determining safety event data according to the characteristic data of the log to be processed and the equipment information.

In this embodiment, the security event data is determined according to the feature data of the log to be processed and the device information, which may be specifically implemented in the following manner:

deleting repeated logs to be processed with the same characteristic data and the equipment information to which the logs belong; if some characteristic data is absent in some log to be processed, filling the absent characteristic data according to other characteristic data of the log to be processed; and determining each log to be processed which is reserved after the deletion and filling processing as security event data.

Specifically, the aggregation condition of the aggregation processing is the generation time of the log or the number of logs. The aggregating process of the logs to be processed may be filtering repeated logs having the same feature data and the device information to which the logs belong.

The padding processing of the log to be processed may be padding of basic information of the log to be processed. According to the type of the characteristic data which the log to be processed should include, if one or more characteristic data is absent in a certain log to be processed, filling processing is carried out on the absent characteristic data according to other characteristic data of the log to be processed, and therefore the characteristic data of the log to be processed is perfected.

For example, if it is recognized that the feature data of a certain log includes an IP address, but the region information of the log is not recognized, which domain the IP address of the log belongs to may be queried, and the region information corresponding to the log may be further obtained, and the region information in the feature data of the log is filled in, so as to complete the feature data of the log.

And step S204, inputting each safety event data into the attack prediction model according to the sequence of the generation time so that the attack prediction model determines the probability of the network attack behavior according to the incidence relation of the plurality of safety event data.

Wherein, the attack prediction model is a machine learning model. The attack prediction model can be obtained by training through a second training sample set and testing the converged attack prediction model through a second testing sample set based on a machine learning method. The attack prediction model is used for predicting whether network attack behaviors occur or not.

In this embodiment, the content of the second knowledge base is a basic network attack rule, and the network attack rule may also be formulated by a third-party network security device to support configuration and the like. And cleaning the network attack rule data in the second knowledge base to obtain a second training sample set used for training a preset attack prediction model and a second test sample set used for testing the attack prediction model.

And each second training sample in the second training sample set and each second testing sample in the second testing sample set are the determined incidence relation of each piece of security event data in each network attack behavior.

After the second training sample set and the second testing sample set are obtained, the second training sample is adopted to train the attack prediction model, and the second testing sample is adopted to test the attack prediction model until the attack prediction model is converged, so that the attack prediction model is obtained.

And S205, comparing the probability of the network attack behavior with a preset attack threshold.

After the probability of the network attack behavior is obtained, the probability of the network attack behavior is compared with a preset attack threshold value, whether the network attack behavior occurs is predicted according to the comparison result, and a prediction result is obtained.

The preset attack threshold may be set by a technician according to actual needs, and this embodiment is not specifically limited herein.

And S206, if the probability of the network attack behavior is greater than a preset attack threshold, outputting a prediction result of the network attack behavior.

And if the probability of the network attack behavior is greater than the preset attack threshold value, determining that the probability of the network attack behavior is very high, and determining that the predicted result is the network attack behavior.

Optionally, after the occurrence of the network attack behavior is predicted, a preset prevention process corresponding to the predicted network attack behavior may be performed. The preset processing may be set by a technician according to an actual application scenario, and this embodiment is not specifically limited herein.

For example, warning information is sent to a technician in a preset manner; or directly taking measures such as network disconnection and the like.

And step S207, if the probability of the network attack behavior is smaller than or equal to a preset attack threshold, outputting a prediction result of the network attack behavior which does not occur.

If the probability of the network attack behavior is smaller than or equal to the preset attack threshold, determining that the probability of the network attack behavior is not large enough, and predicting that the network attack behavior does not occur.

The above steps S205-S207 are a possible implementation manner of predicting whether the network attack behavior occurs according to the security event data and the attack prediction model in step S104.

EXAMPLE III

Fig. 3 is a schematic structural diagram of a device for predicting network attack behavior according to a third embodiment of the present invention. The device for predicting the network attack behavior provided by the embodiment of the invention can execute the processing flow provided by the method for predicting the network attack behavior. As shown in fig. 3, the device 30 for predicting network attack behavior includes: a log parsing module 302, an aggregation filling module 303 and a prediction processing module 304.

Specifically, the log parsing module 302 is configured to perform feature extraction and identification on the log to be processed according to the log parsing model, so as to obtain feature data of the log to be processed and device information of the log to be processed.

And the aggregation filling module 303 is configured to determine security event data according to the feature data of the log to be processed and the device information.

And the prediction processing module 304 is used for predicting whether the network attack behavior occurs according to the security event data and the attack prediction model.

The apparatus provided in the embodiment of the present invention may be specifically configured to execute the method embodiment provided in the first embodiment, and specific functions are not described herein again.

Example four

Fig. 4 is a schematic structural diagram of a device for predicting network attack behavior according to a fourth embodiment of the present invention. On the basis of the third embodiment, in this embodiment, the feature data of the log to be processed includes generation time, and the prediction processing module is specifically configured to:

inputting each security event data into an attack prediction model according to the sequence of the generation time so that the attack prediction model determines the probability of network attack behavior according to the incidence relation of a plurality of security event data; comparing the probability of the network attack behavior with a preset attack threshold; if the probability of the network attack behavior is larger than a preset attack threshold value, outputting a prediction result of the network attack behavior; and if the probability of the network attack behavior is smaller than or equal to a preset attack threshold value, outputting a prediction result of the network attack behavior which does not occur.

Optionally, the aggregate filling module is specifically configured to:

Optionally, as shown in fig. 4, the apparatus 30 for predicting network attack behavior may further include: log parsing model training module 305.

The log parsing model training module 305 is configured to:

acquiring a first training sample set and a first testing sample set, wherein sample data of each first training sample in the first training sample set and each first testing sample in the first testing sample set comprises a log generated by determined network equipment and label data of the log, and the label data comprises characteristic data of the log and information of the equipment to which the log belongs; and training the log analysis model by adopting the first training sample, and testing the log analysis model by adopting the first testing sample until the log analysis model is converged to obtain an optimized log analysis model.

Optionally, the log parsing model training module 305 is further configured to:

if the log to be processed fails to be identified according to the log analysis model, receiving characteristic data of the log to be processed and information of equipment to which the log belongs; taking the received feature data and the information of the device as label data of the log to be processed, and storing the log to be processed and the label data of the log to be processed into a first training sample set so as to update the first training sample set; and updating the log analysis model according to the updated first training sample set.

Optionally, as shown in fig. 4, the apparatus 30 for predicting network attack behavior may further include: attack prediction model training module 306.

The attack prediction model training module 306 is configured to:

acquiring a second training sample set and a second testing sample set, wherein each second training sample in the second training sample set and each second testing sample in the second testing sample set are the determined incidence relation of each security event data in each network attack behavior;

and training the attack prediction model by adopting the second training sample, and testing the attack prediction model by adopting the second test sample until the attack prediction model is converged to obtain the attack prediction model.

The apparatus provided in the embodiment of the present invention may be specifically configured to execute the method embodiment provided in the second embodiment, and specific functions are not described herein again.

EXAMPLE five

Fig. 5 is a schematic structural diagram of a network attack behavior prediction device according to a fifth embodiment of the present invention. As shown in fig. 5, the apparatus 50 includes: a processor 501, a memory 502, and computer programs stored on the memory 502 and executable by the processor 501.

The processor 501, when executing a computer program stored on the memory 502, implements the method for network attack behavior prediction provided by any of the above-described method embodiments.

In addition, an embodiment of the present invention further provides a computer-readable storage medium, in which a computer program is stored, and when the computer program is executed by a processor, the method for predicting network attack behavior provided in any of the above method embodiments is implemented.

In the embodiments provided in the present invention, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional unit.

The integrated unit implemented in the form of a software functional unit may be stored in a computer readable storage medium. The software functional unit is stored in a storage medium and includes several instructions to enable a computer device (which may be a personal computer, a server, or a network device) or a processor (processor) to execute some steps of the methods according to the embodiments of the present invention. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.

It is obvious to those skilled in the art that, for convenience and simplicity of description, the foregoing division of the functional modules is merely used as an example, and in practical applications, the above function distribution may be performed by different functional modules according to needs, that is, the internal structure of the device is divided into different functional modules to perform all or part of the above described functions. For the specific working process of the device described above, reference may be made to the corresponding process in the foregoing method embodiment, which is not described herein again.

Other embodiments of the invention will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This invention is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the invention and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the invention being indicated by the following claims.

It will be understood that the invention is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the invention is limited only by the appended claims.

Claims

1. A method for predicting network attack behavior, comprising:

predicting whether network attack behaviors occur or not according to the security event data and an attack prediction model;

before extracting and identifying the features of the log to be processed according to the log analysis model to obtain the feature data of the log to be processed and the information of the equipment, the method further comprises the following steps:

acquiring a first training sample set and a first testing sample set, wherein sample data in the first training sample set and the first testing sample set comprises a log generated by the determined network equipment and label data of the log, and the label data comprises characteristic data of the log and information of the equipment;

and training a log analysis model by using the first training sample set, and testing the log analysis model by using the first testing sample set until the log analysis model is converged to obtain the log analysis model.

2. The method according to claim 1, wherein the determining of the security event data according to the feature data of the log to be processed and the device information includes:

deleting repeated logs to be processed with the same characteristic data and the equipment information to which the logs belong;

if some characteristic data is lacked in some log to be processed, filling the lacked characteristic data according to other characteristic data of the log to be processed;

and determining each log to be processed which is reserved after the deletion and filling processing as security event data.

3. The method according to claim 1, wherein the feature data of the log to be processed includes generation time, and predicting whether a network attack behavior occurs according to the security event data and an attack prediction model specifically includes:

inputting each security event data into the attack prediction model according to the sequence of the generation time so that the attack prediction model determines the probability of network attack behavior according to the incidence relation of a plurality of security event data;

comparing the probability of the network attack behavior with a preset attack threshold;

if the probability of the network attack behavior is larger than the preset attack threshold value, outputting a prediction result of the network attack behavior;

and if the probability of the network attack behavior is smaller than or equal to the preset attack threshold, outputting a prediction result of the network attack behavior which does not occur.

4. The method of claim 1, wherein before predicting whether a cyber attack behavior occurs based on the security event data and an attack prediction model, further comprising:

and training an attack prediction model by using the second training sample, and testing the attack prediction model by using the second test sample until the attack prediction model is converged to obtain the attack prediction model.

5. The method of claim 1, further comprising:

if the log to be processed is failed to be identified according to the log analysis model, receiving characteristic data of the log to be processed and information of equipment to which the log belongs;

taking the received feature data and the information of the device as label data of the log to be processed, and storing the log to be processed and the label data of the log to be processed into the first training sample set so as to update the first training sample set;

and updating the log analysis model according to the updated first training sample set.

6. An apparatus for predicting cyber-attack behavior, comprising:

the prediction processing module is used for predicting whether network attack behaviors occur or not according to the security event data and the attack prediction model;

the device for predicting the network attack behavior further comprises: a log analysis model training module;

the log analysis model training module is used for:

acquiring a first training sample set and a first testing sample set, wherein sample data of each first training sample in the first training sample set and each first testing sample in the first testing sample set comprise a log generated by the determined network device and label data of the log, and the label data comprises feature data of the log and information of the device; and training the log analysis model by adopting the first training sample, and testing the log analysis model by adopting the first testing sample until the log analysis model is converged to obtain an optimized log analysis model.

7. The apparatus according to claim 6, wherein the feature data of the log to be processed includes a generation time, and the prediction processing module is specifically configured to:

8. A cyber attack behavior prediction apparatus, comprising:

the processor, when executing the computer program, implements the method of any of claims 1-5.

9. A computer-readable storage medium, in which a computer program is stored,

the computer program, when executed by a processor, implementing the method of any one of claims 1-5.