Nothing Special   »   [go: up one dir, main page]

CN114844691B - Data processing method and device, electronic equipment and storage medium - Google Patents

Data processing method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN114844691B
CN114844691B CN202210419635.XA CN202210419635A CN114844691B CN 114844691 B CN114844691 B CN 114844691B CN 202210419635 A CN202210419635 A CN 202210419635A CN 114844691 B CN114844691 B CN 114844691B
Authority
CN
China
Prior art keywords
data
label
processed
primary
determining
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210419635.XA
Other languages
Chinese (zh)
Other versions
CN114844691A (en
Inventor
任洪伟
沈长伟
肖新光
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Antiy Technology Group Co Ltd
Original Assignee
Antiy Technology Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Antiy Technology Group Co Ltd filed Critical Antiy Technology Group Co Ltd
Priority to CN202210419635.XA priority Critical patent/CN114844691B/en
Publication of CN114844691A publication Critical patent/CN114844691A/en
Application granted granted Critical
Publication of CN114844691B publication Critical patent/CN114844691B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/08Network architectures or network communication protocols for network security for authentication of entities
    • H04L63/0807Network architectures or network communication protocols for network security for authentication of entities using tickets, e.g. Kerberos
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1416Event detection, e.g. attack signature detection
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application provides a data processing method, a data processing device, electronic equipment and a storage medium, wherein the method comprises the following steps: acquiring data to be processed; extracting characteristics of the data to be processed; inputting each variable characteristic into a corresponding primary label determining module respectively; according to the primary label corresponding to the data to be processed, determining at least one secondary label determining module corresponding to the data to be processed; and inputting the primary label of the data to be processed into a secondary label determining module to obtain at least one secondary label. According to the data processing method, the primary label corresponding to the basic data to be processed can be determined according to the variable characteristics of the data. And determining a secondary label determining module corresponding to the data to be processed, and finally analyzing the primary label through the secondary label determining module to determine a secondary label corresponding to the data to be processed. Therefore, deeper analysis is carried out according to basic variable characteristics of the data to be processed, and the threat event type of the data to be processed is determined.

Description

Data processing method and device, electronic equipment and storage medium
Technical Field
The present disclosure relates to the field of information security, and in particular, to a data processing method, apparatus, electronic device, and storage medium.
Background
With the development of computer networks, internet technology is widely used, and great changes are brought to the work, life and the like of people. Networks have become an integral part of people's daily lives. However, people enjoy various convenience brought by the information technology and face the threat of various malicious network events, and even the important aspects such as social stability and the like can be influenced.
In recent years, as attackers increasingly use more concealed methods for information theft or network destruction, it is difficult to effectively discover and deal with the information theft or network destruction based on the conventional detection method or a single security device.
Disclosure of Invention
In view of the foregoing, the present application provides a data processing method, apparatus, electronic device, and storage medium, which at least partially solve the problems in the prior art.
According to one aspect of the present application, there is provided a data processing method, comprising:
acquiring data to be processed;
extracting features of the data to be processed to obtain at least one variable feature;
inputting each variable characteristic into a corresponding primary label determining module respectively to obtain a primary label corresponding to each variable characteristic; the first-level tag is used for representing the feature type of the corresponding variable feature;
determining at least one secondary label determining module corresponding to the data to be processed according to the primary label corresponding to the data to be processed;
respectively inputting the primary labels corresponding to the data to be processed into each secondary label determining module to obtain at least one secondary label corresponding to the data to be processed; the secondary label is used for representing the threat event type of the data to be processed.
In an exemplary embodiment of the present application, the feature extracting the data to be processed to obtain at least one variable feature includes:
determining the data type of the data to be processed;
acquiring a corresponding feature extraction rule according to the data type;
and carrying out feature processing on the data to be processed according to the feature extraction rule to obtain at least one variable feature.
In an exemplary embodiment of the present application, the method further comprises:
determining at least one tertiary tag determining module corresponding to the data to be processed according to the primary tag and the secondary tag corresponding to the data to be processed;
respectively inputting a primary label and a secondary label corresponding to the data to be processed into each tertiary label determining module to obtain at least one tertiary label corresponding to the data to be processed; the third-level tag is used for representing the threat event type of the data to be processed, and the threat degree corresponding to the threat event type represented by the third-level tag is higher than the threat degree corresponding to the threat event type represented by the second-level tag.
In an exemplary embodiment of the present application, the determining, according to the primary label corresponding to the data to be processed, at least one secondary label determining module corresponding to the data to be processed includes:
acquiring a label relation diagram; the label relation graph comprises a plurality of primary labels and a plurality of secondary labels, and the label relation graph is used for representing the corresponding relation between the primary labels and the secondary labels;
determining at least one target secondary label according to the label relation diagram and the primary label corresponding to the data to be processed;
obtaining a combination relation corresponding to each target secondary label;
and determining and combining the corresponding primary label determining modules according to each combination relation in turn to obtain at least one secondary label determining module.
In an exemplary embodiment of the present application, the method further comprises:
obtaining a candidate combination relation; the candidate combination relation has a corresponding candidate secondary label;
acquiring processed data corresponding to the candidate secondary labels; the processed data has the candidate secondary labels;
determining and combining corresponding primary label determining modules according to the candidate combination relation to obtain candidate secondary label determining modules;
inputting the processed data into the candidate secondary label determination module;
and if the candidate secondary label determining module can output the candidate secondary label, establishing an association relationship between the candidate combination relation and the candidate secondary label.
In an exemplary embodiment of the present application, after the establishing the association relationship between the candidate combination relation and the candidate secondary label, the method further includes:
determining at least two target first-level labels corresponding to the candidate combination relation;
determining whether each target primary label and the candidate secondary label establish an association relationship in sequence according to the label relation graph;
if not, establishing the association relation between the target primary label and the candidate secondary label so as to update the label relation diagram.
In an exemplary embodiment of the present application, the method further comprises:
displaying a data name and a secondary label corresponding to the data to be processed;
when the secondary label is selected, at least part of the primary label and/or at least part of variable characteristics corresponding to the data to be processed can be displayed.
According to an aspect of the present application, there is provided a data processing apparatus comprising:
the acquisition module is used for acquiring data to be processed;
the extraction module is used for extracting the characteristics of the data to be processed to obtain at least one variable characteristic;
the first processing module is used for inputting each variable characteristic into the corresponding primary label determining module respectively so as to obtain a primary label corresponding to each variable characteristic; the first-level tag is used for representing the feature type of the corresponding variable feature;
the determining module is used for determining at least one secondary label determining module corresponding to the data to be processed according to the primary label corresponding to the data to be processed;
the second processing module is used for respectively inputting the primary labels corresponding to the data to be processed into each secondary label determining module so as to obtain at least one secondary label corresponding to the data to be processed; the secondary label is used for representing the threat event type of the data to be processed.
According to one aspect of the present application, there is provided an electronic device comprising a processor and a memory;
the processor is configured to perform the steps of any of the methods described above by invoking a program or instruction stored in the memory.
According to one aspect of the present application, there is provided a computer-readable storage medium storing a program or instructions that cause a computer to perform the steps of any one of the methods described above.
According to the data processing method, after the data to be processed is obtained, the primary label corresponding to the basic data to be processed can be determined according to the variable characteristics of the data. After the primary label is obtained, a secondary label determining module corresponding to the data to be processed is determined according to the condition of the primary label, and finally the primary label is analyzed through the secondary label determining module to determine the secondary label corresponding to the label to be processed. Therefore, deeper analysis is carried out according to basic variable characteristics of the data to be processed, and the threat event type of the data to be processed is determined.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a data processing method according to the present embodiment;
fig. 2 is a block diagram of a data processing apparatus according to the present embodiment.
Detailed Description
Embodiments of the present application are described in detail below with reference to the accompanying drawings.
It should be noted that, without conflict, the following embodiments and features in the embodiments may be combined with each other; and, based on the embodiments in this disclosure, all other embodiments that may be made by one of ordinary skill in the art without inventive effort are within the scope of the present disclosure.
It is noted that various aspects of the embodiments are described below within the scope of the following claims. It should be apparent that the aspects described herein may be embodied in a wide variety of forms and that any specific structure and/or function described herein is merely illustrative. Based on the present disclosure, one skilled in the art will appreciate that one aspect described herein may be implemented independently of any other aspect, and that two or more of these aspects may be combined in various ways. For example, an apparatus may be implemented and/or a method practiced using any number of the aspects set forth herein. In addition, such apparatus may be implemented and/or such methods practiced using other structure and/or functionality in addition to one or more of the aspects set forth herein.
Referring to fig. 1, according to an aspect of the present application, there is provided a data processing method, including the steps of:
step S100, obtaining data to be processed.
And step S200, extracting the characteristics of the data to be processed to obtain at least one variable characteristic.
Step S300, inputting each variable characteristic into a corresponding primary label determining module to obtain a primary label corresponding to each variable characteristic; the primary labels are used for representing feature types of corresponding variable features.
Step S400, determining at least one secondary label determining module corresponding to the data to be processed according to the primary label corresponding to the data to be processed.
Step S500, inputting the primary labels corresponding to the data to be processed into each secondary label determining module respectively to obtain at least one secondary label corresponding to the data to be processed; the secondary label determining module comprises at least two primary label determining modules; the secondary label is used for representing the threat event type of the data to be processed.
The data to be processed may be obtained through a work log or flow data of the controlled electronic device, and may include network flow data, endpoint data, file data, and the like. After the data is obtained, corresponding standardized processing can be performed on the data according to different detection scenes or objects, specifically, corresponding unified standard formats can be formed by combining the different detection scenes and the different data objects, such as standardized processing is performed on objects of network traffic data, terminals, files and the like, including data preprocessing, fault tolerance verification, error format rejection and the like. So as to facilitate the subsequent variable feature extraction.
Variable characteristics are understood to mean that data of the same type may change under different conditions, for example, in the case of ping value data, the source address and the destination address in different data may change, which may be regarded as a variable characteristic. The variable characteristics can not effectively determine the corresponding threat event through the variable characteristics due to the diversity of the actual contents, and the corresponding threat event is difficult to comprehensively or deeply determine by adopting the unchanged data characteristics.
In this embodiment, the primary label determining module may exist in the form of a regular expression, and may be formed by a combination of meaningful permutation methods for detection and analysis, such as numbers, operators, numerical grouping symbols, free variables, constraint variables, and the like. Constraint variables have been assigned values in an expression, while free variables may be assigned values outside of the expression. Each rule expression can analyze variable characteristics of at least one characteristic type and obtain a first label corresponding to the variable characteristics. For example, the feature type corresponding to the rule expression a (first-level tag determination module a) is "extension", the input variable feature is ". Doc", and the output first tag is "word document". In some cases, some primary tag determination modules may also determine whether a relationship is "Word" or not, for example, whether a Word document is "Word" or not.
In this embodiment, the number of the first-level tag determining modules may be plural, and in this embodiment, after the variable feature is obtained, the variable feature may be input into the corresponding first-level tag determining model to obtain the corresponding first-level tag. In some embodiments, the inputs to some of the primary tag determination modules may be multiple or multiple variable characteristics. From the above, it can be seen that the primary label is obtained directly from explicit intrinsic information (variable characteristics) of the data to be processed, and therefore belongs to a fact label, which is an attribute that can be clearly determined. So that subsequent analysis can be performed by the corresponding primary label.
In this embodiment, the secondary tag determination module is composed of a plurality of primary tag determination modules, and in some cases, may also include some single corresponding ungraded tag determination modules, where the ungraded tag determination modules may analyze and identify some determined features (i.e., non-variable features). In this embodiment, "composition" means that a plurality of first tag determination modules are combined through a logic relationship and an input sequence to form a judgment logic, so as to implement determination of a second tag.
The secondary label is used for representing threat event types of the data to be processed, such as APT attack, phishing attack and the like.
It can be appreciated that the primary tag is used for characterizing inherent information of the data to be processed, and does not necessarily indicate which primary tag the data to be processed corresponds to, and is necessarily malicious data.
According to the data processing method, after the data to be processed is obtained, the primary label corresponding to the basic data to be processed can be determined according to the variable characteristics of the data. After the primary label is obtained, a secondary label determining module corresponding to the data to be processed is determined according to the condition of the primary label, and finally the primary label is analyzed through the secondary label determining module to determine the secondary label corresponding to the label to be processed. Therefore, deeper analysis is carried out according to basic variable characteristics of the data to be processed, and the threat event type of the data to be processed is determined.
In an exemplary embodiment of the present application, the feature extracting the data to be processed to obtain at least one variable feature includes:
determining the data type of the data to be processed;
acquiring a corresponding feature extraction rule according to the data type;
and carrying out feature processing on the data to be processed according to the feature extraction rule to obtain at least one variable feature.
In this embodiment, the data type may be network traffic data, endpoint data, file data, and the like. Each data type has at least one feature extraction rule for specifying which variable features the corresponding data to be processed should extract, and the extraction method and/or standardized processing method of each variable feature. The method and the device realize targeted feature extraction of different data to be processed, and avoid unnecessary feature extraction.
For example, the variable identification and extraction of the network traffic data includes source IP, destination IP, source port, destination port, protocol, initiation time, packet size, packet type, transmission direction, domain name, URL, user_agent, request status code, transmitted file, etc., and if the mail protocol is used, the mailing metadata including elements such as sender, recipient, sender name, recipient name, mail text content, mail text links, mail attachments, etc. need to be further analyzed and identified, and related variable result values are formed for supporting the subsequent regular expression construction.
And the variable identification and extraction of the endpoint data comprise the elements of a process, a service, a registry, a planning task, system account information and the like. The process elements comprise a process name, a process ID, actions (creation, suspension, activation and the like), parameter values, resource occupancy, thread numbers, child process names, child process IDs, father process names, father process IDs, network behaviors and the like; the service elements comprise service names, process IDs, descriptions, start types, login identities, belonging groups and the like; the planning task comprises a name, a position, a trigger, an action, last running time, next running time, a running result, account creation and the like; the system account elements include account, creation time, earliest login time, latest login time, etc.
The variable identification and extraction of the file data comprises file name, file extension, file format, file hash, file size, file position, whether macro, mutex, version information and the like.
In an exemplary embodiment of the present application, the method further comprises:
determining at least one tertiary tag determining module corresponding to the data to be processed according to the primary tag and the secondary tag corresponding to the data to be processed;
respectively inputting a primary label and a secondary label corresponding to the data to be processed into each tertiary label determining module to obtain at least one tertiary label corresponding to the data to be processed; the three-level tag determining module comprises at least one first-level tag determining module and at least one second-level tag determining module;
the third-level tag is used for representing the threat event type of the data to be processed, and the threat degree corresponding to the threat event type represented by the third-level tag is higher than the threat degree corresponding to the threat event type represented by the second-level tag. The threat level may be determined based on the scope of influence, the degree of influence, etc.
In some cases, some advanced malicious events (i.e., malicious events with a greater threat level) cannot be directly determined by the primary label, or analysis is performed without determining the corresponding secondary label, and there may be a case that the analysis failure rate is too high or whether the analysis needs to be performed cannot be determined. Although a policy may be adopted that is always certain to analyze if it is not determined whether to perform the analysis, this may cause excessive processing pressure on the system.
Therefore, in this embodiment, the above strategy of suspicious analysis is not adopted, but the determination of the second-level tag is performed first, and after the second-level tag is determined, whether a corresponding third-level tag determination module exists is determined according to the first-level tag and the second-level tag determined by the data to be processed, and if so, the corresponding third-level tag determination module is obtained, and the first-level tag and the second-level tag are input to determine the third-level tag. The targeted sharing is realized in turn, and the increase of the processing pressure of the system in some unnecessary analysis processes is avoided.
It will be appreciated that in some embodiments, there may be four-level tags or higher-level tags, and the determination method of these higher-level tags may be determined by referring to the determination method of three-level tags, which is not described in detail in this embodiment.
In an exemplary embodiment of the present application, the determining, according to the primary label corresponding to the data to be processed, at least one secondary label determining module corresponding to the data to be processed includes:
acquiring a label relation diagram; the label relation graph comprises a plurality of primary labels and a plurality of secondary labels, and the label relation graph is used for representing the corresponding relation between the primary labels and the secondary labels;
determining at least one target secondary label according to the label relation diagram and the primary label corresponding to the data to be processed;
obtaining a combination relation corresponding to each target secondary label;
and determining and combining the corresponding primary label determining modules according to each combination relation in turn to obtain at least one secondary label determining module.
In this embodiment, a label relationship graph is used to characterize the relationship between all the determined labels (primary, secondary, and beyond). It may take the form of a knowledge graph or chart, etc., which may characterize at least two low-level tags corresponding to one high-level tag. It is thus possible to determine which low-level tags are needed for a high-level tag by consulting the tag relationship diagram. The combination relation is used for the corresponding high-level label, and the corresponding low-level label can be determined by judging the logic operation relation or operation sequence. Each secondary label (advanced label) has a corresponding combined relation, which can be recorded by a mapping table or directly embedded in the label relation graph. Logical operational relationships include and, or, not, exclusive or, and the like.
For example, the secondary label D, which corresponds to three primary labels D1, D2, and D3, is characterized by a combination relation, and if D1 and D2 are satisfied at the same time, and D3 is satisfied again, it may be determined that the secondary label D is satisfied.
In this embodiment, the method for determining the target secondary label includes comparing a plurality of primary labels corresponding to data to be processed with a label relation diagram to determine which secondary labels can be satisfied by the current primary label. For example, the primary labels of the data to be processed include D1, D2, D3, D4, D5. According to the distance, the three primary labels D1, D2 and D3 needed by the secondary label D are included. The secondary label D is determined to be the target secondary label.
After the secondary labels are determined, a primary label determining module which is needed to be used can be determined according to a corresponding combination relation, and the primary label determining module is combined according to a logical operation relation and a sequence to obtain the corresponding secondary label determining module. Therefore, in this embodiment, the secondary tag determining module is not already existing in the system, but is generated in real time according to the requirement, and can be released after the corresponding work is performed.
Because the secondary tag determination module includes a plurality of primary tag determination modules, the same primary tag determination module is often used in different secondary tag determination modules. If the two-level tag determining module corresponding to each two-level tag is built in advance and stored in the system, the volume of the system is greatly increased (a large number of the first-level tag determining modules are multiplexed), and the storage pressure of the system is improved. In this embodiment, in the normal state, the system only stores the combination relation corresponding to each secondary label, and generates in real time when needed. The storage pressure of the system is reduced.
In an exemplary embodiment of the present application, the method further comprises:
obtaining a candidate combination relation; the candidate combination relation has a corresponding candidate secondary label;
acquiring processed data corresponding to the candidate secondary labels; the processed data has the candidate secondary labels;
determining and combining corresponding primary label determining modules according to the candidate combination relation to obtain candidate secondary label determining modules;
inputting the processed data into the candidate secondary label determination module;
and if the candidate secondary label determining module can output the candidate secondary label, establishing an association relationship between the candidate combination relation and the candidate secondary label.
The candidate combination relation is a combination relation which is not determined to be capable of being used normally. The system can be manually input by a worker, and can also be obtained by a preset self-learning module/artificial intelligence module according to the analysis of the existing data. The candidate secondary label may be an existing secondary label or a secondary label which is not formally used for the candidate combination relation.
After the candidate combination relation is obtained, the feasibility of the candidate combination relation needs to be determined. In this embodiment, the processed data (i.e., processed and marked data) having the candidate secondary label is determined by acquiring the corresponding candidate secondary label, selecting the corresponding primary label determining module according to the combination relation, and combining the determined primary label determining modules according to the logical operation relation, the operation sequence, and the like, so as to generate the candidate secondary label determining module. After the candidate secondary label determining module is generated, the processed data is input into the candidate secondary label determining module, whether the candidate secondary label determining module can normally generate the corresponding candidate secondary label or not is determined, if so, the association relation between the candidate combination relation and the candidate secondary label is established, and the candidate secondary label can be determined by the candidate combination relation in subsequent work.
It will be appreciated that the "candidates" in the foregoing are only temporary at a time, and that after determining their feasibility, they may be considered as normal combination relationships.
In an exemplary embodiment of the present application, after the establishing the association relationship between the candidate combination relation and the candidate secondary label, the method further includes:
determining at least two target first-level labels corresponding to the candidate combination relation;
determining whether each target primary label and the candidate secondary label establish an association relationship in sequence according to the label relation graph;
if not, establishing the association relation between the target primary label and the candidate secondary label so as to update the label relation diagram.
After the feasibility of the candidate combination relation is determined, in this embodiment, according to the target primary label and the candidate secondary label corresponding to the candidate combination relation, it is determined whether the association relation (the influence relation and the upper and lower relation) has been determined by each target primary label and the candidate secondary label in the label relation graph, if not, a relation is established, so that whether the association relation exists before any two labels (not classified) can be determined by the label relation graph later. By referring to the label relation diagram, staff can determine the association relation among different types of data, and the research on malicious network events is facilitated.
In an exemplary embodiment of the present application, the method further comprises:
displaying a data name and a secondary label corresponding to the data to be processed;
when the secondary label is selected, at least part of the primary label and/or at least part of variable characteristics corresponding to the data to be processed can be displayed.
In particular, the data name is used to characterize the identity of the event to be processed, and may exist in the form of an ID or a person name. The term "display" refers to displaying on a display screen or other devices having a display function in the form of icons or the like, so that a worker can determine the condition of data to be processed and the corresponding label by displaying the content. Meanwhile, in this embodiment, there may be many tags for each event to be processed, so that for convenience of the operator, only the highest level tag may be displayed in the initial display, and may be two-level or three-level or higher, and when the operator wants to view detailed information, the operator may click on the two-level tag to expand the lower level information, such as the first-level tag or variable feature for determining the two-level tag. In some cases, all primary labels or variable features may also be displayed.
Referring to fig. 2, according to an aspect of the present application, there is provided a data processing apparatus, including:
the acquisition module is used for acquiring data to be processed;
the extraction module is used for extracting the characteristics of the data to be processed to obtain at least one variable characteristic;
the first processing module is used for inputting each variable characteristic into the corresponding primary label determining module respectively so as to obtain a primary label corresponding to each variable characteristic; the first-level tag is used for representing the feature type of the corresponding variable feature;
the determining module is used for determining at least one secondary label determining module corresponding to the data to be processed according to the primary label corresponding to the data to be processed;
the second processing module is used for respectively inputting the primary labels corresponding to the data to be processed into each secondary label determining module so as to obtain at least one secondary label corresponding to the data to be processed; the secondary label determining module comprises at least two primary label determining modules; the secondary label is used for representing the threat event type of the data to be processed.
Furthermore, although the steps of the methods in the present disclosure are depicted in a particular order in the drawings, this does not require or imply that the steps must be performed in that particular order or that all illustrated steps be performed in order to achieve desirable results. Additionally or alternatively, certain steps may be omitted, multiple steps combined into one step to perform, and/or one step decomposed into multiple steps to perform, etc.
From the above description of embodiments, those skilled in the art will readily appreciate that the example embodiments described herein may be implemented in software, or may be implemented in software in combination with the necessary hardware. Thus, the technical solution according to the embodiments of the present disclosure may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (may be a CD-ROM, a U-disk, a mobile hard disk, etc.) or on a network, including several instructions to cause a computing device (may be a personal computer, a server, a mobile terminal, or a network device, etc.) to perform the method according to the embodiments of the present disclosure.
In an exemplary embodiment of the present disclosure, an electronic device capable of implementing the above method is also provided.
Those skilled in the art will appreciate that the various aspects of the present application may be implemented as a system, method, or program product. Accordingly, aspects of the present application may be embodied in the following forms, namely: an entirely hardware embodiment, an entirely software embodiment (including firmware, micro-code, etc.) or an embodiment combining hardware and software aspects may be referred to herein as a "circuit," module "or" system.
An electronic device according to this embodiment of the present application. The electronic device is only one example and should not impose any limitation on the functionality and scope of use of the embodiments of the present application.
The electronic device is in the form of a general purpose computing device. Components of an electronic device may include, but are not limited to: the at least one processor, the at least one memory, and a bus connecting the various system components, including the memory and the processor.
Wherein the memory stores program code that is executable by the processor to cause the processor to perform steps according to various exemplary embodiments of the present application described in the above section of the "exemplary method" of the present specification.
The storage may include readable media in the form of volatile storage, such as Random Access Memory (RAM) and/or cache memory, and may further include Read Only Memory (ROM).
The storage may also include a program/utility having a set (at least one) of program modules including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each or some combination of which may include an implementation of a network environment.
The bus may be one or more of several types of bus structures including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, a processor, or a local bus using any of a variety of bus architectures.
The electronic device may also communicate with one or more external devices (e.g., keyboard, pointing device, bluetooth device, etc.), with one or more devices that enable a user to interact with the electronic device, and/or with any device (e.g., router, modem, etc.) that enables the electronic device to communicate with one or more other computing devices. Such communication may be through an input/output (I/O) interface. And, the electronic device may also communicate with one or more networks such as a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, such as the Internet, through a network adapter. The network adapter communicates with other modules of the electronic device via a bus. It should be appreciated that although not shown, other hardware and/or software modules may be used in connection with an electronic device, including but not limited to: microcode, device drivers, redundant processors, external disk drive arrays, RAID systems, tape drives, data backup storage systems, and the like.
From the above description of embodiments, those skilled in the art will readily appreciate that the example embodiments described herein may be implemented in software, or may be implemented in software in combination with the necessary hardware. Thus, the technical solution according to the embodiments of the present disclosure may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (may be a CD-ROM, a U-disk, a mobile hard disk, etc.) or on a network, including several instructions to cause a computing device (may be a personal computer, a server, a terminal device, or a network device, etc.) to perform the method according to the embodiments of the present disclosure.
In an exemplary embodiment of the present disclosure, a computer-readable storage medium having stored thereon a program product capable of implementing the method described above in the present specification is also provided. In some possible implementations, the various aspects of the present application may also be implemented in the form of a program product comprising program code for causing a terminal device to carry out the steps according to the various exemplary embodiments of the present application as described in the "exemplary methods" section of this specification, when the program product is run on the terminal device.
The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. The readable storage medium can be, for example, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium would include the following: an electrical connection having one or more wires, a portable disk, a hard disk, random Access Memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
The computer readable signal medium may include a data signal propagated in baseband or as part of a carrier wave with readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A readable signal medium may also be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Program code for carrying out operations of the present application may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C++ or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device, partly on a remote computing device, or entirely on the remote computing device or server. In the case of remote computing devices, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., connected via the Internet using an Internet service provider).
Furthermore, the above-described figures are only illustrative of the processes involved in the method according to exemplary embodiments of the present application, and are not intended to be limiting. It will be readily appreciated that the processes shown in the above figures do not indicate or limit the temporal order of these processes. In addition, it is also readily understood that these processes may be performed synchronously or asynchronously, for example, among a plurality of modules.
It should be noted that although in the above detailed description several modules or units of a device for action execution are mentioned, such a division is not mandatory. Indeed, the features and functionality of two or more modules or units described above may be embodied in one module or unit in accordance with embodiments of the present disclosure. Conversely, the features and functions of one module or unit described above may be further divided into a plurality of modules or units to be embodied.
The foregoing is merely specific embodiments of the present application, but the scope of the present application is not limited thereto, and any changes or substitutions easily conceivable by those skilled in the art within the technical scope of the present application should be covered in the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (9)

1. A method of data processing, comprising:
acquiring data to be processed;
extracting features of the data to be processed to obtain at least one variable feature;
inputting each variable characteristic into a corresponding primary label determining module respectively to obtain a primary label corresponding to each variable characteristic; the first-level tag is used for representing the feature type of the corresponding variable feature;
determining at least one secondary label determining module corresponding to the data to be processed according to the primary label corresponding to the data to be processed;
respectively inputting the primary labels corresponding to the data to be processed into each secondary label determining module to obtain at least one secondary label corresponding to the data to be processed; the secondary label is used for representing the threat event type of the data to be processed;
the determining, according to the primary label corresponding to the data to be processed, at least one secondary label determining module corresponding to the data to be processed includes:
acquiring a label relation diagram; the label relation graph comprises a plurality of primary labels and a plurality of secondary labels, and the label relation graph is used for representing the corresponding relation between the primary labels and the secondary labels;
determining at least one target secondary label according to the label relation diagram and the primary label corresponding to the data to be processed;
obtaining a combination relation corresponding to each target secondary label;
and determining and combining the corresponding primary label determining modules according to each combination relation in turn to obtain at least one secondary label determining module.
2. The method for processing data according to claim 1, wherein the feature extraction of the data to be processed to obtain at least one variable feature includes:
determining the data type of the data to be processed;
acquiring a corresponding feature extraction rule according to the data type;
and carrying out feature processing on the data to be processed according to the feature extraction rule to obtain at least one variable feature.
3. The data processing method of claim 1, wherein the method further comprises:
determining at least one tertiary tag determining module corresponding to the data to be processed according to the primary tag and the secondary tag corresponding to the data to be processed;
respectively inputting a primary label and a secondary label corresponding to the data to be processed into each tertiary label determining module to obtain at least one tertiary label corresponding to the data to be processed;
the third-level tag is used for representing the threat event type of the data to be processed, and the threat degree corresponding to the threat event type represented by the third-level tag is higher than the threat degree corresponding to the threat event type represented by the second-level tag.
4. The data processing method of claim 1, wherein the method further comprises:
obtaining a candidate combination relation; the candidate combination relation has a corresponding candidate secondary label;
acquiring processed data corresponding to the candidate secondary labels;
determining and combining corresponding primary label determining modules according to the candidate combination relation to obtain candidate secondary label determining modules;
inputting the processed data into the candidate secondary label determination module;
and if the candidate secondary label determining module can output the candidate secondary label, establishing an association relationship between the candidate combination relation and the candidate secondary label.
5. The data processing method according to claim 4, wherein after said establishing an association of said candidate combination relation with said candidate secondary label, said method further comprises:
determining at least two target first-level labels corresponding to the candidate combination relation;
determining whether each target primary label and the candidate secondary label establish an association relationship in sequence according to the label relation graph;
if not, establishing the association relation between the target primary label and the candidate secondary label so as to update the label relation diagram.
6. The data processing method of claim 1, wherein the method further comprises:
displaying a data name and a secondary label corresponding to the data to be processed;
when the secondary label is selected, at least part of the primary label and/or at least part of variable characteristics corresponding to the data to be processed can be displayed.
7. A data processing apparatus, comprising:
the acquisition module is used for acquiring data to be processed;
the extraction module is used for extracting the characteristics of the data to be processed to obtain at least one variable characteristic;
the first processing module is used for inputting each variable characteristic into the corresponding primary label determining module respectively so as to obtain a primary label corresponding to each variable characteristic; the first-level tag is used for representing the feature type of the corresponding variable feature;
the determining module is used for determining at least one secondary label determining module corresponding to the data to be processed according to the primary label corresponding to the data to be processed;
the second processing module is used for respectively inputting the primary labels corresponding to the data to be processed into each secondary label determining module so as to obtain at least one secondary label corresponding to the data to be processed; the secondary label is used for representing the threat event type of the data to be processed;
the determining at least one secondary label determining module corresponding to the data to be processed comprises:
acquiring a label relation diagram; the label relation graph comprises a plurality of primary labels and a plurality of secondary labels, and the label relation graph is used for representing the corresponding relation between the primary labels and the secondary labels;
determining at least one target secondary label according to the label relation diagram and the primary label corresponding to the data to be processed;
obtaining a combination relation corresponding to each target secondary label;
and determining and combining the corresponding primary label determining modules according to each combination relation in turn to obtain at least one secondary label determining module.
8. An electronic device comprising a processor and a memory;
the processor is adapted to perform the steps of the method according to any of claims 1 to 6 by invoking a program or instruction stored in the memory.
9. A computer readable storage medium storing a program or instructions for causing a computer to perform the steps of the method according to any one of claims 1 to 6.
CN202210419635.XA 2022-04-20 2022-04-20 Data processing method and device, electronic equipment and storage medium Active CN114844691B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210419635.XA CN114844691B (en) 2022-04-20 2022-04-20 Data processing method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210419635.XA CN114844691B (en) 2022-04-20 2022-04-20 Data processing method and device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN114844691A CN114844691A (en) 2022-08-02
CN114844691B true CN114844691B (en) 2023-07-14

Family

ID=82565738

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210419635.XA Active CN114844691B (en) 2022-04-20 2022-04-20 Data processing method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN114844691B (en)

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3462698A1 (en) * 2017-09-29 2019-04-03 AO Kaspersky Lab System and method of cloud detection, investigation and elimination of targeted attacks
EP3531325A1 (en) * 2018-02-23 2019-08-28 Crowdstrike, Inc. Computer security event analysis
US10440059B1 (en) * 2017-03-22 2019-10-08 Verisign, Inc. Embedding contexts for on-line threats into response policy zones
CN110875920A (en) * 2018-12-24 2020-03-10 哈尔滨安天科技集团股份有限公司 Network threat analysis method and device, electronic equipment and storage medium
US10673880B1 (en) * 2016-09-26 2020-06-02 Splunk Inc. Anomaly detection to identify security threats
CN111988341A (en) * 2020-09-10 2020-11-24 奇安信科技集团股份有限公司 Data processing method, device, computer system and storage medium
CN112165462A (en) * 2020-09-11 2021-01-01 哈尔滨安天科技集团股份有限公司 Attack prediction method and device based on portrait, electronic equipment and storage medium
US10986117B1 (en) * 2018-08-07 2021-04-20 Ca, Inc. Systems and methods for providing an integrated cyber threat defense exchange platform
WO2021169730A1 (en) * 2020-02-25 2021-09-02 深信服科技股份有限公司 Method and device for data processing, and storage medium
CN113992371A (en) * 2021-10-18 2022-01-28 安天科技集团股份有限公司 Method and device for generating threat tag of flow log and electronic equipment

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10673880B1 (en) * 2016-09-26 2020-06-02 Splunk Inc. Anomaly detection to identify security threats
US10440059B1 (en) * 2017-03-22 2019-10-08 Verisign, Inc. Embedding contexts for on-line threats into response policy zones
EP3462698A1 (en) * 2017-09-29 2019-04-03 AO Kaspersky Lab System and method of cloud detection, investigation and elimination of targeted attacks
EP3531325A1 (en) * 2018-02-23 2019-08-28 Crowdstrike, Inc. Computer security event analysis
US10986117B1 (en) * 2018-08-07 2021-04-20 Ca, Inc. Systems and methods for providing an integrated cyber threat defense exchange platform
CN110875920A (en) * 2018-12-24 2020-03-10 哈尔滨安天科技集团股份有限公司 Network threat analysis method and device, electronic equipment and storage medium
WO2021169730A1 (en) * 2020-02-25 2021-09-02 深信服科技股份有限公司 Method and device for data processing, and storage medium
CN111988341A (en) * 2020-09-10 2020-11-24 奇安信科技集团股份有限公司 Data processing method, device, computer system and storage medium
CN112165462A (en) * 2020-09-11 2021-01-01 哈尔滨安天科技集团股份有限公司 Attack prediction method and device based on portrait, electronic equipment and storage medium
CN113992371A (en) * 2021-10-18 2022-01-28 安天科技集团股份有限公司 Method and device for generating threat tag of flow log and electronic equipment

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Tianyi Wang ; Kam Pui Chow.Automatic tagging of cyber threat intelligence unstructured data using semantics extraction.《2019 IEEE International Conference on Intelligence and Security Informatics (ISI)》.2019,197-199页. *
卷积神经网络模型攻击的防御方法研究与实现;龚子成;《中国优秀硕士学位论文全文数据库 信息科技辑》;I138-212页 *

Also Published As

Publication number Publication date
CN114844691A (en) 2022-08-02

Similar Documents

Publication Publication Date Title
US11750659B2 (en) Cybersecurity profiling and rating using active and passive external reconnaissance
US10735456B2 (en) Advanced cybersecurity threat mitigation using behavioral and deep analytics
US20180295154A1 (en) Application of advanced cybersecurity threat mitigation to rogue devices, privilege escalation, and risk-based vulnerability and patch management
CN102254111B (en) Malicious site detection method and device
JP7120350B2 (en) SECURITY INFORMATION ANALYSIS METHOD, SECURITY INFORMATION ANALYSIS SYSTEM AND PROGRAM
US20180034837A1 (en) Identifying compromised computing devices in a network
CN107409134B (en) Forensic analysis method
Ezzati-Jivan et al. A stateful approach to generate synthetic events from kernel traces
US20200412763A1 (en) Graph-based policy representation system for managing network devices
CN109165513B (en) System configuration information inspection method and device and server
CN108959659B (en) Log access analysis method and system for big data platform
CN111614614B (en) Safety monitoring method and device applied to Internet of things
CN114760083A (en) Method and device for issuing attack detection file and storage medium
CN114844691B (en) Data processing method and device, electronic equipment and storage medium
WO2023151397A1 (en) Application program deployment method and apparatus, device, and medium
CN111858782A (en) Database construction method, device, medium and equipment based on information security
CN114679295B (en) Firewall security configuration method and device
CN109194756A (en) Application features information extracting method and device
CN115484326A (en) Method, system and storage medium for processing data
US20210092159A1 (en) System for the prioritization and dynamic presentation of digital content
US9853985B2 (en) Device time accumulation
US11588843B1 (en) Multi-level log analysis to detect software use anomalies
US20240064163A1 (en) System and method for risk-based observability of a computing platform
US11894981B1 (en) Systems and methods for generating soar playbooks
US20220019603A1 (en) Systems and methods for classifying data received from unknown entities

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant