CN107562618A - A kind of shellcode detection method and device - Google Patents
A kind of shellcode detection method and device Download PDFInfo
- Publication number
- CN107562618A CN107562618A CN201710667103.7A CN201710667103A CN107562618A CN 107562618 A CN107562618 A CN 107562618A CN 201710667103 A CN201710667103 A CN 201710667103A CN 107562618 A CN107562618 A CN 107562618A
- Authority
- CN
- China
- Prior art keywords
- shellcode
- samples
- detected
- characteristic
- command information
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 47
- 238000000034 method Methods 0.000 claims abstract description 47
- 239000013598 vector Substances 0.000 claims description 49
- 238000000605 extraction Methods 0.000 description 12
- 238000010606 normalization Methods 0.000 description 11
- 238000004458 analytical method Methods 0.000 description 9
- 238000012512 characterization method Methods 0.000 description 7
- 108091029480 NONCODE Proteins 0.000 description 6
- 238000005070 sampling Methods 0.000 description 6
- 238000011524 similarity measure Methods 0.000 description 6
- 238000004422 calculation algorithm Methods 0.000 description 5
- 239000012634 fragment Substances 0.000 description 4
- 230000006870 function Effects 0.000 description 4
- 230000001174 ascending effect Effects 0.000 description 3
- 230000008901 benefit Effects 0.000 description 3
- 238000004364 calculation method Methods 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 230000003068 static effect Effects 0.000 description 3
- 230000009471 action Effects 0.000 description 2
- 238000004590 computer program Methods 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 238000012552 review Methods 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 238000012300 Sequence Analysis Methods 0.000 description 1
- 241000700605 Viruses Species 0.000 description 1
- 230000002159 abnormal effect Effects 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 238000003556 assay Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 239000012141 concentrate Substances 0.000 description 1
- 238000009795 derivation Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000001035 drying Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 238000003064 k means clustering Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 230000002265 prevention Effects 0.000 description 1
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The embodiment of the present application provides a kind of shellcode detection method and device, and methods described includes:Obtain shellcode to be detected command information;Using the command information, the generation shellcode to be detected characteristic;According to the characteristic, the shellcode to be detected and the similarity of each shellcode classifications in preset shellcode Sample Storehouses are calculated respectively;According to the similarity, the classification of the shellcode to be detected is determined, and then the shellcode can be directed to and dispose the corresponding precautionary measures, avoids the generation of security incident.The present embodiment to shellcode to be detected code without performing line by line, it is not required that by security expert's predefined characteristic sequence, reduces the consuming of the resources such as human cost, improves the efficiency of shellcode detections.
Description
Technical field
The application is related to field of information security technology, more particularly to a kind of shellcode detection method, one kind
Shellcode detection means, a kind of generation method of shellcode Sample Storehouses and a kind of generation of shellcode Sample Storehouses
Device.
Background technology
Shellcode is one section of binary code for referring to complete special duty, can be according to different task calls
Or establish the shell (shell) of a higher-rights.Shellcode is to overflow program and the core of worm-type virus, generally as number
According to the server being sent under fire, by the way that one section of binary code is sent into no patch installing, the main frame of leak be present simultaneously
Perform, it is possible to obtain the control of target machine, and then implement hacker's behavior.But, if it is possible to detect in advance
Shellcode type, it is possible to dispose corresponding risk prevention instrumentses, avoid the generation of security incident.
In the prior art, the detection to shellcode mainly includes dynamic detection and Static Detection two ways.Dynamically
Detection detects above-mentioned code with the presence or absence of abnormal to complete by progressively performing code;And Static Detection is then to known
Shellcode generates characteristic sequence, then by characteristic sequence compared with the characteristic sequence of code to be detected, so that it is determined that
Whether code to be detected is shellcode.
But dynamic detection does not work at all for a part of special code.For example, for employing anti-virtual skill
The shellcode of art can not be detected by way of dynamic detection at all;Secondly, when performing dynamic detection, in order to improve effect
Rate often sets the effective time of a detection, and longer behavior code of some hiding times can not also pass through dynamic detection
Mode detects.On the other hand, if using Static Detection, characteristic sequence must be pre-defined by security expert, once
Shellcode feature changes, it is necessary to puts into manpower and materials again and goes to analyze sample, time and cost consumption are huge.
The content of the invention
In view of the above problems, it is proposed that the application so as to provide one kind overcome above mentioned problem or at least in part solve on
State a kind of shellcode of problem detection method, a kind of shellcode detection means, a kind of shellcode Sample Storehouses
Generation method and a kind of corresponding generating means of shellcode Sample Storehouses.
According to the one side of the application, there is provided a kind of shellcode detection method, including:
Obtain shellcode to be detected command information;
Using the command information, the generation shellcode to be detected characteristic;
According to the characteristic, the shellcode to be detected and preset shellcode Sample Storehouses are calculated respectively
In each shellcode classifications similarity;
According to the similarity, the classification of the shellcode to be detected is determined.
Alternatively, the step of acquisition shellcode to be detected command information includes:
Shellcode to be detected text message is obtained, the text message includes shellcode code segments;
Extract the multiple instruction information in the shellcode code segments.
Alternatively, it is described to use the command information, the step of generating the characteristic of the shellcode to be detected
Including:
Extract the sub-instructions information of preset length successively from the multiple command information;
Count the number that each sub-instructions information occurs;
Multiple target sub-instructions information of predetermined threshold value are exceeded using the number of the appearance, generation is described to be detected
Shellcode characteristic.
Alternatively, the number using the appearance exceedes multiple target sub-instructions information of predetermined threshold value, generates institute
The step of characteristic for stating shellcode to be detected, includes:
Calculate the appearance total degree of the multiple target sub-instructions information;
There is total degree according to described, the multiple target sub-instructions information is normalized, with described in generation
Shellcode to be detected characteristic vector.
Alternatively, the preset shellcode Sample Storehouses generate in the following way:
The command information of multiple shellcode samples is obtained respectively;
Using the command information, the characteristic of each shellcode samples is generated;
According to the characteristic, the multiple shellcode samples are clustered, to generate the shellcode
Sample Storehouse.
Alternatively, it is described obtain multiple shellcode samples respectively command information the step of include:
The text message of each shellcode samples is obtained respectively, and the text message includes shellcode code segments;
Extract the multiple instruction information in the shellcode code segments.
Alternatively, it is described to use the command information, wrap the step of the characteristic for generating each shellcode samples
Include:
Extract the sub-instructions information of preset length successively from the multiple command information;
Count the number that each sub-instructions information occurs;
Multiple target sub-instructions information of predetermined threshold value are exceeded using the number of the appearance, generated described each
The characteristic of shellcode samples.
Alternatively, the number using the appearance exceedes multiple target sub-instructions information of predetermined threshold value, generates institute
The step of characteristic for stating each shellcode samples, includes:
Calculate the appearance total degree of the multiple target sub-instructions information;
There is total degree according to described, the multiple target sub-instructions information is normalized, with described in generation
The characteristic vector of each shellcode samples.
Alternatively, it is described according to the characteristic, the multiple shellcode samples are clustered, to generate
The step of stating shellcode Sample Storehouses includes:
K target shellcode sample is randomly choosed from the multiple shellcode samples;
According to the characteristic, remaining each shellcode samples and the k target are calculated respectively
The distance of shellcode samples;
According to the distance, the multiple shellcode samples are clustered.
Alternatively, described according to the distance, the step of being clustered to the multiple shellcode samples, includes:
According to the distance, the multiple shellcode samples are divided into k shellcode classification;
Count the quantity of shellcod samples in each shellcode classifications;
The most preceding m shellcode classifications of the quantity to the shellcod samples cluster respectively, to obtain m*
N shellcode classification;
Using k-m+m*n shellcode classification as the shellcode classifications in the shellcode Sample Storehouses.
Alternatively, it is described according to the characteristic, calculate respectively the shellcode to be detected with it is preset
The step of similarity of each shellcode classifications in shellcode Sample Storehouses, includes:
Determine the center of each shellcode classifications in preset shellcode Sample Storehouses;
The shellcode to be detected and the distance at the center of each shellcode classifications are calculated respectively.
Alternatively, the step of center of each shellcode classifications in the preset shellcode Sample Storehouses of the determination
Including:
It is determined that the characteristic vector of the shellcode samples in each shellcode classifications;
The average value of the characteristic vector of whole shellcode samples in each shellcode classifications is calculated, as described
The center of shellcode classifications.
Alternatively, it is described according to the similarity, include the step of the classification for determining the shellcode to be detected:
Determine the shellcode to be detected and the minimum value of the distance of each shellcode classifications;
Classification using shellcode classifications corresponding to the minimum value of the distance as the shellcode to be detected.
Alternatively, it is described according to the characteristic, calculate respectively the shellcode to be detected with it is preset
The step of similarity of each shellcode classifications in shellcode Sample Storehouses, includes:
Calculate the shellcode to be detected and the shellcode samples in preset each shellcode classifications
Distance;
Using the distance, it is similar to each shellcode classifications to calculate the shellcode to be detected
Degree.
Alternatively, it is described to use the distance, calculate the shellcode to be detected and each shellcode
The step of similarity of classification, includes:
Add up the shellcode to be detected and each shellcode samples in each shellcode classifications
Distance inverse, the similarity as the shellcode to be detected and the shellcode classifications.
Alternatively, it is described according to the similarity, include the step of the classification for determining the shellcode to be detected:
Determine the shellcode to be detected and the maximum of the similarity of each shellcode classifications;
Class using shellcode classifications corresponding to the maximum of the similarity as the shellcode to be detected
Not.
According to the another aspect of the application, there is provided a kind of generation method of shellcode Sample Storehouses, including:
The command information of multiple shellcode samples is obtained respectively;
Using the command information, the characteristic of each shellcode samples is generated;
According to the characteristic, the multiple shellcode samples are clustered, to generate the shellcode
Sample Storehouse.
Alternatively, it is described obtain multiple shellcode samples respectively command information the step of include:
The text message of each shellcode samples is obtained respectively, and the text message includes shellcode code segments;
Extract the multiple instruction information in the shellcode code segments.
Alternatively, it is described to use the command information, wrap the step of the characteristic for generating each shellcode samples
Include:
Extract the sub-instructions information of preset length successively from the multiple command information;
Count the number that each sub-instructions information occurs;
Multiple target sub-instructions information of predetermined threshold value are exceeded using the number of the appearance, generated described each
The characteristic of shellcode samples.
Alternatively, the number using the appearance exceedes multiple target sub-instructions information of predetermined threshold value, generates institute
The step of characteristic for stating each shellcode samples, includes:
Calculate the appearance total degree of the multiple target sub-instructions information;
There is total degree according to described, the multiple target sub-instructions information is normalized, with described in generation
The characteristic vector of each shellcode samples.
Alternatively, it is described according to the characteristic, the multiple shellcode samples are clustered, to generate
The step of stating shellcode Sample Storehouses includes:
K target shellcode sample is randomly choosed from the multiple shellcode samples;
According to the characteristic, remaining each shellcode samples and the k target are calculated respectively
The distance of shellcode samples;
According to the distance, the multiple shellcode samples are clustered.
Alternatively, described according to the distance, the step of being clustered to the multiple shellcode samples, includes:
According to the distance, the multiple shellcode samples are divided into k shellcode classification;
Count the quantity of shellcod samples in each shellcode classifications;
The most preceding m shellcode classifications of the quantity to the shellcod samples cluster respectively, to obtain m*
N shellcode classification;
Using k-m+m*n shellcode classification as the shellcode classifications in the shellcode Sample Storehouses.
According to the another aspect of the application, there is provided a kind of shellcode detection means, including:
Acquisition module, for obtaining shellcode to be detected command information;
Generation module, for using the command information, generate the characteristic of the shellcode to be detected;
Computing module, for according to the characteristic, calculate respectively the shellcode to be detected with it is preset
The similarity of each shellcode classifications in shellcode Sample Storehouses;
Determining module, for according to the similarity, determining the classification of the shellcode to be detected.
Alternatively, the acquisition module includes:
First text message acquisition submodule, for obtaining shellcode to be detected text message, the text envelope
Breath includes shellcode code segments;
First command information extracting sub-module, for extracting the multiple instruction information in the shellcode code segments.
Alternatively, the generation module includes:
First sub-instructions information extraction submodule, for extracting preset length successively from the multiple command information
Sub-instructions information;
First sub-instructions Information Statistics submodule, the number occurred for counting each sub-instructions information;
Fisrt feature data generate submodule, for exceeding multiple targets of predetermined threshold value using the number of the appearance
Command information, generate the characteristic of the shellcode to be detected.
Alternatively, the fisrt feature data generation submodule includes:
First object sub-instructions information calculating unit, the appearance for calculating the multiple target sub-instructions information are always secondary
Number;
First object sub-instructions information normalization unit, for there is total degree according to described, to the multiple target
Command information is normalized, to generate the characteristic vector of the shellcode to be detected.
Alternatively, the preset shellcode Sample Storehouses are by calling following module to generate:
Command information acquisition module, for obtaining the command information of multiple shellcode samples respectively;
Characteristic generation module, for using the command information, generate the characteristic of each shellcode samples
According to;
Sample clustering module, for according to the characteristic, being clustered to the multiple shellcode samples, with
Generate the shellcode Sample Storehouses.
Alternatively, the command information acquisition module includes:
Second text message acquisition submodule, for obtaining the text message of each shellcode samples, the text respectively
This information includes shellcode code segments;
Second command information extracting sub-module, for extracting the multiple instruction information in the shellcode code segments.
Alternatively, the characteristic generation module includes:
Second sub-instructions information extraction submodule, for extracting preset length successively from the multiple command information
Sub-instructions information;
Second sub-instructions Information Statistics submodule, the number occurred for counting each sub-instructions information;
Second feature data generate submodule, for exceeding multiple targets of predetermined threshold value using the number of the appearance
Command information, generate the characteristic of each shellcode samples.
Alternatively, the second feature data generation submodule includes:
Second target sub-instructions information calculating unit, the appearance for calculating the multiple target sub-instructions information are always secondary
Number;
Second target sub-instructions information normalization unit, for there is total degree according to described, to the multiple target
Command information is normalized, to generate the characteristic vector of each shellcode samples.
Alternatively, the sample clustering module includes:
Submodule is selected, for randomly choosing k target shellcode sample from the multiple shellcode samples;
Apart from calculating sub module, for according to the characteristic, calculate respectively remaining each shellcode samples with
The distance of the k target shellcode samples;
Sample clustering submodule, for according to the distance, being clustered to the multiple shellcode samples.
Alternatively, the sample clustering submodule includes:
Category division unit, for according to the distance, the multiple shellcode samples to be divided into k
Shellcode classifications;
Sample size statistic unit, for counting the quantity of shellcod samples in each shellcode classifications;
Sample clustering unit, the preceding m shellcode classification most for the quantity to the shellcod samples respectively
Clustered, to obtain m*n shellcode classification;Using k-m+m*n shellcode classification as the shellcode samples
Shellcode classifications in this storehouse.
Alternatively, the computing module includes:
Class center determination sub-module, for determining each shellcode classifications in preset shellcode Sample Storehouses
Center;
Class center apart from calculating sub module, for calculate respectively the shellcode to be detected with it is each
The distance at the center of shellcode classifications.
Alternatively, the class center determination sub-module includes:
Sampling feature vectors determining unit, for determining the feature of the shellcode samples in each shellcode classifications
Vector;
Sampling feature vectors average calculation unit, for calculating whole shellcode in each shellcode classifications
The average value of the characteristic vector of sample, the center as the shellcode classifications.
Alternatively, the determining module includes:
Apart from minimum value determination sub-module, for determining the shellcode to be detected and each shellcode classifications
Distance minimum value;The shellcode to be detected is used as using shellcode classifications corresponding to the minimum value of the distance
Classification.
Alternatively, the computing module includes:
Sample is apart from calculating sub module, for calculating the shellcode to be detected and preset each shellcode
The distance of shellcode samples in classification;
Similarity Measure submodule, for using the distance, calculate the shellcode to be detected with it is described each
The similarity of shellcode classifications.
Alternatively, the Similarity Measure submodule includes:
Sum unit, for add up the shellcode to be detected with it is each in each shellcode classifications
The inverse of the distance of shellcode samples, it is similar to the shellcode classifications as the shellcode to be detected
Degree.
Alternatively, the determining module includes:
Similarity maximum determination sub-module, for determining the shellcode to be detected and each shellcode classes
The maximum of other similarity;Using shellcode classifications corresponding to the maximum of the similarity as described to be detected
Shellcode classification.
According to another aspect of the application, there is provided a kind of generating means of shellcode Sample Storehouses, including:
Command information acquisition module, for obtaining the command information of multiple shellcode samples respectively;
Characteristic generation module, for using the command information, generate the characteristic of each shellcode samples
According to;
Sample clustering module, for according to the characteristic, being clustered to the multiple shellcode samples, with
Generate the shellcode Sample Storehouses.
Alternatively, the command information acquisition module includes:
Text message acquisition submodule, for obtaining the text message of each shellcode samples, the text envelope respectively
Breath includes shellcode code segments;
Command information extracting sub-module, for extracting the multiple instruction information in the shellcode code segments.
Alternatively, the characteristic generation module includes:
Sub-instructions information extraction submodule, the son for extracting preset length successively from the multiple command information refer to
Make information;
Sub-instructions Information Statistics submodule, the number occurred for counting each sub-instructions information;
Characteristic generates submodule, for exceeding multiple target sub-instructions of predetermined threshold value using the number of the appearance
Information, generate the characteristic of each shellcode samples.
Alternatively, the characteristic generation submodule includes:
Target sub-instructions information calculating unit, for calculating the appearance total degree of the multiple target sub-instructions information;
Sub-instructions information normalization unit is marked, for there is total degree according to described, the multiple target sub-instructions are believed
Breath is normalized, to generate the characteristic vector of each shellcode samples.
Alternatively, the sample clustering module includes:
Submodule is selected, for randomly choosing k target shellcode sample from the multiple shellcode samples;
Apart from calculating sub module, for according to the characteristic, calculate respectively remaining each shellcode samples with
The distance of the k target shellcode samples;
Sample clustering submodule, for according to the distance, being clustered to the multiple shellcode samples.
Alternatively, the sample clustering submodule includes:
Category division unit, for according to the distance, the multiple shellcode samples to be divided into k
Shellcode classifications;
Sample size statistic unit, for counting the quantity of shellcod samples in each shellcode classifications;
Sample clustering unit, the preceding m shellcode classification most for the quantity to the shellcod samples respectively
Clustered, to obtain m*n shellcode classification;Using k-m+m*n shellcode classification as the shellcode samples
Shellcode classifications in this storehouse.
The embodiment of the present application, characteristic is generated by using shellcode to be detected command information, and according to spy
Data are levied, calculate shellcode to be detected and each shellcode classifications in preset shellcode Sample Storehouses respectively
Similarity, so as to according to the similarity that is calculated, determine shellcode to be detected classification, and then can
The corresponding precautionary measures are disposed for the shellcode, avoid the generation of security incident.The present embodiment passes through analysis
Shellcode command information, automatically generate shellcode classification and carry out similitude detection, without line by line to be detected
Shellcode code performed, it is not required that by security expert's predefined characteristic sequence, reduce human cost etc.
The consuming of resource, improve the efficiency of shellcode detections.
Described above is only the general introduction of technical scheme, in order to better understand the technological means of the application,
And can be practiced according to the content of specification, and in order to allow above and other objects, features and advantages of the application can
Become apparent, below especially exemplified by the embodiment of the application.
Brief description of the drawings
By reading the detailed description of hereafter preferred embodiment, it is various other the advantages of and benefit it is common for this area
Technical staff will be clear understanding.Accompanying drawing is only used for showing the purpose of preferred embodiment, and is not considered as to the application
Limitation.And in whole accompanying drawing, identical part is denoted by the same reference numerals.In the accompanying drawings:
Fig. 1 shows that the step of detection method embodiment according to a kind of shellcode of the application one embodiment flows
Cheng Tu;
Fig. 2 shows the step of detection method embodiment according to a kind of shellcode of the application another embodiment
Flow chart;
Fig. 3 shows the step of detection method embodiment according to a kind of shellcode of the application another embodiment
Flow chart;
Fig. 4 shows a kind of system flow chart of shellcode of the application detection method;
Fig. 5 shows a kind of generation method embodiment of shellcode Sample Storehouses according to the application one embodiment
Flow chart of steps;
Fig. 6 shows a kind of structural frames of the detection means embodiment of shellcode according to the application one embodiment
Figure;And
Fig. 7 shows a kind of generating means embodiment of shellcode Sample Storehouses according to the application one embodiment
Structured flowchart.
Embodiment
The exemplary embodiment of the disclosure is more fully described below with reference to accompanying drawings.Although the disclosure is shown in accompanying drawing
Exemplary embodiment, it being understood, however, that may be realized in various forms the disclosure without should be by embodiments set forth here
Limited.On the contrary, these embodiments are provided to facilitate a more thoroughly understanding of the present invention, and can be by the scope of the present disclosure
Completely it is communicated to those skilled in the art.
Reference picture 1, show a kind of step of the detection method embodiment of shellcode according to the application one embodiment
Rapid flow chart, specifically may include steps of:
Step 101, shellcode to be detected command information is obtained;
In the embodiment of the present application, shellcode to be detected can be captured by safety analysis personnel it is new
Shellcode samples, or the new shellcode samples obtained by other approach, the embodiment of the present application is to be detected
Shellcode specific source is not construed as limiting.
Generally, after shellcode to be detected is obtained, IDA (Interactive will can be utilized
Disassembler, a disassemblers) shellcode generated into a .asm formatted files, so as to from the .asm
Corresponding command information is got in formatted file.
IDA is a disassemblers that the current whole world is most intelligent, function is most perfect, is write form using C++ completely, is fitted
For Microsoft Windows, Mac OS X, and the three big mainstream operation system such as Linux.One of IDA main target
It is the code as close possible to source code is presented, and generation to try one's best is annotated by the variable and function name of derivation
Dis-assembling code, its kernel algorithm has at a high speed and the characteristic such as expansible.
It is as follows, it is the Partial Fragment example using a .asm formatted files of IDA generations:
Wherein, imul, mov, adc, aad, xchg etc. are machine instructions, that is, need to obtain to be detected
Shellcode command information.After these machine instructions, it can be believed according to different functions with operand or offset address etc.
Breath.
Step 102, using the command information, the generation shellcode to be detected characteristic;
In the embodiment of the present application, shellcode to be detected characteristic can be characteristic vector.
In the specific implementation, after shellcode to be detected command information is got, N-gram features can be used
Method carries out feature extraction, and shellcode to be detected characteristic vector is generated according to the feature extracted.
Multiple continuous subsequences that the length that N-gram characterization methods refer to take out from a sequence is N, are a kind of meters
Calculate the concept in linguistics.N-gram characterization methods are commonly used to the neck such as text analyzing, speech recognition and biological sequence analysis
Domain.When using N-gram characterization methods, according to the difference of N value, the feature set extracted correspondingly also can be different.
For example, for text " he/be/mono-/name/outstanding// student ":If N=1, feature set for he, is one,
Name ... };If N=2, feature set is { (he/be), (being/mono-), (one/name), (name/outstanding) ... };It is special if N=3
Collect as { (he/be/mono-), (being/mono-/name), (one/name/outstanding), (name/outstanding /) ... }.In practical application, considering
Avoid openness while to the validity for improving feature, usual N can take any value between 1~4, and can also mix makes
With.
Step 103, according to the characteristic, calculate respectively the shellcode to be detected with it is preset
The similarity of each shellcode classifications in shellcode Sample Storehouses;
In the embodiment of the present application, shellcode Sample Storehouses can be made up of multiple shellcode samples
Shellcode sample sets, each shellcode samples can be a .asm formatted files using IDA generations.
In the specific implementation, user can by terminal security software active reporting sample, then by high in the clouds algorithm and
The mode that the manual review of safety analysis personnel is combined, integration classification is carried out to each shellcode samples received, from
And form shellcode Sample Storehouses.On the other hand, after shellcode to be detected is received, by high in the clouds algorithm and peace
After the manual review of complete analysis personnel, the shellcode for detecting completion can also be put into shellcode Sample Storehouses, so as to expand
Fill the sample size of shellcode Sample Storehouses.Certainly, those skilled in the art can also use other modes to generate
Shellcode Sample Storehouses, the embodiment of the present application are not construed as limiting to this.
Further, can be to multiple in the shellcode Sample Storehouses after shellcode Sample Storehouses are generated
Shellcode Sample Storehouses are clustered, so as to which the shellcode samples with higher similitude are divided into same classification,
To facilitate follow-up shellcode to detect.
In the embodiment of the present application, can be according to characteristic after shellcode to be detected characteristic is generated
According to calculating the similarities of each shellcode classifications in the shellcode to be detected and shellcode Sample Storehouses respectively.
Step 104, according to the similarity, the classification of the shellcode to be detected is determined.
In the embodiment of the present application, in shellcode to be detected and preset shellcode Sample Storehouses is calculated
After the similarity of each shellcode classifications, shellcode to be detected classification can be determined according to the similarity.
After shellcode to be detected classification is determined, it can be taken precautions against accordingly with being disposed for the shellcode
Measure, avoid the generation of security incident.
In the embodiment of the present application, by using shellcode to be detected command information generation characteristic, and according to
According to characteristic, shellcode to be detected and each shellcode in preset shellcode Sample Storehouses are calculated respectively
The similarity of classification, so as to according to the similarity that is calculated, determine shellcode to be detected classification, and then
The shellcode can be directed to and dispose the corresponding precautionary measures, avoid the generation of security incident.The present embodiment passes through analysis
Shellcode command information, automatically generate shellcode classification and carry out similitude detection, without line by line to be detected
Shellcode code performed, it is not required that by security expert's predefined characteristic sequence, reduce human cost etc.
The consuming of resource, improve the efficiency of shellcode detections.
Reference picture 2, show the detection method embodiment of shellcode according to the application another embodiment a kind of
Flow chart of steps, specifically it may include steps of:
Step 201, shellcode Sample Storehouses are generated;
In the embodiment of the present application, shellcode Sample Storehouses can be made up of multiple shellcode samples
Shellcode sample sets, each shellcode samples can be a .asm formatted files using IDA generations.
In the specific implementation, shellcode Sample Storehouses can be generated by following sub-step:
Sub-step 2011, the command information of multiple shellcode samples is obtained respectively;
In the embodiment of the present application, the text message of each shellcode samples can be obtained respectively, and text information can
To be using disassemblers IDA generations and the unique corresponding .asm formatted file of each shellcode samples.
It is as follows, it is the Partial Fragment example using a .asm formatted files of IDA generations:
In the .asm formatted files, include code segment and non-code segment, wherein non-code segment refers to be given birth to automatically by IDA
Into file header, the paragraph such as cut-off rule.For code segment, machine instruction part and non-instruction department can be further divided into again
Point.
In shellcode code snippets as described above, imul, mov, adc, aad, xchg etc. are machine instructions, and
Dd rows, dw rows etc. are non-operation part.
Therefore, after the .asm formatted files of each shellcode samples are got, shellcode codes can be extracted
Multiple instruction information in section, and for non-code segment and non-operation part, then it can make to give up processing.
Sub-step 2012, using the command information, generate the characteristic of each shellcode samples;
In the embodiment of the present application, N-gram characterization methods can be used to carry out feature extraction, and according to the spy extracted
Sign generates shellcode to be detected characteristic vector.
In the specific implementation, the sub-instructions information of preset length can be extracted successively from multiple instruction information first.
It should be noted that when using N-gram characterization methods, preset length is the size for the numerical value that N is taken.
For example, for shellcode fragments as follows:
imul edi,[esi+694A0B76h],-1Dh
mov ds:516F2816h,bh
adc[edi+72h],ah
aad0FDh
xchg eax,ebp
If N=1, the sub-instructions information structure feature set that length is 1, i.e. feature set during N=1 can be extracted successively
For { imul, mov, adc, aad, xchg };If N=2, the sub-instructions information structure feature that length is 2 can be extracted successively
Collection, i.e. feature set during N=2 is { (imul/mov), (mov/adc), (adc/aad), (aad/xchg) };, can if N=3
To extract length successively as 3 sub-instructions information structure feature set, i.e. feature set during N=3 for (imul/mov/adc),
(mov/adc/aad), (adc/aad/xchg) }.
It is then possible to count the number that each sub-instructions information occurs, i.e., statistical nature concentrates time that each feature occurs
Number, and use the number of appearance to exceed multiple target sub-instructions information of predetermined threshold value, generate the spy of each shellcode samples
Levy data.
It should be noted that to .asm formatted files count corresponding to all shellcode when, if certain feature
Otherwise the number of appearance is more than predetermined threshold value can just need to give up as feature.
If for example, in .asm formatted files corresponding to 6000 shellcode samples, (mov/adc) occurs altogether 200
Secondary, (adc/aad/xchg) occurs 50 times, if predetermined threshold value is arranged to 100 times, can retain occurrence number more than 100
Secondary (mov/adc) feature, give up to fall (adc/aad/xchg) feature that occurrence number is less than 100 times.
Exceed multiple target sub-instructions information of predetermined threshold value using the number occurred, generate each shellcode samples
During this characteristic, the appearance total degree of multiple target sub-instructions information can be calculated first, and there is total degree in foundation, it is right
Multiple target sub-instructions information are normalized, to generate the characteristic vector of each shellcode samples.
It is the number example that the machine instruction of two shellcode samples occurs for example, as shown in following table one.
Table one:
mov | (sub/mov) | (mov/mov) | (mov/adc/aad) | |
shellcode1 | 10 | 3 | 3 | 1 |
shellcode2 | 15 | 5 | 2 | 0 |
In this example, occur 10 times for sample shellcode1, wherein mov, (sub/mov) occurs 3 times, (mov/
Mov) occur 3 times, (mov/adc/aad) occurs 1 time;Occur 15 times for sample shellcode2, wherein mov, (sub/mov)
Occur 5 times, (mov/mov) occurs 2 times, and (mov/adc/aad) occurs 0 time.Then in sample shellcode1, each feature is altogether
The number of appearance is 17 times, can be by number that each feature occurs respectively divided by 17;In sample shellcode2, each feature
The number occurred altogether is 22 times, can be by number that each feature occurs respectively divided by 22;So as to obtain as shown in following table two
Each feature normalization after numerical value.
Table two:
mov | (sub/mov) | (mov/mov) | (mov/adc/aad) | |
shellcode1 | 0.5882 | 0.1765 | 0.1765 | 0.0588 |
shellcode2 | 0.6818 | 0.2273 | 0.0909 | 0 |
Then, characteristic vector corresponding to sample shellcode1 can be expressed as (0.5882,0.1765,0.1765,
0.0588);Characteristic vector can be expressed as (0.6818,0.2273,0.0909,0) corresponding to sample shellcode2.
Sub-step 2013, according to the characteristic, the multiple shellcode samples are clustered, to generate
State shellcode Sample Storehouses.
In the embodiment of the present application, after the characteristic vector of each shellcode samples is obtained, can according to this feature to
Amount clusters to whole shellcode samples in shellcode Sample Storehouses, to gather several classifications.
In the embodiment of the present application, two-stage k-means clustering algorithms can be used to the whole in shellcode Sample Storehouses
Shellcode samples are clustered.
K target shellcode sample can be randomly choosed first from multiple shellcode samples.For example, it can take
K=50;Then according to the characteristic of each shellcode samples, remaining each shellcode samples and k are calculated respectively
The distance of individual target shellcode samples, so as to according to the distance, be clustered to multiple shellcode samples.
In the specific implementation, multiple shellcode samples can be divided into k shellcode classification according to distance,
Then the quantity of shellcod samples in each shellcode classifications is counted;It is most to the quantity of shellcod samples respectively again
Preceding m shellcode classifications clustered, to obtain m*n shellcode classification;And by k-m+m*n shellcode
Classification is positive integer as the shellcode classifications in shellcode Sample Storehouses, wherein k, m, n.
With k=50, exemplified by m=4, n=15, i.e., whole shellcode samples in shellcode Sample Storehouses are gathered first
Into 50 classifications, then preceding 4 classifications of the quantity comprising shellcode samples at most in this 50 classifications are carried out again again
Cluster, each poly- 15 classifications, this cluster can obtain 4*15=60 classification, by clustering twice, may finally obtain 106
Individual shellcode classifications, and it is used as each shellcode classifications in shellcode Sample Storehouses using this 106 classifications.
So far, shellcode Sample Storehouses have just generated.
It should be noted that after the completion of clustering first, i.e., in the case of the existing classification of shellcode Sample Storehouses,
When there is new shellcode samples to add, it can be determined that whether need to increase new shellcode classifications.
Specifically, can calculate first the shellcode samples that newly increase and each known shellcode classifications away from
From judging whether the distance meets corresponding constraints, so as to add according to the shellcode samples that determine whether to newly increase
Enter in some already present shellcode classification, or by the shellcode samples newly increased be labeled as one it is new
Shellcode classifications, and it is added to shellcode Sample Storehouses.Those skilled in the art can determine constraint according to being actually needed
The particular content and decision procedure of condition, the embodiment of the present application are not construed as limiting to this.
Step 202, shellcode to be detected text message is obtained, the text message includes shellcode codes
Section;
In the embodiment of the present application, shellcode to be detected text message can be given birth to using disassemblers IDA
Into with the unique corresponding .asm formatted file of each shellcode samples to be detected.
In the .asm formatted files, include code segment and non-code segment, wherein non-code segment refers to be given birth to automatically by IDA
Into file header, the paragraph such as cut-off rule.For code segment, machine instruction part and non-instruction department can be further divided into again
Point.
Step 203, the multiple instruction information in the shellcode code segments is extracted;
In the embodiment of the present application, it is thus only necessary to corresponding command information is extracted from shellcode code segments, i.e., it is every
Machine instruction in individual shellcode code segments, and for non-code segment and non-operation part, then it can make to give up processing.
Step 204, the sub-instructions information of preset length is extracted successively from the multiple command information;
In the embodiment of the present application, N-gram characterization methods can be used to carry out feature extraction, and according to the spy extracted
Sign generates shellcode to be detected characteristic vector.
When using N-gram characterization methods, preset length is the size for the numerical value that N is taken.
For example, for shellcode fragments as follows:
imul edi,[esi+694A0B76h],-1Dh
mov ds:516F2816h,bh
adc[edi+72h],ah
aad0FDh
xchg eax,ebp
If N=1, the sub-instructions information structure feature set that length is 1, i.e. feature set during N=1 can be extracted successively
For { imul, mov, adc, aad, xchg };If N=2, the sub-instructions information structure feature that length is 2 can be extracted successively
Collection, i.e. feature set during N=2 is { (imul/mov), (mov/adc), (adc/aad), (aad/xchg) };, can if N=3
To extract length successively as 3 sub-instructions information structure feature set, i.e. feature set during N=3 for (imul/mov/adc),
(mov/adc/aad), (adc/aad/xchg) }.
Step 205, the number that each sub-instructions information occurs is counted;
It is the number example that shellcode machine instructions to be detected occur as shown in following table three.
Table three:
mov | (sub/mov) | (mov/mov) | (mov/adc/aad) | |
Shellcode to be detected | 12 | 5 | 4 | 2 |
I.e. statistics obtains the number that each feature occurs in feature set, and mov occurs 12 times, and (sub/mov) occurs 5 times,
(mov/mov) occur 4 times, (mov/adc/aad) occurs 2 times.
Step 206, multiple target sub-instructions information of predetermined threshold value are exceeded using the number of the appearance, treated described in generation
The shellcode of detection characteristic;
In the embodiment of the present application, shellcode to be detected characteristic can refer to shellcode to be detected
Characteristic vector.
When generating shellcode to be detected characteristic vector, can occur first to each feature in feature set
Number makees normalized.
For example, the number occurred for each feature of shellcode to be detected shown in table three, learns after totalling
The number that each feature occurs altogether be 23 this, then the number divided by 23 that respectively each feature can occur is as follows so as to obtain
The numerical value after each feature normalization of shellcode to be detected shown in table four.
Table four:
mov | (sub/mov) | (mov/mov) | (mov/adc/aad) | |
Shellcode to be detected | 0.5217 | 0.2174 | 0.1739 | 0.0870 |
Then characteristic vector corresponding to shellcode to be detected can be expressed as (0.5217,0.2174,0.1739,
0.0870)。
Step 207, the center of each shellcode classifications in preset shellcode Sample Storehouses is determined;
In the embodiment of the present application, in order to calculate in shellcode to be detected and preset shellcode Sample Storehouses
The similarity of each shellcode classifications, can determine each shellcode in preset shellcode Sample Storehouses first
The center of classification.
In the specific implementation, can determine first the features of the shellcode samples in each shellcode classifications to
Amount, the average value of the characteristic vector of whole shellcode samples in each shellcode classifications is then calculated, is used as this
The center of shellcode classifications.
Step 208, the shellcode to be detected and the distance at the center of each shellcode classifications are calculated respectively;
Then, the distance between center of shellcode to be detected and each shellcode classifications then is respectively calculated,
The distance can be the average value of shellcode to be detected characteristic vector and the characteristic vector of whole shellcode samples
The distance between corresponding characteristic vector.
Step 209, the shellcode to be detected and the minimum value of the distance of each shellcode classifications are determined;With
Classification of the shellcode classifications corresponding to the minimum value of the distance as the shellcode to be detected.
, can be according to ascending order or drop after the distance of shellcode to be detected with each shellcode classifications is determined
The ordered pair distance is ranked up, so as to identify apart from classification corresponding to minimum value, and by this apart from corresponding to minimum value
Classification of the shellcode classifications as shellcode to be detected.
Reference picture 3, show the detection method embodiment of shellcode according to the application another embodiment a kind of
Flow chart of steps, specifically it may include steps of:
Step 301, shellcode Sample Storehouses are generated;
Step 302, shellcode to be detected text message is obtained, the text message includes shellcode codes
Section;
Step 303, the multiple instruction information in the shellcode code segments is extracted;
Step 304, the sub-instructions information of preset length is extracted successively from the multiple command information;
Step 305, the number that each sub-instructions information occurs is counted;
Step 306, multiple target sub-instructions information of predetermined threshold value are exceeded using the number of the appearance, treated described in generation
The shellcode of detection characteristic;
, can be mutual because the step 301- steps 306 in the present embodiment are similar with step 201-206 in above-described embodiment
Refer to, the present embodiment repeats no more to this.
Step 307, calculate in the shellcode to be detected and preset each shellcode classifications
The distance of shellcode samples;
In the specific implementation, can calculate respectively in shellcode to be detected and preset each shellcode classifications
The distance between shellcode samples, the distance can be shellcode to be detected characteristic vector with it is each
The distance between characteristic vector of shellcode samples.
Step 308, using the distance, the shellcode to be detected and each shellcode classifications are calculated
Similarity;
In the specific implementation, can add up shellcode to be detected with it is each in each shellcode classifications
The inverse of the distance of shellcode samples, as shellcode to be detected and the similarity of the shellcode classifications.
Step 309, the shellcode to be detected and the maximum of the similarity of each shellcode classifications are determined;
Classification using shellcode classifications corresponding to the maximum of the similarity as the shellcode to be detected.
After the similarity of shellcode to be detected and each shellcode classifications is determined, can according to ascending order or
Descending is ranked up to the similarity, so as to identify classification corresponding to similarity maximum, and by the similarity maximum pair
Classification of the shellcode classifications answered as shellcode to be detected.
Said in order to make it easy to understand, making one to the shellcode of the application detection method below with a complete example
It is bright.
Reference picture 4, show a kind of system flow chart of shellcode of the application detection method.Generating
During shellcode Sample Storehouses, by extracting the feature of each shellcode samples, whole shellcode samples can be divided
For multiple classifications, so that when receiving shellcode to be detected, shellcode to be detected spy can be equally extracted
Sign, and each classification in shellcode to be detected and shellcode Sample Storehouses is subjected to similitude detection, so as to export
Corresponding testing result.The shellcode, for shellcode to be detected, upon completion of the assays, can be put into by colleague
In some classification in shellcode Sample Storehouses, or, it is put into after the shellcode is individually labeled as into one kind
Shellcode Sample Storehouses.
Specifically, when carrying out similitude detection, it substantially includes two tasks:First, judge to be detected
Which kind of in shellcode Sample Storehouses shellcode belong to, second, being found out from shellcode Sample Storehouses to be detected with this
The most like preceding M shellcode samples of shellcode.
As shown in Table 5, it is a kind of example of shellcode Sample Storehouses.
Table five:
Shellcode samples | Classification |
A | C1 |
B | C2 |
C | C2 |
D | C2 |
…… | …… |
For shellcode to be detected, then firstly the need of judging which class shellcode to be detected particularly belongs to
Not (task one), that is, the classification of the shellcode to be detected shown in completion table six is needed to judge.
Table six:
Shellcode to be detected | Classification |
X1 | |
X2 | |
X3 |
After calculating and analysis, classification judged result as shown in Table 7 can be obtained.
Table seven:
Wherein, constraints can be:In multiple shellcode samples, at least one shellcode sample is with treating
The shellcode of detection distance is less than D (D is preset value).
When being unsatisfactory for constraints, the shellcode to be detected can be transferred to analysis personnel to handle.
And for task two, it can be found out after calculating and analysis from shellcode Sample Storehouses to be detected with this
Preceding M shellcode samples most like shellcode, and ascending order arranges as shown in Table 8 from small to large by distance.
Table eight:
It should be noted that for task two, it is not necessary to constraint IF condition, can directly export corresponding testing result
To analysis personnel.
Reference picture 5, show and implemented according to a kind of generation method of shellcode Sample Storehouses of the application one embodiment
The step flow chart of example, specifically may include steps of:
Step 501, the command information of multiple shellcode samples is obtained respectively;
Step 502, using the command information, the characteristic of each shellcode samples is generated;
Step 503, according to the characteristic, the multiple shellcode samples are clustered, with described in generation
Shellcode Sample Storehouses.
Because the step 501- steps 503 in the present embodiment are similar with step 201 in above-described embodiment, can mutually join
Read, the present embodiment repeats no more to this.
For embodiment of the method, in order to be briefly described, therefore it is all expressed as to a series of combination of actions, but this area
Technical staff should know that the embodiment of the present application is not limited by described sequence of movement, because implementing according to the application
Example, some steps can use other orders or carry out simultaneously.Secondly, those skilled in the art should also know, specification
Described in embodiment belong to preferred embodiment, necessary to involved action not necessarily the embodiment of the present application.
Reference picture 6, show a kind of knot of the detection means embodiment of shellcode according to the application one embodiment
Structure block diagram, it can specifically include following module:
Acquisition module 601, for obtaining shellcode to be detected command information;
Generation module 602, for using the command information, generate the characteristic of the shellcode to be detected;
Computing module 603, for according to the characteristic, calculate respectively the shellcode to be detected with it is preset
Shellcode Sample Storehouses in each shellcode classifications similarity;
Determining module 604, for according to the similarity, determining the classification of the shellcode to be detected.
In the embodiment of the present application, the acquisition module 601 can specifically include following submodule:
First text message acquisition submodule, for obtaining shellcode to be detected text message, the text envelope
Breath includes shellcode code segments;
First command information extracting sub-module, for extracting the multiple instruction information in the shellcode code segments.
In the embodiment of the present application, the generation module 602 can specifically include following submodule:
First sub-instructions information extraction submodule, for extracting preset length successively from the multiple command information
Sub-instructions information;
First sub-instructions Information Statistics submodule, the number occurred for counting each sub-instructions information;
Fisrt feature data generate submodule, for exceeding multiple targets of predetermined threshold value using the number of the appearance
Command information, generate the characteristic of the shellcode to be detected.
In the embodiment of the present application, the fisrt feature data generation submodule can specifically include such as lower unit:
First object sub-instructions information calculating unit, the appearance for calculating the multiple target sub-instructions information are always secondary
Number;
First object sub-instructions information normalization unit, for there is total degree according to described, to the multiple target
Command information is normalized, to generate the characteristic vector of the shellcode to be detected.
In the embodiment of the present application, the preset shellcode Sample Storehouses can be by calling following module to generate:
Command information acquisition module, for obtaining the command information of multiple shellcode samples respectively;
Characteristic generation module, for using the command information, generate the characteristic of each shellcode samples
According to;
Sample clustering module, for according to the characteristic, being clustered to the multiple shellcode samples, with
Generate the shellcode Sample Storehouses.
In the embodiment of the present application, the command information acquisition module can specifically include following submodule:
Second text message acquisition submodule, for obtaining the text message of each shellcode samples, the text respectively
This information includes shellcode code segments;
Second command information extracting sub-module, for extracting the multiple instruction information in the shellcode code segments.
In the embodiment of the present application, the characteristic generation module can specifically include following submodule:
Second sub-instructions information extraction submodule, for extracting preset length successively from the multiple command information
Sub-instructions information;
Second sub-instructions Information Statistics submodule, the number occurred for counting each sub-instructions information;
Second feature data generate submodule, for exceeding multiple targets of predetermined threshold value using the number of the appearance
Command information, generate the characteristic of each shellcode samples.
In the embodiment of the present application, the second feature data generation submodule can specifically include following submodule:
Second target sub-instructions information calculating unit, the appearance for calculating the multiple target sub-instructions information are always secondary
Number;
Second target sub-instructions information normalization unit, for there is total degree according to described, to the multiple target
Command information is normalized, to generate the characteristic vector of each shellcode samples.
In the embodiment of the present application, the sample clustering module can specifically include following submodule:
Submodule is selected, for randomly choosing k target shellcode sample from the multiple shellcode samples;
Apart from calculating sub module, for according to the characteristic, calculate respectively remaining each shellcode samples with
The distance of the k target shellcode samples;
Sample clustering submodule, for according to the distance, being clustered to the multiple shellcode samples.
In the embodiment of the present application, the sample clustering submodule can specifically include such as lower unit:
Category division unit, for according to the distance, the multiple shellcode samples to be divided into k
Shellcode classifications;
Sample size statistic unit, for counting the quantity of shellcod samples in each shellcode classifications;
Sample clustering unit, the preceding m shellcode classification most for the quantity to the shellcod samples respectively
Clustered, to obtain m*n shellcode classification;Using k-m+m*n shellcode classification as the shellcode samples
Shellcode classifications in this storehouse.
In the embodiment of the present application, the computing module 603 can specifically include following submodule:
Class center determination sub-module, for determining each shellcode classifications in preset shellcode Sample Storehouses
Center;
Class center apart from calculating sub module, for calculate respectively the shellcode to be detected with it is each
The distance at the center of shellcode classifications.
In the embodiment of the present application, the class center determination sub-module can specifically include such as lower unit:
Sampling feature vectors determining unit, for determining the feature of the shellcode samples in each shellcode classifications
Vector;
Sampling feature vectors average calculation unit, for calculating whole shellcode in each shellcode classifications
The average value of the characteristic vector of sample, the center as the shellcode classifications.
In the embodiment of the present application, the determining module 604 can specifically include following submodule:
Apart from minimum value determination sub-module, for determining the shellcode to be detected and each shellcode classifications
Distance minimum value;The shellcode to be detected is used as using shellcode classifications corresponding to the minimum value of the distance
Classification.
In the embodiment of the present application, the computing module 603 can also include following submodule:
Sample is apart from calculating sub module, for calculating the shellcode to be detected and preset each shellcode
The distance of shellcode samples in classification;
Similarity Measure submodule, for using the distance, calculate the shellcode to be detected with it is described each
The similarity of shellcode classifications.
In the embodiment of the present application, the Similarity Measure submodule can specifically include such as lower unit:
Sum unit, for add up the shellcode to be detected with it is each in each shellcode classifications
The inverse of the distance of shellcode samples, it is similar to the shellcode classifications as the shellcode to be detected
Degree.
In the embodiment of the present application, the determining module 604 can specifically include following submodule:
Similarity maximum determination sub-module, for determining the shellcode to be detected and each shellcode classes
The maximum of other similarity;Using shellcode classifications corresponding to the maximum of the similarity as described to be detected
Shellcode classification.
Reference picture 7, show and implemented according to a kind of generating means of shellcode Sample Storehouses of the application one embodiment
The structured flowchart of example, can specifically include following module:
Command information acquisition module 701, for obtaining the command information of multiple shellcode samples respectively;
Characteristic generation module 702, for using the command information, generate the feature of each shellcode samples
Data;
Sample clustering module 703, for according to the characteristic, being clustered to the multiple shellcode samples,
To generate the shellcode Sample Storehouses.
In the embodiment of the present application, the command information acquisition module 701 can specifically include following submodule:
Text message acquisition submodule, for obtaining the text message of each shellcode samples, the text envelope respectively
Breath includes shellcode code segments;
Command information extracting sub-module, for extracting the multiple instruction information in the shellcode code segments.
In the embodiment of the present application, the characteristic generation module 702 can specifically include following submodule:
Sub-instructions information extraction submodule, the son for extracting preset length successively from the multiple command information refer to
Make information;
Sub-instructions Information Statistics submodule, the number occurred for counting each sub-instructions information;
Characteristic generates submodule, for exceeding multiple target sub-instructions of predetermined threshold value using the number of the appearance
Information, generate the characteristic of each shellcode samples.
In the embodiment of the present application, the characteristic generation submodule can specifically include such as lower unit:
Target sub-instructions information calculating unit, for calculating the appearance total degree of the multiple target sub-instructions information;
Sub-instructions information normalization unit is marked, for there is total degree according to described, the multiple target sub-instructions are believed
Breath is normalized, to generate the characteristic vector of each shellcode samples.
In the embodiment of the present application, the sample clustering module 703 can specifically include following submodule:
Submodule is selected, for randomly choosing k target shellcode sample from the multiple shellcode samples;
Apart from calculating sub module, for according to the characteristic, calculate respectively remaining each shellcode samples with
The distance of the k target shellcode samples;
Sample clustering submodule, for according to the distance, being clustered to the multiple shellcode samples.
In the embodiment of the present application, the sample clustering submodule can specifically include such as lower unit:
Category division unit, for according to the distance, the multiple shellcode samples to be divided into k
Shellcode classifications;
Sample size statistic unit, for counting the quantity of shellcod samples in each shellcode classifications;
Sample clustering unit, the preceding m shellcode classification most for the quantity to the shellcod samples respectively
Clustered, to obtain m*n shellcode classification;Using k-m+m*n shellcode classification as the shellcode samples
Shellcode classifications in this storehouse.
For device embodiment, because it is substantially similar to embodiment of the method, so description is fairly simple, it is related
Part illustrates referring to the part of embodiment of the method.
Algorithm and display be not inherently related to any certain computer, virtual system or miscellaneous equipment provided herein.
Various general-purpose systems can also be used together with teaching based on this.As described above, required by constructing this kind of system
Structure be obvious.In addition, the application is not also directed to any certain programmed language.It should be understood that it can utilize various
Programming language realizes present context described here, and the description done above to language-specific is to disclose this Shen
Preferred forms please.
In the specification that this place provides, numerous specific details are set forth.It is to be appreciated, however, that the implementation of the application
Example can be put into practice in the case of these no details.In some instances, known method, structure is not been shown in detail
And technology, so as not to obscure the understanding of this description.
Similarly, it will be appreciated that in order to simplify the disclosure and help to understand one or more of each inventive aspect,
Above in the description to the exemplary embodiment of the application, each feature of the application is grouped together into single implementation sometimes
In example, figure or descriptions thereof.However, the method for the disclosure should be construed to reflect following intention:I.e. required guarantor
Shield this application claims the more features of the feature than being expressly recited in each claim.It is more precisely, such as following
Claims reflect as, inventive aspect is all features less than single embodiment disclosed above.Therefore,
Thus the claims for following embodiment are expressly incorporated in the embodiment, wherein each claim is in itself
Separate embodiments all as the application.
Those skilled in the art, which are appreciated that, to be carried out adaptively to the module in the equipment in embodiment
Change and they are arranged in one or more equipment different from the embodiment.Can be the module or list in embodiment
Member or component be combined into a module or unit or component, and can be divided into addition multiple submodule or subelement or
Sub-component.In addition at least some in such feature and/or process or unit exclude each other, it can use any
Combination is disclosed to all features disclosed in this specification (including adjoint claim, summary and accompanying drawing) and so to appoint
Where all processes or unit of method or equipment are combined.Unless expressly stated otherwise, this specification (including adjoint power
Profit requires, summary and accompanying drawing) disclosed in each feature can be by providing the alternative features of identical, equivalent or similar purpose come generation
Replace.
In addition, it will be appreciated by those of skill in the art that although some embodiments described herein include other embodiments
In included some features rather than further feature, but the combination of the feature of different embodiments means to be in the application's
Within the scope of and form different embodiments.For example, in the following claims, embodiment claimed is appointed
One of meaning mode can use in any combination.
The all parts embodiment of the application can realize with hardware, or to be run on one or more processor
Software module realize, or realized with combinations thereof.It will be understood by those of skill in the art that it can use in practice
Microprocessor or digital signal processor (DSP) come realize the detection method of the shellcode according to the embodiment of the present application,
In the generating means of shellcode detection means, the generation method of shellcode Sample Storehouses and shellcode Sample Storehouses
The some or all functions of some or all parts.The application is also implemented as being used to perform method as described herein
Some or all equipment or program of device (for example, computer program and computer program product).Such reality
The program of existing the application can store on a computer-readable medium, or can have the form of one or more signal.
Such signal can be downloaded from internet website and obtained, and either be provided or in the form of any other on carrier signal
There is provided.
The application is limited it should be noted that above-described embodiment illustrates rather than to the application, and ability
Field technique personnel can design alternative embodiment without departing from the scope of the appended claims.In the claims,
Any reference symbol between bracket should not be configured to limitations on claims.Word "comprising" does not exclude the presence of not
Element or step listed in the claims.Word "a" or "an" before element does not exclude the presence of multiple such
Element.The application can be by means of including the hardware of some different elements and being come by means of properly programmed computer real
It is existing.In if the unit claim of equipment for drying is listed, several in these devices can be by same hardware branch
To embody.The use of word first, second, and third does not indicate that any order.These words can be explained and run after fame
Claim.
This application discloses A1, a kind of shellcode detection method, including:
Obtain shellcode to be detected command information;
Using the command information, the generation shellcode to be detected characteristic;
According to the characteristic, the shellcode to be detected and preset shellcode Sample Storehouses are calculated respectively
In each shellcode classifications similarity;
According to the similarity, the classification of the shellcode to be detected is determined.
A2, the method as described in A1, it is described obtain shellcode to be detected command information the step of include:
Shellcode to be detected text message is obtained, the text message includes shellcode code segments;
Extract the multiple instruction information in the shellcode code segments.
A3, the method as described in A2, it is described to use the command information, generate the spy of the shellcode to be detected
The step of levying data includes:
Extract the sub-instructions information of preset length successively from the multiple command information;
Count the number that each sub-instructions information occurs;
Multiple target sub-instructions information of predetermined threshold value are exceeded using the number of the appearance, generation is described to be detected
Shellcode characteristic.
A4, the method as described in A3, the number using the appearance exceed multiple target sub-instructions of predetermined threshold value
Information, include the step of the characteristic for generating the shellcode to be detected:
Calculate the appearance total degree of the multiple target sub-instructions information;
There is total degree according to described, the multiple target sub-instructions information is normalized, with described in generation
Shellcode to be detected characteristic vector.
A5, the method as described in A1, the preset shellcode Sample Storehouses generate in the following way:
The command information of multiple shellcode samples is obtained respectively;
Using the command information, the characteristic of each shellcode samples is generated;
According to the characteristic, the multiple shellcode samples are clustered, to generate the shellcode
Sample Storehouse.
A6, the method as described in A5, it is described obtain multiple shellcode samples respectively command information the step of include:
The text message of each shellcode samples is obtained respectively, and the text message includes shellcode code segments;
Extract the multiple instruction information in the shellcode code segments.
A7, the method as described in A6, it is described to use the command information, generate the characteristic of each shellcode samples
According to the step of include:
Extract the sub-instructions information of preset length successively from the multiple command information;
Count the number that each sub-instructions information occurs;
Multiple target sub-instructions information of predetermined threshold value are exceeded using the number of the appearance, generated described each
The characteristic of shellcode samples.
A8, the method as described in A7, the number using the appearance exceed multiple target sub-instructions of predetermined threshold value
Information, include the step of the characteristic for generating each shellcode samples:
Calculate the appearance total degree of the multiple target sub-instructions information;
There is total degree according to described, the multiple target sub-instructions information is normalized, with described in generation
The characteristic vector of each shellcode samples.
A9, the method as described in A5-A8 is any, it is described according to the characteristic, to the multiple shellcode samples
Clustered, included with generating the step of the shellcode Sample Storehouses:
K target shellcode sample is randomly choosed from the multiple shellcode samples;
According to the characteristic, remaining each shellcode samples and the k target are calculated respectively
The distance of shellcode samples;
According to the distance, the multiple shellcode samples are clustered.
A10, the method as described in A9, it is described according to the distance, the multiple shellcode samples are clustered
Step includes:
According to the distance, the multiple shellcode samples are divided into k shellcode classification;
Count the quantity of shellcod samples in each shellcode classifications;
The most preceding m shellcode classifications of the quantity to the shellcod samples cluster respectively, to obtain m*
N shellcode classification;
Using k-m+m*n shellcode classification as the shellcode classifications in the shellcode Sample Storehouses.
A11, the method as described in A4 or A10, it is described according to the characteristic, calculate respectively described to be detected
Shellcode and include the step of the similarity of each shellcode classifications in preset shellcode Sample Storehouses:
Determine the center of each shellcode classifications in preset shellcode Sample Storehouses;
The shellcode to be detected and the distance at the center of each shellcode classifications are calculated respectively.
A12, the method as described in A11, each shellcode classes determined in preset shellcode Sample Storehouses
The step of other center, includes:
It is determined that the characteristic vector of the shellcode samples in each shellcode classifications;
The average value of the characteristic vector of whole shellcode samples in each shellcode classifications is calculated, as described
The center of shellcode classifications.
A13, the method as described in A12, it is described according to the similarity, determine the class of the shellcode to be detected
Other step includes:
Determine the shellcode to be detected and the minimum value of the distance of each shellcode classifications;
Classification using shellcode classifications corresponding to the minimum value of the distance as the shellcode to be detected.
A14, the method as described in A4 or A10, it is described according to the characteristic, calculate respectively described to be detected
Shellcode and include the step of the similarity of each shellcode classifications in preset shellcode Sample Storehouses:
Calculate the shellcode to be detected and the shellcode samples in preset each shellcode classifications
Distance;
Using the distance, it is similar to each shellcode classifications to calculate the shellcode to be detected
Degree.
A15, the method as described in A14, it is described to use the distance, calculate the shellcode to be detected with it is described
The step of similarity of each shellcode classifications, includes:
Add up the shellcode to be detected and each shellcode samples in each shellcode classifications
Distance inverse, the similarity as the shellcode to be detected and the shellcode classifications.
A16, the method as described in A15, it is described according to the similarity, determine the class of the shellcode to be detected
Other step includes:
Determine the shellcode to be detected and the maximum of the similarity of each shellcode classifications;
Class using shellcode classifications corresponding to the maximum of the similarity as the shellcode to be detected
Not.
Disclosed herein as well is B17, a kind of generation method of shellcode Sample Storehouses, including:
The command information of multiple shellcode samples is obtained respectively;
Using the command information, the characteristic of each shellcode samples is generated;
According to the characteristic, the multiple shellcode samples are clustered, to generate the shellcode
Sample Storehouse.
B18, the method as described in B17, it is described obtain multiple shellcode samples respectively command information the step of wrap
Include:
The text message of each shellcode samples is obtained respectively, and the text message includes shellcode code segments;
Extract the multiple instruction information in the shellcode code segments.
B19, the method as described in B18, it is described to use the command information, generate the feature of each shellcode samples
The step of data, includes:
Extract the sub-instructions information of preset length successively from the multiple command information;
Count the number that each sub-instructions information occurs;
Multiple target sub-instructions information of predetermined threshold value are exceeded using the number of the appearance, generated described each
The characteristic of shellcode samples.
B20, the method as described in B19, multiple target that the number using the appearance exceedes predetermined threshold value refer to
The step of making information, generating each characteristic of shellcode samples includes:
Calculate the appearance total degree of the multiple target sub-instructions information;
There is total degree according to described, the multiple target sub-instructions information is normalized, with described in generation
The characteristic vector of each shellcode samples.
B21, the method as described in B17-B20 is any, it is described according to the characteristic, to the multiple shellcode
Sample is clustered, and is included with generating the step of the shellcode Sample Storehouses:
K target shellcode sample is randomly choosed from the multiple shellcode samples;
According to the characteristic, remaining each shellcode samples and the k target are calculated respectively
The distance of shellcode samples;
According to the distance, the multiple shellcode samples are clustered.
B22, the method as described in B21, it is described according to the distance, the multiple shellcode samples are clustered
The step of include:
According to the distance, the multiple shellcode samples are divided into k shellcode classification;
Count the quantity of shellcod samples in each shellcode classifications;
The most preceding m shellcode classifications of the quantity to the shellcod samples cluster respectively, to obtain m*
N shellcode classification;
Using k-m+m*n shellcode classification as the shellcode classifications in the shellcode Sample Storehouses.
Disclosed herein as well is C23, a kind of shellcode detection means, including:
Acquisition module, for obtaining shellcode to be detected command information;
Generation module, for using the command information, generate the characteristic of the shellcode to be detected;
Computing module, for according to the characteristic, calculate respectively the shellcode to be detected with it is preset
The similarity of each shellcode classifications in shellcode Sample Storehouses;
Determining module, for according to the similarity, determining the classification of the shellcode to be detected.
C24, the device as described in C23, the acquisition module include:
First text message acquisition submodule, for obtaining shellcode to be detected text message, the text envelope
Breath includes shellcode code segments;
First command information extracting sub-module, for extracting the multiple instruction information in the shellcode code segments.
C25, the device as described in C24, the generation module include:
First sub-instructions information extraction submodule, for extracting preset length successively from the multiple command information
Sub-instructions information;
First sub-instructions Information Statistics submodule, the number occurred for counting each sub-instructions information;
Fisrt feature data generate submodule, for exceeding multiple targets of predetermined threshold value using the number of the appearance
Command information, generate the characteristic of the shellcode to be detected.
C26, the device as described in C25, the fisrt feature data generation submodule include:
First object sub-instructions information calculating unit, the appearance for calculating the multiple target sub-instructions information are always secondary
Number;
First object sub-instructions information normalization unit, for there is total degree according to described, to the multiple target
Command information is normalized, to generate the characteristic vector of the shellcode to be detected.
C27, the device as described in C23, the preset shellcode Sample Storehouses are by calling following module to generate:
Command information acquisition module, for obtaining the command information of multiple shellcode samples respectively;
Characteristic generation module, for using the command information, generate the characteristic of each shellcode samples
According to;
Sample clustering module, for according to the characteristic, being clustered to the multiple shellcode samples, with
Generate the shellcode Sample Storehouses.
C28, the device as described in C27, the command information acquisition module include:
Second text message acquisition submodule, for obtaining the text message of each shellcode samples, the text respectively
This information includes shellcode code segments;
Second command information extracting sub-module, for extracting the multiple instruction information in the shellcode code segments.
C29, the device as described in C28, the characteristic generation module include:
Second sub-instructions information extraction submodule, for extracting preset length successively from the multiple command information
Sub-instructions information;
Second sub-instructions Information Statistics submodule, the number occurred for counting each sub-instructions information;
Second feature data generate submodule, for exceeding multiple targets of predetermined threshold value using the number of the appearance
Command information, generate the characteristic of each shellcode samples.
C30, the device as described in C29, the second feature data generation submodule include:
Second target sub-instructions information calculating unit, the appearance for calculating the multiple target sub-instructions information are always secondary
Number;
Second target sub-instructions information normalization unit, for there is total degree according to described, to the multiple target
Command information is normalized, to generate the characteristic vector of each shellcode samples.
C31, the device as described in C27-C30 is any, the sample clustering module include:
Submodule is selected, for randomly choosing k target shellcode sample from the multiple shellcode samples;
Apart from calculating sub module, for according to the characteristic, calculate respectively remaining each shellcode samples with
The distance of the k target shellcode samples;
Sample clustering submodule, for according to the distance, being clustered to the multiple shellcode samples.
C32, the device as described in C31, the sample clustering submodule include:
Category division unit, for according to the distance, the multiple shellcode samples to be divided into k
Shellcode classifications;
Sample size statistic unit, for counting the quantity of shellcod samples in each shellcode classifications;
Sample clustering unit, the preceding m shellcode classification most for the quantity to the shellcod samples respectively
Clustered, to obtain m*n shellcode classification;Using k-m+m*n shellcode classification as the shellcode samples
Shellcode classifications in this storehouse.
C33, the device as described in C26 or C32, the computing module include:
Class center determination sub-module, for determining each shellcode classifications in preset shellcode Sample Storehouses
Center;
Class center apart from calculating sub module, for calculate respectively the shellcode to be detected with it is each
The distance at the center of shellcode classifications.
C34, the device as described in C33, the class center determination sub-module include:
Sampling feature vectors determining unit, for determining the feature of the shellcode samples in each shellcode classifications
Vector;
Sampling feature vectors average calculation unit, for calculating whole shellcode in each shellcode classifications
The average value of the characteristic vector of sample, the center as the shellcode classifications.
C35, the device as described in C34, the determining module include:
Apart from minimum value determination sub-module, for determining the shellcode to be detected and each shellcode classifications
Distance minimum value;The shellcode to be detected is used as using shellcode classifications corresponding to the minimum value of the distance
Classification.
C36, the device as described in C26 or C32, the computing module include:
Sample is apart from calculating sub module, for calculating the shellcode to be detected and preset each shellcode
The distance of shellcode samples in classification;
Similarity Measure submodule, for using the distance, calculate the shellcode to be detected with it is described each
The similarity of shellcode classifications.
C37, the device as described in C36, the Similarity Measure submodule include:
Sum unit, for add up the shellcode to be detected with it is each in each shellcode classifications
The inverse of the distance of shellcode samples, it is similar to the shellcode classifications as the shellcode to be detected
Degree.
C38, the device as described in C37, the determining module include:
Similarity maximum determination sub-module, for determining the shellcode to be detected and each shellcode classes
The maximum of other similarity;Using shellcode classifications corresponding to the maximum of the similarity as described to be detected
Shellcode classification.
Disclosed herein as well is D39, a kind of generating means of shellcode Sample Storehouses, including:
Command information acquisition module, for obtaining the command information of multiple shellcode samples respectively;
Characteristic generation module, for using the command information, generate the characteristic of each shellcode samples
According to;
Sample clustering module, for according to the characteristic, being clustered to the multiple shellcode samples, with
Generate the shellcode Sample Storehouses.
D40, the device as described in D39, the command information acquisition module include:
Text message acquisition submodule, for obtaining the text message of each shellcode samples, the text envelope respectively
Breath includes shellcode code segments;
Command information extracting sub-module, for extracting the multiple instruction information in the shellcode code segments.
D41, the device as described in D40, the characteristic generation module include:
Sub-instructions information extraction submodule, the son for extracting preset length successively from the multiple command information refer to
Make information;
Sub-instructions Information Statistics submodule, the number occurred for counting each sub-instructions information;
Characteristic generates submodule, for exceeding multiple target sub-instructions of predetermined threshold value using the number of the appearance
Information, generate the characteristic of each shellcode samples.
D42, the device as described in D41, the characteristic generation submodule include:
Target sub-instructions information calculating unit, for calculating the appearance total degree of the multiple target sub-instructions information;
Sub-instructions information normalization unit is marked, for there is total degree according to described, the multiple target sub-instructions are believed
Breath is normalized, to generate the characteristic vector of each shellcode samples.
D43, the device as described in D39-D42 is any, the sample clustering module include:
Submodule is selected, for randomly choosing k target shellcode sample from the multiple shellcode samples;
Apart from calculating sub module, for according to the characteristic, calculate respectively remaining each shellcode samples with
The distance of the k target shellcode samples;
Sample clustering submodule, for according to the distance, being clustered to the multiple shellcode samples.
D44, the device as described in D43, the sample clustering submodule include:
Category division unit, for according to the distance, the multiple shellcode samples to be divided into k
Shellcode classifications;
Sample size statistic unit, for counting the quantity of shellcod samples in each shellcode classifications;
Sample clustering unit, the preceding m shellcode classification most for the quantity to the shellcod samples respectively
Clustered, to obtain m*n shellcode classification;Using k-m+m*n shellcode classification as the shellcode samples
Shellcode classifications in this storehouse.
Claims (10)
- A kind of 1. shellcode detection method, it is characterised in that including:Obtain shellcode to be detected command information;Using the command information, the generation shellcode to be detected characteristic;According to the characteristic, calculate respectively in the shellcode to be detected and preset shellcode Sample Storehouses The similarity of each shellcode classifications;According to the similarity, the classification of the shellcode to be detected is determined.
- 2. the method as described in claim 1, it is characterised in that the command information for obtaining shellcode to be detected Step includes:Shellcode to be detected text message is obtained, the text message includes shellcode code segments;Extract the multiple instruction information in the shellcode code segments.
- 3. method as claimed in claim 2, it is characterised in that it is described to use the command information, generate described to be detected The step of shellcode characteristic, includes:Extract the sub-instructions information of preset length successively from the multiple command information;Count the number that each sub-instructions information occurs;Multiple target sub-instructions information of predetermined threshold value are exceeded using the number of the appearance, generation is described to be detected Shellcode characteristic.
- 4. method as claimed in claim 3, it is characterised in that the number using the appearance exceedes the more of predetermined threshold value Individual target sub-instructions information, include the step of the characteristic for generating the shellcode to be detected:Calculate the appearance total degree of the multiple target sub-instructions information;There is total degree according to described, the multiple target sub-instructions information is normalized, it is described to be checked to generate The shellcode of survey characteristic vector.
- 5. the method as described in claim 1, it is characterised in that the preset shellcode Sample Storehouses are in the following way Generation:The command information of multiple shellcode samples is obtained respectively;Using the command information, the characteristic of each shellcode samples is generated;According to the characteristic, the multiple shellcode samples are clustered, to generate the shellcode samples Storehouse.
- 6. method as claimed in claim 5, it is characterised in that the instruction letter for obtaining multiple shellcode samples respectively The step of breath, includes:The text message of each shellcode samples is obtained respectively, and the text message includes shellcode code segments;Extract the multiple instruction information in the shellcode code segments.
- 7. method as claimed in claim 6, it is characterised in that it is described to use the command information, generate each shellcode The step of characteristic of sample, includes:Extract the sub-instructions information of preset length successively from the multiple command information;Count the number that each sub-instructions information occurs;Multiple target sub-instructions information of predetermined threshold value are exceeded using the number of the appearance, generate each shellcode The characteristic of sample.
- A kind of 8. generation method of shellcode Sample Storehouses, it is characterised in that including:The command information of multiple shellcode samples is obtained respectively;Using the command information, the characteristic of each shellcode samples is generated;According to the characteristic, the multiple shellcode samples are clustered, to generate the shellcode samples Storehouse.
- A kind of 9. shellcode detection means, it is characterised in that including:Acquisition module, for obtaining shellcode to be detected command information;Generation module, for using the command information, generate the characteristic of the shellcode to be detected;Computing module, for according to the characteristic, calculate respectively the shellcode to be detected with it is preset The similarity of each shellcode classifications in shellcode Sample Storehouses;Determining module, for according to the similarity, determining the classification of the shellcode to be detected.
- A kind of 10. generating means of shellcode Sample Storehouses, it is characterised in that including:Command information acquisition module, for obtaining the command information of multiple shellcode samples respectively;Characteristic generation module, for using the command information, generate the characteristic of each shellcode samples;Sample clustering module, for according to the characteristic, being clustered to the multiple shellcode samples, with generation The shellcode Sample Storehouses.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710667103.7A CN107562618A (en) | 2017-08-07 | 2017-08-07 | A kind of shellcode detection method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710667103.7A CN107562618A (en) | 2017-08-07 | 2017-08-07 | A kind of shellcode detection method and device |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107562618A true CN107562618A (en) | 2018-01-09 |
Family
ID=60975105
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710667103.7A Pending CN107562618A (en) | 2017-08-07 | 2017-08-07 | A kind of shellcode detection method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107562618A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110866249A (en) * | 2018-12-11 | 2020-03-06 | 北京安天网络安全技术有限公司 | Method and device for dynamically detecting malicious code and electronic equipment |
CN113486354A (en) * | 2021-08-20 | 2021-10-08 | 国网山东省电力公司电力科学研究院 | Firmware safety evaluation method, system, medium and electronic equipment |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103607391A (en) * | 2013-11-19 | 2014-02-26 | 北京航空航天大学 | SQL injection attack detection method based on K-means |
US20140150101A1 (en) * | 2012-09-12 | 2014-05-29 | Xecure Lab Co., Ltd. | Method for recognizing malicious file |
CN104978526A (en) * | 2015-06-30 | 2015-10-14 | 北京奇虎科技有限公司 | Virus signature extraction method and apparatus |
CN106778278A (en) * | 2017-02-15 | 2017-05-31 | 中国科学院信息工程研究所 | A kind of malice document detection method and device |
CN106817248A (en) * | 2016-12-19 | 2017-06-09 | 西安电子科技大学 | A kind of APT attack detection methods |
CN106960153A (en) * | 2016-01-12 | 2017-07-18 | 阿里巴巴集团控股有限公司 | The kind identification method and device of virus |
-
2017
- 2017-08-07 CN CN201710667103.7A patent/CN107562618A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140150101A1 (en) * | 2012-09-12 | 2014-05-29 | Xecure Lab Co., Ltd. | Method for recognizing malicious file |
CN103607391A (en) * | 2013-11-19 | 2014-02-26 | 北京航空航天大学 | SQL injection attack detection method based on K-means |
CN104978526A (en) * | 2015-06-30 | 2015-10-14 | 北京奇虎科技有限公司 | Virus signature extraction method and apparatus |
CN106960153A (en) * | 2016-01-12 | 2017-07-18 | 阿里巴巴集团控股有限公司 | The kind identification method and device of virus |
CN106817248A (en) * | 2016-12-19 | 2017-06-09 | 西安电子科技大学 | A kind of APT attack detection methods |
CN106778278A (en) * | 2017-02-15 | 2017-05-31 | 中国科学院信息工程研究所 | A kind of malice document detection method and device |
Non-Patent Citations (1)
Title |
---|
刘宇扬 等: ""基于词袋模型的shellcode检测"", 《中国科技论文在线HTTP://WWW.PAPER.EDU.CN/RELEASEPAPER/CONTENT/201512-302》 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110866249A (en) * | 2018-12-11 | 2020-03-06 | 北京安天网络安全技术有限公司 | Method and device for dynamically detecting malicious code and electronic equipment |
CN113486354A (en) * | 2021-08-20 | 2021-10-08 | 国网山东省电力公司电力科学研究院 | Firmware safety evaluation method, system, medium and electronic equipment |
CN113486354B (en) * | 2021-08-20 | 2024-08-02 | 国网山东省电力公司电力科学研究院 | Firmware security assessment method, system, medium and electronic equipment |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN103761476B (en) | The method and device of feature extraction | |
CN105915555A (en) | Method and system for detecting network anomalous behavior | |
CN106503558B (en) | A kind of Android malicious code detecting method based on community structure analysis | |
CN111639337B (en) | Unknown malicious code detection method and system for massive Windows software | |
CN109271788B (en) | Android malicious software detection method based on deep learning | |
US11030312B2 (en) | System and method for machine based detection of a malicious executable file | |
AU2009302657A1 (en) | Detection of confidential information | |
CN109063478A (en) | Method for detecting virus, device, equipment and the medium of transplantable executable file | |
CN105205397A (en) | Rogue program sample classification method and device | |
CN109614795B (en) | Event-aware android malicious software detection method | |
CN112005532A (en) | Malware classification of executable files over convolutional networks | |
KR20190070702A (en) | System and method for automatically verifying security events based on text mining | |
CN111651768B (en) | Method and device for identifying link library function name of computer binary program | |
CN116361801A (en) | Malicious software detection method and system based on semantic information of application program interface | |
KR20200109677A (en) | An apparatus and method for detecting malicious codes using ai based machine running cross validation techniques | |
CN111400713B (en) | Malicious software population classification method based on operation code adjacency graph characteristics | |
CN114595451A (en) | Graph convolution-based android malicious application classification method | |
Paranthaman et al. | Malware collection and analysis | |
CN107562618A (en) | A kind of shellcode detection method and device | |
Ugarte-Pedrero et al. | On the adoption of anomaly detection for packed executable filtering | |
CN110458239A (en) | Malware classification method and system based on binary channels convolutional neural networks | |
CN109753794A (en) | A kind of recognition methods of malicious application, system, training method, equipment and medium | |
CN111667018A (en) | Object clustering method and device, computer readable medium and electronic equipment | |
CN110210215B (en) | Virus detection method and related device | |
CN113626817B (en) | Malicious code family classification method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
CB02 | Change of applicant information | ||
CB02 | Change of applicant information |
Address after: 100015 Jiuxianqiao Chaoyang District Beijing Road No. 10, building 15, floor 17, layer 1701-26, 3 Applicant after: QAX Technology Group Inc. Address before: 100015 Jiuxianqiao Chaoyang District Beijing Road No. 10, building 15, floor 17, layer 1701-26, 3 Applicant before: BEIJING QIANXIN TECHNOLOGY Co.,Ltd. |
|
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20180109 |