CN114189585B - Harassment call abnormality detection method and device and computing equipment - Google Patents
Harassment call abnormality detection method and device and computing equipment Download PDFInfo
- Publication number
- CN114189585B CN114189585B CN202010961602.9A CN202010961602A CN114189585B CN 114189585 B CN114189585 B CN 114189585B CN 202010961602 A CN202010961602 A CN 202010961602A CN 114189585 B CN114189585 B CN 114189585B
- Authority
- CN
- China
- Prior art keywords
- call
- data
- early warning
- decision tree
- historical
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 26
- 230000005856 abnormality Effects 0.000 title abstract description 13
- 238000000034 method Methods 0.000 claims abstract description 53
- 238000004891 communication Methods 0.000 claims abstract description 37
- 238000003860 storage Methods 0.000 claims abstract description 7
- 238000003066 decision tree Methods 0.000 claims description 84
- 230000006399 behavior Effects 0.000 claims description 60
- 238000004458 analytical method Methods 0.000 claims description 23
- 238000012544 monitoring process Methods 0.000 claims description 22
- 238000010276 construction Methods 0.000 claims description 13
- 238000005070 sampling Methods 0.000 claims description 10
- 241000282326 Felis catus Species 0.000 claims description 9
- 238000013138 pruning Methods 0.000 claims description 9
- 230000008520 organization Effects 0.000 claims description 6
- 230000002159 abnormal effect Effects 0.000 claims description 4
- 230000003542 behavioural effect Effects 0.000 claims description 2
- 238000012545 processing Methods 0.000 abstract description 4
- 238000007726 management method Methods 0.000 description 10
- 238000010586 diagram Methods 0.000 description 6
- 230000006870 function Effects 0.000 description 5
- 238000004590 computer program Methods 0.000 description 4
- 238000009826 distribution Methods 0.000 description 4
- 238000012552 review Methods 0.000 description 4
- 238000012795 verification Methods 0.000 description 4
- 238000004422 calculation algorithm Methods 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 231100000279 safety data Toxicity 0.000 description 3
- 230000011664 signaling Effects 0.000 description 3
- 230000008901 benefit Effects 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 2
- 238000013524 data verification Methods 0.000 description 2
- 238000011156 evaluation Methods 0.000 description 2
- 238000010801 machine learning Methods 0.000 description 2
- 238000004519 manufacturing process Methods 0.000 description 2
- 238000012549 training Methods 0.000 description 2
- 230000000007 visual effect Effects 0.000 description 2
- 230000006978 adaptation Effects 0.000 description 1
- 239000006227 byproduct Substances 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 239000003795 chemical substances by application Substances 0.000 description 1
- 238000004140 cleaning Methods 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 238000011022 operating instruction Methods 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
- 239000000047 product Substances 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M3/00—Automatic or semi-automatic exchanges
- H04M3/22—Arrangements for supervision, monitoring or testing
- H04M3/2281—Call monitoring, e.g. for law enforcement purposes; Call tracing; Detection or prevention of malicious calls
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M3/00—Automatic or semi-automatic exchanges
- H04M3/22—Arrangements for supervision, monitoring or testing
- H04M3/2218—Call detail recording
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M3/00—Automatic or semi-automatic exchanges
- H04M3/42—Systems providing special services or facilities to subscribers
- H04M3/436—Arrangements for screening incoming calls, i.e. evaluating the characteristics of a call before deciding whether to answer it
Landscapes
- Engineering & Computer Science (AREA)
- Signal Processing (AREA)
- Computer Security & Cryptography (AREA)
- Technology Law (AREA)
- Telephonic Communication Services (AREA)
Abstract
The embodiment of the invention relates to the technical field of communication, and discloses a method, a device, a computing device and a storage medium for detecting nuisance call abnormality, wherein the method comprises the following steps: acquiring call record call ticket data and internet crawler data; analyzing a harassing call early warning model constructed according to the call record call ticket data application to acquire early warning data; performing secondary rechecking on the early warning data according to the internet crawler data to obtain a real harassment telephone number; and issuing the harassing call to stop the harassing call number. By means of the method, the device and the system, the processing can be performed based on various data sources, the practical scene is more met, and the detection accuracy and completeness can be improved.
Description
Technical Field
The embodiment of the invention relates to the technical field of communication, in particular to a method, a device, a computing device and a storage medium for detecting nuisance call abnormality.
Background
Along with the high-rise trend of nuisance calls in recent years, various nuisance call analysis and interception products are introduced by various manufacturers on the network side and the terminal side. The network side is mainly a harassing call analysis model and an interception system which are established by an operator based on call signaling data; the terminal side is mainly based on various Application programs (APP) provided by internet manufacturers, such as 360 mobile phone guard, hunting network system, etc. The terminal side product is mainly based on the fact that a terminal user clicks and reports, a harassment call database is produced, data are downloaded to the terminal, and reminding is given when incoming calls come.
The existing detection scheme for harassing call governance mainly has the following limitations. For the network side, the classification of the harassment calls is single, the harassment call is generally divided according to the signaling attribute, the harassment call attribute is not subdivided, for example, the data dimension and the extracted characteristic data are relatively limited according to the industries of calling calls, calling credit conditions and the like, and the processing precision is insufficient. For the terminal side, the current APP mode needs the user to open the permission, and relates to the user privacy, and only part of intelligent terminal users are covered currently, and non-intelligent terminal users cannot be covered. When in actual use, the existing data must be relied on, so that new nuisance calls cannot be timely handled, and hysteresis exists; at the same time, the recovered numbers of the operators can be intercepted wrongly. And the data sources are all end users, and the situations of omission, malicious marking and the like exist.
Disclosure of Invention
In view of the above, embodiments of the present invention provide a method, an apparatus, a computing device, and a storage medium for detecting a nuisance call anomaly, which overcome or at least partially solve the above problems.
According to one aspect of the embodiment of the invention, a method for detecting the abnormality of a crank call is provided, and the method comprises the following steps: acquiring call record call ticket data and internet crawler data; analyzing a harassing call early warning model constructed according to the call record call ticket data application to acquire early warning data; performing secondary rechecking on the early warning data according to the internet crawler data to obtain a real harassment telephone number; and issuing the harassing call to stop the harassing call number.
In an optional manner, the obtaining call record call ticket data includes: and collecting the call record ticket data from the service domain and the operation domain, and counting behavior characteristics according to the call record ticket data.
In an alternative manner, the nuisance call early warning model includes: high-frequency telephone early warning model, cat pool early warning model, high-risk user monitoring model, silence card-opening monitoring model, hour model.
In an optional manner, the analyzing the early warning model of the nuisance call constructed according to the call record call ticket data application includes: acquiring historical communication record bill data and historical internet crawler data, and counting historical behavior characteristics according to the historical communication record bill data; constructing a decision tree according to the historical behavior characteristics and the historical internet crawler data; verifying the decision tree by adopting a random subsampling method or a self-help sampling method according to the number of samples, adjusting parameters of the decision tree, and determining the final decision tree; and establishing the harassing call early warning model by using the decision tree.
In an alternative manner, the constructing a decision tree according to the historical behavior feature and the historical internet crawler data includes: acquiring information gain according to the historical behavior characteristics; sequentially establishing branches from large to small according to one item of the maximum value of the information gain, and constructing a decision tree; pruning operation is carried out on the decision tree according to the historical internet crawler data, and the construction of the decision tree is stopped when the information gain is smaller than a preset threshold value.
In an optional manner, the obtaining the information gain according to the historical behavior feature includes: acquiring experience entropy and conditional entropy corresponding to the historical behavior characteristics according to the historical behavior characteristics; and calculating information gain according to the empirical entropy and the conditional entropy.
In an alternative manner, the issuing the nuisance call to shut down the nuisance call number includes: automatically issuing the harassment telephone number to a affiliated company to inform an organization to stop; or the harassment telephone number is associated with the existing automatic disposal system to form automatic shut-down.
According to another aspect of the embodiment of the present invention, there is provided a device for detecting a nuisance call abnormality, the device comprising: the data acquisition unit acquires call record call ticket data and internet crawler data; the model analysis unit is used for analyzing the harassment call early warning model constructed according to the call record call ticket data application to acquire early warning data; the secondary rechecking unit is used for carrying out secondary rechecking on the early warning data according to the internet crawler data to obtain a real harassment telephone number; and the number closing unit is used for issuing the harassing call to close the harassing call number.
According to another aspect of an embodiment of the present invention, there is provided a computing device including: the device comprises a processor, a memory, a communication interface and a communication bus, wherein the processor, the memory and the communication interface complete communication with each other through the communication bus;
The memory is used for storing at least one executable instruction, and the executable instruction enables the processor to execute the steps of the harassment call abnormality detection method.
According to still another aspect of the embodiments of the present invention, there is provided a computer storage medium having at least one executable instruction stored therein, the executable instruction causing the processor to execute the steps of the method for detecting a nuisance call anomaly as described above.
According to the embodiment of the invention, call record ticket data and internet crawler data are obtained; analyzing a harassing call early warning model constructed according to the call record call ticket data application to acquire early warning data; performing secondary rechecking on the early warning data according to the internet crawler data to obtain a real harassment telephone number; the harassing call is issued to shut down the harassing call number, so that the harassing call number can be processed based on various data sources, the practical scene is more met, and the detection accuracy and completeness can be improved.
The foregoing description is only an overview of the technical solutions of the embodiments of the present invention, and may be implemented according to the content of the specification, so that the technical means of the embodiments of the present invention can be more clearly understood, and the following specific embodiments of the present invention are given for clarity and understanding.
Drawings
Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the invention. Also, like reference numerals are used to designate like parts throughout the figures. In the drawings:
FIG. 1 shows a schematic diagram of a system for detecting nuisance calls anomalies provided by an embodiment of the present invention;
FIG. 2 shows a schematic flow chart of a method for detecting nuisance call anomalies, which is provided by the embodiment of the invention;
FIG. 3 shows a schematic flow chart of a disturbance call early warning model construction of a disturbance call abnormality detection method provided by the embodiment of the invention;
FIG. 4 shows a behavior feature diagram of a nuisance call early warning model of the nuisance call anomaly detection method provided by the embodiment of the invention;
FIG. 5 is a schematic flow chart of another method for detecting nuisance call anomalies according to an embodiment of the present invention;
FIG. 6 shows a schematic structural diagram of a disturbance call anomaly detection device provided by an embodiment of the invention;
FIG. 7 illustrates a schematic diagram of a computing device provided by an embodiment of the present invention.
Detailed Description
Exemplary embodiments of the present invention will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present invention are shown in the drawings, it should be understood that the present invention may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art.
Aiming at the requirements of harassment call treatment anomaly detection, the embodiment of the invention provides a comprehensive harassment call anomaly detection system, which is based on the existing general big data architecture, adopts MapReduce, hive, spark distributed technology in a Hadoop distributed computing framework to realize analysis and early warning of scenes such as in-network card opening silence early warning model, high-risk agent Shang Faka early warning, suspicious base station call early warning, high-risk user early warning, cat pool harassment data treatment early warning and the like.
The specific architecture of the disturbance call anomaly detection system is shown in fig. 1, and is divided into three parts: the system comprises a safety data center, a safety analysis sub-platform and a safety situation management sub-platform. The collected external data is primarily arranged and stored in a safety data center, then the data is further analyzed by using a safety analysis sub-platform, and the analyzed result is output to a safety situation management sub-platform for display and automatic treatment.
The safety data center is used for collecting original data, primarily classifying and sorting the data, and providing resources and interfaces for upper layer analysis. The raw data is stored in a Hadoop distributed file system (Hadoop Distributed FILE SYSTEM, HDFS) after operations such as data cleaning, normalization, padding, labeling, etc. in the secure data center, in preparation for subsequent analysis. Meanwhile, the secure data center also provides corresponding interfaces (api, idbc, ftp) and resources for analysis, including SQL, spark, HDFS, elastic search (ELASTICSEARCH, ES) and the like. The layer provides metadata management, building management and operation monitoring functions. The metadata management is to flexibly manage a data form, support breakpoint continuous transmission of data acquisition and data transmission verification, provide a unified interface, facilitate access to new data and prepare for future capacity expansion. The component management is to collect the component, analyze the component, store the flexible management of the component, offer the function such as the upgrade of the component, the component restarts, etc.. The operation monitoring is to monitor the operation state of each component, namely the utilization condition of system resources in real time.
The security analysis sub-platform is used for analyzing the disturbance telephone business anomalies and comprises a basic algorithm library and various business anomaly analysis models. The analysis model comprises a high-frequency call monitoring model, a high-risk user monitoring model, an hour detection model, other abnormality detection models and the like. The layer provides functions of engine management and operation monitoring at the same time, wherein the engine management flexibly joins a new detection engine to adapt to more detection scenes. Meanwhile, the algorithm of each engine is flexibly updated and debugged. The operation monitoring is to flexibly monitor the operation state of each engine, including whether the engine operates correctly, whether the engine crashes, etc.
The security situation management sub-platform provides a situation presentation function, is presented as a visual threat early warning and risk notification, and can automatically treat analysis results as required. The situation presenting function performs icon-type visual display on the current situation, history, development trend and the like of various security risks from the dimensions of time, space and the like.
The embodiment of the invention discovers the characteristic difference between harassment call behavior and the call behavior of the common user based on statistics of a large number of historical communication record call ticket data, and then judges the call records according to the characteristics. On the basis, the internet crawler data are additionally imported. The crawler data is used as a decision tree with additional dimension, namely a machine learning means, and the telecommunication harassment behavior characteristics are learned by using the means.
Fig. 2 shows a flow chart of a method for detecting abnormal harassing call provided by the embodiment of the invention. The method for detecting the abnormal crank call is applied to an operator server, as shown in fig. 2, and comprises the following steps:
Step S11: and acquiring call record call ticket data and internet crawler data.
The embodiment of the invention applies the built special big data storage platform and uses MapReduce, hive, spark and other existing distributed architectures. Obtaining original data from a service domain and an operation domain, wherein the original data comprises the following steps:
Basic data of the service operation support system (Business and Operation Support System, BOSS) comprises card unit data, card issuing unit data and card issuing information data.
BOSS business data comprises voice ticket, short message ticket, flow ticket and other ticket data.
User network access data including network access information, channel information of network access location, etc.
User profile data including billing data, internet surfing data, etc.
In the embodiment of the invention, the call record call ticket data are collected from the service domain and the operation domain, and the behavior characteristics are counted according to the call record call ticket data. And dividing the collected call record call ticket data into behavior characteristics of different categories according to the category to which the call record call ticket data belong.
The crawler data of the embodiment of the invention has wide sources: telephone marking data of the multi-network platform, such as internet security manufacturers including hundred degrees, 360 degrees and the like; multidimensional data includes user industry information, business information, credit information, property and advertising promotion information, and the like.
Step S12: and analyzing the harassing call early warning model constructed according to the call record call ticket data application to acquire early warning data.
Before step S12, a nuisance call early warning model needs to be constructed. Specifically, as shown in fig. 3, it includes:
Step S121: and acquiring historical communication record call ticket data and historical internet crawler data, and counting historical behavior characteristics according to the historical communication record call ticket data.
Specifically, the data processing method is the same as that in step S11, and the collected data are classified into different categories of historical behavior characteristics according to the category to which the collected data belong.
Step S122: and constructing a decision tree according to the historical behavior characteristics and the historical internet crawler data.
Firstly, obtaining information gain according to the historical behavior characteristics. Specifically, acquiring experience entropy and conditional entropy corresponding to the historical behavior characteristics according to the historical behavior characteristics; and calculating information gain according to the empirical entropy and the conditional entropy. The expected values contained by all possible values of the different classes of features satisfy the following relation: Where n is the number of classifications and p (x i) is the value of the feature. In the embodiment of the invention, the data in the sample data table is defined as a training data set D, the empirical entropy of the training data set D is H (D), and the I D I represents the sample capacity, namely the number of samples. Provided with K class features C k, k=1, 2,3, K, the C is the number of samples belonging to class feature C k, the empirical entropy corresponding to the historical behavioral characteristics may be calculated using the following relationship: The conditional entropy H (y|x) represents the uncertainty of the random variable Y given the random variable X, the conditional entropy (conditional entropy) H (y|x) of the random variable Y given the random variable X, i.e. the mathematical expectation of the entropy of the conditional probability distribution of Y given the random variable X on X. Calculating conditional entropy corresponding to the historical behavior feature by using the following relation: Where p i=P(X=xi), H (y|x) is the mathematical expectation of the entropy of the conditional probability distribution of Y for a given condition of feature X for feature X. For each feature, the information gain is the difference between the empirical entropy and the conditional entropy. Thus, the information gain is calculated using the following relation: g (D, a) =h (D) -H (d|a), where a is the feature.
And then, branches are built in sequence from large to small according to one item of the maximum value of the information gain, and a decision tree is built.
Because of the over-fitting condition of the decision tree, the value of the decision tree needs to be carefully selected when the decision tree is constructed in order to improve the accuracy of the telecom disturbance behavior feature classifier and the identification capability of new data. In the embodiment of the invention, pruning operation is carried out on the decision tree according to the historical internet crawler data, and the construction of the decision tree is stopped when the information gain is smaller than a preset threshold value. Namely, pruning the decision tree by using the attribute dimension of the internet data, so that when the information gain is smaller than a preset threshold value set by the data, the construction of the decision tree is stopped, and therefore a proper decision tree is determined.
Step S123: and verifying the decision tree by adopting a random subsampling method or a self-help sampling method according to the number of samples, adjusting parameters of the decision tree, and determining the final decision tree.
After the decision tree is constructed, the decision tree importing data is required to be verified, and the quality of the decision tree is judged according to the evaluation index value obtained after calculation. And verifying the decision tree by adopting a random subsampling method or a self-help sampling method according to the number of samples, adjusting parameters designed for constructing the decision tree according to the verification result, and finally obtaining the final available decision tree. The random subsampling method is suitable for larger data volume, and the self-service sampling method is suitable for smaller data volume. The evaluation index includes the following four types: classification accuracy, recall rate, false alarm rate, accuracy.
Step S124: and establishing the harassing call early warning model by using the decision tree.
In the embodiment of the invention, the decision tree is used for establishing the harassment call early warning model, namely the harassment call early warning model used in the actual production environment. The behavior characteristics of the nuisance call early warning model are shown in fig. 4, and comprise calling statistics and called statistics. The calling statistics features comprise calling frequency, calling idle rate, called region discrete rate, time distribution, called number discrete rate, call duration average value and the like. The called statistics include called frequency, time distribution, calling number discrete rate, calling number domain discrete rate, etc. The harassing call early warning model comprises the following steps: high-frequency telephone early warning model, cat pool early warning model, high-risk user monitoring model, silence card-opening monitoring model, hour model. Different harassing call early warning models are suitable for different types of harassing call detection, and the corresponding decision tree building method is the same as the previous method, but comprises different types of behavior characteristics.
In step S12, the call record call ticket data is analyzed by using the established nuisance call early warning model, specifically, one or more nuisance call early warning models can be used to analyze and filter the original data in each dimension, and the data of which part accords with the nuisance call early warning model standard is marked as early warning data.
Step S13: and carrying out secondary rechecking on the early warning data according to the internet crawler data to obtain a real harassment telephone number.
The internet crawler data is introduced for verification, and the internet crawler data is specifically used as an additional dimension for carrying out verification on the data of the secondary review, so that whether the data are real harassment telephone numbers can be better distinguished, and the real harassment telephone numbers are obtained.
Step S14: and issuing the harassing call to stop the harassing call number.
In the embodiment of the invention, the harassment telephone numbers which accord with the harassment telephone early warning model and pass the harassment telephone numbers which accord with the harassment characteristics are checked, and the harassment telephone numbers are automatically issued to the affiliated company to inform the institutions to shut down. Or the harassment telephone number and the existing automatic disposal system can be associated to form automatic shut-down, so that the timeliness of the shut-down is improved.
According to the embodiment of the invention, the feature difference between the call behavior of the suspected nuisance call and the call behavior of the common user is found through analysis, the machine learning means of supporting Decision Tree (Decision Tree) is used for learning the communication nuisance behavior, and a nuisance call early warning model is established, so that a high-risk suspected nuisance call number is found by applying the nuisance call early warning model, and early warning treatment is carried out. The complete method for detecting the abnormal harassing call is shown in fig. 5, and comprises the following steps:
step S201: raw data is acquired.
Specifically, raw data is acquired from the service domain and the operation domain, and internet crawler data is also acquired from the internet. The original data includes call record ticket data, user profile data, and the like.
Step S202: and (5) analyzing by using a harassment call early warning model.
The harassing call early warning model comprises the following steps: high-frequency telephone early warning model, cat pool early warning model, high-risk user monitoring model, silence card-opening monitoring model, hour model. One or more of the harassing call early warning models can be applied to analyze and filter the original data in each dimension, and the data of which part accords with the harassing call early warning model standard is marked as early warning data.
Step S203: and (5) performing secondary review by using internet crawler data.
The internet crawler data is specifically used as an extra dimension to check the data of the secondary review, so that whether the data are real harassment telephone numbers can be better distinguished.
Step S204: and judging whether the shutdown condition is met. If not, then step S205 is performed; if so, the process proceeds to step S206.
Specifically, if the harassment call early warning model is met and the internet crawler data verification is passed, the shut-down condition is met, and the harassment call number is real. Otherwise, the shutdown condition is not met.
Step S205: and is not disposed.
If the harassment call early warning model is not met and/or the data verification of the internet crawler is not passed, the shut-down condition is not met, no processing is carried out, and the original state is kept unchanged.
Step S206: and automatically issuing a harassment telephone number.
If the harassment call early warning model is met and the internet crawler data is checked to be the real harassment call number, the real harassment call number is issued to the affiliated company or is associated with the existing automatic disposal system.
Step S207: the nuisance telephone number is shut down.
The corresponding harassment telephone number is shut down by the notification mechanism of the affiliated company to which the harassment telephone number is issued, or the harassment telephone number can be automatically shut down by the existing automatic disposal system associated with the harassment telephone number.
The harassment call abnormality detection method of the embodiment of the invention adopts more data sources, and related information such as charging ticket, network access information, flow information and the like is accessed in addition to the signaling data of the calling party to construct a harassment call early warning model. The harassment call early warning model based on various data sources can improve accuracy and completeness of analysis results. In addition, the embodiment of the invention introduces internet crawler data besides the analysis of the conventional harassing call early warning model, and the data source after the two are fused can better distinguish the normal call with similar characteristics and the actual harassing call, so that the analysis result is more in line with a practical scene, is accurate and reliable, is an innovative analysis means, and provides powerful guidance for the establishment of a new harassing call analysis system in the future. Aiming at the generated crank call analysis result, the data can be further processed according to the actual service requirement, so that the flexibility and the adaptation breadth of the data are improved, and the production efficiency is improved.
According to the embodiment of the invention, call record ticket data and internet crawler data are obtained; analyzing a harassing call early warning model constructed according to the call record call ticket data application to acquire early warning data; performing secondary rechecking on the early warning data according to the internet crawler data to obtain a real harassment telephone number; the harassing call is issued to shut down the harassing call number, so that the harassing call number can be processed based on various data sources, the practical scene is more met, and the detection accuracy and completeness can be improved.
Fig. 6 shows a schematic structural diagram of a device for detecting abnormality of a crank call according to an embodiment of the present invention. As shown in fig. 6, the nuisance call abnormality detection device includes: a data acquisition unit 601, a model analysis unit 602, a secondary review unit 603, a number shutdown unit 604, and a model construction unit 605. Wherein:
The data acquisition unit 601 acquires call record ticket data and internet crawler data; the model analysis unit 602 is configured to analyze the harassing call early warning model constructed according to the call record call ticket data application, and obtain early warning data; the secondary rechecking unit 603 is configured to perform secondary rechecking on the early warning data according to the internet crawler data, so as to obtain a real harassment phone number; the number shutdown unit 604 is configured to issue the nuisance call to shutdown the nuisance call number.
In an alternative way, the data acquisition unit 601 is configured to: and collecting the call record ticket data from the service domain and the operation domain, and counting behavior characteristics according to the call record ticket data.
In an alternative manner, the nuisance call early warning model includes: high-frequency telephone early warning model, cat pool early warning model, high-risk user monitoring model, silence card-opening monitoring model, hour model.
In an alternative way, the model building unit 605 is configured to: acquiring historical communication record bill data and historical internet crawler data, and counting historical behavior characteristics according to the historical communication record bill data; constructing a decision tree according to the historical behavior characteristics and the historical internet crawler data; verifying the decision tree by adopting a random subsampling method or a self-help sampling method according to the number of samples, adjusting parameters of the decision tree, and determining the final decision tree; and establishing the harassing call early warning model by using the decision tree.
In an alternative way, the model building unit 605 is configured to: acquiring information gain according to the historical behavior characteristics; sequentially establishing branches from large to small according to one item of the maximum value of the information gain, and constructing a decision tree; pruning operation is carried out on the decision tree according to the historical internet crawler data, and the construction of the decision tree is stopped when the information gain is smaller than a preset threshold value.
In an alternative way, the model building unit 605 is configured to: acquiring experience entropy and conditional entropy corresponding to the historical behavior characteristics according to the historical behavior characteristics; and calculating information gain according to the empirical entropy and the conditional entropy.
In an alternative way, the number disabling unit 604 is configured to: automatically issuing the harassment telephone number to a affiliated company to inform an organization to stop; or the harassment telephone number is associated with the existing automatic disposal system to form automatic shut-down.
According to the embodiment of the invention, call record ticket data and internet crawler data are obtained; analyzing a harassing call early warning model constructed according to the call record call ticket data application to acquire early warning data; performing secondary rechecking on the early warning data according to the internet crawler data to obtain a real harassment telephone number; the harassing call is issued to shut down the harassing call number, so that the harassing call number can be processed based on various data sources, the practical scene is more met, and the detection accuracy and completeness can be improved.
The embodiment of the invention provides a non-volatile computer storage medium which stores at least one executable instruction, and the computer executable instruction can execute the harassment call abnormality detection method in any of the method embodiments.
The executable instructions may be particularly useful for causing a processor to:
acquiring call record call ticket data and internet crawler data;
Analyzing a harassing call early warning model constructed according to the call record call ticket data application to acquire early warning data;
Performing secondary rechecking on the early warning data according to the internet crawler data to obtain a real harassment telephone number;
and issuing the harassing call to stop the harassing call number.
In one alternative, the executable instructions cause the processor to:
And collecting the call record ticket data from the service domain and the operation domain, and counting behavior characteristics according to the call record ticket data.
In an alternative manner, the nuisance call early warning model includes: high-frequency telephone early warning model, cat pool early warning model, high-risk user monitoring model, silence card-opening monitoring model, hour model.
In one alternative, the executable instructions cause the processor to:
acquiring historical communication record bill data and historical internet crawler data, and counting historical behavior characteristics according to the historical communication record bill data;
constructing a decision tree according to the historical behavior characteristics and the historical internet crawler data;
Verifying the decision tree by adopting a random subsampling method or a self-help sampling method according to the number of samples, adjusting parameters of the decision tree, and determining the final decision tree;
and establishing the harassing call early warning model by using the decision tree.
In one alternative, the executable instructions cause the processor to:
acquiring information gain according to the historical behavior characteristics;
Sequentially establishing branches from large to small according to one item of the maximum value of the information gain, and constructing a decision tree;
Pruning operation is carried out on the decision tree according to the historical internet crawler data, and the construction of the decision tree is stopped when the information gain is smaller than a preset threshold value.
In one alternative, the executable instructions cause the processor to:
acquiring experience entropy and conditional entropy corresponding to the historical behavior characteristics according to the historical behavior characteristics;
And calculating information gain according to the empirical entropy and the conditional entropy.
In one alternative, the executable instructions cause the processor to:
automatically issuing the harassment telephone number to a affiliated company to inform an organization to stop; or alternatively
And associating the harassment telephone number with an existing automatic disposal system to form automatic shut-down.
According to the embodiment of the invention, call record ticket data and internet crawler data are obtained; analyzing a harassing call early warning model constructed according to the call record call ticket data application to acquire early warning data; performing secondary rechecking on the early warning data according to the internet crawler data to obtain a real harassment telephone number; the harassing call is issued to shut down the harassing call number, so that the harassing call number can be processed based on various data sources, the practical scene is more met, and the detection accuracy and completeness can be improved.
An embodiment of the present invention provides a computer program product, where the computer program product includes a computer program stored on a computer storage medium, where the computer program includes program instructions, when the program instructions are executed by a computer, cause the computer to execute the method for detecting a nuisance call anomaly in any of the method embodiments described above.
The executable instructions may be particularly useful for causing a processor to:
acquiring call record call ticket data and internet crawler data;
Analyzing a harassing call early warning model constructed according to the call record call ticket data application to acquire early warning data;
Performing secondary rechecking on the early warning data according to the internet crawler data to obtain a real harassment telephone number;
and issuing the harassing call to stop the harassing call number.
In one alternative, the executable instructions cause the processor to:
And collecting the call record ticket data from the service domain and the operation domain, and counting behavior characteristics according to the call record ticket data.
In an alternative manner, the nuisance call early warning model includes: high-frequency telephone early warning model, cat pool early warning model, high-risk user monitoring model, silence card-opening monitoring model, hour model.
In one alternative, the executable instructions cause the processor to:
acquiring historical communication record bill data and historical internet crawler data, and counting historical behavior characteristics according to the historical communication record bill data;
constructing a decision tree according to the historical behavior characteristics and the historical internet crawler data;
Verifying the decision tree by adopting a random subsampling method or a self-help sampling method according to the number of samples, adjusting parameters of the decision tree, and determining the final decision tree;
and establishing the harassing call early warning model by using the decision tree.
In one alternative, the executable instructions cause the processor to:
acquiring information gain according to the historical behavior characteristics;
Sequentially establishing branches from large to small according to one item of the maximum value of the information gain, and constructing a decision tree;
Pruning operation is carried out on the decision tree according to the historical internet crawler data, and the construction of the decision tree is stopped when the information gain is smaller than a preset threshold value.
In one alternative, the executable instructions cause the processor to:
acquiring experience entropy and conditional entropy corresponding to the historical behavior characteristics according to the historical behavior characteristics;
And calculating information gain according to the empirical entropy and the conditional entropy.
In one alternative, the executable instructions cause the processor to:
automatically issuing the harassment telephone number to a affiliated company to inform an organization to stop; or alternatively
And associating the harassment telephone number with an existing automatic disposal system to form automatic shut-down.
According to the embodiment of the invention, call record ticket data and internet crawler data are obtained; analyzing a harassing call early warning model constructed according to the call record call ticket data application to acquire early warning data; performing secondary rechecking on the early warning data according to the internet crawler data to obtain a real harassment telephone number; the harassing call is issued to shut down the harassing call number, so that the harassing call number can be processed based on various data sources, the practical scene is more met, and the detection accuracy and completeness can be improved.
FIG. 7 is a schematic diagram of a computing device according to an embodiment of the present invention, and the embodiment of the present invention is not limited to the specific implementation of the device.
As shown in fig. 7, the computing device may include: a processor 702, a communication interface (Communications Interface), a memory 706, and a communication bus 708.
Wherein: processor 702, communication interface 704, and memory 706 perform communication with each other via a communication bus 708. A communication interface 704 for communicating with network elements of other devices, such as clients or other servers. The processor 702 is configured to execute the program 710, and may specifically execute relevant steps in the foregoing embodiment of the method for detecting a nuisance call anomaly.
In particular, program 710 may include program code including computer-operating instructions.
The processor 702 may be a Central Processing Unit (CPU) or an Application-specific integrated Circuit (ASIC) or one or more integrated circuits configured to implement embodiments of the present invention. The device includes one or each processor, which may be the same type of processor, such as one or each CPU; but may also be different types of processors such as one or each CPU and one or each ASIC.
Memory 706 for storing programs 710. The memory 706 may comprise high-speed RAM memory or may further comprise non-volatile memory (non-volatile memory), such as at least one disk memory.
The program 710 may be specifically configured to cause the processor 702 to:
acquiring call record call ticket data and internet crawler data;
Analyzing a harassing call early warning model constructed according to the call record call ticket data application to acquire early warning data;
Performing secondary rechecking on the early warning data according to the internet crawler data to obtain a real harassment telephone number;
and issuing the harassing call to stop the harassing call number.
In an alternative, the program 710 causes the processor to:
And collecting the call record ticket data from the service domain and the operation domain, and counting behavior characteristics according to the call record ticket data.
In an alternative manner, the nuisance call early warning model includes: high-frequency telephone early warning model, cat pool early warning model, high-risk user monitoring model, silence card-opening monitoring model, hour model.
In an alternative, the program 710 causes the processor to:
acquiring historical communication record bill data and historical internet crawler data, and counting historical behavior characteristics according to the historical communication record bill data;
constructing a decision tree according to the historical behavior characteristics and the historical internet crawler data;
Verifying the decision tree by adopting a random subsampling method or a self-help sampling method according to the number of samples, adjusting parameters of the decision tree, and determining the final decision tree;
and establishing the harassing call early warning model by using the decision tree.
In an alternative, the program 710 causes the processor to:
acquiring information gain according to the historical behavior characteristics;
Sequentially establishing branches from large to small according to one item of the maximum value of the information gain, and constructing a decision tree;
Pruning operation is carried out on the decision tree according to the historical internet crawler data, and the construction of the decision tree is stopped when the information gain is smaller than a preset threshold value.
In an alternative, the program 710 causes the processor to:
acquiring experience entropy and conditional entropy corresponding to the historical behavior characteristics according to the historical behavior characteristics;
And calculating information gain according to the empirical entropy and the conditional entropy.
In an alternative, the program 710 causes the processor to:
automatically issuing the harassment telephone number to a affiliated company to inform an organization to stop; or alternatively
And associating the harassment telephone number with an existing automatic disposal system to form automatic shut-down.
According to the embodiment of the invention, call record ticket data and internet crawler data are obtained; analyzing a harassing call early warning model constructed according to the call record call ticket data application to acquire early warning data; performing secondary rechecking on the early warning data according to the internet crawler data to obtain a real harassment telephone number; the harassing call is issued to shut down the harassing call number, so that the harassing call number can be processed based on various data sources, the practical scene is more met, and the detection accuracy and completeness can be improved.
The algorithms or displays presented herein are not inherently related to any particular computer, virtual system, or other apparatus. Various general-purpose systems may also be used with the teachings herein. The required structure for a construction of such a system is apparent from the description above. In addition, embodiments of the present invention are not directed to any particular programming language. It will be appreciated that the teachings of the present invention described herein may be implemented in a variety of programming languages, and the above description of specific languages is provided for disclosure of enablement and best mode of the present invention.
In the description provided herein, numerous specific details are set forth. However, it is understood that embodiments of the invention may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.
Similarly, it should be appreciated that in the above description of exemplary embodiments of the invention, various features of the embodiments of the invention are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure and aiding in the understanding of one or more of the various inventive aspects. However, the disclosed method should not be construed as reflecting the intention that: i.e., the claimed invention requires more features than are expressly recited in each claim.
Those skilled in the art will appreciate that the modules in the apparatus of the embodiments may be adaptively changed and disposed in one or more apparatuses different from the embodiments. The modules or units or components of the embodiments may be combined into one module or unit or component and, furthermore, they may be divided into a plurality of sub-modules or sub-units or sub-components. Any combination of all features disclosed in this specification (including any accompanying claims, abstract and drawings), and all of the processes or units of any method or apparatus so disclosed, may be used in combination, except insofar as at least some of such features and/or processes or units are mutually exclusive. Each feature disclosed in this specification (including any accompanying claims, abstract and drawings), may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise.
It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and that those skilled in the art will be able to design alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The invention may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In the unit claims enumerating several means, several of these means may be embodied by one and the same item of hardware. The use of the words first, second, third, etc. do not denote any order. These words may be interpreted as names. The steps in the above embodiments should not be construed as limiting the order of execution unless specifically stated.
Claims (8)
1. The method for detecting the abnormal harassing call is characterized by comprising the following steps:
acquiring call record call ticket data and internet crawler data;
acquiring historical communication record bill data and historical internet crawler data, and counting historical behavior characteristics according to the historical communication record bill data;
Constructing a decision tree according to the historical behavior characteristics and the historical internet crawler data, wherein the decision tree comprises: acquiring information gain according to the historical behavior characteristics; sequentially establishing branches from large to small according to one item of the maximum value of the information gain, and constructing a decision tree;
Pruning operation is carried out on the decision tree according to the historical internet crawler data, and the construction of the decision tree is stopped when the information gain is smaller than a preset threshold value; verifying the decision tree by adopting a random subsampling method or a self-help sampling method according to the number of samples, adjusting parameters of the decision tree, and determining the final decision tree; establishing a nuisance call early warning model by using the decision tree;
Analyzing a harassing call early warning model constructed according to the call record call ticket data application to acquire early warning data;
Performing secondary rechecking on the early warning data according to the internet crawler data to obtain a real harassment telephone number;
and issuing the harassing call to stop the harassing call number.
2. The method of claim 1, wherein the obtaining call record ticket data comprises:
And collecting the call record ticket data from the service domain and the operation domain, and counting behavior characteristics according to the call record ticket data.
3. The method of claim 1, wherein the nuisance call early warning model comprises: high-frequency telephone early warning model, cat pool early warning model, high-risk user monitoring model, silence card-opening monitoring model, hour model.
4. The method of claim 1, wherein said obtaining information gain based on said historical behavioral characteristics comprises:
acquiring experience entropy and conditional entropy corresponding to the historical behavior characteristics according to the historical behavior characteristics;
And calculating information gain according to the empirical entropy and the conditional entropy.
5. A method as claimed in claim 1, wherein said issuing the nuisance call to deactivate the nuisance call number comprises:
automatically issuing the harassment telephone number to a affiliated company to inform an organization to stop; or alternatively
And associating the harassment telephone number with an existing automatic disposal system to form automatic shut-down.
6. A nuisance call anomaly detection device, the device comprising:
the data acquisition unit acquires call record call ticket data and internet crawler data;
The model construction unit is used for acquiring historical communication record call ticket data and historical internet crawler data and counting historical behavior characteristics according to the historical communication record call ticket data; constructing a decision tree according to the historical behavior characteristics and the historical internet crawler data, wherein the decision tree comprises: acquiring information gain according to the historical behavior characteristics; sequentially establishing branches from large to small according to one item of the maximum value of the information gain, and constructing a decision tree; pruning operation is carried out on the decision tree according to the historical internet crawler data, and the construction of the decision tree is stopped when the information gain is smaller than a preset threshold value; verifying the decision tree by adopting a random subsampling method or a self-help sampling method according to the number of samples, adjusting parameters of the decision tree, and determining the final decision tree; establishing a nuisance call early warning model by using the decision tree;
The model analysis unit is used for analyzing the harassment call early warning model constructed according to the call record call ticket data application to acquire early warning data;
The secondary rechecking unit is used for carrying out secondary rechecking on the early warning data according to the internet crawler data to obtain a real harassment telephone number;
And the number closing unit is used for issuing the harassing call to close the harassing call number.
7. A computing device, comprising: the device comprises a processor, a memory, a communication interface and a communication bus, wherein the processor, the memory and the communication interface complete communication with each other through the communication bus;
The memory is configured to hold at least one executable instruction that causes the processor to perform the steps of a method of detecting a nuisance call anomaly as claimed in any one of claims 1 to 5.
8. A computer storage medium having stored therein at least one executable instruction for causing a processor to perform the steps of a method of detecting a nuisance call anomaly as claimed in any one of claims 1 to 5.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010961602.9A CN114189585B (en) | 2020-09-14 | 2020-09-14 | Harassment call abnormality detection method and device and computing equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010961602.9A CN114189585B (en) | 2020-09-14 | 2020-09-14 | Harassment call abnormality detection method and device and computing equipment |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114189585A CN114189585A (en) | 2022-03-15 |
CN114189585B true CN114189585B (en) | 2024-08-27 |
Family
ID=80539037
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010961602.9A Active CN114189585B (en) | 2020-09-14 | 2020-09-14 | Harassment call abnormality detection method and device and computing equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114189585B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115022464A (en) * | 2022-05-06 | 2022-09-06 | 中国联合网络通信集团有限公司 | Number processing method, system, computing device and storage medium |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108462785A (en) * | 2017-02-21 | 2018-08-28 | 中国移动通信集团浙江有限公司 | A kind of processing method and processing device of malicious call phone |
CN110401779A (en) * | 2018-04-24 | 2019-11-01 | 中国移动通信集团有限公司 | A kind of method, apparatus and computer readable storage medium identifying telephone number |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9692885B2 (en) * | 2015-11-17 | 2017-06-27 | Microsoft Technology Licensing, Llc | Determining scam risk during a voice call |
CA3195323A1 (en) * | 2016-11-01 | 2018-05-01 | Transaction Network Services, Inc. | Systems and methods for automatically conducting risk assessments for telephony communications |
CN110147430A (en) * | 2019-04-25 | 2019-08-20 | 上海欣方智能系统有限公司 | Harassing call recognition methods and system based on random forests algorithm |
-
2020
- 2020-09-14 CN CN202010961602.9A patent/CN114189585B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108462785A (en) * | 2017-02-21 | 2018-08-28 | 中国移动通信集团浙江有限公司 | A kind of processing method and processing device of malicious call phone |
CN110401779A (en) * | 2018-04-24 | 2019-11-01 | 中国移动通信集团有限公司 | A kind of method, apparatus and computer readable storage medium identifying telephone number |
Also Published As
Publication number | Publication date |
---|---|
CN114189585A (en) | 2022-03-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110198310B (en) | Network behavior anti-cheating method and device and storage medium | |
US10045218B1 (en) | Anomaly detection in streaming telephone network data | |
CN108471429B (en) | Network attack warning method and system | |
CN108881265B (en) | Network attack detection method and system based on artificial intelligence | |
CN108881263B (en) | Network attack result detection method and system | |
CN108683687B (en) | Network attack identification method and system | |
CN108833185B (en) | Network attack route restoration method and system | |
CN108833186A (en) | A kind of network attack prediction technique and device | |
CN105825129B (en) | Malware discrimination method and system in a kind of converged communication | |
CN113556254B (en) | Abnormal alarm method and device, electronic equipment and readable storage medium | |
CN112491779B (en) | Abnormal behavior detection method and device and electronic equipment | |
CN111654866A (en) | Method, device and computer storage medium for preventing mobile communication from fraud | |
CN114186626A (en) | Abnormity detection method and device, electronic equipment and computer readable medium | |
CN114693192A (en) | Wind control decision method and device, computer equipment and storage medium | |
CN112738040A (en) | Network security threat detection method, system and device based on DNS log | |
CN111092999A (en) | Data request processing method and device | |
CN106453320A (en) | Malicious sample identification method and device | |
CN110716973A (en) | Big data based security event reporting platform and method | |
CN107172622B (en) | Method, device and system for identifying and analyzing pseudo base station short message | |
CN115001934A (en) | Industrial control safety risk analysis system and method | |
CN111371581A (en) | Method, device, equipment and medium for detecting business abnormity of Internet of things card | |
CN114189585B (en) | Harassment call abnormality detection method and device and computing equipment | |
CN113282920B (en) | Log abnormality detection method, device, computer equipment and storage medium | |
CN114356712A (en) | Data processing method, device, equipment, readable storage medium and program product | |
CN114567495A (en) | Network attack analysis method applied to cloud computing and server |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |