Nothing Special   »   [go: up one dir, main page]

CN107391551B - Web service data analysis method and system based on data mining - Google Patents

Web service data analysis method and system based on data mining Download PDF

Info

Publication number
CN107391551B
CN107391551B CN201710417835.0A CN201710417835A CN107391551B CN 107391551 B CN107391551 B CN 107391551B CN 201710417835 A CN201710417835 A CN 201710417835A CN 107391551 B CN107391551 B CN 107391551B
Authority
CN
China
Prior art keywords
data
service
service data
unit
sorted
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710417835.0A
Other languages
Chinese (zh)
Other versions
CN107391551A (en
Inventor
王晓佳
孔祥明
贾义动
朱容虎
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Guangye Kaiyuan Technology Co ltd
Original Assignee
Guangdong Guangye Kaiyuan Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Guangye Kaiyuan Technology Co ltd filed Critical Guangdong Guangye Kaiyuan Technology Co ltd
Priority to CN201710417835.0A priority Critical patent/CN107391551B/en
Publication of CN107391551A publication Critical patent/CN107391551A/en
Application granted granted Critical
Publication of CN107391551B publication Critical patent/CN107391551B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/955Retrieval from the web using information identifiers, e.g. uniform resource locators [URL]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/211Schema design and management
    • G06F16/212Schema design and management with details for data modelling support
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/2379Updates performed during online database operations; commit processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2216/00Indexing scheme relating to additional aspects of information retrieval not explicitly covered by G06F16/00 and subgroups
    • G06F2216/03Data mining

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Fuzzy Systems (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Mathematical Physics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a web service data analysis method and a system based on data mining, wherein the method comprises the steps of obtaining and sorting service data of a front-end decoding engine to obtain the sorted service data, and establishing a service model; classifying the sorted service data according to the sorted service data; according to the classified service data, carrying out dial testing processing on preset key services; and forming an analysis result according to the service data and storing the analysis result in a storage. The system comprises a data sorting unit, a data classifying unit, a dial testing unit and a data warehousing unit. The invention realizes the model identification of the web service by analyzing and modeling the service data and combining the regular comparison of the data of the manufacturer, forms the management operation of the enterprise to the service, and further realizes the scheme of the operation analysis and the system operation and maintenance based on the service. The invention can be widely applied to business data analysis.

Description

Web service data analysis method and system based on data mining
Technical Field
The invention relates to the technical field of data analysis, in particular to a web service data analysis method and system based on data mining.
Background
With the advancement of modern enterprise informatization, more and more enterprises adopt electronic and informatization forms to provide individualized services anytime and anywhere for enterprise customers, can more comprehensively master the preference and deviation of the customers to the services through online service handling, and can timely push the latest services which accord with enterprise planning. Meanwhile, new business means also bring new challenges to enterprises, namely how to operate the IT system healthily, stably and efficiently. Therefore, operation and maintenance personnel and enterprise management personnel are required to start from the same business view, and various hidden dangers and even faults of the system in the business handling process are discovered from the perspective of business handling of clients.
However, from a technical implementation perspective, it is not easy to implement a service perspective as well. The information such as network information, system logs, hardware resource loads and the like which can be acquired in the past can only reflect the system, and how to restore the information acquired from the system to a service view, and then monitor the whole IT framework from the service view, becomes a problem which needs to be solved urgently by each enterprise.
Disclosure of Invention
In order to solve the technical problems, the invention aims to provide a web service data analysis method and system based on data mining, which have good real-time performance and can improve the accuracy.
The technical scheme adopted by the invention is as follows:
a web service data analysis method based on data mining comprises the following steps:
acquiring and sorting the service data of the front-end decoding engine to obtain the sorted service data, and establishing a service model;
classifying the sorted service data according to the sorted service data;
according to the classified service data, carrying out dial testing processing on preset key services;
and forming an analysis result according to the service data and storing the analysis result in a storage.
As a further improvement of the web service data analysis method based on data mining, the step of obtaining and sorting the service data of the front-end decoding engine to obtain the sorted service data, and establishing a service model specifically includes:
acquiring service data of a front-end decoding engine;
the obtained service data is sorted, and corresponding command words are marked;
and carrying out merging and de-duplication processing on the obtained command words, and sorting the command words according to a preset command word format to obtain the required command words.
As a further improvement of the web service data analysis method based on data mining, the step of classifying the sorted service data according to the sorted service data specifically includes:
dividing the service data into the large classes according to the service data and the command words;
and performing the fine classification and the business step classification on the business data in each large class according to the preset fine classification business characteristics.
As a further improvement of the data mining-based web service data analysis method, the step of performing dial testing processing on preset key services according to the classified service data specifically includes:
carrying out dial testing processing on the key services, and recording mark URIs of different steps of each type of services;
comparing the command word in the service data with the mark URI in the dial testing process, and marking the command word sequence number of the corresponding step;
and sequencing the command words in the service data according to time, recording the sequence number of the command word contained in each step, and dividing the command words not contained in the steps according to the latest principle.
As a further improvement of the web service data analysis method based on data mining, the analysis result comprises a service table, a service step table, a command word table, a return code dimension table and a service class dimension table.
The other technical scheme adopted by the invention is as follows:
a data mining-based web business data analysis system, comprising:
the data sorting unit is used for acquiring and sorting the service data of the front-end decoding engine to obtain the sorted service data and establishing a service model;
the data classification unit is used for classifying the sorted service data according to the sorted service data;
the dial testing unit is used for carrying out dial testing processing on the preset key services according to the classified service data;
and the data storage unit is used for forming an analysis result according to the service data and storing the analysis result in a storage.
As a further improvement of the data mining-based web service data analysis system, the data sorting unit specifically includes:
the data acquisition unit is used for acquiring the service data of the front-end decoding engine;
the command word sorting unit is used for sorting the obtained service data and marking corresponding command words;
and the command word processing unit is used for carrying out merging and de-duplication processing on the obtained command words and sorting the command words according to a preset command word format to obtain the required command words.
As a further improvement of the data mining-based web service data analysis system, the data classification unit specifically includes:
the large class dividing unit is used for dividing the service data into the large classes according to the service data and the command words;
and the fine classification unit is used for performing fine classification and service step classification on the service data in each large class according to the preset fine classification service characteristics.
As a further improvement of the data mining-based web service data analysis system, the dial testing unit specifically includes:
the step marking unit is used for carrying out dial testing processing on the key services and recording the marked URI of different steps of each type of service;
the comparison unit is used for comparing the command words in the service data with the marked URI in the dial testing process and marking the command word sequence numbers of the corresponding steps;
the step dividing unit is used for sequencing the command words in the service data according to time, recording the sequence number of the command word contained in each step, and dividing the command words which are not contained into the steps according to the latest principle.
As a further improvement of the data mining-based web service data analysis system, the analysis result comprises a service table, a service step table, a command word table, a return code dimension table and a service class dimension table.
The invention has the beneficial effects that:
the web service data analysis method and system based on data mining realize model identification of web services by analyzing and modeling the service data and combining with regular comparison of data of manufacturers, and form management operation of enterprises on the services, thereby realizing a scheme based on service operation analysis and system operation and maintenance. The invention has better real-time performance, and the system is based on real production data, so when the change of the business model is caused by adding, deleting and modifying the business, the system can automatically identify, thereby ensuring that the obtained business model accords with the real business condition of the enterprise at that time, and after the business is automatically learned, official business record items can be regularly obtained from a factory log for comparison, thereby achieving the effect of regular correction, the comparison can be carried out personalized setting according to the frequency degree of the business version, ensuring the accuracy of the business model learning result and effectively improving the accuracy. In addition, when the invention is deployed, the full data in the service system switch is adopted, so that the service behavior can be captured as long as the service really occurs, the service model is also identified, and the recall ratio is higher.
Drawings
The following further describes embodiments of the present invention with reference to the accompanying drawings:
FIG. 1 is a flow chart illustrating the steps of a data mining-based web service data analysis method according to the present invention;
fig. 2 is a block diagram of a data mining-based web service data analysis system according to the present invention.
Detailed Description
Referring to fig. 1, the present invention relates to a data mining-based web service data analysis method, which includes the following steps:
acquiring and sorting the service data of the front-end decoding engine to obtain the sorted service data, and establishing a service model;
classifying the sorted service data according to the sorted service data;
according to the classified service data, carrying out dial testing processing on preset key services;
and forming an analysis result according to the service data and storing the analysis result in a storage.
Further, as a preferred embodiment, the acquiring and sorting the service data of the front-end decoding engine to obtain the sorted service data, and establishing the service model specifically includes:
acquiring service data of a front-end decoding engine;
the obtained service data is sorted, and corresponding command words are marked;
and carrying out merging and de-duplication processing on the obtained command words, and sorting the command words according to a preset command word format to obtain the required command words.
In this embodiment, when the service data is arranged, one pcap packet is decoded into a plurality of pack packets, then the pack packets are interpreted into a plurality of messages by python, and then the messages with service meanings are matched with the request response messages by identifiers to obtain command words.
In the step, the business data decoded at the front end is structured, so that the business data is structured and simplified according to the use requirement, and on one hand, the logic is clearer when data mining and learning are carried out; on the other hand, the efficiency of data processing can be effectively improved by the simplified data, which is particularly important for massive data mining, and after all, the data volume of each day is TB level. On the other hand, for the structured data, it needs to be stored according to a specific format, for example, the phone number of the user referred to by phone, user, id, etc., at this time, it needs to merge, remove duplicate, and identify, and according to the preset command word format, leave the information that can uniquely identify the user.
The preset command word format in this embodiment is:
phone (Mobile Phone number, user number)
The mobile Phone number of the User can be stored in various variables, such as Phone, Phoneid, Userid, User, cookies, currentUser, customer, ClientId and the like.
At this time, the cleaning needs to be carried out in various situations:
variable detection is required for each command word that comes in.
During detection, the values of the variables need to be retrieved, and the main retrieval is as follows:
(1) if 1 starts, whether 11 digits exist; if not 1 head, whether it is 13-14 digits.
(2) If it is 1, then it is 13, 15, 17, etc. legal operator number.
(3) If there are multiple mobile phone numbers, the mobile phone numbers should be the same as the later mobile phone numbers.
When a legal mobile phone number is retrieved, it can be used as the data stream of the command word to identify the variable "phone" of the mobile phone number.
It is also normal when a change in the mobile phone number occurs during the retrieval process. This is because the user switches numbers when handling services, for example, handling multiple services in sequence, but different mobile phone numbers; or the user forgets to quit before transacting, and the next user does not find the mobile phone number right until the last steps, so the mobile phone number needs to be switched.
For the switching situation, the last appearing mobile phone number is required to be the standard, because the last appearing mobile phone number is the mobile phone number recorded by the service system when the service is really submitted.
Menuid (Menu number of service location)
The field records the system menu used when the user transacts the business, and generally different major services can be summarized under different major menus.
For example, the services of data services such as GPRS package, traffic overlay package, etc. are classified into a menu, and the services such as call, incoming call prompt, incoming call forwarding wait, etc. are classified into other menus.
For Menuid, character processing is required, and the processing procedure is as follows:
(1) searching whether the field contains the head of YW, CS and Menu, and is followed by underlining, wherein the field needs to be judged to satisfy CS _, YW _, etc., and the suffix is taken out. This is because the prefixes used by different development groups of the same system are different, but the suffix information can only identify the service class.
(2) Searching a data stream of a command word to determine whether different Menuid exist, and if so, determining whether business interpenetration exists.
(3) If the insertion exists, the data stream needs to be split and restored into two different service types.
Traffic status (status, retcode, retId)
The service state is usually multiple at the same time, because when computers communicate, the result is detected for each communication protocol layer.
For example, TCP connection failure, empty window, delayed response wait states may occur at the network layer.
The HTTP layer may generate system or network anomalies of 200, 300, 400, 500, etc.
The private application layer may have factory-defined errors such as database errors, user information errors, and the like.
The interface layer may present prompt information to the user, such as the system is busy, failed to submit, etc.
In operation, a priority extraction method is adopted, and a high-priority state is used for covering a low-priority state: interface- > application layer- > HTTP- > network layer.
Product number (ProductId), order number (OrderId)
The product number needs to be acquired deep into the private application level.
For example, in the HTTP layer, nested information such as XML, JSON, SOCKET, etc. is usually embedded.
If the business is a batch business, one command word contains a plurality of product numbers and order number information, and at this time, independent recording cannot be performed on each product, so that the product numbers need to be combined and isolated by a _ "symbol, and the order numbers are generally unique without additional modification.
In this embodiment, after the required command word is obtained, the type required by the command word return code is obtained, which is to determine the type of the return code (e.g., continuation, success, error request, zero time redirection, etc.) according to the http standard, and obtain and analyze codeType whose return code is 2xx, including performing text analysis on res _ Header and marking the type to which the res _ Header belongs (e.g., system identification, service identification, etc.).
Further as a preferred embodiment, the classifying the sorted service data according to the sorted service data specifically includes:
dividing the service data into the large classes according to the service data and the command words;
and performing the fine classification and the business step classification on the business data in each large class according to the preset fine classification business characteristics.
In this embodiment, the major class to which the service data belongs is preliminarily split according to fields such as menuid. The system is designed, and generally divides the services, so that the users can conveniently screen the services required by the users, and enterprises can conveniently add, delete and modify the associated services in the service management process. For example, account balance inquiry, receiving address inquiry, and general contact inquiry, which are services having a close relationship with the user's personal information, are generally placed under the categories of "user management", "personal information management", etc. on a web interface, and they will adopt the same large-class id, such as menuid, when the system function (command word) is transmitted to the background. Therefore, the information can be used for preliminarily classifying and sorting the transmitted service data into different queues, so that different types of data can be distributed to different queues during subsequent processing, useless data comparison is reduced, and the data processing efficiency is improved.
Then, according to preset thin service characteristics such as time intervals between data, page conditions, skip information and the like, the steps to which the services belong are distinguished, so that the thin classes of the services are further identified. Since the large class of service cannot well meet the requirements of an enterprise on service management, operation analysis and service operation, the large class is usually further split, but the fine class of service has menuid, and in order to distinguish the fine class, multiple dimensions such as service data interval, skip page, key command word and the like need to be introduced for distinguishing. These dimensions, taken together, form a fingerprint that is unique for each subclass of service, with fingerprints between different services that are both similar and distinct. After the fingerprints are established, the production data can be identified, when new data enter the business model, the fingerprints of all businesses in the study pants are compared, and finally the business with the highest weighted proportion is the business to which the newly generated data belongs.
The algorithm principle adopts an ED classification command word sequence to learn the characteristics of the service steps. Even when loss, repeated flow or less business processing occurs, the complete and correct command word sequence can be covered, so that the business can be correctly identified.
In this embodiment, when the large class is divided, the command words of the same menuid, staff, and userphone are divided into one bucket by a bucket dividing method, and a next bucket is checked every minute to see whether overflow occurs. When the thin class is divided, the command words are sequenced according to the request time, the command words in the bucket are divided into a plurality of services according to different USERNUM, one service is divided into different operation intervals by taking 3 seconds as a time interval, and the command words which are not agreed are divided by taking the similarity of the URI sequence as a standard; the business type of the business is determined by marking the business type of one business, utilizing the menuid of the business and the weight occupied by the transaction data of all command words, and marking busiID on each business as a unique identifier.
When the state of a service is determined, if the state is not the menuid, the service is defined as not submitted, if the command word contained in the service is more than 3, the service is defined as submitted, if the command word contained in the service is more than 3, the service is judged whether to be submitted, if the service type of the service is present in the dial testing data obtained by the dial testing, the service is compared with the average step number of the service, if not, the service is judged whether to be more than 3, and if so, the service is defined as submitted.
Further, as a preferred embodiment, the step of performing dial testing processing on the preset key service according to the classified service data specifically includes:
carrying out dial testing processing on the key services, and recording mark URIs of different steps of each type of services;
comparing the command word in the service data with the mark URI in the dial testing process, and marking the command word sequence number of the corresponding step;
and sequencing the command words in the service data according to time, recording the sequence number of the command word contained in each step, and dividing the command words not contained in the steps according to the latest principle.
In this embodiment, for part of the services of major concern, besides knowing the classification of the services, the enterprise part may also concern whether the operation is smooth, whether there is a service loss, what the loss ratio and reason are, at which step the system fails, how much the fault is affected, and the like, in the process of handling the services by the user. These problems require accurate splitting of service data so as to accurately grasp the operation steps of each service, and even the situations that can be considered at the beginning of the non-system design, such as repeated operation, rollback operation, and multi-service parallel operation, of a user need to be identified. At this time, the fingerprint of the service needs to be accurately identified by a dial-up comparison method, so that the accuracy of machine learning is improved. The principle of dial testing is that real users are simulated to handle business, and data of each system link during business handling are captured, so that the expression form of the whole IT system during business handling is accurately mastered and recorded. This can achieve the goal of accurately grasping the service information. The specific process is that key services are recorded and tested, mark URI of different steps of each service is recorded and marked, through URI of service command word, the command word in service data is compared with mark URI in test process, and the command word sequence number of the corresponding command word step is marked to obtain command word step mark; detecting the step marks of the command words, if the step marks can not be found in the dial testing data obtained by the dial testing processing or the step sequence numbers are not from 1 or the maximum step sequence numbers are larger than the total step number, stopping the division, otherwise, performing the step division. The command words in the service data are sorted according to time, the sequence number of the command word included in each step is recorded, the command words not included are divided into the steps according to the latest principle, and the division rule in the embodiment adopts the reverse principle.
Further as a preferred embodiment, the analysis result includes a service table, a service step table, a command word table, a return code dimension table, and a service class dimension table.
In this embodiment, the service data identified by classification needs to be stored according to a specific data format, so that subsequent application and calling can be facilitated. The return code dimension table and the service category dimension table are mainly used for indexing, for example, channels of users, business halls, networks, self-service terminals and the like are possible today, channels of WeChat are added instead of days, the cost is high if the result of table change at the moment, the best method is to change the table into an expandable dimension table, and channel expansion can be completed only by adding one row of code records.
The service table mainly records each service record in a service log mode and provides a main key connected to the service step table, so that the indexing from service classes and subclasses to specific service steps is realized.
Referring to fig. 2, the present invention relates to a data mining-based web service data analysis system, including:
the data sorting unit is used for acquiring and sorting the service data of the front-end decoding engine to obtain the sorted service data and establishing a service model;
the data classification unit is used for classifying the sorted service data according to the sorted service data;
the dial testing unit is used for carrying out dial testing processing on the preset key services according to the classified service data;
and the data storage unit is used for forming an analysis result according to the service data and storing the analysis result in a storage.
Further as a preferred embodiment, the data sorting unit specifically includes:
the data acquisition unit is used for acquiring the service data of the front-end decoding engine;
the command word sorting unit is used for sorting the obtained service data and marking corresponding command words;
and the command word processing unit is used for carrying out merging and de-duplication processing on the obtained command words and sorting the command words according to a preset command word format to obtain the required command words.
Further as a preferred embodiment, the data classification unit specifically includes:
the large class dividing unit is used for dividing the service data into the large classes according to the service data and the command words;
and the fine classification unit is used for performing fine classification and service step classification on the service data in each large class according to the preset fine classification service characteristics.
Further as a preferred embodiment, the dial testing unit specifically includes:
the step marking unit is used for carrying out dial testing processing on the key services and recording the marked URI of different steps of each type of service;
the comparison unit is used for comparing the command words in the service data with the marked URI in the dial testing process and marking the command word sequence numbers of the corresponding steps;
the step dividing unit is used for sequencing the command words in the service data according to time, recording the sequence number of the command word contained in each step, and dividing the command words which are not contained into the steps according to the latest principle.
From the above, the learning of the business model by the system of the invention is based on the real business data 24 hours a day, the identification learning of the business model is realized by mining and modeling the data and comparing and correcting the data with the factory log result regularly, so that the system can operate and self-learn based on the production data after being on line, the dependence on the manpower correction is very low, and therefore, a large amount of manpower investment is not needed, and the manpower maintenance cost can be effectively reduced. The system is based on real production data, so that when business is changed due to addition, deletion and modification of businesses, the system can automatically identify the business, the obtained business model is guaranteed to be in accordance with the real business situation of the enterprise at that time, and the real-time performance is good; after the system learns the service by itself, official service record items are obtained from a factory log at regular intervals for comparison, so that the effect of regular correction is achieved. The comparison can be set individually according to the frequency of the business version, so that the accuracy of the business model learning result is well guaranteed and is higher; when the system is deployed, the full data in the service system switch is adopted, so that the service behavior can be captured as long as the service really occurs, the service model is identified, and the recall ratio is effectively improved.
While the preferred embodiments of the present invention have been illustrated and described, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (8)

1. A web service data analysis method based on data mining is characterized by comprising the following steps:
acquiring and sorting the service data of the front-end decoding engine to obtain the sorted service data, and establishing a service model;
classifying the sorted service data according to the sorted service data;
according to the classified service data, carrying out dial testing processing on preset key services;
forming an analysis result according to the service data processed by the dial testing and storing the analysis result in a storage;
the step of performing dial testing processing on the preset key service according to the classified service data specifically includes:
carrying out dial testing processing on the key services, and recording mark URIs of different steps of each type of services;
comparing the command word in the service data with the mark URI in the dial testing process, and marking the command word sequence number of the corresponding step;
and sequencing the command words in the service data according to time, recording the sequence number of the command word contained in each step, and dividing the command words not contained in the steps according to the latest principle.
2. The data mining-based web service data analysis method according to claim 1, characterized in that: the method comprises the following steps of obtaining and sorting service data of a front-end decoding engine to obtain the sorted service data, and establishing a service model, wherein the steps specifically comprise:
acquiring service data of a front-end decoding engine;
the obtained service data is sorted, and corresponding command words are marked;
and carrying out merging and de-duplication processing on the obtained command words, and sorting the command words according to a preset command word format to obtain the required command words.
3. The data mining-based web service data analysis method according to claim 2, characterized in that: the classifying the sorted service data according to the sorted service data specifically includes:
dividing the service data into the large classes according to the service data and the command words;
and performing the fine classification and the business step classification on the business data in each large class according to the preset fine classification business characteristics.
4. The data mining-based web service data analysis method according to claim 1, characterized in that: the analysis result comprises a service table, a service step table, a command word table, a return code dimension table and a service class dimension table.
5. A data mining-based web business data analysis system, comprising:
the data sorting unit is used for acquiring and sorting the service data of the front-end decoding engine to obtain the sorted service data and establishing a service model;
the data classification unit is used for classifying the sorted service data according to the sorted service data;
the dial testing unit is used for carrying out dial testing processing on the preset key services according to the classified service data;
the data storage unit is used for forming an analysis result according to the service data processed by the dial testing and storing the analysis result;
the dial testing unit specifically comprises:
the step marking unit is used for carrying out dial testing processing on the key services and recording the marked URI of different steps of each type of service;
the comparison unit is used for comparing the command words in the service data with the marked URI in the dial testing process and marking the command word sequence numbers of the corresponding steps;
the step dividing unit is used for sequencing the command words in the service data according to time, recording the sequence number of the command word contained in each step, and dividing the command words which are not contained into the steps according to the latest principle.
6. The data mining-based web business data analysis system of claim 5, wherein: the data sorting unit specifically comprises:
the data acquisition unit is used for acquiring the service data of the front-end decoding engine;
the command word sorting unit is used for sorting the obtained service data and marking corresponding command words;
and the command word processing unit is used for carrying out merging and de-duplication processing on the obtained command words and sorting the command words according to a preset command word format to obtain the required command words.
7. The data mining-based web business data analysis system of claim 6, wherein: the data classification unit specifically comprises:
the large class dividing unit is used for dividing the service data into the large classes according to the service data and the command words;
and the fine classification unit is used for performing fine classification and service step classification on the service data in each large class according to the preset fine classification service characteristics.
8. The data mining-based web business data analysis system of claim 5, wherein: the analysis result comprises a service table, a service step table, a command word table, a return code dimension table and a service class dimension table.
CN201710417835.0A 2017-06-06 2017-06-06 Web service data analysis method and system based on data mining Active CN107391551B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710417835.0A CN107391551B (en) 2017-06-06 2017-06-06 Web service data analysis method and system based on data mining

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710417835.0A CN107391551B (en) 2017-06-06 2017-06-06 Web service data analysis method and system based on data mining

Publications (2)

Publication Number Publication Date
CN107391551A CN107391551A (en) 2017-11-24
CN107391551B true CN107391551B (en) 2020-04-14

Family

ID=60333128

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710417835.0A Active CN107391551B (en) 2017-06-06 2017-06-06 Web service data analysis method and system based on data mining

Country Status (1)

Country Link
CN (1) CN107391551B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113961523A (en) * 2018-01-26 2022-01-21 创新先进技术有限公司 Business file splitting and summarizing method, device and equipment
CN112667702A (en) * 2020-12-03 2021-04-16 成都大数据产业技术研究院有限公司 Big data-based data mining system

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1964273A (en) * 2005-11-10 2007-05-16 华为技术有限公司 A method to interact service configuration information
CN101018252A (en) * 2006-11-30 2007-08-15 张明 Virtual group purchase method and system based on the telecom value-added service
CN101754344A (en) * 2008-12-19 2010-06-23 中国移动通信集团设计院有限公司 Uplink covering capacity control method of time division synchronization code division multiple access (TD-SCDMA) system and user equipment
CN101754253A (en) * 2008-12-02 2010-06-23 中国移动通信集团甘肃有限公司 General packet radio service (GPRS) end-to-end performance analysis method and system
CN103593412A (en) * 2013-10-24 2014-02-19 北京京东尚科信息技术有限公司 Tree-structure-based question answering system and method
CN105488610A (en) * 2015-11-23 2016-04-13 国网山东省电力公司信息通信公司 Fault real-time analysis and diagnosis system and method for power application system
CN106789223A (en) * 2016-12-13 2017-05-31 中国联合网络通信集团有限公司 A kind of IPTV IPTV service quality determining method and system

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1964273A (en) * 2005-11-10 2007-05-16 华为技术有限公司 A method to interact service configuration information
CN101018252A (en) * 2006-11-30 2007-08-15 张明 Virtual group purchase method and system based on the telecom value-added service
CN101754253A (en) * 2008-12-02 2010-06-23 中国移动通信集团甘肃有限公司 General packet radio service (GPRS) end-to-end performance analysis method and system
CN101754344A (en) * 2008-12-19 2010-06-23 中国移动通信集团设计院有限公司 Uplink covering capacity control method of time division synchronization code division multiple access (TD-SCDMA) system and user equipment
CN103593412A (en) * 2013-10-24 2014-02-19 北京京东尚科信息技术有限公司 Tree-structure-based question answering system and method
CN105488610A (en) * 2015-11-23 2016-04-13 国网山东省电力公司信息通信公司 Fault real-time analysis and diagnosis system and method for power application system
CN106789223A (en) * 2016-12-13 2017-05-31 中国联合网络通信集团有限公司 A kind of IPTV IPTV service quality determining method and system

Also Published As

Publication number Publication date
CN107391551A (en) 2017-11-24

Similar Documents

Publication Publication Date Title
CN107707376B (en) A kind of method and system of monitoring and alarm
CN110569298B (en) Data docking and visualization method and system
CN105553769A (en) Data collecting-analyzing system and method
CN110088744A (en) A kind of database maintenance method and its system
CN104601370A (en) Information processing method and cloud server
CN111181923A (en) Flow detection method and device, electronic equipment and storage medium
WO2019019767A1 (en) Client identity information processing method and apparatus, storage medium and computer device
CN113472858B (en) Buried point data processing method and device and electronic equipment
CN107391551B (en) Web service data analysis method and system based on data mining
CN103490978A (en) Terminal, server and message monitoring method
CN110807050B (en) Performance analysis method, device, computer equipment and storage medium
CN111176950A (en) Method and equipment for monitoring network card of server cluster
CN116719750B (en) Software testing method and device, server equipment and storage medium
CN106161403A (en) Application program restored method, device and system
CN101286903A (en) Method for enhancing integrity of sessions in network audit field
CN107888415B (en) Network management system data maintenance method
CN109426576A (en) Fault-tolerance processing method and fault-tolerant component
CN107392415B (en) Telecommunication salesman portrait information processing method and device based on big data
CN112968957B (en) Analysis method, device, equipment and storage medium for library collection resources
CN115994172B (en) Method, device, equipment and medium for determining service access relation
CN112131611A (en) Data correctness verification method, device, equipment, system and storage medium
CN105763370A (en) Method and device for extracting signaling data
CN111913864B (en) Method and device for discovering abnormal operation behavior based on business operation combination
CN115086052B (en) Method for automatically analyzing account based on HTTP (hyper text transport protocol) traffic
CN116527303B (en) Industrial control equipment information extraction method and device based on marked flow comparison

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant