Nothing Special   »   [go: up one dir, main page]

CN102937984A - System, client terminal and method for collecting data - Google Patents

System, client terminal and method for collecting data Download PDF

Info

Publication number
CN102937984A
CN102937984A CN2012104049183A CN201210404918A CN102937984A CN 102937984 A CN102937984 A CN 102937984A CN 2012104049183 A CN2012104049183 A CN 2012104049183A CN 201210404918 A CN201210404918 A CN 201210404918A CN 102937984 A CN102937984 A CN 102937984A
Authority
CN
China
Prior art keywords
field
data
value
key
merger
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN2012104049183A
Other languages
Chinese (zh)
Other versions
CN102937984B (en
Inventor
张珂
郝国梁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Qihoo Technology Co Ltd
Original Assignee
Beijing Qihoo Technology Co Ltd
Qizhi Software Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Qihoo Technology Co Ltd, Qizhi Software Beijing Co Ltd filed Critical Beijing Qihoo Technology Co Ltd
Priority to CN201610302731.0A priority Critical patent/CN105930502B/en
Priority to CN201210404918.3A priority patent/CN102937984B/en
Publication of CN102937984A publication Critical patent/CN102937984A/en
Application granted granted Critical
Publication of CN102937984B publication Critical patent/CN102937984B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/11File system administration, e.g. details of archiving or snapshots
    • G06F16/113Details of archiving
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/1805Append-only file systems, e.g. using logs or journals to store data
    • G06F16/1815Journaling file systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/56Provisioning of proxy services
    • H04L67/568Storing data temporarily at an intermediate stage, e.g. caching

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Information Transfer Between Computers (AREA)
  • Computer And Data Communications (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a system, a client terminal and a method for collecting data, and belongs to the technical field of the internet. According to the invention, client terminals are configured on different production servers; the client terminals classify and store the acquired data according to corresponding businesses of different categories; and the client terminals merge data marked with the same field values of a key code in the stored data of the business of each category into one data when the time period corresponding to the business of the category is finished and then send the data to the server terminal. According to the technical scheme of the invention, data containing any fields can be transmitted, so that data transmission is no limited, and the data are carried out merging at the client terminal, so that the problems of network jam and delay caused by transmitting a large amount of identical or similar data are avoided.

Description

A kind of system, client and method of collecting data
Technical field
The present invention relates to Internet technical field, be specifically related to a kind of system, client and method of collecting data.
Background technology
The Internet era back-end data extremely important and huge, such as daily record data and statistics.These back-end datas may be the firsthand information of backstage slip-stick artist's routine analyzer operation conditions, also may be the first references that the service operation decision-making relies on.Yet the website of large flow generally has up to ten million to produce server, and is distributed in each different machine room.On the production server that journal file or statistics will leave these network isomeries in and distribute scattered, give daily record collection, transmit, gather and analyze and bring very large difficulty.There are at present some softwares of increasing income to be used for collecting these daily records, but also exist a lot of indeterminable situations.
Some open source softwares of comparatively commonly using at present are Scribe for example, can reach the purpose of simple collection daily record data.
Scribe is the result collection system of increasing income of a current large-scale social networking service website, gets a lot of applications in this large-scale inside, social networking service website.It can store in the centralized storage system (can be NFS, distributed file system HDFS etc.) from collector journal on the various Log Sources, so that concentrate statistical study to process.It provides a scheme extendible, that height is fault-tolerant for " distributed collection, the unified processing " of daily record.When the network of central storage system or machine broke down, scribe can dump to daily record this locality or another position, and after central storage system was recovered, scribe can be transferred to the centralized storage system again with the daily record of unloading.It is combined with Hadoop usually, and scribe is used for to HDFS push daily record, and Hadoop regularly processes by the MapReduce operation.
Fig. 1 is the schematic diagram of existing Scribe collector journal.As shown in Figure 1, Scribe is put in the shared queue from collecting data as each application of planting data source, and then push is on the centralized storage system of rear end.When central storage system broke down, scribe can temporarily write daily record in the local file, and after centralized storage system restorability, scribe resumes local daily record in the centralized storage system.
Each data source must be by THRIFT(owing to having adopted THRIFT, client can adopt various language compilation to the scribe the transmission of data, and every data record comprises a category and a message).The THRIFT Thread Count (being defaulted as 3) that can be used in the scribe configuration listening port.In the rear end, scribe can be with the deposit data of different category in different directories, so that process respectively.The log store mode of rear end can be various store, comprising: the file(file), the double-deck storage of buffer(, main a storage, a secondary storage), another scribe server of network() etc.
But there is following shortcoming in scribe:
(1) scribe shortcoming is that the front group organization data is dumb, can only use two fields, be catagory and message, in the application program of producing server, if want to send data with scribe, then every data can only have catagory and two fields of message, if want to transmit a plurality of fields, then must own organising data, a plurality of data that will transmit merge to the message the inside.When the post analysis data, also want oneself to resolve message, obtain original a plurality of fields.This has caused many restrictions and inconvenience to data transfer.
(2) another shortcoming is, scribe can receive each bar data, and they verily are recorded in local cache, with certain frequency Batch sending data, even if the category of two data and message are living.This is very large in volume of transmitted data, when transmission frequency is very high, easily cause serious network blockage and delay.
Summary of the invention
In view of the above problems, the present invention has been proposed in order to a kind of system, client of the collection data that overcome the problems referred to above or address the above problem at least in part is provided and collects accordingly the method for data.
According to one aspect of the present invention, a kind of system of collecting data is provided, wherein, this system comprises: server end be deployed in different production servers on a plurality of clients,
Described client is suitable for obtaining the data of producing the corresponding different classes of business that server produces, and the data the obtained different classes of business according to correspondence is stored classifiedly;
Wherein, every data comprise more than one field, and different fields has different types, and at least one field identification of every data has key; Every kind business has the timing cycle of a correspondence;
Described client is further adapted for when the professional corresponding timing cycle of every kind finishes, and with in the data of such various-service of preserving, it is to send to described server end after the data that sign has the identical aggregation of data of value of the field of key;
Described server end is suitable for from each client data, and stores or transmit.
Alternatively, described client, be further adapted for when the professional corresponding timing cycle of every kind finishes, with in the data of such various-service of preserving, when sign has the identical aggregation of data of value of the field of key to be data, to not identifying the field of key, carry out different merger according to different types and process.
Alternatively, described client is further adapted for when according to different types the field that does not identify key being carried out different merger processing one or more combination below adopting:
For the field of sum-type, sign is had the numerical value addition on this fields of each identical data of the value of field of key, itself and as the value of this field after the merger;
For the field that is averaging type, sign is had the numerical value on this fields of each identical data of the value of field of key be averaging, its average as merger after the value of this field;
For the field of maximal value type, from having value on this fields of each identical data of the value of field of key, sign finds out maximal value, as the value of this field after the merger;
For the field of normal character types, from sign value on this field of getting article one data each identical data of the value of field of key is arranged, as the value of this field after the merger;
For the field of cumulative character types, sign is had character on this fields of each identical data of the value of field of key by the specified order serial connection after, as the value of this field after the merger.
Alternatively, described server end is suitable for the data retransmission that will receive to other server, or is forwarded to database facility, or retain costs ground file.
According to a further aspect in the invention, a kind of client of collecting data is provided, wherein, this client comprises: data capture unit, merger processing unit and a plurality of storage unit, described a plurality of storage unit is the different classes of business of correspondence respectively, and each storage unit has the timing cycle of a correspondence;
Described data capture unit is suitable for obtaining from producing server the data of corresponding different classes of business, and the data of obtaining are preserved to the storage unit of correspondence according to the different classes of distribution of services of correspondence; Wherein, every data comprise more than one field, and different fields has different types, and at least one field identification of every data has key;
Each storage unit is suitable for preserving the data from data capture unit;
Described merger processing unit is suitable for when timing cycle corresponding to each storage unit finishes, and it is to send to server end after the data that the sign in the data that this storage unit is preserved has the identical aggregation of data of value of the field of key.
Alternatively, described merger processing unit, be further adapted for when the professional corresponding timing cycle of every kind finishes, when having the identical aggregation of data of value of the field of key to be data the sign in the data of such various-service of preserving, to not identifying the field of key, carry out different merger according to different types and process.
Alternatively, the merger processing unit is further adapted for when according to different types the field that does not identify key being carried out different merger processing one or more combination below adopting:
For the field of sum-type, sign is had the numerical value addition on this fields of each identical data of the value of field of key, itself and as the value of this field after the merger;
For the field that is averaging type, sign is had the numerical value on this fields of each identical data of the value of field of key be averaging, its average as merger after the value of this field;
For the field of maximal value type, from having value on this fields of each identical data of the value of field of key, sign finds out maximal value, as the value of this field after the merger;
For the field of normal character types, from sign value on this field of getting article one data each identical data of the value of field of key is arranged, as the value of this field after the merger;
For the field of cumulative character types, sign is had character on this fields of each identical data of the value of field of key by the specified order serial connection after, as the value of this field after the merger.
According to another aspect of the invention, provide a kind of method of collecting data, wherein, the method comprises:
Be deployed in the client of producing on the server and obtain the data of the corresponding different classes of business that this production server produces; Wherein, every data comprise more than one field, and different fields has different types, and at least one field identification of every data has key;
Described client stores classifiedly the data the obtained different classes of business according to correspondence; Wherein, every kind business has the timing cycle of a correspondence;
Professional for every kind, when described client finished at corresponding timing cycle, it was to send to server end after the data that the sign in the data of such various-service of preserving is had the identical aggregation of data of value of the field of key.
It is alternatively, described that sign to be had the identical aggregation of data of value of the field of key be that data comprise:
For the field that does not identify key, carry out different merger according to different types and process.
Alternatively, described field for not identifying key, carry out different merger according to different field types and process and comprise following one or more combination:
For the field of sum-type, sign is had the numerical value addition on this fields of each identical data of the value of field of key, itself and as the value of this field after the merger;
For the field that is averaging type, sign is had the numerical value on this fields of each identical data of the value of field of key be averaging, its average as merger after the value of this field;
For the field of maximal value type, from having value on this fields of each identical data of the value of field of key, sign finds out maximal value, as the value of this field after the merger;
For the field of normal character types, from sign value on this field of getting article one data each identical data of the value of field of key is arranged, as the value of this field after the merger;
For the field of cumulative character types, sign is had character on this fields of each identical data of the value of field of key by the specified order serial connection after, as the value of this field after the merger.
According to of the present invention this at different production server difference deploying clients, each client is issued server end with the data of collecting, wherein client stores classifiedly the data the obtained different classes of business according to correspondence, every data comprise more than one dissimilar field, when the professional corresponding timing cycle of every kind finishes, with in the data of such various-service of preserving, it is the technical scheme that sends to server end after the data that sign has the identical aggregation of data of value of the field of key, can transmit the arbitrarily data of a field, and just carried out the aggregation of data processing in client, having solved thus existing scribe only allows every data that catagory and two fields of message can only be arranged, thereby so that the transmission of data has the problem of many restrictions, and solved existing scribe at front end record data verily just, not carrying out merger processes, cause volume of transmitted data large, transmission frequency is high, easily causes the problem of network blockage and delay.
Above-mentioned explanation only is the general introduction of technical solution of the present invention, for can clearer understanding technological means of the present invention, and can be implemented according to the content of instructions, and for above and other objects of the present invention, feature and advantage can be become apparent, below especially exemplified by the specific embodiment of the present invention.
Description of drawings
By reading hereinafter detailed description of the preferred embodiment, various other advantage and benefits will become cheer and bright for those of ordinary skills.Accompanying drawing only is used for the purpose of preferred implementation is shown, and does not think limitation of the present invention.And in whole accompanying drawing, represent identical parts with identical reference symbol.In the accompanying drawings:
Fig. 1 is the schematic diagram of existing Scribe collector journal;
Fig. 2 shows a kind of according to an embodiment of the invention block diagram of collecting the system of data;
Fig. 3 shows a kind of according to an embodiment of the invention structural drawing of collecting the client of data;
Fig. 4 shows a kind of according to an embodiment of the invention process flow diagram of collecting the method for data.
Embodiment
Exemplary embodiment of the present disclosure is described below with reference to accompanying drawings in more detail.Although shown exemplary embodiment of the present disclosure in the accompanying drawing, yet should be appreciated that and to realize the disclosure and the embodiment that should do not set forth limits here with various forms.On the contrary, it is in order to understand the disclosure more thoroughly that these embodiment are provided, and can with the scope of the present disclosure complete convey to those skilled in the art.
Fig. 2 shows a kind of according to an embodiment of the invention block diagram of collecting the system of data.As shown in Figure 2, this system comprises: server end 202 and a plurality of client 201.A plurality of clients 201 are deployed in respectively different needs and collect on the production server of various data.The Data Concurrent that each client 201 collection self place production server produces is given server end 202, and server end 202 receives the data that each clients 201 are beamed back, and carries out the local server of storing or being transmitted to other.Specifically:
Each client 201 is suitable for obtaining the data of producing the corresponding different classes of business that server produces, and the data the obtained different classes of business according to correspondence is stored classifiedly.Wherein, every data comprise more than one field, and different fields has different types, and at least one field identification of every data has key; Every kind business has the timing cycle of a correspondence;
Each client 201, when the professional corresponding timing cycle of every kind finished, with in the data of such various-service of preserving, it was to send to described server end 202 after the data that sign has the identical aggregation of data of value of the field of key;
Server end 202 is suitable for from each client 201 receive datas, and stores or transmit.
Here, the data layout of the data of one species various-service is identical, and namely the type of the field number that comprises of data and each field is all identical.The form that can define according to the actual requirements data of business of all categories, the field number that namely can comprise according to data of practical business requirement definition and the type of each field.For example, can be defined as follows the field of type: sum-type (SUM_INT), be averaging type (AVG_INT), maximal value type (MAX_INT), normal character types (CONST_STRING) and cumulative character types (CONST_STRING) etc.
The purpose that these fields are set is in order to do various optimization processes for the data of various different kinds of business, so that data occupy little space, speed is faster when analyzing and processing data, and committed memory still less, more easily identify the implication of each field, and be that the merger of back is ready.
Each client 201 is according to configuring maintenance a plurality of " boxes ", and each box is deposited the data of same format.That is to say professional corresponding one " box " of a kind, the deposit data of such various-service is in this corresponding box.Client 201 determines to leave in which box the data of collecting according to its data layout.When the one-period of certain box finished, client 201 was done a merger with the data in this box and is processed, and then sends to server end 202.
In the present invention, as the foundation of merger, identify key (" Key " attribute) in some fields of data, when doing aggregation of data, sign in the meeting comparing data has the field of key, and only having sign that the identical data of value of the field of key are arranged can merger be data.
When client 201 finishes at the professional corresponding timing cycle of every kind, with in the data of such various-service of preserving, when sign has the identical aggregation of data of value of the field of key to be data, to not identifying the field of key, carry out different merger according to different types and process.Be that field type is different, its merger mode is also different.
Client 201, when according to different types the field that does not identify key being carried out different merger and processes, can adopt following one or more combination:
(1) for the field of sum-type: when merger, sign is had the numerical value addition on the sum-type field of each identical data of the value of field of key, itself and as the value of the sum-type field of data after the merger;
(2) for the field that is averaging type: when merger, sign is had the numerical value that is averaging on the type field of each identical data of the value of field of key be averaging, its average as merger after the value that is averaging type field of data;
(3) for the field of maximal value type: when merger, from sign has value on the maximal value type field of each identical data of the value of field of key, find out maximal value, as the value of the maximal value type field of data after the merger;
(4) for the field of normal character types: when merger, from sign value on the normal character types field of getting article one data each identical data of the value of field of key is arranged, as the value of the normal character types field of data after the merger;
(5) for the field of cumulative character types: when merger, sign is had character on the cumulative character types field of each identical data of the value of field of key by the specified order serial connection after, as the value of the cumulative character types field of data after the merger.
More than given an example 5 kinds of field types with and corresponding merger mode separately.But the field type among the present invention is not limited to above 5 kinds, can according to the more eurypalynous field of practical business requirement definition with and the merger mode.For example can also define floating number and be averaging type (AVG_FLOAT), minimum value type (MIN_INT) and floating number sum-type (SUM_FLOAT) etc. describe in detail here no longer one by one.
The below provides an object lesson that data is carried out the merger processing.
Define one and log in professional data layout, this data layout is used for logging in business---and the data of " user accesses the number of times of a page " are carried out record, safeguard one " box " in client accordingly, are " Login ", cycle is 300 seconds, and then data layout is specially:
Login(300):user_id KEY_STR,script KEY_STR,number SUM_INT,datetime TIME_FLOOR;
This data layout comprises 4 fields, the first two field user_id and the relevant key KEY_STR of the upper sign of script, and the type of latter two field number and datetime is respectively sum-type (SUM_INT) and floor time type (TIME_FLOOR).
After definition is finished, just can send the data that meet each field type at the production server, the client that is deployed on this production server is collected the data that send.Such as shown in table 1 to the data of client collection between the 2012-09-2100:04:59 at 2012-09-2100:00:00:
ZK Index.php 1 2012-09-2100:00:00
ZK Index.php 1 2012-09-2100:01:03
ZK Index.php 5 2012-09-2100:01:23
ZK Login.php 2 2012-09-2100:02:14
HGL Login.php 2 2012-09-2100:02:14
ZK Index.php 3 2012-09-2100:03:19
HGL Index.php 7 2012-09-2100:04:10
HGL Index.php 10 2012-09-2100:04:34
Table 1
Data shown in the table 1 are to belong to log in professional data, and its form is identical, are therefore put into " Login " box by client.After 300 seconds cycle had arrived, client can be done a merger to the data in " Login " this box, and the merger result is as shown in table 2:
ZK Index.php 10 2012-09-2100:00:00 Article the 1st, 2,3,6, merger result
ZK Login.php 2 2012-09-2100:00:00 Article 4, merger result
HGL Index.php 17 2012-09-2100:00:00 Article 7,8, merger result
HGL Login.php 2 2012-09-2100:00:00 Article 5, merger result
Table 2
Last row of table 2 are the explanations to merger.As seen, because the 1st, 2,3 and the sign of 6 data in the table 1 have the content of the first two field of key identical, therefore can be merged into data, the data after the merging: the first two field still is original value; The 3rd field is sum-type, thus its value for the data in the 3rd field of the 1st, 2,3 and 6 data in the table 1 and, be specially 10; The 4th field is floor time type, so its value is the zero-time in this cycle.By that analogy, the 4th data merging in the table 1, the 7th and the 8th data in the table 1 can merge, and the 5th data in the table 1 merge.Amalgamation result is referring to table 2.
Like this, the data clauses and subclauses of input " Login " box are 8 data in one-period (2012-09-2100:00:00 is to 2012-09-2100:04:59), have only sent 4 data when sending to server end 202.
Server end 202 is suitable for receiving the data that each client 201 sends, and with the data retransmission that the receives server to other, or be forwarded to database facility (such as the MySQL server), or retain costs ground file.
As seen, server end 202 receives the data that each clients are beamed back, and server end 202 receives server or the database facility that can also be transmitted to other after the data, namely plays the part of " agency " role.Go for like this network environment or the machine room of isomery.
As seen by above-mentioned, the system of this collection data of the present invention owing in client data have been carried out flexibly processing and merger, therefore can realize the collection to daily record, can be used for using again getting statistics ready.
The below introduces the composition structure of client 201.
Fig. 3 shows a kind of according to an embodiment of the invention structural drawing of collecting the client of data.As shown in Figure 3, this client comprises: data capture unit 301, merger processing unit 303 and a plurality of storage unit 302, a plurality of storage unit 302 are the different classes of business of correspondence respectively, and each storage unit 302 has the timing cycle of a correspondence.Wherein:
Data capture unit 301 is suitable for obtaining from producing server the data of corresponding different classes of business, and the data of obtaining are preserved to corresponding storage unit 302 according to the different classes of distribution of services of correspondence; Wherein, every data comprise more than one field, and different fields has different types, and at least one field identification of every data has key;
Each storage unit 302 is suitable for preserving the data from data capture unit 301;
Merger processing unit 303 is suitable for when the timing cycle of each storage unit 302 correspondences finishes, and it is to send to server end after the data that the sign in the data that this storage unit 302 is preserved has the identical aggregation of data of value of the field of key.
Here, the data layout of the data of one species various-service is identical, and namely the type of the field number that comprises of data and each field is all identical.The form that can define according to the actual requirements data of business of all categories, the field number that namely can comprise according to data of practical business requirement definition and the type of each field.
In one embodiment of the invention, merger processing unit 303 is further adapted for when the professional corresponding timing cycle of every kind finishes, when having the identical aggregation of data of value of the field of key to be data the sign in the data of such various-service of preserving, to not identifying the field of key, carry out different merger according to different types and process.
In one embodiment of the invention, merger processing unit 303 is further adapted for when according to different types the field that does not identify key being carried out different merger processing, one or more combination below adopting:
For the field of sum-type, sign is had the numerical value addition on this fields of each identical data of the value of field of key, itself and as the value of this field after the merger;
For the field that is averaging type, sign is had the numerical value on this fields of each identical data of the value of field of key be averaging, its average as merger after the value of this field;
For the field of maximal value type, from having value on this fields of each identical data of the value of field of key, sign finds out maximal value, as the value of this field after the merger;
For the field of normal character types, from sign value on this field of getting article one data each identical data of the value of field of key is arranged, as the value of this field after the merger;
For the field of cumulative character types, sign is had character on this fields of each identical data of the value of field of key by the specified order serial connection after, as the value of this field after the merger.
Fig. 4 shows a kind of according to an embodiment of the invention process flow diagram of collecting the method for data.As shown in Figure 4, the method comprises:
Step S410 is deployed in the data that the client of producing on the server is obtained the corresponding different classes of business that this production server produces; Wherein, every data comprise more than one field, and different fields has different types, and at least one field identification of every data has key;
Here, the data layout of the data of one species various-service is identical, and namely the type of the field number that comprises of data and each field is all identical.
Step S420, client stores classifiedly the data the obtained different classes of business according to correspondence; Wherein, every kind business has the timing cycle of a correspondence;
Step S430, professional for every kind, when client finished at corresponding timing cycle, it was to send to server end after the data that the sign in the data of such various-service of preserving is had the identical aggregation of data of value of the field of key.
Wherein, in step S430, the identical aggregation of data of value that sign is had the field of key is that data comprise: for the field that does not identify key, carry out different merger according to different types and process.This field for not identifying key, carry out different merger according to different field types and process and comprise following one or more combination:
For the field of sum-type, sign is had the numerical value addition on this fields of each identical data of the value of field of key, itself and as the value of this field after the merger;
For the field that is averaging type, sign is had the numerical value on this fields of each identical data of the value of field of key be averaging, its average as merger after the value of this field;
For the field of maximal value type, from having value on this fields of each identical data of the value of field of key, sign finds out maximal value, as the value of this field after the merger;
For the field of normal character types, from sign value on this field of getting article one data each identical data of the value of field of key is arranged, as the value of this field after the merger;
For the field of cumulative character types, sign is had character on this fields of each identical data of the value of field of key by the specified order serial connection after, as the value of this field after the merger.
In sum, of the present invention this at different production server difference deploying clients, each client is issued server end with the data of collecting, wherein client stores classifiedly the data the obtained different classes of business according to correspondence, every data comprise more than one dissimilar field, when the professional corresponding timing cycle of every kind finishes, with in the data of such various-service of preserving, it is the technical scheme that sends to server end after the data that sign has the identical aggregation of data of value of the field of key, can transmit the arbitrarily data of a field, and just carried out the aggregation of data processing in client, having solved thus existing scribe only allows every data that catagory and two fields of message can only be arranged, thereby so that the transmission of data has the problem of many restrictions, and solved existing scribe at front end record data verily just, not carrying out merger processes, cause volume of transmitted data large, transmission frequency is high, easily causes the problem of network blockage and delay.Technical scheme of the present invention, can save bandwidth, dispose simply, easily safeguard and performance efficient, when technical scheme of the present invention has satisfied network data transmission to a greater extent, to the flexible and changeable demand of log transmission.
Need to prove:
Intrinsic not relevant with any certain computer, virtual system or miscellaneous equipment with demonstration at this algorithm that provides.Various general-purpose systems also can be with using based on the teaching at this.According to top description, it is apparent constructing the desired structure of this type systematic.In addition, the present invention is not also for any certain programmed language.Should be understood that and to utilize various programming languages to realize content of the present invention described here, and the top description that language-specific is done is in order to disclose preferred forms of the present invention.
In the instructions that provides herein, a large amount of details have been described.Yet, can understand, embodiments of the invention can be in the situation that there be these details to put into practice.In some instances, be not shown specifically known method, structure and technology, so that not fuzzy understanding of this description.
Similarly, be to be understood that, in order to simplify the disclosure and to help to understand one or more in each inventive aspect, in the description to exemplary embodiment of the present invention, each feature of the present invention is grouped together in single embodiment, figure or the description to it sometimes in the above.Yet the method for the disclosure should be construed to the following intention of reflection: namely the present invention for required protection requires the more feature of feature clearly put down in writing than institute in each claim.Or rather, as following claims reflected, inventive aspect was to be less than all features of the disclosed single embodiment in front.Therefore, follow claims of embodiment and incorporate clearly thus this embodiment into, wherein each claim itself is as independent embodiment of the present invention.
Those skilled in the art are appreciated that and can adaptively change and they are arranged in one or more equipment different from this embodiment the module in the equipment among the embodiment.Can be combined into a module or unit or assembly to the module among the embodiment or unit or assembly, and can be divided into a plurality of submodules or subelement or sub-component to them in addition.In such feature and/or process or unit at least some are mutually repelling, and can adopt any combination to disclosed all features in this instructions (comprising claim, summary and the accompanying drawing followed) and so all processes or the unit of disclosed any method or equipment make up.Unless in addition clearly statement, disclosed each feature can be by providing identical, being equal to or the alternative features of similar purpose replaces in this instructions (comprising claim, summary and the accompanying drawing followed).
In addition, those skilled in the art can understand, although embodiment more described herein comprise some feature rather than further feature included among other embodiment, the combination of the feature of different embodiment means and is within the scope of the present invention and forms different embodiment.For example, in the following claims, the one of any of embodiment required for protection can be used with array mode arbitrarily.
All parts embodiment of the present invention can realize with hardware, perhaps realizes with the software module of moving at one or more processor, and perhaps the combination with them realizes.It will be understood by those of skill in the art that and to use in practice microprocessor or digital signal processor (DSP) to realize according to some of the client and server end in the data gathering system of the embodiment of the invention or all some or repertoire of parts.The present invention can also be embodied as be used to part or all equipment or the device program (for example, computer program and computer program) of carrying out method as described herein.Such realization program of the present invention can be stored on the computer-readable medium, perhaps can have the form of one or more signal.Such signal can be downloaded from internet website and obtain, and perhaps provides at carrier signal, perhaps provides with any other form.
It should be noted above-described embodiment the present invention will be described rather than limit the invention, and those skilled in the art can design alternative embodiment in the situation of the scope that does not break away from claims.In the claims, any reference symbol between bracket should be configured to limitations on claims.Word " comprises " not to be got rid of existence and is not listed in element or step in the claim.Being positioned at word " " before the element or " one " does not get rid of and has a plurality of such elements.The present invention can realize by means of the hardware that includes some different elements and by means of the computing machine of suitably programming.In having enumerated the unit claim of some devices, several in these devices can be to come imbody by same hardware branch.The use of word first, second and C grade does not represent any order.Can be title with these word explanations.

Claims (10)

1. system of collecting data, wherein, this system comprises: server end be deployed in different production servers on a plurality of clients,
Described client is suitable for obtaining the data of producing the corresponding different classes of business that server produces, and the data the obtained different classes of business according to correspondence is stored classifiedly;
Wherein, every data comprise more than one field, and different fields has different types, and at least one field identification of every data has key; Every kind business has the timing cycle of a correspondence;
Described client is further adapted for when the professional corresponding timing cycle of every kind finishes, and with in the data of such various-service of preserving, it is to send to described server end after the data that sign has the identical aggregation of data of value of the field of key;
Described server end is suitable for from each client data, and stores or transmit.
2. the system as claimed in claim 1, wherein,
Described client, be further adapted for when the professional corresponding timing cycle of every kind finishes, with in the data of such various-service of preserving, when sign has the identical aggregation of data of value of the field of key to be data, to not identifying the field of key, carry out different merger according to different types and process.
3. system as claimed in claim 2 is characterized in that,
Described client is further adapted for when according to different types the field that does not identify key being carried out different merger processing one or more combination below adopting:
For the field of sum-type, sign is had the numerical value addition on this fields of each identical data of the value of field of key, itself and as the value of this field after the merger;
For the field that is averaging type, sign is had the numerical value on this fields of each identical data of the value of field of key be averaging, its average as merger after the value of this field;
For the field of maximal value type, from having value on this fields of each identical data of the value of field of key, sign finds out maximal value, as the value of this field after the merger;
For the field of normal character types, from sign value on this field of getting article one data each identical data of the value of field of key is arranged, as the value of this field after the merger;
For the field of cumulative character types, sign is had character on this fields of each identical data of the value of field of key by the specified order serial connection after, as the value of this field after the merger.
4. such as each described system in the claims 1 to 3, it is characterized in that,
Described server end is suitable for the data retransmission that will receive to other server, or is forwarded to database facility, or retain costs ground file.
5. client of collecting data, wherein, this client comprises: data capture unit, merger processing unit and a plurality of storage unit, and described a plurality of storage unit are the different classes of business of correspondence respectively, and each storage unit has the timing cycle of a correspondence;
Described data capture unit is suitable for obtaining from producing server the data of corresponding different classes of business, and the data of obtaining are preserved to the storage unit of correspondence according to the different classes of distribution of services of correspondence; Wherein, every data comprise more than one field, and different fields has different types, and at least one field identification of every data has key;
Each storage unit is suitable for preserving the data from data capture unit;
Described merger processing unit is suitable for when timing cycle corresponding to each storage unit finishes, and it is to send to server end after the data that the sign in the data that this storage unit is preserved has the identical aggregation of data of value of the field of key.
6. client as claimed in claim 5, wherein,
Described merger processing unit, be further adapted for when the professional corresponding timing cycle of every kind finishes, when having the identical aggregation of data of value of the field of key to be data the sign in the data of such various-service of preserving, to not identifying the field of key, carry out different merger according to different types and process.
7. client as claimed in claim 6, wherein,
The merger processing unit is further adapted for when according to different types the field that does not identify key being carried out different merger processing one or more combination below adopting:
For the field of sum-type, sign is had the numerical value addition on this fields of each identical data of the value of field of key, itself and as the value of this field after the merger;
For the field that is averaging type, sign is had the numerical value on this fields of each identical data of the value of field of key be averaging, its average as merger after the value of this field;
For the field of maximal value type, from having value on this fields of each identical data of the value of field of key, sign finds out maximal value, as the value of this field after the merger;
For the field of normal character types, from sign value on this field of getting article one data each identical data of the value of field of key is arranged, as the value of this field after the merger;
For the field of cumulative character types, sign is had character on this fields of each identical data of the value of field of key by the specified order serial connection after, as the value of this field after the merger.
8. method of collecting data, wherein, the method comprises:
Be deployed in the client of producing on the server and obtain the data of the corresponding different classes of business that this production server produces; Wherein, every data comprise more than one field, and different fields has different types, and at least one field identification of every data has key;
Described client stores classifiedly the data the obtained different classes of business according to correspondence; Wherein, every kind business has the timing cycle of a correspondence;
Professional for every kind, when described client finished at corresponding timing cycle, it was to send to server end after the data that the sign in the data of such various-service of preserving is had the identical aggregation of data of value of the field of key.
9. method as claimed in claim 8, wherein, described sign to be had the identical aggregation of data of value of the field of key be that data comprise:
For the field that does not identify key, carry out different merger according to different types and process.
10. method as claimed in claim 9, wherein, described field for not identifying key, carry out different merger according to different field types and process and comprise following one or more combination:
For the field of sum-type, sign is had the numerical value addition on this fields of each identical data of the value of field of key, itself and as the value of this field after the merger;
For the field that is averaging type, sign is had the numerical value on this fields of each identical data of the value of field of key be averaging, its average as merger after the value of this field;
For the field of maximal value type, from having value on this fields of each identical data of the value of field of key, sign finds out maximal value, as the value of this field after the merger;
For the field of normal character types, from sign value on this field of getting article one data each identical data of the value of field of key is arranged, as the value of this field after the merger;
For the field of cumulative character types, sign is had character on this fields of each identical data of the value of field of key by the specified order serial connection after, as the value of this field after the merger.
CN201210404918.3A 2012-10-22 2012-10-22 A kind of collect the system of data, client and method Active CN102937984B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201610302731.0A CN105930502B (en) 2012-10-22 2012-10-22 System, client and method for collecting data
CN201210404918.3A CN102937984B (en) 2012-10-22 2012-10-22 A kind of collect the system of data, client and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210404918.3A CN102937984B (en) 2012-10-22 2012-10-22 A kind of collect the system of data, client and method

Related Child Applications (1)

Application Number Title Priority Date Filing Date
CN201610302731.0A Division CN105930502B (en) 2012-10-22 2012-10-22 System, client and method for collecting data

Publications (2)

Publication Number Publication Date
CN102937984A true CN102937984A (en) 2013-02-20
CN102937984B CN102937984B (en) 2016-06-08

Family

ID=47696881

Family Applications (2)

Application Number Title Priority Date Filing Date
CN201610302731.0A Expired - Fee Related CN105930502B (en) 2012-10-22 2012-10-22 System, client and method for collecting data
CN201210404918.3A Active CN102937984B (en) 2012-10-22 2012-10-22 A kind of collect the system of data, client and method

Family Applications Before (1)

Application Number Title Priority Date Filing Date
CN201610302731.0A Expired - Fee Related CN105930502B (en) 2012-10-22 2012-10-22 System, client and method for collecting data

Country Status (1)

Country Link
CN (2) CN105930502B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104090938A (en) * 2014-06-26 2014-10-08 广州金山网络科技有限公司 Method and device for submitting data
CN104699718A (en) * 2013-12-10 2015-06-10 阿里巴巴集团控股有限公司 Method and device for rapidly introducing business data
CN110995839A (en) * 2019-12-03 2020-04-10 北京搜狐新媒体信息技术有限公司 Method and device for analyzing performance of advertisement system and computer storage medium

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109064317B (en) * 2018-08-10 2021-04-02 玖富金科控股集团有限责任公司 Data receiving and forwarding method, electronic equipment and readable storage medium
CN109491815A (en) * 2018-10-17 2019-03-19 深圳壹账通智能科技有限公司 Based on multistage data creation method, device and computer equipment
CN110826307A (en) * 2019-10-31 2020-02-21 北京字节跳动网络技术有限公司 Method and device for creating business object
CN112416972A (en) * 2020-09-25 2021-02-26 上海哔哩哔哩科技有限公司 Real-time data stream processing method, device, equipment and readable storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5737600A (en) * 1994-09-12 1998-04-07 International Business Machines Corporation Method and system for log management in a coupled data processing system
CN1949214A (en) * 2006-09-26 2007-04-18 北京北大方正电子有限公司 Information merging method and system
CN102637142A (en) * 2012-04-13 2012-08-15 浪潮(北京)电子信息产业有限公司 Computer system and method for realizing log management

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060129415A1 (en) * 2004-12-13 2006-06-15 Rohit Thukral System for linking financial asset records with networked assets
CN101566986A (en) * 2008-04-21 2009-10-28 阿里巴巴集团控股有限公司 Method and device for processing data in online business processing

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5737600A (en) * 1994-09-12 1998-04-07 International Business Machines Corporation Method and system for log management in a coupled data processing system
CN1949214A (en) * 2006-09-26 2007-04-18 北京北大方正电子有限公司 Information merging method and system
CN102637142A (en) * 2012-04-13 2012-08-15 浪潮(北京)电子信息产业有限公司 Computer system and method for realizing log management

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
庄欣: "统一网络安全管理中数据采集代理的设计和实现", 《中国优秀硕士学位论文全文数据库信息科技辑》, no. 11, 15 November 2009 (2009-11-15), pages 16 - 61 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104699718A (en) * 2013-12-10 2015-06-10 阿里巴巴集团控股有限公司 Method and device for rapidly introducing business data
CN104699718B (en) * 2013-12-10 2019-04-12 阿里巴巴集团控股有限公司 Method and apparatus for being rapidly introduced into business datum
CN104090938A (en) * 2014-06-26 2014-10-08 广州金山网络科技有限公司 Method and device for submitting data
WO2015196983A1 (en) * 2014-06-26 2015-12-30 广州金山网络科技有限公司 Data submission method and device
CN110995839A (en) * 2019-12-03 2020-04-10 北京搜狐新媒体信息技术有限公司 Method and device for analyzing performance of advertisement system and computer storage medium
CN110995839B (en) * 2019-12-03 2022-09-20 北京搜狐新媒体信息技术有限公司 Method and device for analyzing performance of advertisement system and computer storage medium

Also Published As

Publication number Publication date
CN105930502B (en) 2020-04-10
CN105930502A (en) 2016-09-07
CN102937984B (en) 2016-06-08

Similar Documents

Publication Publication Date Title
CN102902813A (en) Log collection system
CN105677844B (en) A kind of orientation of moving advertising big data pushes and user is across screen recognition methodss
CN102937984A (en) System, client terminal and method for collecting data
US10430111B2 (en) Optimization for real-time, parallel execution of models for extracting high-value information from data streams
CN106611046B (en) Spatial data storage processing middleware system based on big data technology
CN106708993B (en) Method for realizing space data storage processing middleware framework based on big data technology
US12008027B2 (en) Optimization for real-time, parallel execution of models for extracting high-value information from data streams
CN103984755A (en) Multidimensional model based oil and gas resource data key system implementation method and system
CN110990447B (en) Data exploration method, device, equipment and storage medium
CN106021260A (en) Method and system to search for at least one relationship pattern in a plurality of runtime artifacts
CN103605651A (en) Data processing showing method based on on-line analytical processing (OLAP) multi-dimensional analysis
CN103970902A (en) Method and system for reliable and instant retrieval on situation of large quantities of data
CN101645032A (en) Performance analysis method of application server and application server
CN108241867B (en) Classification method and device
CN109977125A (en) A kind of big data safety analysis plateform system based on network security
US20160203224A1 (en) System for analyzing social media data and method of analyzing social media data using the same
Okewu et al. Design of a learning analytics system for academic advising in Nigerian universities
Peng et al. Research trends in social media/big data with the emphasis on data collection and data management: A bibliometric analysis
CN116629802A (en) Big data platform system for railway port station
CN102945270A (en) Parallel distribution type network public opinion data management method and system
CN116610531A (en) Data embedding method based on code probe acquisition and picture uploading request
Anusha et al. Big data techniques for efficient storage and processing of weather data
Suciu et al. Big data technology for scientific applications
KR20230059364A (en) Public opinion poll system using language model and method thereof
CN113111244A (en) Multisource heterogeneous big data fusion system based on traditional Chinese medicine knowledge large-scale popularization

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20220714

Address after: Room 801, 8th floor, No. 104, floors 1-19, building 2, yard 6, Jiuxianqiao Road, Chaoyang District, Beijing 100015

Patentee after: BEIJING QIHOO TECHNOLOGY Co.,Ltd.

Address before: 100088 room 112, block D, 28 new street, new street, Xicheng District, Beijing (Desheng Park)

Patentee before: BEIJING QIHOO TECHNOLOGY Co.,Ltd.

Patentee before: Qizhi software (Beijing) Co.,Ltd.