CN118427220A - Data processing method and device, electronic equipment and storage medium - Google Patents
Data processing method and device, electronic equipment and storage medium Download PDFInfo
- Publication number
- CN118427220A CN118427220A CN202310093677.3A CN202310093677A CN118427220A CN 118427220 A CN118427220 A CN 118427220A CN 202310093677 A CN202310093677 A CN 202310093677A CN 118427220 A CN118427220 A CN 118427220A
- Authority
- CN
- China
- Prior art keywords
- data
- window
- data processing
- data source
- offset
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000003672 processing method Methods 0.000 title claims abstract description 30
- 238000012545 processing Methods 0.000 claims abstract description 301
- 238000000034 method Methods 0.000 claims abstract description 47
- 230000000737 periodic effect Effects 0.000 claims abstract description 23
- 238000013507 mapping Methods 0.000 claims description 26
- 238000004364 calculation method Methods 0.000 claims description 24
- 238000004590 computer program Methods 0.000 claims description 7
- 230000006870 function Effects 0.000 description 12
- 238000004891 communication Methods 0.000 description 10
- 238000001514 detection method Methods 0.000 description 6
- 238000010586 diagram Methods 0.000 description 6
- 238000005516 engineering process Methods 0.000 description 4
- 238000007726 management method Methods 0.000 description 3
- 230000001133 acceleration Effects 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 2
- 230000005484 gravity Effects 0.000 description 2
- 230000007774 longterm Effects 0.000 description 2
- 238000010295 mobile communication Methods 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 230000005236 sound signal Effects 0.000 description 2
- 206010033799 Paralysis Diseases 0.000 description 1
- 230000002159 abnormal effect Effects 0.000 description 1
- 230000004931 aggregating effect Effects 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 238000013475 authorization Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 238000010223 real-time analysis Methods 0.000 description 1
- 238000005096 rolling process Methods 0.000 description 1
- 238000010897 surface acoustic wave method Methods 0.000 description 1
- 238000012384 transportation and delivery Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2455—Query execution
- G06F16/24568—Data stream processing; Continuous queries
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The embodiment of the invention discloses a data processing method, a data processing device, electronic equipment and a storage medium; the method comprises the steps of obtaining initial time windows corresponding to a plurality of data source ends and data identifications of all the data source ends, calculating window offset corresponding to all the data source ends according to the data identifications, performing offset processing on all the initial time windows based on the window offset corresponding to all the data source ends to obtain target time windows corresponding to all the data source ends, wherein window starting time of target time windows of at least two data source ends is different, performing periodic data processing on data to be processed of all the data source ends based on the target time windows to obtain data processing results of all the data source ends; in the embodiment of the invention, because the target time windows with different window starting times are obtained, the time windows of different data sources are not aligned in a centralized way on the time line of the data processing process, so that the pressure of the data processing process on the data processing equipment can be effectively relieved, and the performance of the data processing equipment is improved.
Description
Technical Field
The present invention relates to the field of data processing technologies, and in particular, to a data processing method, a data processing device, an electronic device, and a storage medium.
Background
With the development of big data technology, processing streaming data has become a fundamental technology in many application fields. For example, in an internet application, a streaming data processing technique may be used to count user-related usage data in real time, such as counting complete accesses per minute, counting web pages for 5 minutes; and can also be used for recommending goods, business wind control and the like.
Currently, a common method of processing streaming data is to use a time window. The streaming data may be divided into a plurality of time periods by time windows, with the data in each time period treated as a set. In the related art, time windows in the streaming process are aligned, and when a large number of data sources need to be processed, the time windows are concentrated together to output a large number of calculation results. This can lead to severe data processing pressures on the data processing device, which can lead to reduced or even paralysis of the data processing device.
Disclosure of Invention
The embodiment of the invention provides a data processing method, a device, electronic equipment and a storage medium, which can effectively relieve the pressure of a data processing process on the data processing equipment and improve the performance of the data processing equipment.
The embodiment of the invention provides a data processing method, which comprises the following steps:
Acquiring initial time windows corresponding to a plurality of data source ends and data identifiers of the data source ends;
calculating window offset corresponding to each data source end according to the data identification;
performing offset processing on each initial time window based on the window offset corresponding to each data source end to obtain a target time window corresponding to each data source end, wherein window starting time of the target time windows of at least two data source ends is different;
And carrying out periodic data processing on the data to be processed of each data source end based on the target time window to obtain a data processing result of each data source end.
Accordingly, an embodiment of the present invention provides a data processing apparatus, including:
the window acquisition unit is used for acquiring initial time windows corresponding to a plurality of data source ends and data identifiers of the data source ends;
The offset calculating unit is used for calculating window offsets corresponding to the data source ends according to the data identifiers respectively;
The window offset processing unit is used for performing offset processing on each initial time window based on the window offset corresponding to each data source end to obtain a target time window corresponding to each data source end, wherein window starting time of the target time windows of at least two data source ends is different;
and the data processing unit is used for carrying out periodic data processing on the data to be processed of each data source end based on the target time window to obtain the data processing result of each data source end.
In some optional embodiments, the offset calculating unit is configured to perform mapping calculation on the data identifiers according to a preset mapping relationship, so as to obtain mapping numerical results corresponding to the data identifiers;
And taking an absolute value of the mapping numerical value result, and calculating window offset corresponding to each data source end according to the result obtained after taking the absolute value and the time window length of the initial time window.
In some optional embodiments, the data processing apparatus provided by the embodiments of the present invention further includes a first re-offset unit, configured to calculate an offset difference between the window offsets based on the window offsets corresponding to the data source ends;
if a first offset difference value smaller than a preset first offset difference threshold exists, weighting calculation is carried out on the window offset corresponding to the first offset difference value, and a new window offset is obtained;
And returning to the step of executing the offset difference value between the window offsets based on the window offsets corresponding to the data source ends until the offset difference value is not smaller than the first offset difference threshold value.
In some optional embodiments, the data processing apparatus provided in the embodiments of the present invention further includes a first threshold calculating unit, configured to obtain processing resource information of a data processing device and a historical data processing result of each data source end;
calculating the maximum data processing capacity of the data processing equipment according to the processing resource information;
And predicting a first offset difference threshold according to the maximum data processing amount and each historical data processing result.
In some optional embodiments, the data processing apparatus provided in this embodiment of the present invention further includes a second re-shifting unit, configured to group the data source ends according to the maximum processing amount of data and each of the historical data processing results, where the offset difference value of the data source ends in the same group is smaller than a preset second offset difference threshold;
If a second offset difference value which is not smaller than a preset second offset difference threshold value exists, and the data source ends corresponding to the second offset difference value belong to the same group, weighting calculation is carried out on the window offset corresponding to the second offset difference value, and a new window offset is obtained;
and returning to the step of executing the offset difference value between the window offsets based on the window offsets corresponding to the data source ends until the offset difference value corresponding to the data source ends of the same group is larger than the second offset difference threshold value.
In some optional embodiments, the data processing apparatus provided by the embodiments of the present invention further includes a real-time re-shifting unit, configured to predict, based on the target time window corresponding to each data source end and a window time length of the target time window, a data processing pressure value of each time period in a data processing timeline of the data processing device, where the time length of each time period is calculated by the window time length;
when a target time period exists in which the data processing pressure value is larger than a preset processing pressure threshold value, shifting the target time window in the target time period to obtain a new target time window;
And returning to execute the step of predicting the data processing pressure value of each time period in the data processing time line of the data processing equipment based on the target time window and the window time length of the target time window corresponding to each data source end until the target time period does not exist in the data processing time line.
In some optional embodiments, the data processing unit is configured to perform data processing on data to be processed of each data source end once based on the target time window, so as to obtain a real-time data processing result of each data source end;
Calculating the data quantity of unit time of each data source end according to the real-time data processing result and the window time length of the target time window;
Modifying the window time length of the target time window based on the data quantity in unit time to obtain a new target time window;
And returning to the step of executing the data to be processed of each data source end based on the target time window to perform data processing once to obtain real-time data processing results of each data source end until the data processing time line of the data processing equipment is ended.
Correspondingly, the embodiment of the invention also provides electronic equipment, which comprises a memory and a processor; the memory stores an application program, and the processor is configured to run the application program in the memory, so as to execute steps in any one of the data processing methods provided by the embodiments of the present invention.
Accordingly, an embodiment of the present invention further provides a computer readable storage medium storing a plurality of instructions adapted to be loaded by a processor to perform any of the steps of the data processing method provided by the embodiment of the present invention.
In addition, the embodiment of the present invention further provides a computer program product, which includes a computer program or instructions, where the computer program or instructions implement steps in any of the data processing methods provided in the embodiments of the present invention when executed by a processor.
By adopting the scheme of the embodiment of the invention, the initial time windows corresponding to a plurality of data source ends and the data identifications of the data source ends can be obtained, the window offset corresponding to each data source end is calculated according to the data identifications, the initial time windows are offset processed based on the window offset corresponding to each data source end to obtain the target time windows corresponding to each data source end, wherein the window starting time of the target time windows of at least two data source ends is different, and the data to be processed of each data source end is subjected to periodic data processing based on the target time windows to obtain the data processing result of each data source end; in the embodiment of the invention, the time windows of the data source terminals are offset according to the data identifications of the different data source terminals, so that the target time windows with different window starting times are obtained, therefore, the time windows of the different data source terminals are not aligned in a centralized manner on a time line of a data processing process, and further, the data processing equipment is not centralized in calculating the different time windows, thereby effectively relieving the pressure of the data processing process on the data processing equipment and improving the performance of the data processing equipment.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the description of the embodiments will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a schematic view of a scenario of a data processing method according to an embodiment of the present invention;
FIG. 2 is a flow chart of a data processing method provided by an embodiment of the present invention;
FIG. 3 is a schematic diagram of a landing page provided by an embodiment of the present invention;
FIG. 4 is a schematic diagram of an authorized login page provided by an embodiment of the present invention;
FIG. 5 is a schematic flow chart of user login using authorization provided by an embodiment of the present invention;
FIG. 6 is a schematic diagram of a data processing apparatus according to an embodiment of the present invention;
FIG. 7 is a schematic diagram of another structure of a data processing apparatus according to an embodiment of the present invention;
fig. 8 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to fall within the scope of the invention.
The embodiment of the invention provides a data processing method, a data processing device, electronic equipment and a computer readable storage medium. Specifically, the embodiment of the invention provides a data processing method suitable for a data processing device, and the data processing device can be integrated in electronic equipment.
The electronic device may be a terminal or the like, including but not limited to a mobile terminal and a fixed terminal, for example, a mobile terminal including but not limited to a smart phone, a smart watch, a tablet computer, a notebook computer, a smart car, etc., wherein the fixed terminal includes but not limited to a desktop computer, a smart television, etc.
The electronic device may be a server, which may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server that provides cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, CDN (Content Delivery Network ), and basic cloud computing services such as big data and artificial intelligence platform, but is not limited thereto.
The data processing method of the embodiment of the invention can be realized by a server or a terminal and the server together.
The method for processing data by the terminal and the server is described below by taking the example of the method.
As shown in fig. 1, a data processing system provided in an embodiment of the present invention includes a terminal 10, a server 20, and the like; the terminal 10 and the server 20 are connected through a network, for example, a wired or wireless network connection, wherein the terminal 10 may exist as a terminal on which a data source is mounted and which transmits data to be processed to the server 20.
The terminal 10 may be a terminal for collecting data to be processed, and is configured to send the data to be processed to the server 20.
The server 20 may be configured to obtain initial time windows corresponding to a plurality of data source ends, and data identifiers of the data source ends, calculate window offsets corresponding to the data source ends according to the data identifiers, and perform offset processing on each initial time window based on the window offsets corresponding to the data source ends to obtain target time windows corresponding to the data source ends, where window start times of the target time windows of at least two data source ends are different, and perform periodic data processing on data to be processed of each data source end based on the target time windows to obtain a data processing result of each data source end.
It will be appreciated that in some embodiments, the steps of the data processing performed by the server 20 may also be performed by the terminal 10, and/or the data to be processed may also originate from the server 20 and/or the terminal 10, which is not limited by the embodiment of the present invention.
The following will describe in detail. The following description of the embodiments is not intended to limit the preferred embodiments.
Embodiments of the present invention will be described in terms of a data processing apparatus, which may be integrated in a server or terminal in particular.
As shown in fig. 2, the specific flow of the data processing method of this embodiment may be as follows:
201. And acquiring initial time windows corresponding to the data source ends and data identifiers of the data source ends.
The data source terminal is a source terminal of data to be processed which needs to be processed. For example, when the data processing process is to count the access amount of the website per minute, the data source end may be each website needing to be counted; when the data processing process is to count the transaction amount of 1 minute for each user, the data source end can be the client end of each user needing to count, and the like.
Generally, a time window is a defined range of time. In general, the time window may be of a different type, such as a sliding window, a scrolling window, and a session window. For a rolling window, there is no overlap between windows, and the window length Size is fixed; for the sliding window, the sliding window is continuously slid forward by one step (Slide), the length of the window is fixed, and the windows of the sliding window can be overlapped.
Specifically, the initial time window is a time window preset when the time window offset is not performed by each data source terminal. In the practical application process, the initial time windows corresponding to different data sources are generally the same. When data processing is performed, if the initial time window is directly used, the data processing device has a larger data processing pressure in the same time period because the window time lengths of the initial time windows are the same and the starting time of the initial time windows are the same.
For example, in the related art, it is common to divide time into equal-length time periods (i.e., time windows) and then see which time window the data belongs to process the data in each time window. For example: assuming that the time window length is 10 seconds, the time window can be expressed as [0 to 10 ], [10 to 20 ], [20 to 30 ]) ….
Suppose the data processing procedure is to count the number of clicks of the user within a time window of 10 minutes, and output these data at the end of the window. If the number of users is large, statistics of large amounts of data may occur, even in the hundreds of millions. This results in the output data curve assuming a pulse pattern as shown in fig. 3. It will be appreciated that the pulse pattern data causes a greater processing pressure on the data processing apparatus at peak times and may not effectively utilize the processing resources of the data processing apparatus at valley times.
Specifically, the data identifier of each data source may be identifier information that distinguishes each data source from other data sources. In general, the data source identification may be different depending on the data processing task.
For example, when the data processing task is to count the transaction amount of each user in a certain period of time, the data source identifier may be a user identifier (such as a user ID, etc.), or the data source may also be an identifier (such as a communication identification code or a MAC address, etc.) of a terminal used by the user.
For another example, when the data processing task is to count the access amount of the website in a certain time, the data source end may be an identifier (such as a communication identifier or a MAC address) used by a server of each website, or may be a website of each website, or the like.
202. And calculating the window offset corresponding to each data source end according to the data identification.
Specifically, the window offset may be a length of time that offsets the initial time window.
In the embodiment of the invention, the calculation of the window offset can be any calculation mode capable of mapping different data identifiers into different values. That is, the step of calculating the window offset corresponding to each data source according to the data identifier may specifically include:
And respectively carrying out mapping calculation on the data identifiers according to a preset mapping relation to obtain a mapping numerical value result corresponding to each data identifier.
For example, the mapping relationship may be various types of hash algorithms, such as MD5, SHA-1, and so on.
In some alternative embodiments, the window offset may be calculated directly from the mapped value result.
It will be appreciated that the mapping value result may be a negative number, so as to avoid a data processing error caused by window offset directly according to the mapping value result in the negative number form, in other embodiments, the mapping value result may be taken as an absolute value, and the window offset corresponding to each data source end may be calculated according to the result obtained as the absolute value and the time window length of the initial time window.
For example, the hash value can be obtained by hashing the data identifier of the data source end calculated in the required time window, and the hash value is further taken as an absolute value, and the result after the absolute value is taken.
Specifically, the absolute value of the result can be calculated by the following formula:
hash=|hash_code(key)|。
The code (key) is a data identifier of a data source end, the hash in the absolute value symbol is a mapping numerical value result, and the hash on the left side of the formula is a result obtained after taking the absolute value.
Further, the window offset may be calculated by the following formula:
dynamicOffset=hash%windowSize。
wherein dynamicOffset is the window offset, windowSize is the time window length of the initial time window.
In the embodiment of the invention, the window offset calculated by the data identifications of the same data source end can be the same, and the window offsets calculated by the data identifications of different data source ends can be different.
203. And performing offset processing on each initial time window based on the window offset corresponding to each data source end to obtain a target time window corresponding to each data source end, wherein the window starting time of the target time windows of at least two data source ends is different.
Specifically, the initial time windows are shifted, a target time window can be obtained through calculation according to the following process, and the window starting time of the target time window can be calculated according to the window starting time, the time window length and the window shifting amount of each initial time window; and calculating the window ending time of the target time window according to the window starting time, the time window length and the window offset of the target time window. And obtaining the target time window when the window starting time and the window ending time of the target time window are obtained.
Specifically, the window start time of the target time window may be calculated by the following formula:
windowStart=timestamp-(timestamp+windowSize)%windowSize+dynamicOffset
Wherein windowSize is the time window length, timestamp is the window start time of the initial time window, and windowStart is the window start time of the target time window.
Alternatively, the window start time of the initial time window may be in the form of a time stamp or the like.
Specifically, the window end time of the target time window may be calculated by the following formula:
windowEnd=windowStart+windowSize+dynamicOffset
Wherein windowEnd is the window end time of the target time window.
In the embodiment of the invention, the offset processing is performed on each initial time window, so that the target time windows of the data identifications of different data sources are not aligned together on a time line, and the calculation of the time windows is not concentrated together, thereby reducing the pressure on a downstream system (namely, data processing equipment such as a database or an online service equipment).
In some optional embodiments, in order to avoid that the target time windows obtained after the offset are still denser, the target time windows may be further adjusted according to the difference of the offsets between the target time windows, and before the step of performing offset processing on each initial time window based on the window offset corresponding to each data source end to obtain the target time window corresponding to each data source end, the data processing method provided by the embodiment of the present invention may further include:
calculating an offset difference value between the window offsets based on the window offsets corresponding to the data source ends;
If a first offset difference value smaller than a preset first offset difference threshold exists, weighting calculation is carried out on the window offset corresponding to the first offset difference value, and a new window offset is obtained;
And returning to execute the step of calculating the offset difference value between the window offsets based on the window offsets corresponding to the data source ends until the offset difference value is not smaller than the first offset difference threshold value.
The offset difference is the difference between the window offsets. When calculating the offset difference, window offsets of the data source end can be calculated pairwise. Or dividing the offset range according to the size of the window offset, calculating the window offset in the same offset range for different offset ranges to obtain the offset difference value, and the like.
Wherein the first offset difference threshold may be a range of time lengths, e.g., the first offset difference threshold may be 3 seconds, 1 minute, etc. The technician can set according to the actual data processing requirements.
Optionally, during the weighting calculation, weights for performing the weighting calculation on different window offsets can be determined according to the magnitude of the offset difference. For example, if the window offset of the target time window whose window start time is later is weighted, it may be that the smaller the offset difference value is, the larger the weighting weight is so that the window start time of the target time window is later, and so on.
Specifically, the first offset difference threshold may be determined according to the processing capability of the data processing device, the data amount of the data source end, and the like. That is, before the step of "if there is a target offset difference value smaller than the preset first offset difference threshold value, performing weighted calculation on the window offset corresponding to the target offset difference value to obtain a new window offset", the data processing method provided by the embodiment of the present invention may further include:
acquiring processing resource information of data processing equipment and historical data processing results of all data source ends;
Calculating the maximum data processing capacity of the data processing equipment according to the processing resource information;
And predicting a first offset difference threshold according to the maximum processing amount of the data and each historical data processing result.
Wherein the processing resource information is information describing the data processing capability of the data processing device. For example, the processing resource information may be information about available computing resources of the data processing apparatus for a specified period of time, and so on.
Specifically, the historical data processing result may be the last or the last N data processing results of the data source. The data processing result may indicate the amount of data per unit time at the data source.
Based on the processing resource information and the history data processing results, the number of data sources that the data processing apparatus can support processing in a certain period of time, etc. can be estimated. Further, the minimum time interval between the target time windows of different data sources, i.e. the first offset difference threshold, may be estimated.
In some optional embodiments, in order to avoid wasting processing resources of the data processing device caused by an excessive time interval between target time windows, the data processing method provided by the embodiment of the present invention may further include:
Grouping the data source ends according to the maximum data processing amount and each historical data processing result, wherein the offset difference value of the data source ends in the same group is smaller than a preset second offset difference threshold value;
if a second offset difference value which is not smaller than a preset second offset difference threshold value exists, and the data source ends corresponding to the second offset difference value belong to the same group, weighting calculation is carried out on the window offset corresponding to the second offset difference value, and a new window offset is obtained;
And returning to execute the step of calculating the offset difference value between the window offsets based on the window offsets corresponding to the data source ends until the offset difference value corresponding to the data source ends of the same group is larger than a second offset difference threshold value.
When the data source ends are grouped, the data source ends with similar data quantity in unit time indicated by the historical data processing result can be used as a group, so that the data processing equipment can better allocate processing resources.
Or the data source ends with different data volumes can be used as a group, so that the total data volume of the data source ends in each group in unit time is similar, and the overlarge fluctuation of processing resources of the data processing equipment is avoided.
Based on the processing resource information and the history data processing results, the number of data sources that the data processing apparatus can support processing in a certain period of time, etc. can be estimated. Further, the maximum time interval between the target time windows of different data sources without idle computing resources, i.e. the second offset difference threshold, may be estimated.
204. And carrying out periodic data processing on the data to be processed of each data source end based on the target time window to obtain a data processing result of each data source end.
In the embodiment of the present invention, the data to be processed may be streaming data. The Flink technique may be used for periodic data processing. Wherein, the flank is an open source distributed stream processing framework for processing a large number of real-time data streams. It provides a flexible programming model that can be used to process real-time data streams and batch data.
Specifically, the periodic data processing is performed, that is, the processing is performed in a circulation manner according to each data source end and with a target time window corresponding to the data source end until a preset processing duration or a processing end condition set by other technicians is reached.
For example, by modulo the hash value of the user ID, the offset value of the window can be obtained. For example, if user 1 has an offset value of 0, then his time window is [ 0-10 ], [ 10-20). If the offset value of user 2 is 1, then his time window is [ 1-11 ], [ 11-21). If the offset value of user 3 is 2, then his time window is [ 2-12 ], [ 12-22 ].
As shown in fig. 4, by a smooth design of the time window. The computation of the time window is not focused together, thereby reducing the pressure on the downstream system and making the data output curve smoother.
In some optional embodiments, in order to avoid that the processing pressure of the data processing device in a certain period of time is too high, the target time window of the data source end may be shifted in real time, and before the step of performing periodic data processing on the data to be processed of each data source end based on the target time window to obtain the data processing result of each data source end, the data processing method provided by the embodiment of the present invention may further include:
Predicting data processing pressure values of each time period in a data processing time line of the data processing equipment based on a target time window corresponding to each data source end and window time length of the target time window, wherein the time length of the time period is calculated by the window time length;
When a target time period with the data processing pressure value larger than a preset processing pressure threshold exists, shifting a target time window in the target time period to obtain a new target time window;
and returning to execute the step of predicting the data processing pressure value of each time period in the data processing time line of the data processing equipment based on the target time window corresponding to each data source end and the window time length of the target time window until the target time period does not exist in the data processing time line.
Wherein the time length of the time period may be the same as the window time length. Or the time length of the time period may be 2 times of the window time length, which is not limited in the embodiment of the present invention.
In particular, the data processing pressure value may be a percentage of the processing resource occupation of the data processing apparatus in the period of time, or the like.
In some optional embodiments, in order to avoid that the processing pressure of the data processing device in a certain period of time is too high, the window time length of the target time window of the data source end may be adjusted in real time, and the step of performing periodic data processing on the data to be processed of each data source end based on the target time window to obtain a data processing result of each data source end may specifically include:
Based on the target time window, carrying out primary data processing on the data to be processed of each data source end to obtain real-time data processing results of each data source end;
Calculating the data quantity of each data source end in unit time according to the real-time data processing result and the window time length of the target time window;
Based on the data quantity in unit time, modifying the window time length of the target time window to obtain a new target time window;
And returning to execute the step of carrying out data processing on the data to be processed of each data source end once based on the target time window to obtain the real-time data processing result of each data source end until the data processing time line of the data processing equipment is ended.
The real-time data processing result can indicate the data quantity of the data source end in the unit time in the current period. Specifically, the data quantity of unit time, which directly represents the data quantity of each data source terminal in unit time, can be obtained according to the real-time data processing result and the window time length of the target time window.
By modifying the window time length of the target time window, the window start time and the window end time of the subsequent target time window can be correspondingly adjusted, so that the time window is smoother.
In the practical application process, the data processing method provided by the embodiment of the invention can be applied to the aggregate calculation of dividing the user related usage data into windows with specified sizes in the internet application, and is used for marketing, recommendation, wind control and the like, for example: the visit amount per minute on the website is counted, the visit amount per 5 minutes of the webpage is counted, and the transaction times per user are counted for 1 minute.
And aggregating device data by window in the internet of things (IoT) for detection, real-time analysis, etc., such as: the average temperature of the device per hour, the abnormal times in 5 minutes and the power consumption of the device for 1 hour are counted.
As can be seen from the foregoing, in the embodiment of the present invention, initial time windows corresponding to a plurality of data source ends and data identifiers of the data source ends may be obtained, window offsets corresponding to the data source ends are calculated according to the data identifiers, offset processing is performed on each initial time window based on the window offsets corresponding to the data source ends, and target time windows corresponding to the data source ends are obtained, where window start times of the target time windows of at least two data source ends are different, and periodic data processing is performed on data to be processed of each data source end based on the target time window, so as to obtain a data processing result of each data source end; in the embodiment of the invention, the time windows of the data source terminals are offset according to the data identifications of the different data source terminals, so that the target time windows with different window starting times are obtained, therefore, the time windows of the different data source terminals are not aligned in a centralized manner on a time line of a data processing process, and further, the data processing equipment is not centralized in calculating the different time windows, thereby effectively relieving the pressure of the data processing process on the data processing equipment and improving the performance of the data processing equipment.
The method described in the previous examples is described in further detail below by way of example.
In this embodiment, a description will be given with reference to the system of fig. 1.
As shown in fig. 5, the specific flow of the data processing method of this embodiment may be as follows:
501. and acquiring initial time windows corresponding to the data source ends and data identifiers of the data source ends.
The data source terminal is a source terminal of data to be processed which needs to be processed. For example, when the data processing process is to count the access amount of the website per minute, the data source end may be each website needing to be counted; when the data processing process is to count the transaction amount of 1 minute for each user, the data source end can be the client end of each user needing to count, and the like.
Specifically, the data identifier of each data source may be identifier information that distinguishes each data source from other data sources. In general, the data source identification may be different depending on the data processing task.
For example, when the data processing task is to count the transaction amount of each user in a certain period of time, the data source identifier may be a user identifier (such as a user ID, etc.), or the data source may also be an identifier (such as a communication identification code or a MAC address, etc.) of a terminal used by the user.
For another example, when the data processing task is to count the access amount of the website in a certain time, the data source end may be an identifier (such as a communication identifier or a MAC address) used by a server of each website, or may be a website of each website, or the like.
502. And respectively carrying out mapping calculation on the data identifiers according to a preset mapping relation to obtain a mapping numerical value result corresponding to each data identifier.
For example, the hash value can be obtained by hashing the data identifier of the data source end calculated in the required time window, and the hash value is further taken as an absolute value, and the result after the absolute value is taken.
Specifically, the absolute value of the result can be calculated by the following formula:
hash=|hash_code(key)|。
The code (key) is a data identifier of a data source end, the hash in the absolute value symbol is a mapping numerical value result, and the hash on the left side of the formula is a result obtained after taking the absolute value.
503. And taking an absolute value of the mapping numerical value result, and calculating window offset corresponding to each data source end according to the result obtained after taking the absolute value and the time window length of the initial time window.
Further, the window offset may be calculated by the following formula:
dynamicOffset=hash%windowSize。
wherein dynamicOffset is the window offset, windowSize is the time window length of the initial time window.
504. And performing offset processing on each initial time window based on the window offset corresponding to each data source end to obtain a target time window corresponding to each data source end, wherein the window starting time of the target time windows of at least two data source ends is different.
Specifically, the initial time windows are shifted, a target time window can be obtained through calculation according to the following process, and the window starting time of the target time window can be calculated according to the window starting time, the time window length and the window shifting amount of each initial time window; and calculating the window ending time of the target time window according to the window starting time, the time window length and the window offset of the target time window. And obtaining the target time window when the window starting time and the window ending time of the target time window are obtained.
Specifically, the window start time of the target time window may be calculated by the following formula:
windowStart=timestamp-(timestamp+windowSize)%windowSize+dynamicOffset
Wherein windowSize is the time window length, timestamp is the window start time of the initial time window, and windowStart is the window start time of the target time window.
Alternatively, the window start time of the initial time window may be in the form of a time stamp or the like.
Specifically, the window end time of the target time window may be calculated by the following formula:
windowEnd=windowStart+windowSize+dynamicOffset
Wherein windowEnd is the window end time of the target time window.
505. And carrying out periodic data processing on the data to be processed of each data source end based on the target time window to obtain a data processing result of each data source end.
In the embodiment of the present invention, the data to be processed may be streaming data. The Flink technique may be used for periodic data processing. Wherein, the flank is an open source distributed stream processing framework for processing a large number of real-time data streams. It provides a flexible programming model that can be used to process real-time data streams and batch data.
Specifically, the periodic data processing is performed, that is, the processing is performed in a circulation manner according to each data source end and with a target time window corresponding to the data source end until a preset processing duration or a processing end condition set by other technicians is reached.
For example, by modulo the hash value of the user ID, the offset value of the window can be obtained. For example, if user 1 has an offset value of 0, then his time window is [ 0-10 ], [ 10-20). If the offset value of user 2 is 1, then his time window is [ 1-11 ], [ 11-21). If the offset value of user 3 is 2, then his time window is [ 2-12 ], [ 12-22 ].
As can be seen from the foregoing, in the embodiment of the present invention, initial time windows corresponding to a plurality of data source ends and data identifiers of the data source ends may be obtained, window offsets corresponding to the data source ends are calculated according to the data identifiers, offset processing is performed on each initial time window based on the window offsets corresponding to the data source ends, and target time windows corresponding to the data source ends are obtained, where window start times of the target time windows of at least two data source ends are different, and periodic data processing is performed on data to be processed of each data source end based on the target time window, so as to obtain a data processing result of each data source end; in the embodiment of the invention, the time windows of the data source terminals are offset according to the data identifications of the different data source terminals, so that the target time windows with different window starting times are obtained, therefore, the time windows of the different data source terminals are not aligned in a centralized manner on a time line of a data processing process, and further, the data processing equipment is not centralized in calculating the different time windows, thereby effectively relieving the pressure of the data processing process on the data processing equipment and improving the performance of the data processing equipment.
In order to better implement the above method, correspondingly, the embodiment of the invention also provides a data processing device.
Referring to fig. 6, the apparatus includes:
The window obtaining unit 601 may be configured to obtain initial time windows corresponding to a plurality of data source ends, and data identifiers of the data source ends;
The offset calculating unit 602 may be configured to calculate, according to the data identifiers, window offsets corresponding to the data source ends respectively;
The window offset processing unit 603 may be configured to perform offset processing on each initial time window based on a window offset corresponding to each data source end, so as to obtain a target time window corresponding to each data source end, where window start times of target time windows of at least two data source ends are different;
the data processing unit 604 may be configured to perform periodic data processing on the data to be processed of each data source based on the target time window, to obtain a data processing result of each data source.
In some alternative embodiments, the offset calculating unit 602 may be configured to perform mapping calculation on the data identifiers according to a preset mapping relationship, so as to obtain a mapping numerical result corresponding to each data identifier;
and taking an absolute value of the mapping numerical value result, and calculating window offset corresponding to each data source end according to the result obtained after taking the absolute value and the time window length of the initial time window.
In some alternative embodiments, as shown in fig. 7, the data processing apparatus provided in this embodiment of the present invention may further include a first re-offset unit 605, configured to calculate an offset difference between the window offsets based on the window offsets corresponding to the data source ends;
If a first offset difference value smaller than a preset first offset difference threshold exists, weighting calculation is carried out on the window offset corresponding to the first offset difference value, and a new window offset is obtained;
And returning to execute the step of calculating the offset difference value between the window offsets based on the window offsets corresponding to the data source ends until the offset difference value is not smaller than the first offset difference threshold value.
In some optional embodiments, the data processing apparatus provided in the embodiments of the present invention may further include a first threshold calculation unit 606, which may be configured to obtain processing resource information of the data processing device and historical data processing results of each data source end;
Calculating the maximum data processing capacity of the data processing equipment according to the processing resource information;
And predicting a first offset difference threshold according to the maximum processing amount of the data and each historical data processing result.
In some optional embodiments, the data processing apparatus provided in this embodiment of the present invention may further include a second re-offset unit 607, configured to group data source ends according to a maximum processing amount of data and each historical data processing result, where an offset difference value of the data source ends in the same group is smaller than a preset second offset difference threshold;
if a second offset difference value which is not smaller than a preset second offset difference threshold value exists, and the data source ends corresponding to the second offset difference value belong to the same group, weighting calculation is carried out on the window offset corresponding to the second offset difference value, and a new window offset is obtained;
And returning to execute the step of calculating the offset difference value between the window offsets based on the window offsets corresponding to the data source ends until the offset difference value corresponding to the data source ends of the same group is larger than a second offset difference threshold value.
In some optional embodiments, the data processing apparatus provided in the embodiments of the present invention may further include a real-time re-shifting unit, which may be configured to predict, based on the target time window corresponding to each data source end and the window time length of the target time window, a data processing pressure value of each time period in a data processing timeline of the data processing device, where the time length of the time period is calculated by the window time length;
When a target time period with the data processing pressure value larger than a preset processing pressure threshold exists, shifting a target time window in the target time period to obtain a new target time window;
and returning to execute the step of predicting the data processing pressure value of each time period in the data processing time line of the data processing equipment based on the target time window corresponding to each data source end and the window time length of the target time window until the target time period does not exist in the data processing time line.
In some alternative embodiments, the data processing unit 604 may be configured to perform data processing on the data to be processed of each data source end once based on the target time window, so as to obtain a real-time data processing result of each data source end;
Calculating the data quantity of each data source end in unit time according to the real-time data processing result and the window time length of the target time window;
Based on the data quantity in unit time, modifying the window time length of the target time window to obtain a new target time window;
And returning to execute the step of carrying out data processing on the data to be processed of each data source end once based on the target time window to obtain the real-time data processing result of each data source end until the data processing time line of the data processing equipment is ended.
As can be seen from the above, the data processing device may obtain initial time windows corresponding to a plurality of data source ends and data identifiers of the data source ends, calculate window offsets corresponding to the data source ends according to the data identifiers, and perform offset processing on each initial time window based on the window offsets corresponding to the data source ends to obtain target time windows corresponding to the data source ends, where window starting times of the target time windows of at least two data source ends are different, and perform periodic data processing on data to be processed of each data source end based on the target time windows to obtain data processing results of each data source end; in the embodiment of the invention, the time windows of the data source terminals are offset according to the data identifications of the different data source terminals, so that the target time windows with different window starting times are obtained, therefore, the time windows of the different data source terminals are not aligned in a centralized manner on a time line of a data processing process, and further, the data processing equipment is not centralized in calculating the different time windows, thereby effectively relieving the pressure of the data processing process on the data processing equipment and improving the performance of the data processing equipment.
In addition, the embodiment of the present invention further provides an electronic device, which may be a terminal or a server, as shown in fig. 8, and shows a schematic structural diagram of the electronic device according to the embodiment of the present invention, specifically:
The electronic device may include Radio Frequency (RF) circuitry 801, memory 802 including one or more computer-readable storage media, input unit 803, display unit 804, sensor 805, audio circuitry 806, wireless fidelity (WiFi, wireless Fidelity) module 807, processor 808 including one or more processing cores, and power supply 809. It will be appreciated by those skilled in the art that the electronic device structure shown in fig. 8 is not limiting of the electronic device and may include more or fewer components than shown, or may combine certain components, or a different arrangement of components.
Wherein:
The RF circuit 801 may be used for receiving and transmitting signals during a message or a call, and particularly, after receiving downlink information of a base station, the downlink information is processed by one or more processors 808; in addition, data relating to uplink is transmitted to the base station. In general, RF circuitry 801 includes, but is not limited to, an antenna, at least one amplifier, a tuner, one or more oscillators, a subscriber identity module (SIM, subscriber Identity Module) card, a transceiver, a coupler, a low noise amplifier (LNA, low Noise Amplifier), a duplexer, and the like. In addition, the RF circuit 801 may also communicate with networks and other devices through wireless communication. The wireless communication may use any communication standard or protocol including, but not limited to, global system for mobile communications (GSM, global System of Mobile communication), universal packet Radio Service (GPRS, general Packet Radio Service), code division multiple access (CDMA, code Division Multiple Access), wideband code division multiple access (WCDMA, wideband Code Division Multiple Access), long term evolution (LTE, long Term Evolution), email, short message Service (SMS, short MESSAGING SERVICE), and the like.
The memory 802 may be used to store software programs and modules that the processor 808 performs various functional applications and data processing by executing the software programs and modules stored in the memory 802. The memory 802 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program (such as a sound playing function, an image playing function, etc.) required for at least one function, and the like; the storage data area may store data created according to the use of the electronic device (such as audio data, phonebooks, etc.), and the like. In addition, memory 802 may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid-state storage device. Accordingly, the memory 802 may also include a memory controller to provide access to the memory 802 by the processor 808 and the input unit 803.
The input unit 803 may be used to receive input numeric or character information and to generate keyboard, mouse, joystick, optical or trackball signal inputs related to user settings and function control. In particular, in one particular embodiment, the input unit 803 may include a touch-sensitive surface, as well as other input devices. The touch-sensitive surface, also referred to as a touch display screen or a touch pad, may collect touch operations thereon or thereabout by a user (e.g., operations thereon or thereabout by a user using any suitable object or accessory such as a finger, stylus, etc.), and actuate the corresponding connection means according to a predetermined program. Alternatively, the touch-sensitive surface may comprise two parts, a touch detection device and a touch controller. The touch detection device detects the touch azimuth of a user, detects a signal brought by touch operation and transmits the signal to the touch controller; the touch controller receives touch information from the touch detection device and converts it into touch point coordinates, which are then sent to the processor 808 and can receive commands from the processor 808 and execute them. In addition, touch sensitive surfaces may be implemented in a variety of types, such as resistive, capacitive, infrared, and surface acoustic waves. In addition to the touch-sensitive surface, the input unit 803 may also comprise other input devices. In particular, other input devices may include, but are not limited to, one or more of a physical keyboard, function keys (such as volume control keys, switch keys, etc.), a trackball, mouse, joystick, etc.
The display unit 804 may be used to display information entered by a user or provided to a user as well as various graphical user interfaces of the electronic device, which may be composed of graphics, text, icons, video, and any combination thereof. The display unit 804 may include a display panel, which may optionally be configured in the form of a Liquid Crystal Display (LCD), an Organic Light-Emitting Diode (OLED), or the like. Further, the touch-sensitive surface may overlay a display panel, upon detection of a touch operation thereon or thereabout by the touch-sensitive surface, being communicated to the processor 808 to determine a type of touch event, and the processor 808 then provides a corresponding visual output on the display panel based on the type of touch event. Although in fig. 8 the touch sensitive surface and the display panel are implemented as two separate components for input and output functions, in some embodiments the touch sensitive surface may be integrated with the display panel to implement the input and output functions.
The electronic device may also include at least one sensor 805 such as a light sensor, a motion sensor, and other sensors. In particular, the light sensor may include an ambient light sensor that may adjust the brightness of the display panel according to the brightness of ambient light, and a proximity sensor that may turn off the display panel and/or backlight when the electronic device is moved to the ear. As one of the motion sensors, the gravity acceleration sensor can detect the acceleration in all directions (generally three axes), and can detect the gravity and the direction when the mobile phone is stationary, and can be used for applications of recognizing the gesture of the mobile phone (such as horizontal and vertical screen switching, related games, magnetometer gesture calibration), vibration recognition related functions (such as pedometer and knocking), and the like; other sensors such as gyroscopes, barometers, hygrometers, thermometers, infrared sensors, etc. that may also be configured with the electronic device are not described in detail herein.
Audio circuitry 806, speakers, and a microphone may provide an audio interface between the user and the electronic device. The audio circuit 806 may transmit the received electrical signal after audio data conversion to a speaker, which converts the electrical signal to a sound signal for output; on the other hand, the microphone converts the collected sound signals into electrical signals, which are received by the audio circuit 806 and converted into audio data, which are processed by the audio data output processor 808 for transmission to, for example, another electronic device via the RF circuit 801, or which are output to the memory 802 for further processing. The audio circuitry 806 may also include an ear bud jack to provide communication of the peripheral headphones with the electronic device.
WiFi belongs to a short-distance wireless transmission technology, and the electronic equipment can help a user to send and receive e-mails, browse web pages, access streaming media and the like through the WiFi module 807, so that wireless broadband Internet access is provided for the user. Although fig. 8 shows a WiFi module 807, it is understood that it does not belong to the necessary constitution of the electronic device, and can be omitted entirely as needed within a range that does not change the essence of the invention.
The processor 808 is a control center of the electronic device that utilizes various interfaces and lines to connect the various parts of the overall handset, perform various functions of the electronic device and process data by running or executing software programs and/or modules stored in the memory 802, and invoking data stored in the memory 802. Optionally, the processor 808 may include one or more processing cores; preferably, the processor 808 may integrate an application processor that primarily handles operating systems, user interfaces, applications, etc., with a modem processor that primarily handles wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor 808.
The electronic device also includes a power supply 809 (e.g., a battery) for powering the various components, which may be logically connected to the processor 808 through a power management system such that charge, discharge, and power consumption management functions are performed by the power management system. The power supply 809 may also include one or more of any components, such as a direct current or alternating current power supply, a recharging system, a power failure detection circuit, a power converter or inverter, a power status indicator, and the like.
Although not shown, the electronic device may further include a camera, a bluetooth module, etc., which will not be described herein. In particular, in this embodiment, the processor 808 in the electronic device loads executable files corresponding to the processes of one or more application programs into the memory 802 according to the following instructions, and the processor 808 executes the application programs stored in the memory 802, so as to implement various functions as follows:
Acquiring initial time windows corresponding to a plurality of data source ends and data identifiers of the data source ends;
calculating window offset corresponding to each data source end according to the data identification;
Performing offset processing on each initial time window based on the window offset corresponding to each data source end to obtain a target time window corresponding to each data source end, wherein the window starting time of the target time windows of at least two data source ends is different;
And carrying out periodic data processing on the data to be processed of each data source end based on the target time window to obtain a data processing result of each data source end.
Those of ordinary skill in the art will appreciate that all or a portion of the steps of the various methods of the above embodiments may be performed by instructions, or by instructions controlling associated hardware, which may be stored in a computer-readable storage medium and loaded and executed by a processor.
To this end, an embodiment of the present invention provides a computer readable storage medium having stored therein a plurality of instructions capable of being loaded by a processor to perform the steps of any of the data processing methods provided by the embodiments of the present invention. For example, the instructions may perform the steps of:
Acquiring initial time windows corresponding to a plurality of data source ends and data identifiers of the data source ends;
calculating window offset corresponding to each data source end according to the data identification;
Performing offset processing on each initial time window based on the window offset corresponding to each data source end to obtain a target time window corresponding to each data source end, wherein the window starting time of the target time windows of at least two data source ends is different;
And carrying out periodic data processing on the data to be processed of each data source end based on the target time window to obtain a data processing result of each data source end.
The specific implementation of each operation above may be referred to the previous embodiments, and will not be described herein.
Wherein the computer-readable storage medium may comprise: read Only Memory (ROM), random access Memory (RAM, random Access Memory), magnetic or optical disk, and the like.
Because the instructions stored in the computer readable storage medium may execute the steps in any data processing method provided by the embodiments of the present invention, the beneficial effects that any data processing method provided by the embodiments of the present invention can be achieved, which are detailed in the previous embodiments and are not described herein.
According to one aspect of the present application, there is also provided a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the electronic device reads the computer instructions from the computer-readable storage medium and executes the computer instructions to cause the electronic device to perform the methods provided in the various alternative implementations of the embodiments described above.
The foregoing has described in detail a data processing method, apparatus, electronic device and storage medium according to embodiments of the present invention, and specific examples have been applied to illustrate the principles and embodiments of the present invention, where the foregoing examples are only for aiding in understanding the method and core idea of the present invention; meanwhile, as those skilled in the art will have variations in the specific embodiments and application scope in light of the ideas of the present invention, the present description should not be construed as limiting the present invention.
Claims (11)
1. A method of data processing, comprising:
Acquiring initial time windows corresponding to a plurality of data source ends and data identifiers of the data source ends;
calculating window offset corresponding to each data source end according to the data identification;
performing offset processing on each initial time window based on the window offset corresponding to each data source end to obtain a target time window corresponding to each data source end, wherein window starting time of the target time windows of at least two data source ends is different;
And carrying out periodic data processing on the data to be processed of each data source end based on the target time window to obtain a data processing result of each data source end.
2. The method of claim 1, wherein calculating the window offset corresponding to each data source according to the data identifier, includes:
Respectively carrying out mapping calculation on the data identifiers according to a preset mapping relation to obtain a mapping numerical value result corresponding to each data identifier;
And taking an absolute value of the mapping numerical value result, and calculating window offset corresponding to each data source end according to the result obtained after taking the absolute value and the time window length of the initial time window.
3. The method for processing data according to claim 1, wherein before performing offset processing on each initial time window based on the window offset corresponding to each data source end to obtain a target time window corresponding to each data source end, the method further comprises:
calculating an offset difference value between the window offsets based on the window offsets corresponding to the data source ends;
if a first offset difference value smaller than a preset first offset difference threshold exists, weighting calculation is carried out on the window offset corresponding to the first offset difference value, and a new window offset is obtained;
And returning to the step of executing the offset difference value between the window offsets based on the window offsets corresponding to the data source ends until the offset difference value is not smaller than the first offset difference threshold value.
4. The data processing method according to claim 3, wherein if there is a target offset difference value smaller than a preset first offset difference threshold value, the window offset corresponding to the target offset difference value is weighted, and before obtaining a new window offset, the method further includes:
acquiring processing resource information of data processing equipment and historical data processing results of each data source end;
calculating the maximum data processing capacity of the data processing equipment according to the processing resource information;
And predicting a first offset difference threshold according to the maximum data processing amount and each historical data processing result.
5. The data processing method of claim 4, wherein the method further comprises:
Grouping the data source ends according to the maximum data processing amount and each historical data processing result, wherein the offset difference value of the data source ends in the same group is smaller than a preset second offset difference threshold value;
If a second offset difference value which is not smaller than a preset second offset difference threshold value exists, and the data source ends corresponding to the second offset difference value belong to the same group, weighting calculation is carried out on the window offset corresponding to the second offset difference value, and a new window offset is obtained;
and returning to the step of executing the offset difference value between the window offsets based on the window offsets corresponding to the data source ends until the offset difference value corresponding to the data source ends of the same group is larger than the second offset difference threshold value.
6. The method for processing data according to claim 1, wherein the method further comprises, before performing periodic data processing on the data to be processed of each data source based on the target time window to obtain a data processing result of each data source:
Predicting data processing pressure values of all time periods in a data processing time line of data processing equipment based on the target time window and the window time length of the target time window corresponding to each data source end, wherein the time length of the time periods is calculated by the window time length;
when a target time period exists in which the data processing pressure value is larger than a preset processing pressure threshold value, shifting the target time window in the target time period to obtain a new target time window;
And returning to execute the step of predicting the data processing pressure value of each time period in the data processing time line of the data processing equipment based on the target time window and the window time length of the target time window corresponding to each data source end until the target time period does not exist in the data processing time line.
7. The method for processing data according to claim 1, wherein the performing periodic data processing on the data to be processed of each data source based on the target time window to obtain a data processing result of each data source includes:
Performing primary data processing on the data to be processed of each data source end based on the target time window to obtain real-time data processing results of each data source end;
Calculating the data quantity of unit time of each data source end according to the real-time data processing result and the window time length of the target time window;
Modifying the window time length of the target time window based on the data quantity in unit time to obtain a new target time window;
And returning to the step of executing the data to be processed of each data source end based on the target time window to perform data processing once to obtain real-time data processing results of each data source end until the data processing time line of the data processing equipment is ended.
8. A data processing apparatus, comprising:
the window acquisition unit is used for acquiring initial time windows corresponding to a plurality of data source ends and data identifiers of the data source ends;
The offset calculating unit is used for calculating window offsets corresponding to the data source ends according to the data identifiers respectively;
The window offset processing unit is used for performing offset processing on each initial time window based on the window offset corresponding to each data source end to obtain a target time window corresponding to each data source end, wherein window starting time of the target time windows of at least two data source ends is different;
and the data processing unit is used for carrying out periodic data processing on the data to be processed of each data source end based on the target time window to obtain the data processing result of each data source end.
9. An electronic device comprising a memory and a processor; the memory stores an application program, and the processor is configured to execute the application program in the memory to perform the steps in the data processing method according to any one of claims 1 to 7.
10. A computer readable storage medium, characterized in that it stores a plurality of instructions adapted to be loaded by a processor for performing the steps in the data processing method according to any of claims 1 to 7.
11. A computer program product comprising a computer program or instructions which, when executed by a processor, implement the steps of the data processing method of any one of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310093677.3A CN118427220A (en) | 2023-01-31 | 2023-01-31 | Data processing method and device, electronic equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310093677.3A CN118427220A (en) | 2023-01-31 | 2023-01-31 | Data processing method and device, electronic equipment and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN118427220A true CN118427220A (en) | 2024-08-02 |
Family
ID=92322156
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310093677.3A Pending CN118427220A (en) | 2023-01-31 | 2023-01-31 | Data processing method and device, electronic equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN118427220A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118643031A (en) * | 2024-08-07 | 2024-09-13 | 凯美瑞德(苏州)信息科技股份有限公司 | Financial data processing method and device, electronic equipment and processing medium |
-
2023
- 2023-01-31 CN CN202310093677.3A patent/CN118427220A/en active Pending
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118643031A (en) * | 2024-08-07 | 2024-09-13 | 凯美瑞德(苏州)信息科技股份有限公司 | Financial data processing method and device, electronic equipment and processing medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110198275B (en) | Flow control method, system, server and storage medium | |
US10304461B2 (en) | Remote electronic service requesting and processing method, server, and terminal | |
WO2015090248A1 (en) | Server overload protection method and device | |
CN104618222A (en) | Method and device for matching expression image | |
CN104065693A (en) | Method, device and system for accessing network data in webpage applications | |
CN105227598A (en) | A kind of resource sharing method, device and system stored based on cloud | |
CN118427220A (en) | Data processing method and device, electronic equipment and storage medium | |
CN107317828B (en) | File downloading method and device | |
CN106294087B (en) | Statistical method and device for operation frequency of business execution operation | |
CN105025064B (en) | Download the method, apparatus and system of file | |
CN111371916B (en) | Data processing method and related equipment | |
CN115118636B (en) | Method and device for determining network jitter state, electronic equipment and storage medium | |
CN113593602B (en) | Audio processing method and device, electronic equipment and storage medium | |
CN112181508B (en) | Page automatic refreshing method and device and computer equipment | |
CN110209924B (en) | Recommendation parameter acquisition method, device, server and storage medium | |
CN107315623B (en) | Method and device for reporting statistical data | |
CN112131482A (en) | Aging determination method and related device | |
CN110913022A (en) | Method, device and system for downloading network file of mobile terminal and storage medium | |
CN114189436B (en) | Multi-cluster configuration deployment method and device, electronic equipment and storage medium | |
CN104125202A (en) | Weight adjustment method, device and terminal equipment | |
CN114363406B (en) | Push message processing method, device, equipment and storage medium | |
CN115134327B (en) | Message processing method and device, electronic equipment and storage medium | |
CN117112153A (en) | Task processing method, device, computer equipment and computer readable storage medium | |
CN108965358B (en) | Method and device for downloading application program applied to first terminal and server | |
CN106612315B (en) | File replacement method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication |