Nothing Special   »   [go: up one dir, main page]

CN113608960A - Service monitoring method and device, electronic equipment and storage medium - Google Patents

Service monitoring method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN113608960A
CN113608960A CN202110780636.2A CN202110780636A CN113608960A CN 113608960 A CN113608960 A CN 113608960A CN 202110780636 A CN202110780636 A CN 202110780636A CN 113608960 A CN113608960 A CN 113608960A
Authority
CN
China
Prior art keywords
combustion rate
duration
service
combustion
threshold
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110780636.2A
Other languages
Chinese (zh)
Other versions
CN113608960B (en
Inventor
孙斌
史忠伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuba Co Ltd
Original Assignee
Wuba Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuba Co Ltd filed Critical Wuba Co Ltd
Priority to CN202110780636.2A priority Critical patent/CN113608960B/en
Publication of CN113608960A publication Critical patent/CN113608960A/en
Application granted granted Critical
Publication of CN113608960B publication Critical patent/CN113608960B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/3006Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system is distributed, e.g. networked systems, clusters, multiprocessor systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/32Monitoring with visual or acoustical indication of the functioning of the machine
    • G06F11/324Display of status information
    • G06F11/327Alarm or error message display

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Quality & Reliability (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Alarm Systems (AREA)

Abstract

The application provides a service monitoring method, a service monitoring device, electronic equipment and a storage medium, and relates to the technical field of communication. The method comprises the following steps: acquiring service data of a target service in a time window at intervals; calculating a combustion rate according to the service data; and executing preset alarm operation under the condition that the calculated combustion rate reaches the duration threshold of the combustion rate threshold and exceeds the preset duration. Therefore, in the embodiment of the application, a plurality of invalid alarms are reduced; the accuracy rate is effectively improved, so that the user can attach importance to the alarm information of each time instead of dealing with invalid alarms.

Description

Service monitoring method and device, electronic equipment and storage medium
Technical Field
The present invention relates to the field of communications technologies, and in particular, to a service monitoring method and apparatus, an electronic device, and a storage medium.
Background
Online service systems have become an integral part of people's lives. However, because these service systems have large scale and complex structure, in actual operation, a fault inevitably occurs, and the service monitoring system issues an alarm to remind the alarm receiver to process the alarm.
The existing service monitoring method is to calculate the error rate according to the data in the 1 minute window, and then judge whether the alarm should be given according to the threshold value. This approach monitors the frequency too frequently and generates many alarms, some of which may be of negligible fluctuation and do not require processing. Therefore, the mailbox of the alarm receiver is submerged by frequent alarm information, the alarm receiver is tired to respond, and the problem of alarming is delayed to be processed.
Therefore, the invalid alarms generated by the conventional service monitoring method are many, and the accuracy rate is low.
Disclosure of Invention
The embodiment of the application provides a service monitoring method and device, electronic equipment and a storage medium, so that a lot of invalid alarms are reduced, the alarm accuracy is improved, and a user can attach importance to alarm information every time instead of dealing with the invalid alarms.
In order to solve the technical problem, the present application is implemented as follows:
in a first aspect, an embodiment of the present application provides a service monitoring method, where the method includes:
acquiring service data of a target service in a time window at preset time intervals;
calculating a combustion rate according to the service data;
and executing preset warning operation under the condition that the calculated duration of the combustion rate reaching the combustion rate threshold exceeds the duration threshold.
In a second aspect, an embodiment of the present application provides a service monitoring apparatus, where the apparatus includes:
the data acquisition module is used for acquiring service data of the target service in a time window at intervals of preset time;
the data calculation module is used for calculating the combustion rate according to the service data;
and the warning module is used for executing preset warning operation under the condition that the calculated duration of the combustion rate reaching the combustion rate threshold exceeds the duration threshold.
In a third aspect, an embodiment of the present application additionally provides an electronic device, including: a memory, a processor and a computer program stored on the memory and executable on the processor, the computer program, when executed by the processor, implementing the steps of the service monitoring method according to the first aspect.
In a fourth aspect, the present embodiments additionally provide a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the steps of the service monitoring method according to the first aspect.
In the embodiment of the application, service data of a target service in a time window is acquired at intervals; calculating a combustion rate according to the service data; and executing preset alarm operation under the condition that the calculated combustion rate reaches the duration threshold of the combustion rate threshold and exceeds the preset duration. The combustion rate represents the consumption degree of the wrong budget, and the occurrence condition of the problem event of the target service can be reflected more visually, so that more accurate warning can be given for the problem event based on the combustion rate. In addition, according to the embodiment of the application, when the duration that the calculated combustion rate reaches the combustion rate threshold reaches the duration threshold, the alarm operation is executed, so that accidental fluctuation of the combustion rate can be avoided, and further frequent alarm is avoided. Therefore, the embodiment of the application can improve the accuracy of the alarm and reduce the invalid alarm, so that the user can pay attention to the alarm information every time instead of dealing with the invalid alarm.
The foregoing description is only an overview of the technical solutions of the present application, and the present application can be implemented according to the content of the description in order to make the technical means of the present application more clearly understood, and the following detailed description of the present application is given in order to make the above and other objects, features, and advantages of the present application more clearly understandable.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings needed to be used in the description of the embodiments of the present application will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a flowchart illustrating steps of a service monitoring method according to an embodiment of the present application;
fig. 2 is a block diagram of a service monitoring apparatus according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some, but not all, embodiments of the present application. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The service monitoring method of the embodiment of the application can be operated on terminal equipment or a server. The terminal device may be a local terminal device. When the method operates as a server, it can be presented as a cloud.
In an optional embodiment, the cloud presentation refers to an information presentation manner based on cloud computing. In the cloud display operation mode, an operation main body and an information picture presentation main body of an information processing program are separated, storage and operation of a display switching method are completed on a cloud display server, and a cloud display client is used for receiving and sending data and presenting an information picture, for example, the cloud display client can be a display device with a data transmission function close to a user side, such as a mobile terminal, a television, a computer, a palm computer and the like; however, the terminal device for processing the information data is a cloud display server at the cloud end. When browsing, a user operates the cloud display client to send an operation instruction to the cloud display server, the cloud display server performs coding compression on data according to operation instruction display information, returns the data to the cloud display client through a network, and finally decodes the data through the cloud display client and outputs display content.
In another alternative embodiment, the terminal device may be a local terminal device. The local terminal device stores an application program and is used for presenting an application interface. The local terminal device is used for interacting with a user through a graphical user interface, namely, downloading and installing an application program through the electronic device and running the application program conventionally. The manner in which the local terminal device provides the graphical user interface to the user may include a variety of ways, for example, it may be rendered for display on a display screen of the terminal or provided to the user by holographic projection. For example, the local terminal device may include a display screen for presenting a graphical user interface including an application screen and a processor for running the application, generating the graphical user interface, and controlling display of the graphical user interface on the display screen.
The application provides a service monitoring method, a service monitoring device, electronic equipment and a storage medium, which can reduce a plurality of invalid alarms, thereby improving the alarm accuracy rate and enabling a user to attach importance to each alarm information instead of responding to the duration of the invalid alarm.
For convenience of understanding the service monitoring method provided in the embodiment of the present application, the following concepts are first explained:
a quality of Service Indicator (SLI), which refers to an Indicator of quality of Service, is generally considered to be a ratio of two numbers, the number of good events/the total number of events (e.g., number of successful hypertext transfer protocol (http) requests/total number of http requests, number of remote procedure call protocol (rpc) calls completed within 100 ms/total number of rpc calls).
A quality of Service Objective (SLO) is an Objective that a Service is within a valid window in a certain metric dimension, e.g. the Objective that an http request is successful within 30 days is 99.9%. I.e. SLO is the target of SLI. Wherein, the effective window of the SLO refers to the time required for implementing the SLO.
Error budget: this is the percentage value of SLO subtracted from 100%, i.e., the amount of error theoretically allowed over a period of time (e.g., 30 days).
Error rate: problem events account for the proportion of all events.
Combustion rate: refers to the speed at which the service consumes the wrong budget relative to the SLO.
The precision ratio is as follows: a proportion of the problem events detected among all the detected events.
And (3) recall ratio: the proportion of problem events detected among all objectively present problems.
When the reset is used: after the problem is solved, the alarm lasts for a long time.
The service monitoring method provided by the embodiment of the present application is explained in detail below.
Referring to fig. 1, a flowchart illustrating steps of a service monitoring method in an embodiment of the present application is shown, and the method may include the following steps 101 to 103.
Step 101: and acquiring service data of the target service in a time window at preset time intervals.
The preset time interval can be adjusted according to different application scenes. I.e. the values of the time intervals corresponding to different target services are different. For example, the preset time interval may be 1 minute.
Additionally, the target service may be an http request, rpc call, or the like.
Step 102: and calculating the combustion rate according to the service data.
Optionally, the service data includes: the total number of events and the number of problem events for the target service; the calculating a combustion rate according to the service data comprises:
calculating the ratio of the number of the problem events to the total number of the events to be used as an error rate;
calculating a ratio of the error rate to a predetermined error budget as a firing rate.
The number of problem events is X, the total number of events is Y, the predetermined error budget is Z, and the error rate is C, so that the error rate C is X/Y and the combustion rate is C/Z.
As can be seen, the burn rate represents the rate of consumption of the wrong budget. For example, it may be defined that the burn rate is 1, representing the rate at which the error budget is consumed, and at the end of the SLO's valid window, the error budget will just become 0. For example, an effective window of SLO is 30 days, then 30 days are consumed with a burn rate of 1, based on 99% SLO, just to run out of all the wrong budget. Wherein the burn rate and the time required for the depletion of the wrong budget may be as shown in table 1.
TABLE 1 burn Rate and time to exhaust error budget correspondence Table
Rate of combustion Error Rate (SLO 99%) Time required for the depletion of the wrong budget
1 1% 30 days
2 2% 15 days
10 10% 3 days
1000 1000% 43.2 minutes
Wherein, the combustion rate is equal to the error rate/the error budget, the combustion rate represents the consumption speed of the error budget, and the error budget is fixed, then, the error rate is increased by N times, the combustion rate is increased by N times, the consumption time is shortened by N times, and N is larger than 1.
Step 103: and executing preset warning operation under the condition that the calculated duration of the combustion rate reaching the combustion rate threshold exceeds the duration threshold.
Wherein different burn rate thresholds may be preset for different target services. In other words, different combustion rate thresholds can be set in different application scenarios, so that service monitoring requirements in different scenarios can be met.
In addition, the alarm operation may be at least one of sending alarm information (i.e., a short message alarm) to a predetermined electronic device, dialing a phone number to a predetermined communication number (i.e., a phone alarm), and sending an alarm mail (i.e., a mail alarm) to a predetermined mailbox address.
As can be seen from the foregoing steps 101 to 103, in the embodiment of the present application, service data of a target service in a time window is obtained at intervals; calculating a combustion rate according to the service data; and executing preset alarm operation under the condition that the calculated combustion rate reaches the duration threshold of the combustion rate threshold and exceeds the preset duration. The combustion rate represents the consumption degree of the wrong budget, and the occurrence condition of the problem event of the target service can be reflected more visually, so that more accurate warning can be given for the problem event based on the combustion rate. In addition, according to the embodiment of the application, when the duration that the calculated combustion rate reaches the combustion rate threshold reaches the duration threshold, the alarm operation is executed, so that accidental fluctuation of the combustion rate can be avoided, and further frequent alarm is avoided. Therefore, the embodiment of the application can improve the accuracy of the alarm and reduce the invalid alarm, so that the user can pay attention to the alarm information every time instead of dealing with the invalid alarm.
Optionally, the time window includes at least two windows;
the calculating a combustion rate according to the service data comprises:
respectively calculating the combustion rate of each time window according to the service data in each time window;
and when the calculated burning rate reaches the duration threshold of the burning rate threshold and exceeds the duration threshold, executing preset warning operation, wherein the warning operation comprises the following steps:
and executing preset alarm operation under the condition that the combustion rate of each time window obtained by calculation and the duration threshold reaching the combustion rate threshold exceed the duration threshold.
It can be seen that, in the embodiment of the present application, at least two time windows may be stored in advance. And if the duration reaching the combustion rate threshold exceeds the duration threshold, executing preset alarm operation.
For example, two A, B time windows are preset, and the combustion rate threshold is S, wherein service data in the a time window is acquired at time t, and the combustion rate a1 is calculated, and service data in the B time window is acquired at time t, and the combustion rate B1 is calculated, then if a1> S and B1 > S, it is recorded that the combustion rates of the two time windows at time t A, B both exceed the combustion rate threshold S, and at this time, the start time of the combustion rates of the two time windows A, B both exceed the combustion rate threshold S is t.
Service data in the A time window is obtained at the time T + T1, the combustion rate a2 is obtained through calculation, service data in the B time window is obtained at the time T + T1, the combustion rate B2 is obtained through calculation, if a2 is greater than S and B2 is greater than S, the combustion rates recorded in the two time windows A, B at the time T + T1 exceed the combustion rate threshold S, and the duration that the combustion rates of the two time windows A, B exceed the combustion rate threshold S is T1.
That is, if the firing rate of the a time window and the firing rate of the B time window both exceed the firing rate threshold S at time T, and the firing rate of the a time window and the firing rate of the B time window both exceed the firing rate threshold S at time T + T1, the duration for which the firing rates of the two time windows exceed the firing rate threshold S is determined A, B to be T1.
Wherein, the above process is repeatedly executed every T1 time, and when the duration that the burning rates of A, B two time windows exceed the burning rate threshold S exceeds the duration threshold, the preset warning operation is executed. Here, if a2> S and B2< S, the combustion rate recorded at time T + T1 in the time window a exceeds the combustion rate threshold S, but the combustion rate recorded in the time window B does not exceed the combustion rate threshold S, and in this case, the duration in which the combustion rates in the two time windows A, B exceed the combustion rate threshold S is cleared.
In addition, the T1 is smaller than a preset value, for example, T1 may be 1 minute.
Therefore, according to the embodiment of the application, the plurality of combustion rates can be calculated according to at least two pre-stored time windows, and the alarm operation can be executed when the duration that the calculated combustion rate reaches the combustion rate threshold reaches the duration threshold, so that the accidental fluctuation of the combustion rate can be avoided, and further the frequent alarm can be avoided.
And recording the times of the combustion rate continuously reaching the combustion rate threshold, and judging whether to execute preset alarm operation according to the recorded times of the combustion rate continuously reaching the combustion rate threshold. For example, when the number of times the combustion rate continuously reaches the combustion rate threshold value exceeds a preset number of times, a preset warning operation is performed.
Optionally, the time window includes a first type window and a second type window, where an absolute value of a time difference between the first type window and the second type window is greater than a second preset time.
Two types of time windows with large time difference can be preset. Among the two types of time windows, the one with longer time duration may be referred to as a long window, and the one with shorter time duration may be referred to as a short window. For example, the matching of the long and short windows can be as shown in table 2.
In addition, under the condition that the long window and the short window are matched, if the combustion rate of the long window and the combustion rate of the short window which are obtained by calculation at the time t both exceed the preset combustion rate threshold value and the duration time both exceed the duration time threshold value, the alarm execution operation is triggered, so that relevant processing personnel can process relevant alarms, namely the problem event of the target service is solved. If the problem event is resolved within T to T + T2, the time to reset is T2, where the time to burn when calculated after T + T2, the service data in the long window may include service data during the time to reset, such that the combustion rate in the long window is greater and may still reach the combustion rate threshold, and the service data in the short window may not include service data during the time to reset, such that the combustion rate in the short window is less and may not exceed the combustion rate threshold. Therefore, under the matching of the long window and the short window, after the problem event is solved, the calculated combustion rates of the long window and the short window may not meet the alarm condition, so that the alarm is not triggered, and related processing personnel do not take time to solve the problem event any more, so that the matching of the long window and the short window can reduce the time for resetting.
Optionally, a plurality of sets of configurations are pre-stored, wherein a set of configurations includes at least two time windows, a combustion rate threshold, a duration threshold, and an alarm operation;
and executing preset alarm operation under the condition that the calculated combustion rate of each time window and the duration threshold reaching the combustion rate threshold exceed the duration threshold, wherein the preset alarm operation comprises the following steps:
executing an alarm operation in an ith group configuration if the durations for which the combustion rates of the time windows in the ith group configuration reach the combustion rate thresholds in the ith group configuration both exceed the duration thresholds in the ith group configuration;
wherein i is an integer greater than 0.
The different combustion rate thresholds can correspond to different alarm operations, so that the alarm mode can be selected according to the severity of the problem event, for example, the severity of the service problem can be divided into a fatal degree and a general degree according to the combustion rate thresholds, wherein the fatal degree is short message and telephone alarm, and the general degree is mail alarm. For example, in Table 2, threshold burn rates of 14.4 and 6 may be assigned to lethality, and threshold burn rates of 3 and 1 may be assigned to general.
TABLE 2 prestored sets of configurations
Figure BDA0003156687140000091
For example, the pre-stored sets of configurations are shown in table 2. I.e., time windows of different durations, may produce different combinations of the first through fourth sets, and each set is associated with a burn rate threshold and an alarm operation. For monitoring a certain service, it may specifically adopt which group or groups in table 2 are configured, and may be determined according to the actual situation of the service.
For example, for monitoring of an http request, the configurations of the first group to the fourth group may be selected, and then the service data of each time window shown in table 2 may be obtained, and the combustion rate of each time window is calculated according to the service data of each time window; then, the following process is performed for the firing rate of the time window in the first group:
and judging whether the duration of the combustion rate of the time window of the first group exceeds the combustion rate threshold value (14.4) in the first group exceeds the duration threshold value (2 minutes) in the first group, and if so, executing the alarm operation (namely short message and telephone alarm) in the first group.
The process of executing the combustion rates of the time windows in the second group, the third group, and the fourth group may be performed in the above "process of executing the combustion rates of the time windows in the first group", which is not described herein again.
Therefore, according to the embodiment of the application, aiming at the monitoring of different services, the time window, the combustion rate threshold value, the duration threshold value and the alarm operation can be configured flexibly, so that the quality of the service can be monitored more accurately.
For the configuration in table 2, if the combustion rate in the 3-day time window reaches 1, it indicates that the 3-day time window can detect more complete problem events, and in this case, the recall rate of the problem events is higher.
Table 2 is configured based on the consumption of the error budget for 30 days (i.e., the combustion rate for 30 days is 1). . Wherein, it is generally considered that when the error budget consumed by the 1 hour time window is 2%, the severity of the service problem is fatal, the value x1 reached by the combustion rate when the error budget consumed 2% can be determined, that is, x1 is 14.4 according to the formula 1/x1 ═ 2%/(1/(30 × 24));
similarly, if the severity of the service problem is considered to be fatal when the error budget is consumed for 5% in the 6-hour time window, the combustion rate reaches a value x2 when the error budget is consumed for 5%, that is, x2 is 6 according to the formula 1/x2 being 5%/(1/(30) × 24) × 6);
similarly, considering that the severity of the service problem is general when the error budget is 10% consumed in the 24-hour time window, the combustion rate reaches the value x3 when the error budget is 10%, that is, x3 is 3 according to the formula 1/x3 ═ 10%/(1/(30 × 24);
similarly, assuming that the service problem severity is general when the error budget is 10% consumed in the 3-day time window, the combustion rate reaches the value x4 when the error budget is 10%, i.e. x4 is 1 according to the formula 1/x4 ═ 10%/(1/30 × 3).
From this, it is understood that the combustion rate threshold values of the respective groups can be estimated from "consumed error budget" in table 2.
Optionally, a plurality of sets of configurations are stored in advance, where a set of configurations includes at least two time windows, at least two combustion rate thresholds, a duration threshold, and alarm operations corresponding to the combustion rate thresholds one to one; and when the calculated burning rate reaches the duration threshold of the burning rate threshold and exceeds the duration threshold, executing preset warning operation, wherein the warning operation comprises the following steps:
acquiring a combustion rate threshold value of the ith group of configuration, which is smaller than each combustion rate obtained by calculation, as a first combustion rate threshold value;
acquiring a combustion rate threshold value with the minimum sum of absolute values of the differences between the first combustion rate threshold value and the calculated combustion rate, and taking the combustion rate threshold value as a second combustion rate threshold value;
executing an alarm operation corresponding to the second combustion rate threshold value in the ith group configuration under the condition that the duration of each of the combustion rates reaching the second combustion rate threshold value exceeds the duration threshold value in the ith group configuration;
wherein i is an integer greater than 0.
For example, if one of the previously stored configurations is shown in table 3, if the combustion rate at 1 hour obtained by this calculation is c and the combustion rate at 5 minutes is d, it is necessary to search for a combustion rate threshold value in 14.4, 6, 3, 1, which is smaller than c and d and whose sum of absolute values of differences from c and d is the minimum. For example, if both the burning rate thresholds 3 and 1 are smaller than c and d, the comparison between |3-c | + |3-d | and |1-c | + |1-d | is required, and if |3-c | + |3-d | < |1-c | + |1-d |, the alarm operation (i.e., mail alarm) corresponding to the burning rate threshold 3 is performed.
Therefore, according to the embodiment of the application, different combustion rate thresholds and alarm operations can be flexibly collocated aiming at one group of configuration of the time window, so that different degrees of service problems can be alarmed aiming at the configuration of one group of time window.
TABLE 3 set of prestored configurations
Figure BDA0003156687140000111
It should be noted that, for simplicity of description, the method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the embodiments are not limited by the order of acts described, as some steps may occur in other orders or concurrently depending on the embodiments. Further, those skilled in the art will also appreciate that the embodiments described in the specification are presently preferred and that no particular act is required of the embodiments of the application.
Referring to fig. 2, which shows a block diagram of a service monitoring apparatus in an embodiment of the present application, the service monitoring apparatus 200 may include the following modules:
a data obtaining module 201, configured to obtain service data of a target service in a time window at preset time intervals;
a data calculation module 202, configured to calculate a combustion rate according to the service data;
and the display module 203 is used for recording the duration threshold value when the combustion rate obtained by calculation reaches the combustion rate threshold value, and executing preset alarm operation when the duration threshold value exceeds the combustion rate threshold value.
Optionally, the service data includes: the total number of events and the number of problem events for the target service; the data calculation module 202 includes:
the first calculation submodule is used for calculating the ratio of the number of the problem events to the total number of the events to be used as an error rate;
a second calculation submodule for calculating a ratio of the error rate to a predetermined error budget as a firing rate.
Optionally, the time window includes at least two windows;
the data calculation module 202 is specifically configured to:
respectively calculating the combustion rate of each time window according to the service data in each time window;
the alarm module 203 is specifically configured to:
and executing preset alarm operation under the condition that the calculated combustion rate of each time window and the duration threshold reaching the combustion rate threshold exceed the duration threshold.
Optionally, the time window includes a first type window and a second type window, where an absolute value of a time difference between the first type window and the second type window is greater than a third preset time.
Optionally, a plurality of sets of configurations are pre-stored, wherein a set of configurations includes at least two time windows, a combustion rate threshold, a duration threshold, and an alarm operation; the alarm module 203 is specifically configured to:
executing an alarm operation in an ith group configuration if the durations for which the combustion rates of the time windows in the ith group configuration reach the combustion rate thresholds in the ith group configuration both exceed the duration thresholds in the ith group configuration;
wherein i is an integer greater than 0.
Optionally, a plurality of sets of configurations are stored in advance, where a set of configurations includes at least two time windows, at least two combustion rate thresholds, a duration threshold, and alarm operations corresponding to the combustion rate thresholds one to one;
the alarm module 203 is specifically configured to:
acquiring a combustion rate threshold value of the ith group of configuration, which is smaller than each combustion rate obtained by calculation, as a first combustion rate threshold value;
acquiring a combustion rate threshold value with the minimum sum of absolute values of the differences between the first combustion rate threshold value and the calculated combustion rate, and taking the combustion rate threshold value as a second combustion rate threshold value;
executing an alarm operation corresponding to the second combustion rate threshold value in the ith group configuration under the condition that the duration of each of the combustion rates reaching the second combustion rate threshold value exceeds the duration threshold value in the ith group configuration;
wherein i is an integer greater than 0.
As can be seen from the above, in the embodiment of the present application, the service data of the target service in the time window is acquired at intervals; calculating a combustion rate according to the service data; and executing preset alarm operation under the condition that the calculated combustion rate reaches the duration threshold of the combustion rate threshold and exceeds the preset duration. The combustion rate represents the consumption degree of the wrong budget, and the occurrence condition of the problem event of the target service can be reflected more visually, so that more accurate warning can be given for the problem event based on the combustion rate. In addition, according to the embodiment of the application, when the duration that the calculated combustion rate reaches the combustion rate threshold reaches the duration threshold, the alarm operation is executed, so that accidental fluctuation of the combustion rate can be avoided, and further frequent alarm is avoided. Therefore, the embodiment of the application can improve the accuracy of the alarm and reduce the invalid alarm, so that the user can pay attention to the alarm information every time instead of dealing with the invalid alarm.
For the device embodiment, since it is basically similar to the method embodiment, the description is simple, and for the relevant points, refer to the partial description of the method embodiment.
An embodiment of the present application further provides an electronic device, including:
one or more processors; and one or more machine readable media having instructions stored thereon that, when executed by the one or more processors, cause the electronic device to perform methods as described herein.
Embodiments of the present application also provide one or more machine-readable media having instructions stored thereon, which when executed by one or more processors, cause the processors to perform the methods of embodiments of the present application.
The embodiments in the present specification are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other.
As will be appreciated by one of skill in the art, embodiments of the present application may be provided as a method, apparatus, or computer program product. Accordingly, embodiments of the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, embodiments of the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
Embodiments of the present application are described with reference to flowchart illustrations and/or block diagrams of methods, terminal devices (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing terminal to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing terminal, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing terminal to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing terminal to cause a series of operational steps to be performed on the computer or other programmable terminal to produce a computer implemented process such that the instructions which execute on the computer or other programmable terminal provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present application have been described, additional variations and modifications of these embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including the preferred embodiment and all such alterations and modifications as fall within the true scope of the embodiments of the application.
Finally, it should also be noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or terminal that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or terminal. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or terminal that comprises the element.
The method and the device for displaying the cover picture provided by the application are introduced in detail, a specific example is applied in the text to explain the principle and the implementation of the application, and the description of the embodiment is only used for helping to understand the method and the core idea of the application; meanwhile, for a person skilled in the art, according to the idea of the present application, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present application.

Claims (9)

1. A method of service monitoring, comprising:
acquiring service data of a target service in a time window at preset time intervals;
calculating a combustion rate according to the service data;
and executing preset warning operation under the condition that the calculated duration of the combustion rate reaching the combustion rate threshold exceeds the duration threshold.
2. The service monitoring method of claim 1, wherein the service data comprises: the total number of events and the number of problem events for the target service; the calculating a combustion rate according to the service data comprises:
calculating the ratio of the number of the problem events to the total number of the events to be used as an error rate;
calculating a ratio of the error rate to a predetermined error budget as a firing rate.
3. The service monitoring method of claim 1, wherein the time window comprises at least two windows;
the calculating a combustion rate according to the service data comprises:
respectively calculating the combustion rate of each time window according to the service data in each time window;
and when the calculated burning rate reaches the duration threshold of the burning rate threshold and exceeds the duration threshold, executing preset warning operation, wherein the warning operation comprises the following steps:
and executing preset alarm operation under the condition that the combustion rate of each time window obtained by calculation and the duration threshold reaching the combustion rate threshold exceed the duration threshold.
4. The service monitoring method according to claim 1, wherein the time window comprises a first type window and a second type window, and an absolute value of a time difference between the first type window and the second type window is greater than a second preset time.
5. The service monitoring method according to claim 1, wherein a plurality of sets of configurations are prestored, one set of configurations including at least two time windows, a burn rate threshold, a duration threshold and an alarm operation;
and executing preset alarm operation under the condition that the calculated combustion rate of each time window and the duration threshold reaching the combustion rate threshold exceed the duration threshold, wherein the preset alarm operation comprises the following steps:
executing an alarm operation in an ith group configuration if the durations for which the combustion rates of the time windows in the ith group configuration reach the combustion rate thresholds in the ith group configuration both exceed the duration thresholds in the ith group configuration;
wherein i is an integer greater than 0.
6. The service monitoring method according to claim 1, wherein a plurality of sets of configurations are prestored, one set of configurations including at least two time windows, at least two burning rate thresholds, a duration threshold, and alarm operations corresponding to the burning rate thresholds one to one;
and when the calculated burning rate reaches the duration threshold of the burning rate threshold and exceeds the duration threshold, executing preset warning operation, wherein the warning operation comprises the following steps:
acquiring a combustion rate threshold value of the ith group of configuration, which is smaller than each combustion rate obtained by calculation, as a first combustion rate threshold value;
acquiring a combustion rate threshold value with the minimum sum of absolute values of the differences between the first combustion rate threshold value and the calculated combustion rate, and taking the combustion rate threshold value as a second combustion rate threshold value;
executing an alarm operation corresponding to the second combustion rate threshold value in the ith group configuration under the condition that the duration of each of the combustion rates reaching the second combustion rate threshold value exceeds the duration threshold value in the ith group configuration;
wherein i is an integer greater than 0.
7. A service monitoring apparatus, the apparatus comprising:
the data acquisition module is used for acquiring service data of the target service in a time window at intervals of preset time;
the data calculation module is used for calculating the combustion rate according to the service data;
and the warning module is used for executing preset warning operation under the condition that the calculated duration of the combustion rate reaching the combustion rate threshold exceeds the duration threshold.
8. An electronic device, comprising: memory, processor and computer program stored on the memory and executable on the processor, which computer program, when executed by the processor, carries out the steps of the service monitoring method according to any one of claims 1 to 6.
9. A computer-readable storage medium, characterized in that a computer program is stored thereon, which computer program, when being executed by a processor, carries out the steps of the service monitoring method according to one of the claims 1 to 6.
CN202110780636.2A 2021-07-09 2021-07-09 Service monitoring method and device, electronic equipment and storage medium Active CN113608960B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110780636.2A CN113608960B (en) 2021-07-09 2021-07-09 Service monitoring method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110780636.2A CN113608960B (en) 2021-07-09 2021-07-09 Service monitoring method and device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN113608960A true CN113608960A (en) 2021-11-05
CN113608960B CN113608960B (en) 2024-06-25

Family

ID=78304379

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110780636.2A Active CN113608960B (en) 2021-07-09 2021-07-09 Service monitoring method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113608960B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2024152770A1 (en) * 2023-01-17 2024-07-25 中兴通讯股份有限公司 Qos guarantee method and apparatus, device and storage medium

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110035485A1 (en) * 2009-08-04 2011-02-10 Daniel Joseph Martin System And Method For Goal Driven Threshold Setting In Distributed System Management
US8156382B1 (en) * 2008-04-29 2012-04-10 Netapp, Inc. System and method for counting storage device-related errors utilizing a sliding window
US20130116976A1 (en) * 2011-11-03 2013-05-09 The Georgia Tech Research Corporation Method, computer program, and information processing apparatus for analyzing performance of computer system
US20140052841A1 (en) * 2012-08-16 2014-02-20 The Georgia Tech Research Corporation Computer program, method, and information processing apparatus for analyzing performance of computer system
US20160004475A1 (en) * 2013-02-28 2016-01-07 Hitachi, Ltd Management system and method of dynamic storage service level monitoring
US20160378583A1 (en) * 2014-07-28 2016-12-29 Hitachi, Ltd. Management computer and method for evaluating performance threshold value
CN108959025A (en) * 2018-06-27 2018-12-07 郑州云海信息技术有限公司 A kind of server alarm method, device and server
CN110008090A (en) * 2019-04-15 2019-07-12 苏州浪潮智能科技有限公司 A kind of method, apparatus and computer readable storage medium monitoring EMS memory error
CN110597688A (en) * 2019-09-09 2019-12-20 中国工商银行股份有限公司 Monitoring information acquisition method and system
CN111522719A (en) * 2020-04-27 2020-08-11 中国银行股份有限公司 Method and device for monitoring big data task state
CN112052145A (en) * 2020-09-09 2020-12-08 中国工商银行股份有限公司 Method and device for determining performance alarm threshold, electronic equipment and medium
CN112260858A (en) * 2020-09-30 2021-01-22 福建天泉教育科技有限公司 Alarm method capable of automatic detection and terminal

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8156382B1 (en) * 2008-04-29 2012-04-10 Netapp, Inc. System and method for counting storage device-related errors utilizing a sliding window
US20110035485A1 (en) * 2009-08-04 2011-02-10 Daniel Joseph Martin System And Method For Goal Driven Threshold Setting In Distributed System Management
US20130116976A1 (en) * 2011-11-03 2013-05-09 The Georgia Tech Research Corporation Method, computer program, and information processing apparatus for analyzing performance of computer system
US20140052841A1 (en) * 2012-08-16 2014-02-20 The Georgia Tech Research Corporation Computer program, method, and information processing apparatus for analyzing performance of computer system
US20160004475A1 (en) * 2013-02-28 2016-01-07 Hitachi, Ltd Management system and method of dynamic storage service level monitoring
US20160378583A1 (en) * 2014-07-28 2016-12-29 Hitachi, Ltd. Management computer and method for evaluating performance threshold value
CN108959025A (en) * 2018-06-27 2018-12-07 郑州云海信息技术有限公司 A kind of server alarm method, device and server
CN110008090A (en) * 2019-04-15 2019-07-12 苏州浪潮智能科技有限公司 A kind of method, apparatus and computer readable storage medium monitoring EMS memory error
CN110597688A (en) * 2019-09-09 2019-12-20 中国工商银行股份有限公司 Monitoring information acquisition method and system
CN111522719A (en) * 2020-04-27 2020-08-11 中国银行股份有限公司 Method and device for monitoring big data task state
CN112052145A (en) * 2020-09-09 2020-12-08 中国工商银行股份有限公司 Method and device for determining performance alarm threshold, electronic equipment and medium
CN112260858A (en) * 2020-09-30 2021-01-22 福建天泉教育科技有限公司 Alarm method capable of automatic detection and terminal

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
焦振清: "第五章报警SLO", pages 1 - 24, Retrieved from the Internet <URL:https://blog.csdn.net/weixin_43947499/article/details/84941702?ops_request_misc=%257B%2522request%255Fid%2522%253A%2522170055653716800182727923%2522%252C%2522scm%2522%253A%252220140713.130102334.pc%255Fall.%2522%257D&request_id=170055653716800182727923&biz_id=0&utm_medium=distribute.pc_search_result.none-task-blog-2~all~first_rank_ecpm_v1~rank_v31_ecpm-2-84941702-null-null.142^v96^pc_search_result_base6&utm_term=http请求%20监控%20告警%20错误率%20SLO&spm=1018.2226.3001.4187> *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2024152770A1 (en) * 2023-01-17 2024-07-25 中兴通讯股份有限公司 Qos guarantee method and apparatus, device and storage medium

Also Published As

Publication number Publication date
CN113608960B (en) 2024-06-25

Similar Documents

Publication Publication Date Title
US8175253B2 (en) System and method for automated performance monitoring for a call servicing system
CN109698934B (en) Region monitoring method and device
CN107608812B (en) Fusing method and server
CN111478963B (en) Message pushing method and device, electronic equipment and computer readable storage medium
US20210035062A1 (en) Information prompt
CN112152833B (en) Network abnormity alarm method and device and electronic equipment
CN113468025A (en) Data warning method, system, device and storage medium
CN110011926A (en) A kind of method, apparatus, equipment and storage medium adjusting message sending time
CN110990245A (en) Micro-service operation state judgment method and device based on call chain data
CN114996085A (en) Prometheus-based real-time service monitoring method and system
CN113608960A (en) Service monitoring method and device, electronic equipment and storage medium
CN107306200A (en) Network failure method for early warning and the gateway for network failure early warning
CN112398725B (en) Group message prompting method, system, computer equipment and storage medium
CN111339062A (en) Data monitoring method and device, electronic equipment and storage medium
CN111580961B (en) Access request processing method, device, server and storage medium
CN110322671B (en) Alarm information processing method and device
CN107395450B (en) Using the monitoring method and device, storage medium, electronic device for logging in situation
CN114302348B (en) Message generation method, device, electronic equipment and computer readable storage medium
CN107800560B (en) Network detection method and device, and network detection query method and device
CN110633165A (en) Fault processing method, device, system server and computer readable storage medium
CN111784425A (en) Order number generation method, exception handling method and device
CN113411828B (en) Method, device, equipment and computer readable storage medium for sensing call quality
CN112631808B (en) Data synchronization method, device, electronic equipment and storage medium
CN108566496B (en) Method and device for monitoring voice line state
CN109285035B (en) Method, device, equipment and storage medium for predicting application retention data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant