Nothing Special   »   [go: up one dir, main page]

CN110795301A - Job monitoring method, device, terminal and computer storage medium - Google Patents

Job monitoring method, device, terminal and computer storage medium Download PDF

Info

Publication number
CN110795301A
CN110795301A CN201810861581.6A CN201810861581A CN110795301A CN 110795301 A CN110795301 A CN 110795301A CN 201810861581 A CN201810861581 A CN 201810861581A CN 110795301 A CN110795301 A CN 110795301A
Authority
CN
China
Prior art keywords
threshold
job
monitoring
memory resources
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810861581.6A
Other languages
Chinese (zh)
Inventor
翁泽梁
伍应标
王能
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Mashang Consumer Finance Co Ltd
Original Assignee
Mashang Consumer Finance Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Mashang Consumer Finance Co Ltd filed Critical Mashang Consumer Finance Co Ltd
Priority to CN201810861581.6A priority Critical patent/CN110795301A/en
Publication of CN110795301A publication Critical patent/CN110795301A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3051Monitoring arrangements for monitoring the configuration of the computing system or of the computing system component, e.g. monitoring the presence of processing resources, peripherals, I/O links, software programs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/3017Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system is implementing multitasking

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computing Systems (AREA)
  • Physics & Mathematics (AREA)
  • Quality & Reliability (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The application discloses an operation monitoring method, an operation monitoring device, a terminal and a computer storage medium, wherein the operation monitoring method comprises the following steps: judging whether the operation occupying the memory resources exceeding a first threshold exists at present; wherein the first threshold is smaller than the maximum threshold of the total amount of the memory resources; and if the operation occupying the memory resources exceeding the first threshold value exists, sending the operation information of the operation to a monitoring object. By the method, the operation which is possibly abnormal can be in an early warning state in advance, and the probability of occurrence of major accidents is effectively reduced. In addition, by the mode, the operation is monitored in real time on the premise of low labor cost, and the stability and sustainability of operation monitoring are guaranteed.

Description

Job monitoring method, device, terminal and computer storage medium
Technical Field
The present application relates to the field of big data monitoring, and in particular, to a method, an apparatus, a terminal, and a computer storage medium for job monitoring.
Background
With the continuous development of big data and related technologies, the traditional data analysis is new and new, so that the analysis of large-scale data becomes possible. With the proliferation of data volume, the integration of data and the hardware resources required for computation face unprecedented challenges.
At present, two main ways for solving the problem of insufficient hardware resources exist, the first way is to add hardware resources, but the cost is high by adding hardware, the approval process of a general enterprise is long, and the problem of insufficient hardware resources cannot be solved in time. The second way is to manually detect the abnormality of the operation being executed, which not only has high labor cost, but also can be detected after the abnormality occurs, thus causing serious accidents and causing delay of other operations.
Disclosure of Invention
The technical problem mainly solved by the application is to provide an operation monitoring method, an operation monitoring device, a terminal and a computer storage medium, which can realize early warning of operation abnormity on the premise of not increasing hardware resources.
In order to solve the above technical problem, the first technical solution adopted by the present application is: provided is an operation monitoring method including: judging whether the operation occupying the memory resources exceeding a first threshold exists at present; wherein the first threshold is smaller than the maximum threshold of the total amount of the memory resources;
and if the operation occupying the memory resources exceeding the first threshold exists, sending the operation information of the operation to the monitoring object.
Wherein, if there is a job occupying the memory resource and exceeding the first threshold, the step of sending the job information of the job to the monitoring object includes:
if the operation occupying the memory resources exceeds the first threshold value, judging whether the duration time of the operation exceeding the first threshold value exceeds the threshold value time;
and if the duration time exceeds the threshold time, sending the job information of the job to the monitored object.
Wherein, the step of sending the job information of the job to the monitoring object includes:
judging whether the format of the operation information is the same as a preset format or not;
if the format of the operation information is different from the preset format, converting the format of the operation information into the preset format;
and sending the job information after the format conversion to a monitoring object.
The preset format comprises a table or a text format.
The operation information includes the name of the operation, the amount of occupied memory resources, and the current execution state.
Wherein the first threshold is an average value of memory resources occupied by the history operation.
The step of judging whether the current operation of occupying the memory resources exceeds the first threshold specifically comprises:
judging whether the number of the occupied maps is larger than a first threshold value or not;
if the operation occupying the memory resources exceeding the first threshold exists, the step of sending the operation information of the operation to the monitoring object comprises the following steps:
and if the number of occupied maps is larger than the first threshold value, sending the operation information of the operation to the monitored object.
In order to solve the above technical problem, the second technical solution adopted by the present application is: provides an operation monitoring device, which comprises a judging module and a sending module,
the judging module is used for judging whether the operation occupying the memory resources exceeds a first threshold value currently exists; wherein the first threshold is smaller than the maximum threshold of the total amount of the memory resources;
the sending module is used for sending the operation information of the operation to the monitoring object when the operation of which the memory resource exceeds the first threshold exists.
The judging module is specifically used for judging whether the duration time of the operation exceeding a first threshold exceeds threshold time when the operation occupying the memory resources exceeds the first threshold exists; the sending module is specifically configured to send job information of the job to the monitored object when the duration exceeds the threshold time.
The judging module is also used for judging whether the format of the operation information is the same as the preset format; if the format of the operation information is different from the preset format, converting the format of the operation information into the preset format; the sending module is further used for sending the job information after the format conversion to the monitoring object.
The preset format comprises a table or a text format.
The operation information includes the name of the operation, the amount of occupied memory resources, and the current execution state.
Wherein the first threshold is an average value of memory resources occupied by the history operation.
The judging module is specifically used for judging whether the operation with the number of the maps larger than a first threshold exists at present; the sending module is used for sending the job information of the job to the monitored object when the number of the occupied maps is larger than a first threshold value.
In order to solve the above technical problem, the third technical solution adopted by the present application is: provided is an operation monitoring terminal including: the processor and the communication circuit are coupled with each other, and the processor is matched with the communication circuit to realize any one of the operation monitoring methods when working.
In order to solve the above technical problem, a fourth technical solution adopted by the present application is: there is provided a computer storage medium having stored thereon program data which, when executed by a processor, implements the job monitoring method of any one of the above.
Compared with the prior art, the beneficial effects of this application are: in the method, whether the operation occupying the memory resources exceeds a first threshold value is judged at present, the first threshold value is smaller than a maximum threshold value of the total amount of the memory resources, and when the operation is detected to exist, the operation information of the operation is sent to a monitoring object. The method can enter the early warning state in advance for the operation which is possibly abnormal, avoids finding the operation after the operation fails to be executed or other operations are influenced to be executed to cause major accidents, and can effectively reduce the probability of the major accidents. In addition, by the mode, the operation is monitored in real time on the premise of low labor cost, and the stability and sustainability of operation monitoring are guaranteed.
Drawings
FIG. 1 is a schematic flow chart diagram illustrating an embodiment of a job monitoring method of the present application;
FIG. 2 is a schematic flow chart diagram illustrating another embodiment of a job monitoring method of the present application;
FIG. 3 is a detailed flowchart of an embodiment of the present application of sending job information of a job to a monitored object;
FIG. 4 is a schematic structural diagram of an embodiment of the operation monitoring device of the present application;
FIG. 5 is a schematic structural diagram of an embodiment of a job monitoring terminal according to the present application;
FIG. 6 is a schematic structural diagram of an embodiment of a computer storage medium according to the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
As shown in fig. 1, fig. 1 is a schematic flow chart of an embodiment of the job monitoring method of the present application. The operation monitoring method of the implementation method comprises the following steps:
step 101: judging whether the operation occupying the memory resources exceeding a first threshold exists at present; wherein the first threshold is smaller than the maximum threshold of the total amount of the memory resources.
When a terminal or a server executes a job, in order to ensure that a plurality of jobs can be normally executed, memory resources occupied by each job are monitored.
The terminal comprises a computer, an intelligent terminal, a PC and other equipment.
In one embodiment, the terminal or server monitors the resources of the jobs executing in the big data platform.
Specifically, the terminal or the server acquires the job information of the executing job, such as acquiring the job information of the currently executing job in real time from the yarn in the big data platform Cloudera Manager by calling the job information acquisition program. The operation information includes the name of the operation, the amount of memory resources occupied and the current execution state, and for example, under a hadoop software platform, the amount of memory resources occupied can be represented by a map number.
In order to avoid an accident in operation, an early warning is made in advance, and in the embodiment, after the operation information of the currently executed operation is acquired, whether the operation occupying the memory resource exceeds a first threshold value is judged.
Wherein the first threshold is smaller than the maximum threshold of the total amount of the memory resources. The maximum threshold is the maximum amount that a single job set by the terminal or the server occupies the memory resource and cannot exceed, and exceeding the maximum threshold may cause that other jobs cannot run normally or delay. If the total amount of the memory resources is 1000M, the maximum threshold value of the total amount of the memory resources is 7000M or 8000M.
The first threshold can be set according to practical experience, and in a preferred embodiment, the first threshold is an average value of memory resources occupied by historical operations, such as 200M. When the number of the currently executed operations is large, the first threshold may also be an average value of the memory resources occupied by each current operation, which is not limited herein.
Under the hadoop software platform, when the occupied memory resource amount is represented by the map number, the terminal or the server judges whether the map number occupied by the current operation has the map number exceeding a first threshold, such as 20 maps or 30 maps.
Step 102: and if the operation occupying the memory resources exceeding the first threshold value exists, sending the operation information of the operation to a monitoring object.
After the screening, if the operation with the memory resource exceeding the first threshold exists in the currently executed operation, the operation information of the operation with the memory resource exceeding the first threshold is sent to the monitoring object so as to remind the monitoring object to monitor the operation. And realizing early warning and reminding before real abnormity or accidents do not occur. And further judging whether the operation is an abnormal operation or not through the monitoring object, and if the operation is the abnormal operation, restarting or closing the abnormal operation and the like so as to avoid influencing the execution of other normal operations.
The terminal or the server can send the operation information to the monitoring object through mails, short messages or other social contact platforms. The monitoring object can be related staff.
By the mode, when the operation of which the occupied memory resource exceeds the first threshold value which is smaller than the maximum threshold value is monitored, the operation information of the operation is sent to the monitored object, so that the operation which is possibly abnormal can be in an early warning state in advance, the operation can be prevented from being discovered after the operation fails to be executed or other operations are influenced to cause major accidents, and the probability of the major accidents can be effectively reduced. By the mode, on the premise of low labor cost, real-time monitoring of operation is realized, and stability and sustainability of operation monitoring are guaranteed.
During the execution of the operation, a transient abnormality sometimes occurs, for example, the occupied memory resource suddenly increases, but the normal state is recovered later, in this case, the transient abnormality does not substantially affect the operation or other operations, but the terminal or the server still screens the abnormality. To avoid the waste of processor resources caused by processing such accidental abnormal jobs and save labor cost, as shown in fig. 2, fig. 2 is a flowchart of another embodiment of the job monitoring method of the present application, and in step 201: after determining whether there is a job occupying the memory resource exceeding the first threshold, step 202 is executed.
Step 202: and if the operation of occupying the memory resources exceeding a first threshold exists, judging whether the duration of the memory resources occupied by the operation exceeding the first threshold exceeds threshold time.
Wherein the threshold time may be set empirically, such as 20 seconds. The time may be set according to an average value of abnormal times of abnormal jobs that occasionally occur in the history, and is not limited herein.
Step 203: and if the time exceeds the threshold value, sending the job information of the job to a monitoring object.
And when the memory resources occupied by the operation are determined to be larger than a first threshold value and the duration time of the abnormal condition exceeds the threshold value time, the operation information of the operation is sent to a monitoring object for abnormal detection.
By the mode, accidental operation abnormity screening caused by other emergency situations such as a network can be effectively avoided, the workload of the terminal or the server is saved, and the workload of the monitoring object is also reduced.
In the actual operation process, sometimes for convenience of operation or different platforms for executing the job, the formats of the job information acquired by the terminal or the server may be various, such as json format or list format, and these formats are not very intuitive for a general monitoring object to view. In order to enable the monitoring object to more clearly and intuitively know the operation which may be abnormal, fig. 3 is a detailed flowchart of an embodiment of the step of sending the operation information to the monitoring object.
As shown in fig. 3, the method comprises the following steps:
step 301: and judging whether the format of the operation information is the same as a preset format.
After the terminal or the server acquires the operation information which needs to be sent to the detection object, whether the format of the operation information is the same as the preset format is further judged. The preset format may be a table format, such as an excel format, or a text format, and is not limited herein.
Step 302: and if the format of the operation information is different from the preset format, converting the format of the operation information into the preset format.
Step 303: and sending the job information after the format conversion to the monitoring object.
As shown in table 1, table 1 is an explanatory table of an embodiment of the job information.
The job information of this embodiment includes the job number jobID of the job in which the abnormality may occur, the submission time of the abnormality, the start time, duration, job name jobname, and the number of occupied maps, and in other embodiments, may include the end time, status, and the like.
TABLE 1
Figure BDA0001749810910000071
By the mode, the operation information of different platforms and different formats can be converted into the information of the universal preset format, so that monitoring personnel can more intuitively and clearly know the operation information of the operation which is possibly abnormal.
Different from the prior art, in the embodiment, it is first determined whether there is a job occupying the memory resource and exceeding a first threshold, where the first threshold is smaller than a maximum threshold of the total amount of the memory resource, and when the presence is detected, job information of the job is sent to the monitoring object. The early warning method can enter an early warning state in advance for the operation which is possibly abnormal, avoid finding the operation after the operation fails to be executed or other operations are influenced to be executed to cause major accidents, and can effectively reduce the probability of the major accidents. By the mode, on the premise of low labor cost, real-time monitoring of operation is realized, and stability and sustainability of operation monitoring are guaranteed.
In addition, after the operation that the occupied memory resource exceeds the first threshold value is acquired, the duration time is further judged, and if the duration time exceeds the threshold time, the operation information of the operation is sent to the monitoring object. By the mode, accidental operation abnormity screening caused by other emergency situations such as a network can be effectively avoided, the workload of the terminal or the server is saved, and the workload of the monitoring object is also reduced.
Referring to fig. 4, fig. 4 is a schematic structural diagram of an embodiment of the operation monitoring device of the present application.
The job monitoring apparatus of the present embodiment includes a determination module 401 and a transmission module 402.
The judging module 401 is configured to judge whether there is a job in which the occupied memory resource exceeds a first threshold at present; wherein the first threshold is smaller than the maximum threshold of the total amount of the memory resources.
In this embodiment, the job monitoring device monitors the big data platform resource of the terminal.
Specifically, the determining module 401 obtains job information of the executing job, such as collecting job information of the currently executing job in real time from the yarn in the big data platform Cloudera Manager by calling the job information collecting program. The operation information includes the name of the operation, the amount of memory resources occupied and the current execution state, and for example, under a hadoop software platform, the amount of memory resources occupied can be represented by a map number.
Further, the determining module 401 performs an early warning in advance to avoid an accident in the operation, and determines whether there is an operation occupying a memory resource exceeding a first threshold after acquiring the operation information of the currently executed operation.
Wherein the first threshold is smaller than the maximum threshold of the total amount of the memory resources. The maximum threshold is the maximum amount that a single job set by the terminal or the server occupies the memory resource and cannot exceed, and exceeding the maximum threshold may cause that other jobs cannot run normally or delay.
The first threshold may be set according to actual experience, in a preferred embodiment, the first threshold is an average value of memory resources occupied by historical operations, and when the number of currently executed operations is large, the first threshold may also be an average value of memory resources occupied by each currently executed operation, which is not limited herein.
The sending module 402 is configured to send job information of a job to a monitoring object when there is a job whose occupied memory resource exceeds the first threshold.
After the screening by the determining module 401, if there is a job whose memory resource exceeds the first threshold in the currently executed jobs, the sending module sends the job information of the job whose memory resource exceeds the first threshold to the monitoring object, so as to remind the monitoring object to monitor the job.
The sending module 402 may send the job information to the monitoring object through a mail, a short message, or other social platform. The monitoring object can be related staff.
In this way, when the determining module 401 monitors the operation that the occupied memory resource exceeds the first threshold smaller than the maximum threshold, the sending module 402 sends the operation information of the operation to the monitored object, so that the operation which may be abnormal can be brought into an early warning state in advance, the operation is prevented from being discovered after the operation fails to be executed or other operations are influenced to cause major accidents, and the probability of the major accidents can be effectively reduced. By the mode, on the premise of low labor cost, real-time monitoring of operation is realized, and stability and sustainability of operation monitoring are guaranteed.
In another embodiment, in order to avoid the waste of processor resources and save labor cost caused by the handling of such accidental exception, the determining module 401 is further configured to determine whether the duration of the memory resource occupied by the job exceeds the first threshold exceeds the threshold time when detecting that there is a job whose memory resource occupied by the job exceeds the first threshold. And if the threshold time is exceeded, sending the job information of the job to the monitored object.
Wherein the threshold time may be set empirically, such as 20 seconds. The time may be set according to an average value of abnormal times of abnormal jobs that occasionally occur in the history, and is not limited herein.
By the mode, accidental operation abnormity screening caused by other emergency situations such as a network can be effectively avoided, the workload of the terminal or the server is saved, and the workload of the monitoring object is also reduced.
Further, in order to enable the monitoring object to more clearly and intuitively know the job which may be abnormal, the determining module 401 further determines whether the format of the job information is the same as the preset format after acquiring the job information which needs to be sent to the detection object. The preset format may be a table format, such as an excel format, or a text format, and is not limited herein. And if the format of the operation information is different from the preset format, converting the format of the operation information into the preset format. The sending module 402 sends the job information after the format conversion to the monitoring object.
By the mode, the operation information of different platforms and different formats can be converted into the information of the universal preset format, so that monitoring personnel can more intuitively and clearly know the operation information of the operation which is possibly abnormal.
Referring to fig. 5, fig. 5 is a schematic structural diagram of an embodiment of the operation monitoring terminal according to the present application. The acquisition terminal 50 of the present embodiment includes a processor 501 and a communication circuit 502 coupled to each other. The communication circuit 502 is used for completing communication with other devices or monitoring objects.
The terminal 50 includes a PC, a tablet computer, and an intelligent device such as a smart phone.
The processor 501 is configured to determine whether there is a job occupying the memory resource exceeding a first threshold at present; wherein the first threshold is smaller than the maximum threshold of the total amount of the memory resources.
The processor 501 monitors resources of jobs executed in a big data platform.
Specifically, the processor 501 obtains job information of the executing job, such as collecting job information of the currently executing job in real time from the yarn in the big data platform Cloudera Manager by calling the job information collection program. The operation information includes the name of the operation, the amount of memory resources occupied and the current execution state, and for example, under a hadoop software platform, the amount of memory resources occupied can be represented by a map number.
In order to avoid an accident in the operation and make an early warning in advance, in this embodiment, after acquiring the operation information of the currently executed operation, the processor 501 determines whether there is an operation whose memory resource is greater than a first threshold.
Wherein the first threshold is smaller than the maximum threshold of the total amount of the memory resources. The maximum threshold is the maximum amount that a single job set by the processor 501 occupies memory resources and cannot exceed, and exceeding the maximum threshold may cause other jobs to not run normally or to delay.
The first threshold can be set according to practical experience, and in a preferred embodiment, the first threshold is an average value of memory resources occupied by historical operations, such as 200M. When the number of the currently executed operations is large, the first threshold may also be an average value of the memory resources occupied by each current operation, which is not limited herein.
The communication circuit 502 is configured to send job information of a job to a monitoring object when there is a job whose occupied memory resource exceeds the first threshold.
The communication circuit 502 may send the job information to the monitoring object through a mail, a short message, or other social platform. The monitoring object can be related staff.
By the above manner, when the processor 501 monitors the operation that the occupied memory resource exceeds the first threshold value smaller than the maximum threshold value, the operation information of the operation is sent to the monitored object, so that the operation which may be abnormal can be brought into an early warning state in advance, the operation can be prevented from being found after the operation fails to be executed or other operations are influenced to cause major accidents, and the probability of the major accidents can be effectively reduced. By the mode, on the premise of low labor cost, real-time monitoring of operation is realized, and stability and sustainability of operation monitoring are guaranteed.
In order to avoid the waste of processor resources caused by the accidental exception handling and save the labor cost, after the processor 501 monitors that the memory resources occupied by the operations exceed the first threshold, it is further determined whether the duration of the memory resources occupied by the operations exceeding the first threshold exceeds the threshold time. The communication circuit 502 transmits job information of the job to the monitoring object when the duration exceeds the threshold time.
By the mode, accidental operation abnormity screening caused by other emergency situations such as a network can be effectively avoided, the workload of the terminal or the server is saved, and the workload of the monitoring object is also reduced.
In another embodiment, in order to enable the monitoring object to more clearly and intuitively know the job which may be abnormal, after acquiring the job information which needs to be sent to the detection object, the processor 501 further determines whether the format of the job information is the same as the preset format. The preset format may be a table format, such as an excel format, or a text format, and is not limited herein. If the format of the job information is different from a preset format, the processor 501 converts the format of the job information into the preset format. The communication circuit 502 transmits the job information after the format conversion to the monitoring object.
By the mode, the operation information of different platforms and different formats can be converted into the information of the universal preset format, so that monitoring personnel can more intuitively and clearly know the operation information of the operation which is possibly abnormal.
Referring to fig. 6, the present application further provides a structural diagram of an embodiment of a computer storage medium. In this embodiment, the computer storage medium 60 stores processor-executable program data 61, the program data 61 being for performing the method in the above-described embodiment.
The computer storage medium 60 may be a medium that can store the program data 61, such as a usb disk, a removable hard disk, a Read-only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk, or may be a server that stores the program data 61, and the server may transmit the stored program data 61 to another device for operation, or may self-operate the stored program data 61.
In the several embodiments provided in the present application, it should be understood that the disclosed method and apparatus may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, a division of a module or a unit is merely a logical division, and an actual implementation may have another division, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
Units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be substantially implemented or contributed to by the prior art, or all or part of the technical solution may be embodied in a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, a network device, or the like) or a processor (processor) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
The above embodiments are merely examples and are not intended to limit the scope of the present disclosure, and all modifications, equivalents, and flow charts using the contents of the specification and drawings of the present disclosure or those directly or indirectly applied to other related technical fields are intended to be included in the scope of the present disclosure.

Claims (10)

1. An operation monitoring method, characterized by comprising:
judging whether the operation occupying the memory resources exceeding a first threshold exists at present; wherein the first threshold is smaller than the maximum threshold of the total amount of the memory resources;
and if the operation occupying the memory resources and exceeding the first threshold exists, sending the operation information of the operation to a monitoring object.
2. The method according to claim 1, wherein the step of sending job information of the job to a monitoring object if there is a job occupying the memory resource exceeding the first threshold includes:
if the operation occupying the memory resources and exceeding the first threshold value exists, judging whether the duration time of the operation exceeding the first threshold value exceeds threshold time or not;
and if the duration time exceeds the threshold time, sending the operation information of the operation to the monitoring object.
3. The job monitoring method according to claim 1 or 2, wherein the step of transmitting the job information of the job to a monitoring target includes:
judging whether the format of the operation information is the same as a preset format or not;
if the format of the operation information is different from the preset format, converting the format of the operation information into the preset format;
and sending the job information after the format conversion to the monitoring object.
4. The job monitoring method according to claim 3, wherein the preset format comprises a table or a text format.
5. The job monitoring method according to claim 1 or 2, wherein the job information includes a name of the job, an amount of occupied memory resources, and a current execution state.
6. The job monitoring method according to claim 1 or 2, wherein the first threshold is an average value of the memory resources occupied by the historical jobs.
7. The operation monitoring method according to claim 1 or 2,
the step of judging whether the current operation of occupying the memory resources exceeds the first threshold specifically includes:
judging whether the number of the occupied maps is larger than the first threshold value or not;
if the operation occupying the memory resources and exceeding the first threshold exists, the step of sending the operation information of the operation to the monitoring object comprises the following steps:
and if the number of the occupied maps is larger than the first threshold value, sending the operation information of the operation to the monitoring object.
8. An operation monitoring device is characterized by comprising a judging module and a sending module,
the judging module is used for judging whether the operation occupying the memory resources exceeds a first threshold value currently exists; wherein the first threshold is smaller than the maximum threshold of the total amount of the memory resources;
the sending module is used for sending the operation information of the operation to a monitoring object when the operation occupying the memory resource exceeds the first threshold value.
9. An operation monitoring terminal, characterized in that the operation monitoring terminal includes:
a processor and a communication circuit coupled to each other, the processor being operable to implement the operation monitoring method of any one of claims 1-7 in cooperation with the communication circuit.
10. A computer storage medium having stored thereon program data which, when executed by a processor, implements a method of job monitoring as claimed in any one of claims 1 to 7.
CN201810861581.6A 2018-08-01 2018-08-01 Job monitoring method, device, terminal and computer storage medium Pending CN110795301A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810861581.6A CN110795301A (en) 2018-08-01 2018-08-01 Job monitoring method, device, terminal and computer storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810861581.6A CN110795301A (en) 2018-08-01 2018-08-01 Job monitoring method, device, terminal and computer storage medium

Publications (1)

Publication Number Publication Date
CN110795301A true CN110795301A (en) 2020-02-14

Family

ID=69424931

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810861581.6A Pending CN110795301A (en) 2018-08-01 2018-08-01 Job monitoring method, device, terminal and computer storage medium

Country Status (1)

Country Link
CN (1) CN110795301A (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103218263A (en) * 2013-03-12 2013-07-24 北京航空航天大学 Dynamic determining method and device for MapReduce parameter
CN103942108A (en) * 2014-04-25 2014-07-23 四川大学 Resource parameter optimization method under Hadoop homogenous cluster
EP2829975A1 (en) * 2013-07-23 2015-01-28 Fujitsu Limited A fault-tolerant monitoring apparatus, method and system
US20150095488A1 (en) * 2013-09-27 2015-04-02 Fujitsu Limited System and method for acquiring log information of related nodes in a computer network
CN104615526A (en) * 2014-12-05 2015-05-13 北京航空航天大学 Monitoring system of large data platform
CN105718364A (en) * 2016-01-15 2016-06-29 西安交通大学 Dynamic assessment method for ability of computation resource in cloud computing platform
CN106533792A (en) * 2016-12-12 2017-03-22 北京锐安科技有限公司 Method and device for monitoring and configuring resources
CN108021450A (en) * 2017-12-04 2018-05-11 北京小度信息科技有限公司 Job analysis method and apparatus based on YARN

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103218263A (en) * 2013-03-12 2013-07-24 北京航空航天大学 Dynamic determining method and device for MapReduce parameter
EP2829975A1 (en) * 2013-07-23 2015-01-28 Fujitsu Limited A fault-tolerant monitoring apparatus, method and system
US20150095488A1 (en) * 2013-09-27 2015-04-02 Fujitsu Limited System and method for acquiring log information of related nodes in a computer network
CN103942108A (en) * 2014-04-25 2014-07-23 四川大学 Resource parameter optimization method under Hadoop homogenous cluster
CN104615526A (en) * 2014-12-05 2015-05-13 北京航空航天大学 Monitoring system of large data platform
CN105718364A (en) * 2016-01-15 2016-06-29 西安交通大学 Dynamic assessment method for ability of computation resource in cloud computing platform
CN106533792A (en) * 2016-12-12 2017-03-22 北京锐安科技有限公司 Method and device for monitoring and configuring resources
CN108021450A (en) * 2017-12-04 2018-05-11 北京小度信息科技有限公司 Job analysis method and apparatus based on YARN

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
陶敬 等: "基于资源可用性的主机异常检测", 《电子科技大学学报》 *

Similar Documents

Publication Publication Date Title
CN106557407B (en) Method and device for monitoring equipment load
CN112346931A (en) Raspberry pie-based private network service cluster monitoring alarm system, method and medium
CN110417575A (en) Alarm method, device and the computer equipment of O&M monitor supervision platform
CN106211227A (en) Flow method for early warning and the network equipment
CN110704872A (en) Data query method and device, electronic equipment and computer readable storage medium
CN114978860A (en) Fault monitoring method and device, electronic equipment and storage medium
CN114978867A (en) Alarm notification method, device, equipment and storage medium
CN112565062B (en) Processing method, related device, equipment and medium of instant messaging order
CN107168846A (en) The monitoring method and device of electronic equipment
CN112910733A (en) Full link monitoring system and method based on big data
CN110795301A (en) Job monitoring method, device, terminal and computer storage medium
CN107612755A (en) The management method and its device of a kind of cloud resource
CN112256470A (en) Fault server positioning method and device, storage medium and electronic equipment
CN111008110A (en) Machine position resource management method, device, equipment and computer readable storage medium
CN114697247B (en) Fault detection method, device, equipment and storage medium of streaming media system
CN111210599A (en) Chemical industry index early warning method and system, electronic equipment and storage medium
CN116483663A (en) Abnormality warning method and device for platform
CN112711517A (en) Server performance monitoring method and device, storage medium and terminal
CN113110970A (en) Method, device, equipment and medium for monitoring components in server working mode
CN109508356B (en) Data abnormality early warning method, device, computer equipment and storage medium
CN111400156A (en) Log analysis method and device
CN112202622B (en) Connection processing method and device
CN110009420A (en) The ticket system and ticket drawing method of value-added service
CN117251337B (en) Micro-service health dial testing method, device, equipment and storage medium
CN117156398B (en) Message processing method, device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20200214

RJ01 Rejection of invention patent application after publication