Nothing Special   »   [go: up one dir, main page]

CN108920283A - Server guard method based on Prometheus performance monitoring system - Google Patents

Server guard method based on Prometheus performance monitoring system Download PDF

Info

Publication number
CN108920283A
CN108920283A CN201810886980.8A CN201810886980A CN108920283A CN 108920283 A CN108920283 A CN 108920283A CN 201810886980 A CN201810886980 A CN 201810886980A CN 108920283 A CN108920283 A CN 108920283A
Authority
CN
China
Prior art keywords
server
current limliting
prometheus
monitoring system
performance monitoring
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810886980.8A
Other languages
Chinese (zh)
Inventor
彭涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chengdu Zhidaochuangyu Information Technology Co Ltd
Original Assignee
Chengdu Zhidaochuangyu Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chengdu Zhidaochuangyu Information Technology Co Ltd filed Critical Chengdu Zhidaochuangyu Information Technology Co Ltd
Priority to CN201810886980.8A priority Critical patent/CN108920283A/en
Publication of CN108920283A publication Critical patent/CN108920283A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/505Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the load
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/50Indexing scheme relating to G06F9/50
    • G06F2209/5022Workload threshold

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer And Data Communications (AREA)

Abstract

The invention discloses a kind of server guard methods based on Prometheus performance monitoring system, include the following steps:Step 1:The software package that Prometheus performance monitoring system is installed on destination server, the monitoring data that can judge server performance is collected by Prometheus performance monitoring system;Step 2:The monitoring data for the destination server being collected into is stored;Step 3:The monitoring data of destination server is shown in corresponding data drawing list system;Step 4:The trigger condition of server current limliting is set in data drawing list system, and judges whether to trigger current limliting strategy;Step 5:Sentry's queue current limliting strategy is triggered, current limliting is carried out to server.The method of the present invention automation carries out limitation flow, improves the stability of service, avoids some unwanted upgrade expandings of valuableness, reduce human cost, also save server resource.

Description

Server guard method based on Prometheus performance monitoring system
Technical field
The present invention relates to servers to protect field, especially a kind of server based on Prometheus performance monitoring system Guard method.
Background technique
For Website server often due to emergency event, flow of services increases suddenly causes partial service to collapse, and then causes There is not available situation in entire service, influences the availability of whole system.It is operated normally for protection server, promotes service Availability needs to carry out current limliting to burst flow situation.
In the prior art, it can be common that current limliting is carried out by token bucket algorithm, protects server.Existed in advance by token bucket The size that server can undertake has been estimated inside service code, has been limited, and is that can accomplish to protect server to a certain degree, The case where economizing on resources.But the uninterrupted undertaken by estimating server inside code in advance, carries out current limliting, this is not real When be adjusted according to server stress loading condition, have the following problems:1) it is excessively high to receive flow for estimation service, leads The threshold value for causing service to reach limitation flow not yet is just collapsed;2) it is too low to receive flow for estimation service, and Service Source is caused not have There is more preferable use, to waste resource.
Relational language
Prometheus (Prometheus):It is a set of open source monitoring system using Go language development, basic principle is By the state of HTTP (Hyper text transfer) agreement periodically crawl component to be monitored, as long as random component provides corresponding HTTP Interface can access monitoring.
Redis:It is a high performance key-value pair memory database, makes to show a C language.
Redis Sentinel (sentry):It is High Availabitity (HA) solution that Redis official is recommended, for monitoring The tool of Redis cluster interior joint state.
Queue:A kind of special linear list, queue are a kind of restricted linear lists of operation;The end for carrying out insertion operation claims For tail of the queue, the end for carrying out delete operation is known as team's head.
Summary of the invention
Technical problem to be solved by the invention is to provide a kind of servers based on Prometheus performance monitoring system Guard method, by Prometheus performance monitoring system come monitoring server data, then can by Redis Sentinel high Current limliting is carried out with framework queue, increases the stability and availability of service.
In order to solve the above technical problems, the technical solution adopted by the present invention is that:
A kind of server guard method based on Prometheus performance monitoring system, includes the following steps:
Step 1:The software package that Prometheus performance monitoring system is installed on destination server, passes through Prometheus Performance monitoring system collects the monitoring data that can judge server performance, the time responded including CPU memory, API;
Step 2:The monitoring data for the destination server being collected into step 1 is stored;
Step 3:The monitoring data of destination server is shown in corresponding data drawing list system;
Step 4:The trigger condition of server current limliting is set in data drawing list system, and judges whether to trigger current limliting plan Slightly, and then to server current limliting is carried out;
Step 5:Current limliting is carried out to server, including:
If 1) server traffic becomes larger, sentry's queue current limliting strategy is triggered, i.e., queue is added in the flow newly entered, works as clothes Business device pressure restores normal and sentry's queue the inside without data within the set time, then cancels current limliting strategy;
If 2) server traffic becomes larger, sentry's queue current limliting strategy is triggered, when server stress does not have within the set time Restore normal, then still carries out current limliting;If sentry's queue stores over load, extra request is closed, protects current service.
Further, the data drawing list system uses Grafana.
Further, the trigger condition of the server current limliting is that CPU memory consumption accounting is greater than 80% and API response Time value is greater than 500ms.
Compared with prior art, the beneficial effects of the invention are as follows:When encountering short-term burst flow, this can be passed through Kind monitoring server pressure condition automation carries out limitation flow to promote the stability of service, avoids some valuableness unwanted Upgrade expanding reduces human cost, also saves server resource.
Detailed description of the invention
Fig. 1 is the monitoring process of the server guard method the present invention is based on Prometheus performance monitoring system.
Fig. 2 is the current limliting process of the server guard method the present invention is based on Prometheus performance monitoring system.
Specific embodiment
The present invention will be further described in detail below with reference to the accompanying drawings and specific embodiments.Current-limiting method solution of the present invention It has determined following problems:
1) real-time response:Website traffic is because some focus incidents or activity cause a certain moment flow to increase suddenly suddenly Add, can whether excessive with real-time judge server stress by Prometheus performance monitoring system, and then decide whether in real time Turn-on flow rate limitation service;
2) current limliting is automated:It is different according to business scenario, the whether excessive rule of real-time judge flow of services (such as CPU Memory usage is more than 80%), to set respective conditions in advance to trigger and open current limliting configuration, fast and flexible;
3) resource consumption is reduced:(judge what server can be born by current CPU memory by real-time judge uninterrupted Pressure), it then carries out automation current limliting, protects server, can solve instantaneous flow and increase situation, both protect server It operates normally, need not also increase server resource newly immediately, reduce the loss and time waste of server resource.
The present invention is based on the server guard method of Prometheus performance monitoring system, details are as follows:
One, server performance real-time judge
1, the software package that Prometheus performance monitoring system is installed on destination server, passes through Prometheus performance Monitoring system collects the monitoring data that can judge server performance, the time responded including CPU memory, API;
2, monitoring system (Prometheus) stores the monitoring data gathered;
3, the monitoring data of destination server is shown in corresponding data drawing list system (Grafana);
4, the trigger condition of server current limliting is arranged in data drawing list system, and (such as CPU memory is more than 80%, API loud It is greater than 500ms between seasonable), judge whether to trigger current limliting strategy.
Two, current limliting is automated
Under normal circumstances, server stress less (CPU memory is lower than 50% using accounting etc.), then current limliting (is not kept just Often);When needing current limliting, current limliting strategy is executed, including:
If 1, server traffic becomes larger (CPU memory is more than 80%, the API response time to be greater than 500ms), then Redis is triggered (queue is added) in the flow newly entered by Sentinel (sentry) queue current limliting strategy, but server stress is in setting time (such as 30 minutes) restore normal and Redis queue the inside does not have data, then cancel current limliting strategy.
If 2, server traffic becomes larger, Redis Sentinel queue current limliting strategy is triggered, but server stress is being set Fix time (such as 30 minutes) do not restore normally, then still to carry out current limliting, if Redis Sentinel queue store over it is negative It carries, then closes extra request, protect current service.

Claims (3)

1. a kind of server guard method based on Prometheus performance monitoring system, which is characterized in that include the following steps:
Step 1:The software package that Prometheus performance monitoring system is installed on destination server, passes through Prometheus performance Monitoring system collects the monitoring data that can judge server performance, the time responded including CPU memory, API;
Step 2:The monitoring data for the destination server being collected into step 1 is stored;
Step 3:The monitoring data of destination server is shown in corresponding data drawing list system;
Step 4:The trigger condition of server current limliting is set in data drawing list system, and judges whether to trigger current limliting strategy, into And current limliting is carried out to server;
Step 5:Current limliting is carried out to server, including:
If 1) server traffic becomes larger, sentry's queue current limliting strategy is triggered, i.e., queue is added in the flow newly entered, works as server Pressure restores normal and sentry's queue the inside without data within the set time, then cancels current limliting strategy;
If 2) server traffic becomes larger, sentry's queue current limliting strategy is triggered, when server stress does not restore within the set time Normally, then current limliting is still carried out;If sentry's queue stores over load, extra request is closed, protects current service.
2. as described in claim 1 based on the server guard method of Prometheus performance monitoring system, which is characterized in that The data drawing list system uses Grafana.
3. as described in claim 1 based on the server guard method of Prometheus performance monitoring system, which is characterized in that The trigger condition of the server current limliting is that CPU memory consumption accounting is greater than the time value of 80% and API response greater than 500ms.
CN201810886980.8A 2018-08-06 2018-08-06 Server guard method based on Prometheus performance monitoring system Pending CN108920283A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810886980.8A CN108920283A (en) 2018-08-06 2018-08-06 Server guard method based on Prometheus performance monitoring system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810886980.8A CN108920283A (en) 2018-08-06 2018-08-06 Server guard method based on Prometheus performance monitoring system

Publications (1)

Publication Number Publication Date
CN108920283A true CN108920283A (en) 2018-11-30

Family

ID=64393598

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810886980.8A Pending CN108920283A (en) 2018-08-06 2018-08-06 Server guard method based on Prometheus performance monitoring system

Country Status (1)

Country Link
CN (1) CN108920283A (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110266602A (en) * 2019-06-21 2019-09-20 四川新网银行股份有限公司 It is a kind of construct flow control slot method and the execution data flow after building
CN110347377A (en) * 2019-07-08 2019-10-18 紫光云技术有限公司 A kind of Prometheus exporter database monitoring system
CN111176954A (en) * 2020-01-02 2020-05-19 浪潮软件股份有限公司 Monitoring method of kudu
CN111510351A (en) * 2020-04-10 2020-08-07 星辰天合(北京)数据科技有限公司 Anomaly detection method and device based on Promissuris monitoring system
CN112380097A (en) * 2020-11-18 2021-02-19 厦门市美亚柏科信息股份有限公司 Prometheus-based method for customizing monitoring index
CN112615790A (en) * 2020-12-22 2021-04-06 苏州思必驰信息科技有限公司 Multi-server-side flow limiting and flow monitoring system and method
CN113114725A (en) * 2021-03-19 2021-07-13 中新网络信息安全股份有限公司 Multi-node data interaction system based on HTTP (hyper text transport protocol) and implementation method thereof
CN113344454A (en) * 2021-07-05 2021-09-03 湖南快乐阳光互动娱乐传媒有限公司 Pressure measurement data processing method and device
CN113765821A (en) * 2021-09-09 2021-12-07 南京优飞保科信息技术有限公司 Multi-dimensional access flow control system
CN116737514A (en) * 2023-08-15 2023-09-12 南京国睿信维软件有限公司 Automatic operation and maintenance method based on log and probe analysis

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130136253A1 (en) * 2011-11-28 2013-05-30 Hadas Liberman Ben-Ami System and method for tracking web interactions with real time analytics
US20130191621A1 (en) * 2012-01-23 2013-07-25 Phillip M. Hoffman System and method for providing multiple processor state operation in a multiprocessor processing system
CN106257456A (en) * 2016-07-08 2016-12-28 北京京东尚科信息技术有限公司 The method of data base's stability, Apparatus and system is improved under high concurrent request
CN107370684A (en) * 2017-06-15 2017-11-21 腾讯科技(深圳)有限公司 Business current-limiting method and business current-limiting apparatus
CN107592345A (en) * 2017-08-28 2018-01-16 中国工商银行股份有限公司 Transaction current-limiting apparatus, method and transaction system
CN107645456A (en) * 2016-07-20 2018-01-30 阿里巴巴集团控股有限公司 Flow control methods and flow control system
CN107908521A (en) * 2017-11-10 2018-04-13 南京邮电大学 A kind of monitoring method of container performance on the server performance and node being applied under cloud environment

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130136253A1 (en) * 2011-11-28 2013-05-30 Hadas Liberman Ben-Ami System and method for tracking web interactions with real time analytics
US20130191621A1 (en) * 2012-01-23 2013-07-25 Phillip M. Hoffman System and method for providing multiple processor state operation in a multiprocessor processing system
CN106257456A (en) * 2016-07-08 2016-12-28 北京京东尚科信息技术有限公司 The method of data base's stability, Apparatus and system is improved under high concurrent request
CN107645456A (en) * 2016-07-20 2018-01-30 阿里巴巴集团控股有限公司 Flow control methods and flow control system
CN107370684A (en) * 2017-06-15 2017-11-21 腾讯科技(深圳)有限公司 Business current-limiting method and business current-limiting apparatus
CN107592345A (en) * 2017-08-28 2018-01-16 中国工商银行股份有限公司 Transaction current-limiting apparatus, method and transaction system
CN107908521A (en) * 2017-11-10 2018-04-13 南京邮电大学 A kind of monitoring method of container performance on the server performance and node being applied under cloud environment

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110266602A (en) * 2019-06-21 2019-09-20 四川新网银行股份有限公司 It is a kind of construct flow control slot method and the execution data flow after building
CN110347377A (en) * 2019-07-08 2019-10-18 紫光云技术有限公司 A kind of Prometheus exporter database monitoring system
CN111176954A (en) * 2020-01-02 2020-05-19 浪潮软件股份有限公司 Monitoring method of kudu
CN111510351A (en) * 2020-04-10 2020-08-07 星辰天合(北京)数据科技有限公司 Anomaly detection method and device based on Promissuris monitoring system
CN111510351B (en) * 2020-04-10 2021-09-14 星辰天合(北京)数据科技有限公司 Anomaly detection method and device based on Promissuris monitoring system
CN112380097A (en) * 2020-11-18 2021-02-19 厦门市美亚柏科信息股份有限公司 Prometheus-based method for customizing monitoring index
CN112615790A (en) * 2020-12-22 2021-04-06 苏州思必驰信息科技有限公司 Multi-server-side flow limiting and flow monitoring system and method
CN113114725A (en) * 2021-03-19 2021-07-13 中新网络信息安全股份有限公司 Multi-node data interaction system based on HTTP (hyper text transport protocol) and implementation method thereof
CN113344454A (en) * 2021-07-05 2021-09-03 湖南快乐阳光互动娱乐传媒有限公司 Pressure measurement data processing method and device
CN113765821A (en) * 2021-09-09 2021-12-07 南京优飞保科信息技术有限公司 Multi-dimensional access flow control system
CN116737514A (en) * 2023-08-15 2023-09-12 南京国睿信维软件有限公司 Automatic operation and maintenance method based on log and probe analysis
CN116737514B (en) * 2023-08-15 2023-12-22 南京国睿信维软件有限公司 Automatic operation and maintenance method based on log and probe analysis

Similar Documents

Publication Publication Date Title
CN108920283A (en) Server guard method based on Prometheus performance monitoring system
US20200133750A1 (en) Methods, apparatus and computer programs for managing persistence
US10404556B2 (en) Methods and computer program products for correlation analysis of network traffic in a network device
US9674046B2 (en) Automatic detection and prevention of network overload conditions using SDN
US9621441B2 (en) Methods and computer program products for analysis of network traffic by port level and/or protocol level filtering in a network device
CN106533805B (en) Micro-service request processing method, micro-service controller and micro-service architecture
US8645532B2 (en) Methods and computer program products for monitoring the contents of network traffic in a network device
CN109450691B (en) Service gateway monitoring method, device and computer readable storage medium
CN108572898B (en) Method, device, equipment and storage medium for controlling interface
CN112527484B (en) Workflow breakpoint continuous running method and device, computer equipment and readable storage medium
CA2604448A1 (en) Method and system for centralized memory management in wireless terminal devices
WO2019140738A1 (en) Method for avoiding excess return visits, and electronic apparatus and computer-readable storage medium
CN105264811A (en) Tuple recovery
CN111782488B (en) Message queue monitoring method, device, electronic equipment and medium
CN113656252A (en) Fault positioning method and device, electronic equipment and storage medium
CN110602331B (en) Error code expansion-based cause positioning method, intelligent terminal and storage medium
US11030184B2 (en) Systems and methods for database active monitoring
CN107025148B (en) Mass data processing method and device
CN116483663A (en) Abnormality warning method and device for platform
CN106547609A (en) A kind of event-handling method and equipment
CN112835794B (en) Method and system for positioning and monitoring code execution problem based on Swoole
CN110995694B (en) Network message detection method, device, network security equipment and storage medium
CN106844151A (en) A kind of network task method for detecting abnormality of VxWorks system
CN111694705A (en) Monitoring method, device, equipment and computer readable storage medium
CN110879774A (en) Network element performance data warning method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20181130