CN108920283A - Server guard method based on Prometheus performance monitoring system - Google Patents
Server guard method based on Prometheus performance monitoring system Download PDFInfo
- Publication number
- CN108920283A CN108920283A CN201810886980.8A CN201810886980A CN108920283A CN 108920283 A CN108920283 A CN 108920283A CN 201810886980 A CN201810886980 A CN 201810886980A CN 108920283 A CN108920283 A CN 108920283A
- Authority
- CN
- China
- Prior art keywords
- server
- current limliting
- prometheus
- monitoring system
- performance monitoring
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
- G06F9/505—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the load
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2209/00—Indexing scheme relating to G06F9/00
- G06F2209/50—Indexing scheme relating to G06F9/50
- G06F2209/5022—Workload threshold
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computer And Data Communications (AREA)
Abstract
The invention discloses a kind of server guard methods based on Prometheus performance monitoring system, include the following steps:Step 1:The software package that Prometheus performance monitoring system is installed on destination server, the monitoring data that can judge server performance is collected by Prometheus performance monitoring system;Step 2:The monitoring data for the destination server being collected into is stored;Step 3:The monitoring data of destination server is shown in corresponding data drawing list system;Step 4:The trigger condition of server current limliting is set in data drawing list system, and judges whether to trigger current limliting strategy;Step 5:Sentry's queue current limliting strategy is triggered, current limliting is carried out to server.The method of the present invention automation carries out limitation flow, improves the stability of service, avoids some unwanted upgrade expandings of valuableness, reduce human cost, also save server resource.
Description
Technical field
The present invention relates to servers to protect field, especially a kind of server based on Prometheus performance monitoring system
Guard method.
Background technique
For Website server often due to emergency event, flow of services increases suddenly causes partial service to collapse, and then causes
There is not available situation in entire service, influences the availability of whole system.It is operated normally for protection server, promotes service
Availability needs to carry out current limliting to burst flow situation.
In the prior art, it can be common that current limliting is carried out by token bucket algorithm, protects server.Existed in advance by token bucket
The size that server can undertake has been estimated inside service code, has been limited, and is that can accomplish to protect server to a certain degree,
The case where economizing on resources.But the uninterrupted undertaken by estimating server inside code in advance, carries out current limliting, this is not real
When be adjusted according to server stress loading condition, have the following problems:1) it is excessively high to receive flow for estimation service, leads
The threshold value for causing service to reach limitation flow not yet is just collapsed;2) it is too low to receive flow for estimation service, and Service Source is caused not have
There is more preferable use, to waste resource.
Relational language
Prometheus (Prometheus):It is a set of open source monitoring system using Go language development, basic principle is
By the state of HTTP (Hyper text transfer) agreement periodically crawl component to be monitored, as long as random component provides corresponding HTTP
Interface can access monitoring.
Redis:It is a high performance key-value pair memory database, makes to show a C language.
Redis Sentinel (sentry):It is High Availabitity (HA) solution that Redis official is recommended, for monitoring
The tool of Redis cluster interior joint state.
Queue:A kind of special linear list, queue are a kind of restricted linear lists of operation;The end for carrying out insertion operation claims
For tail of the queue, the end for carrying out delete operation is known as team's head.
Summary of the invention
Technical problem to be solved by the invention is to provide a kind of servers based on Prometheus performance monitoring system
Guard method, by Prometheus performance monitoring system come monitoring server data, then can by Redis Sentinel high
Current limliting is carried out with framework queue, increases the stability and availability of service.
In order to solve the above technical problems, the technical solution adopted by the present invention is that:
A kind of server guard method based on Prometheus performance monitoring system, includes the following steps:
Step 1:The software package that Prometheus performance monitoring system is installed on destination server, passes through Prometheus
Performance monitoring system collects the monitoring data that can judge server performance, the time responded including CPU memory, API;
Step 2:The monitoring data for the destination server being collected into step 1 is stored;
Step 3:The monitoring data of destination server is shown in corresponding data drawing list system;
Step 4:The trigger condition of server current limliting is set in data drawing list system, and judges whether to trigger current limliting plan
Slightly, and then to server current limliting is carried out;
Step 5:Current limliting is carried out to server, including:
If 1) server traffic becomes larger, sentry's queue current limliting strategy is triggered, i.e., queue is added in the flow newly entered, works as clothes
Business device pressure restores normal and sentry's queue the inside without data within the set time, then cancels current limliting strategy;
If 2) server traffic becomes larger, sentry's queue current limliting strategy is triggered, when server stress does not have within the set time
Restore normal, then still carries out current limliting;If sentry's queue stores over load, extra request is closed, protects current service.
Further, the data drawing list system uses Grafana.
Further, the trigger condition of the server current limliting is that CPU memory consumption accounting is greater than 80% and API response
Time value is greater than 500ms.
Compared with prior art, the beneficial effects of the invention are as follows:When encountering short-term burst flow, this can be passed through
Kind monitoring server pressure condition automation carries out limitation flow to promote the stability of service, avoids some valuableness unwanted
Upgrade expanding reduces human cost, also saves server resource.
Detailed description of the invention
Fig. 1 is the monitoring process of the server guard method the present invention is based on Prometheus performance monitoring system.
Fig. 2 is the current limliting process of the server guard method the present invention is based on Prometheus performance monitoring system.
Specific embodiment
The present invention will be further described in detail below with reference to the accompanying drawings and specific embodiments.Current-limiting method solution of the present invention
It has determined following problems:
1) real-time response:Website traffic is because some focus incidents or activity cause a certain moment flow to increase suddenly suddenly
Add, can whether excessive with real-time judge server stress by Prometheus performance monitoring system, and then decide whether in real time
Turn-on flow rate limitation service;
2) current limliting is automated:It is different according to business scenario, the whether excessive rule of real-time judge flow of services (such as CPU
Memory usage is more than 80%), to set respective conditions in advance to trigger and open current limliting configuration, fast and flexible;
3) resource consumption is reduced:(judge what server can be born by current CPU memory by real-time judge uninterrupted
Pressure), it then carries out automation current limliting, protects server, can solve instantaneous flow and increase situation, both protect server
It operates normally, need not also increase server resource newly immediately, reduce the loss and time waste of server resource.
The present invention is based on the server guard method of Prometheus performance monitoring system, details are as follows:
One, server performance real-time judge
1, the software package that Prometheus performance monitoring system is installed on destination server, passes through Prometheus performance
Monitoring system collects the monitoring data that can judge server performance, the time responded including CPU memory, API;
2, monitoring system (Prometheus) stores the monitoring data gathered;
3, the monitoring data of destination server is shown in corresponding data drawing list system (Grafana);
4, the trigger condition of server current limliting is arranged in data drawing list system, and (such as CPU memory is more than 80%, API loud
It is greater than 500ms between seasonable), judge whether to trigger current limliting strategy.
Two, current limliting is automated
Under normal circumstances, server stress less (CPU memory is lower than 50% using accounting etc.), then current limliting (is not kept just
Often);When needing current limliting, current limliting strategy is executed, including:
If 1, server traffic becomes larger (CPU memory is more than 80%, the API response time to be greater than 500ms), then Redis is triggered
(queue is added) in the flow newly entered by Sentinel (sentry) queue current limliting strategy, but server stress is in setting time
(such as 30 minutes) restore normal and Redis queue the inside does not have data, then cancel current limliting strategy.
If 2, server traffic becomes larger, Redis Sentinel queue current limliting strategy is triggered, but server stress is being set
Fix time (such as 30 minutes) do not restore normally, then still to carry out current limliting, if Redis Sentinel queue store over it is negative
It carries, then closes extra request, protect current service.
Claims (3)
1. a kind of server guard method based on Prometheus performance monitoring system, which is characterized in that include the following steps:
Step 1:The software package that Prometheus performance monitoring system is installed on destination server, passes through Prometheus performance
Monitoring system collects the monitoring data that can judge server performance, the time responded including CPU memory, API;
Step 2:The monitoring data for the destination server being collected into step 1 is stored;
Step 3:The monitoring data of destination server is shown in corresponding data drawing list system;
Step 4:The trigger condition of server current limliting is set in data drawing list system, and judges whether to trigger current limliting strategy, into
And current limliting is carried out to server;
Step 5:Current limliting is carried out to server, including:
If 1) server traffic becomes larger, sentry's queue current limliting strategy is triggered, i.e., queue is added in the flow newly entered, works as server
Pressure restores normal and sentry's queue the inside without data within the set time, then cancels current limliting strategy;
If 2) server traffic becomes larger, sentry's queue current limliting strategy is triggered, when server stress does not restore within the set time
Normally, then current limliting is still carried out;If sentry's queue stores over load, extra request is closed, protects current service.
2. as described in claim 1 based on the server guard method of Prometheus performance monitoring system, which is characterized in that
The data drawing list system uses Grafana.
3. as described in claim 1 based on the server guard method of Prometheus performance monitoring system, which is characterized in that
The trigger condition of the server current limliting is that CPU memory consumption accounting is greater than the time value of 80% and API response greater than 500ms.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810886980.8A CN108920283A (en) | 2018-08-06 | 2018-08-06 | Server guard method based on Prometheus performance monitoring system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810886980.8A CN108920283A (en) | 2018-08-06 | 2018-08-06 | Server guard method based on Prometheus performance monitoring system |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108920283A true CN108920283A (en) | 2018-11-30 |
Family
ID=64393598
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810886980.8A Pending CN108920283A (en) | 2018-08-06 | 2018-08-06 | Server guard method based on Prometheus performance monitoring system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108920283A (en) |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110266602A (en) * | 2019-06-21 | 2019-09-20 | 四川新网银行股份有限公司 | It is a kind of construct flow control slot method and the execution data flow after building |
CN110347377A (en) * | 2019-07-08 | 2019-10-18 | 紫光云技术有限公司 | A kind of Prometheus exporter database monitoring system |
CN111176954A (en) * | 2020-01-02 | 2020-05-19 | 浪潮软件股份有限公司 | Monitoring method of kudu |
CN111510351A (en) * | 2020-04-10 | 2020-08-07 | 星辰天合(北京)数据科技有限公司 | Anomaly detection method and device based on Promissuris monitoring system |
CN112380097A (en) * | 2020-11-18 | 2021-02-19 | 厦门市美亚柏科信息股份有限公司 | Prometheus-based method for customizing monitoring index |
CN112615790A (en) * | 2020-12-22 | 2021-04-06 | 苏州思必驰信息科技有限公司 | Multi-server-side flow limiting and flow monitoring system and method |
CN113114725A (en) * | 2021-03-19 | 2021-07-13 | 中新网络信息安全股份有限公司 | Multi-node data interaction system based on HTTP (hyper text transport protocol) and implementation method thereof |
CN113344454A (en) * | 2021-07-05 | 2021-09-03 | 湖南快乐阳光互动娱乐传媒有限公司 | Pressure measurement data processing method and device |
CN113765821A (en) * | 2021-09-09 | 2021-12-07 | 南京优飞保科信息技术有限公司 | Multi-dimensional access flow control system |
CN116737514A (en) * | 2023-08-15 | 2023-09-12 | 南京国睿信维软件有限公司 | Automatic operation and maintenance method based on log and probe analysis |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20130136253A1 (en) * | 2011-11-28 | 2013-05-30 | Hadas Liberman Ben-Ami | System and method for tracking web interactions with real time analytics |
US20130191621A1 (en) * | 2012-01-23 | 2013-07-25 | Phillip M. Hoffman | System and method for providing multiple processor state operation in a multiprocessor processing system |
CN106257456A (en) * | 2016-07-08 | 2016-12-28 | 北京京东尚科信息技术有限公司 | The method of data base's stability, Apparatus and system is improved under high concurrent request |
CN107370684A (en) * | 2017-06-15 | 2017-11-21 | 腾讯科技(深圳)有限公司 | Business current-limiting method and business current-limiting apparatus |
CN107592345A (en) * | 2017-08-28 | 2018-01-16 | 中国工商银行股份有限公司 | Transaction current-limiting apparatus, method and transaction system |
CN107645456A (en) * | 2016-07-20 | 2018-01-30 | 阿里巴巴集团控股有限公司 | Flow control methods and flow control system |
CN107908521A (en) * | 2017-11-10 | 2018-04-13 | 南京邮电大学 | A kind of monitoring method of container performance on the server performance and node being applied under cloud environment |
-
2018
- 2018-08-06 CN CN201810886980.8A patent/CN108920283A/en active Pending
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20130136253A1 (en) * | 2011-11-28 | 2013-05-30 | Hadas Liberman Ben-Ami | System and method for tracking web interactions with real time analytics |
US20130191621A1 (en) * | 2012-01-23 | 2013-07-25 | Phillip M. Hoffman | System and method for providing multiple processor state operation in a multiprocessor processing system |
CN106257456A (en) * | 2016-07-08 | 2016-12-28 | 北京京东尚科信息技术有限公司 | The method of data base's stability, Apparatus and system is improved under high concurrent request |
CN107645456A (en) * | 2016-07-20 | 2018-01-30 | 阿里巴巴集团控股有限公司 | Flow control methods and flow control system |
CN107370684A (en) * | 2017-06-15 | 2017-11-21 | 腾讯科技(深圳)有限公司 | Business current-limiting method and business current-limiting apparatus |
CN107592345A (en) * | 2017-08-28 | 2018-01-16 | 中国工商银行股份有限公司 | Transaction current-limiting apparatus, method and transaction system |
CN107908521A (en) * | 2017-11-10 | 2018-04-13 | 南京邮电大学 | A kind of monitoring method of container performance on the server performance and node being applied under cloud environment |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110266602A (en) * | 2019-06-21 | 2019-09-20 | 四川新网银行股份有限公司 | It is a kind of construct flow control slot method and the execution data flow after building |
CN110347377A (en) * | 2019-07-08 | 2019-10-18 | 紫光云技术有限公司 | A kind of Prometheus exporter database monitoring system |
CN111176954A (en) * | 2020-01-02 | 2020-05-19 | 浪潮软件股份有限公司 | Monitoring method of kudu |
CN111510351A (en) * | 2020-04-10 | 2020-08-07 | 星辰天合(北京)数据科技有限公司 | Anomaly detection method and device based on Promissuris monitoring system |
CN111510351B (en) * | 2020-04-10 | 2021-09-14 | 星辰天合(北京)数据科技有限公司 | Anomaly detection method and device based on Promissuris monitoring system |
CN112380097A (en) * | 2020-11-18 | 2021-02-19 | 厦门市美亚柏科信息股份有限公司 | Prometheus-based method for customizing monitoring index |
CN112615790A (en) * | 2020-12-22 | 2021-04-06 | 苏州思必驰信息科技有限公司 | Multi-server-side flow limiting and flow monitoring system and method |
CN113114725A (en) * | 2021-03-19 | 2021-07-13 | 中新网络信息安全股份有限公司 | Multi-node data interaction system based on HTTP (hyper text transport protocol) and implementation method thereof |
CN113344454A (en) * | 2021-07-05 | 2021-09-03 | 湖南快乐阳光互动娱乐传媒有限公司 | Pressure measurement data processing method and device |
CN113765821A (en) * | 2021-09-09 | 2021-12-07 | 南京优飞保科信息技术有限公司 | Multi-dimensional access flow control system |
CN116737514A (en) * | 2023-08-15 | 2023-09-12 | 南京国睿信维软件有限公司 | Automatic operation and maintenance method based on log and probe analysis |
CN116737514B (en) * | 2023-08-15 | 2023-12-22 | 南京国睿信维软件有限公司 | Automatic operation and maintenance method based on log and probe analysis |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108920283A (en) | Server guard method based on Prometheus performance monitoring system | |
US20200133750A1 (en) | Methods, apparatus and computer programs for managing persistence | |
US10404556B2 (en) | Methods and computer program products for correlation analysis of network traffic in a network device | |
US9674046B2 (en) | Automatic detection and prevention of network overload conditions using SDN | |
US9621441B2 (en) | Methods and computer program products for analysis of network traffic by port level and/or protocol level filtering in a network device | |
CN106533805B (en) | Micro-service request processing method, micro-service controller and micro-service architecture | |
US8645532B2 (en) | Methods and computer program products for monitoring the contents of network traffic in a network device | |
CN109450691B (en) | Service gateway monitoring method, device and computer readable storage medium | |
CN108572898B (en) | Method, device, equipment and storage medium for controlling interface | |
CN112527484B (en) | Workflow breakpoint continuous running method and device, computer equipment and readable storage medium | |
CA2604448A1 (en) | Method and system for centralized memory management in wireless terminal devices | |
WO2019140738A1 (en) | Method for avoiding excess return visits, and electronic apparatus and computer-readable storage medium | |
CN105264811A (en) | Tuple recovery | |
CN111782488B (en) | Message queue monitoring method, device, electronic equipment and medium | |
CN113656252A (en) | Fault positioning method and device, electronic equipment and storage medium | |
CN110602331B (en) | Error code expansion-based cause positioning method, intelligent terminal and storage medium | |
US11030184B2 (en) | Systems and methods for database active monitoring | |
CN107025148B (en) | Mass data processing method and device | |
CN116483663A (en) | Abnormality warning method and device for platform | |
CN106547609A (en) | A kind of event-handling method and equipment | |
CN112835794B (en) | Method and system for positioning and monitoring code execution problem based on Swoole | |
CN110995694B (en) | Network message detection method, device, network security equipment and storage medium | |
CN106844151A (en) | A kind of network task method for detecting abnormality of VxWorks system | |
CN111694705A (en) | Monitoring method, device, equipment and computer readable storage medium | |
CN110879774A (en) | Network element performance data warning method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20181130 |