CN118233401A

CN118233401A - Service grid current limiting method and related device

Info

Publication number: CN118233401A
Application number: CN202211595059.0A
Authority: CN
Inventors: 刘冬冬
Original assignee: Huawei Cloud Computing Technologies Co Ltd
Current assignee: Huawei Cloud Computing Technologies Co Ltd
Priority date: 2022-12-13
Filing date: 2022-12-13
Publication date: 2024-06-21
Also published as: WO2024125201A1

Abstract

The application provides a service grid current limiting method and a related device, wherein the method is applied to an upstream container group Pod with an upstream-downstream corresponding relation, and the method comprises the following steps: under the condition that the upstream container group Pod is determined to be overloaded, determining the comprehensive grade of each downstream container group Pod in at least one downstream container group Pod corresponding to the upstream container group Pod; and according to the comprehensive grade of each downstream container group Pod required to be called by the plurality of first service requests, partial service requests in the plurality of first service requests are issued to the corresponding downstream container groups Pod, and the rest of service requests except the partial service requests in the plurality of first service requests are discarded. The application comprehensively considers the service grade and the real overload condition of the downstream Pod, and selects the service request with low comprehensive grade to discard, thereby ensuring the stability of the system and the service and improving the service quality.

Description

Service grid current limiting method and related device

Technical Field

The application relates to the field of cloud computing, in particular to a service grid current limiting method and a related device.

Background

With the development of cloud technology, more and more application programs are realized in the form of micro services, and because of numerous micro services, the calling relationship among the micro services is complex, and when a background system faces a large number of service requests, overload is easy to cause, so that the service requests need to be limited, system breakdown is avoided, and the service stability is kept as much as possible.

The current throttling method is to perform the current throttling control based on the threshold value of the service request rate, i.e. calculate the service request amount in a time window, and when the service request amount in a time window exceeds the set threshold value, no service request is received or a new service request is discarded until the end of the current time window. The current limiting method adopts a one-cut mode to limit the current, does not pay attention to the service grade, and is easy to cause service damage. In addition, there may be a case that: the traffic request in the upstream Pod is overloaded, but a certain downstream Pod corresponding to the upstream Pod may be in an idle state, in which case, it is not reasonable for the upstream Pod to discard the new traffic request in a cut-off manner.

Disclosure of Invention

The application provides a service grid current limiting method and a related device.

In a first aspect, the present application provides a service grid current limiting method, where the method is applied to upstream container groups Pod having an upstream-downstream correspondence, and each container group Pod deploys an edge vehicle container and an application program container, where the application program container is used for processing the service request, and the edge vehicle container is used for controlling a traffic of the service request, and the method includes:

Under the current time window:

In the case that the upstream container group Pod is overloaded, determining the comprehensive grade of each downstream container group Pod in at least one downstream container group Pod required to be invoked by a plurality of first service requests received by the upstream container group Pod, wherein the comprehensive grade of the downstream container group Pod is determined based on a preset static service grade of the downstream container group Pod and the capability of the downstream container group Pod to actually process the service requests;

And according to the comprehensive grade of each downstream container group Pod required to be called by the plurality of first service requests, partial service requests in the plurality of first service requests are issued to the corresponding downstream container group Pod, and the rest of service requests except the partial service requests in the plurality of first service requests are discarded.

It can be seen that the upstream Pod set Pod determines which first service requests to discard for the purpose of throttling based on the comprehensive level of the downstream Pod set Pod, wherein the comprehensive level of each downstream Pod set Pod is not determined based on a single factor but based on the static service level of each downstream Pod set Pod and the capability of actually processing the service requests. The method for limiting the flow is adopted to determine which part of the discarded requests are, which part of the requests need to be processed are not determined by a single factor or a cut-off mode, so that the quality of business service is improved.

Based on the first aspect, in a possible implementation manner, the capability of the downstream container group Pod to actually process the service request includes one or more of a service request drop rate threshold of the downstream container group Pod and an overload rate of the downstream container group Pod, where the overload rate of the downstream container group Pod is positively correlated with a drop rate difference, and the drop rate difference refers to a difference between the service request drop rate of the downstream container group Pod and the service request drop rate threshold.

The service request drop rate threshold refers to a preset limit of the service request drop rate, and may be understood as what the number of service requests that are allowed to be dropped, or may be understood as what the tolerance of the downstream container group Pod to dropped service requests is, so that the preset service request rate threshold may reflect the capability of the downstream container group Pod to actually process service requests. The overload rate may represent the capability of the downstream container group Pod to actually process service requests, and the overload rate is positively correlated with the difference between the actual service request dropping rate of the container group Pod and the preset service request dropping rate threshold.

Based on the first aspect, in a possible implementation manner, the comprehensive grade of the downstream container group Pod is positively correlated with the static service grade of the downstream container group Pod; the comprehensive grade of the downstream container group Pod is inversely related to a service request discarding rate threshold value of the downstream container group Pod; the overall level of the downstream group of containers Pod is inversely related to the overload rate of the downstream group of containers Pod.

It will be appreciated that the lower the static traffic level, the lower the overall level of the group Pod; the larger the preset service request discarding rate threshold value is, the larger the number of service requests allowed to be discarded by the container group Pod is, the lower the comprehensive grade of the container group Pod is, the smaller the preset service request discarding rate threshold value is, the smaller the number of service requests allowed to be discarded by the container group Pod is, and the higher the comprehensive grade of the container group Pod is; the larger the difference between the actual service request discarding rate and the preset service request discarding rate threshold value is, the higher the overload degree of the Pod group Pod is, and the lower the comprehensive level of the Pod group Pod is.

Based on the first aspect, in a possible implementation manner, the method further includes: acquiring the resource utilization rate of the upstream container group Pod under the current time window; and determining whether the upstream container group Pod is overloaded or not under the current window according to the resource utilization rate of the upstream container group Pod.

Based on the first aspect, in a possible implementation manner, the resource usage of the upstream container group Pod includes the usage of the processor and/or the usage of the memory in the upstream container group Pod; the determining whether the upstream container group Pod is overloaded under the current window according to the resource utilization rate of the upstream container group Pod comprises: when the utilization rate of the processor is greater than a set processor utilization rate overload threshold value and/or when the utilization rate of the memory is greater than a set memory utilization rate overload threshold value, determining that the upstream container group Pod is overloaded under the current window; otherwise, determining that the upstream container group Pod is not overloaded under the current window.

Based on the first aspect, in a possible implementation manner, the issuing, according to the comprehensive level of each downstream container group Pod required to be invoked by the plurality of first service requests, a part of service requests in the plurality of first service requests to a corresponding downstream container group Pod, and discarding remaining service requests except the part of service requests in the plurality of first service requests includes:

Determining a downstream container group Pod with the lowest comprehensive grade from all downstream container groups Pod required to be called by the plurality of first service requests as a first target container group Pod;

and sending the service requests for indicating to call the downstream container group Pod with the comprehensive grade higher than that of the first target container group Pod in the plurality of first service requests to the corresponding downstream container group Pod, and discarding the rest of the service requests.

It will be appreciated that discarding the first service request that invokes the downstream container group Pod having the lowest overall level includes any one or more of the following: the static service level of the downstream container group Pod is relatively low; the service request discarding rate threshold preset by the downstream container group Pod is relatively large (or called, the number of service requests that the downstream container group Pod allows to discard is relatively large); the overload rate (overload degree) of the downstream group of containers Pod is high.

Based on the first aspect, in a possible implementation manner, the method further includes:

Under a first time window following the current time window:

Under the condition that the upstream container group Pod is overloaded, determining the comprehensive grade of each downstream container group Pod in at least one downstream container group Pod required to be called by a plurality of second service requests received by the upstream container group Pod;

Determining a downstream container group Pod with the comprehensive grade higher than that of the first target container group Pod from all downstream container groups Pod required to be called by the plurality of second service requests as a first set;

determining a downstream container group Pod with the lowest comprehensive grade in the first set as a second target container group Pod;

and sending the service requests for indicating to call the downstream Pod with the comprehensive grade higher than that of the second target container group Pod in the plurality of second service requests to the corresponding downstream container group Pod, and discarding the rest of the service requests.

It can be seen that in the case that the upstream container group Pod is overloaded during the current time window, the service request for calling the downstream container group Pod with the lowest comprehensive level is discarded, in the case that the downstream container group Pod of the first time window after the current time window is overloaded, the service request for calling the container group Pod higher than the downstream container group Pod with the lowest comprehensive level in the previous time window is discarded, so that in the case that an overload occurs in a certain upstream container group Pod in a continuous time window, the comprehensive level of the downstream container group Pod called by the discarded service request is higher and higher, so as to eliminate the overload state of the current upstream container group Pod and/or the downstream container group Pod as soon as possible.

Under a second time window following the current time window:

if the upstream container group Pod is overloaded, determining the comprehensive grade of each downstream container group Pod in at least one downstream container group Pod required to be called by a plurality of third service requests received by the upstream container group Pod;

Determining a downstream container group Pod with the comprehensive grade higher than that of the second target container group Pod from all downstream container groups Pod required to be called by the plurality of third service requests as a second set;

Determining a downstream container group Pod with the lowest comprehensive grade in the second set as a third target container group Pod;

And sending the service requests for indicating to call the downstream Pod with the comprehensive grade higher than that of the third target container group Pod in the plurality of third service requests to the corresponding downstream container group Pod, and discarding the rest of the service requests.

It can be seen that in case of overload of a certain continuous time window of an upstream group Pod, the overall level of downstream group Pod called by the dropped service request is higher and higher, so as to eliminate the overload state of the current upstream group Pod and/or downstream group Pod as soon as possible.

Based on the first aspect, in a possible implementation manner, the comprehensive grade of the downstream container group Pod is represented by a numerical value.

It will be appreciated that the composite level may be represented by a numerical value, in one example, the larger the numerical value, the higher the composite level, and in one example, the smaller the numerical value, the higher the composite level.

In a second aspect, the present application provides a service grid current limiting device, the device including upstream container groups Pod having an upstream-downstream correspondence, each container group Pod having disposed therein an edge vehicle container and an application container, the application container being configured to process the service request, the edge vehicle container being configured to manage traffic of the service request, the device including:

Under the current time window:

A comprehensive grade determining module, configured to determine, in a case where it is determined that the upstream container group Pod is overloaded, a comprehensive grade of each downstream container group Pod in at least one downstream container group Pod that needs to be invoked in the plurality of first service requests received by the upstream container group Pod, where the comprehensive grade of the downstream container group Pod is determined based on a preset static service grade of the downstream container group Pod and an ability of the downstream container group Pod to actually process the service requests;

And the communication module is used for issuing part of the service requests in the plurality of first service requests to the corresponding downstream container group Pod according to the comprehensive grade of each downstream container group Pod required to be invoked by the plurality of first service requests, and discarding the rest of the service requests except the part of the service requests in the plurality of first service requests.

Based on the second aspect, in a possible implementation manner, the capability of the downstream container group Pod to actually process the service request includes one or more of a service request drop rate threshold of the downstream container group Pod and an overload rate of the downstream container group Pod, where the overload rate of the downstream container group Pod is positively correlated with a drop rate difference, and the drop rate difference refers to a difference between the service request drop rate of the downstream container group Pod and the service request drop rate threshold.

Based on the second aspect, in a possible implementation manner, the comprehensive grade of the downstream container group Pod is positively correlated with the static service grade of the downstream container group Pod; the comprehensive grade of the downstream container group Pod is inversely related to a service request discarding rate threshold value of the downstream container group Pod; the overall level of the downstream group of containers Pod is inversely related to the overload rate of the downstream group of containers Pod.

Based on the second aspect, in a possible implementation manner, the obtaining module is configured to obtain a resource usage rate of the upstream container group Pod under a current time window; and the overload determining module is used for determining whether the upstream container group Pod is overloaded or not under the current window according to the resource utilization rate of the upstream container group Pod.

Based on the second aspect, in a possible implementation manner, the resource usage of the upstream container group Pod includes the usage of the processor and/or the usage of the memory in the upstream container group Pod; the overload determination module is configured to: when the utilization rate of the processor is greater than a set processor utilization rate overload threshold value and/or when the utilization rate of the memory is greater than a set memory utilization rate overload threshold value, determining that the upstream container group Pod is overloaded under the current window; otherwise, determining that the upstream container group Pod is not overloaded under the current window.

Based on the second aspect, in a possible implementation manner, the comprehensive grade determining module is configured to determine, from among the downstream container groups Pod required to be invoked by the plurality of first service requests, a downstream container group Pod with a lowest comprehensive grade as a first target container group Pod;

The communication module is configured to send a service request for indicating to call a downstream container group Pod with a comprehensive level higher than the first target container group Pod in the plurality of first service requests to a corresponding downstream container group Pod, and discard the rest of service requests.

Based on the second aspect, in a possible implementation manner, the comprehensive grade determining module is configured to:

Under a first time window following the current time window:

The communication module is used for:

Under a second time window following the current time window:

The communication module is used for:

Based on the second aspect, in a possible implementation manner, the comprehensive level of the downstream container group Pod is represented by a numerical value.

The functional modules of the second aspect are configured to implement the method of the first aspect and any one of the possible implementation manners of the first aspect.

In a third aspect, the present application provides a computing device cluster comprising at least one computing device, each of the at least one computing device comprising a memory and a processor, the processor of the at least one computing device being configured to execute instructions stored in the memory of the at least one computing device, such that the computing device cluster performs the method of any one of the above-mentioned first aspect and possible implementations of the first aspect.

In a fourth aspect, the present application provides a computer storage medium containing instructions which, when executed in a cluster of computing devices, cause the cluster of computing devices to perform the method of any one of the above-described first aspect and possible implementations of the first aspect.

In a fifth aspect, the present application provides a computer program product comprising program instructions which, when executed on a cluster of computing devices, performs the method of any one of the above-mentioned first aspect and any one of the possible implementations of the first aspect.

Drawings

FIG. 1 is a schematic diagram of a system architecture according to the present application;

FIG. 2 is a schematic flow chart of a service grid current limiting method according to the present application;

FIG. 3 is a schematic flow chart of a first current limiting operation according to the present application;

FIG. 4 is an exemplary diagram provided by the present application;

fig. 5 is a schematic structural diagram of a service grid current limiting device according to the present application;

FIG. 6 is a schematic diagram of a computing device according to the present application;

FIG. 7 is a schematic diagram illustrating a computing device cluster according to the present application;

fig. 8 is a schematic structural diagram of yet another computing device cluster according to the present application.

Detailed Description

In cloud native architecture, applications are typically designed in the form of a distributed set of multiple micro-services, each for performing some discrete business functions. The service grid (SERVICE MESH) is an infrastructure layer for communication between numerous services, responsible for describing the complex service topology of cloud native applications, as well as network traffic control between individual micro services.

Referring to fig. 1, fig. 1 is a schematic diagram of a system architecture according to the present application. As shown in fig. 1, the cloud may include a plurality of server nodes, where a server node may be a virtual machine or a physical host.

One server node may include one or more container groups Pod, where Pod is a container orchestration engine (Kubernetes is a google open source, abbreviated as K8 s) as a basic unit for deploying, managing, orchestrating a containerized application, and Pod may include one or more containers.

In the cloud native architecture, the sidecar (sidecar) container and application container are deployed in one Pod, as shown in fig. 1. The side car container is used for managing and controlling network traffic, and specifically, the side car container is used for processing each service request received by the Pod, and the processing comprises flow limiting operation and distributing the corresponding service request to the corresponding Pod. The application container is used to implement business functions of the micro-service. It will be appreciated that any service request must first pass through the sidecar container, which distributes the service request. Specifically, the sidecar container is used for distributing the service request to the application program container where the Pod is located, so that the application program container where the Pod is located processes the service request to realize the corresponding function; or the side car container is used for distributing the service request to the side car containers in other Pods, and the side car containers in other Pods send the received service request to the application program container in the Pod so that the application program container in the Pod processes the service request to realize corresponding functions.

It is to be appreciated that an application can include a plurality of micro-services, which can be implemented in one or more application containers, and thus, an application can be implemented in one or more application containers. There may be a call relationship between the micro services of one application program, and there may also be a call relationship between different application programs. The service grid management platform defines the arrangement form of each side car container and the calling relation among each micro service, wherein the calling relation among each micro service can be also called a network topological relation.

Optionally, the system architecture shown in fig. 1 may further include a service grid management platform (not shown in fig. 1), where the service grid management platform is configured to manage each sidecar container, such as adding or deleting a sidecar container, and the service grid management platform is further configured to manage a calling relationship and a calling frequency between each micro service, for example, a number of times that a micro service a (corresponding to implementation in Pod a) is allowed to call a micro service b (corresponding to implementation in Pod b) in a unit time.

It can be understood that if there is a call relationship between the plurality of micro services, then the plurality of Pod where the plurality of micro services are located has an upstream-downstream relationship. For example, if the micro service a is implemented in Pod a, the micro service b is implemented in Pod b, and the micro service a may call the micro service b, there is an upstream-downstream correspondence between Pod a and Pod b, pod a is upstream, and Pod b is downstream.

It should be noted that the system shown in fig. 1 may include more or fewer server nodes, one server node may include more or fewer Pod, and the network topology is merely an example, and fig. 1 is not meant to limit the present application.

The application provides a method for limiting flow based on service level, which introduces a flow limiting software development kit (software development kit, SDK) into an application program container, the flow limiting SDK judges the service level and overload condition of a received service request, selects the service request with high service level for processing, and discards the service request with low service level.

However, this method needs to introduce the SDK into the application container, and the micro service and the SDK cooperate to process the service request, which belongs to the invasive container current limiting method. With the development of cloud computing, the number of micro services is increased, the calling relationship among the micro services is complex, the invasive container current limiting method is complex to realize, and service development is difficult.

The application also provides a non-invasive service grid current limiting method, which is applied to an upstream Pod with an upstream-downstream correspondence relationship, specifically, can be applied to a sidecar container in the upstream Pod, and referring to fig. 2, fig. 2 is a schematic flow chart of the service grid current limiting method provided by the application, and the method includes but is not limited to the following description.

S101, the Pod obtains the resource utilization rate of the Pod in the current time window.

In the method of the application, the Pod refers to an upstream Pod. The time window refers to a time interval of a preset duration, and the preset duration may be one minute, one second, or other preset durations.

The resource utilization of the Pod includes the processor utilization of the Pod and/or the memory utilization of the Pod. The processor utilization of the Pod refers to the ratio of the amount of processor resources consumed by the Pod to the total amount of processor resources of the Pod, and the memory utilization of the Pod refers to the ratio of the occupied memory capacity of the Pod to the total memory capacity of the Pod. Wherein, when establishing each Pod, a resource capacity size has been allocated for each Pod, including a processor capacity size and a memory capacity size of each Pod. When the processor utilization rate of the Pod and the memory utilization rate of the Pod are calculated, the processor utilization rate of the Pod and the memory utilization rate of the Pod may be calculated once at a certain time of the current time window or may be calculated by other methods, and the application is not limited.

S102, determining whether overload occurs to the Pod in the current time window according to the resource utilization rate of the Pod.

And comparing the resource utilization rate of the Pod with an overload threshold of the Pod, and determining whether the Pod is overloaded or not under the current time window. The overload threshold of the Pod comprises a processor utilization rate overload threshold of the Pod and/or a memory utilization rate overload threshold of the Pod, wherein the overload threshold of the Pod is preset.

In one example, the overload threshold of the present Pod includes a processor usage overload threshold, and if the processor usage of the present Pod is greater than the processor usage overload threshold, it is determined that the present Pod is overloaded in the current time window.

In one example, the overload threshold of the present Pod includes a memory usage overload threshold, and if the memory usage of the present Pod is greater than the memory usage overload threshold, it is determined that the present Pod is overloaded in the current time window.

In one example, the overload threshold of the present Pod includes a memory usage overload threshold and a processor usage overload threshold, and if the memory usage of the present Pod is greater than the memory usage overload threshold and the processor usage is greater than the processor usage overload threshold, then it is determined that the present Pod is overloaded in the current time window.

And S103, under the condition that the current Pod is overloaded under the current time window, executing a first current limiting operation, and discarding the service request for calling the downstream Pod with the lowest comprehensive level.

In the case that it is determined that the present Pod is overloaded in the current time window, the first current limiting operation is performed, which may specifically include, but is not limited to, step S1031 and step S1032, as shown in fig. 3, and fig. 3 is a schematic flow chart of a method of the first current limiting operation provided in the present application.

S1031, determining the comprehensive grade of each downstream Pod in at least one downstream Pod required to be called by the first service requests received by the Pod.

Under the current time window, the Pod receives a plurality of first service requests, and the Pod can identify which micro service corresponds to each request according to the first service requests, which downstream Pod needs to be called corresponding to each request can be identified by the Pod according to the first service requests. According to the plurality of first service requests, first, one or more downstream Pod(s) to be called by the plurality of first service requests can be determined, and then, the comprehensive grade of the one or more downstream Pod(s) to be called by the plurality of first service requests is calculated. Wherein the composite level of each downstream Pod is dynamically changing, and the number of dropped service requests per downstream Pod is different for any one downstream Pod over different time windows, the composite level of each Pod being determined based on the static service level of the Pod and the Pod's ability to actually process service requests, wherein the number of dropped service requests per downstream Pod is an indication of the ability to actually process service requests, an understanding of which is described below, in particular with reference to the description below.

For example, referring to the exemplary diagram shown in fig. 4, the present Pod is Pod a, where the downstream Pod required to be invoked by each first service request received by Pod a includes Pod B, pod C, and Pod, where each Pod corresponds to a micro service and is used to implement a certain service function.

How to calculate the overall level of each downstream Pod is described below.

For any downstream Pod, the overall level of the downstream Pod may be determined based on the static traffic level of the downstream Pod, the traffic request drop rate threshold of the downstream Pod, the overload rate of the downstream Pod. For example, the comprehensive level of Pod B may be determined based on the static traffic level of downstream Pod B, the traffic request drop rate threshold of Pod B, and the overload rate of Pod B. Similarly, the combined level of Pod C and Pod D may be determined. Wherein, the static service level of each Pod is preset according to the service realized by the micro service in each Pod.

Taking downstream Pod B and Pod C in fig. 4 as an example, a service request drop rate threshold of the downstream Pod and a service request drop rate of the downstream Pod are explained.

Taking the downstream Pod B as an example, under a certain time window, the number of service requests sent by the upstream Pod a to the downstream Pod B is x1, but the downstream Pod B only processes y1 service requests in the x1 service requests, where y1 is smaller than x1, the upstream Pod a receives the response of the y1 service requests returned by the downstream Pod B, and the x1-y1 service requests are discarded by the downstream Pod B, so the upstream Pod a can determine that the service request discarding rate of the downstream Pod B in the current time window is (x 1-y 1)/x 1. The service request drop rate threshold of the downstream Pod B refers to the ratio of the number of service requests that the upstream Pod a allows the downstream Pod B to drop to the total number of service requests that the upstream Pod a sends to the downstream Pod B.

Taking the downstream Pod C as an example, under a certain time window, the number of service requests sent by the upstream Pod a to the downstream Pod C is x2, but the downstream Pod B only processes y2 service requests in the x2 service requests, where y2 is smaller than x2, the upstream Pod a receives the response of the y2 service requests returned by the downstream Pod B, and the x2-y2 service requests are discarded by the downstream Pod B, so the upstream Pod a can determine that the service request discarding rate of the downstream Pod B in the current time window is (x 2-y 2)/x 2. The service request drop rate threshold of the downstream Pod B refers to the ratio of the number of service requests that the upstream Pod a allows the downstream Pod B to drop to the total number of service requests that the upstream Pod a sends to the downstream Pod B. For other downstream Pod, the service request drop rate and the service request drop rate threshold are similar.

Alternatively, the static service level of each Pod may be represented by a numerical value, for example, 0-100 may be used to represent the level of the static service level, and the larger the numerical value is, the lower the static service level is, 0 is the highest, and 100 is the lowest.

Alternatively, the euclidean distance algorithm may be used to calculate the comprehensive level of the downstream Pod according to the static service level of the downstream Pod, the service request drop rate threshold of the downstream Pod, and the overload rate of the downstream Pod. For example, p is used to represent the static traffic class of the downstream Pod, where p has a value in the range of 0,100, L represents the traffic request drop rate threshold of the downstream Pod, f has a value in the range of 0,1, f has an overload rate in the range of 0,100,

Wherein d represents the service request discarding rate, the calculation mode is d=r/s, r represents the number of service requests sent by the upstream Pod to be discarded by the downstream Pod, and s represents the number of service requests sent by the upstream Pod to the downstream Pod. In one example, d may represent a service request drop rate per unit time, r represents the number of service requests sent by an upstream Pod that are dropped by a downstream Pod per unit time, and s represents the number of service requests sent by an upstream Pod to a downstream Pod per unit time. In one example, d may represent a service request drop rate under the previous time window, r represents a number of service requests sent by the upstream Pod downstream of the previous time window that are dropped by the downstream Pod, and s represents a number of service requests sent by the upstream Pod to the downstream Pod downstream of the previous time window.

When d is smaller than L, the actual service request discarding rate is smaller than the service request discarding rate threshold, which indicates that the service request in the downstream Pod is not overloaded, and the overload rate f takes a value of 0; when d is greater than or equal to L, the actual service request drop rate is greater than or equal to a service request drop rate threshold, which indicates that the service request in the downstream Pod is overloaded or is about to overload, f has a value of 100×α, α is a scaling factor, and the value range of α is [0,1], and the greater the drop rate difference, the greater the value of α, where the drop rate difference refers to the difference between the actual service request drop rate and a preset drop rate threshold. It should be noted that, the specific value of α may be specifically set according to the overload condition of the service request, for example, when overload is less, α may take a smaller value, and when overload is more, α may take a larger value. The calculation mode of the overload rate and the value of alpha in the application are only an example, and in practical application, the overload rate can be calculated in other modes, and the value of alpha can also be calculated in other modes, so the application is not limited.

Calculating the comprehensive grade of the downstream Pod by using Euclidean distance algorithm, wherein the comprehensive grade of the downstream Pod is

I.e. the euclidean distance of the three-dimensional vector (p, L x 100, f) from the origin of coordinates is calculated.

For example, in the example of fig. 4, the static service level p of the downstream Pod B is 10, the service request drop rate threshold L is 20%, the service request drop rate d is 0, the corresponding three-dimensional vector is (10, 20, 0), and the comprehensive level of the downstream Pod B isThe static service level p of the downstream Pod C is 30, the service request drop rate threshold L is 10%, the service request drop rate d is 20%, and for convenience of calculation, when α is taken as 1 here, the corresponding three-dimensional vector is (30,10,100), and the comprehensive level of the downstream Pod C is/>The static service level p of the downstream Pod is 40, the service request discarding rate threshold L is 30%, the service request discarding rate D is 0, the corresponding three-dimensional vector is (40,30,0), and the comprehensive level of the downstream Pod is/>Thus, among the downstream Pod of the upstream Pod a, the downstream Pod B has the highest overall rank, and the downstream Pod D has the lowest overall rank, as compared with the downstream Pod D.

Analysis according to equation (2) shows that: when L and f are fixed, the larger the static service level p is, the larger the numerical value representing the comprehensive level of the downstream Pod is, the lower the comprehensive level of the downstream Pod is, and it can be understood that the larger p is, the lower the static service level representing the downstream Pod is, and the lower the service level of the micro service in the Pod is, the lower the comprehensive level obtained by calculation is; when p and f are fixed, the larger the L is, the larger the numerical value representing the comprehensive grade of the downstream Pod is, the lower the comprehensive grade of the downstream Pod is, the service request discarding rate threshold can be understood as the tolerance to the number of discarded service requests, and the larger the service request discarding rate threshold L is, namely the larger the tolerance is, the lower the comprehensive grade of the downstream Pod is; when p and L are fixed, the larger the service request discarding rate of the downstream Pod is, the larger the overload rate f is, and the lower the comprehensive level of the downstream Pod is.

It should be noted that this step of calculating the integrated level of each Pod is performed by the present Pod (upstream Pod). The static service level of each Pod is globally shared in the service grid, so that the present Pod (upstream Pod) can obtain the static service level of each downstream Pod, thereby calculating the comprehensive level of each downstream Pod.

S1032, discarding part of the service requests in the plurality of first service requests according to the comprehensive grade of each downstream Pod required to be called by the plurality of first service requests, and issuing the rest of the service requests to the corresponding downstream Pod.

After the comprehensive grade of each downstream Pod required to be called by the plurality of first service requests is determined, discarding the service request with the lowest indicated calling comprehensive grade in the plurality of first service requests, and sending other service requests to the corresponding downstream Pod. For example, in the example of fig. 4, the downstream Pod required to be invoked in each service request received by the upstream Pod a includes Pod B, pod C and Pod D, and the comprehensive level of Pod C is the lowest through calculation, so Pod a discards the service request for indicating to invoke Pod C in each service request, and sends other service requests to Pod B and Pod D correspondingly.

In one example, the value representing the comprehensive grade of Pod B is calculated to be 22, the value representing the comprehensive grade of Pod C is calculated to be 104, the value representing the comprehensive grade of Pod D is calculated to be 50, and it is determined that the comprehensive grade is Pod C with the lowest comprehensive grade according to the value, so that Pod a discards the service request for indicating to call Pod C in each service request, and other service requests are correspondingly sent to Pod B and Pod D. In the present application, the service request is discarded, and it is understood that the service request is not processed.

Under the condition that the Pod is overloaded in the current time window, the number of service requests in the Pod is reduced by discarding part of service requests.

And S104, under the condition that the Pod is not overloaded under the current time window, a plurality of first service requests are issued to the corresponding downstream container group Pod.

Under the condition that the Pod is not overloaded under the current time window, the Pod issues a plurality of first service requests to the corresponding downstream Pod according to the normal condition, any service request is not discarded until the next time window comes, whether the Pod is overloaded under the next time window is judged, if yes, the step S105 is executed, and if no overload is generated, the step S101 is executed.

S105, under the condition that the current time window is overloaded, determining whether the first time window Pod behind the current time window is overloaded.

After the current time window is finished, entering the next time window, namely the first time window after the current time window, and judging whether overload occurs to the Pod (upstream Pod) under the first time window after the current time window. The judgment method is similar to the method described in the steps S101 and S102 above: 1) Acquiring the resource utilization rate of the Pod in a first time window after the current time window, wherein the resource utilization rate comprises the processor utilization rate and/or the memory utilization rate; 2) And determining whether the Pod is overloaded in the first time window after the current time window according to the resource utilization rate of the Pod. Specifically, the resource usage rate of the Pod may be compared with the overload threshold to determine whether the Pod is overloaded in the first time window after the current time window, and the description in steps S101 and S102 may be referred to for details, which are not further described herein for brevity of description.

And S106, under the condition that overload occurs to the Pod under the first time window after the current time window, executing a second current limiting operation, wherein the comprehensive grade of the downstream Pod called by the discarded service request in the second current limiting operation is higher than that of the downstream Pod called by the discarded service request in the first current limiting operation.

In the case where it is determined that overload occurs in the present Pod under the first time window after the current time window, a second current limiting operation is performed, specifically including but not limited to the following contents in steps S1061, S1062.

S1061, determining the comprehensive grade of each downstream Pod in at least one downstream Pod required to be invoked by the second service requests received by the Pod.

Firstly, one or more downstream Pod required to be invoked by a plurality of second service requests received by the Pod is determined, wherein the Pod can identify which micro service each request corresponds to according to the first service request, and it can be understood that the Pod can identify which downstream Pod is required to be invoked by each request according to the first service request. Then, a composite level of one or more downstream Pod to be invoked by the plurality of second service requests is determined. For any downstream Pod, determining the comprehensive level of the Pod according to the static service level of the Pod, the service request drop rate threshold of the Pod, and the service request drop rate of the Pod. In one example, the composite level for each downstream Pod may be obtained by numerically representing the static traffic level, calculated using the euclidean distance algorithm. For details, reference is made to the description of the content in step S1031, and for brevity of description, description will not be repeated here.

S1062, discarding part of service requests in the plurality of second service requests according to the comprehensive grade of each downstream Pod required to be invoked by the plurality of second service requests, and issuing the rest of service requests to the corresponding downstream Pod.

After determining the comprehensive grade of each downstream Pod required to be invoked by the plurality of second service requests, firstly screening out the Pod with the comprehensive grade higher than the Pod with the lowest comprehensive grade in each downstream Pod invoked by the plurality of first service requests in the first flow limiting operation as a first target set. For example, in the example of fig. 4, when the Pod with the lowest comprehensive level is Pod B, the Pod with the comprehensive level higher than Pod B is screened from the downstream pods required to be invoked by the plurality of second service requests as the first target set. And then, determining the Pod with the lowest comprehensive grade from the first target set, and deleting the Pod to obtain a second target set. And finally, the Pod (upstream Pod) sends the service request for indicating to call any Pod in the second target set in the plurality of second service requests to the corresponding downstream Pod, and discards other service requests.

For example, in the example of fig. 4, in the first current limiting operation, the comprehensive grade of Pod B is the lowest, and the value representing the comprehensive grade of Pod B is 104, then in the second current limiting operation, first, the value of the comprehensive grade of each downstream Pod required to be invoked by the plurality of second service requests is calculated, then, a Pod (Pod with a comprehensive grade higher than Pod B) with a comprehensive grade value smaller than 104 is determined therefrom as a first target set, then, a Pod (Pod with a lowest comprehensive grade) with a comprehensive grade value is determined from the first target set, and is deleted, so as to obtain a second target set, finally, a service request for instructing to invoke any Pod in the plurality of second service requests is issued to the corresponding Pod, and other service requests are discarded.

And S107, under the condition that the current time window is overloaded and the Pod is not overloaded in the first time window after the current time window, a plurality of second service requests are issued to the corresponding downstream container group Pod.

And under the condition that the Pod is not overloaded in the first time window after the current time window, issuing a plurality of second service requests to the corresponding downstream container group Pod according to normal operation, waiting for the arrival of the next time window, judging whether the next time window is overloaded again, executing a third current limiting operation if the next time window is overloaded, and executing step S101 if the third current limiting operation is not overloaded.

Optionally, in the second time window after the current time window, determining whether the present Pod is overloaded, the method is similar to step S105, and specific content may refer to the description in step S105, which is not further described herein for brevity of description. In the case that the present Pod is determined to be overloaded, executing a third current limiting operation, including:

1) Determining the comprehensive level of one or more downstream Pod required to be invoked by the third service requests received by the Pod, which is described in step S1061 and will not be further described herein;

2) Determining a third target set from all downstream Pods required to be called by the plurality of third service requests, wherein the third target set comprises Pods with the comprehensive grade higher than that of the Pods with the lowest comprehensive grade in the first target set in the second current limiting operation;

3) Determining the Pod with the lowest comprehensive grade from the third target set, and deleting the Pod to obtain a fourth target set;

4) And sending the service request for indicating to call any Pod in the fourth target set in the plurality of third service requests to the corresponding downstream Pod, and discarding other service requests.

It should be noted that, before step S101 is performed, the present Pod (upstream Pod) is not overloaded, that is, the present Pod is not overloaded in a time window before the current time window.

Optionally, in an example, if no overload occurs in the present Pod in n consecutive time windows after the current time window, and in a case where an overload occurs in the n+1th time window, the first current limiting operation is performed in the n+1th time window, where n may be specifically set by the user according to a specific situation.

Optionally, in an example, if the present Pod does not reach that overload does not occur in n consecutive time windows after the current time window, a current limiting operation similar to the first current limiting operation or the second current limiting operation may be performed, and a portion of the service request may be discarded.

It should be noted that, for ease of understanding, the embodiment of the present application describes a current limiting method for two consecutive time windows from step S101 to step S107, and in practical application, there may be a greater or lesser number of consecutive time windows that are overloaded. For example, if overload behavior occurs in the present Pod (upstream Pod) in three consecutive time windows, the present Pod executes a current limiting operation in the three time windows in which the overload behavior occurs, and the integrated level of the downstream Pod called by the service request discarded in the second current limiting operation is higher than the integrated level of the downstream Pod called by the service request discarded in the first current limiting operation, and the integrated level of the downstream Pod called by the service request discarded in the third current limiting operation is higher than the integrated level of the downstream Pod called by the service request discarded in the second current limiting operation, so that the overload behavior of the present Pod disappears as soon as possible.

It can be seen that the present application provides a service grid current limiting method, which determines whether an upstream Pod is overloaded according to the resource usage situation of the upstream Pod, and performs current limiting under the condition that the upstream Pod is overloaded, and when current limiting is performed, comprehensively considers the static service level of the downstream Pod and the real overload situation of the downstream Pod to determine the comprehensive level of the downstream Pod, and selects and calls the service request with low comprehensive level to discard, thereby ensuring the stability of the system and the stability of the service. Where the Pod with low overall level may be a static low service level (a micro low service level) or may be in a high load state.

Referring to fig. 5, fig. 5 is a schematic structural diagram of a service grid current limiting device 500 provided by the present application, where the device 500 includes upstream container groups Pod having an upstream-downstream correspondence, each container group Pod is configured with an edge container and an application container, the application container is used for processing a service request, the edge container is used for controlling a traffic of the service request, and the device 500 includes:

Under the current time window:

A comprehensive grade determining module 510, configured to determine, in a case where it is determined that the upstream container group Pod is overloaded, a comprehensive grade of each downstream container group Pod in at least one downstream container group Pod that needs to be invoked by the plurality of first service requests received by the upstream container group Pod, where the comprehensive grade of the downstream container group Pod is determined based on a preset static service grade of the downstream container group Pod and an ability of the downstream container group Pod to actually process the service requests;

And the communication module 520 is configured to issue a part of service requests in the plurality of first service requests to the corresponding downstream container group Pod according to the comprehensive level of each downstream container group Pod required to be invoked by the plurality of first service requests, and discard the rest of service requests except the part of service requests in the plurality of first service requests.

In a possible implementation manner, the capability of the downstream container group Pod to actually process the service request includes one or more of a service request drop rate threshold of the downstream container group Pod and an overload rate of the downstream container group Pod, where the overload rate of the downstream container group Pod is positively correlated with a drop rate difference, and the drop rate difference refers to a difference between the service request drop rate of the downstream container group Pod and the service request drop rate threshold.

In a possible implementation, the comprehensive level of the downstream container group Pod is positively correlated with the static traffic level of the downstream container group Pod; the comprehensive grade of the downstream container group Pod is inversely related to the service request discarding rate threshold value of the downstream container group Pod; the overall level of the downstream group Pod is inversely related to the overload rate of the downstream group Pod.

In a possible implementation manner, the obtaining module 530 is configured to obtain a resource usage rate of the upstream container group Pod in the current time window; the overload determining module 540 is configured to determine whether the upstream container group Pod is overloaded in the current window according to the resource usage rate of the upstream container group Pod.

In a possible implementation, the resource usage of the upstream container group Pod includes the usage of the processor and/or the usage of the memory in the upstream container group Pod; the overload determination module 540 is configured to: when the utilization rate of the processor is greater than a set processor utilization rate overload threshold value and/or when the utilization rate of the memory is greater than a set memory utilization rate overload threshold value, determining that the upstream container group Pod of the current window is overloaded; otherwise, it is determined that the upstream container group Pod downstream of the current window is not overloaded.

In a possible implementation manner, the comprehensive grade determining module 510 is configured to determine, from among the downstream container groups Pod required to be invoked by the plurality of first service requests, a downstream container group Pod with a lowest comprehensive grade, as a first target container group Pod;

The communication module 520 is configured to issue, to the corresponding downstream container group Pod, a service request for indicating to invoke a downstream container group Pod with a comprehensive level higher than that of the first target container group Pod in the plurality of first service requests, and discard the remaining service requests.

In a possible implementation, the comprehensive grade determination module 510 is configured to:

under the first time window after the current time window:

In the case that the upstream container group Pod is overloaded, determining the comprehensive level of each downstream container group Pod in at least one downstream container group Pod required to be invoked by the plurality of second service requests received by the upstream container group Pod;

Determining a downstream container group Pod with the comprehensive grade higher than that of the first target container group Pod from all downstream container groups Pod required to be called by a plurality of second service requests as a first set;

the communication module 520 is configured to:

In a possible implementation manner, the comprehensive grade determining module is used for:

Under a second time window after the current time window:

Determining a downstream container group Pod with the comprehensive grade higher than that of the second target container group Pod from all downstream container groups Pod required to be called by a plurality of third service requests as a second set;

the communication module is used for:

and sending the service requests for indicating to call the downstream Pod with the comprehensive grade higher than the third target container group Pod in the plurality of third service requests to the corresponding downstream container group Pod, and discarding the rest of the service requests.

In a possible implementation, the comprehensive level of the downstream group of containers Pod is represented by a numerical value.

The comprehensive grade determining module 510, the communication module 520, the obtaining module 530, and the overload determining module 540 may be implemented by software, or may be implemented by hardware. Illustratively, the implementation of the integrated level determination module 510 is described next using the integrated level determination module 510 as an example. Similarly, the implementation of the communication module 520, the acquisition module 530, and the overload determination module 540 may refer to the implementation of the comprehensive grade determination module 510.

Module as an example of a software functional unit, the comprehensive grade determination module 510 may include code running on a computing device. Wherein the computing device may be at least one of a physical host, a virtual machine, a container, and the like. Further, the computing device may be one or more. For example, the comprehensive grade determination module 510 may include code running on multiple hosts/virtual machines/containers. It should be noted that, multiple hosts/virtual machines/containers for running the application may be distributed in the same region (region), or may be distributed in different regions. Multiple hosts/virtual machines/containers for running the code may be distributed in the same availability zone (availability zone, AZ) or may be distributed in different AZs, each AZ comprising a data center or multiple geographically close data centers. Wherein typically a region may comprise a plurality of AZs.

Also, multiple hosts/virtual machines/containers for running the code may be distributed in the same virtual private cloud (virtual private cloud, VPC) or may be distributed in multiple VPCs. Where typically one VPC is placed within one region. The inter-region communication between two VPCs in the same region and between VPCs in different regions needs to set a communication gateway in each VPC, and the interconnection between the VPCs is realized through the communication gateway.

Module as an example of a hardware functional unit, the comprehensive grade determination module 510 may include at least one computing device, such as a server or the like. Or the integrated level determination module 510 may be a device implemented using an application-specific integrated circuit (ASIC), a programmable logic device (programmable logic device, PLD), or the like. The PLD may be implemented as a complex program logic device (complex programmable logical device, CPLD), a field-programmable gate array (FPGA) GATE ARRAY, a general-purpose array logic (GENERIC ARRAY logic, GAL), or any combination thereof.

The multiple computing devices included in the integrated level determination module 510 may be distributed in the same region or may be distributed in different regions. The multiple computing devices included in the integrated level determination module 510 may be distributed among the same AZ or may be distributed among different AZ. Likewise, multiple computing devices included in the integrated level determination module 510 may be distributed in the same VPC or may be distributed among multiple VPCs. Wherein the plurality of computing devices may be any combination of computing devices such as servers, ASIC, PLD, CPLD, FPGA, and GAL.

Referring to fig. 6, fig. 6 is a schematic structural diagram of a computing device 600 provided by the present application, where the computing device 600, for example, a bare metal server, a virtual machine, a container, etc., and the computing device 600 may be configured as an upstream container group Pod in a method embodiment, and specifically may be configured as a sidecar container in the upstream container group Pod. The computing device 600 includes: bus 602, processor 604, memory 606, and communication interface 608. The processor 604, the memory 606, and the communication interface 608 communicate via the bus 602. It should be understood that the present application is not limited to the number of processors, memories in computing device 600.

Bus 602 may be a peripheral component interconnect standard (PERIPHERAL COMPONENT INTERCONNECT, PCI) bus, or an extended industry standard architecture (extended industry standard architecture, EISA) bus, among others. The buses may be divided into address buses, data buses, control buses, etc. For ease of illustration, only one line is shown in fig. 6, but not only one bus or one type of bus. Bus 602 may include a path to transfer information between various components of computing device 600 (e.g., memory 606, processor 604, communication interface 608).

The processor 604 may include any one or more of a central processing unit (central processing unit, CPU), a graphics processor (graphics processing unit, GPU), a Microprocessor (MP), or a digital signal processor (DIGITAL SIGNAL processor, DSP).

The memory 606 may include volatile memory (RAM), such as random access memory (random access memory). The processor 604 may also include non-volatile memory (non-volatile memory), such as read-only memory (ROM), flash memory, mechanical hard disk (HARD DISK DRIVE, HDD) or solid state disk (SSD STATE DRIVE).

The memory 606 stores executable program codes, and the processor 604 executes the executable program codes to implement the functions of the above-mentioned comprehensive grade determination module 510, communication module 520, acquisition module 530, and overload determination module 540, respectively, so as to implement a service grid current limiting method. That is, the memory 606 has instructions stored thereon for performing a service grid throttling method.

Communication interface 608 enables communication between computing device 600 and other devices or communication networks using transceiver modules such as, but not limited to, network interface cards, transceivers, and the like. Alternatively, for example, the communication module 520 may be located in the communication interface 608.

The embodiment of the application also provides a computing device cluster. The cluster of computing devices includes at least one computing device. The computing device may be a server, virtual machine, container, such as a central server, edge server, sidecar container.

As shown in fig. 7, fig. 7 is a schematic structural diagram of a computing device cluster provided by the present application, where the computing device cluster includes at least one computing device 600, and in one scenario, at least one computing device in the computing device cluster may be configured as a sidecar container, and the same instructions for executing a service grid current limiting method may be stored in a memory 606 in one or more computing devices 600 in the computing device cluster.

In some possible implementations, portions of instructions for performing a service grid throttling method may also be stored separately in the memory 606 of one or more computing devices 600 in the computing device cluster. In other words, a combination of one or more computing devices 600 may be used to collectively execute instructions of a service grid current limiting method.

When at least one computing device in a cluster of computing devices is configured to service grid current limiting apparatus 500, memory 606 in a different computing device 600 in the cluster of computing devices may store different instructions for performing part of the functionality of apparatus 500, respectively. That is, the instructions stored by the memory 606 in the different computing devices 600 may implement the functionality of one or more of the comprehensive grade determination module 510, the communication module 520, the acquisition module 530, and the overload determination module 540.

In some possible implementations, one or more computing devices in a cluster of computing devices may be connected through a network. Wherein the network may be a wide area network or a local area network, etc. FIG. 8 illustrates one possible implementation of a cluster of computing devices. As shown in fig. 8, two computing devices 600A and 600B are connected by a network. Specifically, the connection to the network is made through a communication interface in each computing device. In this type of possible implementation, the memory 606 in the computing device 600A stores instructions for performing the functions of the acquisition module 530 and the overload determination module 540, and the memory 606 in the computing device 600B stores instructions for performing the functions of the communication module 520 and the comprehensive grade determination module 510.

It should be appreciated that the functionality of computing device 600A shown in fig. 8 may also be performed by multiple computing devices 600, or that a computing device cluster includes multiple computing devices having the same functionality as computing device 600A. Likewise, the functionality of computing device 600B may be performed by multiple computing devices 600, or a cluster of computing devices may include multiple computing devices having the same functionality as computing device 600B.

The embodiment of the application also provides another computing device cluster. The connection between computing devices in the computing device cluster may be similar to the connection of the computing device cluster described with reference to fig. 7 and 8. In contrast, the memory 606 in one or more computing devices 600 in the computing device cluster may have different instructions stored therein for performing a service grid throttling method. In some possible implementations, portions of instructions for performing a service grid throttling method may also be stored separately in the memory 606 of one or more computing devices 600 in the computing device cluster. In other words, a combination of one or more computing devices 600 may collectively execute instructions for performing a service grid throttling method.

Embodiments of the present application also provide a computer program product comprising instructions. The computer program product may be software or a program product containing instructions capable of running on a computing device or stored in any useful medium. The computer program product, when run on at least one computing device, causes the at least one computing device to perform a service grid throttling method.

The embodiment of the application also provides a computer readable storage medium. The computer readable storage medium may be any available medium that can be stored by a computing device or a data storage device such as a data center containing one or more available media. The usable medium may be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., solid state disk), etc. The computer-readable storage medium includes instructions that instruct a computing device or cluster of computing devices to perform a service grid throttling method.

The above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; these modifications or substitutions do not depart from the essence of the corresponding technical solutions from the protection scope of the technical solutions of the embodiments of the present invention.

Claims

1. A service grid current limiting method, wherein the method is applied to upstream container groups Pod with upstream-downstream correspondence, each container group Pod is provided with an edge vehicle container and an application program container, the application program container is used for processing the service request, and the edge vehicle container is used for controlling the traffic of the service request, and the method comprises the following steps:

Under the current time window:

2. The method of claim 1, wherein the capability of the downstream container group Pod to actually process service requests includes one or more of a service request drop rate threshold of the downstream container group Pod and an overload rate of the downstream container group Pod, the overload rate of the downstream container group Pod being positively correlated with a drop rate difference, the drop rate difference being a difference between a service request drop rate of the downstream container group Pod and the service request drop rate threshold.

3. The method of claim 2, wherein the step of determining the position of the substrate comprises,

The comprehensive grade of the downstream container group Pod is positively correlated with the static service grade of the downstream container group Pod;

The comprehensive grade of the downstream container group Pod is inversely related to a service request discarding rate threshold value of the downstream container group Pod;

The overall level of the downstream group of containers Pod is inversely related to the overload rate of the downstream group of containers Pod.

4. A method according to any one of claims 1-3, wherein the method further comprises:

acquiring the resource utilization rate of the upstream container group Pod under the current time window;

And determining whether the upstream container group Pod is overloaded or not under the current window according to the resource utilization rate of the upstream container group Pod.

5. The method according to claim 4, wherein the resource usage of the upstream container group Pod comprises the usage of a processor and/or the usage of a memory in the upstream container group Pod;

the determining whether the upstream container group Pod is overloaded under the current window according to the resource utilization rate of the upstream container group Pod comprises:

When the utilization rate of the processor is greater than a set processor utilization rate overload threshold value and/or when the utilization rate of the memory is greater than a set memory utilization rate overload threshold value, determining that the upstream container group Pod is overloaded under the current window;

otherwise, determining that the upstream container group Pod is not overloaded under the current window.

6. The method according to any one of claims 1-5, wherein the issuing a part of service requests in the plurality of first service requests to the corresponding downstream container group Pod according to the comprehensive level of each downstream container group Pod required to be invoked by the plurality of first service requests, discarding the rest of service requests in the plurality of first service requests except the part of service requests, includes:

7. The method of claim 6, wherein the method further comprises:

Under a first time window following the current time window:

8. The method according to any one of claims 1 to 7, wherein the composite grade of the downstream group of containers Pod is represented by a numerical value.

9. A service grid current limiting device, wherein the device comprises upstream container groups Pod with upstream-downstream correspondence, each container group Pod is provided with a side vehicle container and an application program container, the application program container is used for processing the service request, and the side vehicle container is used for controlling the flow of the service request, and the device comprises:

Under the current time window:

10. The apparatus of claim 9, wherein the capability of the downstream container group Pod to actually process service requests includes one or more of a service request drop rate threshold of the downstream container group Pod and an overload rate of the downstream container group Pod, the overload rate of the downstream container group Pod being positively correlated with a drop rate difference, the drop rate difference being a difference between a service request drop rate of the downstream container group Pod and the service request drop rate threshold.

11. The apparatus of claim 10, wherein the device comprises a plurality of sensors,

12. The device according to any one of claims 9 to 11, wherein,

The acquisition module is used for acquiring the resource utilization rate of the upstream container group Pod under the current time window;

And the overload determining module is used for determining whether the upstream container group Pod is overloaded or not under the current window according to the resource utilization rate of the upstream container group Pod.

13. The apparatus according to claim 12, wherein the resource usage of the upstream container group Pod comprises usage of a processor and/or usage of a memory in the upstream container group Pod;

The overload determination module is configured to:

14. The device according to any one of claims 9 to 13, wherein,

The comprehensive grade determining module is used for determining a downstream container group Pod with the lowest comprehensive grade from all downstream container groups Pod required to be called by the plurality of first service requests as a first target container group Pod;

15. The apparatus of claim 14, wherein the comprehensive grade determination module is configured to:

Under a first time window following the current time window:

The communication module is used for:

16. The apparatus according to any one of claims 9-15, wherein the composite grade of the downstream group of containers Pod is represented by a numerical value.

17. A cluster of computing devices, comprising at least one computing device, each computing device of the at least one computing device comprising a memory and a processor, the processor of the at least one computing device to execute instructions stored in the memory of the at least one computing device, such that the cluster of computing devices performs the method of any of claims 1-8.

18. A computer storage medium containing instructions which, when run in a cluster of computing devices, cause the cluster of computing devices to perform the method of any of claims 1 to 8.