CN112540829A - Container group eviction method, device, node equipment and storage medium - Google Patents
Container group eviction method, device, node equipment and storage medium Download PDFInfo
- Publication number
- CN112540829A CN112540829A CN202011486963.9A CN202011486963A CN112540829A CN 112540829 A CN112540829 A CN 112540829A CN 202011486963 A CN202011486963 A CN 202011486963A CN 112540829 A CN112540829 A CN 112540829A
- Authority
- CN
- China
- Prior art keywords
- container
- node
- target
- container group
- eviction
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 54
- 238000012545 processing Methods 0.000 claims description 15
- 238000004590 computer program Methods 0.000 claims description 3
- 238000012544 monitoring process Methods 0.000 claims description 3
- 230000008569 process Effects 0.000 description 16
- 102100033121 Transcription factor 21 Human genes 0.000 description 10
- 101150109289 tcf21 gene Proteins 0.000 description 10
- 238000010586 diagram Methods 0.000 description 9
- 230000006870 function Effects 0.000 description 7
- 238000004891 communication Methods 0.000 description 6
- 230000009471 action Effects 0.000 description 3
- 230000003993 interaction Effects 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 238000003491 array Methods 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000012163 sequencing technique Methods 0.000 description 1
- 230000011664 signaling Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/455—Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
- G06F9/45533—Hypervisors; Virtual machine monitors
- G06F9/45558—Hypervisor-specific management and integration aspects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/455—Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
- G06F9/45533—Hypervisors; Virtual machine monitors
- G06F9/45558—Hypervisor-specific management and integration aspects
- G06F2009/4557—Distribution of virtual machine instances; Migration and load balancing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/455—Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
- G06F9/45533—Hypervisors; Virtual machine monitors
- G06F9/45558—Hypervisor-specific management and integration aspects
- G06F2009/45595—Network integration; Enabling network access in virtual machine instances
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Memory System Of A Hierarchy Structure (AREA)
Abstract
The application provides a container group eviction method, a device, node equipment and a storage medium, and relates to the technical field of containers, a target container group is determined in a plurality of container groups which can be evicted, and when the target container group is determined to be capable of being dispatched to other available node equipment except the target node equipment where the target container group is located, the target container group is determined to be a container group to be evicted; therefore, even if the target container group is evicted, the target container group can be scheduled to other available node devices to run, the target container group cannot be repeatedly evicted, and the stability of the cluster system is improved.
Description
Technical Field
The present application relates to the field of container technologies, and in particular, to a container set eviction method, an apparatus, a node device, and a storage medium.
Background
In a container cluster such as kubernets, a container group (pod) can be created by each node device (node) in the container cluster, so that each service can be deployed in the pod to run, and the purpose of rapidly deploying the service is achieved.
Because the pod deployed on the node device needs to consume resources, such as a Memory (Memory), a disk, a Process Identification (PID), and the like, on the node device, the resources of a single node device are generally limited; therefore, when the remaining resources of a certain node device in the container cluster are insufficient, at least part of the pod of the node device can be evicted, so that the node device can leave enough resources for stable operation of the pod.
However, since the evicted pod cannot be scheduled to be stably operated by other node devices in the container cluster, the evicted pod may be repeatedly evicted, so that the evicted pod cannot be scheduled.
Disclosure of Invention
An object of the present application is to provide a container group eviction method, apparatus, node device, and storage medium, which can improve stability of a cluster system.
In order to achieve the purpose, the technical scheme adopted by the application is as follows:
in a first aspect, the present application provides a container set eviction method, the method comprising:
sequentially traversing a plurality of container groups which can be evicted and are recorded in a container eviction list according to the respective corresponding eviction priority order of each container group, and determining a target container group in the container eviction list; wherein the target container set is a multiple-copy container set of the plurality of evictable container sets;
judging whether the target container group can be dispatched to other available node equipment except the target node equipment in the container cluster or not; wherein the container cluster includes a plurality of node devices, the target node device is a node device of the plurality of node devices that operates the target container group, and the available node device is a node device of the plurality of node devices that can operate the target container group;
and when the target container group can be dispatched to other available node devices, determining the target container group as a container group to be evicted.
In a second aspect, the present application provides a container set eviction device, the device comprising:
the processing module is used for sequentially traversing a plurality of container groups which can be evicted and are recorded in the container eviction list according to the respective corresponding eviction priority order of each container group, and determining a target container group in the container eviction list; wherein the target container set is a multiple-copy container set of the plurality of evictable container sets;
the judging module is used for judging whether the target container group can be dispatched to other available node equipment except the target node equipment in the container cluster; wherein the container cluster includes a plurality of node devices, the target node device is a node device of the plurality of node devices that operates the target container group, and the available node device is a node device of the plurality of node devices that can operate the target container group;
the processing module is further configured to determine the target container group as a to-be-evicted container group when the target container group can be scheduled to other available node devices.
In a third aspect, the present application provides a node apparatus comprising a memory for storing one or more programs; a processor; the one or more programs, when executed by the processor, implement the container set eviction method described above.
In a fourth aspect, the present application provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the container set eviction method described above.
In the container group eviction method, apparatus, node device, and storage medium provided by the present application, a target container group is determined from a plurality of container groups that can be evicted, and when it is determined that the target container group can be scheduled to other available node devices except for a target node device where the target container group is located, the target container group is determined as a container group to be evicted; therefore, even if the target container group is evicted, the target container group can be scheduled to other available node devices to run, the target container group cannot be repeatedly evicted, and the stability of the cluster system is improved.
In order to make the aforementioned objects, features and advantages of the present application more comprehensible, preferred embodiments accompanied with figures are described in detail below.
Drawings
In order to more clearly explain the technical solutions of the present application, the drawings needed for the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and that those skilled in the art can also derive other related drawings from these drawings without inventive effort.
Fig. 1 shows an architectural diagram of a kubernets cluster.
Fig. 2 shows a schematic structural block diagram of a node device provided in the present application.
FIG. 3 illustrates a schematic flow chart of a container set eviction method as provided herein.
Fig. 4 shows a schematic flow diagram of a container set eviction apparatus as provided herein.
In the figure: 100-node devices; 101-a memory; 102-a processor; 103-a communication interface; 300-container set eviction means; 301-a processing module; 302-a decision module.
Detailed Description
To make the purpose, technical solutions and advantages of the present application clearer, the technical solutions in the present application will be clearly and completely described below with reference to the accompanying drawings in some embodiments of the present application, and it is obvious that the described embodiments are some, but not all embodiments of the present application. The components of the present application, as generally described and illustrated in the figures herein, may be arranged and designed in a wide variety of different configurations.
Thus, the following detailed description of the embodiments of the present application, as presented in the figures, is not intended to limit the scope of the claimed application, but is merely representative of selected embodiments of the application. All other embodiments obtained by a person of ordinary skill in the art based on a part of the embodiments in the present application without any creative effort belong to the protection scope of the present application.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures. Meanwhile, in the description of the present application, the terms "first", "second", and the like are used only for distinguishing the description, and are not to be construed as indicating or implying relative importance.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
Some embodiments of the present application will be described in detail below with reference to the accompanying drawings. The embodiments described below and the features of the embodiments can be combined with each other without conflict.
Referring to fig. 1, fig. 1 is a schematic diagram illustrating an architecture of a kubernets cluster, where the kubernets cluster may include a cluster master node (master) and a plurality of node devices, and the cluster master node may receive a creation request sent by a user through a client, create a new pod, and schedule the new pod to one of the node devices for operation.
It should be understood that fig. 1 is only an illustration, the kubernets cluster includes three node devices, in other possible embodiments of the present application, the kubernets cluster may further include more or less node devices, and the present application does not limit the number of node devices included in the kubernets cluster.
In the process of operating the pod in the node device, hardware resources provided by the node device, such as a Memory, a disk, a process PID, and the like, need to be used, and the hardware resources owned by a single node device are generally limited; therefore, in the process of scheduling the newly created pod to the node devices, the cluster master node may reasonably schedule the newly created pod to the node devices with spare hardware resources in the kubernets cluster by combining the resource usage of each node device in the kubernets cluster at the current time.
However, the situation of hardware resources used by a pod is generally dynamically changed, and as the amount of service interaction data running in the pod increases, the amount of hardware resources occupied by each pod may also increase, so that the remaining hardware of the node device where the pod is located is insufficient in resources, and sufficient hardware resources cannot be provided for the pod to use, thereby causing the pod or the node device to run abnormally, or even crash.
Therefore, for the purpose of maintaining node devices and stable operation of the pod, in a scenario such as a kubberenets cluster, a Kubelet service running on a node device may enable a control unit named as an eventinmanager, and the eventinmanager is responsible for monitoring hardware resource usage such as Memory, disk, process PID, and the like on the node device where the Kubelet service is located.
In the process of determining the pod which needs to be evicted, the EvictionManager generally sorts the pod according to the respective eviction priority of each pod, and preferentially evicts the pod with the highest eviction priority.
For example, in some possible scenarios, EvictionManager's basis for ordering all pods may consider the following two dimensions:
(1) whether the pod to be evicted is a critical pod such as a management pod, a static pod, etc.;
(2) whether a pod to be evicted occupies relatively more recoverable resources than other pods on the node device.
Namely: in the process of sequencing all the pods operated on the node equipment, the EvidinAnager firstly excludes all the key pods operated on the node equipment and only sequences the non-key pods; then, for all non-critical pods, sorting is performed according to hardware resources such as memory and disk space occupied by each pod, and the pods occupying relatively more hardware resources are preferentially evicted.
Thus, according to the eviction strategy of the above example, the eventionmanager only needs to evict a small amount of pod, that is, can release a large amount of hardware resources, and ensure that other devices running on the node device can run stably.
Wherein, for the evicted pod, the cluster master node may enable the scheduler to schedule the pod to other node devices to run to continuously provide the service.
However, in the process of scheduling the evicted pod to other node devices, the other node devices may not provide enough hardware resources for the evicted pod to run, and can only reschedule the evicted pod to the node device running earlier; moreover, when the pod is continuously evicted according to the above policy, the pod may be evicted again, which may cause the same pod to be repeatedly evicted in the same node device, and thus the evicted pod cannot be scheduled.
For example, in the scenario shown in fig. 1, when an eventinmanager running in the node apparatus 1 determines that the usage amount of the hardware resource on the node apparatus 1 exceeds a set threshold, the eventinmanager may evict a part of pod in the node apparatus 1, such as evict pod1 running in the node apparatus 1; however, the evicted pod1 may not be scheduled to node device 2 or node device 3 due to more node resources required, etc., and the pod1 may be rescheduled to node device 1 by the cluster master after being evicted by the eventmanager of node device 1; in accordance with the eviction policy described above, pod1 may be evicted by the eventmanager of node device 1; this is repeated, resulting in pod1 being repeatedly evicted by the EvalinAnanager of node device 1.
Therefore, to address at least some of the drawbacks of the above strategies, some possible embodiments provided by the present application are: determining a target container group as a container group to be evicted by determining a target container group from a plurality of container groups that can be evicted, and determining the target container group as the container group to be evicted when determining that the target container group can be dispatched to other available node devices except the target node device where the target container group is located; in this way, it can be ensured that the evicted target container set can be scheduled to be run by other node devices without being repeatedly evicted.
Referring to fig. 2, fig. 2 shows a schematic block diagram of a node device 100 provided in the present application, and in some embodiments, the node device 100 may include a memory 101, a processor 102, and a communication interface 103, where the memory 101, the processor 102, and the communication interface 103 are electrically connected to each other directly or indirectly to implement data transmission or interaction. For example, the components may be electrically connected to each other via one or more communication buses or signal lines.
The memory 101 may be used to store software programs and modules, such as program instructions/modules corresponding to the container set eviction apparatus provided in the present application, and the processor 102 executes various functional applications and data processing by executing the software programs and modules stored in the memory 101, so as to execute the steps of the container set eviction method provided in the present application. The communication interface 103 may be used for communicating signaling or data with other node devices.
The Memory 101 may be, but is not limited to, a Random Access Memory (RAM), a Read Only Memory (ROM), a Programmable Read-Only Memory (PROM), an Erasable Read-Only Memory (EPROM), an electrically Erasable Programmable Read-Only Memory (EEPROM), and the like.
The processor 102 may be an integrated circuit chip having signal processing capabilities. The Processor 102 may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), and the like; but also Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components.
It will be appreciated that the configuration shown in fig. 2 is merely illustrative and that node apparatus 100 may also include more or fewer components than shown in fig. 2 or have a different configuration than shown in fig. 2. The components shown in fig. 2 may be implemented in hardware, software, or a combination thereof.
The container set eviction method provided by the present application is exemplarily described below with the node device shown in fig. 2 as an exemplary execution subject.
Referring to fig. 3, fig. 3 shows a schematic flow chart of a container set eviction method provided by the present application, which may include the following steps in some possible embodiments:
In step 205, the target container set is determined as the container set to be evicted.
In a scenario such as a kubernets cluster, a CRD (Custom Resource type) function provided by the kubernets cluster may be utilized to create a preemption controller (PreEvictionController) service in advance at a node device in the kubernets cluster, and the node device may execute a program instruction or a function module corresponding to the preemption controller service to execute the step of the container group eviction method provided by the present application, and determine a pod to be evicted from all pods run by the kubernets cluster.
It may be appreciated that, in some embodiments, for a kubernets cluster, the pre-eviction controller may be created only on one of the node devices in the kubernets cluster, and all the pods running in the kubernets cluster may be maintained by using the created pre-eviction controller; or, for a kubernets cluster, a pre-eviction controller may be created in all node devices in the kubernets cluster, and the pre-eviction controller running on each node device is responsible for maintaining all the pods running on each node device; or, the pre-eviction controller may also be operated in a node device outside the kubernets cluster, and by establishing communication between the pre-eviction controller and each node device in the kubernets cluster, the pre-eviction controller may maintain all the pods operated in the kubernets cluster; the present application does not limit the implementation of the pre-eviction controller.
In the process of executing the container set eviction method provided by the present application, the pre-eviction controller may receive a container eviction list sent by a kubel eventinmanager, where the container eviction list may be generated by the eventinmanager, and records all non-critical pods running on the node device, and information such as a copy state and an eviction priority corresponding to each non-critical pod.
In the process of executing step 201, in order to avoid situations that, after a single-copy pod is evicted, an application running in the pod cannot continue running, and the like, the pre-eviction controller may sequentially traverse a plurality of evictable pods recorded in the container eviction list according to an eviction priority order of each pod in the container eviction list, and determine multiple copies of pods in the plurality of evictable pods recorded in the container eviction list as target pods.
That is, in the scheme provided in the present application, it may be specified that eviction is not performed for a single-copy pod, but only for a multi-copy pod.
Next, for a plurality of node devices included in the container cluster, the pre-eviction controller may take a node device running the target pod from the plurality of node devices as a target node device, and determine whether the target pod can be scheduled to another available node device in the container cluster other than the target node device by acquiring a remaining resource condition of each node device, where the available node device may be a node device capable of running the target pod from the plurality of node devices included in the container cluster; namely: the pre-eviction controller may first determine whether there is a node other than the target node device that can run the target pod, according to a remaining resource condition of each node device in the container cluster.
Wherein, when the pre-eviction controller determines that the target pod can be scheduled to other available node devices in the container cluster except the target node device, the pre-eviction controller may determine the target pod as the pod to be evicted, and then perform eviction on the pod to be evicted by the eventinmager; otherwise, when the pre-eviction controller determines that the target pod cannot be scheduled to another available node device in the container cluster except the target node device, the pre-eviction controller may return to performing step 201, and continue to perform the container group eviction method provided by the present application with the next multi-copy pod as the target pod until all the multi-copy pods recorded in the container eviction list are traversed.
For example, in the scenario of the above example, assuming that pod1 and pod2 are both multi-copy pods, and node resources required by pod2 are less than those required by pod1, pod2 can stably operate in node device 2, and pod1 cannot stably operate in node device 2; then evicting pod1, according to the example related eviction scheme above, may result in pod1 being evicted repeatedly; according to the eviction scheme provided by the application, the pod2 can be evicted from the node device 1, and the pod2 can be stably scheduled to the node device 2 for operation, so that repeated eviction is avoided.
It can be seen that, in the above-mentioned solution provided by the present application, a target container group is determined from a plurality of container groups that can be evicted, and when it is determined that the target container group can be scheduled to other available node devices except for a target node device where the target container group is located, the target container group is determined as a container group to be evicted; therefore, even if the target container group is evicted, the target container group can be scheduled to other available node devices to run, the target container group cannot be repeatedly evicted, and the stability of the cluster system is improved.
It should be noted that, in some possible embodiments, in the process of executing step 201, the pre-eviction controller may determine, according to the respective resource usage of each pod in operation, an eviction priority order corresponding to each pod; for example, for a hardware resource with a set dimension, a pod using a larger amount of the hardware resource is associated with a higher eviction priority.
In addition, in order to ensure the reliability of the target pod being dispatched to other available node devices after being evicted, the pre-eviction controller may first search for all available node devices except the target node device in the container cluster during the execution of step 203.
For example, in some embodiments, in order to flexibly find out the number of available node devices, the pre-eviction controller may create a node search policy corresponding to a target pod according to a resource condition required by the target pod; illustratively, the pre-eviction controller may take the resource condition required by the target pod as a node lookup policy; alternatively, the pre-eviction controller may superimpose a preset amount of container resources on the basis of the resources required by the target pod, and generate the node lookup policy.
The pre-eviction controller may then determine, based on the node lookup policy, a node device in the container cluster that matches the node lookup policy as an available node device.
For example, the pre-eviction controller may compare the resource condition in the node lookup policy with the remaining resource conditions of other node devices in the container cluster, and determine all node devices whose remaining resource conditions satisfy the node lookup policy as available node devices.
Or, the pre-eviction controller may send the node lookup policy to the cluster master node, and send a matching request to the cluster master node, where the cluster master node may respond to the matching request, create a virtual mirror pod corresponding to the node lookup policy according to the resource requirement request indicated by the node lookup policy, and start the scheduler, attempt to schedule the virtual mirror pod to all node devices except the target node device, thereby determining all node devices capable of stably running the virtual mirror pod as available node devices.
Of course, it is understood that all virtual image pods may also be deleted after all available node devices are determined.
Next, the pre-eviction controller may compare the number of available node devices with a first threshold value based on the first threshold value set in advance; wherein, when the number of available node devices reaches the first threshold, the pre-eviction controller may determine that the target pod can be scheduled to other available node devices; conversely, when the number of available node devices does not reach the first threshold, the pre-eviction controller may determine that the target pod cannot be scheduled to other available node devices.
For example, assuming that the first threshold is 3, when the pre-eviction controller determines that at least 3 available node devices other than the target node device in the container cluster are capable of running the target pod, the pre-eviction controller may determine that the target pod is capable of being scheduled to other available node devices; conversely, when the pre-eviction controller determines that there are less than 3 available node devices in the container cluster other than the target node device that can run the target pod, the pre-eviction controller may determine that the target pod cannot be scheduled to other available node devices.
In addition, in implementations such as that described above, in determining available node devices using the created node lookup policy, the pre-eviction controller may traverse all node devices in the container cluster one by one based on the node lookup policy.
In some possible scenarios, there may be some node devices in the container cluster that do not have free node resources, that is: there may be node devices in the container cluster that cannot provide a free node device for the target pod to run.
Therefore, in order to increase the searching speed of the available node devices and avoid an invalid searching process, before the available node devices are matched based on the node searching policy, the pre-eviction controller may further determine all node devices currently having idle node resources in the container cluster.
For example, in some possible scenarios, assuming that a node device requests to evict a pod, it is characterized that the node device currently has no spare node resources to provide for a new pod; thus, prior to finding available node devices, the pre-eviction controller may determine, based on the node device currently requesting the eviction pod, other node devices in the container cluster than the node device currently requesting the eviction pod, as all node devices currently having free node resources.
Based on this, in the process of matching out the available node device based on the node searching policy, the pre-eviction controller may determine, as the available node device, a node device that matches with the node searching policy among all node devices currently having idle node resources, without matching out the available node device among all node devices included in the container cluster, so as to improve the searching speed of the available node device.
Of course, it is understood that the above is only an example, and illustrates that in some possible scenarios, the node device requesting to evict a pod belongs to a node device that does not currently have a free node resource; in some other possible scenarios, the node device without idle node resources currently may be determined according to the state identification information of each node device or the resource utilization condition, which is not limited in the present application.
In addition, in some possible scenarios, when the pre-eviction controller finishes the container eviction list, there is still no target pod that can be scheduled to other available node devices, namely: if there are no multiple copies of pod that can be scheduled to other available node devices in the container eviction list, the pre-eviction controller may determine the multiple copies of pod with the highest eviction priority in the container eviction list as the pod to be evicted; in this way, even if the pod is evicted and cannot be dispatched to another node device for a while, if more node resources are released as much as possible, another copy pod provides services, thereby avoiding service interruption after a single copy pod is evicted.
Also, in some possible scenarios, when the pre-eviction controller determines a pod to be evicted, the pod to be evicted may be evicted by an eventinmanager running within the target node device. Additionally, the pre-eviction controller may further configure the state of the target node device to a prohibited scheduling state to indicate that the target node device is not used to create a new pod; namely: the target node device has executed the operation of evicting the pod, and the current node resource of the target node device is insufficient, so the state of the target node device is configured to be a scheduling prohibition state, so as to avoid the evicted pod from being rescheduled to the target node device, thereby reducing the occurrence of repeated eviction of the pod.
In addition, in some possible scenarios, for the pod to be evicted, after the EvictionManager evicts the pod to be evicted, the pre-eviction controller may monitor a container state of a target object to which the pod to be evicted belongs, where the container state may be used to indicate a copy state of the target object, that is, a number of copy pods currently owned by the target object.
Based on this, when the pre-eviction controller determines that the container state of the target object satisfies the set multi-copy condition, for example, when the number of pods belonging to the target object reaches the second threshold, the pre-eviction controller may configure the state of the target node device as a permitted scheduling state to indicate that the target node device is permitted to be used to create a new pod; namely: when the container state of the target object meets the set multi-copy condition, the characteristic is that the pod which is evicted first is already scheduled to other node devices, and is not scheduled to the target node device any more, so that repeated eviction is not caused.
In addition, based on the same inventive concept as the above-mentioned container set eviction method provided in the present application, as shown in fig. 4, the present application also provides a container set eviction device 300. In some embodiments, the container set eviction apparatus 300 may include a processing module 301 and a determining module 302. Wherein:
a processing module 301, configured to sequentially traverse multiple container groups that can be evicted and are recorded in a container eviction list according to an eviction priority order corresponding to each container group, and determine a target container group in the container eviction list; wherein the target container set is a multiple copy container set of a plurality of container sets that can be evicted;
a determining module 302, configured to determine whether the target container group can be scheduled to another available node device in the container cluster except the target node device; the container cluster comprises a plurality of node devices, a target node device is a node device which runs a target container group in the plurality of node devices, and an available node device is a node device which can run the target container group in the plurality of node devices;
the processing module 301 is further configured to determine the target container group as a to-be-evicted container group when the target container group can be scheduled to other available node devices.
Optionally, in some possible embodiments, the determining module 302, when determining whether the target container group can be scheduled to another available node device in the container cluster except the target node device, is specifically configured to:
searching all available node equipment except the target node equipment in the container cluster;
when the number of the available node devices reaches a first threshold value, determining that the target container group can be scheduled to other available node devices;
when the number of available node devices does not reach the first threshold, it is determined that the target container group cannot be scheduled to other available node devices.
Optionally, in some possible embodiments, the determining module 302, when searching for an available node device other than the target node device in the container cluster, has a function of:
creating a node searching strategy corresponding to the target container group;
and determining the node equipment matched with the node searching strategy in the container cluster as available node equipment.
Optionally, in some possible embodiments, before determining that the node device in the container cluster matching the node lookup policy is an available node device, the determining module 302 is further configured to:
determining all node equipment with idle node resources currently in the container cluster;
when determining the node device in the container cluster that matches the node lookup policy as an available node device, the determining module 302 is specifically configured to:
and determining the node equipment matched with the node searching strategy in all the node equipment with the current idle node resources as the available node equipment.
Optionally, in some possible embodiments, when determining that all node devices currently having idle node resources in the container cluster have, the determining module 302 is specifically configured to:
and determining other node devices in the container cluster except the node device which currently requests to evict the container group as all the node devices which currently have idle node resources.
Optionally, in some possible embodiments, the processing module 301 is further configured to:
and when the target container group which can be dispatched to other available node devices does not exist in the container eviction list, determining the multi-copy container group with the highest eviction priority corresponding to the container eviction list as the container group to be evicted.
Optionally, in some possible embodiments, the processing module 301 is further configured to:
configuring the state of the target node equipment into a scheduling prohibition state; wherein the prohibited scheduling state is used to indicate that the corresponding node device is not used to create a new container group.
Optionally, in some possible embodiments, the processing module 301 is further configured to:
monitoring the container state of a target object to which a container group to be evicted belongs;
when the container state of the target object meets the set multi-copy condition, configuring the state of the target node equipment into a scheduling permission state; wherein the permission scheduling status is used to indicate that the corresponding node device is permitted to be used to create a new container group.
Optionally, in some possible embodiments, the processing module 301, when determining that the container status of the target object satisfies the set multi-copy condition, is specifically configured to:
and when the number of the container groups belonging to the target object reaches a second threshold value, determining that the container state of the target object meets the set multi-copy condition.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. The apparatus embodiments described above are merely illustrative, and for example, the flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to some embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s).
It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved.
It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
In addition, functional modules in some embodiments of the present application may be integrated together to form an independent part, or each module may exist separately, or two or more modules may be integrated to form an independent part.
The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to perform all or part of the steps of the method according to some embodiments of the present application. And the aforementioned storage medium includes: u disk, removable hard disk, read only memory, random access memory, magnetic or optical disk, etc. for storing program codes.
The above description is only a few examples of the present application and is not intended to limit the present application, and those skilled in the art will appreciate that various modifications and variations can be made in the present application. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present application shall be included in the protection scope of the present application.
It will be evident to those skilled in the art that the present application is not limited to the details of the foregoing illustrative embodiments, and that the present application may be embodied in other specific forms without departing from the spirit or essential attributes thereof. The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the application being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference sign in a claim should not be construed as limiting the claim concerned.
Claims (12)
1. A container set eviction method, the method comprising:
sequentially traversing a plurality of container groups which can be evicted and are recorded in a container eviction list according to the respective corresponding eviction priority order of each container group, and determining a target container group in the container eviction list; wherein the target container set is a multiple-copy container set of the plurality of evictable container sets;
judging whether the target container group can be dispatched to other available node equipment except the target node equipment in the container cluster or not; wherein the container cluster includes a plurality of node devices, the target node device is a node device of the plurality of node devices that operates the target container group, and the available node device is a node device of the plurality of node devices that can operate the target container group;
and when the target container group can be dispatched to other available node devices, determining the target container group as a container group to be evicted.
2. The method of claim 1, wherein said determining whether the target group of containers can be scheduled to other available node devices in the container cluster other than the target node device comprises:
searching all available node equipment except the target node equipment in the container cluster;
when the number of the available node devices reaches a first threshold value, determining that the target container group can be scheduled to other available node devices;
when the number of available node devices does not reach the first threshold, determining that the target container group cannot be scheduled to other available node devices.
3. The method of claim 2, wherein said finding available node devices in the container cluster other than the target node device comprises:
creating a node searching strategy corresponding to the target container group;
and determining the node equipment matched with the node searching strategy in the container cluster as available node equipment.
4. The method of claim 3, wherein prior to said determining a node device in the container cluster that matches the node lookup policy as an available node device, the method further comprises:
determining all node devices currently having idle node resources in the container cluster;
the determining, as an available node device, a node device in the container cluster that matches the node lookup policy includes:
and determining the node equipment matched with the node searching strategy in all the node equipment with the current idle node resources as available node equipment.
5. The method of claim 4, wherein said determining all node devices in the container cluster that currently have free node resources comprises:
and determining other node devices in the container cluster except the node device which currently requests to evict the container group as all the node devices which currently have idle node resources.
6. The method of any one of claims 1-5, further comprising:
and when the target container group which can be dispatched to other available node devices does not exist in the container eviction list, determining the multi-copy container group with the highest eviction priority in the container eviction list as the container group to be evicted.
7. The method of any one of claims 1-5, further comprising:
configuring the state of the target node device to a scheduling prohibition state; wherein the prohibited scheduling state is used to indicate that the corresponding node device is not used to create a new container group.
8. The method of claim 7, wherein the method further comprises:
monitoring the container state of a target object to which the container group to be evicted belongs;
when the container state of the target object meets the set multi-copy condition, configuring the state of the target node equipment into a scheduling permission state; wherein the permission scheduling status is used to indicate that the corresponding node device is permitted to be used for creating a new container group.
9. The method of claim 8, wherein the container status of the target object satisfies a set multi-copy condition, comprising:
and when the number of the container groups belonging to the target object reaches a second threshold value, determining that the container state of the target object meets the set multi-copy condition.
10. A container set eviction device, the device comprising:
the processing module is used for sequentially traversing a plurality of container groups which can be evicted and are recorded in the container eviction list according to the respective corresponding eviction priority order of each container group, and determining a target container group in the container eviction list; wherein the target container set is a multiple-copy container set of the plurality of evictable container sets;
the judging module is used for judging whether the target container group can be dispatched to other available node equipment except the target node equipment in the container cluster; wherein the container cluster includes a plurality of node devices, the target node device is a node device of the plurality of node devices that operates the target container group, and the available node device is a node device of the plurality of node devices that can operate the target container group;
the processing module is further configured to determine the target container group as a to-be-evicted container group when the target container group can be scheduled to other available node devices.
11. A node apparatus, comprising:
a memory for storing one or more programs;
a processor;
the one or more programs, when executed by the processor, implement the method of any of claims 1-9.
12. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1-9.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011486963.9A CN112540829B (en) | 2020-12-16 | 2020-12-16 | Container group eviction method, device, node equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011486963.9A CN112540829B (en) | 2020-12-16 | 2020-12-16 | Container group eviction method, device, node equipment and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112540829A true CN112540829A (en) | 2021-03-23 |
CN112540829B CN112540829B (en) | 2024-10-18 |
Family
ID=75018969
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011486963.9A Active CN112540829B (en) | 2020-12-16 | 2020-12-16 | Container group eviction method, device, node equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112540829B (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113297031A (en) * | 2021-05-08 | 2021-08-24 | 阿里巴巴新加坡控股有限公司 | Container group protection method and device in container cluster |
CN113835840A (en) * | 2021-09-28 | 2021-12-24 | 广东浪潮智慧计算技术有限公司 | Cluster resource management method, device and equipment and readable storage medium |
CN117331650A (en) * | 2023-10-31 | 2024-01-02 | 中科驭数(北京)科技有限公司 | Container set scheduling method, device, equipment and storage medium |
Citations (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180349174A1 (en) * | 2017-05-30 | 2018-12-06 | Red Hat, Inc. | Fast and greedy scheduling machine based on a distance matrix |
CN109558260A (en) * | 2018-11-20 | 2019-04-02 | 北京京东尚科信息技术有限公司 | Kubernetes troubleshooting system, method, equipment and medium |
CN110096336A (en) * | 2019-04-29 | 2019-08-06 | 江苏满运软件科技有限公司 | Data monitoring method, device, equipment and medium |
CN110427249A (en) * | 2019-07-26 | 2019-11-08 | 重庆紫光华山智安科技有限公司 | Method for allocating tasks, pod initial method and relevant apparatus |
CN110515730A (en) * | 2019-08-22 | 2019-11-29 | 北京宝兰德软件股份有限公司 | Resource secondary dispatching method and device based on kubernetes container arranging system |
CN110727512A (en) * | 2019-09-30 | 2020-01-24 | 星环信息科技(上海)有限公司 | Cluster resource scheduling method, device, equipment and storage medium |
US20200034254A1 (en) * | 2018-07-30 | 2020-01-30 | EMC IP Holding Company LLC | Seamless mobility for kubernetes based stateful pods using moving target defense |
CN111104227A (en) * | 2019-12-28 | 2020-05-05 | 北京浪潮数据技术有限公司 | Resource control method and device of K8s platform and related components |
CN111198745A (en) * | 2018-11-16 | 2020-05-26 | 北京京东尚科信息技术有限公司 | Scheduling method, device, medium and electronic equipment for container creation |
CN111314450A (en) * | 2020-02-06 | 2020-06-19 | 恒生电子股份有限公司 | Data transmission method and device, electronic equipment and computer storage medium |
CN111464659A (en) * | 2020-04-27 | 2020-07-28 | 广州虎牙科技有限公司 | Node scheduling method, node pre-selection processing method, device, equipment and medium |
CN111522639A (en) * | 2020-04-16 | 2020-08-11 | 南京邮电大学 | Multidimensional resource scheduling method under Kubernetes cluster architecture system |
CN111694633A (en) * | 2020-04-14 | 2020-09-22 | 新华三大数据技术有限公司 | Cluster node load balancing method and device and computer storage medium |
CN111767113A (en) * | 2019-04-01 | 2020-10-13 | 北京沃东天骏信息技术有限公司 | Method and device for realizing container eviction |
US20200334075A1 (en) * | 2016-04-12 | 2020-10-22 | Telefonaktiebolaget Lm Ericsson (Publ) | Process scheduling in a processing system having at least one processor and shared hardware resources |
CN111930468A (en) * | 2020-07-13 | 2020-11-13 | 苏州浪潮智能科技有限公司 | Method, system and device for expelling container group |
-
2020
- 2020-12-16 CN CN202011486963.9A patent/CN112540829B/en active Active
Patent Citations (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20200334075A1 (en) * | 2016-04-12 | 2020-10-22 | Telefonaktiebolaget Lm Ericsson (Publ) | Process scheduling in a processing system having at least one processor and shared hardware resources |
US20180349174A1 (en) * | 2017-05-30 | 2018-12-06 | Red Hat, Inc. | Fast and greedy scheduling machine based on a distance matrix |
US20200034254A1 (en) * | 2018-07-30 | 2020-01-30 | EMC IP Holding Company LLC | Seamless mobility for kubernetes based stateful pods using moving target defense |
CN111198745A (en) * | 2018-11-16 | 2020-05-26 | 北京京东尚科信息技术有限公司 | Scheduling method, device, medium and electronic equipment for container creation |
CN109558260A (en) * | 2018-11-20 | 2019-04-02 | 北京京东尚科信息技术有限公司 | Kubernetes troubleshooting system, method, equipment and medium |
CN111767113A (en) * | 2019-04-01 | 2020-10-13 | 北京沃东天骏信息技术有限公司 | Method and device for realizing container eviction |
CN110096336A (en) * | 2019-04-29 | 2019-08-06 | 江苏满运软件科技有限公司 | Data monitoring method, device, equipment and medium |
CN110427249A (en) * | 2019-07-26 | 2019-11-08 | 重庆紫光华山智安科技有限公司 | Method for allocating tasks, pod initial method and relevant apparatus |
CN110515730A (en) * | 2019-08-22 | 2019-11-29 | 北京宝兰德软件股份有限公司 | Resource secondary dispatching method and device based on kubernetes container arranging system |
CN110727512A (en) * | 2019-09-30 | 2020-01-24 | 星环信息科技(上海)有限公司 | Cluster resource scheduling method, device, equipment and storage medium |
CN111104227A (en) * | 2019-12-28 | 2020-05-05 | 北京浪潮数据技术有限公司 | Resource control method and device of K8s platform and related components |
CN111314450A (en) * | 2020-02-06 | 2020-06-19 | 恒生电子股份有限公司 | Data transmission method and device, electronic equipment and computer storage medium |
CN111694633A (en) * | 2020-04-14 | 2020-09-22 | 新华三大数据技术有限公司 | Cluster node load balancing method and device and computer storage medium |
CN111522639A (en) * | 2020-04-16 | 2020-08-11 | 南京邮电大学 | Multidimensional resource scheduling method under Kubernetes cluster architecture system |
CN111464659A (en) * | 2020-04-27 | 2020-07-28 | 广州虎牙科技有限公司 | Node scheduling method, node pre-selection processing method, device, equipment and medium |
CN111930468A (en) * | 2020-07-13 | 2020-11-13 | 苏州浪潮智能科技有限公司 | Method, system and device for expelling container group |
Non-Patent Citations (3)
Title |
---|
JOSHUAANDREW: "容器调度", Retrieved from the Internet <URL:CSDN,https://blog.csdn.net/chweiweich/article/details/53244965> * |
ZHANG D,ET AL.: "Container oriented job scheduling using linear programming model", 《IEEE》 * |
黄涛: "基于Kubernetes的容器云调度算法的研究", 《中国优秀硕士学位论文全文数据库-信息科技辑》 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113297031A (en) * | 2021-05-08 | 2021-08-24 | 阿里巴巴新加坡控股有限公司 | Container group protection method and device in container cluster |
CN113835840A (en) * | 2021-09-28 | 2021-12-24 | 广东浪潮智慧计算技术有限公司 | Cluster resource management method, device and equipment and readable storage medium |
CN117331650A (en) * | 2023-10-31 | 2024-01-02 | 中科驭数(北京)科技有限公司 | Container set scheduling method, device, equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN112540829B (en) | 2024-10-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112540829B (en) | Container group eviction method, device, node equipment and storage medium | |
CN105700939B (en) | The method and system of Multi-thread synchronization in a kind of distributed system | |
CN106919445B (en) | Method and device for scheduling containers in cluster in parallel | |
US7810098B2 (en) | Allocating resources across multiple nodes in a hierarchical data processing system according to a decentralized policy | |
US8458712B2 (en) | System and method for multi-level preemption scheduling in high performance processing | |
CN110231991B (en) | Task allocation method and device, electronic equipment and readable storage medium | |
CN112214288B (en) | Pod scheduling method, device, equipment and medium based on Kubernetes cluster | |
CN108616424B (en) | Resource scheduling method, computer equipment and system | |
US7920282B2 (en) | Job preempt set generation for resource management | |
KR102398076B1 (en) | Apparatus and method for distributing and storing data | |
CN108509280B (en) | Distributed computing cluster locality scheduling method based on push model | |
CN112527490B (en) | Node resource management and control method and device, electronic equipment and storage medium | |
US9384050B2 (en) | Scheduling method and scheduling system for multi-core processor system | |
US20150365474A1 (en) | Computer-readable recording medium, task assignment method, and task assignment apparatus | |
US20130326528A1 (en) | Resource starvation management in a computer system | |
CN116450328A (en) | Memory allocation method, memory allocation device, computer equipment and storage medium | |
CN113608896B (en) | Method, system, medium and terminal for dynamically switching data streams | |
CN114675954A (en) | Task scheduling method and device | |
CN114443302A (en) | Container cluster capacity expansion method, system, terminal and storage medium | |
CN109240829B (en) | Method and device for applying for exchanging chip and managing exclusive resource | |
CN112612606A (en) | Message theme processing method and device, computer equipment and readable storage medium | |
CN114077493A (en) | Resource allocation method and related equipment | |
CN113703930A (en) | Task scheduling method, device and system and computer readable storage medium | |
CN113127289A (en) | Resource management method based on YARN cluster, computer equipment and storage medium | |
CN111339132A (en) | Data query method and database agent |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |