Nothing Special   »   [go: up one dir, main page]

CN112540829A - Container group eviction method, device, node equipment and storage medium - Google Patents

Container group eviction method, device, node equipment and storage medium Download PDF

Info

Publication number
CN112540829A
CN112540829A CN202011486963.9A CN202011486963A CN112540829A CN 112540829 A CN112540829 A CN 112540829A CN 202011486963 A CN202011486963 A CN 202011486963A CN 112540829 A CN112540829 A CN 112540829A
Authority
CN
China
Prior art keywords
container
node
target
container group
eviction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011486963.9A
Other languages
Chinese (zh)
Other versions
CN112540829B (en
Inventor
王朱珍
李俊
姜泽涛
廖林荣
吴晓云
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hundsun Technologies Inc
Original Assignee
Hundsun Technologies Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hundsun Technologies Inc filed Critical Hundsun Technologies Inc
Priority to CN202011486963.9A priority Critical patent/CN112540829B/en
Publication of CN112540829A publication Critical patent/CN112540829A/en
Application granted granted Critical
Publication of CN112540829B publication Critical patent/CN112540829B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/4557Distribution of virtual machine instances; Migration and load balancing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/45595Network integration; Enabling network access in virtual machine instances

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Memory System Of A Hierarchy Structure (AREA)

Abstract

The application provides a container group eviction method, a device, node equipment and a storage medium, and relates to the technical field of containers, a target container group is determined in a plurality of container groups which can be evicted, and when the target container group is determined to be capable of being dispatched to other available node equipment except the target node equipment where the target container group is located, the target container group is determined to be a container group to be evicted; therefore, even if the target container group is evicted, the target container group can be scheduled to other available node devices to run, the target container group cannot be repeatedly evicted, and the stability of the cluster system is improved.

Description

Container group eviction method, device, node equipment and storage medium
Technical Field
The present application relates to the field of container technologies, and in particular, to a container set eviction method, an apparatus, a node device, and a storage medium.
Background
In a container cluster such as kubernets, a container group (pod) can be created by each node device (node) in the container cluster, so that each service can be deployed in the pod to run, and the purpose of rapidly deploying the service is achieved.
Because the pod deployed on the node device needs to consume resources, such as a Memory (Memory), a disk, a Process Identification (PID), and the like, on the node device, the resources of a single node device are generally limited; therefore, when the remaining resources of a certain node device in the container cluster are insufficient, at least part of the pod of the node device can be evicted, so that the node device can leave enough resources for stable operation of the pod.
However, since the evicted pod cannot be scheduled to be stably operated by other node devices in the container cluster, the evicted pod may be repeatedly evicted, so that the evicted pod cannot be scheduled.
Disclosure of Invention
An object of the present application is to provide a container group eviction method, apparatus, node device, and storage medium, which can improve stability of a cluster system.
In order to achieve the purpose, the technical scheme adopted by the application is as follows:
in a first aspect, the present application provides a container set eviction method, the method comprising:
sequentially traversing a plurality of container groups which can be evicted and are recorded in a container eviction list according to the respective corresponding eviction priority order of each container group, and determining a target container group in the container eviction list; wherein the target container set is a multiple-copy container set of the plurality of evictable container sets;
judging whether the target container group can be dispatched to other available node equipment except the target node equipment in the container cluster or not; wherein the container cluster includes a plurality of node devices, the target node device is a node device of the plurality of node devices that operates the target container group, and the available node device is a node device of the plurality of node devices that can operate the target container group;
and when the target container group can be dispatched to other available node devices, determining the target container group as a container group to be evicted.
In a second aspect, the present application provides a container set eviction device, the device comprising:
the processing module is used for sequentially traversing a plurality of container groups which can be evicted and are recorded in the container eviction list according to the respective corresponding eviction priority order of each container group, and determining a target container group in the container eviction list; wherein the target container set is a multiple-copy container set of the plurality of evictable container sets;
the judging module is used for judging whether the target container group can be dispatched to other available node equipment except the target node equipment in the container cluster; wherein the container cluster includes a plurality of node devices, the target node device is a node device of the plurality of node devices that operates the target container group, and the available node device is a node device of the plurality of node devices that can operate the target container group;
the processing module is further configured to determine the target container group as a to-be-evicted container group when the target container group can be scheduled to other available node devices.
In a third aspect, the present application provides a node apparatus comprising a memory for storing one or more programs; a processor; the one or more programs, when executed by the processor, implement the container set eviction method described above.
In a fourth aspect, the present application provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the container set eviction method described above.
In the container group eviction method, apparatus, node device, and storage medium provided by the present application, a target container group is determined from a plurality of container groups that can be evicted, and when it is determined that the target container group can be scheduled to other available node devices except for a target node device where the target container group is located, the target container group is determined as a container group to be evicted; therefore, even if the target container group is evicted, the target container group can be scheduled to other available node devices to run, the target container group cannot be repeatedly evicted, and the stability of the cluster system is improved.
In order to make the aforementioned objects, features and advantages of the present application more comprehensible, preferred embodiments accompanied with figures are described in detail below.
Drawings
In order to more clearly explain the technical solutions of the present application, the drawings needed for the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and that those skilled in the art can also derive other related drawings from these drawings without inventive effort.
Fig. 1 shows an architectural diagram of a kubernets cluster.
Fig. 2 shows a schematic structural block diagram of a node device provided in the present application.
FIG. 3 illustrates a schematic flow chart of a container set eviction method as provided herein.
Fig. 4 shows a schematic flow diagram of a container set eviction apparatus as provided herein.
In the figure: 100-node devices; 101-a memory; 102-a processor; 103-a communication interface; 300-container set eviction means; 301-a processing module; 302-a decision module.
Detailed Description
To make the purpose, technical solutions and advantages of the present application clearer, the technical solutions in the present application will be clearly and completely described below with reference to the accompanying drawings in some embodiments of the present application, and it is obvious that the described embodiments are some, but not all embodiments of the present application. The components of the present application, as generally described and illustrated in the figures herein, may be arranged and designed in a wide variety of different configurations.
Thus, the following detailed description of the embodiments of the present application, as presented in the figures, is not intended to limit the scope of the claimed application, but is merely representative of selected embodiments of the application. All other embodiments obtained by a person of ordinary skill in the art based on a part of the embodiments in the present application without any creative effort belong to the protection scope of the present application.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures. Meanwhile, in the description of the present application, the terms "first", "second", and the like are used only for distinguishing the description, and are not to be construed as indicating or implying relative importance.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
Some embodiments of the present application will be described in detail below with reference to the accompanying drawings. The embodiments described below and the features of the embodiments can be combined with each other without conflict.
Referring to fig. 1, fig. 1 is a schematic diagram illustrating an architecture of a kubernets cluster, where the kubernets cluster may include a cluster master node (master) and a plurality of node devices, and the cluster master node may receive a creation request sent by a user through a client, create a new pod, and schedule the new pod to one of the node devices for operation.
It should be understood that fig. 1 is only an illustration, the kubernets cluster includes three node devices, in other possible embodiments of the present application, the kubernets cluster may further include more or less node devices, and the present application does not limit the number of node devices included in the kubernets cluster.
In the process of operating the pod in the node device, hardware resources provided by the node device, such as a Memory, a disk, a process PID, and the like, need to be used, and the hardware resources owned by a single node device are generally limited; therefore, in the process of scheduling the newly created pod to the node devices, the cluster master node may reasonably schedule the newly created pod to the node devices with spare hardware resources in the kubernets cluster by combining the resource usage of each node device in the kubernets cluster at the current time.
However, the situation of hardware resources used by a pod is generally dynamically changed, and as the amount of service interaction data running in the pod increases, the amount of hardware resources occupied by each pod may also increase, so that the remaining hardware of the node device where the pod is located is insufficient in resources, and sufficient hardware resources cannot be provided for the pod to use, thereby causing the pod or the node device to run abnormally, or even crash.
Therefore, for the purpose of maintaining node devices and stable operation of the pod, in a scenario such as a kubberenets cluster, a Kubelet service running on a node device may enable a control unit named as an eventinmanager, and the eventinmanager is responsible for monitoring hardware resource usage such as Memory, disk, process PID, and the like on the node device where the Kubelet service is located.
In the process of determining the pod which needs to be evicted, the EvictionManager generally sorts the pod according to the respective eviction priority of each pod, and preferentially evicts the pod with the highest eviction priority.
For example, in some possible scenarios, EvictionManager's basis for ordering all pods may consider the following two dimensions:
(1) whether the pod to be evicted is a critical pod such as a management pod, a static pod, etc.;
(2) whether a pod to be evicted occupies relatively more recoverable resources than other pods on the node device.
Namely: in the process of sequencing all the pods operated on the node equipment, the EvidinAnager firstly excludes all the key pods operated on the node equipment and only sequences the non-key pods; then, for all non-critical pods, sorting is performed according to hardware resources such as memory and disk space occupied by each pod, and the pods occupying relatively more hardware resources are preferentially evicted.
Thus, according to the eviction strategy of the above example, the eventionmanager only needs to evict a small amount of pod, that is, can release a large amount of hardware resources, and ensure that other devices running on the node device can run stably.
Wherein, for the evicted pod, the cluster master node may enable the scheduler to schedule the pod to other node devices to run to continuously provide the service.
However, in the process of scheduling the evicted pod to other node devices, the other node devices may not provide enough hardware resources for the evicted pod to run, and can only reschedule the evicted pod to the node device running earlier; moreover, when the pod is continuously evicted according to the above policy, the pod may be evicted again, which may cause the same pod to be repeatedly evicted in the same node device, and thus the evicted pod cannot be scheduled.
For example, in the scenario shown in fig. 1, when an eventinmanager running in the node apparatus 1 determines that the usage amount of the hardware resource on the node apparatus 1 exceeds a set threshold, the eventinmanager may evict a part of pod in the node apparatus 1, such as evict pod1 running in the node apparatus 1; however, the evicted pod1 may not be scheduled to node device 2 or node device 3 due to more node resources required, etc., and the pod1 may be rescheduled to node device 1 by the cluster master after being evicted by the eventmanager of node device 1; in accordance with the eviction policy described above, pod1 may be evicted by the eventmanager of node device 1; this is repeated, resulting in pod1 being repeatedly evicted by the EvalinAnanager of node device 1.
Therefore, to address at least some of the drawbacks of the above strategies, some possible embodiments provided by the present application are: determining a target container group as a container group to be evicted by determining a target container group from a plurality of container groups that can be evicted, and determining the target container group as the container group to be evicted when determining that the target container group can be dispatched to other available node devices except the target node device where the target container group is located; in this way, it can be ensured that the evicted target container set can be scheduled to be run by other node devices without being repeatedly evicted.
Referring to fig. 2, fig. 2 shows a schematic block diagram of a node device 100 provided in the present application, and in some embodiments, the node device 100 may include a memory 101, a processor 102, and a communication interface 103, where the memory 101, the processor 102, and the communication interface 103 are electrically connected to each other directly or indirectly to implement data transmission or interaction. For example, the components may be electrically connected to each other via one or more communication buses or signal lines.
The memory 101 may be used to store software programs and modules, such as program instructions/modules corresponding to the container set eviction apparatus provided in the present application, and the processor 102 executes various functional applications and data processing by executing the software programs and modules stored in the memory 101, so as to execute the steps of the container set eviction method provided in the present application. The communication interface 103 may be used for communicating signaling or data with other node devices.
The Memory 101 may be, but is not limited to, a Random Access Memory (RAM), a Read Only Memory (ROM), a Programmable Read-Only Memory (PROM), an Erasable Read-Only Memory (EPROM), an electrically Erasable Programmable Read-Only Memory (EEPROM), and the like.
The processor 102 may be an integrated circuit chip having signal processing capabilities. The Processor 102 may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), and the like; but also Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components.
It will be appreciated that the configuration shown in fig. 2 is merely illustrative and that node apparatus 100 may also include more or fewer components than shown in fig. 2 or have a different configuration than shown in fig. 2. The components shown in fig. 2 may be implemented in hardware, software, or a combination thereof.
The container set eviction method provided by the present application is exemplarily described below with the node device shown in fig. 2 as an exemplary execution subject.
Referring to fig. 3, fig. 3 shows a schematic flow chart of a container set eviction method provided by the present application, which may include the following steps in some possible embodiments:
step 201, according to the respective corresponding eviction priority order of each container group, sequentially traversing a plurality of container groups which can be evicted and are recorded in the container eviction list, and determining a target container group in the container eviction list.
Step 203, judging whether the target container group can be dispatched to other available node equipment except the target node equipment in the container cluster; if so, go to step 205; if not; step 201 is performed.
In step 205, the target container set is determined as the container set to be evicted.
In a scenario such as a kubernets cluster, a CRD (Custom Resource type) function provided by the kubernets cluster may be utilized to create a preemption controller (PreEvictionController) service in advance at a node device in the kubernets cluster, and the node device may execute a program instruction or a function module corresponding to the preemption controller service to execute the step of the container group eviction method provided by the present application, and determine a pod to be evicted from all pods run by the kubernets cluster.
It may be appreciated that, in some embodiments, for a kubernets cluster, the pre-eviction controller may be created only on one of the node devices in the kubernets cluster, and all the pods running in the kubernets cluster may be maintained by using the created pre-eviction controller; or, for a kubernets cluster, a pre-eviction controller may be created in all node devices in the kubernets cluster, and the pre-eviction controller running on each node device is responsible for maintaining all the pods running on each node device; or, the pre-eviction controller may also be operated in a node device outside the kubernets cluster, and by establishing communication between the pre-eviction controller and each node device in the kubernets cluster, the pre-eviction controller may maintain all the pods operated in the kubernets cluster; the present application does not limit the implementation of the pre-eviction controller.
In the process of executing the container set eviction method provided by the present application, the pre-eviction controller may receive a container eviction list sent by a kubel eventinmanager, where the container eviction list may be generated by the eventinmanager, and records all non-critical pods running on the node device, and information such as a copy state and an eviction priority corresponding to each non-critical pod.
In the process of executing step 201, in order to avoid situations that, after a single-copy pod is evicted, an application running in the pod cannot continue running, and the like, the pre-eviction controller may sequentially traverse a plurality of evictable pods recorded in the container eviction list according to an eviction priority order of each pod in the container eviction list, and determine multiple copies of pods in the plurality of evictable pods recorded in the container eviction list as target pods.
That is, in the scheme provided in the present application, it may be specified that eviction is not performed for a single-copy pod, but only for a multi-copy pod.
Next, for a plurality of node devices included in the container cluster, the pre-eviction controller may take a node device running the target pod from the plurality of node devices as a target node device, and determine whether the target pod can be scheduled to another available node device in the container cluster other than the target node device by acquiring a remaining resource condition of each node device, where the available node device may be a node device capable of running the target pod from the plurality of node devices included in the container cluster; namely: the pre-eviction controller may first determine whether there is a node other than the target node device that can run the target pod, according to a remaining resource condition of each node device in the container cluster.
Wherein, when the pre-eviction controller determines that the target pod can be scheduled to other available node devices in the container cluster except the target node device, the pre-eviction controller may determine the target pod as the pod to be evicted, and then perform eviction on the pod to be evicted by the eventinmager; otherwise, when the pre-eviction controller determines that the target pod cannot be scheduled to another available node device in the container cluster except the target node device, the pre-eviction controller may return to performing step 201, and continue to perform the container group eviction method provided by the present application with the next multi-copy pod as the target pod until all the multi-copy pods recorded in the container eviction list are traversed.
For example, in the scenario of the above example, assuming that pod1 and pod2 are both multi-copy pods, and node resources required by pod2 are less than those required by pod1, pod2 can stably operate in node device 2, and pod1 cannot stably operate in node device 2; then evicting pod1, according to the example related eviction scheme above, may result in pod1 being evicted repeatedly; according to the eviction scheme provided by the application, the pod2 can be evicted from the node device 1, and the pod2 can be stably scheduled to the node device 2 for operation, so that repeated eviction is avoided.
It can be seen that, in the above-mentioned solution provided by the present application, a target container group is determined from a plurality of container groups that can be evicted, and when it is determined that the target container group can be scheduled to other available node devices except for a target node device where the target container group is located, the target container group is determined as a container group to be evicted; therefore, even if the target container group is evicted, the target container group can be scheduled to other available node devices to run, the target container group cannot be repeatedly evicted, and the stability of the cluster system is improved.
It should be noted that, in some possible embodiments, in the process of executing step 201, the pre-eviction controller may determine, according to the respective resource usage of each pod in operation, an eviction priority order corresponding to each pod; for example, for a hardware resource with a set dimension, a pod using a larger amount of the hardware resource is associated with a higher eviction priority.
In addition, in order to ensure the reliability of the target pod being dispatched to other available node devices after being evicted, the pre-eviction controller may first search for all available node devices except the target node device in the container cluster during the execution of step 203.
For example, in some embodiments, in order to flexibly find out the number of available node devices, the pre-eviction controller may create a node search policy corresponding to a target pod according to a resource condition required by the target pod; illustratively, the pre-eviction controller may take the resource condition required by the target pod as a node lookup policy; alternatively, the pre-eviction controller may superimpose a preset amount of container resources on the basis of the resources required by the target pod, and generate the node lookup policy.
The pre-eviction controller may then determine, based on the node lookup policy, a node device in the container cluster that matches the node lookup policy as an available node device.
For example, the pre-eviction controller may compare the resource condition in the node lookup policy with the remaining resource conditions of other node devices in the container cluster, and determine all node devices whose remaining resource conditions satisfy the node lookup policy as available node devices.
Or, the pre-eviction controller may send the node lookup policy to the cluster master node, and send a matching request to the cluster master node, where the cluster master node may respond to the matching request, create a virtual mirror pod corresponding to the node lookup policy according to the resource requirement request indicated by the node lookup policy, and start the scheduler, attempt to schedule the virtual mirror pod to all node devices except the target node device, thereby determining all node devices capable of stably running the virtual mirror pod as available node devices.
Of course, it is understood that all virtual image pods may also be deleted after all available node devices are determined.
Next, the pre-eviction controller may compare the number of available node devices with a first threshold value based on the first threshold value set in advance; wherein, when the number of available node devices reaches the first threshold, the pre-eviction controller may determine that the target pod can be scheduled to other available node devices; conversely, when the number of available node devices does not reach the first threshold, the pre-eviction controller may determine that the target pod cannot be scheduled to other available node devices.
For example, assuming that the first threshold is 3, when the pre-eviction controller determines that at least 3 available node devices other than the target node device in the container cluster are capable of running the target pod, the pre-eviction controller may determine that the target pod is capable of being scheduled to other available node devices; conversely, when the pre-eviction controller determines that there are less than 3 available node devices in the container cluster other than the target node device that can run the target pod, the pre-eviction controller may determine that the target pod cannot be scheduled to other available node devices.
In addition, in implementations such as that described above, in determining available node devices using the created node lookup policy, the pre-eviction controller may traverse all node devices in the container cluster one by one based on the node lookup policy.
In some possible scenarios, there may be some node devices in the container cluster that do not have free node resources, that is: there may be node devices in the container cluster that cannot provide a free node device for the target pod to run.
Therefore, in order to increase the searching speed of the available node devices and avoid an invalid searching process, before the available node devices are matched based on the node searching policy, the pre-eviction controller may further determine all node devices currently having idle node resources in the container cluster.
For example, in some possible scenarios, assuming that a node device requests to evict a pod, it is characterized that the node device currently has no spare node resources to provide for a new pod; thus, prior to finding available node devices, the pre-eviction controller may determine, based on the node device currently requesting the eviction pod, other node devices in the container cluster than the node device currently requesting the eviction pod, as all node devices currently having free node resources.
Based on this, in the process of matching out the available node device based on the node searching policy, the pre-eviction controller may determine, as the available node device, a node device that matches with the node searching policy among all node devices currently having idle node resources, without matching out the available node device among all node devices included in the container cluster, so as to improve the searching speed of the available node device.
Of course, it is understood that the above is only an example, and illustrates that in some possible scenarios, the node device requesting to evict a pod belongs to a node device that does not currently have a free node resource; in some other possible scenarios, the node device without idle node resources currently may be determined according to the state identification information of each node device or the resource utilization condition, which is not limited in the present application.
In addition, in some possible scenarios, when the pre-eviction controller finishes the container eviction list, there is still no target pod that can be scheduled to other available node devices, namely: if there are no multiple copies of pod that can be scheduled to other available node devices in the container eviction list, the pre-eviction controller may determine the multiple copies of pod with the highest eviction priority in the container eviction list as the pod to be evicted; in this way, even if the pod is evicted and cannot be dispatched to another node device for a while, if more node resources are released as much as possible, another copy pod provides services, thereby avoiding service interruption after a single copy pod is evicted.
Also, in some possible scenarios, when the pre-eviction controller determines a pod to be evicted, the pod to be evicted may be evicted by an eventinmanager running within the target node device. Additionally, the pre-eviction controller may further configure the state of the target node device to a prohibited scheduling state to indicate that the target node device is not used to create a new pod; namely: the target node device has executed the operation of evicting the pod, and the current node resource of the target node device is insufficient, so the state of the target node device is configured to be a scheduling prohibition state, so as to avoid the evicted pod from being rescheduled to the target node device, thereby reducing the occurrence of repeated eviction of the pod.
In addition, in some possible scenarios, for the pod to be evicted, after the EvictionManager evicts the pod to be evicted, the pre-eviction controller may monitor a container state of a target object to which the pod to be evicted belongs, where the container state may be used to indicate a copy state of the target object, that is, a number of copy pods currently owned by the target object.
Based on this, when the pre-eviction controller determines that the container state of the target object satisfies the set multi-copy condition, for example, when the number of pods belonging to the target object reaches the second threshold, the pre-eviction controller may configure the state of the target node device as a permitted scheduling state to indicate that the target node device is permitted to be used to create a new pod; namely: when the container state of the target object meets the set multi-copy condition, the characteristic is that the pod which is evicted first is already scheduled to other node devices, and is not scheduled to the target node device any more, so that repeated eviction is not caused.
In addition, based on the same inventive concept as the above-mentioned container set eviction method provided in the present application, as shown in fig. 4, the present application also provides a container set eviction device 300. In some embodiments, the container set eviction apparatus 300 may include a processing module 301 and a determining module 302. Wherein:
a processing module 301, configured to sequentially traverse multiple container groups that can be evicted and are recorded in a container eviction list according to an eviction priority order corresponding to each container group, and determine a target container group in the container eviction list; wherein the target container set is a multiple copy container set of a plurality of container sets that can be evicted;
a determining module 302, configured to determine whether the target container group can be scheduled to another available node device in the container cluster except the target node device; the container cluster comprises a plurality of node devices, a target node device is a node device which runs a target container group in the plurality of node devices, and an available node device is a node device which can run the target container group in the plurality of node devices;
the processing module 301 is further configured to determine the target container group as a to-be-evicted container group when the target container group can be scheduled to other available node devices.
Optionally, in some possible embodiments, the determining module 302, when determining whether the target container group can be scheduled to another available node device in the container cluster except the target node device, is specifically configured to:
searching all available node equipment except the target node equipment in the container cluster;
when the number of the available node devices reaches a first threshold value, determining that the target container group can be scheduled to other available node devices;
when the number of available node devices does not reach the first threshold, it is determined that the target container group cannot be scheduled to other available node devices.
Optionally, in some possible embodiments, the determining module 302, when searching for an available node device other than the target node device in the container cluster, has a function of:
creating a node searching strategy corresponding to the target container group;
and determining the node equipment matched with the node searching strategy in the container cluster as available node equipment.
Optionally, in some possible embodiments, before determining that the node device in the container cluster matching the node lookup policy is an available node device, the determining module 302 is further configured to:
determining all node equipment with idle node resources currently in the container cluster;
when determining the node device in the container cluster that matches the node lookup policy as an available node device, the determining module 302 is specifically configured to:
and determining the node equipment matched with the node searching strategy in all the node equipment with the current idle node resources as the available node equipment.
Optionally, in some possible embodiments, when determining that all node devices currently having idle node resources in the container cluster have, the determining module 302 is specifically configured to:
and determining other node devices in the container cluster except the node device which currently requests to evict the container group as all the node devices which currently have idle node resources.
Optionally, in some possible embodiments, the processing module 301 is further configured to:
and when the target container group which can be dispatched to other available node devices does not exist in the container eviction list, determining the multi-copy container group with the highest eviction priority corresponding to the container eviction list as the container group to be evicted.
Optionally, in some possible embodiments, the processing module 301 is further configured to:
configuring the state of the target node equipment into a scheduling prohibition state; wherein the prohibited scheduling state is used to indicate that the corresponding node device is not used to create a new container group.
Optionally, in some possible embodiments, the processing module 301 is further configured to:
monitoring the container state of a target object to which a container group to be evicted belongs;
when the container state of the target object meets the set multi-copy condition, configuring the state of the target node equipment into a scheduling permission state; wherein the permission scheduling status is used to indicate that the corresponding node device is permitted to be used to create a new container group.
Optionally, in some possible embodiments, the processing module 301, when determining that the container status of the target object satisfies the set multi-copy condition, is specifically configured to:
and when the number of the container groups belonging to the target object reaches a second threshold value, determining that the container state of the target object meets the set multi-copy condition.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. The apparatus embodiments described above are merely illustrative, and for example, the flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to some embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s).
It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved.
It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
In addition, functional modules in some embodiments of the present application may be integrated together to form an independent part, or each module may exist separately, or two or more modules may be integrated to form an independent part.
The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to perform all or part of the steps of the method according to some embodiments of the present application. And the aforementioned storage medium includes: u disk, removable hard disk, read only memory, random access memory, magnetic or optical disk, etc. for storing program codes.
The above description is only a few examples of the present application and is not intended to limit the present application, and those skilled in the art will appreciate that various modifications and variations can be made in the present application. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present application shall be included in the protection scope of the present application.
It will be evident to those skilled in the art that the present application is not limited to the details of the foregoing illustrative embodiments, and that the present application may be embodied in other specific forms without departing from the spirit or essential attributes thereof. The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the application being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference sign in a claim should not be construed as limiting the claim concerned.

Claims (12)

1. A container set eviction method, the method comprising:
sequentially traversing a plurality of container groups which can be evicted and are recorded in a container eviction list according to the respective corresponding eviction priority order of each container group, and determining a target container group in the container eviction list; wherein the target container set is a multiple-copy container set of the plurality of evictable container sets;
judging whether the target container group can be dispatched to other available node equipment except the target node equipment in the container cluster or not; wherein the container cluster includes a plurality of node devices, the target node device is a node device of the plurality of node devices that operates the target container group, and the available node device is a node device of the plurality of node devices that can operate the target container group;
and when the target container group can be dispatched to other available node devices, determining the target container group as a container group to be evicted.
2. The method of claim 1, wherein said determining whether the target group of containers can be scheduled to other available node devices in the container cluster other than the target node device comprises:
searching all available node equipment except the target node equipment in the container cluster;
when the number of the available node devices reaches a first threshold value, determining that the target container group can be scheduled to other available node devices;
when the number of available node devices does not reach the first threshold, determining that the target container group cannot be scheduled to other available node devices.
3. The method of claim 2, wherein said finding available node devices in the container cluster other than the target node device comprises:
creating a node searching strategy corresponding to the target container group;
and determining the node equipment matched with the node searching strategy in the container cluster as available node equipment.
4. The method of claim 3, wherein prior to said determining a node device in the container cluster that matches the node lookup policy as an available node device, the method further comprises:
determining all node devices currently having idle node resources in the container cluster;
the determining, as an available node device, a node device in the container cluster that matches the node lookup policy includes:
and determining the node equipment matched with the node searching strategy in all the node equipment with the current idle node resources as available node equipment.
5. The method of claim 4, wherein said determining all node devices in the container cluster that currently have free node resources comprises:
and determining other node devices in the container cluster except the node device which currently requests to evict the container group as all the node devices which currently have idle node resources.
6. The method of any one of claims 1-5, further comprising:
and when the target container group which can be dispatched to other available node devices does not exist in the container eviction list, determining the multi-copy container group with the highest eviction priority in the container eviction list as the container group to be evicted.
7. The method of any one of claims 1-5, further comprising:
configuring the state of the target node device to a scheduling prohibition state; wherein the prohibited scheduling state is used to indicate that the corresponding node device is not used to create a new container group.
8. The method of claim 7, wherein the method further comprises:
monitoring the container state of a target object to which the container group to be evicted belongs;
when the container state of the target object meets the set multi-copy condition, configuring the state of the target node equipment into a scheduling permission state; wherein the permission scheduling status is used to indicate that the corresponding node device is permitted to be used for creating a new container group.
9. The method of claim 8, wherein the container status of the target object satisfies a set multi-copy condition, comprising:
and when the number of the container groups belonging to the target object reaches a second threshold value, determining that the container state of the target object meets the set multi-copy condition.
10. A container set eviction device, the device comprising:
the processing module is used for sequentially traversing a plurality of container groups which can be evicted and are recorded in the container eviction list according to the respective corresponding eviction priority order of each container group, and determining a target container group in the container eviction list; wherein the target container set is a multiple-copy container set of the plurality of evictable container sets;
the judging module is used for judging whether the target container group can be dispatched to other available node equipment except the target node equipment in the container cluster; wherein the container cluster includes a plurality of node devices, the target node device is a node device of the plurality of node devices that operates the target container group, and the available node device is a node device of the plurality of node devices that can operate the target container group;
the processing module is further configured to determine the target container group as a to-be-evicted container group when the target container group can be scheduled to other available node devices.
11. A node apparatus, comprising:
a memory for storing one or more programs;
a processor;
the one or more programs, when executed by the processor, implement the method of any of claims 1-9.
12. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1-9.
CN202011486963.9A 2020-12-16 2020-12-16 Container group eviction method, device, node equipment and storage medium Active CN112540829B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011486963.9A CN112540829B (en) 2020-12-16 2020-12-16 Container group eviction method, device, node equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011486963.9A CN112540829B (en) 2020-12-16 2020-12-16 Container group eviction method, device, node equipment and storage medium

Publications (2)

Publication Number Publication Date
CN112540829A true CN112540829A (en) 2021-03-23
CN112540829B CN112540829B (en) 2024-10-18

Family

ID=75018969

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011486963.9A Active CN112540829B (en) 2020-12-16 2020-12-16 Container group eviction method, device, node equipment and storage medium

Country Status (1)

Country Link
CN (1) CN112540829B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113297031A (en) * 2021-05-08 2021-08-24 阿里巴巴新加坡控股有限公司 Container group protection method and device in container cluster
CN113835840A (en) * 2021-09-28 2021-12-24 广东浪潮智慧计算技术有限公司 Cluster resource management method, device and equipment and readable storage medium
CN117331650A (en) * 2023-10-31 2024-01-02 中科驭数(北京)科技有限公司 Container set scheduling method, device, equipment and storage medium

Citations (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180349174A1 (en) * 2017-05-30 2018-12-06 Red Hat, Inc. Fast and greedy scheduling machine based on a distance matrix
CN109558260A (en) * 2018-11-20 2019-04-02 北京京东尚科信息技术有限公司 Kubernetes troubleshooting system, method, equipment and medium
CN110096336A (en) * 2019-04-29 2019-08-06 江苏满运软件科技有限公司 Data monitoring method, device, equipment and medium
CN110427249A (en) * 2019-07-26 2019-11-08 重庆紫光华山智安科技有限公司 Method for allocating tasks, pod initial method and relevant apparatus
CN110515730A (en) * 2019-08-22 2019-11-29 北京宝兰德软件股份有限公司 Resource secondary dispatching method and device based on kubernetes container arranging system
CN110727512A (en) * 2019-09-30 2020-01-24 星环信息科技(上海)有限公司 Cluster resource scheduling method, device, equipment and storage medium
US20200034254A1 (en) * 2018-07-30 2020-01-30 EMC IP Holding Company LLC Seamless mobility for kubernetes based stateful pods using moving target defense
CN111104227A (en) * 2019-12-28 2020-05-05 北京浪潮数据技术有限公司 Resource control method and device of K8s platform and related components
CN111198745A (en) * 2018-11-16 2020-05-26 北京京东尚科信息技术有限公司 Scheduling method, device, medium and electronic equipment for container creation
CN111314450A (en) * 2020-02-06 2020-06-19 恒生电子股份有限公司 Data transmission method and device, electronic equipment and computer storage medium
CN111464659A (en) * 2020-04-27 2020-07-28 广州虎牙科技有限公司 Node scheduling method, node pre-selection processing method, device, equipment and medium
CN111522639A (en) * 2020-04-16 2020-08-11 南京邮电大学 Multidimensional resource scheduling method under Kubernetes cluster architecture system
CN111694633A (en) * 2020-04-14 2020-09-22 新华三大数据技术有限公司 Cluster node load balancing method and device and computer storage medium
CN111767113A (en) * 2019-04-01 2020-10-13 北京沃东天骏信息技术有限公司 Method and device for realizing container eviction
US20200334075A1 (en) * 2016-04-12 2020-10-22 Telefonaktiebolaget Lm Ericsson (Publ) Process scheduling in a processing system having at least one processor and shared hardware resources
CN111930468A (en) * 2020-07-13 2020-11-13 苏州浪潮智能科技有限公司 Method, system and device for expelling container group

Patent Citations (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200334075A1 (en) * 2016-04-12 2020-10-22 Telefonaktiebolaget Lm Ericsson (Publ) Process scheduling in a processing system having at least one processor and shared hardware resources
US20180349174A1 (en) * 2017-05-30 2018-12-06 Red Hat, Inc. Fast and greedy scheduling machine based on a distance matrix
US20200034254A1 (en) * 2018-07-30 2020-01-30 EMC IP Holding Company LLC Seamless mobility for kubernetes based stateful pods using moving target defense
CN111198745A (en) * 2018-11-16 2020-05-26 北京京东尚科信息技术有限公司 Scheduling method, device, medium and electronic equipment for container creation
CN109558260A (en) * 2018-11-20 2019-04-02 北京京东尚科信息技术有限公司 Kubernetes troubleshooting system, method, equipment and medium
CN111767113A (en) * 2019-04-01 2020-10-13 北京沃东天骏信息技术有限公司 Method and device for realizing container eviction
CN110096336A (en) * 2019-04-29 2019-08-06 江苏满运软件科技有限公司 Data monitoring method, device, equipment and medium
CN110427249A (en) * 2019-07-26 2019-11-08 重庆紫光华山智安科技有限公司 Method for allocating tasks, pod initial method and relevant apparatus
CN110515730A (en) * 2019-08-22 2019-11-29 北京宝兰德软件股份有限公司 Resource secondary dispatching method and device based on kubernetes container arranging system
CN110727512A (en) * 2019-09-30 2020-01-24 星环信息科技(上海)有限公司 Cluster resource scheduling method, device, equipment and storage medium
CN111104227A (en) * 2019-12-28 2020-05-05 北京浪潮数据技术有限公司 Resource control method and device of K8s platform and related components
CN111314450A (en) * 2020-02-06 2020-06-19 恒生电子股份有限公司 Data transmission method and device, electronic equipment and computer storage medium
CN111694633A (en) * 2020-04-14 2020-09-22 新华三大数据技术有限公司 Cluster node load balancing method and device and computer storage medium
CN111522639A (en) * 2020-04-16 2020-08-11 南京邮电大学 Multidimensional resource scheduling method under Kubernetes cluster architecture system
CN111464659A (en) * 2020-04-27 2020-07-28 广州虎牙科技有限公司 Node scheduling method, node pre-selection processing method, device, equipment and medium
CN111930468A (en) * 2020-07-13 2020-11-13 苏州浪潮智能科技有限公司 Method, system and device for expelling container group

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
JOSHUAANDREW: "容器调度", Retrieved from the Internet <URL:CSDN,https://blog.csdn.net/chweiweich/article/details/53244965> *
ZHANG D,ET AL.: "Container oriented job scheduling using linear programming model", 《IEEE》 *
黄涛: "基于Kubernetes的容器云调度算法的研究", 《中国优秀硕士学位论文全文数据库-信息科技辑》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113297031A (en) * 2021-05-08 2021-08-24 阿里巴巴新加坡控股有限公司 Container group protection method and device in container cluster
CN113835840A (en) * 2021-09-28 2021-12-24 广东浪潮智慧计算技术有限公司 Cluster resource management method, device and equipment and readable storage medium
CN117331650A (en) * 2023-10-31 2024-01-02 中科驭数(北京)科技有限公司 Container set scheduling method, device, equipment and storage medium

Also Published As

Publication number Publication date
CN112540829B (en) 2024-10-18

Similar Documents

Publication Publication Date Title
CN112540829B (en) Container group eviction method, device, node equipment and storage medium
CN105700939B (en) The method and system of Multi-thread synchronization in a kind of distributed system
CN106919445B (en) Method and device for scheduling containers in cluster in parallel
US7810098B2 (en) Allocating resources across multiple nodes in a hierarchical data processing system according to a decentralized policy
US8458712B2 (en) System and method for multi-level preemption scheduling in high performance processing
CN110231991B (en) Task allocation method and device, electronic equipment and readable storage medium
CN112214288B (en) Pod scheduling method, device, equipment and medium based on Kubernetes cluster
CN108616424B (en) Resource scheduling method, computer equipment and system
US7920282B2 (en) Job preempt set generation for resource management
KR102398076B1 (en) Apparatus and method for distributing and storing data
CN108509280B (en) Distributed computing cluster locality scheduling method based on push model
CN112527490B (en) Node resource management and control method and device, electronic equipment and storage medium
US9384050B2 (en) Scheduling method and scheduling system for multi-core processor system
US20150365474A1 (en) Computer-readable recording medium, task assignment method, and task assignment apparatus
US20130326528A1 (en) Resource starvation management in a computer system
CN116450328A (en) Memory allocation method, memory allocation device, computer equipment and storage medium
CN113608896B (en) Method, system, medium and terminal for dynamically switching data streams
CN114675954A (en) Task scheduling method and device
CN114443302A (en) Container cluster capacity expansion method, system, terminal and storage medium
CN109240829B (en) Method and device for applying for exchanging chip and managing exclusive resource
CN112612606A (en) Message theme processing method and device, computer equipment and readable storage medium
CN114077493A (en) Resource allocation method and related equipment
CN113703930A (en) Task scheduling method, device and system and computer readable storage medium
CN113127289A (en) Resource management method based on YARN cluster, computer equipment and storage medium
CN111339132A (en) Data query method and database agent

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant