CN116541122A - Task scheduling method, device and system of distributed container system - Google Patents
Task scheduling method, device and system of distributed container system Download PDFInfo
- Publication number
- CN116541122A CN116541122A CN202210093359.2A CN202210093359A CN116541122A CN 116541122 A CN116541122 A CN 116541122A CN 202210093359 A CN202210093359 A CN 202210093359A CN 116541122 A CN116541122 A CN 116541122A
- Authority
- CN
- China
- Prior art keywords
- bandwidth
- task
- node
- bandwidth information
- scheduling
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 66
- 238000012544 monitoring process Methods 0.000 claims abstract description 116
- 239000003795 chemical substances by application Substances 0.000 claims description 124
- 230000006870 function Effects 0.000 claims description 40
- 238000012545 processing Methods 0.000 claims description 26
- 238000004590 computer program Methods 0.000 claims description 9
- 238000012806 monitoring device Methods 0.000 claims description 9
- 230000005540 biological transmission Effects 0.000 claims description 7
- 238000005315 distribution function Methods 0.000 claims description 2
- 239000000758 substrate Substances 0.000 claims 2
- 230000004044 response Effects 0.000 abstract description 5
- 230000008569 process Effects 0.000 description 25
- 238000010586 diagram Methods 0.000 description 16
- 238000004891 communication Methods 0.000 description 7
- 230000002159 abnormal effect Effects 0.000 description 5
- 230000004048 modification Effects 0.000 description 4
- 238000012986 modification Methods 0.000 description 4
- 230000003287 optical effect Effects 0.000 description 4
- 230000008901 benefit Effects 0.000 description 3
- 238000004364 calculation method Methods 0.000 description 3
- 230000000694 effects Effects 0.000 description 2
- 239000000835 fiber Substances 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 230000000644 propagated effect Effects 0.000 description 2
- 239000004065 semiconductor Substances 0.000 description 2
- 238000012549 training Methods 0.000 description 2
- 230000032683 aging Effects 0.000 description 1
- 239000002131 composite material Substances 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/455—Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
- G06F9/45533—Hypervisors; Virtual machine monitors
- G06F9/45558—Hypervisor-specific management and integration aspects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/48—Program initiating; Program switching, e.g. by interrupt
- G06F9/4806—Task transfer initiation or dispatching
- G06F9/4843—Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
- G06F9/4881—Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L43/00—Arrangements for monitoring or testing data switching networks
- H04L43/08—Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/70—Admission control; Resource allocation
- H04L47/78—Architectures of resource allocation
- H04L47/783—Distributed allocation of resources, e.g. bandwidth brokers
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
- H04L67/104—Peer-to-peer [P2P] networks
- H04L67/1074—Peer-to-peer [P2P] networks for supporting data block transmission mechanisms
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/455—Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
- G06F9/45533—Hypervisors; Virtual machine monitors
- G06F9/45558—Hypervisor-specific management and integration aspects
- G06F2009/4557—Distribution of virtual machine instances; Migration and load balancing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/455—Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
- G06F9/45533—Hypervisors; Virtual machine monitors
- G06F9/45558—Hypervisor-specific management and integration aspects
- G06F2009/45595—Network integration; Enabling network access in virtual machine instances
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Environmental & Geological Engineering (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
The invention discloses a task scheduling method, device and system of a distributed container system, and relates to the technical field of computers. One embodiment of the method comprises the following steps: receiving a task execution request; the task execution request includes bandwidth information required for executing the task; determining a target node agent for executing the task from one or more node agents according to residual bandwidth information corresponding to the one or more node agents in the distributed container and bandwidth information required by executing the task, and correspondingly generating a scheduling task; the node agent obtains the residual bandwidth information according to the monitoring result of the bandwidth flow monitoring service; the scheduling task is executed with the target node proxy in response to the task execution request. Under the condition that normal operation of the proxy node is not affected, residual bandwidth information at the proxy node is acquired in an asynchronous mode, so that reasonable scheduling and use of the proxy node are realized.
Description
Technical Field
The present invention relates to the field of computer technologies, and in particular, to a task scheduling method, device, and system for a distributed container system.
Background
When a distributed container performs tasks such as online service or distributed training, the tasks need to be scheduled to a certain node in the cluster, and the node performs the tasks.
In the existing distributed container, tasks are generally allocated to nodes only according to two items of data of a memory and a central processing unit as resource indexes.
In the process of implementing the present invention, the inventor finds that at least the following problems exist in the prior art: the special requirement of the distributed container to the network can cause that certain nodes squeeze the task due to overlarge bandwidth consumption in the task execution process, but other nodes have overlarge residual bandwidth; the problems of abnormal interruption, long execution time, stuck service request and the like occur in the execution process of the online service or the distributed training task, and the response condition of the task in the distributed container is poor.
Disclosure of Invention
In view of this, embodiments of the present invention provide a task scheduling method, apparatus, and system for a distributed container system, which determine a target node agent for executing a task according to bandwidth information required in a task execution request and remaining bandwidth information determined according to a monitoring result of a bandwidth flow monitoring service, so as to generate different scheduling tasks, thereby using the bandwidth information as a resource index of a task scheduling process, and implementing reasonable scheduling and use of the node agent, so as to avoid problems of abnormal interruption, long execution time, and stuck service request in the task execution process. In addition, the node agent and the bandwidth flow monitoring service are mutually independent, and the normal operation of the node agent is not affected by the error of the bandwidth flow monitoring service, so that the robustness and the stability of the distributed container system are improved.
To achieve the above object, according to a first aspect of an embodiment of the present invention, there is provided a bandwidth scheduling method of a distributed container system.
The bandwidth scheduling method of the distributed container system provided by the embodiment of the invention comprises the following steps: receiving a task execution request; the task execution request comprises bandwidth information required for executing a task; determining a target node agent for executing the task from the one or more node agents according to residual bandwidth information corresponding to the one or more node agents in the distributed container and bandwidth information required by the task, and correspondingly generating a scheduling task; the residual bandwidth information is obtained by the node agent according to a monitoring result of the bandwidth flow monitoring service; and executing the scheduling task by using the target node agent to respond to the task execution request.
Optionally, the method further comprises: receiving a monitoring result reported by the successfully registered bandwidth flow monitoring service by utilizing the node agent; and determining the residual bandwidth information according to the real-time bandwidth information and the bandwidth allocation function in the monitoring result, and reporting the residual bandwidth information.
Optionally, the receiving, by the node proxy, a monitoring result reported by the bandwidth traffic monitoring service that has been successfully registered includes:
acquiring a registration service of the bandwidth flow monitoring service by using the node proxy, wherein the registration service comprises a socket; registering the bandwidth flow service according to the socket, and returning a status code to the bandwidth flow monitoring service; the state code characterizes the success or failure of registration; and receiving a monitoring result sent by the bandwidth flow monitoring service which is successfully registered.
Optionally, acquiring a monitoring result reported by the registration service of the bandwidth flow monitoring service by using the node agent according to a preset time period; and updating the current residual bandwidth information according to the real-time bandwidth information in the monitoring result acquired in each time period, and reporting the updating result.
Optionally, the determining, from the one or more node agents, a target node agent for performing a task according to the remaining bandwidth information corresponding to the one or more node agents in the distributed container and the bandwidth information required for performing the task, respectively includes: judging whether the first bandwidth flow in each piece of residual bandwidth information is larger than the second bandwidth flow in the bandwidth information required by executing the task or not; a node proxy for the first bandwidth traffic being greater than the second bandwidth traffic: scoring the node agents according to the node information and a preset scoring strategy according to the node information; and selecting the node agent meeting a preset score threshold as the target node agent according to the scoring result.
Optionally, the node information of the one or more node agents includes any one or more of: mirror distribution, central processing unit, disk residuals, and central processing unit load.
Optionally, after the determining the target node agent for performing the task, the method further comprises: and setting the bandwidth information required by the target node agent for executing the scheduling task to be in a reserved state within a preset time, so that the bandwidth information in the reserved state cannot be called by other tasks to meet the bandwidth requirement of the scheduling task within the preset time.
Optionally, the determining the residual bandwidth information according to the real-time bandwidth information and the bandwidth allocation function in the monitoring result includes: and carrying out average weighted summation on the real-time bandwidth information according to a plurality of first weights and summation functions indicated by the bandwidth allocation function to obtain the current residual bandwidth information to be reported.
Optionally, determining the residual bandwidth information by using the real-time bandwidth information and a bandwidth allocation function in the monitoring result of the real-time bandwidth information includes: weighting and summing the real-time bandwidth information according to a plurality of second weighting and summing functions indicated by the distribution function; wherein, the second weight corresponding to the real-time bandwidth information which is closer to the current moment is larger; wherein the sum of the plurality of second weights is 1.
To achieve the above object, according to a second aspect of an embodiment of the present invention, there is provided a task scheduling device of a distributed container system.
The task scheduling device of the distributed container system in the embodiment of the invention comprises:
the receiving module is used for receiving a task execution request; the task execution request comprises bandwidth information required for executing a task;
the processing module is used for determining a target node agent for executing the task from the one or more node agents according to the residual bandwidth information corresponding to the one or more node agents in the distributed container and the bandwidth information required by the task, and correspondingly generating a scheduling task; the residual bandwidth information is obtained by the node agent according to a monitoring result of the bandwidth flow monitoring service;
and the execution module is used for executing the scheduling task by utilizing the target node agent so as to respond to the task execution request.
Optionally, the apparatus further comprises: the registration module is used for receiving the monitoring result reported by the successfully registered bandwidth flow monitoring service by using the node proxy; and determining the residual bandwidth information according to the real-time bandwidth information and the bandwidth allocation function in the monitoring result, and reporting the residual bandwidth information.
Optionally, the registration module is further configured to obtain a registration service of the bandwidth traffic monitoring service with the node proxy, where the registration service includes a socket; registering the bandwidth flow service according to the socket, and returning a status code to the bandwidth flow monitoring service; the state code characterizes the success or failure of registration; and receiving a monitoring result sent by the bandwidth flow monitoring service which is successfully registered.
Optionally, the registration module is further configured to obtain, by using the node proxy, a monitoring result reported by a registration service of the bandwidth flow monitoring service according to a preset time period; and updating the current residual bandwidth information according to the real-time bandwidth information in the monitoring result acquired in each time period, and reporting the updating result.
Optionally, the processing module is further configured to determine whether the first bandwidth traffic in the remaining bandwidth information is greater than the second bandwidth traffic in the bandwidth information required for executing the task; a node proxy for the first bandwidth traffic being greater than the second bandwidth traffic: scoring the node agents according to the node information and a preset scoring strategy according to the node information; and selecting the node agent meeting a preset score threshold as the target node agent according to the scoring result.
Optionally, the node information of the one or more node agents includes any one or more of: mirror distribution, central processing unit, disk residuals, and central processing unit load.
Optionally, the apparatus further comprises: and the reservation module is used for setting bandwidth information required by the target node agent for executing the scheduling task into a reservation state in a preset time after the target node agent for executing the task is determined, so that the bandwidth information in the reservation state cannot be called by other tasks to meet the bandwidth requirement of the scheduling task in the preset time.
Optionally, the registration module is further configured to perform average weighted summation on the real-time bandwidth information according to a plurality of first weights and summation functions indicated by the bandwidth allocation function, so as to obtain remaining bandwidth information to be reported currently.
Optionally, the registration module is further configured to weight and sum the real-time bandwidth information according to a plurality of second weights and summation functions indicated by the allocation function; wherein, the second weight corresponding to the real-time bandwidth information which is closer to the current moment is larger; wherein the sum of the plurality of second weights is 1.
To achieve the above object, according to a third aspect of the embodiments of the present invention, there is provided a task scheduling system.
The task scheduling system provided by the embodiment of the invention comprises the following components: the task scheduling device of the distributed container system and the monitoring device for providing bandwidth flow monitoring service; wherein,,
the monitoring device is used for acquiring one or more data packets respectively corresponding to the driving programs of one or more network cards and the maximum bandwidth information respectively corresponding to the one or more network cards through the bandwidth flow monitoring service; and determining the corresponding bandwidth consumption information of the network card according to the data packet, and taking the difference between the maximum bandwidth information of the network card and the corresponding bandwidth consumption information as the real-time bandwidth information of the network card.
Optionally, the bandwidth traffic monitoring service is configured to, in a case where one of the network cards corresponds to a plurality of data packets, target a single network card: determining average transmission rates corresponding to a plurality of data packets corresponding to the network card respectively; and adding the average transmission rate of each data packet to determine the consumed bandwidth information of the network card.
To achieve the above object, according to a fourth aspect of an embodiment of the present invention, there is provided a bandwidth scheduling apparatus of a distributed container system.
The bandwidth scheduling device of the distributed container system in the embodiment of the invention comprises: one or more processors; a storage system for storing one or more programs; the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the task scheduling method of the distributed container system of the embodiments of the present invention.
To achieve the above object, according to a fifth aspect of the embodiments of the present invention, there is provided a computer-readable medium.
The computer readable medium of the embodiment of the present invention stores a computer program, which when executed by a processor, implements the bandwidth scheduling method of the distributed container system of the embodiment of the present invention.
One embodiment of the above invention has the following advantages or benefits: in the embodiment of the invention, the target node agent for executing the task is determined according to the bandwidth information required in the task execution request and the residual bandwidth information determined according to the monitoring result of the bandwidth flow monitoring service, so as to generate different scheduling tasks, thereby taking the bandwidth information as the resource index of the task scheduling process, realizing reasonable scheduling and use of the node agent, and avoiding the problems of abnormal interruption, long execution time and stuck service request in the task execution process. In addition, the node agent and the bandwidth flow monitoring service are mutually independent, and the normal operation of the node agent is not affected by the error of the bandwidth flow monitoring service, so that the robustness and the stability of the distributed container system are improved. .
Further effects of the above-described non-conventional alternatives are described below in connection with the embodiments.
Drawings
The drawings are included to provide a better understanding of the invention and are not to be construed as unduly limiting the invention. Wherein:
FIG. 1 is a schematic diagram of the main flow of a bandwidth scheduling method of a distributed container system according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of the main flow before receiving a task execution request according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a main flow of receiving real-time bandwidth information reported by a successfully registered bandwidth flow monitoring service by using a node proxy according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of the overall flow of registration reporting in an embodiment of the present invention;
FIG. 5 is a schematic diagram of the main flow of determining a target node agent for performing a task from among the one or more node agents according to an embodiment of the present invention;
fig. 6 is an overall schematic diagram of a bandwidth scheduling process according to an embodiment of the present invention;
FIG. 7 is a schematic diagram of the main modules of a task scheduling system according to an embodiment of the present invention;
FIG. 8 is an exemplary system architecture diagram in which embodiments of the present invention may be applied;
fig. 9 is a schematic diagram of a computer system suitable for use in implementing a terminal device or server of an embodiment of the present invention.
Detailed Description
Exemplary embodiments of the present invention will now be described with reference to the accompanying drawings, in which various details of the embodiments of the present invention are included to facilitate understanding, and are to be considered merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the invention. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
According to a first aspect of an embodiment of the present invention, there is provided a bandwidth scheduling method applied to a distributed container system of a server.
Fig. 1 is a schematic diagram of a main flow of a bandwidth scheduling method of a distributed container system according to an embodiment of the present invention. As shown in fig. 1, the method mainly comprises:
step S101: receiving a task execution request; the task execution request includes bandwidth information required for executing the task;
step S102: determining a target node agent for executing the task from one or more node agents according to residual bandwidth information corresponding to the one or more node agents in the distributed container and bandwidth information required by executing the task, and correspondingly generating a scheduling task; the node agent obtains the residual bandwidth information according to the monitoring result of the bandwidth flow monitoring service;
Step S103: the scheduling task is executed with the target node proxy in response to the task execution request.
In the embodiment of the invention, the bandwidth flow monitoring service can be set on each node agent of the distributed container system, and other threads (such as a task execution thread, a reporting thread and the like) of the node agent are mutually independent of the bandwidth flow monitoring service, so that even if the bandwidth flow monitoring service has a problem, the normal operation of the node agent is not influenced. In addition, the bandwidth information of each node agent is detected in real time through the bandwidth flow monitoring service, and other threads (such as task execution threads and reporting threads) of the node agent and the bandwidth flow monitoring service can run asynchronously, so that the processing efficiency of the node agent is improved.
Compared with the mode of realizing the monitoring of bandwidth indexes by proxy nodes through the modification of source codes in the prior art, the method and the device have the advantages that the requirements on code writing are greatly reduced, and the proxy nodes corresponding to the bandwidth flow monitoring service can be positioned after errors are reported, so that the error sources are found, and the errors can be positioned and corrected in time. Further, as the node agent and the bandwidth flow monitoring service are mutually independent, the error of the bandwidth flow monitoring service does not influence the normal operation of the node agent, so that the problem that the whole node agent is not available once the bandwidth monitoring is in error due to the integral operation of the bandwidth monitoring and agent nodes in the prior art is solved, and the robustness and the stability of the distributed container system are improved.
Nodes in the distributed container system can be divided into a master node and other nodes, and node agents run on the other nodes of the distributed container. The master node is used for receiving various requests of users, managing and scheduling other nodes according to the requests, processing various data reported by the node agents, and finally completing the scheduling of the agent tasks of the various nodes, so that the efficient operation of the distributed container is realized. The master node can be any node in the distributed container system, and node agents run on other nodes except the master node.
In the actual application process, the interaction process between the user request and the distributed container system is completed by the master node, the interaction process between the bandwidth flow monitoring service and the distributed container is completed by the node agent, and the data transmission between the master node and the node agent is completed inside the distributed container.
In the present invention, in order to distinguish between different tasks, bandwidth information required for executing the tasks may include bandwidth traffic, task names, occupied memory, and the like.
In an alternative embodiment, as shown in fig. 2, before step S101, the method may further include:
Step S201: receiving a monitoring result reported by the successfully registered bandwidth flow monitoring service by using the node proxy;
step S202: and determining residual bandwidth information according to the real-time bandwidth information and the bandwidth allocation function in the monitoring result, and reporting the residual bandwidth information.
In the actual application process, the bandwidth flow monitoring service can monitor the bandwidth information of the proxy node where the bandwidth flow monitoring service is located in real time, but cannot achieve the effect of real-time uploading, in order to save the occupied space of the bandwidth flow monitoring service reporting process, a preset uploading period can be set, that is, the node proxy is utilized to obtain the real-time bandwidth information reported by the registration service of the bandwidth flow monitoring service according to the preset time period, wherein the real-time bandwidth information reported each time contains all data in the current uploading period. By the method, uploading frequency is reduced, bandwidth data cannot be deleted, and accuracy of calculating residual bandwidth information of the subsequent proxy node is guaranteed.
Different weights may be set in the bandwidth allocation function to obtain the result, and in an alternative embodiment, determining the remaining bandwidth information from the real-time bandwidth information and the bandwidth allocation function includes: and carrying out average weighted summation on the real-time bandwidth information according to a plurality of first weights and summation functions indicated by the bandwidth allocation function to obtain the current residual bandwidth information to be reported.
For example, the preset uploading period is 5 minutes, that is, the bandwidth flow monitoring service performs registration service every 5 minutes, and data of approximately 5 minutes is reported to the proxy node. The weight ratios of the data of approximately 5 minutes are 20%, as shown in Table 1.
Table 1 first weight summing table
Time | 5 | 4 | 3 | 2 | 1 |
Weighting of | 20% | 20% | 20% | 20% | 20% |
Under the condition that bandwidth data of approximately 5 minutes are relatively averaged, more accurate residual bandwidth information can be obtained by the same weight proportion, but if the fluctuation of the bandwidth data is larger, the calculation result deviation is larger due to the same weight, at this time, another alternative embodiment can be selected, and the residual bandwidth information corresponding to the current registration service can be obtained according to the real-time bandwidth information and a preset bandwidth allocation function, including: weighting and summing the real-time bandwidth information according to a plurality of second weighting and summing functions indicated by the allocation function; wherein, the second weight corresponding to the real-time bandwidth information which is closer to the current moment is larger; wherein the sum of the plurality of second weights is 1.
For example, the preset uploading period is 5 minutes, that is, the bandwidth flow monitoring service performs registration service every 5 minutes, and data of approximately 5 minutes is reported to the proxy node. The weight ratio of the data of approximately 5 minutes was different, as shown in table 2.
Table 2 second weight summing table
Time | 5 | 4 | 3 | 2 | 1 |
Weighting of | 5% | 10% | 15% | 25% | 45% |
In general, the bandwidth demand is changed in a trend during the task execution, and when the task is large, the bandwidth information is not suddenly increased, but is increased in trend, and when the task is about to be completed, the bandwidth information is reduced in trend again. The closer to the current time instant the real-time bandwidth information is thus the more indicative of the remaining bandwidth information.
In a further alternative embodiment, as shown in fig. 3, the specific implementation of step S201 may include:
step S301: acquiring a registration service of the bandwidth flow monitoring service by using a node proxy, wherein the registration service comprises a socket;
step S302: registering the bandwidth flow service according to the socket, and returning a status code to the bandwidth flow monitoring service; the state code characterizes the success or failure of registration;
step S303: and receiving a monitoring result sent by the bandwidth flow monitoring service which is successfully registered.
In the present invention, a socket is an abstraction of an endpoint that communicates bi-directionally between application processes on different hosts in a network. One socket is the end of the network where processes communicate, providing a mechanism for application layer processes to exchange data using network protocols. In step S301, the registration service sends a socket to the proxy node.
The status code may characterize the registration success or the registration failure, and in case the status code characterizes the registration success, the step S303 is continued after the step S302. If the state code characterizes a registration failure, in an alternative embodiment, the registration may be repeated 3 times, for example, i.e. the registration step is repeated immediately after the registration failure, which is the case for temporary failures due to sudden reasons such as network errors. If the registration fails after repeating the execution for 3 times, it is not caused by the emergency, and an alternative embodiment may be selected at this time, where the real-time bandwidth information in the current preset time period is stored briefly, and is reported together when the registration service is reported in the next time period. This case is for the case where the registration amount is too large due to too many requests, and normal registration is not possible in the case of current limitation, that is, registration is temporarily stopped, and registration is performed again after missing a registration peak.
In an alternative embodiment, kubernetes may be employed as a tool for container orchestration and scheduling, where each node is a node, kubelet is a node proxy running on each node, and the primary function is to report node status. Specifically, the registration procedure between the bandwidth traffic monitoring service and the node proxy is shown in fig. 4, wherein the procedure may include the following steps:
Step S401: the bandwidth flow monitoring service sends a registration service to the node proxy, and the node proxy Kubelet judges whether the registration is passed or not;
step S402: under the condition that the registration service passes, the bandwidth flow monitoring service sends real-time bandwidth information monitored by the current node to the node proxy Kubelet;
step S403: the node proxy Kubelet invokes a bandwidth allocation function to determine the remaining bandwidth information for the current node.
After each registration service determines the remaining bandwidth information of the current node, the node proxy Kubelet may also perform steps S404 and S405,
step S404: the node proxy Kubelet sends the residual bandwidth information of the current node to a container arrangement engine master node by calling a software service;
step S405: and updating the data state of the current node through the residual bandwidth information of the current node determined in the registration process.
For the step of determining a target node proxy for performing a task, in an alternative embodiment, as shown in fig. 5, specifically may include:
step S501: judging whether the first bandwidth flow M1 in each piece of residual bandwidth information is larger than the second bandwidth flow M2 in the bandwidth information required by executing the task or not;
Step S502: node proxies for the first bandwidth traffic M1 being greater than the second bandwidth traffic M2: scoring the node agents according to the node information and a preset scoring strategy according to the node information;
step S503: and selecting the node agent meeting the preset score threshold as a target node agent according to the scoring result.
For the node agents that do not satisfy the first bandwidth traffic being greater than the second bandwidth traffic, after step S501, step S504 is performed in a jump manner: the task execution request is not called.
In an alternative embodiment, the state of the node agent with the first bandwidth flow larger than the second bandwidth flow may be set to be a state of an executable task, the node agent with the first bandwidth flow larger than the second bandwidth flow may be set to be a state of an unexecutable task, and the executable task table may be generated according to one or more node agents in the state of the executable task, so as to facilitate a subsequent score calculation process.
Further, in an alternative embodiment, the node information of the one or more node agents includes any one or more of: mirror distribution, central processing unit, disk residuals, and central processing unit load.
Illustratively, the node information selects a mirror distribution, a central processor, a disk remaining amount, and a central processor load, and the preset scoring policy scores the weights of the residual bandwidth information, the mirror distribution, the central processor, the disk remaining amount, and the central processor load, as shown in table 3.
Table 3 preset scoring strategy weight table
Composite score = residual bandwidth information score x 2+ mirror distribution score x 2+ central processor score x 3+ central processor load score x 4+ disk residual score x 3. The residual bandwidth information, the mirror distribution, the central processing unit, the disk residual and the scores of the central processing unit load can be set in a self-defined mode according to different calculation rules.
Illustratively, for the score of the remaining bandwidth information, in an alternative embodiment, the node proxy may calculate by invoking a software service, as follows: when the container orchestration engine schedules a software service, the software service sets a demand value, calculates from the demand value, the residual bandwidth information, and the maximum priority, and score = maximum priority-maximum priority (demand value/residual bandwidth information).
After scoring the node agents, in an alternative embodiment, the score may be correspondingly filled into the executable task table to obtain an executable task score table, and a node agent meeting a preset score threshold may be selected from the executable task score table as the target node agent.
If a plurality of node agents meeting the preset score threshold value appear at the same time, the node agents with the highest scores are taken as target node agents and can be arranged according to the sequence from high to low. If a plurality of node agents with highest scores and all meeting the preset score threshold value appear, one executing task can be selected at will, or the executing tasks are ordered according to the historical executing task times of the node agents, the node agent with the lowest historical executing task times is selected to execute the current task, the tasks of the node agents are distributed in an average mode, the problem that the same node agent is called for multiple times and other node agents are idle is avoided as much as possible, and the use aging caused by frequent calling of the same node agent is relieved.
In an alternative embodiment, after determining the target node agent for performing the task, the method further comprises: and in the preset time, setting the bandwidth information required by the target node agent for executing the scheduling task into a reserved state, so that the bandwidth information in the reserved state cannot be called by other tasks to meet the bandwidth requirement of the scheduling task in the preset time.
The reserved bandwidth information may be an average value of bandwidth information required for executing each moment of scheduling the task, for example, task a needs 3 minutes to complete, the bandwidth information required for the first minute is 3Mbps, the bandwidth information required for the second minute is 10Mbps, and the bandwidth information required for the third minute is 5Mbps, so after the task starts, the reserved bandwidth information for each minute is 6Mbps, that is, only the bandwidth information required for the second minute exceeds the reserved value, and at this time, the task may be called from the bandwidth information not reserved.
The reserved bandwidth information can also be the maximum value in the bandwidth information values required for each moment of executing the scheduling task, for example, task B needs 3 minutes to complete, the bandwidth information required for the first minute is 3Mbps, the bandwidth information required for the second minute is 10Mbps, and the bandwidth information required for the third minute is 5Mbps, so after the task starts, the reserved bandwidth information for each minute is 10Mbps, and the mode can ensure the complete execution of the whole task, but compared with the reserved bandwidth information value, when bandwidth information fluctuation in a certain task is larger, the reserved bandwidth can not be fully utilized, but a large amount of bandwidth is in an idle state, and the execution process of other tasks is easy to be influenced.
In another alternative embodiment, after the task is executed, bandwidth information may be reserved for the task in a preset time period, and when the same task needs to be executed again in a short time, the same executing task is directly distributed to the node agent executed last time without repeated computation, so as to save space resources.
In an alternative embodiment, an overall schematic of the bandwidth scheduling process is shown in fig. 6 after a user initiates a task request. Node agents and bandwidth flow monitoring services are arranged on the node 1, the node 2 and the node 3, after a user initiates a task request, the node agents on the node 1, the node 2 and the node 3 respectively acquire residual bandwidth information on own nodes through the bandwidth flow monitoring services and report the residual bandwidth information to a main node of a container arrangement engine, the main node respectively determines the comprehensive score of each node agent according to a bandwidth allocation function, namely the score 1 of the node 1, the score 2 of the node 2 and the score 3 of the node 3, and finally determines a target node agent according to the scores of the score 1, the score 2 and the score 3, and the target node agent executes the task request initiated by the user.
In an alternative embodiment, the bandwidth flow monitoring service may acquire all data packets by intercepting the network card driver, calculate the data size, calculate the average rate, and acquire the consumed bandwidth by acquiring multiple network card drivers. And obtaining the maximum bandwidth information of the current network card according to the network card attribute, and subtracting the consumed bandwidth information from the maximum bandwidth information to obtain the real-time bandwidth information of the current node. Wherein the real-time bandwidth information may be understood as real-time residual bandwidth information.
According to the bandwidth scheduling method of the distributed container system, under the condition that normal operation of the proxy node is not affected, the bandwidth flow monitoring service obtains residual bandwidth information at the node proxy in an asynchronous mode, determines a target node proxy for executing tasks according to the bandwidth information required by task execution requests, further generates different scheduling tasks, and achieves reasonable scheduling and use of the node proxy so as to avoid the problems of abnormal interruption, long execution time and stuck service requests in the task execution process.
According to a second aspect of the embodiment of the present invention, there is provided a bandwidth scheduling apparatus applied to a distributed container system of a server.
Fig. 7 is a schematic diagram of main modules of a bandwidth scheduling apparatus 700 of a distributed container system according to a second aspect of an embodiment of the present invention. As shown in fig. 7, includes:
a receiving module 701, configured to receive a task execution request; the task execution request includes bandwidth information required for executing the task;
a processing module 702, configured to determine a target node agent for executing a task from the one or more node agents according to the remaining bandwidth information corresponding to the one or more node agents in the distributed container and the bandwidth information required for executing the task, and correspondingly generate a scheduling task; the node agent obtains the residual bandwidth information according to the monitoring result of the bandwidth flow monitoring service;
an execution module 703, configured to execute the scheduled task with the target node proxy in response to the task execution request.
In one embodiment of the invention, the apparatus further comprises: a registration module 704, configured to receive, by using the node proxy, a monitoring result reported by the bandwidth flow monitoring service that has been successfully registered; and determining the residual bandwidth information according to the real-time bandwidth information and the bandwidth allocation function in the monitoring result, and reporting the residual bandwidth information.
The receiving module 701 and the processing module 702 are both disposed on the master node, and the executing module 703 and the registering module 704 are disposed on the node corresponding to the node proxy.
In one embodiment of the present invention, the registration module 704 is further configured to obtain, with the node proxy, a registration service of the bandwidth traffic monitoring service, where the registration service includes a socket; registering the bandwidth flow service according to the socket, and returning a status code to the bandwidth flow monitoring service; the state code characterizes the success or failure of registration; and receiving a monitoring result sent by the bandwidth flow monitoring service which is successfully registered.
In one embodiment of the present invention, the registration module 704 is further configured to obtain, by using the node proxy, a monitoring result reported by a registration service of the bandwidth flow monitoring service according to a preset time period; and updating the current residual bandwidth information according to the real-time bandwidth information in the monitoring result acquired in each time period, and reporting the updating result.
In one embodiment of the present invention, the processing module 702 is further configured to determine whether the first bandwidth traffic in the remaining bandwidth information is greater than the second bandwidth traffic in the bandwidth information required for performing the task; a node proxy for the first bandwidth traffic being greater than the second bandwidth traffic: scoring the node agents according to the node information and a preset scoring strategy according to the node information; and selecting the node agent meeting a preset score threshold as the target node agent according to the scoring result.
In one embodiment of the present invention, the node information of the one or more node agents includes any one or more of the following: mirror distribution, central processing unit, disk residuals, and central processing unit load.
In one embodiment of the invention, the apparatus further comprises: and the reservation module is used for setting bandwidth information required by the target node agent for executing the scheduling task into a reservation state in a preset time after the target node agent for executing the task is determined, so that the bandwidth information in the reservation state cannot be called by other tasks to meet the bandwidth requirement of the scheduling task in the preset time.
In one embodiment of the present invention, the registration module 704 is further configured to perform average weighted summation on the real-time bandwidth information according to a plurality of first weights and summation functions indicated by the bandwidth allocation function, so as to obtain remaining bandwidth information to be currently reported.
In one embodiment of the present invention, the registration module 704 is further configured to perform weighted summation on the real-time bandwidth information according to a plurality of second weights and summation functions indicated by the allocation function; wherein, the second weight corresponding to the real-time bandwidth information which is closer to the current moment is larger; wherein the sum of the plurality of second weights is 1.
According to a third aspect of the embodiments of the present invention, there is further provided a task scheduling system including the task scheduling device provided in any one of the embodiments, where the task scheduling system further includes a monitoring device for providing a bandwidth traffic monitoring service, where the monitoring device is configured to obtain, by using the bandwidth traffic monitoring service, one or more data packets corresponding to drivers of one or more network cards, and maximum bandwidth information corresponding to the one or more network cards, respectively; and determining the corresponding bandwidth consumption information of the network card according to the data packet, and taking the difference between the maximum bandwidth information of the network card and the corresponding bandwidth consumption information as the real-time bandwidth information of the network card.
The monitoring device may be provided in the task scheduling device, or may be provided independently of the task scheduling device. It can be understood that even if the monitoring device is disposed on the task scheduling device, the functions of the monitoring device and the task scheduling device are independent, that is, the error of the bandwidth flow monitoring service does not affect the normal operation of the node agent.
In one embodiment of the present invention, the bandwidth traffic monitoring service is configured to, in a case where one of the network cards corresponds to a plurality of data packets, target a single network card: determining average transmission rates corresponding to a plurality of data packets corresponding to the network card respectively; and adding the average transmission rate of each data packet to determine the consumed bandwidth information of the network card.
Fig. 8 illustrates an exemplary system architecture 800 of a bandwidth scheduling system of a distributed container bandwidth scheduling method or distributed container system to which embodiments of the present invention may be applied.
As shown in fig. 8, a system architecture 800 may include terminal devices 801, 802, 803, a network 804, and a plurality of servers 805, 806, 807. The network 804 is a medium used to provide communication links between the terminal devices 801, 802, 803 and the server 805, and between the respective servers 805, 806, 807. The network 804 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.
The user can interact with the server 805 through the network 804 using the terminal devices 801, 802, 803 to transmit a task execution request or receive response information of the request, or the like. Various communication client applications may be installed on the terminal devices 801, 802, 803, such as an online service application, a web browser application, a search class application, an instant messaging tool, a mailbox client, social platform software, and the like.
The terminal devices 801, 802, 803 may be a variety of electronic devices having a display screen and supporting web browsing, including but not limited to smartphones, tablets, laptop and desktop computers, and the like.
The servers 805, 806, 807 may be servers providing various services, such as a background management server providing support for online service requests sent by users using the terminal devices 801, 802, 803, or a server scheduling tasks. The background management server may analyze and process the received data such as the task execution request, and feed back the processing result (for example, the allocated bandwidth information) to the terminal device.
It should be noted that, the bandwidth scheduling method of the distributed container provided in the first aspect of the embodiment of the present invention is generally executed by the servers 805, 806, 807, and accordingly, the bandwidth scheduling system of the distributed container system provided in the second aspect of the embodiment of the present invention is generally disposed in the servers 805, 806, 807.
It should be understood that the number of terminal devices, networks and servers in fig. 8 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
Referring now to FIG. 9, there is illustrated a schematic diagram of a computer system 900 suitable for use in implementing an embodiment of the present invention. The terminal device shown in fig. 9 is only an example, and should not impose any limitation on the functions and the scope of use of the embodiment of the present invention.
As shown in fig. 9, the computer system 900 includes a Central Processing Unit (CPU) 901, which can execute various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 902 or a program loaded from a storage section 908 into a Random Access Memory (RAM) 903. In the RAM 903, various programs and data necessary for the operation of the system 900 are also stored. The CPU 701, ROM 902, and RAM 903 are connected to each other by a bus 904. An input/output (I/O) interface 905 is also connected to the bus 904.
The following components are connected to the I/O interface 905: an input section 906 including a keyboard, a mouse, and the like; an output portion 905 including a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker, and the like; a storage portion 908 including a hard disk or the like; and a communication section 909 including a network interface card such as a LAN card, a modem, or the like. The communication section 909 performs communication processing via a network such as the internet. The drive 910 is also connected to the I/O interface 905 as needed. A removable medium 911 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is installed as needed on the drive 910 so that a computer program read out therefrom is installed into the storage section 908 as needed.
In particular, according to embodiments of the present disclosure, the processes described above with reference to flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method shown in the flow chart. In such an embodiment, the computer program may be downloaded and installed from the network via the communication portion 909 and/or installed from the removable medium 911. The above-described functions defined in the system of the present invention are performed when the computer program is executed by a Central Processing Unit (CPU) 901.
The computer readable medium shown in the present invention may be a computer readable signal medium or a computer readable storage medium, or any combination of the two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, or device, or a combination of any of the foregoing. More specific examples of the computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, or device. In the present invention, however, the computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, with the computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The modules involved in the embodiments of the present invention may be implemented in software or in hardware. The described modules may also be provided in a processor, for example, as: a processor includes a receiving module, a processing module, an executing module, and a registration module. The names of these modules do not constitute a limitation on the module itself in some cases, and for example, the receiving module may also be described as "a module for receiving a task execution request".
As another aspect, the present invention also provides a computer-readable medium that may be contained in the apparatus described in the above embodiments; or may be present alone without being fitted into the device. The computer readable medium carries one or more programs which, when executed by a device, cause the device to include: receiving a task execution request; the task execution request comprises bandwidth information required for executing a task; determining a target node agent for executing the task from the one or more node agents according to one or more pieces of residual bandwidth information corresponding to the one or more node agents in the distributed container and the bandwidth information required by the task, and correspondingly generating a scheduling task; the node agent operates on the nodes of the distributed container, and the residual bandwidth information is obtained by monitoring the node agent through a bandwidth flow monitoring service; and executing the scheduling task by using the target node agent to respond to the task execution request.
According to the distributed container system and the bandwidth scheduling method thereof, under the condition that normal operation of the proxy node is not affected, the bandwidth flow monitoring service obtains residual bandwidth information at the node proxy in an asynchronous mode, determines a target node proxy for executing tasks according to the bandwidth information required in task execution requests, further generates different scheduling tasks, and achieves reasonable scheduling and use of the node proxy so as to avoid the problems of abnormal interruption, long execution time and stuck service requests in the task execution process.
The above embodiments do not limit the scope of the present invention. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives can occur depending upon design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present invention should be included in the scope of the present invention.
Claims (14)
1. A method of task scheduling for a distributed container system, the method comprising:
receiving a task execution request; the task execution request comprises bandwidth information required for executing a task;
determining a target node agent for executing the task from the one or more node agents according to residual bandwidth information corresponding to the one or more node agents in the distributed container and bandwidth information required by the task, and correspondingly generating a scheduling task; the residual bandwidth information is obtained by the node agent according to a monitoring result of the bandwidth flow monitoring service;
and executing the scheduling task by using the target node agent to respond to the task execution request.
2. The method as recited in claim 1, further comprising:
Receiving a monitoring result reported by the successfully registered bandwidth flow monitoring service by using the node agent;
and determining the residual bandwidth information according to the real-time bandwidth information and the bandwidth allocation function in the monitoring result, and reporting the residual bandwidth information.
3. The method according to claim 2, wherein the receiving, by the node proxy, the real-time bandwidth information of the monitoring result reported by the successfully registered bandwidth traffic monitoring service includes:
acquiring a registration service of the bandwidth flow monitoring service by using the node proxy, wherein the registration service comprises a socket;
registering the bandwidth flow service according to the socket, and returning a status code to the bandwidth flow monitoring service; the state code characterizes the success or failure of registration;
and receiving the real-time bandwidth information of the monitoring result sent by the bandwidth flow monitoring service which is successfully registered.
4. The method of claim 2, wherein the step of determining the position of the substrate comprises,
acquiring a monitoring result reported by a registration service of the bandwidth flow monitoring service according to a preset time period by using the node agent;
and updating the current residual bandwidth information according to the real-time bandwidth information in the monitoring result acquired in each time period, and reporting the updating result.
5. The method of claim 1, wherein the determining a target node agent for performing a task from the one or more node agents based on the remaining bandwidth information corresponding to the one or more node agents in the distributed container and the bandwidth information required for performing the task, respectively, comprises:
judging whether the first bandwidth flow in each piece of residual bandwidth information is larger than the second bandwidth flow in the bandwidth information required by executing the task or not;
a node proxy for the first bandwidth traffic being greater than the second bandwidth traffic: scoring the node agents according to the node information and a preset scoring strategy according to the node information;
and selecting the node agent meeting a preset score threshold as the target node agent according to the scoring result.
6. The method of claim 5, wherein the node information of the one or more node agents includes any one or more of: mirror distribution, central processing unit, disk residuals, and central processing unit load.
7. The method of claim 5, wherein after said determining a target node agent for performing a task, the method further comprises:
And setting the bandwidth information required by the target node agent for executing the scheduling task to be in a reserved state within a preset time, so that the bandwidth information in the reserved state cannot be called by other tasks to meet the bandwidth requirement of the scheduling task within the preset time.
8. The method of claim 2, wherein the step of determining the position of the substrate comprises,
the determining the residual bandwidth information according to the real-time bandwidth information and the bandwidth allocation function in the monitoring result comprises the following steps:
and carrying out average weighted summation on the real-time bandwidth information according to a plurality of first weights and summation functions indicated by the bandwidth allocation function to obtain the current residual bandwidth information to be reported.
9. The method of claim 2, wherein the real-time bandwidth information and bandwidth allocation function in the monitoring result of the real-time bandwidth information determine the remaining bandwidth information, comprising:
weighting and summing the real-time bandwidth information according to a plurality of second weighting and summing functions indicated by the distribution function; wherein, the second weight corresponding to the real-time bandwidth information which is closer to the current moment is larger; wherein the sum of the plurality of second weights is 1.
10. A task scheduling device of a distributed container system, comprising:
the receiving module is used for receiving a task execution request; the task execution request comprises bandwidth information required for executing a task;
the processing module is used for determining a target node agent for executing the task from the one or more node agents according to the residual bandwidth information corresponding to the one or more node agents in the distributed container and the bandwidth information required by the task, and correspondingly generating a scheduling task; the residual bandwidth information is obtained by the node agent according to a monitoring result of the bandwidth flow monitoring service;
and the execution module is used for executing the scheduling task by utilizing the target node agent so as to respond to the task execution request.
11. A task scheduling system comprising the task scheduling device of the distributed container system of claim 10 and a monitoring device for providing a bandwidth traffic monitoring service; wherein,,
the monitoring device is used for acquiring one or more data packets respectively corresponding to the driving programs of one or more network cards and the maximum bandwidth information respectively corresponding to the one or more network cards through the bandwidth flow monitoring service; and determining the corresponding bandwidth consumption information of the network card according to the data packet, and taking the difference between the maximum bandwidth information of the network card and the corresponding bandwidth consumption information as the real-time bandwidth information of the network card.
12. The system of claim 11, wherein the system further comprises a controller configured to control the controller,
the bandwidth flow monitoring service is configured to, in a case where one network card corresponds to a plurality of data packets, target a single network card: determining average transmission rates corresponding to a plurality of data packets corresponding to the network card respectively; and adding the average transmission rate of each data packet to determine the consumed bandwidth information of the network card.
13. A bandwidth scheduling apparatus for a distributed container system, comprising: one or more processors;
a storage system for storing one or more programs,
when executed by the one or more processors, causes the one or more processors to implement the method of any of claims 1-9.
14. A computer readable medium, on which a computer program is stored, characterized in that the program, when being executed by a processor, implements the method according to any of claims 1-9.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210093359.2A CN116541122A (en) | 2022-01-26 | 2022-01-26 | Task scheduling method, device and system of distributed container system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210093359.2A CN116541122A (en) | 2022-01-26 | 2022-01-26 | Task scheduling method, device and system of distributed container system |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116541122A true CN116541122A (en) | 2023-08-04 |
Family
ID=87452994
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210093359.2A Pending CN116541122A (en) | 2022-01-26 | 2022-01-26 | Task scheduling method, device and system of distributed container system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116541122A (en) |
-
2022
- 2022-01-26 CN CN202210093359.2A patent/CN116541122A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US7930344B2 (en) | Incremental run-time session balancing in a multi-node system | |
US8719297B2 (en) | System for managing data collection processes | |
CN108667748B (en) | Method, device, equipment and storage medium for controlling bandwidth | |
US7773522B2 (en) | Methods, apparatus and computer programs for managing performance and resource utilization within cluster-based systems | |
US8015281B2 (en) | Dynamic server flow control in a hybrid peer-to-peer network | |
US7543060B2 (en) | Service managing apparatus for keeping service quality by automatically allocating servers of light load to heavy task | |
CN109873868A (en) | A kind of computing capability sharing method, system and relevant device | |
JP5617914B2 (en) | Throughput maintenance support system, apparatus, method, and program | |
US11949737B1 (en) | Allocation of server resources in remote-access computing environments | |
CN104243405A (en) | Request processing method, device and system | |
CN109428926B (en) | Method and device for scheduling task nodes | |
CN114116173A (en) | Method, device and system for dynamically adjusting task allocation | |
CN112565391A (en) | Method, apparatus, device and medium for adjusting instances in an industrial internet platform | |
JP4834622B2 (en) | Business process operation management system, method, process operation management apparatus and program thereof | |
CN112887407B (en) | Job flow control method and device for distributed cluster | |
CN110336884B (en) | Server cluster updating method and device | |
US10963305B2 (en) | Low latency distributed counters for quotas | |
CN109842665B (en) | Task processing method and device for task allocation server | |
CN116541122A (en) | Task scheduling method, device and system of distributed container system | |
CN118467140B (en) | Task scheduling method and system | |
Ezzeddine et al. | Tail-Latency Aware and Resource-Efficient Bin Pack Autoscaling for Distributed Event Queues. | |
CN118802795A (en) | Bandwidth adjustment method, device, equipment, storage medium and product | |
CN118368259A (en) | Network resource allocation method, device, electronic equipment and storage medium | |
JP6322332B2 (en) | Energy management system and business application execution method | |
CN117093439A (en) | Method, device, electronic equipment and storage medium for capacity data processing |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |