Nothing Special   »   [go: up one dir, main page]

CN109976873A - The scheduling scheme acquisition methods and dispatching method of containerization distributed computing framework - Google Patents

The scheduling scheme acquisition methods and dispatching method of containerization distributed computing framework Download PDF

Info

Publication number
CN109976873A
CN109976873A CN201910137847.7A CN201910137847A CN109976873A CN 109976873 A CN109976873 A CN 109976873A CN 201910137847 A CN201910137847 A CN 201910137847A CN 109976873 A CN109976873 A CN 109976873A
Authority
CN
China
Prior art keywords
containerization
computational
component
distributed computing
computational frame
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910137847.7A
Other languages
Chinese (zh)
Other versions
CN109976873B (en
Inventor
童薇
冯丹
刘景宁
谢乘胜
邓竣中
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huazhong University of Science and Technology
Original Assignee
Huazhong University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huazhong University of Science and Technology filed Critical Huazhong University of Science and Technology
Priority to CN201910137847.7A priority Critical patent/CN109976873B/en
Publication of CN109976873A publication Critical patent/CN109976873A/en
Application granted granted Critical
Publication of CN109976873B publication Critical patent/CN109976873B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention discloses a kind of scheduling scheme acquisition methods of containerization distributed computing framework and dispatching methods, it include: the computing resource of consumption and the available computational resources of each calculate node according to needed for unscheduled containerization component each in containerization distributed computing framework to be scheduled, determine the calculate node for running each containerization component, enable containerization core component to be scheduled to same calculate node with containerization computation module as much as possible, thus obtains the scheduling scheme of containerization distributed computing framework to be scheduled;For newly-built Computational frame, each containerization component is dispatched to corresponding calculate node after acquisition scheduling scheme;Newly-built Computational frame and satisfaction readjustment degree execution condition if it does not exist, then obtain the scheduling scheme of one or more Computational frames being currently running, and readjustment degree is executed when the cost on network communication for enabling to all Computational frames total reduces.The present invention is capable of the performance of effective hoisting container distributed computing framework.

Description

The scheduling scheme acquisition methods and dispatching method of containerization distributed computing framework
Technical field
The invention belongs to container cluster task scheduling technique fields, more particularly, to a kind of containerization distributed computing The scheduling scheme acquisition methods and dispatching method of frame.
Background technique
Current data center promotes the resource utilization of physical machine by virtualization technology, and virtualization technology is also simultaneously The isolation of different application offer running environment.Virtualization technology includes virtual machine technique and container technique, wherein virtual machine skill Art needs to fictionalize the operating system of a whole set of client computer operation, and container technique then allows to operate in above same physical server All containers share the operating system nucleus of same host, user need to be only constructed towards concrete application using needing most Small running environment, therefore, compared to virtual machine techniques such as KVM, Xen, container occupancy system external space is smaller, starts the time It is short, the performance of application is run in a reservoir close to the performance for directly running the application on physical server, typical container Technology includes Docker, RKT, OpenVZ etc..Data center administrator is not necessarily to consider to transport in container to the management that containerization is applied Row apply relied on environment, manage application process thus be simplified, the application runtime environment of more and more data centers By virtual machine (vm) migration to container.
With the rise of big data and artificial intelligence, user is submitted to big data the processing task, depth of data center Practising training mission number becomes more, these tasks all select to run on distributed computing framework;It is current typical distributed Computational frame includes parallel computational model (such as OpenMPI), big data processing model (such as Hadoop and Spark), Yi Jishen It spends neural network training model (such as tensorflow);These distributed computing frameworks are usually by core component and computation module Composition, core component are responsible for receiving the task that user is sent to Computational frame, ancestral task are cut into multiple subtasks, then will Subtask is distributed to each computation module, then collect with each computation module of processing be calculated as a result, will finally calculate As a result user is returned to;Computation module is responsible for receiving the task of core component distribution and be calculated locally, completes after calculating The result being calculated is sent to core component;During the task that such Computational frame processing user submits, core There are a large amount of data communication between heart component and computation module, data communication bandwidth easily becomes the performance of entire Computational frame Bottleneck can reduce the efficiency that Computational frame executes task if data communication bandwidth is smaller;Containerization distributed computing framework Calculating service, containerization core component and containerization meter therein will be provided to user after the various components containerization of Computational frame Calculating the communication bandwidth between component will receive the influence of container cluster arranging system scheduling strategy.
Container cluster arranging system such as Docker Swarm, Kubernetes etc. currently popular, the plan of scheduling container It is slightly more single, do not consider influence of the data communication bandwidth to Computational frame performance between containerization component, may incite somebody to action It is dispatched on node different in cluster and runs there are the containerization component of mass data communication in Computational frame, as shown in Figure 1, It will cause the traffic rate between the containerization core component of Computational frame and containerization computation module in this way and be limited to physics section Network communication rate between point, so that the time of these containerization component synchronization other side's data is elongated, and then leads to containerization The performance that Computational frame executes task is not high.
Summary of the invention
In view of the drawbacks of the prior art and Improvement requirement, the present invention provides a kind of tune of containerization distributed computing framework Degree scheme acquisition methods and dispatching method, it is intended that the performance of hoisting container distributed computing framework.
To achieve the above object, according to the invention in a first aspect, providing a kind of containerization distributed computing framework Scheduling scheme acquisition methods, comprising:
(1) all unscheduled containerization components in containerization distributed computing framework to be scheduled are obtained, to obtain Assembly set to be dispatched;
(2) according to wait consumption needed for dispatching each containerization component in assembly set computing resource and cluster in respectively calculate The available computational resources of node are determined for running the calculate node wait dispatch each containerization component in assembly set, so that holding Device core component can be scheduled to same calculate node with containerization computation module as much as possible, thus obtain wait dispatch Containerization distributed computing framework scheduling scheme.
By the present invention in that the containerization core component that must belong to same Computational frame can be with containerization as much as possible Computation module is scheduled for same calculate node in cluster and operates above, and can be improved the containerization core group inside Computational frame Traffic rate between part and containerization computation module shortens the time of these containerization component synchronization other side's data, so as to It enough reduces containerization distributed computing framework and executes the whole time-consuming of task, the performance of hoisting container distributed computing framework.
Further, step (2) includes:
(21) the containerization meter in scheduling assembly set is treated according to the sequence of the computing resource of required consumption from small to large It calculates component to be ranked up, obtains ordered assemblies set;
(22) if assembly set to be dispatched includes containerization core component, containerization core component is inserted as orderly group First element of part set, and it is transferred to step (23);Otherwise, step (23) are directly transferred to;
(23) the total computing resource R consumed needed for all containerization components in ordered assemblies set is obtained;
(24) if the available computational resources of all calculate nodes are respectively less than total computing resource R, step (25) are transferred to;It is no Then, all calculate nodes that available computational resources are greater than or equal to total computing resource R are obtained, to constitute both candidate nodes set, and It is transferred to step (27);
(25) the maximum calculate node I of available computational resources is obtained, is determined schedulable to calculating section in ordered assemblies set The preceding m containerization component of point I, so thatNmaxFor the available computational resources of calculate node I;
It, can by priority scheduling containerization core component and the required lesser containerization computation module of consumption computing resource So that containerization core component and containerization computation module as much as possible are scheduled to same calculate node;
(26) calculate node I is determined as being used to run the calculate node of m containerization component, updates calculate node I's Available computational resources areAnd by m containerization component after being removed in ordered assemblies set, turn Enter step (23);
(27) the smallest calculate node I ' of available computational resources in both candidate nodes set is obtained, and is determined it as transporting The calculate node of each containerization component in row ordered assemblies set;
The smallest calculate node of available computational resources is selected herein, be can be improved and is belonged to same calculating in subsequent scheduling process A possibility that containerization component of frame is scheduled to same node;
Wherein, i is the number of containerization component, FiFor consumption needed for i-th of containerization component in ordered assemblies set Computing resource.
Second aspect according to the invention provides a kind of dispatching method of containerization distributed computing framework, comprising:
The containerization distributed computing framework Fr scheduled for the needs created in cluster by user utilizes the present invention the The scheduling scheme acquisition methods of containerization distributed computing framework provided by one side obtain its scheduling scheme S;
Each containerization component of Computational frame Fr is dispatched to corresponding calculate node according to scheduling scheme S, to complete Scheduling to Computational frame Fr.
The third aspect according to the invention provides a kind of dispatching method of containerization distributed computing framework, comprising:
(1) judge the containerization distributed computing framework being scheduled in current cluster with the presence or absence of the needs that user creates Fr, if so, being transferred to step (6);If it is not, being then transferred to step (2);
(2) the current timestamp t of cluster is obtainedpThe timestamp t of weight scheduling process is executed with the cluster last timelBetween difference It is worth Δ t, if Δ t > T, is transferred to step (3);Otherwise, it is transferred to step (1);
(3) using the one or more containerization distributed computing frameworks being currently running in cluster as readjustment degree pair As, and regained using the scheduling scheme acquisition methods of containerization distributed computing framework provided by first aspect present invention The scheduling scheme of each heavy scheduler object;
(4) it calculates separately and is opened according to the total network communication of all Computational frames in the new forward and backward cluster of scheduling scheme schedules Pin V and V ' adjusts all or part of containerization component of each heavy scheduler object according to new scheduling scheme if V ' < V again Degree to complete readjustment degree, and is transferred to step (5) to corresponding calculate node after the completion of readjustment degree;Otherwise, without resetting Degree, is transferred to step (1);
When the Computational frame not created needs to dispatch, to the containerization distributed computing framework being currently running in cluster Readjustment degree is carried out to reduce the cost on network communication that all Computational frames are total in cluster, can further hoisting containerization it is distributed The performance of Computational frame;
(5) by timestamp tlValue be updated to timestamp tpValue, and be transferred to step (1);
(6) it is obtained using the scheduling scheme acquisition methods of containerization distributed computing framework provided by first aspect present invention Computational frame Fr scheduling scheme S, and according to scheduling scheme S each containerization component of Computational frame Fr is dispatched to corresponding Calculate node, to complete the scheduling to Computational frame Fr;
Newly-built Computational frame Fr is scheduled according to scheduling scheme S, can guarantee the containerization core of Computational frame Fr Heart component and containerization computation module as much as possible are scheduled to same calculate node, to improve the communication speed between component Rate, the performance of hoisting container distributed computing framework;
(7) after the completion of dispatching, it is transferred to step (1);
Wherein, T is preset time interval threshold value.
Further, step (3) includes:
(31) the maximum Computational frame M of cost on network communication in cluster is obtained;
(32) using the scale B of Computational frame M as threshold value, the scale in cluster that filters out is less than all calculation blocks of threshold value B Frame, to obtain Computational frame collection H;
(33) Computational frame in Computational frame M and Computational frame collection H is calculated into each readjustment degree as weight scheduler object When object is not scheduled to calculate node, the available resources N of each calculate node in clusterj'=Nj+FM,j+FH,j, thus obtain Node p with most available computational resources before weight scheduler object is scheduled;
Readjustment degree is carried out to the maximum Computational frame of cost on network communication, can be reduced in cluster and be owned with maximum probability The total cost on network communication of Computational frame;Computing resource needed for the lesser Computational frame of scale is often smaller, smaller to scale Computational frame carry out readjustment degree, being capable of efficent use of resources fragment;
(34) for each Computational frame h ∈ H, its all containerization component run on node p is obtained, as one A new Computational frame h ' thus obtains the Computational frame collection K being made of all new Computational frames;
(35) Computational frame in Computational frame collection K is ranked up according to the sequence of scale from big to small, is calculated Frame queue Q, and Computational frame M is inserted into the stem of Computational frame queue Q;
The scale of containerization distributed computing framework is bigger, and communication overhead is often bigger, preferentially obtains larger The scheduling scheme of Computational frame can effectively promote the performance of cluster entirety;
(36) available computational resources for updating calculate node p are Wj'=Nj+FM,j+FH,j, update other calculate nodes can It is W with computing resourcej'=Nj+FM,j, utilize the dispatching party of containerization distributed computing framework provided by first aspect present invention Case acquisition methods successively obtain the scheduling scheme of each Computational frame in Computational frame queue Q;
Wherein, the containerization number of components that the scale of Computational frame is included for the Computational frame, j are the volume of calculate node Number, NjAttach most importance to and dispatches the available computational resources of j-th of calculate node in preceding cluster, FM,jTo be run on j-th in Computational frame M The computing resource total amount consumed needed for the containerization component of calculate node, FH,jTo run on j-th of calculating in Computational frame collection H The computing resource total amount consumed needed for the containerization component of node.
In general, contemplated above technical scheme through the invention, can obtain it is following the utility model has the advantages that
(1) by the present invention in that the containerization core component that must belong to same Computational frame can be with container as much as possible Change computation module is scheduled for same calculate node in cluster and operates above, and can be improved the containerization core inside Computational frame Traffic rate between component and containerization computation module shortens the time of these containerization component synchronization other side's data, thus Containerization distributed computing framework can be reduced and execute the whole time-consuming of task, the property of hoisting container distributed computing framework Energy.
(2) scheduling scheme acquired in the present invention is independent of specific container cluster arranging system, therefore portable It is good.
(3) present invention can either realize the scheduling to newly-built containerization distributed computing framework, additionally it is possible to realize basis Cost on network communication between Computational frame internal container component carries out readjustment degree to the Computational frame being currently running in cluster, The cost on network communication summation that can be minimized the multiple Computational frames run in cluster, improves the flexibility of scheduling.
(4) present invention after determining weight scheduler object and obtaining new scheduling scheme, ensures that in weight scheduling process The overall network communication overhead of all Computational frames is smaller compared to before executing weight scheduling process in cluster after execution readjustment degree, The high stability of scheduling.
Detailed description of the invention
Fig. 1 is the scheduling result schematic diagram of existing container cluster arranging system;
Fig. 2 is the scheduling scheme acquisition methods process for the containerization distributed computing framework that the embodiment of the present invention one provides Figure;
Fig. 3 is the knot being scheduled according to the scheduling scheme that scheduling scheme acquisition methods provided in an embodiment of the present invention obtain Fruit schematic diagram;
Fig. 4 is the dispatching method flow chart of containerization distributed computing framework provided by Embodiment 2 of the present invention;
Fig. 5 is the dispatching method flow chart for the containerization distributed computing framework that the embodiment of the present invention three provides;
Fig. 6 is the method flow provided in an embodiment of the present invention that readjustment degree is carried out to the Computational frame being currently running in cluster Figure.
Specific embodiment
In order to make the objectives, technical solutions, and advantages of the present invention clearer, with reference to the accompanying drawings and embodiments, right The present invention is further elaborated.It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, and It is not used in the restriction present invention.As long as in addition, technical characteristic involved in the various embodiments of the present invention described below Not constituting a conflict with each other can be combined with each other.
For shown in FIG. 1 when the containerization core component and containerization meter that belong to same containerization distributed computing framework Component is calculated when being scheduled to different calculate nodes, the larger problem of the communication overhead between component, in the embodiment of the present invention one In, the scheduling scheme acquisition methods of containerization distributed computing framework provided by the invention are as shown in Figure 2, comprising:
(1) all unscheduled containerization components in containerization distributed computing framework to be scheduled are obtained, to obtain Assembly set to be dispatched;
(2) according to wait consumption needed for dispatching each containerization component in assembly set computing resource and cluster in respectively calculate The available computational resources of node are determined for running the calculate node wait dispatch each containerization component in assembly set, so that holding Device core component can be scheduled to same calculate node with containerization computation module as much as possible, thus obtain wait dispatch Containerization distributed computing framework scheduling scheme;
In an optional embodiment, step (2) is specifically included:
(21) the containerization meter in scheduling assembly set is treated according to the sequence of the computing resource of required consumption from small to large It calculates component to be ranked up, obtains ordered assemblies set;
(22) if assembly set to be dispatched includes containerization core component, containerization core component is inserted as orderly group First element of part set, and it is transferred to step (23);Otherwise, step (23) are directly transferred to;
(23) the total computing resource consumed needed for all containerization components in ordered assemblies set is obtained
Wherein, i is the number of containerization component, FiFor consumption needed for i-th of containerization component in ordered assemblies set Computing resource, the containerization number of components that 1≤i≤n, n include by ordered assemblies set;
(24) if the available computational resources of all calculate nodes are respectively less than total computing resource R, step (25) are transferred to;It is no Then, all calculate nodes that available computational resources are greater than or equal to total computing resource R are obtained, to constitute both candidate nodes set, and It is transferred to step (27);
(25) the maximum calculate node I of available computational resources is obtained, is determined schedulable to calculating section in ordered assemblies set The preceding m containerization component of point I, so thatNmaxFor the available computational resources of calculate node I;
It, can by priority scheduling containerization core component and the required lesser containerization computation module of consumption computing resource So that containerization core component and containerization computation module as much as possible are scheduled to same calculate node;
(26) calculate node I is determined as being used to run the calculate node of m containerization component, updates calculate node I's Available computational resources areAnd by m containerization component after being removed in ordered assemblies set, turn Enter step (23);
(27) the smallest calculate node I ' of available computational resources in both candidate nodes set is obtained, and is determined it as transporting The calculate node of each containerization component in row ordered assemblies set;
The smallest calculate node of available computational resources is selected herein, be can be improved and is belonged to same calculating in subsequent scheduling process A possibility that containerization component of frame is scheduled to same node.
By the present invention in that the containerization core component that must belong to same Computational frame can be with containerization as much as possible Computation module is scheduled for same calculate node in cluster and operates above, distributed according to containerization provided in an embodiment of the present invention Result that the scheduling scheme acquisition methods scheduling scheme obtained of Computational frame is scheduled Computational frame as shown in figure 3, Thus, it is possible to improve the traffic rate between the containerization core component inside Computational frame and containerization computation module, shorten this The time of a little containerization component synchronization other side's data, so as to reduce the entirety that containerization distributed computing framework executes task Time-consuming, the performance of hoisting container distributed computing framework.
In the embodiment of the present invention two, the present invention also provides a kind of dispatching parties of containerization distributed computing framework Method, as shown in Figure 4, comprising:
The containerization distributed computing framework Fr scheduled for the needs created in cluster by user, utilizes said vesse The scheduling scheme acquisition methods for changing distributed computing framework obtain its scheduling scheme S;
Each containerization component of Computational frame Fr is dispatched to corresponding calculate node according to scheduling scheme S, to complete Scheduling to Computational frame Fr.
By above-mentioned dispatching method, it can guarantee the containerization core group for the containerization distributed computing framework that user creates Part can be scheduled in same calculate node with containerization computation module as much as possible and run, to reduce between component Communication overhead promotes the performance of Computational frame.
In the embodiment of the present invention three, the present invention also provides a kind of dispatching parties of containerization distributed computing framework Method, as shown in Figure 5, comprising:
(1) judge the containerization distributed computing framework being scheduled in current cluster with the presence or absence of the needs that user creates Fr, if so, being transferred to step (6);If it is not, being then transferred to step (2);
(2) judge the current timestamp t of clusterpThe timestamp t of weight scheduling process is executed with the cluster last timelBetween difference It is worth Δ t, if Δ t > T, is transferred to step (3);Otherwise, it is transferred to step (1);
Wherein, T is preset time interval threshold value;The value of time interval threshold value T can according to actual cluster environment and Application characteristic is rationally arranged, to avoid because be arranged it is excessive due to lead to task after weight scheduling process it is also inactive, or because setting It sets too small and causes computing cost excessive;In embodiment, T is set as 10 seconds;
(3) using the one or more containerization distributed computing frameworks being currently running in cluster as readjustment degree pair As, and each heavy scheduler object is regained using the scheduling scheme acquisition methods of said vesse distributed computing framework of the present invention Scheduling scheme;
In an optional embodiment, as shown in fig. 6, step (3) specifically includes:
(31) the maximum Computational frame M of cost on network communication in cluster is obtained;
(32) using the scale B of Computational frame M as threshold value, the scale in cluster that filters out is less than all calculation blocks of threshold value B Frame, to obtain Computational frame collection H;
(33) Computational frame in Computational frame M and Computational frame collection H is calculated into each readjustment degree as weight scheduler object When object is not scheduled to calculate node, the available resources N of each calculate node in clusterj'=Nj+FM,j+FH,j, thus obtain Node p with most available computational resources before weight scheduler object is scheduled;
Readjustment degree is carried out to the maximum Computational frame of cost on network communication, can be reduced in cluster and be owned with maximum probability The total cost on network communication of Computational frame;Computing resource needed for the lesser Computational frame of scale is often smaller, smaller to scale Computational frame carry out readjustment degree, being capable of efficent use of resources fragment;
(34) for each Computational frame h ∈ H, its all containerization component run on node p is obtained, as one A new Computational frame h ' thus obtains the Computational frame collection K being made of all new Computational frames;
(35) Computational frame in Computational frame collection K is ranked up according to the sequence of scale from big to small, is calculated Frame queue Q, and Computational frame M is inserted into the stem of Computational frame queue Q;
The scale of containerization distributed computing framework is bigger, and communication overhead is often bigger, preferentially obtains larger The scheduling scheme of Computational frame can effectively promote the performance of cluster entirety;
(36) available computational resources for updating calculate node p are Wj'=Nj+FM,j+FH,j, update other calculate nodes can It is W with computing resourcej'=Nj+FM,j, using said vesse distributed computing framework of the present invention scheduling scheme acquisition methods according to The secondary scheduling scheme for obtaining each Computational frame in Computational frame queue Q;
Wherein, the containerization number of components that the scale of Computational frame is included for the Computational frame, j are the volume of calculate node Number, NjAttach most importance to and dispatches the available computational resources of j-th of calculate node in preceding cluster, FM,jTo be run on j-th in Computational frame M The computing resource total amount consumed needed for the containerization component of calculate node, FH,jTo run on j-th of calculating in Computational frame collection H The computing resource total amount consumed needed for the containerization component of node;
(4) it calculates separately and is opened according to the total network communication of all Computational frames in the new forward and backward cluster of scheduling scheme schedules Pin V and V ' adjusts all or part of containerization component of each heavy scheduler object according to new scheduling scheme if V ' < V again Degree to complete readjustment degree, and is transferred to step (5) to corresponding calculate node after the completion of readjustment degree;Otherwise, without resetting Degree, is transferred to step (1);
In the case that the core component and computation module of Computational frame are dispatched to and run on different nodes, containerization group There are cost on network communication between part;Wherein, for first of containerization distributed computing framework, containerization core component and i-th Cost on network communication between a containerization component are as follows:
In formula, klFor constant, for indicating cost on network communication and computation module between computation module and core component The resource for needing to consume is rendered as a positive correlation, the constant k of different type Computational framelValue it is different;GiFor The computing resource consumed needed for i-th of containerization component, GiIt is bigger, indicate that Computational frame inner pressurd vessel computation module needs disappear The resource of consumption is more, and the amount of communication data for needing to carry out between containerization core component is more, the containerization component and core Cost on network communication between heart component is bigger;
Before readjustment degree, the total cost on network communication of all Computational frames in cluster are as follows:
In formula, l is the number for the containerization distributed computing framework being currently running in cluster, ClIndicate first of containerization The cost on network communication of distributed computing framework, r represent the containerization Computational frame quantity being currently running in total in cluster;
After readjustment degree, the calculation method of the total cost on network communication V ' of all Computational frames and above-mentioned network communication in cluster The calculation method of expense V is similar, and therefore not to repeat here;
When the Computational frame not created needs to dispatch, to the containerization distributed computing framework being currently running in cluster Readjustment degree is carried out to reduce the cost on network communication that all Computational frames are total in cluster, can further hoisting containerization it is distributed The performance of Computational frame;
(5) by timestamp tlValue be updated to timestamp tpValue, and be transferred to step (1);
(6) scheduling of Computational frame Fr is obtained using the scheduling scheme acquisition methods of said vesse distributed computing framework Scheme S, and each containerization component of Computational frame Fr is dispatched to by corresponding calculate node according to scheduling scheme S, to complete Scheduling to Computational frame Fr;
Newly-built Computational frame Fr is scheduled according to scheduling scheme S, can guarantee the containerization core of Computational frame Fr Heart component and containerization computation module as much as possible are scheduled to same calculate node, to improve the communication speed between component Rate, the performance of hoisting container distributed computing framework;
(7) after the completion of dispatching, it is transferred to step (1).
The present invention has fully considered the containerization core component and containerization for belonging to same containerization distributed computing framework Containerization core component and containerization computation module as much as possible are dispatched to same meter by the communication overhead between computation module Operator node can effectively reduce the communication overhead between Computational frame internal container component, to reduce containerization distribution Computational frame executes the whole time-consuming of task, the performance of hoisting container distributed computing framework.
As it will be easily appreciated by one skilled in the art that the foregoing is merely illustrative of the preferred embodiments of the present invention, not to The limitation present invention, any modifications, equivalent substitutions and improvements made within the spirit and principles of the present invention should all include Within protection scope of the present invention.

Claims (5)

1. a kind of scheduling scheme acquisition methods of containerization distributed computing framework characterized by comprising
(1) all unscheduled containerization components in containerization distributed computing framework to be scheduled are obtained, to obtain wait adjust Spend assembly set;
(2) according to described wait respectively be calculated in the computing resource and cluster of consumption needed for dispatching each containerization component in assembly set The available computational resources of node determine for running the calculate node wait dispatch each containerization component in assembly set, make Same calculate node can be scheduled to containerization computation module as much as possible by obtaining containerization core component, thus obtain institute State the scheduling scheme of containerization distributed computing framework to be scheduled.
2. the scheduling scheme acquisition methods of containerization distributed computing framework as described in claim 1, which is characterized in that described Step (2) includes:
(21) according to the computing resource of required consumption sequence from small to large to described wait dispatch the containerization meter in assembly set It calculates component to be ranked up, obtains ordered assemblies set;
(22) if the assembly set to be dispatched includes containerization core component, the containerization core component is inserted as institute First element of ordered assemblies set is stated, and is transferred to step (23);Otherwise, step (23) are directly transferred to;
(23) the total computing resource R consumed needed for all containerization components in the ordered assemblies set is obtained;
(24) if the available computational resources of all calculate nodes are respectively less than total computing resource R, step (25) are transferred to;It is no Then, all calculate nodes that available computational resources are greater than or equal to total computing resource R are obtained, to constitute candidate node set It closes, and is transferred to step (27);
(25) the maximum calculate node I of available computational resources is obtained, determines the meter in the ordered assemblies set The preceding m containerization component of operator node I, so thatNmaxFor the available calculating of the calculate node I Resource;
(26) the calculate node I is determined as being used to run the calculate node of the m containerization component, updates the calculating The available computational resources of node I areAnd by the m containerization component from the ordered assemblies After removing in set, it is transferred to step (23);
(27) the smallest calculate node I ' of available computational resources in the both candidate nodes set is obtained, and is determined it as transporting The calculate node of each containerization component in the row ordered assemblies set;
Wherein, i is the number of containerization component, FiFor the meter of consumption needed for i-th of containerization component in the ordered assemblies set Calculate resource.
3. a kind of dispatching method of containerization distributed computing framework characterized by comprising
The containerization distributed computing framework Fr scheduled for the needs created in cluster by user, utilizes claim 1-2 The scheduling scheme acquisition methods of described in any item containerization distributed computing frameworks obtain its scheduling scheme S;
Each containerization component of the Computational frame Fr is dispatched to corresponding calculate node according to the scheduling scheme S, thus Complete the scheduling to the Computational frame Fr.
4. a kind of dispatching method of containerization distributed computing framework characterized by comprising
(1) judge the containerization distributed computing framework Fr being scheduled in current cluster with the presence or absence of the needs that user creates, if It is then to be transferred to step (6);If it is not, being then transferred to step (2);
(2) the current timestamp t of cluster is obtainedpThe timestamp t of weight scheduling process is executed with the cluster last timelBetween difference DELTA T, if Δ t > T, is transferred to step (3);Otherwise, it is transferred to step (1);
(3) using the one or more containerization distributed computing frameworks being currently running in cluster as weight scheduler object, and It is regained using the scheduling scheme acquisition methods of the described in any item containerization distributed computing frameworks of claim 1-2 each heavy The scheduling scheme of scheduler object;
(4) it calculates separately and is opened according to the total network communication of all Computational frames in the new forward and backward cluster of scheduling scheme schedules Pin V and V ' adjusts all or part of containerization component of each heavy scheduler object according to new scheduling scheme if V ' < V again Degree to complete readjustment degree, and is transferred to step (5) to corresponding calculate node after the completion of readjustment degree;Otherwise, without resetting Degree, is transferred to step (1);
(5) by the timestamp tlValue be updated to the timestamp tpValue, and be transferred to step (1);
(6) it is obtained using the scheduling scheme acquisition methods of the described in any item containerization distributed computing frameworks of claim 1-2 The scheduling scheme S of the Computational frame Fr, and according to the scheduling scheme S by each containerization component tune of the Computational frame Fr Degree is to corresponding calculate node, to complete the scheduling to the Computational frame Fr;
(7) after the completion of dispatching, it is transferred to step (1);
Wherein, T is preset time interval threshold value.
5. the dispatching method of containerization distributed computing framework as claimed in claim 4, which is characterized in that the step (3) Include:
(31) the maximum Computational frame M of cost on network communication in the cluster is obtained;
(32) using the scale B of Computational frame M as threshold value, the scale in the cluster that filters out is less than all meters of the threshold value B Frame is calculated, to obtain Computational frame collection H;
(33) Computational frame in the Computational frame M and the Computational frame collection H is calculated each heavy as weight scheduler object When scheduler object is not scheduled to calculate node, the available resources N of each calculate node in the clusterj'=Nj+FM,j+FH,j, Thus the node p before weight scheduler object is scheduled with most available computational resources is obtained;
(34) for each Computational frame h ∈ H, its all containerization component run on the node p is obtained, as one A new Computational frame h ' thus obtains the Computational frame collection K being made of all new Computational frames;
(35) Computational frame in the Computational frame collection K is ranked up according to the sequence of scale from big to small, is calculated Frame queue Q, and the Computational frame M is inserted into the stem of the Computational frame queue Q;
(36) available computational resources for updating the calculate node p are Wj'=Nj+FM,j+FH,j, update other calculate nodes can It is W with computing resourcej'=Nj+FM,j, utilize the scheduling of the described in any item containerization distributed computing frameworks of claim 1-2 Scheme acquisition methods successively obtain the scheduling scheme of each Computational frame in the Computational frame queue Q;
Wherein, the containerization number of components that the scale of Computational frame is included for the Computational frame, j are the number of calculate node, Nj Attach most importance to and dispatches the available computational resources of j-th of calculate node in the preceding cluster, FM,jTo run on jth in the Computational frame M The computing resource total amount consumed needed for the containerization component of a calculate node, FH,jTo run on jth in the Computational frame collection H The computing resource total amount consumed needed for the containerization component of a calculate node.
CN201910137847.7A 2019-02-25 2019-02-25 Scheduling scheme obtaining method and scheduling method of containerized distributed computing framework Active CN109976873B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910137847.7A CN109976873B (en) 2019-02-25 2019-02-25 Scheduling scheme obtaining method and scheduling method of containerized distributed computing framework

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910137847.7A CN109976873B (en) 2019-02-25 2019-02-25 Scheduling scheme obtaining method and scheduling method of containerized distributed computing framework

Publications (2)

Publication Number Publication Date
CN109976873A true CN109976873A (en) 2019-07-05
CN109976873B CN109976873B (en) 2020-12-18

Family

ID=67077367

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910137847.7A Active CN109976873B (en) 2019-02-25 2019-02-25 Scheduling scheme obtaining method and scheduling method of containerized distributed computing framework

Country Status (1)

Country Link
CN (1) CN109976873B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110636120A (en) * 2019-09-09 2019-12-31 广西东信易联科技有限公司 Distributed resource coordination system and method based on service request
CN110704135A (en) * 2019-09-26 2020-01-17 北京智能工场科技有限公司 Competition data processing system and method based on virtual environment
CN110764887A (en) * 2019-09-10 2020-02-07 浙江大华技术股份有限公司 Task rescheduling method and system, and related equipment and device

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103414767A (en) * 2013-07-30 2013-11-27 华南师范大学 Method and device for deploying application software on cloud computing platform
CN105786619A (en) * 2016-02-24 2016-07-20 中国联合网络通信集团有限公司 Virtual machine distribution method and device
US20180157508A1 (en) * 2016-12-05 2018-06-07 Red Hat, Inc. Co-locating containers based on source to improve compute density
CN109039686A (en) * 2017-06-12 2018-12-18 中兴通讯股份有限公司 A kind of method and device of mix of traffic layout

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103414767A (en) * 2013-07-30 2013-11-27 华南师范大学 Method and device for deploying application software on cloud computing platform
CN105786619A (en) * 2016-02-24 2016-07-20 中国联合网络通信集团有限公司 Virtual machine distribution method and device
US20180157508A1 (en) * 2016-12-05 2018-06-07 Red Hat, Inc. Co-locating containers based on source to improve compute density
CN109039686A (en) * 2017-06-12 2018-12-18 中兴通讯股份有限公司 A kind of method and device of mix of traffic layout

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
WEI QIAO,YING LI,ZHONG-HAI WU: "DLTAP:A Network-efficient Scheduling Method for Distributed Deep Learning Workload in Containerized Cluster Environment", 《ITM WEB OF CONFERENCES,EDP SCIENCES》 *
董春涛,李文婷,沈晴霓,吴中海: "Hadoop YARN大数据计算框架及其资源调度机制研究", 《信息通信技术》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110636120A (en) * 2019-09-09 2019-12-31 广西东信易联科技有限公司 Distributed resource coordination system and method based on service request
CN110636120B (en) * 2019-09-09 2022-02-08 广西东信易联科技有限公司 Distributed resource coordination system and method based on service request
CN110764887A (en) * 2019-09-10 2020-02-07 浙江大华技术股份有限公司 Task rescheduling method and system, and related equipment and device
CN110704135A (en) * 2019-09-26 2020-01-17 北京智能工场科技有限公司 Competition data processing system and method based on virtual environment

Also Published As

Publication number Publication date
CN109976873B (en) 2020-12-18

Similar Documents

Publication Publication Date Title
Han et al. Tailored learning-based scheduling for kubernetes-oriented edge-cloud system
CN103092683B (en) For data analysis based on didactic scheduling
US20240111586A1 (en) Multi-policy intelligent scheduling method and apparatus oriented to heterogeneous computing power
CN109034396B (en) Method and apparatus for processing deep learning jobs in a distributed cluster
CN107992359B (en) Task scheduling method for cost perception in cloud environment
CN104657221B (en) The more queue flood peak staggered regulation models and method of task based access control classification in a kind of cloud computing
US10474504B2 (en) Distributed node intra-group task scheduling method and system
CN109324875B (en) Data center server power consumption management and optimization method based on reinforcement learning
CN106020933B (en) Cloud computing dynamic resource scheduling system and method based on ultralight amount virtual machine
US20200219028A1 (en) Systems, methods, and media for distributing database queries across a metered virtual network
US8843929B1 (en) Scheduling in computer clusters
US20240036937A1 (en) Workload placement for virtual gpu enabled systems
CN111274036A (en) Deep learning task scheduling method based on speed prediction
CN104503832B (en) A kind of scheduling virtual machine system and method for fair and efficiency balance
CN108108225B (en) A kind of method for scheduling task towards cloud computing platform
CN112000388B (en) Concurrent task scheduling method and device based on multi-edge cluster cooperation
CN103401939A (en) Load balancing method adopting mixing scheduling strategy
CN108564164A (en) A kind of parallelization deep learning method based on SPARK platforms
Tang et al. Dependent task offloading for multiple jobs in edge computing
CN106708625A (en) Minimum-cost maximum-flow based large-scale resource scheduling system and minimum-cost maximum-flow based large-scale resource scheduling method
CN109976873A (en) The scheduling scheme acquisition methods and dispatching method of containerization distributed computing framework
CN114610474A (en) Multi-strategy job scheduling method and system in heterogeneous supercomputing environment
CN106648831B (en) Cloud workflow schedule method based on glowworm swarm algorithm and dynamic priority
Li et al. Endpoint-flexible coflow scheduling across geo-distributed datacenters
CN109936471A (en) A kind of resource allocation methods and device of more clusters

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant