CN111708627B

CN111708627B - Task scheduling method and device based on distributed scheduling framework

Info

Publication number: CN111708627B
Application number: CN202010575887.2A
Authority: CN
Inventors: 吴永晖
Original assignee: Ping An Property and Casualty Insurance Company of China Ltd
Current assignee: Ping An Property and Casualty Insurance Company of China Ltd
Priority date: 2020-06-22
Filing date: 2020-06-22
Publication date: 2023-06-20
Anticipated expiration: 2040-06-22
Also published as: CN111708627A

Abstract

The application relates to the technical field of big data, in particular to a task scheduling method and device based on a distributed scheduling framework. The method comprises the following steps: receiving a task scheduling request, wherein the task scheduling request carries a task identifier to be scheduled; acquiring the task size of a task to be scheduled corresponding to the task identification to be scheduled; performing slicing processing on a task to be scheduled according to the size of the task to obtain slicing tasks, and acquiring the priority of each slicing task according to preset logic; calculating the task load rate of each execution node in the distributed scheduling framework; and distributing the partitioned tasks to the execution nodes according to the task load rates and the priorities so as to instruct the execution nodes to schedule the distributed partitioned tasks. By adopting the method, the task scheduling efficiency can be improved. In addition, the invention also relates to a block chain technology, and the working state of each execution node is stored in the block chain.

Description

Task scheduling method and device based on distributed scheduling framework

Technical Field

The application relates to the technical field of big data, in particular to a task scheduling method and device based on a distributed scheduling framework.

Background

In the field of asynchronous scheduling of databases, an asynchronous scheduling framework supports asynchronous scheduling, and can set a time expression for scheduling, so that when no task exists, the task is scheduled when the time is met, and the system computing resources are wasted.

In the traditional technology, task scheduling can be performed in a distributed environment, but the same task needs to be controlled to run only by a single node in the task scheduling process, and the computing capacity of the distributed environment is seriously wasted by the computing mode.

Disclosure of Invention

In view of the foregoing, it is desirable to provide a task scheduling method and apparatus based on a distributed scheduling framework, which can improve task scheduling efficiency.

A task scheduling method based on a distributed scheduling framework comprises the following steps:

receiving a task scheduling request, wherein the task scheduling request carries a task identifier to be scheduled;

acquiring the task size of a task to be scheduled corresponding to the task identification to be scheduled;

performing slicing processing on a task to be scheduled according to the size of the task to obtain slicing tasks, and acquiring the priority of each slicing task according to preset logic;

calculating the task load rate of each execution node in the distributed scheduling framework;

And distributing the partitioned tasks to the execution nodes according to the task load rates and the priorities so as to instruct the execution nodes to schedule the distributed partitioned tasks.

In one embodiment, calculating a task load rate for each execution node in a distributed scheduling framework includes:

acquiring task states corresponding to each sliced task in an execution node in a distributed scheduling frame, wherein the task states comprise completed states and unfinished states;

acquiring a first number of slicing tasks in a completed state and a second number of slicing tasks in an unfinished state;

calculating a ratio of the first number to the second number;

and obtaining the task load rate of each execution node according to the ratio.

In one embodiment, assigning the fragmented tasks to the execution nodes according to the task load rates and priorities includes:

and sequentially distributing the slicing tasks to the execution nodes with the task load rates from low to high according to the order of the priorities from high to low.

In one embodiment, after calculating the task load rate of each execution node in the distributed scheduling framework, the method further includes:

obtaining the calculation performance index of each execution node according to the task load rate;

When the computing performance index cannot meet the requirement of processing all the sliced tasks, the processing quantity of the sliced tasks corresponding to the computing performance index is obtained, the sliced tasks corresponding to the processing quantity are distributed to each execution node, the rest sliced tasks are stored in a message queue, and when the task state of the sliced task in the execution node corresponds to the completed state, the sliced task is extracted from the message queue and distributed to the execution node until all the sliced tasks are distributed to the execution node.

In one embodiment, after obtaining the calculation performance index of the execution node according to the task load rate, the method further includes:

when the calculation performance index cannot meet the requirement of processing all the slicing tasks, a preset number of execution nodes are newly added according to the calculation performance index;

distributing each slicing task to each execution node according to each task load rate, including:

and distributing each slicing task to each execution node and the newly added execution node according to each task load rate.

In one embodiment, the allocation of the slicing tasks to the execution nodes according to the task load rates includes:

acquiring the working state of each execution node, wherein the working state comprises a normal state and a fault state;

And distributing the fragmented tasks to the execution nodes in a normal state according to the task load rates.

In one embodiment, the method further comprises:

constructing a proportional relation according to the task load rate of each executive machine;

and performing slicing processing on the task to be scheduled according to the proportion relation to obtain slicing tasks, and distributing the slicing tasks to the execution machine for task scheduling.

A task scheduling device based on a distributed scheduling framework, the device comprising:

the request receiving module is used for receiving a task scheduling request, wherein the task scheduling request carries a task identifier to be scheduled;

the task size acquisition module is used for acquiring the task size of the task to be scheduled corresponding to the task identifier to be scheduled;

the slicing task module is used for carrying out slicing processing on the task to be scheduled according to the size of the task to obtain slicing tasks, and obtaining the priority of each slicing task according to preset logic;

the load rate calculation module is used for calculating the task load rate of each execution node in the distributed scheduling framework;

and the distribution module is used for distributing the segmented tasks to the execution nodes according to the task load rate and the priority so as to instruct the execution nodes to schedule the distributed segmented tasks.

A computer device comprising a memory storing a computer program and a processor implementing the steps of the above method when the processor executes the computer program.

A computer readable storage medium having stored thereon a computer program which when executed by a processor performs the steps of the above method.

The task scheduling method and device based on the distributed scheduling framework comprise the following steps: the method comprises the steps that a master node receives a task scheduling request, wherein the task scheduling request carries a task identifier to be scheduled; acquiring the task size of a task to be scheduled corresponding to the task identification to be scheduled; the tasks to be scheduled are subjected to slicing processing according to the size of the tasks to obtain sliced tasks, and the once scheduled tasks can be subjected to task decomposition through the slicing processing; then calculating the task load rate of each execution node in the distributed scheduling framework and the priority of each slicing task; and distributing the decomposed segmented tasks to each execution node according to the task load rates and the priorities so as to instruct each execution node to schedule the distributed segmented tasks according to preset rules. The task to be scheduled is decomposed and distributed to a plurality of execution nodes to simultaneously execute the scheduling of the task, and the task is executed according to the priority of each slicing task in the process of executing the task, so that the task scheduling efficiency is improved.

Drawings

FIG. 1 is an application scenario diagram of a task scheduling method based on a distributed scheduling framework in one embodiment;

FIG. 2 is a flow diagram of a task scheduling method based on a distributed scheduling framework in one embodiment;

FIG. 3 is a flow diagram of a method for calculating a task load rate for each execution node in a distributed scheduling framework, according to one embodiment;

FIG. 4 is a block diagram of a task scheduler based on a distributed scheduling framework in one embodiment;

fig. 5 is an internal structural diagram of a computer device in one embodiment.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application will be further described in detail with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the present application.

The task scheduling method based on the distributed scheduling framework can be applied to an application environment shown in fig. 1. Wherein the master node 102 communicates with the execution node 103 via a network. The master node 102 receives a task scheduling request, wherein the task scheduling request carries a task identifier to be scheduled; the master node 102 obtains the task size of the task to be scheduled corresponding to the task identifier to be scheduled; performing slicing processing on a task to be scheduled according to the size of the task to obtain slicing tasks, and acquiring the priority of each slicing task according to preset logic; the master node 102 calculates the task load rate of each execution node in the distributed scheduling framework; the master node 102 allocates each of the fragmented tasks to each of the execution nodes 103 according to the task load rates and priorities to instruct each of the execution nodes 103 to perform task scheduling on the allocated fragmented tasks.

In one embodiment, as shown in fig. 2, a task scheduling method based on a distributed scheduling framework is provided, and the method is applied to the master node 102 in fig. 1 for illustration, and the method includes the following steps:

step 210, a task scheduling request is received, where the task scheduling request carries a task identifier to be scheduled.

Task scheduling refers to the process of obtaining resources from a computer, such as various enterprise applications meeting the task scheduling requirements, such as daily early morning statistics of the ranking of points of forum users, etc., doing specific things at specific times.

In particular, the scheduling framework to which task scheduling corresponds may belong to a distributed execution framework, and a distributed application may run on multiple systems of the network at a given time by coordinating them to accomplish a particular task quickly and efficiently. The group of systems in which the distributed application is running is collectively called a cluster, and each machine running in the cluster is called a node, and the node may be divided into a master node (master node) and an execution node (worker node), and further, the node may further include a monitoring node (slave node). More specifically, the distributed scheduling system is based on a zookeeper framework, and the dependent zookeeper needs to build a cluster environment (node > =3), so that the condition that the whole scheduling system directly crashes due to single-point faults of the zookeeper cluster is prevented. The Zookeeper cluster is responsible for carrying out master node election in the distributed scheduling cluster, when the master node is elected, the master node is a master node, and other nodes can be used as worker nodes, and the worker nodes are executing nodes. And the executing node can monitor the health state of the main node, and when the health state of the main node is a fault, the main node is selected again according to a preset rule.

Specifically, the master node receives a task scheduling request sent by the user terminal, and the task scheduling request may include information such as a task to be scheduled, execution time of the task to be scheduled, and task size of the task to be scheduled, so that the master node executes scheduling of the task according to the task scheduling request.

Step 220, obtaining the task size of the task to be scheduled corresponding to the task identifier to be scheduled.

Specifically, the master node obtains a task attribute of the task to be scheduled, where the task attribute may include information such as a task size and a task priority of the task to be scheduled.

And 230, performing slicing processing on the task to be scheduled according to the size of the task to obtain slicing tasks, and acquiring the priority of each slicing task according to preset logic.

Specifically, the master node extracts the task size in the task attribute, and performs slicing processing on the task to be scheduled according to the task size to obtain a plurality of slicing tasks. The slicing processing is to divide the task to be scheduled into a plurality of subtasks, for example, when the task to be scheduled is larger, the task to be scheduled can be divided into a plurality of subtasks, and then the subtasks can be respectively distributed to different execution nodes for parallel processing, so that the processing efficiency of the server to the task to be scheduled is improved.

Step 240, calculating the task load rate of each execution node in the distributed scheduling framework.

The master node obtains status information of each executing node, wherein the status information can comprise task load rates of each executing node. Specifically, when the task load rate of the execution node is large, the execution node is indicated to have weak task execution capability, and when the task load rate of the execution node is small, the execution node is indicated to have strong task execution capability.

And step 250, distributing the segmented tasks to the execution nodes according to the task load rates and the priorities so as to instruct the execution nodes to perform task scheduling on the distributed segmented tasks.

And the master node distributes the slicing tasks to the execution nodes according to the priorities of the slicing tasks so as to instruct the execution nodes to schedule the distributed slicing tasks according to a preset rule. If the master node preferentially distributes the fragmented task to the execution node with smaller current task load rate, so as to balance the task load rate of each execution node and uniformly distribute the computer resources.

Specifically, the master node is responsible for slicing tasks needing distributed scheduling, decides the number of slices according to the size of the tasks, issues slicing task information and priority information as temporary nodes of the zookeeper after the slicing tasks are completed, and decides to be distributed to a plurality of execution nodes according to the number of running tasks on the execution nodes.

In this embodiment, in the process of asynchronous scheduling of the framework, the framework may not only be used in a distributed environment when integrating the framework, but also enable the task to be scheduled not only on one node but to be fragmented according to the size of the task when executing task scheduling, and the fragmented tasks after being fragmented are uniformly distributed to all execution nodes for execution. And furthermore, all the execution nodes can be mobilized to participate in the task execution process at the same time, so that the computer resources are reasonably utilized, and the task execution efficiency is greatly improved. Furthermore, the distribution of the partitioned tasks is performed by the master node, compared with a distributed scheduling system without the master node, the distributed scheduling system is more intelligent, the task contention is not needed, so that the execution of the partitioned tasks can be performed more evenly, the tasks are divided into different partitioned tasks according to the size of the tasks, a plurality of executing nodes can process the tasks in parallel, the processing efficiency of the tasks is improved, and the executing nodes can be notified according to the priority of the partitioned tasks, so that the tasks with high priority can be executed preferentially.

In one embodiment, the task call framework further comprises a monitoring node, the monitoring node monitors the task state of each execution node, and when the newly-added slicing task on the execution node is monitored, the priority of the newly-added slicing task is obtained; and scheduling the newly added slicing task by the execution node according to the priority indication.

Further, the executing node server scheduling the newly added shard task according to the priority indication comprises: when the priority of the newly added slicing task is the highest priority, the execution node is instructed to execute the newly added slicing task in time, the execution state of the newly added slicing task is modified to be the task being executed until the newly added slicing task is completely executed, and the execution state of the newly added slicing task is modified to be the task completion, so that the newly added slicing task is prevented from being repeatedly executed by the execution node.

In this embodiment, task scheduling according to task priority is supported, so that tasks can be processed according to the order of light and heavy urgency, and the task execution capability is improved.

In one embodiment, as shown in fig. 3, a method for calculating a task load rate of each execution node in a distributed scheduling framework is provided, which includes:

in step 310, task states corresponding to the fragmented tasks in the executing node are obtained, where the task states include a completed state and an incomplete state.

Specifically, the executing node is configured to execute the partitioned task allocated by the master node, and the monitoring node may monitor the executing state of the partitioned task on the executing node in real time, and send the executing state to the master node, so that the master node may obtain the executing state of the task on each executing node in real time, where it is required to specify that the task incomplete state includes the task executing state and the task not yet started executing state.

Step 320 obtains a first number of fragmented tasks in a completed state and a second number of fragmented tasks in an incomplete state.

And e.g. recording the number of tasks distributed on each execution node, the number of tasks corresponding to the completed state on each execution node, and the second number of fragmented tasks in the incomplete state.

In step 330, a ratio of the first number to the second number is calculated.

And step 340, obtaining the task load rate of each execution node according to the ratio.

And further obtaining the load rate of each execution node according to the ratio of the first quantity to the second quantity.

In one embodiment, assigning the fragmented tasks to the execution nodes according to the task load rates and priorities includes: and sequentially distributing the slicing tasks to the execution nodes with the task load rates from low to high according to the order of the priorities from high to low.

In this embodiment, the fragmented tasks with higher priorities are preferentially allocated to the execution nodes with lower task load rates, so that not only is the computing capability of the execution nodes fully utilized, but also the tasks with higher priorities are guaranteed to be preferentially executed, and the effective execution of the tasks is guaranteed.

In one embodiment, after calculating the task load rate of each execution node in the distributed scheduling framework, the method further includes: obtaining the calculation performance index of each execution node according to the task load rate; when the computing performance index cannot meet the requirement of processing all the sliced tasks, the processing quantity of the sliced tasks corresponding to the computing performance index is obtained, the sliced tasks corresponding to the processing quantity are distributed to each execution node, the rest sliced tasks are stored in a message queue, and when the task state of the sliced task in the execution node corresponds to the completed state, the sliced task is extracted from the message queue and distributed to the execution node until all the sliced tasks are distributed to the execution node.

The computing performance index is used for representing the computing capacity of each executing node for the segmented task, and the higher the computing performance index is, the stronger the executing capacity of the executing node is, and specifically, the computing performance index is in inverse proportion to the task load rate. Further, when the load rates of the executing nodes are all relatively large, the master node can control the task distribution speed at this time, for example, the tasks can be stored in the message queue in the master node first, and then the tasks are slightly later. Specifically, when the master node judges that the task load rate of the executing node cannot meet the requirement of processing all the sliced tasks, the sliced tasks which cannot be processed are stored in the message queue, the task execution condition in each executing node is monitored in real time, and when the state of the sliced task is the completed state, a proper number of sliced tasks are extracted from the message queue and are continuously distributed to the executing node for task execution until the execution of all the sliced tasks in the message queue is completed. It should be noted that, the status of the task on the executing node is monitored by the monitoring node, and the monitoring node may report the monitored task status to the master node, so that the master node better distributes the task to the executing node according to the received status information.

Further, when the master node determines that the calculation performance index of the execution node cannot meet the requirement of processing all the sliced tasks, the sliced tasks with higher priority can be preferentially distributed to the execution node for execution, and the sliced tasks with lower priority are stored in the message queue.

In one embodiment, the method further comprises: constructing a proportional relation according to the task load rate of each executive machine; and performing slicing processing on the task to be scheduled according to the proportion relation to obtain slicing tasks, and distributing the slicing tasks to the execution machine for task scheduling.

Further, the master node may further construct a proportional relationship according to the task load rate of each execution machine, and then perform slicing processing on the task to be scheduled according to the constructed proportional relationship to obtain slicing tasks, so that the task size of each slicing task corresponds to the proportional relationship. Further, the slicing tasks are distributed to corresponding execution machines to execute task scheduling, such as distributing larger slicing tasks to execution machines with smaller task load rates, and distributing smaller slicing tasks to execution machines with larger task load rates.

In this embodiment, the master node may further perform slicing processing on the task to be scheduled according to the task load rate of each execution node to obtain a plurality of sliced tasks that conform to the computational performance indexes of each execution machine, so that the sliced tasks allocated to each execution machine exactly conform to the computational performance indexes of each execution machine, thereby implementing reasonable allocation of the sliced tasks and improving the processing capability of the task to be scheduled.

In this embodiment, data interaction is performed among the master node, the monitoring node and the executing node, so that task scheduling is completed together, and the monitoring node timely sends the task execution state to the master node by monitoring the task execution state of each executing node so as to help the master node to reasonably allocate the slicing tasks according to the task load rate.

In one embodiment, after obtaining the calculation performance index of the execution node according to the task load rate, the method further includes: when the calculation performance index can not meet the requirement of processing all the slicing tasks, a preset number of execution nodes are newly added according to the calculation performance index. Distributing each slicing task to each execution node according to each task load rate, including: and distributing each slicing task to each execution node and the newly added execution node according to each task load rate.

Specifically, after the master node distributes each sliced task to each execution node, the method further comprises calculating a task load rate corresponding to each execution node, and when the task load rate exceeds the capacity of processing the scheduled task to be executed, the number of the execution nodes can be increased until the task load rate of each execution node is within a preset range after each sliced task is distributed to each execution node, so that the scheduled task to be distributed can be executed.

In this embodiment, the scheduling framework belongs to distributed execution, not only is all nodes involved in computation, but also is convenient for lateral expansion, and when the computing capacity is insufficient, the computing capacity can be increased by increasing the number of executing nodes. By ensuring that each execution node can normally and efficiently execute each slicing task, the task execution efficiency is improved.

In one embodiment, the allocation of the slicing tasks to the execution nodes according to the task load rates includes: acquiring the working state of each execution node, wherein the working state comprises a normal state and a fault state; distributing each slicing task to each executing node in a normal state according to each task load rate; the working states of all the execution nodes are stored in a blockchain.

In order to ensure that the fragmented tasks are correctly executed, the master node monitors the working state of the execution node, wherein the working state can be used for representing the health condition of the execution node, when the working state of the execution node is in a normal state, the master node is healthy, the tasks to be scheduled can be normally executed according to a preset task allocation rule, and when the working state of the execution node is in a fault state, the master node is unhealthy, and the fragmented tasks on the unhealthy execution node need to be redistributed according to a preset rule. Specifically, when the health state of the execution node is a fault, extracting the slicing task distributed to the execution node; and distributing the extracted slicing task to an execution node with healthy state.

It should be emphasized that, to further ensure the security of the state of the execution machine, the working states of the execution nodes may also be stored in a node of a blockchain.

In this embodiment, by monitoring the health status of the executing nodes in real time, it is ensured that each executing node can normally execute tasks, particularly when the executing node fails, a failed server can be found timely, so that the failed server does not influence the normal execution of the tasks. Specifically, when an execution node fails, such as a temporary node disappears, the master node redistributes the incomplete shard tasks distributed to the execution node which does not fail, so as to avoid that the shard tasks cannot be processed all the time due to the failure of the node.

In one embodiment, the executing node listens for the health status of the master node; and when the health state of the main node is a fault, the main node is reselected according to a preset rule. In this embodiment, there is no single point of failure problem, and the scheduling system is based on zookeeper, and when the main node and the execution node fail, the scheduling system is immediately monitored, and when the main node fails, the scheduling system reselects, and when the execution node fails, the main node redistributes the sharding task to the execution node that does not fail.

In summary, in the field of asynchronous scheduling based on traditional databases, a well-known quaterz framework is currently known. The framework supports asynchronous scheduling, and can set a time expression cron for scheduling, so that the framework is very convenient and is popular in enterprise level development. But quatertz suffers from the following disadvantages: the method is only suitable for the computing environment of the nodes, and even in a distributed environment, the same task needs to be controlled to run only by a single node, otherwise, the task can be repeatedly executed. Such a manner of computation can severely waste the computing power of the distributed environment. Because of the scheduling framework of a single node, when an abnormality occurs in this node, the unavoidable overall environment may fail. The scheduling mode is single, and only timing or timing cycle scheduling can be realized. Scheduling according to task priority is not supported, a certain time interval exists between scheduling, and system computing power is wasted. Task scheduling is a simple time expression to schedule, which results in that when no task is available, time cron is met, which also schedules the task, wasting system computing resources.

The task scheduling method based on the distributed scheduling framework, which is provided by the patent, has the advantages that the scheduling framework belongs to distributed execution, not only is the efficiency of computing execution participated by all nodes high, but also the task scheduling method is convenient to transversely expand, and when the computing capacity is insufficient, the computing capacity can be improved in a mode of increasing the number of executing nodes. The scheduling system is based on the zookeeper, the zookeeper can be immediately monitored when the main node and the execution node are in fault, the main node can reselect when in fault, and the main node can redistribute the slicing task to the execution node which is not in fault when the execution node is in fault. The distribution of the slicing tasks is performed by the master node completely, the distributed scheduling system is more intelligent than a distributed scheduling system without the master node, the task contention is not needed, the execution of the slicing tasks can be performed more evenly, the executing nodes can be notified according to the priority of the slicing tasks, and the tasks with high priority can be executed preferentially.

It should be understood that, although the steps in the flowcharts of fig. 2-3 are shown in order as indicated by the arrows, these steps are not necessarily performed in order as indicated by the arrows. The steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders. Moreover, at least some of the steps in fig. 2-3 may include multiple sub-steps or stages that are not necessarily performed at the same time, but may be performed at different times, nor do the order in which the sub-steps or stages are performed necessarily occur sequentially, but may be performed alternately or alternately with at least a portion of the sub-steps or stages of other steps or steps.

In one embodiment, as shown in fig. 4, there is provided a task scheduling device based on a distributed scheduling framework, including:

the request receiving module 410 is configured to receive a task scheduling request, where the task scheduling request carries a task scheduling identifier to be scheduled.

The task size obtaining module 420 is configured to obtain a task size of a task to be scheduled corresponding to the task identifier to be scheduled.

The slicing task module 430 is configured to perform slicing processing on a task to be scheduled according to a task size to obtain slicing tasks, and obtain priorities of the slicing tasks according to preset logic.

The load factor calculation module 440 is configured to calculate a task load factor of each execution node in the distributed scheduling framework.

And the allocation module 450 is configured to allocate each sliced task to each execution node according to the task load rate and the priority, so as to instruct each execution node to perform task scheduling on the allocated sliced task.

In one embodiment, the load factor calculation module 440 includes:

the task state acquisition unit is used for acquiring task states corresponding to the slicing tasks in the execution node, wherein the task states comprise completed states and unfinished states.

And the quantity acquisition unit is used for acquiring the first quantity of the slicing tasks in the completed state and the second quantity of the slicing tasks in the unfinished state.

And the ratio calculating unit is used for calculating the ratio of the first quantity to the second quantity.

And the load rate calculation unit is used for obtaining the task load rate of each execution node according to the ratio.

In one embodiment, the allocation module 450 includes:

the first allocation unit is used for sequentially allocating the slicing tasks to the execution nodes with the task load rates from low to high according to the order of the priorities from high to low.

In one embodiment, the task scheduling device based on the distributed scheduling framework further includes:

and the index calculation module is used for obtaining the calculation performance index of each execution node according to the task load rate.

The task extraction module is used for obtaining the processing quantity of the segmented tasks corresponding to the computing performance index when the computing performance index cannot meet the processing requirement of all the segmented tasks, distributing the segmented tasks corresponding to the processing quantity to each execution node, storing the rest segmented tasks into the message queue, and extracting the segmented tasks from the message queue and distributing the segmented tasks to the execution nodes until all the segmented tasks are distributed to the execution nodes when the task states of the segmented tasks in the execution nodes are corresponding to completed states.

and the node newly-adding module is used for adding a preset number of execution nodes according to the calculation performance index when the calculation performance index cannot meet the requirement of processing all the slicing tasks.

A slicing task module, comprising:

and the second allocation unit is used for allocating the slicing tasks to the execution nodes and the newly added execution nodes according to the task load rates.

In one embodiment, the allocation module comprises:

the working state acquisition unit is used for acquiring the working state of each execution node, wherein the working state comprises a normal state and a fault state.

The third distribution unit is used for distributing each slicing task to each execution node in a normal state according to each task load rate; the working states of all the execution nodes are stored in a blockchain.

For specific limitations on the task scheduling device based on the distributed scheduling framework, reference may be made to the above limitation on the task scheduling device based on the distributed scheduling framework, and will not be described herein. The respective modules in the task scheduling device described above may be implemented in whole or in part by software, hardware, and combinations thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.

In one embodiment, a computer device is provided, which may be a server, and the internal structure of which may be as shown in fig. 5. The computer device includes a processor, a memory, a network interface, and a database connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, computer programs, and a database. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The database of the computer device is used for storing the relevant data of the tasks to be scheduled. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program, when executed by a processor, implements a task scheduling method based on a distributed scheduling framework.

It will be appreciated by those skilled in the art that the structure shown in fig. 5 is merely a block diagram of some of the structures associated with the present application and is not limiting of the computer device to which the present application may be applied, and that a particular computer device may include more or fewer components than shown, or may combine certain components, or have a different arrangement of components.

In one embodiment, a computer device is provided comprising a memory storing a computer program and a processor that when executing the computer program performs the steps of: receiving a task scheduling request, wherein the task scheduling request carries a task identifier to be scheduled; acquiring the task size of a task to be scheduled corresponding to the task identification to be scheduled; performing slicing processing on a task to be scheduled according to the size of the task to obtain slicing tasks, and acquiring the priority of each slicing task according to preset logic; calculating the task load rate of each execution node in the distributed scheduling framework; and distributing the partitioned tasks to the execution nodes according to the task load rates and the priorities so as to instruct the execution nodes to schedule the distributed partitioned tasks.

In one embodiment, the processor, when executing the computer program, is further configured to implement the step of calculating a task load rate for each execution node in the distributed scheduling framework: acquiring task states corresponding to each slicing task in an execution node, wherein the task states comprise completed states and unfinished states; acquiring a first number of slicing tasks in a completed state and a second number of slicing tasks in an unfinished state; calculating a ratio of the first number to the second number; and obtaining the task load rate of each execution node according to the ratio.

In one embodiment, the step of assigning the fragmented tasks to the execution nodes according to the task load rates and the priorities is further performed when the processor executes the computer program: and sequentially distributing the slicing tasks to the execution nodes with the task load rates from low to high according to the order of the priorities from high to low.

In one embodiment, the processor, when executing the computer program, further performs the steps of: obtaining the calculation performance index of each execution node according to the task load rate; when the computing performance index cannot meet the requirement of processing all the sliced tasks, the processing quantity of the sliced tasks corresponding to the computing performance index is obtained, the sliced tasks corresponding to the processing quantity are distributed to each execution node, the rest sliced tasks are stored in a message queue, and when the task state of the sliced task in the execution node corresponds to the completed state, the sliced task is extracted from the message queue and distributed to the execution node until all the sliced tasks are distributed to the execution node.

In one embodiment, the steps after the processor executes the computer program to obtain the computing performance index of the execution node according to the task load rate are further used for: when the calculation performance index cannot meet the requirement of processing all the slicing tasks, a preset number of execution nodes are newly added according to the calculation performance index; the processor, when executing the computer program, is further configured to implement the step of distributing each slicing task to each execution node according to each task load rate: and distributing each slicing task to each execution node and the newly added execution node according to each task load rate.

In one embodiment, the step of assigning the slicing tasks to the execution nodes according to the task load rates is further performed when the processor executes the computer program: acquiring the working state of each execution node, wherein the working state comprises a normal state and a fault state; distributing each slicing task to each executing node in a normal state according to each task load rate; the working states of all the execution nodes are stored in a blockchain.

In one embodiment, the processor, when executing the computer program, is further configured to: constructing a proportional relation according to the task load rate of each executive machine; and performing slicing processing on the task to be scheduled according to the proportion relation to obtain slicing tasks, and distributing the slicing tasks to the execution machine for task scheduling.

In one embodiment, a computer readable storage medium is provided having a computer program stored thereon, which when executed by a processor, performs the steps of: receiving a task scheduling request, wherein the task scheduling request carries a task identifier to be scheduled; acquiring the task size of a task to be scheduled corresponding to the task identification to be scheduled; performing slicing processing on a task to be scheduled according to the size of the task to obtain slicing tasks, and acquiring the priority of each slicing task according to preset logic; calculating the task load rate of each execution node in the distributed scheduling framework; and distributing the partitioned tasks to the execution nodes according to the task load rates and the priorities so as to instruct the execution nodes to schedule the distributed partitioned tasks.

In one embodiment, the computer program when executed by the processor is further configured to, when executed by the processor, perform the step of calculating a task load rate for each execution node in the distributed scheduling framework: acquiring task states corresponding to each slicing task in an execution node, wherein the task states comprise completed states and unfinished states; acquiring a first number of slicing tasks in a completed state and a second number of slicing tasks in an unfinished state; calculating a ratio of the first number to the second number; and obtaining the task load rate of each execution node according to the ratio.

In one embodiment, the computer program when executed by the processor performs the step of assigning the fragmented tasks to the execution nodes according to the task load rates and priorities further comprises: and sequentially distributing the slicing tasks to the execution nodes with the task load rates from low to high according to the order of the priorities from high to low.

In one embodiment, the computer program when executed by the processor is further configured to, when executed by the processor, perform the steps of: obtaining the calculation performance index of each execution node according to the task load rate; when the computing performance index cannot meet the requirement of processing all the sliced tasks, the processing quantity of the sliced tasks corresponding to the computing performance index is obtained, the sliced tasks corresponding to the processing quantity are distributed to each execution node, the rest sliced tasks are stored in a message queue, and when the task state of the sliced task in the execution node corresponds to the completed state, the sliced task is extracted from the message queue and distributed to the execution node until all the sliced tasks are distributed to the execution node.

In one embodiment, the computer program when executed by the processor is further configured to, when executed by the processor, perform the following steps of obtaining the computational performance indicators of the execution nodes according to the task load rates: when the calculation performance index cannot meet the requirement of processing all the slicing tasks, a preset number of execution nodes are newly added according to the calculation performance index; the computer program when executed by the processor is further configured to implement the step of assigning the respective shard tasks to the respective execution nodes according to the respective task load rates: and distributing each slicing task to each execution node and the newly added execution node according to each task load rate.

In one embodiment, the computer program when executed by the processor is further configured to, when executed by the processor, perform the step of assigning the respective sharded tasks to the respective execution nodes according to the respective task load rates: acquiring the working state of each execution node, wherein the working state comprises a normal state and a fault state; distributing each slicing task to each executing node in a normal state according to each task load rate; the working states of all the execution nodes are stored in a blockchain.

In one embodiment, the computer program when executed by the processor is further configured to: constructing a proportional relation according to the task load rate of each executive machine; and performing slicing processing on the task to be scheduled according to the proportion relation to obtain slicing tasks, and distributing the slicing tasks to the execution machine for task scheduling.

The subject application is operational with numerous general purpose or special purpose computer system environments or configurations. For example: personal computers, server computers, hand-held or portable devices, tablet devices, multiprocessor systems, microprocessor-based systems, set top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like. The application may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The application may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.

The blockchain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, consensus mechanism, encryption algorithm and the like. The Blockchain (Blockchain), which is essentially a decentralised database, is a string of data blocks that are generated by cryptographic means in association, each data block containing a batch of information of network transactions for verifying the validity of the information (anti-counterfeiting) and generating the next block. The blockchain may include a blockchain underlying platform, a platform product services layer, an application services layer, and the like.

Those skilled in the art will appreciate that implementing all or part of the above-described methods in accordance with the embodiments may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed may comprise the steps of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the various embodiments provided herein may include non-volatile and/or volatile memory. The nonvolatile memory can include Read Only Memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), memory bus direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), among others.

The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.

The foregoing examples represent only a few embodiments of the present application, which are described in more detail and are not to be construed as limiting the scope of the invention. It should be noted that it would be apparent to those skilled in the art that various modifications and improvements could be made without departing from the spirit of the present application, which would be within the scope of the present application. Accordingly, the scope of protection of the present application is to be determined by the claims appended hereto.

Claims

1. A task scheduling method based on a distributed scheduling framework, the method comprising:

acquiring the task size of the task to be scheduled corresponding to the task identification to be scheduled;

determining the number of fragments according to the task size, performing fragment processing on the task to be scheduled according to the number of fragments to obtain fragment tasks, and obtaining the priority of each fragment task according to preset logic;

Calculating the task load rate of each execution node in the distributed scheduling framework; distributing each of the slicing tasks to each of the execution nodes according to the task load rate and the priority, including: sequentially distributing the slicing tasks to the execution nodes with the task load rates from low to high according to the order of the priority from high to low so as to instruct each execution node to schedule the distributed slicing tasks;

the method for performing slicing processing on the task to be scheduled according to the number of the slices to obtain the sliced task comprises the following steps:

and constructing a proportional relation according to the task load rate of each execution machine, and performing slicing processing on the tasks to be scheduled according to the constructed proportional relation to obtain slicing tasks, wherein the task size of each slicing task corresponds to the proportional relation.

2. The method of claim 1, wherein calculating the task load rate for each execution node in the distributed scheduling framework comprises:

acquiring task states corresponding to the fragmented tasks in the execution nodes in a distributed scheduling framework, wherein the task states comprise completed states and unfinished states;

Acquiring a first number of the slicing tasks in the completed state and a second number of the slicing tasks in the unfinished state;

calculating a ratio of the first number to the second number;

and obtaining the task load rate of each execution node according to the ratio.

3. The method of claim 1, wherein after calculating the task load rates of the execution nodes in the distributed scheduling framework, further comprising:

and when the task state of the slicing task is the completed state, extracting the slicing task from the message queue and distributing the slicing task to the execution node until all the slicing tasks are distributed to the execution node.

4. The method according to claim 1, wherein after obtaining the calculation performance index of the execution node according to the task load rate, the method further comprises:

the distributing each of the slicing tasks to each of the executing nodes according to each of the task load rates includes:

And distributing the slicing tasks to the execution nodes and the newly added execution nodes according to the task load rates.

5. The method of claim 1, wherein said assigning each of said fragmented tasks to each of said executing nodes according to each of said task load rates comprises:

distributing the slicing tasks to the execution nodes in the normal state according to the task load rates; wherein the operating state of the executing node is stored in the blockchain.

6. The method according to any one of claims 1 to 5, further comprising:

7. A task scheduling device based on a distributed scheduling framework, the device comprising:

Performing slicing processing on the tasks to be scheduled to obtain slicing tasks, and acquiring the priority of each slicing task according to preset logic;

the allocation module is configured to allocate each of the sliced tasks to each of the execution nodes according to the task load rate and the priority, and includes: sequentially distributing the slicing tasks to the execution nodes with the task load rates from low to high according to the order of the priority from high to low so as to instruct each execution node to schedule the distributed slicing tasks;

the slicing task module is used for constructing a proportion relation according to the task load rate of each execution machine, carrying out slicing processing on the tasks to be scheduled according to the constructed proportion relation to obtain slicing tasks, and the task size of each slicing task corresponds to the proportion relation.

8. The apparatus of claim 7, wherein the load factor calculation module comprises:

the task state acquisition unit is used for acquiring task states corresponding to the slicing tasks in the execution nodes in the distributed scheduling framework, wherein the task states comprise completed states and unfinished states;

A number acquisition unit configured to acquire a first number of the fragmented tasks in the completed state and a second number of the fragmented tasks in the unfinished state;

a ratio calculating unit for calculating a ratio of the first number to the second number;

9. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the steps of the method of any of claims 1 to 6 when the computer program is executed.

10. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any of claims 1 to 6.