CN111176840B

CN111176840B - Distribution optimization method and device for distributed tasks, storage medium and electronic device

Info

Publication number: CN111176840B
Application number: CN201911330857.9A
Authority: CN
Inventors: 尹德帅; 徐志方; 刘超; 沈顺厚; 王守峰
Original assignee: Qingdao Haier Technology Co Ltd
Current assignee: Qingdao Haier Technology Co Ltd
Priority date: 2019-12-20
Filing date: 2019-12-20
Publication date: 2023-11-28
Anticipated expiration: 2039-12-20
Also published as: CN111176840A

Abstract

The invention provides a distributed task allocation optimization method and device, wherein the method comprises the following steps: determining a second node set in a first node set formed by all nodes of the distributed system according to the state information of each node in the distributed system, wherein each node in the second node set is a node allowing tasks to be distributed in the distributed system to be executed, and the number of the nodes included in the second node set is smaller than or equal to that of the nodes included in the first node set; determining a correlation degree parameter of the current task to be allocated and each node in the second node set, wherein the correlation degree parameter is used for indicating the correlation degree of the current task to be allocated and the task received by each node in the second node set; and distributing the current task to be distributed to the target node with the highest association degree in the second node set for execution.

Description

Distribution optimization method and device for distributed tasks, storage medium and electronic device

Technical Field

The present invention relates to the field of computers, and in particular, to a distributed task allocation optimization method and apparatus, a storage medium, and an electronic device.

Background

In the related art, when a scheduling system of a distributed system performs node scheduling, resource load conditions of all nodes are calculated mainly through a load balancing algorithm, then nodes with low load utilization rate are added into a scheduling queue in a priority mode, and tasks are randomly distributed to the nodes added into the queue.

In the node allocation mode in the related art, the problem of different efficiency of executing tasks still exists for the nodes added into the scheduling queue due to different load utilization rates, and the problem of good or bad task execution effect can exist simply by adopting a random allocation mode, so that the optimization of allocation optimization of distributed tasks can not be realized.

Aiming at the problems that in the related art, when a dispatching system of a distributed system carries out node dispatching, the distributed task allocation optimization is unreasonable due to the difference of the load utilization rates of nodes, and the like, an effective technical scheme is not proposed yet.

Disclosure of Invention

The embodiment of the invention provides a distributed task allocation optimization method and device, which at least solve the problems that in the related art, when a dispatching system of a distributed system dispatches nodes, the distributed task allocation optimization is unreasonable due to different load utilization rates of the nodes.

According to an embodiment of the present invention, there is provided a distributed task allocation optimization method, including: determining a second node set in a first node set formed by all nodes of the distributed system according to the state information of each node in the distributed system, wherein each node in the second node set is a node which is allowed to execute tasks to be distributed in the distributed system, and the number of the nodes included in the second node set is smaller than or equal to the number of the nodes included in the first node set; determining a degree of association parameter of a current task to be allocated and each node in the second node set; the association degree parameter is used for indicating the association degree of the task to be allocated currently and the task received by each node in the second node set; and optimizing the distribution of the distributed task to be distributed currently to the target node with the highest association degree in the second node set for execution.

In an embodiment of the present invention, the state information of each node in the distributed system includes: the number of incomplete tasks, the average load parameter in the first preset time period, and the estimated completion time of the incomplete tasks; the determining a second node set in the first node set formed by all nodes of the distributed system according to the state information of each node in the distributed system comprises the following steps: acquiring state information of each node in the distributed system; determining a performance evaluation value of each node in the distributed system according to the state information; and determining the second node set in the first node set in the distributed system according to the performance evaluation value.

In an embodiment of the present invention, the determining, according to the performance evaluation value, the second node set from the first node set in the distributed system includes: comparing the performance evaluation value of each node in the distributed system with a preset threshold value; and determining a set formed by nodes with performance evaluation values smaller than the preset threshold value as the second node set.

In an embodiment of the present invention, the determining a performance evaluation value of each node in the distributed system includes: normalizing the number of the unfinished tasks to obtain the number of the unfinished tasks after normalization; determining a first product of a first weight and the number of incomplete tasks after normalization, determining a second product of a second weight and the average load parameter, and determining a third product of a third weight and the estimated completion time; wherein the first weight is greater than the second weight, the second weight is greater than the third weight; and determining the sum of the first product, the second product and the third product as the performance evaluation value.

In the embodiment of the present invention, the determining the association degree parameter between the task to be allocated and each node in the second node set includes: obtaining the maximum task number of each node in the second node set, which is allocated with tasks in a second preset time, determining the maximum task number as a first task number, obtaining the average value of the task numbers of each node in the second node set, which is allocated with tasks in the second preset time, determining the average value as a second task number, obtaining the task number included in a task set to be allocated in the distributed system, and determining the task number included in the task set to be allocated as a third task number; wherein the task set to be allocated comprises the current task to be allocated; determining a ratio of the third number of tasks to the second number of tasks; and determining a covariance value between the ratio and the first task number, and determining the covariance value as the association degree parameter.

In the embodiment of the present invention, the task to be allocated currently is allocated to the target node with the highest association degree in the second node set for execution, including: arranging each node in the second node set according to a first preset sequence according to the magnitude of the association degree parameter to obtain a target node set after the first preset sequence is arranged; determining a node corresponding to the maximum value of the association degree parameter in the target node set as the target node; and distributing the current task to be distributed to the target node for execution.

In the embodiment of the invention, each task to be allocated in the task set to be allocated in the distributed system is arranged in a task pool in the distributed system according to a second preset sequence, the current task to be allocated is any one task to be allocated in the task set to be allocated, and the task pool is used for storing the task set to be allocated in the distributed system.

According to another embodiment of the present invention, there is also provided an allocation optimization device of distributed tasks, including: a first determining module, configured to determine, according to status information of each node in the distributed system, a second node set from a first node set formed by all nodes in the distributed system, where each node in the second node set is a node that allows execution of a task to be allocated in the distributed system, and the number of nodes included in the second node set is less than or equal to the number of nodes included in the first node set; the second determining module is used for determining a correlation degree parameter of a current task to be distributed and each node in the second node set, wherein the correlation degree parameter is used for indicating the correlation degree of the current task to be distributed and the task received by each node in the second node set; and the distribution module is used for distributing the current task to be distributed to the target node with the highest association degree in the second node set for execution.

According to another embodiment of the present invention, there is also provided a computer-readable storage medium including a stored program, wherein the program, when run, performs the allocation optimization method of the distributed task of any one of the above.

According to another embodiment of the present invention, there is also provided an electronic device, the storage medium including a stored program, wherein the program, when run, performs the allocation optimization method of the distributed task according to any one of the above.

According to the state information of each node in the distributed system, a second node set is determined in a first node set formed by all nodes of the distributed system, wherein each node in the second node set is a node allowed to execute tasks to be distributed in the distributed system, and the number of the nodes included in the second node set is smaller than or equal to the number of the nodes included in the first node set; determining a degree of association parameter of a current task to be allocated and each node in the second node set; the association degree parameter is used for indicating the association degree of the task to be allocated currently and the task received by each node in the second node set; and optimizing the distribution of the distributed task to be distributed currently to the target node with the highest association degree in the second node set for execution. By adopting the technical scheme, the problems that in the related art, when a dispatching system of a distributed system dispatches nodes, the distributed task allocation optimization is unreasonable due to different load utilization rates of the nodes are still solved.

Drawings

The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this specification, illustrate embodiments of the application and together with the description serve to explain the application and do not constitute a limitation on the application. In the drawings:

FIG. 1 is a block diagram of a hardware architecture of a distributed task allocation optimization method according to an embodiment of the present application;

FIG. 2 is a flow chart of an alternative distributed task allocation optimization method according to an embodiment of the present application;

FIG. 3 is a flow chart of another alternative distributed task allocation optimization method according to an embodiment of the present application;

FIG. 4 is a schematic diagram of an alternative distributed system according to an embodiment of the present application;

FIG. 5 is a block diagram of an alternative distributed task allocation optimization method according to an embodiment of the present application;

fig. 6 is a block diagram of another alternative first determination module according to an embodiment of the present application.

Detailed Description

The application will be described in detail hereinafter with reference to the drawings in conjunction with embodiments. It should be noted that, without conflict, the embodiments of the present application and features of the embodiments may be combined with each other.

It should be noted that the terms "first," "second," and the like in the description and the claims of the present application and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order.

The method embodiment provided by the embodiment of the application can be applied to a distributed system for execution. FIG. 1 is a block diagram of the hardware architecture of a distributed task allocation optimization device according to an embodiment of the present application. As shown in fig. 1, the distributed task allocation optimization device 10 may include one or more processors 102 (only one is shown in fig. 1), which processors 102 may include, but are not limited to, a microprocessor MCU or a processing device such as a programmable logic device FPGA, and a memory 104 for storing data, and optionally the smart door lock may further include a transmission device 106 for communication functions and an input-output device 108. It will be appreciated by those skilled in the art that the configuration shown in fig. 1 is merely illustrative and is not intended to limit the configuration of the distributed task allocation optimization device described above. For example, the distributed task allocation optimization device 10 may also include more or fewer components than shown in FIG. 1, or have a different configuration than the equivalent function shown in FIG. 1 or more than the function shown in FIG. 1.

The memory 104 may be used to store a computer program, for example, a software program of application software and a module, such as a computer program corresponding to a distributed task allocation optimization method in an embodiment of the present application, and the processor 102 executes the computer program stored in the memory 104 to perform various functional applications and data processing, that is, implement the above-mentioned method. Memory 104 may include high-speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory 104 may further include memory remotely located relative to the processor 102, which may be connected to the mobile terminal via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The transmission device 106 is used to receive or transmit data via a network. Specific examples of the network described above may include a wireless network provided by a communication provider of the mobile terminal. In one example, the transmission device 106 includes a network adapter (Network Interface Controller, simply referred to as NIC) that can connect to other network devices through a base station to communicate with the internet. In one example, the transmission device 106 may be a Radio Frequency (RF) module, which is configured to communicate with the internet wirelessly.

In this embodiment, a method for optimizing the allocation of distributed tasks running on the intelligent door lock is provided, and fig. 2 is a flowchart of an alternative method for optimizing the allocation of distributed tasks according to an embodiment of the present invention, as shown in fig. 2, where the method includes the following steps:

step S202, determining a second node set in a first node set formed by all nodes of the distributed system according to the state information of each node in the distributed system, wherein each node in the second node set is a node allowed to execute tasks to be distributed in the distributed system, and the number of nodes included in the second node set is smaller than or equal to the number of nodes included in the first node set;

Step S204, determining a correlation degree parameter of a current task to be allocated and each node in the second node set, wherein the correlation degree parameter is used for indicating the correlation degree of the current task to be allocated and the task received by each node in the second node set;

and step S206, the task to be distributed currently is distributed to the target node with the highest association degree in the second node set for execution.

In the step S202, the state information of each node in the distributed system includes: the number of incomplete tasks, the average load parameter in the first preset time period, and the estimated completion time of the incomplete tasks; the determining a second node set in the first node set formed by all nodes of the distributed system according to the state information of each node in the distributed system comprises the following steps: acquiring state information of each node in the distributed system; determining a performance evaluation value of each node in the distributed system according to the state information; and determining the second node set in the first node set in the distributed system according to the performance evaluation value.

Optionally, the determining the second node set from the first node set in the distributed system according to the performance evaluation value includes: comparing the performance evaluation value of each node in the distributed system with a preset threshold value; and determining a set formed by nodes with performance evaluation values smaller than the preset threshold value as the second node set.

Various implementations of the determining the performance evaluation value may be determined, and in an alternative embodiment, the determining the performance evaluation value of each node in the distributed system includes: normalizing the number of the unfinished tasks to obtain the number of the unfinished tasks after normalization; determining a first product of a first weight and the number of incomplete tasks after normalization, determining a second product of a second weight and the average load parameter, and determining a third product of a third weight and the estimated completion time; wherein the first weight is greater than the second weight, the second weight is greater than the third weight; and determining the sum of the first product, the second product and the third product as the performance evaluation value.

The implementation manner of the step S204 is various, and in an optional embodiment, the determining the association degree parameter of the task to be allocated and each node in the second node set includes: obtaining the maximum task number of each node in the second node set, which is allocated with tasks in a second preset time, determining the maximum task number as a first task number, obtaining the average value of the task numbers of each node in the second node set, which is allocated with tasks in the second preset time, determining the average value as a second task number, obtaining the task number included in a task set to be allocated in the distributed system, and determining the task number included in the task set to be allocated as a third task number; wherein the task set to be allocated comprises the current task to be allocated; determining a ratio of the third number of tasks to the second number of tasks; and determining a covariance value between the ratio and the first task number, and determining the covariance value as the association degree parameter.

Optionally, each task to be allocated in the task set to be allocated in the distributed system is arranged in a task pool in the distributed system according to a second preset sequence, the current task to be allocated is any one task to be allocated in the task set to be allocated, and the task pool is used for storing the task set to be allocated in the distributed system.

The flow of the task allocation method is described below in conjunction with an alternative example, as shown in fig. 3, which includes the steps of:

step S302, acquiring real-time task execution data of each node in the system;

when new tasks in the system task pool are distributed, the task execution conditions of all nodes are analyzed in consideration of the load balancing condition of the nodes to be distributed, the task execution conditions of all nodes are comprehensively considered, and the tasks are dynamically distributed according to the actual conditions of all the nodes. The phenomenon of overload of the nodes caused by too many tasks which are not finished at present in one node is avoided.

As shown in fig. 4, an average load parameter L of each node in a first node set formed by all nodes in the distributed system may be obtained, where the average load parameter L is an average number of processes in a first preset time period, so that the distributed system can preferentially assign tasks to nodes with low average load; the time parameter index of each node in the distributed system can be obtained, for example, the estimated completion time sum of the incomplete task of each node is obtained, the estimated completion time sum of the incomplete task can accurately reflect the capability of the node to execute a new task, if the time for processing the incomplete task of the node is long, the new task is not distributed to the node, and the parameter index can be determined based on the comprehensive calculation of the incomplete task data, the CPU usage data and the content usage data of the node; the number M of outstanding tasks per node in all nodes in the distributed system may also be obtained.

Step S304, calculating the comprehensive performance index of the node based on the number M of the node incomplete tasks, the average load parameter L of the node and the estimated completion time sum T of the node incomplete tasks;

in an alternative embodiment, the overall performance index of the node (corresponding to the performance assessment values described above) is represented by Q, and the threshold for schedulable tasks is determined at the same time(corresponding to the above-mentioned preset threshold value) only when the Q value is smaller than the defined threshold value +.>As shown in FIG. 4, Q is smaller than the defined threshold +.>The corresponding nodes are added into a scheduling queue (corresponding to the second node set), the node comprehensive performance index exceeds a threshold value, the average load of the identified nodes is higher, the number of unfinished tasks is more, and the estimated completion time sum of the unfinished tasks of the nodes is larger.

According to the method and the device, the calculation parameters of the estimated completion time sum parameters of the incomplete tasks of the nodes are increased, and the actual task execution capacity of the nodes is comprehensively considered. The nodes in the distributed system all have different CPU frequencies, memory sizes and the like, the processing capacities of the different nodes for the same task are different, the actual task execution capacity of the nodes is comprehensively considered, the task dynamic allocation can be carried out according to the actual conditions of the nodes, the phenomenon of overload node load caused by too many tasks which are not completed at present when one node appears is avoided, and the rationality of the task allocation in the distributed system is improved.

Alternatively, the following is a performance evaluation value calculation process, specifically described below:

（1）

wherein,the result obtained after normalization processing is carried out on the number M of incomplete tasks;

m is the number of incomplete tasks of the node;

t is the estimated completion time sum of the incomplete tasks;

l is an average load parameter of each node in the distributed system in a first preset time period, and only the nodes with low average load rate are allocated with tasks under limited consideration;

weight values set according to the importance of the index, respectively, which should satisfy +.>. Wherein (1)>Corresponding to the first weight, +.>Corresponding to the second weight, +.>Corresponding to the third weight described above.

Step S306, based on the comprehensive performance index of all the nodes calculated in step S304, the task scheduling method is realized by the set threshold value of the schedulable taskComparing, determining nodes capable of distributing tasks, and adding the screened nodes into a scheduling queue (corresponding to the second node set);

in an alternative embodiment, the task threshold may be scheduledIs settable according to the computing power of the computing devices of the nodes in the system, when the schedulable task threshold is +.>When the definition is too high, some idle nodes cannot be scheduled; when the schedulable task threshold +. >When the definition is too low, the nodes can receive too many tasks, and load imbalance among the nodes is caused. The comprehensive determination is needed according to the ratio of the number of nodes and the number of tasks of the system.

Step S308, performing secondary optimization on the nodes added into the scheduling queue, calculating a correlation degree index value between the nodes in the queue and the tasks to be distributed, grouping according to the tasks to be distributed, determining a node set corresponding to each task to be distributed, and determining an optimal execution node (corresponding to the target node) of the task to be distributed based on the determined node set;

by executing step S306, the node that can perform task allocation is added to the scheduling queue, but the node included in the scheduling queue is not necessarily the optimal node to perform the task to be allocated.

In an alternative embodiment, the process of determining the optimal execution node specifically includes the following steps:

step 1, determining a historical load parameter X (corresponding to the first task number) of each node in a scheduling queue;

the historical load parameter may be a maximum value of task amount allocated by the node in a historical period of time;

X=max（Xi）（2）

step 2, calculating a mean value ratio Y of the tasks to be distributed and the historical distribution task quantity of the node;

E(X)=Calculating a historical allocation task quantity average value (corresponding to the second task quantity) of the node;

y=x0/E (X), calculating a ratio of the number of tasks to be allocated to the average of the historic allocation task amounts of the nodes (corresponding to a ratio of the third number of tasks to the second number of tasks);

step 3, calculating a covariance value between the historical load parameter X and the ratio Y (the covariance value between the ratio and the first task number);

covariance is defined as:

（3）

the equivalent calculation formula is:。

the cooperative variance value between each task to be allocated and the nodes in the scheduling queue can be obtained through the formula (3), and the correlation degree between the historical allocation task quantity of the nodes and the task quantity of the tasks to be allocated can be reflected by the variance value.

Step 4, grouping the calculated covariance values by taking the tasks to be distributed as units; obtaining a set of covariance values, such as Vc= { C1, C2, C3, … … }, wherein C1, C2, C3 are the covariance values between each node in the scheduling queue and the task to be allocated;

then, in each combination, respectively selecting nodes corresponding to S values with the former covariance values to form a node combination Vj; such as vj= { N1, N2, N3, … …, ns }.

Assuming that S is 10 and the number of tasks to be allocated is 10, the node combination formed by each task to be allocated is 1x10 matrix size; the 10 tasks to be allocated may be formed as a matrix of 10 x 10.

5. Selecting the node with the smallest sorting j from the combination Vj as the optimal execution node of the task to be distributed;

in this way, each task to be allocated uniquely identifies an optimal node to which the corresponding task to be allocated is allocated.

Step S310, a scheduling module distributes tasks to the optimal nodes determined in the step S308, and deletes the distributed tasks from the task pool;

and managing all tasks to be allocated in the task pool, and automatically deleting the task by the task pool when the scheduling module allocates a certain task to be allocated from the task pool to the determined optimal node for execution, so that the task is prevented from being repeatedly allocated.

According to the method and the device, the estimated completion time sum of the incomplete tasks of the nodes can be calculated by introducing the time parameter index, the capability of the nodes to execute the new tasks can be accurately reflected, if the time for the nodes to process the incomplete tasks is long, the new tasks are not distributed to the nodes, the association degree parameters of the nodes and the tasks to be distributed can be determined through covariance value calculation, the association degree of the nodes and the tasks to be distributed is improved, and the rationality of task distribution in the distributed system is improved.

From the description of the above embodiments, it will be clear to a person skilled in the art that the method according to the above embodiments may be implemented by means of software plus the necessary general hardware platform, but of course also by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (e.g. ROM/RAM, magnetic disk, optical disk) comprising instructions for causing a terminal device (which may be a mobile phone, a computer, a server, or a network device, etc.) to perform the method according to the embodiments of the present invention.

The embodiment also provides a distributed task allocation optimization device, which is applied to a distributed system, and the device is used for realizing the above embodiment and the preferred implementation manner, and is not described again. As used below, the term "module" may be a combination of software and/or hardware that implements a predetermined function. While the means described in the following embodiments are preferably implemented in software, implementation in hardware, or a combination of software and hardware, is also possible and contemplated.

FIG. 5 is a block diagram of an alternative distributed task allocation optimization device, according to an embodiment of the present invention, as shown in FIG. 5, comprising:

a first determining module 502, configured to determine, according to status information of each node in the distributed system, a second node set from a first node set formed by all nodes in the distributed system, where each node in the second node set is a node that is allowed to execute a task to be allocated in the distributed system, and the number of nodes included in the second node set is less than or equal to the number of nodes included in the first node set;

a second determining module 504, configured to determine a degree of association parameter of a task to be currently allocated and each node in the second node set, where the degree of association parameter is used to indicate a degree of association of the task to be currently allocated and a task received by each node in the second node set;

and the allocation module 506 is configured to allocate the current task to be allocated to a target node with the highest association degree in the second node set for execution.

In an embodiment of the present invention, as shown in fig. 6, the state information of each node in the distributed system includes: the number of incomplete tasks, the average load parameter in the first preset time period, and the estimated completion time of the incomplete tasks; the first determining module 502 includes: an obtaining unit 5021, configured to obtain status information of each node in the distributed system; a determining unit 5022, configured to determine a performance evaluation value of each node in the distributed system according to the state information; the determining unit is further configured to determine the second node set from the first node set in the distributed system according to the performance evaluation value.

In the embodiment of the present invention, the determining unit 5022 is further configured to compare the performance evaluation value of each node in the distributed system with a preset threshold; and determining a set formed by nodes with performance evaluation values smaller than the preset threshold value as the second node set.

In the embodiment of the present invention, the determining unit 5022 is further configured to normalize the number of incomplete tasks to obtain the number of incomplete tasks after normalization; determining a first product of a first weight and the number of incomplete tasks after normalization, determining a second product of a second weight and the average load parameter, and determining a third product of a third weight and the estimated completion time; wherein the first weight is greater than the second weight, the second weight is greater than the third weight; and determining the sum of the first product, the second product and the third product as the performance evaluation value.

In the embodiment of the present invention, the second determining module 504 is further configured to obtain a maximum number of tasks allocated to each node in the second node set within a second predetermined time, determine the maximum number of tasks as a first number of tasks, obtain a mean value of the number of tasks allocated to each node in the second node set within the second predetermined time, determine the mean value as a second number of tasks, and obtain a number of tasks included in a task set to be allocated in the distributed system, and determine a number of tasks included in the task set to be allocated as a third number of tasks; wherein the task set to be allocated comprises the current task to be allocated; determining a ratio of the third number of tasks to the second number of tasks; and determining a covariance value between the ratio and the first task number, and determining the covariance value as the association degree parameter.

In this embodiment of the present invention, the allocation module 506 is further configured to arrange each node in the second node set according to a first preset order according to the magnitude of the association degree parameter, so as to obtain a target node set after the first preset order is arranged; determining a node corresponding to the maximum value of the association degree parameter in the target node set as the target node; and distributing the current task to be distributed to the target node for execution.

In the embodiment of the invention, each task to be allocated in a task set to be allocated in the distributed system is arranged in a task pool in the distributed system according to a second preset sequence, the current task to be allocated is any one task to be allocated in the task set to be allocated, and the task pool is used for storing the task set to be allocated in the distributed system.

Embodiments of the present invention also provide a computer-readable storage medium including a stored program, wherein the program, when run, performs the method of any one of the above.

Alternatively, in the present embodiment, the above-described storage medium may be configured to store program code for performing the steps of:

s1, determining a second node set in a first node set formed by all nodes of the distributed system according to the state information of each node in the distributed system, wherein each node in the second node set is a node allowed to execute tasks to be distributed in the distributed system, and the number of the nodes included in the second node set is smaller than or equal to the number of the nodes included in the first node set;

S2, determining a correlation degree parameter of a current task to be distributed and each node in the second node set, wherein the correlation degree parameter is used for indicating the correlation degree of the current task to be distributed and the task received by each node in the second node set;

and S3, distributing the current task to be distributed to a target node with the highest association degree in the second node set for execution.

Alternatively, in the present embodiment, the storage medium may include, but is not limited to: a U-disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a removable hard disk, a magnetic disk, or an optical disk, or other various media capable of storing program codes.

An embodiment of the invention also provides an electronic device comprising a memory having stored therein a computer program and a processor arranged to run the computer program to perform the steps of any of the method embodiments described above.

Optionally, the electronic apparatus may further include a transmission device and an input/output device, where the transmission device is connected to the processor, and the input/output device is connected to the processor.

Alternatively, in the present embodiment, the above-described processor may be configured to execute the following steps by a computer program:

Alternatively, specific examples in this embodiment may refer to examples described in the foregoing embodiments and optional implementations, and this embodiment is not described herein.

It will be appreciated by those skilled in the art that the modules or steps of the invention described above may be implemented in a general purpose computing device, they may be concentrated on a single computing device, or distributed across a network of computing devices, they may alternatively be implemented in program code executable by computing devices, so that they may be stored in a memory device for execution by computing devices, and in some cases, the steps shown or described may be performed in a different order than that shown or described, or they may be separately fabricated into individual integrated circuit modules, or multiple modules or steps within them may be fabricated into a single integrated circuit module for implementation. Thus, the present invention is not limited to any specific combination of hardware and software.

The above description is only of the preferred embodiments of the present invention and is not intended to limit the present invention, but various modifications and variations can be made to the present invention by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the principle of the present invention should be included in the protection scope of the present invention.

Claims

1. A distributed task allocation optimization method applied to a distributed system, comprising the following steps:

Determining a second node set in a first node set formed by all nodes of the distributed system according to the state information of each node in the distributed system, wherein each node in the second node set is a node which is allowed to execute tasks to be distributed in the distributed system, and the number of the nodes included in the second node set is smaller than or equal to the number of the nodes included in the first node set;

determining a degree of association parameter of a current task to be allocated and each node in the second node set, wherein the degree of association parameter is used for indicating the degree of association between the number of tasks of the current task to be allocated and the number of tasks of the tasks received by each node in the second node set;

distributing the current task to be distributed to a target node with highest association degree in the second node set for execution;

the determining the association degree parameter of the current task to be allocated and each node in the second node set includes: obtaining the maximum task number of each node in the second node set, which is allocated with tasks in a second preset time, determining the maximum task number as a first task number, obtaining the average value of the task numbers of each node in the second node set, which is allocated with tasks in the second preset time, determining the average value as a second task number, obtaining the task number included in a task set to be allocated in the distributed system, and determining the task number included in the task set to be allocated as a third task number; wherein the task set to be allocated comprises the current task to be allocated; determining a ratio of the third number of tasks to the second number of tasks; and determining a covariance value between the ratio and the first task number, and determining the covariance value as the association degree parameter.

2. The method of claim 1, wherein the state information for each node in the distributed system comprises: the number of incomplete tasks, the average load parameter in the first preset time period, and the estimated completion time of the incomplete tasks;

the determining a second node set in the first node set formed by all nodes of the distributed system according to the state information of each node in the distributed system comprises the following steps:

acquiring state information of each node in the distributed system;

determining a performance evaluation value of each node in the distributed system according to the state information;

and determining the second node set in the first node set in the distributed system according to the performance evaluation value.

3. The method of claim 2, wherein said determining the second set of nodes from the first set of nodes in the distributed system based on the performance assessment value comprises:

comparing the performance evaluation value of each node in the distributed system with a preset threshold value;

and determining a set formed by nodes with performance evaluation values smaller than the preset threshold value as the second node set.

4. The method of claim 2, wherein said determining a performance assessment value for each node in the distributed system comprises:

normalizing the number of the unfinished tasks to obtain the number of the unfinished tasks after normalization;

determining a first product of a first weight and the number of incomplete tasks after normalization, determining a second product of a second weight and the average load parameter, and determining a third product of a third weight and the estimated completion time; wherein the first weight is greater than the second weight, the second weight is greater than the third weight;

and determining the sum of the first product, the second product and the third product as the performance evaluation value.

5. The method of claim 1, wherein assigning the current task to be assigned to the target node of the second set of nodes that is most highly associated with the target node comprises:

arranging each node in the second node set according to a first preset sequence according to the magnitude of the association degree parameter to obtain a target node set after the first preset sequence is arranged;

determining a node corresponding to the maximum value of the association degree parameter in the target node set as the target node;

And distributing the current task to be distributed to the target node for execution.

6. The method according to any one of claim 1 to 5, wherein,

arranging each task to be distributed in a task set to be distributed in the distributed system in a task pool in the distributed system according to a second preset sequence, wherein the current task to be distributed is any one task to be distributed in the task set to be distributed, and the task pool is used for storing the task set to be distributed in the distributed system.

7. A distributed task allocation optimization device applied to a distributed system, comprising:

a first determining module, configured to determine, according to status information of each node in the distributed system, a second node set from a first node set formed by all nodes in the distributed system, where each node in the second node set is a node that allows execution of a task to be allocated in the distributed system, and the number of nodes included in the second node set is less than or equal to the number of nodes included in the first node set;

the second determining module is used for determining a correlation degree parameter of a current task to be distributed and each node in the second node set, wherein the correlation degree parameter is used for indicating the correlation degree between the task number of the current task to be distributed and the task number of the received task of each node in the second node set;

The allocation module is used for allocating the current task to be allocated to the target node with the highest association degree in the second node set for execution;

the second determining module is further configured to determine a degree of association parameter between the current task to be allocated and each node in the second node set by executing the following steps: obtaining the maximum task number of each node in the second node set, which is allocated with tasks in a second preset time, determining the maximum task number as a first task number, obtaining the average value of the task numbers of each node in the second node set, which is allocated with tasks in the second preset time, determining the average value as a second task number, obtaining the task number included in a task set to be allocated in the distributed system, and determining the task number included in the task set to be allocated as a third task number; wherein the task set to be allocated comprises the current task to be allocated; determining a ratio of the third number of tasks to the second number of tasks; and determining a covariance value between the ratio and the first task number, and determining the covariance value as the association degree parameter.

8. A computer-readable storage medium, characterized in that the storage medium has stored therein a computer program, wherein the computer program is arranged to execute the method of any of the claims 1 to 6 when run.

9. An electronic device comprising a memory and a processor, characterized in that the memory has stored therein a computer program, the processor being arranged to run the computer program to perform the method of any of the claims 1 to 6.