US10733026B2 - Automated workflow selection - Google Patents
Automated workflow selection Download PDFInfo
- Publication number
- US10733026B2 US10733026B2 US15/965,550 US201815965550A US10733026B2 US 10733026 B2 US10733026 B2 US 10733026B2 US 201815965550 A US201815965550 A US 201815965550A US 10733026 B2 US10733026 B2 US 10733026B2
- Authority
- US
- United States
- Prior art keywords
- workload
- workloads
- parent
- jobs
- child
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related, expires
Links
- 238000012545 processing Methods 0.000 claims abstract description 104
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 72
- 238000000034 method Methods 0.000 claims abstract description 51
- 238000010200 validation analysis Methods 0.000 claims abstract description 16
- 230000008569 process Effects 0.000 claims description 27
- 238000011084 recovery Methods 0.000 claims description 20
- 238000012360 testing method Methods 0.000 claims description 12
- 238000004590 computer program Methods 0.000 claims description 11
- 238000004088 simulation Methods 0.000 claims description 10
- 230000003116 impacting effect Effects 0.000 claims description 3
- 230000008878 coupling Effects 0.000 claims 1
- 238000010168 coupling process Methods 0.000 claims 1
- 238000005859 coupling reaction Methods 0.000 claims 1
- 238000010586 diagram Methods 0.000 description 25
- 238000013461 design Methods 0.000 description 7
- 230000006870 function Effects 0.000 description 7
- 230000015654 memory Effects 0.000 description 7
- 230000008901 benefit Effects 0.000 description 6
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 6
- 230000008859 change Effects 0.000 description 4
- 230000007423 decrease Effects 0.000 description 4
- 241001522296 Erithacus rubecula Species 0.000 description 3
- 230000003287 optical effect Effects 0.000 description 3
- 230000004622 sleep time Effects 0.000 description 3
- 238000004364 calculation method Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 239000004065 semiconductor Substances 0.000 description 2
- 230000004075 alteration Effects 0.000 description 1
- 230000001934 delay Effects 0.000 description 1
- 230000004069 differentiation Effects 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 238000002955 isolation Methods 0.000 description 1
- 230000008092 positive effect Effects 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5061—Partitioning or combining of resources
- G06F9/5077—Logical partitioning of resources; Management or configuration of virtualized resources
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/48—Program initiating; Program switching, e.g. by interrupt
- G06F9/4806—Task transfer initiation or dispatching
- G06F9/4843—Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
Definitions
- This disclosure relates to a method and apparatus for controlling the workload of individual computing systems in an information handling system in which incoming work requests are placed in a queue for processing by one or more computing systems.
- Integrated circuits are becoming more complex with increased density. Consequently, IC testing is also becoming more complex according to the complexity and density of ICs.
- An automated workload selection system (workload management system) submits jobs (compute intensive workloads, such as chip design) to the batch processing system to keep the batch processing system continually full of useful work.
- jobs compute intensive workloads, such as chip design
- a batch processing system such as Load Leveler or LSF has the ability to receive submitted jobs and run the jobs on many computing systems simultaneously. When a machine is available, the job executes. If all machines in the batch system are running jobs, extra jobs are placed in a queue for execution in the future. Batch systems with a large number of computers represent a significant investment in capital. Keeping the batch system full of jobs (such that none of the computers in the processing system is idle) offers advantages such as increased efficiency and utilization.
- Embodiments on the present invention provide a system operable to allocate workload within a batch processing system.
- This system includes one or more servers and a batch processing system communicatively coupled to the one or more servers.
- the one or more servers are operable to receive a work request and execute work management processing software.
- the work management processing software validates individual workloads, determines the number of workloads to select and then selects a set of valid workloads to be processed. This system then may submit the set of valid workloads to a batch processing system.
- the batch processing system has a number of computing resources wherein the work management processing module directs workloads to individual computing resources based on a variety of criteria within the batch processing system.
- the workloads may include one or more parent workloads and one or more child workloads.
- Embodiments provide an N-level tree hierarchy such that one child workload may be a parent to several other child workloads.
- Another embodiment on the present intervention provides a computer implemented method of allocating workloads within a batch processing system.
- This method begins by receiving a work request at a work management software module executed on a server.
- This work request may involve many different workloads.
- the workload work management processing software determines the number individual validated workloads to select.
- the batch processing system having a number of computing resources then may receive a selected set of valid workloads to be processed that have been submitted. Individual workloads may be directed to individual computing resources.
- Yet another embodiment of the present invention provides another method of automatically allocating workloads to a batch processing system.
- This method involves receiving a work request and validating individual workloads.
- the workloads may be one or more parent workloads having one or more child workloads.
- Individual child workloads may serve as parents to sub child workloads.
- the batch processing system receives a selected set of valid workloads for processing after the number of workloads to be selected has been determined.
- the work management software may further allocate and reallocate computing resources to child workloads within a first parent workload without impacting an allocation of computing resources within a second parent workload.
- Further embodiments of the present invention may halt processing within a first parent workload when a number of failures associated with child workloads within the first parent workload exceed a threshold value.
- the computing resources assigned associated with the halted first parent workload may be reassigned to remaining parent workloads.
- Embodiments of the present invention also facilitate intelligently executing portion recovery of computing resources assigned to the remaining parent workloads when processing within the first parent workload resumes. These workloads may be associated with complex testing such as the simulation test on an integrated device or circuit.
- FIG. 1 illustrates the environment and the key features of the present disclosure for an exemplary embodiment
- FIG. 2 is a flow diagram of a client work request 202 from a network (see FIG. 1 ) in accordance with embodiments of the present disclosure
- FIG. 3 is a logic flow diagram for the submission of jobs to the batch processing system for compute intensive workloads, such as chip design in accordance with embodiments of the present disclosure
- FIG. 4 is a logic flow diagram associated with an algorithm to validate and select workloads to be submitted to a batch processing system in accordance with embodiments of the present disclosure
- FIG. 5 provides a logic flow diagram associated with Update Job Counts block of FIG. 4 in further detail in accordance with embodiments of the present disclosure
- FIGS. 6A and 6B provide a logic flow diagram that relates to selecting work loads in more detail than that of block of FIG. 4 in accordance with embodiments of the present disclosure
- FIGS. 7 A and 7 B provide a logic flow diagram in accordance with embodiments of the present disclosure that employ a round-robin strategy as the pluggable strategy identified in FIG. 6A ;
- FIG. 8 provides a logic flow diagram in accordance with embodiments of the present disclosure associated with an algorithm to minimize batch processing depth
- FIG. 9 provides a logic flow diagram in accordance with embodiments of the present disclosure associated with an algorithm to address a highly variable completion rate and job submission rate
- FIG. 10 provides a logic-flow diagram associated with an algorithm to address workload portion recovery in accordance with embodiment to the present disclosure.
- FIG. 11 provides a logic-flow diagram associated with an algorithm for workload portion recovery in accordance with embodiment to the present disclosure.
- FIGs. illustrate embodiments of the present disclosure, like numerals used to refer to like and corresponding parts of the various drawings.
- Embodiments of the present disclosure provide a server based software application that can continuously allocate computing resources between projects, groups and various workloads. These embodiments allow not only the allocation of computing resources but associated manpower associated with groups. This will result in both high CPU utilization as well as manpower utilization.
- Embodiments of the present disclosure provide an automated allocation of computing resources in a batch processing system that considers not only projects through workloads but also as the result of simulations. By identifying errors or failures and comparing the number of errors or failures to a threshold level, further simulation on the batch processing system relating to that group may stop when the threshold level of errors associated with a certain project or group compares unfavorably to the number of errors or failures. This allows the manpower required to address these failures and the computing sources used to identify additional related failures to be reallocated. This utilization and reallocation results in a more effective utilization of manpower and processing power.
- Workload management is a concept whereby units of work (jobs, processes, threads, etc.) that are managed and provided system resources. A net positive effect in performance as determined by predefined performance criteria may result by reassigning resources from one group to another. Workload management of this type differs from the run-of-the-mill resource management performed by most operating systems in that the effect on the work units to which the resources are reassigned, but also by the effect on the work units from which they are taken determines the assignment of resources.
- FIG. 1 illustrates the environment and the key features of the present disclosure for an exemplary embodiment comprising a batch processing system or cluster 102 of interconnected, cooperating computing resources such as computer systems 104 , 106 , 108 , and 110 , and servers 112 and 118 .
- Various clients 114 define workload definitions to server 112 .
- the server continually executes a workload management selection processing module 116 .
- the workload management processing module selects workloads based on the batch processing status, previous workload results, hierarchical priorities, and workload configuration settings.
- the selected workloads are submitted the batch processing system or cluster 102 that includes cooperating computer systems 104 , 106 , 108 , and 110 , and optional central server 118 .
- the environment of the disclosure is that of a queue of work requests and a pool of servers distributed across the cluster that service the work requests.
- the disclosure allows management of the number of cooperating computer systems based on the performance goal classes of the queued work and the performance goal classes of competing work in the systems 100 .
- Those skilled in the art will recognize that any number of cooperating computer systems and any number of such queues and groups of servers 112 may be used without departing from the spirit or scope of the disclosure.
- FIG. 2 is a flow diagram of a client work request 202 from a network (see FIG. 1 ) to which system 100 is connected, to an address space or individual computing system 208 managed by the workload manager 204 .
- Work management software 204 is a component inside/outside of the operating system of the individual computing systems, which uses operating system services to define one or more queues 206 to a workload manager (WLM) 204 and to insert work requests 202 onto these queues.
- WLM workload manager
- the workload manager 204 ensures that requests provided to those individual computing systems have affinity to tests to be performed.
- testing infrastructure associated with testing complex integrated systems is no longer an instance of merely testing a set of instructions. Rather testing, particularly in the hardware environment, has become a testing based on a directed random template wherein random seeds may be tested repeatedly tested and yield different results.
- Embodiments of the present disclosure provide algorithm(s) to validate and select workloads to submit to a batch processing system.
- the algorithm uses the combination of: (1) an n-level tree hierarchy to represent the workload portions; (2) selectable workload validation algorithms at any level of the tree; (3) the isolation of workload validation, such that validation impact may be limited to a single workload group's children; (4) giving resources assigned to an invalid workload's to the workload's peers; (5) the solution provides for maximum limits on workload and workload groups if a user does not want a peer overtaking too much of the batch processing system; selectable workload selection algorithms at any level of the tree; isolate the portion distribution of a workload or workload group to its parent's children; provide portion alteration limits and algorithms so external events like coverage can modify the portion distribution of a workload groups children; provide the user with a high priority workload group which: can be added anywhere in the tree; and is isolated to only steal selections from its sibling's portions; to achieve a selection system that enables many large and small teams to share a common batch processing system that dynamic
- FIG. 3 is a logic flow diagram for the submission of jobs to the batch processing system for compute intensive workloads, such as chip design.
- the jobs or workloads may be organized into an n-level tree hierarchy. This method may be executed using an automated workload selection system software layer to submit jobs to the batch processing system such that the system may keep the batch processing system continually full of useful work.
- Operations 300 begin with block 302 .
- the processes of block 302 compute an updated job counts.
- the workload management server application may determine how many workloads to select. Processes of block 306 then validate these workloads. Validation of workloads may include the identifying of those workloads that have already reached a predetermined or threshold level of failures.
- Block 308 selects workloads from the set of validated workloads.
- the workload management application may submit jobs to the batch processing system.
- the submission of jobs to the batch processing system may include a wrapper or header associated with the jobs that specify the processing requirements associated with the job. Specifying the processing requirements ensures the matching of systems having the proper computing resources to jobs that require those resources.
- FIG. 4 provides a logic flow diagram associated with an algorithm to validate and select workloads for submission to a batch processing system in accordance with embodiments of the present disclosure.
- a workload or parent workload could be designated invalid because it had too many fails or because the time was during the weekend hours.
- the disclosure provides that one or more validator algorithms can be associated with any workload or parent workload in the workload hierarchy.
- Operations 400 begin with block 402 .
- Processes within block 402 update validator status.
- Decision point 404 determines whether the pool of workloads is valid. The process then ends when the pool is not valid. Otherwise, block 406 computes an update for expected results when the pool is valid.
- Block 408 updates the number of job counts. Then block 410 determines how many workloads to select.
- Block 412 selects workloads for submission as jobs in block 414 .
- This algorithm uses an N-level tree hierarchy to represent workload portions.
- Previous hierarchy systems may have been limited to a predetermined number of hierarchies such as high priority and low priority workloads.
- the N-level tree hierarchy used to represent workload portions allows embodiments of the present disclosure to address the inadvertent impact associated with killing one group of workloads when new workloads for that group are presented.
- Prior solutions allowed the workloads of other groups to subsume all the resources when an interruption of one groups work occurred. This resulted in no resources being available when a group that had previously experienced interruption recovers.
- Updating validator status as shown in FIG. 4 simply walks the tree. Updating validator status can be an expensive operation. Updating validator status executes first and then the following steps can query the results. The algorithm can handle submitting to multiple batch processing systems (pools) which improve decision making with remote batch processing systems.
- Update Expected Status runs through a calculation of what the user should expect to see based on the current portion ratios and validation status. However, since this workload selection algorithm must wait for jobs to complete on the batch processing system before submitting new jobs, the select workloads algorithm extends this Update Expected Status algorithm and accounts for current jobs in the batch processing system.
- Various selection algorithms can select the workloads of a parent workload's children. In addition, users may assign the various selection algorithms to any parent workload in the workload hierarchy. For example, the selection could be random, round robin, or weighted based on how each workload compares against an arbitrary criteria.
- Update jobs counts uses the following algorithm to count jobs for workloads and workload groups.
- FIG. 5 describes one embodiment of this algorithm in further detail.
- First block 410 determines how many workloads to select by a variety of algorithms and could be as simple as seeing the maximum jobs allowed on the pool compared to the current jobs on the pool.
- Embodiments of the present disclosure provide one benefit in the ability to update when expected results may occur.
- a user may receive an update relating to when simulation results will be available based on dynamic changes in the jobs submitted and potentially removed from queue.
- the question of “is the pool valid?” relates more directly or may be better stated as “is the batch system to which the jobs will be submitted up and available?
- Embodiments of the present disclosure may provide a heuristic process wherein jobs may be related. If the simulation results of a particular job results in too many failures, unfinished jobs in queue related to the simulation results having too many failures may be ended such that processing time need not be wasted on a faulty design. Ending processing of these unfinished jobs allows reallocation of resources to other jobs not related or unrelated to the failing simulation. These jobs may be part of different workloads or projects but the relationship between jobs may be defined such that it is possible to end one job based on failure results in one or more related jobs. Thus, all testing related to the simulation of one particular revision or model may be concluded.
- FIG. 5 provides a logic flow diagram associated with Update Job Counts block 408 of FIG. 4 in further detail in accordance with embodiments of the present disclosure.
- Operations 500 begin with block 502 , where block 502 determines whether the workload is a leaf or a group. If the workload is a group, as determined at Decision Point 504 , several process iterations identify all the individual jobs associated with the group. If the workload is not a group at Decision Point 504 , block 514 determines the updated job count based on queued and running jobs as well as the current workloads group. It should be noted that a leaf node may contain more than one job. For example, a leaf node may contain 20,000 jobs. Similarly, a group may contain more than one leaf nodes.
- block 506 examines the workload and determines the number of jobs associated with that group workload.
- Decision Point 510 identifies whether or not additional workloads exist. If no more group workloads exist, then the instant workload's count results from summing the children's counts in block 512 . However, if additional workloads are present, the process will iterate through all of the children workloads until all the children counts have been identified and summed in block 512 .
- the key section of the algorithm selects workloads and dynamically rebalances portions based on validators.
- the algorithm gives the user optional maximum limit definitions to prevent workload groups from completely overtaking portions for a temporarily invalid workload sibling.
- FIGS. 6A and 6B describe one select workloads recursive algorithm used by embodiments of the present disclosure.
- FIGS. 6A and 6B provide a logic flow diagram that relates to selecting work loads in more detail than that of block 412 of FIG. 4 in accordance with embodiments of the present disclosure.
- Operations 600 begin by navigating through a workload group of available jobs in block 602 . If the group is not valid, block 624 returns this number of jobs as unused or available jobs in block 624 . If the group is valid in block 604 , block 606 checks the available jobs against an optional maximum limit. This prevents a team that may want to subsume the queue. Block 608 then finds the valid child workloads. The current jobs plus the previously allocated jobs should be less than ( ⁇ ) the maximum number of jobs.
- Block 610 introduces the concept of a pluggable strategy that selects the workloads and returns the unused jobs. This pluggable strategy allows a workload to use a particular set of rules of how a workload and child workloads or related workloads would be handled. These rules may employ a round robin, a weighted round robin or other strategy.
- FIG. 6A continues with FIG. 6B .
- block 626 again calls the recursive algorithm to determine if other children can use the job slots. The process then returns any unused jobs.
- the unused jobs when the unused jobs is not equal to zero and/or the unused child's jobs is not greater than (>) zero, then the unused jobs equals the unused child jobs.
- Operations 600 identify unused jobs and return the unused jobs such that the unused jobs can be reallocated. Selected workloads translate into jobs for submission to the batch processing system.
- FIGS. 7A and 7B provide a logic flow diagram in accordance with embodiments of the present disclosure that employ a round-robin strategy as the pluggable strategy identified in FIG. 6A .
- This allows priority differentiation of workloads to be addressed.
- a high priority workload group can be inserted anywhere in the workload hierarchy.
- the work management processing module only allows the high priority workload to take a configurable amount from peer workloads under the same parent workload. This allows high priority workloads quick service but within the fairness across all workload groups which for example could represent multiple projects competing for the same batch system's resource.
- a high-priority group can take a parameterized parameter ratio of its children groups.
- one parent can have a high-priority workload take priority over related and child workloads up to a certain percentage.
- This strategy allows processing the high-priority workload without affecting unrelated workloads.
- a high-priority workload will not impact the work of other workgroups.
- the ratios may also prevent a child workload from being starved.
- the pluggable strategy allows a parent workload to determine the rules applied to all of its children workloads. If there is a high-priority group, special handling is required in order to determine how much or how many jobs the high-priority job can take from its siblings. The total number of jobs available to the siblings does not include the number of jobs removed from the siblings and reallocated to the high-priority job. After removing the reallocated jobs, the processes use the new total number of jobs allocated to the siblings to reallocate jobs to the sibling.
- Block 702 allocates selected workload jobs.
- Block 704 identifies the selected workloads.
- block 706 sums the workload ratios.
- Block 708 determines whether there is a high-priority group. If there is a high-priority group, block 710 determines the number jobs of the high priority as a function of the number of jobs to allocate and the high-priority ratio.
- Block 712 removes the high-priority ratio assigned from the groups' ratio to sum.
- Block 714 removes the jobs for high priority from the jobs available for allocation.
- Block 720 determines whether or not there are more jobs (jobs, job slots, or jobs to allocate/count) and workloads to be allocated. If there are not any more, block 730 returns unused jobs to the pool of available jobs. Otherwise, block 722 retrieves the next selected workload.
- Block 724 sets the number of jobs for the workload as a function of the jobs to allocate and the workloads ratio wherein the workloads ratio may have been adjusted previously by removing the high-priority component of the ratio.
- Block 726 shows that the jobs for the workload must be greater than one (>1) and less than ( ⁇ ) the workload's max jobs. Block 728 then determines whether all the selected workloads have been processed. If they have not the process returns to block 720 .
- Block 732 retrieves a first-selected workload.
- Decision Point 734 determines whether the workload can have additional jobs. If the workload cannot have more jobs then block 736 removes the workload from the selected list. Then block 738 removes the workloads ratio from the ratio sum. Then Decision Point 744 determines whether more workloads are to be processed. If more workloads are to be processed, block 742 retrieves the next-selected workload and the process then returns to block 734 . Otherwise, block 740 computes the job to allocate as the difference between the jobs available for allocation and the jobs previously allocated.
- Embodiments of the present disclosure also provide an algorithm to minimize batch processing queue depth.
- the automated workload selection system algorithm takes a finite amount of time to pick the next job(s) to submit, and an additional amount of time to submit the job to the batch farm. This execution time limits the rate at which jobs may be submitted to the batch farm, which is defined as the submit rate. If the submit rate is greater than (>) or equal to the job completion rate, then all computers in the batch farm will be utilized. In addition, customers may change the workload configuration, and the system must be able to respond to those changes in a reasonable amount of time.
- the work management processing module detects the job completion rate and dynamically increases or decreases the time between re-evaluating the entire tree or set of workloads and parent workloads thereby keeping the number of queued but not running jobs in the batch system very low to avoid high latency penalties of changing priorities or reacting to invalid workloads.
- the automated workload selection system continuously monitors the completion rate and automatically adjusts the submit rate in order to keep the batch system full.
- the automated workload selection system can adjust how often the algorithm to pick the next job(s) runs, and how many jobs to submit in that run. For example, if jobs are completing at a rate of one per hour, the automated workload selection system need only run once per hour and submit one job. If the automated workload selection system detects that the completion rate is increasing, the automated workload selection system can either reduce the time between runs, or increase the number of jobs submitted per run.
- the automated workload selection system may need to submit upwards of several hundred jobs per second at times to keep the batch system fully utilized. The time required to interact with the batch system to submit a job may prevent the automated workload selection system from achieving the required submit rate.
- the automated workload selection system may submit one or more helper job(s) to the batch system that, in tum, submits jobs to the batch system on behalf of the automated workload selection system. These helper jobs may automatically terminate when the completion rate falls, or additional helper jobs may automatically start when the completion rate increases.
- FIG. 9 provides a logic flow diagram in accordance with embodiments of the present disclosure associated with an algorithm to address a highly variable completion rate and job submission rate.
- the work management processing module may detect the job completion rate the job submission rate and create or terminate job submission helpers (a new process or another server for example) to increase or decrease the rate of submissions to match the completion rate thereby ensuring that the work management processing module always keeps the batch systems busy with work.
- Operations 900 begin with block 902 where the submit rate is retrieved.
- Block 904 retrieves the completion rate.
- Decision point 906 determines whether the completion rate exceeds the submission rate. If the completion rate exceeds the submission rate then a submission helper processing module activates in block 910 . This module may be supplied work in block 912 . Otherwise, if the submission helper processing module is active the processing module may terminate in block 908 .
- Embodiments of the present disclosure also provide an algorithm to address workload portion recovery. Based on configuration changes or other events such as a workload completing, other workloads can take advantage of the portion of the batch processing system that is now unused. The problem occurs when the configuration changes or the completed workload starts up again. Historically, the workload starting up must wait for jobs to finish on the batch processing system before new workloads have a chance to be selected. This causes delays in starting the workload. Severity of this problem increases if the batch processing system's job completion rate is low. An alternative solution of killing all excess jobs of a workload is rarely an acceptable solution, as certain workloads require protection from termination due to job importance or job length
- Embodiments of the present disclosure provide a number of portion recovery algorithms that offer varying degrees of aggressiveness. Examples include: 1) Killing queued, but not running, jobs in the batch processing system from workloads that exceed their portion (Variations on this algorithm include: killing the oldest queued jobs because the newly queued ones would have the most recent workload configuration data; and killing the newest queued jobs); 2) Killing both queued and running jobs in the batch processing system from workloads that exceed their portion; 3) Waiting a programmable amount of time before killing jobs via method 1 or 2 in the batch processing system from workloads that exceed their portion; and 4) Killing up to a certain number of jobs per unit of time in the batch processing system from workloads that exceed their portion.
- FIG. 10 provides a logic flow diagram associated with an algorithm to address workload portion recovery in accordance with embodiment to the present disclosure.
- Workload portion recovery may occur when a previously terminated workload returns to service and the available workload allocation associated with the previously terminated workload had been returned to the batch processing system as being unused.
- the present disclosure provides a process to reclaim that workload's allocation. This allows prior to the configuration change, other workloads to take advantage of the portion of the batch processing system that was unused.
- Operations 1000 begin with block 1002 querying the batch processing system for status on workloads jobs. These operations allow other jobs associated with siblings to terminate if necessary.
- Block 1004 picks the new job for evaluation.
- Decision Point 1006 makes a determination as to whether a time limit has been reached to start killing jobs.
- block 1018 questions whether a start time for killing of jobs has been set. If not block, 1020 starts the set time to kill jobs. If the time limit has expired at Decision Point 1006 then Decision Point 1008 determines whether the job kill rate has exceeded the job kill rate limit. If the job kill rate has been exceeded then this recovery algorithm is complete. Otherwise the process continues. Decision Point 1010 determines whether the job status has been queued. If the job status has not been queued then the process completes. Otherwise, the kill request for the job may be sent to the batch processing system in block 1012 . Block 1014 updates the kill rate. This process continues iteratively at Decision Point 1016 where if the entire group of workload jobs has not been processed the process selects the next newest job by returning to block 1004 . Otherwise, the recovery algorithm is complete. This insures that short-running jobs may be completed while longer running jobs may be removed to facilitate the reallocation of jobs to an existing workload that is being recovered.
- Embodiments of the present disclosure also provide a user interface (UI) for the users to associate one of the portion recovery algorithms with a workload. This allows users to be aggressive in killing jobs in some workloads, but protecting certain workloads from being killed due to importance of the job or concern about killing long running jobs where too much compute resource is wasted.
- UI user interface
- FIG. 11 provides a logic-flow diagram associated with an algorithm for workload portion recovery in accordance with embodiment to the present disclosure.
- Operations 1100 begin with block 1102 querying the batch processing system for the status across all workloads. Then block 1104 evaluates the current portion for all workloads and workload groups. Block 1106 retrieves the next workload. Block 1008 examines this workload and decides whether the workload has more jobs in the batch processing system than its current position and ratio allow. If the workload does not have more than its share, then the process continues and determines whether more workloads exist for examination at Decision Point 1116 . Otherwise, block 1110 reads the workload configuration in block 1110 and the portion recovery algorithm selected in block 1112 . A portion recovery algorithm such as the exemplary portion recovery algorithm of FIG. 10 may then be applied in block 1114 .
- a portion recovery algorithm such as the exemplary portion recovery algorithm of FIG. 10 may then be applied in block 1114 .
- the automated workload selection system determines if a workload or workload group has more jobs in the batch processing system than its current portion. The calculation must take into account the status of all workloads and workload groups to determine a workload or workload group's current portion. If a workload exceeds its portion, the automated workload selection system reads the recovery algorithms and executes the algorithm on the amount of jobs in excess of its current portion.
- Embodiments of the present disclosure provide an algorithm for communicating workload change impacts to users.
- Embodiments allow the Automated Workload Selection System to provide intuitive control and viewing of system status through the combination of: 1) Algorithms to automatically decrease workload portions when a user increases a siblings workload (including feature to lock workloads that should not be altered); 2) Algorithms to translate portion allocations into various definitions of the number of jobs expected on the queue: a) Lower bound expectations (assume validators are true for all workloads); b) Historically bounded expectations (use history of validator and portion distribution status); c) Current expectations (using current Automated Workload Selection Algorithm); and d) Actual number of jobs (lags current expectations because jobs must complete before the Automated Workload Selection can submit more work); and 3) Algorithms to predict time for a workloads completion.
- the algorithms may predict time for a workloads completion based on: 1) History of validator and portion distribution status; 2) History of validator status and a new portion distribution status. Alternatively, given a history of the validator status and a desired time of completion, the algorithm can provide a recommended portion distribution to achieve those goals.
- a batch processing system such as load leveler or LSF has the ability to run jobs on many computing resources simultaneously.
- the batch system receives jobs, and when a machine is available, directs a computing resource to execute the job.
- all machines (computing resources) associated with the batch system are running jobs, a queue of extra jobs for future execution results.
- an automated workload selection system software layer may submit jobs to the batch processing system to keep the batch processing system continually full of useful work.
- the job submission system provides for: organizing workloads; assigning relative ratios between workloads; associating arbitrary workload validation algorithms with a workload or parent workload to ensure work is useful; associating arbitrary selection algorithms with a workload or workload group; defining high priority workloads that preserve the fairness of the overall system; and constantly balancing the workload selection based on current status of the batch system, validation status, and the workload ratios.
- the system provides for minimizing the batch system's queue depth to reduce the latency to respond to quickly changing priorities and workload validation results.
- embodiments of the present disclosure provide a system operable to allocate workload within a batch processing system.
- This system includes one or more servers and a batch processing system communicatively coupled to the one or more servers.
- the one or more servers are operable to receive a work request and execute work management processing software.
- the work management processing software validates individual workloads, determines the number of workloads to select and then selects a set of valid workloads to be processed.
- a batch processing system then receives this set of valid workloads.
- the batch processing system includes a number of computing resources wherein the work management processing module directs workloads to individual computing resources based on a variety of criteria within the batch processing system.
- the workloads may include one or more parent workloads and one or more child workloads.
- Embodiments provide an N-level tree hierarchy such that one child workload may be a parent to several other child workloads.
- each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s).
- the functions noted in the block may occur out of the order noted in the FIGs. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved.
- the disclosure can take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment containing both hardware and software elements.
- a preferred embodiment implements the disclosure in software, which includes but is not limited to firmware, resident software, microcode, etc.
- the disclosure can take the form of a computer program product accessible from a computer-usable or computer-readable medium providing program code for use by or in connection with a computer or any instruction execution system.
- a computer-usable or computer readable medium can be any tangible apparatus that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
- the medium can be an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system (or apparatus or device) or a propagation medium.
- Examples of a computer-readable medium include a semiconductor or solid state memory, magnetic tape, a removable computer diskette, a random access memory (RAM), a read-only memory (ROM), a rigid magnetic disk and an optical disk.
- Current examples of optical disks include compact disk-read only memory (CD-ROM), compact disk-read/write (CD-R/W) and DVD.
- a data processing system suitable for storing and/or executing program code will include at least one processor coupled directly or indirectly to memory elements through a system bus.
- the memory elements can include local memory employed during actual execution of the program code, bulk storage, and cache memories which provide temporary storage of at least some program code in order to reduce the number of times code must be retrieved from bulk storage during execution.
- I/O devices including but not limited to keyboards, displays, pointing devices, etc.
- I/O controllers can couple to the system either directly or through intervening I/O controllers.
- Network adapters may also couple to the system to enable the data processing system to couple to other data processing systems or remote printers or storage devices through intervening private or public networks.
- Modems, cable modem and Ethernet cards are just a few of the currently available types of network adapters.
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Debugging And Monitoring (AREA)
- Hardware Redundancy (AREA)
Abstract
Description
Claims (19)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US15/965,550 US10733026B2 (en) | 2009-04-13 | 2018-04-27 | Automated workflow selection |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US12/422,701 US9965333B2 (en) | 2009-04-13 | 2009-04-13 | Automated workload selection |
US15/965,550 US10733026B2 (en) | 2009-04-13 | 2018-04-27 | Automated workflow selection |
Related Parent Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US12/422,701 Continuation US9965333B2 (en) | 2009-04-13 | 2009-04-13 | Automated workload selection |
Publications (2)
Publication Number | Publication Date |
---|---|
US20180246771A1 US20180246771A1 (en) | 2018-08-30 |
US10733026B2 true US10733026B2 (en) | 2020-08-04 |
Family
ID=42935367
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US12/422,701 Expired - Fee Related US9965333B2 (en) | 2009-04-13 | 2009-04-13 | Automated workload selection |
US15/965,550 Expired - Fee Related US10733026B2 (en) | 2009-04-13 | 2018-04-27 | Automated workflow selection |
Family Applications Before (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US12/422,701 Expired - Fee Related US9965333B2 (en) | 2009-04-13 | 2009-04-13 | Automated workload selection |
Country Status (1)
Country | Link |
---|---|
US (2) | US9965333B2 (en) |
Families Citing this family (36)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20100325024A1 (en) * | 2009-06-22 | 2010-12-23 | Alberth Jr William P | System and Method for Obligation Management in Wireless Communication Devices |
US9128771B1 (en) * | 2009-12-08 | 2015-09-08 | Broadcom Corporation | System, method, and computer program product to distribute workload |
US9268601B2 (en) * | 2010-04-05 | 2016-02-23 | Nvidia Corporation | API for launching work on a processor |
US8930896B1 (en) * | 2010-07-23 | 2015-01-06 | Amazon Technologies, Inc. | Data anonymity and separation for user computation |
US8397099B2 (en) | 2010-09-10 | 2013-03-12 | Microsoft Corporation | Using pulses to control work ingress |
US8572614B2 (en) * | 2011-06-30 | 2013-10-29 | International Business Machines Corporation | Processing workloads using a processor hierarchy system |
EP2787438A4 (en) * | 2011-08-26 | 2016-06-29 | Hitachi Ltd | Predictive sequential calculation device |
US9223630B2 (en) * | 2011-12-22 | 2015-12-29 | Alcatel Lucent | Method and apparatus for energy efficient distributed and elastic load balancing |
US8832176B1 (en) * | 2012-05-09 | 2014-09-09 | Google Inc. | Method and system for processing a large collection of documents |
US9524009B2 (en) * | 2012-05-14 | 2016-12-20 | Intel Corporation | Managing the operation of a computing device by determining performance-power states |
US8978034B1 (en) * | 2013-03-15 | 2015-03-10 | Natero, Inc. | System for dynamic batching at varying granularities using micro-batching to achieve both near real-time and batch processing characteristics |
DE102015207543A1 (en) * | 2015-04-24 | 2016-10-27 | Conti Temic Microelectronic Gmbh | Device and method for controlling a vehicle headlight of a motor vehicle |
US9678806B2 (en) * | 2015-06-26 | 2017-06-13 | Advanced Micro Devices, Inc. | Method and apparatus for distributing processing core workloads among processing cores |
US9894165B2 (en) | 2015-09-16 | 2018-02-13 | Telefonaktiebolaget Lm Ericsson (Publ) | Systems and methods for decentralized service placement in a resource pool |
US9794370B2 (en) * | 2015-11-09 | 2017-10-17 | Telefonaktiebolaget Lm Ericsson (Publ) | Systems and methods for distributed network-aware service placement |
US10649808B2 (en) * | 2016-09-16 | 2020-05-12 | Oracle International Corporation | Outcome-based job rescheduling in software configuration automation |
US10691501B1 (en) * | 2016-10-25 | 2020-06-23 | Amazon Technologies, Inc. | Command invocations for target computing resources |
CN111432741B (en) * | 2017-09-01 | 2023-05-30 | 斯皮诺洛吉克斯公司 | Spinal rod implant manufacturing process |
US11037080B2 (en) * | 2017-10-05 | 2021-06-15 | Aconex Limited | Operational process anomaly detection |
US20190042142A1 (en) * | 2017-12-29 | 2019-02-07 | Intel Corporation | Statistics and priority-based control for storage device |
US11360804B2 (en) | 2018-06-29 | 2022-06-14 | International Business Machines Corporation | Resource management for parent child workload |
US10761915B2 (en) | 2018-09-26 | 2020-09-01 | International Business Machines Corporation | Preemptive deep diagnostics and health checking of resources in disaggregated data centers |
US10831580B2 (en) | 2018-09-26 | 2020-11-10 | International Business Machines Corporation | Diagnostic health checking and replacement of resources in disaggregated data centers |
US11188408B2 (en) | 2018-09-26 | 2021-11-30 | International Business Machines Corporation | Preemptive resource replacement according to failure pattern analysis in disaggregated data centers |
US11050637B2 (en) | 2018-09-26 | 2021-06-29 | International Business Machines Corporation | Resource lifecycle optimization in disaggregated data centers |
US10838803B2 (en) | 2018-09-26 | 2020-11-17 | International Business Machines Corporation | Resource provisioning and replacement according to a resource failure analysis in disaggregated data centers |
US10754720B2 (en) * | 2018-09-26 | 2020-08-25 | International Business Machines Corporation | Health check diagnostics of resources by instantiating workloads in disaggregated data centers |
US12026381B2 (en) | 2018-10-26 | 2024-07-02 | Pure Storage, Inc. | Preserving identities and policies across replication |
US10671302B1 (en) | 2018-10-26 | 2020-06-02 | Pure Storage, Inc. | Applying a rate limit across a plurality of storage systems |
CN111125241B (en) * | 2018-10-31 | 2023-10-20 | 伊姆西Ip控股有限责任公司 | Method, apparatus and computer storage medium for data synchronization |
JP7159952B2 (en) * | 2019-04-04 | 2022-10-25 | 日本電信電話株式会社 | Resource allocation device, resource allocation method and resource allocation program |
US20210089534A1 (en) * | 2019-09-19 | 2021-03-25 | Teradata Us, Inc. | System and method for dynamically reallocating resources among multiple task groups in a database system |
CN112995579B (en) * | 2019-12-12 | 2023-03-07 | 杭州海康威视系统技术有限公司 | Video stream distribution method and device, management server and video monitoring system |
US20210406147A1 (en) * | 2020-06-27 | 2021-12-30 | Intel Corporation | Apparatus and method for a closed-loop dynamic resource allocation control framework |
US11989587B2 (en) | 2020-06-27 | 2024-05-21 | Intel Corporation | Apparatus and method for a resource allocation control framework using performance markers |
CN112486687B (en) * | 2020-12-03 | 2022-09-27 | 重庆邮电大学 | Cloud platform workload prediction method based on multitask learning time sequence |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6230183B1 (en) | 1998-03-11 | 2001-05-08 | International Business Machines Corporation | Method and apparatus for controlling the number of servers in a multisystem cluster |
US6609227B2 (en) | 2000-12-15 | 2003-08-19 | International Business Machines Corporation | Scheduler for schematic related jobs |
US20050050480A1 (en) | 2003-09-02 | 2005-03-03 | Texas Instruments Incorporated | Notifying status of execution of jobs used to characterize cells in an integrated circuit |
US20070021998A1 (en) | 2005-06-27 | 2007-01-25 | Road Ltd. | Resource scheduling method and system |
US20080062886A1 (en) | 2006-09-12 | 2008-03-13 | Tang Ao Kevin | Method and apparatus for resource allocation for stream data processing |
US7434185B2 (en) | 2006-09-27 | 2008-10-07 | International Business Machines Corporation | Method and apparatus for parallel data preparation and processing of integrated circuit graphical design data |
-
2009
- 2009-04-13 US US12/422,701 patent/US9965333B2/en not_active Expired - Fee Related
-
2018
- 2018-04-27 US US15/965,550 patent/US10733026B2/en not_active Expired - Fee Related
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6230183B1 (en) | 1998-03-11 | 2001-05-08 | International Business Machines Corporation | Method and apparatus for controlling the number of servers in a multisystem cluster |
US6609227B2 (en) | 2000-12-15 | 2003-08-19 | International Business Machines Corporation | Scheduler for schematic related jobs |
US20050050480A1 (en) | 2003-09-02 | 2005-03-03 | Texas Instruments Incorporated | Notifying status of execution of jobs used to characterize cells in an integrated circuit |
US20070021998A1 (en) | 2005-06-27 | 2007-01-25 | Road Ltd. | Resource scheduling method and system |
US20080062886A1 (en) | 2006-09-12 | 2008-03-13 | Tang Ao Kevin | Method and apparatus for resource allocation for stream data processing |
US7434185B2 (en) | 2006-09-27 | 2008-10-07 | International Business Machines Corporation | Method and apparatus for parallel data preparation and processing of integrated circuit graphical design data |
Non-Patent Citations (1)
Title |
---|
Byun et al..; Scheduling Scheme based on Dedication Rate in Volunteer Computing Environment; Proceedings of the 4th International Symposium on Parallel and Distributed Computing (ISPDC '05); 2005; pp. 1-8. |
Also Published As
Publication number | Publication date |
---|---|
US20180246771A1 (en) | 2018-08-30 |
US9965333B2 (en) | 2018-05-08 |
US20100262975A1 (en) | 2010-10-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10733026B2 (en) | Automated workflow selection | |
US11275609B2 (en) | Job distribution within a grid environment | |
US10623481B2 (en) | Balancing resources in distributed computing environments | |
US20190324819A1 (en) | Distributed-system task assignment method and apparatus | |
US7257635B2 (en) | System and method for describing and automatically managing resources | |
US9141432B2 (en) | Dynamic pending job queue length for job distribution within a grid environment | |
US6587938B1 (en) | Method, system and program products for managing central processing unit resources of a computing environment | |
US8458714B2 (en) | Method, system and program products for managing logical processors of a computing environment | |
US6651125B2 (en) | Processing channel subsystem pending I/O work queues based on priorities | |
US6519660B1 (en) | Method, system and program products for determining I/O configuration entropy | |
Chatzistergiou et al. | Fast heuristics for near-optimal task allocation in data stream processing over clusters | |
US7467291B1 (en) | System and method for calibrating headroom margin | |
US20070016907A1 (en) | Method, system and computer program for automatic provisioning of resources to scheduled jobs | |
US20070180453A1 (en) | On demand application scheduling in a heterogeneous workload environment | |
CN111344688A (en) | Method and system for providing resources in cloud computing | |
KR20170029263A (en) | Apparatus and method for load balancing | |
WO2015101091A1 (en) | Distributed resource scheduling method and device | |
BRPI0014350B1 (en) | workload management in a computing environment | |
Weng et al. | Beware of Fragmentation: Scheduling {GPU-Sharing} Workloads with Fragmentation Gradient Descent | |
US10771982B2 (en) | Resource utilization of heterogeneous compute units in electronic design automation | |
CN106528288A (en) | Resource management method, device and system | |
CN113032102B (en) | Resource rescheduling method, device, equipment and medium | |
US7568052B1 (en) | Method, system and program products for managing I/O configurations of a computing environment | |
US8819239B2 (en) | Distributed resource management systems and methods for resource management thereof | |
US20080235705A1 (en) | Methods and Apparatus for Global Systems Management |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: INTERNATIONAL BUSINESS MACHINES CORPORATION, NEW YORK Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:REYSA, JOHN RICHARD;REEL/FRAME:045661/0413 Effective date: 20180427 Owner name: INTERNATIONAL BUSINESS MACHINES CORPORATION, NEW YORK Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:MCCAUGHRIN, TIERNEY BRUCE;REEL/FRAME:045661/0658 Effective date: 20180427 Owner name: INTERNATIONAL BUSINESS MACHINES CORPORATION, NEW Y Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:MCCAUGHRIN, TIERNEY BRUCE;REEL/FRAME:045661/0658 Effective date: 20180427 Owner name: INTERNATIONAL BUSINESS MACHINES CORPORATION, NEW Y Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:REYSA, JOHN RICHARD;REEL/FRAME:045661/0413 Effective date: 20180427 |
|
FEPP | Fee payment procedure |
Free format text: ENTITY STATUS SET TO UNDISCOUNTED (ORIGINAL EVENT CODE: BIG.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
AS | Assignment |
Owner name: INTERNATIONAL BUSINESS MACHINES CORPORATION, NEW YORK Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:HUNT, BRYAN RONALD;MCCANTS, STEPHEN;SIGNING DATES FROM 20180502 TO 20180505;REEL/FRAME:045935/0356 Owner name: INTERNATIONAL BUSINESS MACHINES CORPORATION, NEW Y Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:HUNT, BRYAN RONALD;MCCANTS, STEPHEN;SIGNING DATES FROM 20180502 TO 20180505;REEL/FRAME:045935/0356 |
|
AS | Assignment |
Owner name: INTERNATIONAL BUSINESS MACHINES CORPORATION, NEW YORK Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:KOZITZA, BRIAN LEE;REEL/FRAME:045990/0533 Effective date: 20180530 Owner name: INTERNATIONAL BUSINESS MACHINES CORPORATION, NEW Y Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:KOZITZA, BRIAN LEE;REEL/FRAME:045990/0533 Effective date: 20180530 |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NOTICE OF ALLOWANCE MAILED -- APPLICATION RECEIVED IN OFFICE OF PUBLICATIONS |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: PUBLICATIONS -- ISSUE FEE PAYMENT RECEIVED |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
FEPP | Fee payment procedure |
Free format text: MAINTENANCE FEE REMINDER MAILED (ORIGINAL EVENT CODE: REM.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
LAPS | Lapse for failure to pay maintenance fees |
Free format text: PATENT EXPIRED FOR FAILURE TO PAY MAINTENANCE FEES (ORIGINAL EVENT CODE: EXP.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
STCH | Information on status: patent discontinuation |
Free format text: PATENT EXPIRED DUE TO NONPAYMENT OF MAINTENANCE FEES UNDER 37 CFR 1.362 |
|
FP | Lapsed due to failure to pay maintenance fee |
Effective date: 20240804 |