research-article

Online Task Scheduling and Termination With Throughput Constraint

Authors:

Zhixuan FangAuthors Info & Claims

IEEE/ACM Transactions on Networking, Volume 32, Issue 6

Pages 4629 - 4643

https://doi.org/10.1109/TNET.2024.3425617

Published: 19 July 2024 Publication History

Abstract

We consider the task scheduling scenario where the controller activates one from K task types at each time. Each task induces a random completion time, and a reward is obtained only after the task is completed. The statistics of the completion time and the reward distributions of all task types are unknown to the controller. The controller needs to learn to schedule tasks to maximize the accumulated reward within a given time horizon T. Motivated by the practical scenarios, we require the designed policy to satisfy a system throughput constraint. In addition, we introduce the interruption mechanism to terminate ongoing tasks that last longer than certain deadlines. To address this scheduling problem, we model it as an online learning problem with deadline and throughput constraints. Then, we characterize the optimal offline policy and develop efficient online learning algorithms based on the Lyapunov method. We prove that our online learning algorithm achieves an <inline-formula> <tex-math notation="LaTeX">$O(\sqrt {T})$ </tex-math></inline-formula> regret and zero constraint violation. We also conduct simulations to evaluate the performance of our developed learning algorithms.

References

[1]

S. Agrawal and N. Devanur, “Linear contextual bandits with knapsacks,” in Proc. Adv. Neural Inf. Process. Syst., 2016, pp. 1–9.

[2]

H. Ahlehagh and S. Dey, “Adaptive bit rate capable video caching and scheduling,” in Proc. IEEE Wireless Commun. Netw. Conf. (WCNC), Apr. 2013, pp. 1357–1362.

[3]

S. Asmussen, S. Asmussen, and S. Asmussen, Applied Probability and Queues, vol. 2. Berlin, Germany: Springer, 2003.

[4]

E. Baccour, A. Erbad, K. Bilal, A. Mohamed, and M. Guizani, “PCCP: Proactive video chunks caching and processing in edge networks,” Future Gener. Comput. Syst., vol. 105, pp. 44–60, Apr. 2020.

[5]

E. B. Bajalinov, Linear-Fractional Programming Theory, Methods, Applications and Software, vol. 84. Berlin, Germany: Springer, 2003.

[6]

S. Balseiro, H. Lu, and V. Mirrokni, “Regularized online allocation problems: Fairness and beyond,” in Proc. Int. Conf. Mach. Learn., 2021, pp. 630–639.

[7]

Y. Bao, Y. Peng, and C. Wu, “Deep learning-based job placement in distributed machine learning clusters,” in Proc. IEEE INFOCOM Conf. Comput. Commun., Apr. 2019, pp. 505–513.

[8]

S. Bubeck, N. Cesa-Bianchi, and G. Lugosi, “Bandits with heavy tail,” IEEE Trans. Inf. Theory, vol. 59, no. 11, pp. 7711–7717, Nov. 2013.

[9]

S. Cayci and A. Eryilmaz, “Learning for serving deadline-constrained traffic in multi-channel wireless networks,” in Proc. 15th Int. Symp. Model. Optim. Mobile, Ad Hoc, Wireless Netw. (WiOpt), May 2017, pp. 1–8.

[10]

S. Cayci, A. Eryilmaz, and R. Srikant, “Continuous-time multi-armed bandits with controlled restarts,” 2020, arXiv:2007.00081.

[11]

S. Cayci, A. Eryilmaz, and R. Srikant, “Learning to control renewal processes with bandit feedback,” in Proc. Abstr. SIGMETRICS/Performance Joint Int. Conf. Meas. Model. Comput. Syst., Jun. 2019, pp. 1–32.

[12]

S. Cayci, A. Eryilmaz, and R. Srikant, “Budget-constrained bandits over general cost and reward distributions,” in Proc. Int. Conf. Artif. Intell. Statist., 2020, pp. 4388–4398.

[13]

S. Cayci, S. Gupta, and A. Eryilmaz, “Group-fair online allocation in continuous time,” in Proc. Adv. Neural Inf. Process. Syst., vol. 33, 2020, pp. 13750–13761.

[14]

S. Cayci, Y. Zheng, and A. Eryilmaz, “A Lyapunov-based methodology for constrained optimization with bandit feedback,” in Proc. AAAI Conf. Artif. Intell., vol. 36, 2022, pp. 3716–3723.

[15]

T. Chen, Q. Ling, and G. B. Giannakis, “An online convex optimization approach to proactive network resource allocation,” IEEE Trans. Signal Process., vol. 65, no. 24, pp. 6350–6364, Dec. 2017.

[16]

Y. Chen, J. Wang, Q. Zhang, F. Gao, and J. Song, “Online utility optimization in multi-user interference networks under a long-term budget constraint,” IEEE Trans. Veh. Technol., vol. 71, no. 10, pp. 11033–11046, Oct. 2022.

[17]

T. Choudhury, G. Joshi, W. Wang, and S. Shakkottai, “Job dispatching policies for queueing systems with unknown service rates,” in Proc. 22nd Int. Symp. Theory, Algorithmic Found., Protocol Design Mobile Netw. Mobile Comput., Jul. 2021, pp. 181–190.

[18]

R. Combes, C. Jiang, and R. Srikant, “Bandits with budgets: Regret lower bounds and optimal algorithms,” ACM SIGMETRICS Perform. Eval. Rev., vol. 43, no. 1, pp. 245–257, 2015.

[19]

D. Freund, T. Lykouris, and W. Weng, “Efficient decentralized multi-agent learning in asymmetric bipartite queueing systems,” Oper. Res., vol. 72, no. 3, pp. 1049–1070, May 2024.

[20]

X. Fu and E. Modiano, “Elastic job scheduling with unknown utility functions,” Perform. Eval., vol. 152, Dec. 2021, Art. no.

[21]

X. Fu and E. Modiano, “Joint learning and control in stochastic queueing networks with unknown utilities,” ACM SIGMETRICS Perform. Eval. Rev., vol. 51, no. 1, pp. 77–78, Jun. 2023.

[22]

X. Fu and E. Modiano, “Optimal routing to parallel servers with unknown utilities—Multi-armed bandit with queues,” IEEE/ACM Trans. Netw., vol. 31, no. 5, pp. 1997–2012, Dec. 2022.

[23]

G. Gao and Y. Wen, “Video transcoding for adaptive bitrate streaming over edge-cloud continuum,” Digit. Commun. Netw., vol. 7, no. 4, pp. 598–604, 2021.

[24]

A. Garivier, T. Lattimore, and E. Kaufmann, “On explore-then-commit strategies,” in Proc. Adv. Neural Inf. Process. Syst., vol. 29, 2016, pp. 1–9.

[25]

A. Garivier and E. Moulines, “On upper-confidence bound policies for switching bandit problems,” in Proc. Int. Conf. Algorithmic Learn. Theory. Berlin, Germany: Springer, Oct. 2011, pp. 174–188.

[26]

L. Huang, “Intelligence of smart systems: Model, bounds, and algorithms,” IEEE/ACM Trans. Netw., vol. 25, no. 5, pp. 2960–2973, Oct. 2017.

[27]

T. Huang et al., “Quality-aware neural adaptive video streaming with lifelong imitation learning,” IEEE J. Sel. Areas Commun., vol. 38, no. 10, pp. 2324–2342, Oct. 2020.

[28]

Z. Huang, Y. Xu, B. Hu, Q. Wang, and J. Pan, “Thompson sampling for combinatorial semi-bandits with sleeping arms and long-term fairness constraints,” 2020, arXiv:2005.06725.

[29]

Z. Huang et al., “AoI-constrained bandit: Information gathering over unreliable channels with age guarantees,” 2021, arXiv:2112.02786.

[30]

N. Immorlica, K. A. Sankararaman, R. Schapire, and A. Slivkins, “Adversarial bandits with knapsacks,” in Proc. IEEE 60th Annu. Symp. Found. Comput. Sci. (FOCS), Nov. 2019, pp. 202–219.

[31]

P. R. Jelenković and J. Tan, “Characterizing heavy-tailed distributions induced by retransmissions,” Adv. Appl. Probab., vol. 45, no. 1, pp. 106–138, Mar. 2013.

[32]

I. Kadota, A. Sinha, and E. Modiano, “Optimizing age of information in wireless networks with throughput constraints,” in Proc. IEEE Conf. Comput. Commun. (INFOCOM), Apr. 2018, pp. 1844–1852.

[33]

T. Kesselheim and S. Singla, “Online learning with vector costs and bandits with knapsacks,” in Proc. Conf. Learn. Theory, 2020, pp. 2286–2305.

[34]

S. Krishnasamy, R. Sen, R. Johari, and S. Shakkottai, “Learning unknown service rates in queues: A multiarmed bandit approach,” Oper. Res., vol. 69, no. 1, pp. 315–330, Jan. 2021.

[35]

B. Li, “Efficient learning-based scheduling for information freshness in wireless networks,” in Proc. IEEE Conf. Comput. Commun. (INFOCOM), May 2021, pp. 1–10.

[36]

F. Li, J. Liu, and B. Ji, “Combinatorial sleeping bandits with fairness constraints,” IEEE Trans. Netw. Sci. Eng., vol. 7, no. 3, pp. 1799–1813, Jul. 2020.

[37]

Q. Liu and Z. Fang, “Learning to schedule tasks with deadline and throughput constraints,” in Proc. IEEE INFOCOM Conf. Comput. Commun., May 2023, pp. 1–10.

[38]

Q. Liu and Z. Fang, “Decentralized scheduling with QoS constraints: Achieving O(1) QoS regret of multi-player bandits,” in Proc. AAAI Conf. Artif. Intell., vol. 38, 2024, pp. 13981–13989.

[39]

Q. Liu and Z. Fang, “Learning the optimal control for evolving systems with converging dynamics,” ACM SIGMETRICS Perform. Eval. Rev., vol. 52, no. 1, pp. 29–30, Jun. 2024.

[40]

Q. Liu, Z. Li, and Z. Fang, “Online convex optimization with switching costs: Algorithms and performance,” in Proc. 20th Int. Symp. Model. Optim. Mobile, Ad Hoc, Wireless Netw. (WiOpt), Sep. 2022, pp. 1–8.

[41]

Q. Liu, W. Wu, L. Huang, and Z. Fang, “Simultaneously achieving sublinear regret and constraint violations for online convex optimization with time-varying constraints,” Perform. Eval., vol. 152, Dec. 2021, Art. no.

[42]

Q. Liu, W. Xu, and Z. Fang, “Learning-based scheduling for information gathering with QoS constraints,” in Proc. IEEE INFOCOM Conf. Comput. Commun., 2024.

[43]

Q. Liu, W. Xu, S. Wang, and Z. Fang, “Combinatorial bandits with linear constraints: Beyond knapsacks and fairness,” in Proc. Adv. Neural Inf. Process. Syst., 2022, pp. 1–14.

[44]

X. Liu, B. Li, P. Shi, and L. Ying, “An efficient pessimistic-optimistic algorithm for stochastic linear bandits with general constraints,” in Proc. Adv. Neural Inf. Process. Syst., vol. 34, 2021, pp. 24075–24086.

[45]

S. Minsker, “Geometric median and robust estimation in Banach spaces,” Bernoulli, vol. 21, no. 4, pp. 2308–2335, 2015.

[46]

J. Nair, A. Wierman, and B. Zwart, The Fundamentals of Heavy Tails: Properties, Emergence, and Estimation, vol. 53. Cambridge, U.K.: Cambridge Univ. Press, 2022.

[47]

M. J. Neely, “Energy-aware wireless scheduling with near-optimal backlog and convergence time tradeoffs,” IEEE/ACM Trans. Netw., vol. 24, no. 4, pp. 2223–2236, Aug. 2016.

[48]

A. Pal and S. Reuveni, “First passage under restart,” Phys. Rev. Lett., vol. 118, no. 3, Jan. 2017, Art. no.

[49]

V. Perchet, P. Rigollet, S. Chassang, and E. Snowberg, “Batched bandit problems,” Ann. Statist., vol. 44, no. 2, pp. 660–681, 2016.

[50]

W. Ren, J. Liu, and N. B. Shroff, “On logarithmic regret for bandits with knapsacks,” in Proc. 55th Annu. Conf. Inf. Sci. Syst. (CISS), Mar. 2021, pp. 1–6.

[51]

K. A. Sankararaman and A. Slivkins, “Combinatorial semi-bandits with knapsacks,” in Proc. Int. Conf. Artif. Intell. Statist., 2018, pp. 1760–1770.

[52]

R. Sheahan, L. Lipsky, P. M. Fiorini, and S. Asmussen, “On the completion time distribution for tasks that must restart from the beginning if a failure occurs,” ACM SIGMETRICS Perform. Eval. Rev., vol. 34, no. 3, pp. 24–26, 2006.

[53]

J. Song, G. de Veciana, and S. Shakkottai, “Meta-scheduling for the wireless downlink through learning with bandit feedback,” IEEE/ACM Trans. Netw., vol. 30, no. 2, pp. 487–500, Apr. 2022.

[54]

T. Stahlbuhk, B. Shrader, and E. Modiano, “Learning algorithms for minimizing queue length regret,” IEEE Trans. Inf. Theory, vol. 67, no. 3, pp. 1759–1781, Mar. 2021.

[55]

J. Steiger, B. Li, and N. Lu, “Learning from delayed semi-bandit feedback under strong fairness guarantees,” in Proc. IEEE INFOCOM Conf. Comput. Commun., May 2022, pp. 1379–1388.

[56]

M. J. Wainwright, High-Dimensional Statistics: A Non-Asymptotic Viewpoint, vol. 48. Cambridge, U.K.: Cambridge Univ. Press, 2019.

[57]

X. Wei and M. J. Neely, “Power-aware wireless file downloading: A Lyapunov indexing approach to a constrained restless bandit problem,” IEEE/ACM Trans. Netw., vol. 24, no. 4, pp. 2264–2277, Aug. 2016.

[58]

J. Yao and N. Ansari, “Task allocation in fog-aided mobile IoT by Lyapunov online reinforcement learning,” IEEE Trans. Green Commun. Netw., vol. 4, no. 2, pp. 556–565, Jun. 2020.

Index Terms

Online Task Scheduling and Termination With Throughput Constraint
1. Computing methodologies
  1. Artificial intelligence
    1. Control methods
      1. Computational control theory
  2. Machine learning
    1. Learning paradigms
      1. Reinforcement learning
        Sequential decision making
    2. Learning settings
      1. Online learning settings
2. Theory of computation

Index terms have been assigned to the content through auto-classification.

Recommendations

Online scheduling of equal-processing-time task systems

We consider the problem of online scheduling a set of equal-processing-time tasks with precedence constraints so as to minimize the makespan. For arbitrary precedence constraints, it is known that any list scheduling algorithm has a competitive ratio of ...
Online Scheduling of Precedence Constrained Tasks

A fundamental problem in scheduling theory is that of scheduling a set of n tasks, with precedence constraints, on $m \ge 1$ identical and parallel processors so as to minimize the makespan (schedule length). In the past, research has focused on the ...
Scheduling task with state-dependent deadlines

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image IEEE/ACM Transactions on Networking

IEEE/ACM Transactions on Networking Volume 32, Issue 6

Dec. 2024

985 pages

Issue’s Table of Contents

1063-6692 © 2024 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See https://www.ieee.org/publications/rights/index.html for more information.

Publisher

IEEE Press

Publication History

Published: 19 July 2024

Published in TON Volume 32, Issue 6

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
6
Total Downloads

Downloads (Last 12 months)6
Downloads (Last 6 weeks)6

Reflects downloads up to 29 Jan 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Issue’s Table of Contents