Nothing Special   »   [go: up one dir, main page]

skip to main content
Skip header Section
Markov Decision Processes: Discrete Stochastic Dynamic ProgrammingJanuary 1994
Publisher:
  • John Wiley & Sons, Inc.
  • 605 Third Ave. New York, NY
  • United States
ISBN:978-0-471-61977-2
Published:01 January 1994
Pages:
672
Skip Bibliometrics Section
Reflects downloads up to 18 Nov 2024Bibliometrics
Skip Abstract Section
Abstract

From the Publisher:

The past decade has seen considerable theoretical and applied research on Markov decision processes, as well as the growing use of these models in ecology, economics, communications engineering, and other fields where outcomes are uncertain and sequential decision-making processes are needed. A timely response to this increased activity, Martin L. Puterman's new work provides a uniquely up-to-date, unified, and rigorous treatment of the theoretical, computational, and applied research on Markov decision process models. It discusses all major research directions in the field, highlights many significant applications of Markov decision processes models, and explores numerous important topics that have previously been neglected or given cursory coverage in the literature. Markov Decision Processes focuses primarily on infinite horizon discrete time models and models with discrete time spaces while also examining models with arbitrary state spaces, finite horizon models, and continuous-time discrete state models. The book is organized around optimality criteria, using a common framework centered on the optimality (Bellman) equation for presenting results. The results are presented in a "theorem-proof" format and elaborated on through both discussion and examples, including results that are not available in any other book. A two-state Markov decision process model, presented in Chapter 3, is analyzed repeatedly throughout the book and demonstrates many results and algorithms. Markov Decision Processes covers recent research advances in such areas as countable state space models with average reward criterion, constrained models, and models with risk sensitive optimality criteria. It also explores several topics that have received little or no attention in other books, including modified policy iteration, multichain models with average reward criterion, and sensitive optimality. In addition, a Bibliographic Remarks section in each chapter comments on relevant historic

Cited By

  1. Li Y, Lan G and Zhao T (2024). Homotopic policy mirror descent: policy convergence, algorithmic regularization, and improved sample complexity, Mathematical Programming: Series A and B, 207:1-2, (457-513), Online publication date: 1-Sep-2024.
  2. Winterer L, Wimmer R, Becker B and Jansen N (2024). Strong Simple Policies for POMDPs, International Journal on Software Tools for Technology Transfer (STTT), 26:3, (269-299), Online publication date: 1-Jun-2024.
  3. Oesterle M, Grams T, Bartelt C and Stuckenschmidt H RAISE the Bar: Restriction of Action Spaces for Improved Social Welfare and Equity in Traffic Management Proceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems, (1492-1500)
  4. ACM
    Mendoza D, Romero F and Trippel C Model Selection for Latency-Critical Inference Serving Proceedings of the Nineteenth European Conference on Computer Systems, (1016-1038)
  5. Wu Q, Wang S, Ge H, Fan P, Fan Q and Letaief K (2023). Delay-Sensitive Task Offloading in Vehicular Fog Computing-Assisted Platoons, IEEE Transactions on Network and Service Management, 21:2, (2012-2026), Online publication date: 1-Apr-2024.
  6. ACM
    Tournaire T, Castel-Taleb H and Hyon E (2023). Efficient Computation of Optimal Thresholds in Cloud Auto-scaling Systems, ACM Transactions on Modeling and Performance Evaluation of Computing Systems, 8:4, (1-31), Online publication date: 31-Dec-2024.
  7. Hau J, Delage E, Ghavamzadeh M and Petrik M On dynamic programming decompositions of static risk measures in Markov decision processes Proceedings of the 37th International Conference on Neural Information Processing Systems, (51734-51757)
  8. Hong Y, Xie Q, Chen Y and Wang W Restless bandits with average reward Proceedings of the 37th International Conference on Neural Information Processing Systems, (12810-12844)
  9. Lobo E, Cousins C, Zick Y and Petrik M Percentile criterion optimization in offline reinforcement learning Proceedings of the 37th International Conference on Neural Information Processing Systems, (9322-9352)
  10. ACM
    Vallat G, Wang J, Maddux A, Kamgarpour M and Parascho S Reinforcement learning for scaffold-free construction of spanning structures Proceedings of the 8th ACM Symposium on Computational Fabrication, (1-12)
  11. Shinde S and Tarchi D (2023). Joint Air-Ground Distributed Federated Learning for Intelligent Transportation Systems, IEEE Transactions on Intelligent Transportation Systems, 24:9, (9996-10011), Online publication date: 1-Sep-2023.
  12. Hibbard M, Tanaka T and Topcu U (2023). Simultaneous perception–action design via invariant finite belief sets, Automatica (Journal of IFAC), 155:C, Online publication date: 1-Sep-2023.
  13. Hasanbeig H, Kroening D and Abate A (2023). Certified reinforcement learning with logic guidance, Artificial Intelligence, 322:C, Online publication date: 1-Sep-2023.
  14. ACM
    Wan R, Liu Y, McQueen J, Hains D and Song R Experimentation Platforms Meet Reinforcement Learning: Bayesian Sequential Decision-Making for Continuous Monitoring Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, (5016-5027)
  15. Zychlinski N, Chan C and Dong J (2023). Managing Queues with Different Resource Requirements, Operations Research, 71:4, (1387-1413), Online publication date: 1-Jul-2023.
  16. Rodriguez C, Jenkins P and Robbins M (2023). Solving the joint military medical evacuation problem via a random forest approximate dynamic programming approach, Expert Systems with Applications: An International Journal, 221:C, Online publication date: 1-Jul-2023.
  17. Augello A, Gaglio S, Infantino I, Maniscalco U, Pilato G and Vella F (2023). Roboception and adaptation in a cognitive robot, Robotics and Autonomous Systems, 164:C, Online publication date: 1-Jun-2023.
  18. Zhang H, Sun J, Xu Z and Shi J (2023). Learning unified mutation operator for differential evolution by natural evolution strategies, Information Sciences: an International Journal, 632:C, (594-616), Online publication date: 1-Jun-2023.
  19. Sengadu Suresh P, Gui Y and Doshi P Dec-AIRL: Decentralized Adversarial IRL for Human-Robot Teaming Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems, (1116-1124)
  20. ACM
    Gu T, Feng K, Cong G, Long C, Wang Z and Wang S (2023). The RLR-Tree: A Reinforcement Learning Based R-Tree for Spatial Data, Proceedings of the ACM on Management of Data, 1:1, (1-26), Online publication date: 26-May-2023.
  21. Li S, Yu Y, Miguel N, Calderone D, Ratliff L and Açıkmeşe B (2023). Adaptive constraint satisfaction for Markov decision process congestion games, Automatica (Journal of IFAC), 151:C, Online publication date: 1-May-2023.
  22. Xia L, Guo X and Cao X (2023). A note on the existence of optimal stationary policies for average Markov decision processes with countable states, Automatica (Journal of IFAC), 151:C, Online publication date: 1-May-2023.
  23. Disser Y, Friedmann O and Hopp A (2023). An exponential lower bound for Zadeh’s pivot rule, Mathematical Programming: Series A and B, 199:1-2, (865-936), Online publication date: 1-May-2023.
  24. ACM
    Luo H, Bao Z, Culpepper J, Li M and Zhao Y Facility Relocation Search For Good: When Facility Exposure Meets User Convenience Proceedings of the ACM Web Conference 2023, (3937-3947)
  25. Malekzadeh P, Hou M and Plataniotis K (2023). Uncertainty-aware transfer across tasks using hybrid model-based successor feature reinforcement learning☆, Neurocomputing, 530:C, (165-187), Online publication date: 14-Apr-2023.
  26. Yang L, Tao J, Liu Y, Xu Y and Su C (2023). Energy scheduling for DoS attack over multi-hop networks, Neural Networks, 161:C, (735-745), Online publication date: 1-Apr-2023.
  27. Zhao B, Dong H, Wang Y and Pan T (2023). PPO-TA, Knowledge-Based Systems, 264:C, Online publication date: 15-Mar-2023.
  28. Crispino G, Freire V and Delgado K (2023). GUBS criterion, Artificial Intelligence, 316:C, Online publication date: 1-Mar-2023.
  29. Bonetti M, Bisi L and Restelli M (2023). Risk-averse optimization of reward-based coherent risk measures, Artificial Intelligence, 316:C, Online publication date: 1-Mar-2023.
  30. Simão T, Suilen M and Jansen N Safe policy improvement for POMDPs via finite-state controllers Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on Educational Advances in Artificial Intelligence, (15109-15117)
  31. Fu J, Huang C, Li Y, Mei J, Xu M and Zhang L (2023). Quantitative controller synthesis for consumption Markov decision processes, Information Processing Letters, 180:C, Online publication date: 1-Feb-2023.
  32. Quan J and Wang N (2023). An optimized task assignment framework based on crowdsourcing knowledge graph and prediction, Knowledge-Based Systems, 260:C, Online publication date: 25-Jan-2023.
  33. ACM
    Fuchs A, Passarella A and Conti M (2023). Modeling, Replicating, and Predicting Human Behavior: A Survey, ACM Transactions on Autonomous and Adaptive Systems, 0:0
  34. ACM
    Batz K, Kaminski B, Katoen J, Matheja C and Verscht L (2023). A Calculus for Amortized Expected Runtimes, Proceedings of the ACM on Programming Languages, 7:POPL, (1957-1986), Online publication date: 9-Jan-2023.
  35. Drent C, Drent M, Arts J and Kapodistria S (2023). Real-Time Integrated Learning and Decision Making for Cumulative Shock Degradation, Manufacturing & Service Operations Management, 25:1, (235-253), Online publication date: 1-Jan-2023.
  36. Yerudkar A, Chatzaroulas E, Del Vecchio C and Moschoyiannis S (2023). Sampled-data Control of Probabilistic Boolean Control Networks, Information Sciences: an International Journal, 619:C, (374-389), Online publication date: 1-Jan-2023.
  37. Wu T, Wen P and Tang S (2023). Optimal scheduling strategy of AUV based on importance and age of information, Wireless Networks, 29:1, (87-95), Online publication date: 1-Jan-2023.
  38. Marrero W Simulation-Based Sets of Similar-Performing Actions in Finite Markov Decision Process Models Proceedings of the Winter Simulation Conference, (3217-3228)
  39. Kong N, Paz J and Gao X EMS Operations Management Proceedings of the Winter Simulation Conference, (222-237)
  40. Wei J and Ye D (2022). Transmission schedule for jointly optimizing remote state estimation and wireless sensor network lifetime, Neurocomputing, 514:C, (374-384), Online publication date: 1-Dec-2022.
  41. Grimm C, Barreto A and Singh S Approximate value equivalence Proceedings of the 36th International Conference on Neural Information Processing Systems, (33029-33040)
  42. Ho C, Petrik M and Wiesemann W Robust ϕ-divergence MDPs Proceedings of the 36th International Conference on Neural Information Processing Systems, (32680-32693)
  43. Fu H, Yu S, Littman M and Konidaris G Model-based lifelong reinforcement learning with Bayesian exploration Proceedings of the 36th International Conference on Neural Information Processing Systems, (32369-32382)
  44. Pesquerel F and Maillard O IMED-RL Proceedings of the 36th International Conference on Neural Information Processing Systems, (26363-26374)
  45. Arumugam D and Singh S Planning to the information horizon of BAMDPs via epistemic state abstraction Proceedings of the 36th International Conference on Neural Information Processing Systems, (20482-20497)
  46. Terpin A, Lanzetti N, Yardim B, Dörfler F and Ramponi G Trust region policy optimization with optimal transport discrepancies Proceedings of the 36th International Conference on Neural Information Processing Systems, (19786-19797)
  47. Arumugam D and Van Roy B Deciding what to model Proceedings of the 36th International Conference on Neural Information Processing Systems, (9024-9044)
  48. ACM
    Dwarakanath K, Dervovic D, Tavallali P, Vyetrenko S and Balch T Optimal Stopping with Gaussian Processes Proceedings of the Third ACM International Conference on AI in Finance, (497-505)
  49. ACM
    Xia Y, Liu S, Chen X, Xu Z, Zheng K and Su H RISE: A Velocity Control Framework with Minimal Impacts based on Reinforcement Learning Proceedings of the 31st ACM International Conference on Information & Knowledge Management, (2210-2219)
  50. ACM
    Xiong G, Qin X, Li B, Singh R and Li J Index-aware reinforcement learning for adaptive video streaming at the wireless edge Proceedings of the Twenty-Third International Symposium on Theory, Algorithmic Foundations, and Protocol Design for Mobile Networks and Mobile Computing, (81-90)
  51. ACM
    Shirmohammadi M (2022). A Beginner's Tutorial on Strategy Complexity in Stochastic Games, ACM SIGLOG News, 9:4, (27-43), Online publication date: 1-Oct-2022.
  52. Epperlein J, Overko R, Zhuk S, King C, Bouneffouf D, Cullen A and Shorten R (2022). Reinforcement learning with algorithms from probabilistic structure estimation, Automatica (Journal of IFAC), 144:C, Online publication date: 1-Oct-2022.
  53. Bisi L, Santambrogio D, Sandrelli F, Tirinzoni A, Ziebart B and Restelli M (2022). Risk-averse policy optimization via risk-neutral policy optimization, Artificial Intelligence, 311:C, Online publication date: 1-Oct-2022.
  54. Efrosinin D and Stepanova N Average Cost Minimization in a Multi-server Retrial Queueing System with a Controllable Reserve Group of Servers Distributed Computer and Communication Networks: Control, Computation, Communications, (284-296)
  55. Skitsas K, Papageorgiou I, Talebi M, Kantere V, Katehakis M and Karras P (2022). SIFTER, Proceedings of the VLDB Endowment, 16:1, (90-98), Online publication date: 1-Sep-2022.
  56. Wu Q (2022). Dynamic matching with teams, Operations Research Letters, 50:5, (618-622), Online publication date: 1-Sep-2022.
  57. Gautron R, Maillard O, Preux P, Corbeels M and Sabbadin R (2022). Reinforcement learning for crop management support, Computers and Electronics in Agriculture, 200:C, Online publication date: 1-Sep-2022.
  58. Pan W, Liu Y and Yang C (2022). Adaptive task offloading of rechargeable UAV edge computing network based on double decision value iteration, Computer Communications, 193:C, (136-145), Online publication date: 1-Sep-2022.
  59. Fu C, Ding X and Chang W (2022). A stable multi-criteria decision model based on Markov chain, Computers and Industrial Engineering, 171:C, Online publication date: 1-Sep-2022.
  60. ACM
    Tahir A, Cui K and Koeppl H Learning Mean-Field Control for Delayed Information Load Balancing in Large Queuing Systems Proceedings of the 51st International Conference on Parallel Processing, (1-11)
  61. ACM
    Wu Y and De Loera J Geometric Policy Iteration for Markov Decision Processes Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, (2070-2078)
  62. ACM
    Antonelli M, Dal Lago U and Pistone P Curry and Howard Meet Borel Proceedings of the 37th Annual ACM/IEEE Symposium on Logic in Computer Science, (1-13)
  63. Sami H, Bentahar J, Mourad A, Otrok H and Damiani E (2022). Graph convolutional recurrent networks for reward shaping in reinforcement learning, Information Sciences: an International Journal, 608:C, (63-80), Online publication date: 1-Aug-2022.
  64. Torrico A and Toriello A (2022). Dynamic Relaxations for Online Bipartite Matching, INFORMS Journal on Computing, 34:4, (1871-1884), Online publication date: 1-Jul-2022.
  65. ACM
    Jeon S, Kwon S, Hwang J, Cho Y, Kim H, Park J and Lee I (2022). Dynamic optimal space partitioning for redirected walking in multi-user environment, ACM Transactions on Graphics, 41:4, (1-14), Online publication date: 1-Jul-2022.
  66. Nan Z, Jia Y, Ren Z, Chen Z and Liang L (2022). Delay-Aware Content Delivery With Deep Reinforcement Learning in Internet of Vehicles, IEEE Transactions on Intelligent Transportation Systems, 23:7, (8918-8929), Online publication date: 1-Jul-2022.
  67. Roy M, Biswas D, Aslam N and Chowdhury C (2022). Reinforcement learning based effective communication strategies for energy harvested WBAN, Ad Hoc Networks, 132:C, Online publication date: 1-Jul-2022.
  68. ACM
    Nikookar S, Sakharkar P, Somasunder S, Basu Roy S, Bienkowski A, Macesker M, Pattipati K and Sidoti D Cooperative Route Planning Framework for Multiple Distributed Assets in Maritime Applications Proceedings of the 2022 International Conference on Management of Data, (1518-1527)
  69. Yu Y, Calderone D, Li S, Ratliff L and Açıkmeşe B (2022). Variable demand and multi-commodity flow in Markovian network equilibrium, Automatica (Journal of IFAC), 140:C, Online publication date: 1-Jun-2022.
  70. Street C, Lacerda B, Staniaszek M, Mühlig M and Hawes N Context-Aware Modelling for Multi-Robot Systems Under Uncertainty Proceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems, (1228-1236)
  71. Senadeera M, Karimpanal T, Gupta S and Rana S Sympathy-based Reinforcement Learning Agents Proceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems, (1164-1172)
  72. Qian P and Unhelkar V Evaluating the Role of Interactivity on Improving Transparency in Autonomous Agents Proceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems, (1083-1091)
  73. Karabag M, Neary C and Topcu U Planning Not to Talk: Multiagent Systems that are Robust to Communication Loss Proceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems, (705-713)
  74. Ghalme G, Nair V, Patil V and Zhou Y Long-Term Resource Allocation Fairness in Average Markov Decision Process (AMDP) Environment Proceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems, (525-533)
  75. Geist M, Pérolat J, Laurière M, Elie R, Perrin S, Bachem O, Munos R and Pietquin O Concave Utility Reinforcement Learning: The Mean-field Game Viewpoint Proceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems, (489-497)
  76. Agarwal M, Aggarwal V and Lan T Multi-Objective Reinforcement Learning with Non-Linear Scalarization Proceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems, (9-17)
  77. Avrachenkov K and Borkar V (2022). Whittle index based Q-learning for restless bandits with average reward, Automatica (Journal of IFAC), 139:C, Online publication date: 1-May-2022.
  78. Gao S, Shi H, Wang F, Wang Z, Zhang S, Li Y and Sun Y (2022). Deterministic policy optimization with clipped value expansion and long-horizon planning, Neurocomputing, 483:C, (299-310), Online publication date: 28-Apr-2022.
  79. ACM
    Zilic J, De Maio V, Aral A and Brandic I Edge offloading for microservice architectures Proceedings of the 5th International Workshop on Edge Systems, Analytics and Networking, (1-6)
  80. Haseeb J, Malik S, Mansoori M and Welch I (2022). Probabilistic modelling of deception-based security framework using markov decision process, Computers and Security, 115:C, Online publication date: 1-Apr-2022.
  81. Lomuscio A and Pirovano E (2022). A counter abstraction technique for verifying properties of probabilistic swarm systems, Artificial Intelligence, 305:C, Online publication date: 1-Apr-2022.
  82. ACM
    Antar A, Kratz A and Banovic N (2022). Behavior Modeling Approach for Forecasting Physical Functioning of People with Multiple Sclerosis, Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 7:1, (1-29), Online publication date: 27-Mar-2022.
  83. ACM
    Humayoun S, Abbas G and Al-Tarawneh R Touch-behavioral Authentication on Smartphones using Machine Learning 27th International Conference on Intelligent User Interfaces, (105-108)
  84. Matez-Bandera J, Monroy J and Gonzalez-Jimenez J (2022). Efficient semantic place categorization by a robot through active line-of-sight selection, Knowledge-Based Systems, 240:C, Online publication date: 15-Mar-2022.
  85. V Varagapriya , Singh V and Lisser A (2022). Constrained Markov decision processes with uncertain costs, Operations Research Letters, 50:2, (218-223), Online publication date: 1-Mar-2022.
  86. ACM
    Archer C, Banerjee S, Cortez M, Rucker C, Sinclair S, Solberg M, Xie Q and Lee Yu C (2022). ORSuite, ACM SIGMETRICS Performance Evaluation Review, 49:2, (57-61), Online publication date: 17-Jan-2022.
  87. Atia G, Beckus A, Alkhouri I and Velasquez A (2021). Steady-State Planning in Expected Reward Multichain MDPs, Journal of Artificial Intelligence Research, 72, (1029-1082), Online publication date: 4-Jan-2022.
  88. Su Y and Li J (2022). On the optimality of a maintenance queueing system, Operations Research Letters, 50:1, (32-39), Online publication date: 1-Jan-2022.
  89. Graves E, Jenkins P and Robbins M Analyzing the impact of triage classification errors on military medical evacuation dispatching policies Proceedings of the Winter Simulation Conference, (1-12)
  90. Arumugam D and Van Roy B The value of information when deciding what to learn Proceedings of the 35th International Conference on Neural Information Processing Systems, (9816-9827)
  91. Grimm C, Barreto A, Farquhar G, Silver D and Singh S Proper value equivalence Proceedings of the 35th International Conference on Neural Information Processing Systems, (7773-7786)
  92. Anselmi J, Gaujal B and Rebuffi L (2021). Optimal speed profile of a DVFS processor under soft deadlines, Performance Evaluation, 152:C, Online publication date: 1-Dec-2021.
  93. ACM
    Liu T, Wu B, Xu W, Cao X, Peng J and Wu H (2021). RLC: A Reinforcement Learning-Based Charging Algorithm for Mobile Devices, ACM Transactions on Sensor Networks, 17:4, (1-23), Online publication date: 30-Nov-2021.
  94. Dai Y, Yoshikawa M and Sugiyama K (2022). Prerequisite-aware course ordering towards getting relevant job opportunities, Expert Systems with Applications: An International Journal, 183:C, Online publication date: 30-Nov-2021.
  95. ACM
    Glatt R, Silva F, Soper B, Dawson W, Rusu E and Goldhahn R Collaborative energy demand response with decentralized actor and centralized critic Proceedings of the 8th ACM International Conference on Systems for Energy-Efficient Buildings, Cities, and Transportation, (333-337)
  96. Schillinger P, García S, Makris A, Roditakis K, Logothetis M, Alevizos K, Ren W, Tajvar P, Pelliccione P, Argyros A, Kyriakopoulos K and Dimarogonas D (2021). Adaptive heterogeneous multi-robot collaboration from formal task specifications, Robotics and Autonomous Systems, 145:C, Online publication date: 1-Nov-2021.
  97. Tang Y, Jiang H, Xie J and Zheng Z (2021). A queueing model for customer rescheduling and no-shows in service systems, Operations Research Letters, 49:6, (821-828), Online publication date: 1-Nov-2021.
  98. Wu C, Chen W and Chiu H (2021). Dynamic expansion of flexible capacity with and without pricing coordination, Computers and Industrial Engineering, 161:C, Online publication date: 1-Nov-2021.
  99. ACM
    Shea-Blymyer C and Abbas H (2021). Algorithmic Ethics: Formalization and Verification of Autonomous Vehicle Obligations, ACM Transactions on Cyber-Physical Systems, 5:4, (1-25), Online publication date: 31-Oct-2021.
  100. Ganzfried S Computing Nash Equilibria in Multiplayer DAG-Structured Stochastic Games with Persistent Imperfect Information Decision and Game Theory for Security, (3-16)
  101. Zhu H, Ouahada K and Abu-Mahfouz A A Lyapunov-based Real Time Energy Management System for Smart IoT Homes IECON 2021 – 47th Annual Conference of the IEEE Industrial Electronics Society, (1-7)
  102. ACM
    Roohi S, Guckelsberger C, Relas A, Heiskanen H, Takatalo J and Hämäläinen P (2021). Predicting Game Difficulty and Engagement Using AI Players, Proceedings of the ACM on Human-Computer Interaction, 5:CHI PLAY, (1-17), Online publication date: 5-Oct-2021.
  103. Simard F, Desharnais J and Laviolette F (2021). General Cops and Robbers games with randomness, Theoretical Computer Science, 887:C, (30-50), Online publication date: 2-Oct-2021.
  104. Bedewy A, Sun Y, Singh R and Shroff N (2021). Low-Power Status Updates via Sleep-Wake Scheduling, IEEE/ACM Transactions on Networking, 29:5, (2129-2141), Online publication date: 1-Oct-2021.
  105. Agarwal M and Aggarwal V (2021). Blind decision making, Pattern Recognition Letters, 150:C, (176-182), Online publication date: 1-Oct-2021.
  106. Kicki P, Gawron T, Ćwian K, Ozay M and Skrzypczyński P (2021). Learning from experience for rapid generation of local car maneuvers, Engineering Applications of Artificial Intelligence, 105:C, Online publication date: 1-Oct-2021.
  107. Huang L, Xu T, Chen X, Xu Y, Zhang X and Fang G (2021). Joint relay and channel selection in relay‐aided anti‐jamming system, Transactions on Emerging Telecommunications Technologies, 32:9, Online publication date: 8-Sep-2021.
  108. Ma A, Ouimet M and Cortés J (2021). Temporal sampling annealing schemes for receding horizon multi-agent planning, Robotics and Autonomous Systems, 143:C, Online publication date: 1-Sep-2021.
  109. Borkar V, Choudhary S, Gupta V and Kasbekar G (2021). Scheduling in wireless networks with spatial reuse of spectrum as restless bandits, Performance Evaluation, 149:C, Online publication date: 1-Sep-2021.
  110. ACM
    Killian J, Biswas A, Shah S and Tambe M Q-Learning Lagrange Policies for Multi-Action Restless Bandits Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, (871-881)
  111. ACM
    Wan R, Zhang X and Song R Multi-Objective Model-based Reinforcement Learning for Infectious Disease Control Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, (1634-1644)
  112. ACM
    Butkova Y, Hartmanns A and Hermanns H (2021). A Modest Approach to Markov Automata, ACM Transactions on Modeling and Computer Simulation, 31:3, (1-34), Online publication date: 31-Jul-2021.
  113. ACM
    Regnier-Coudert O and Povéda G An empirical evaluation of permutation-based policies for stochastic RCPSP Proceedings of the Genetic and Evolutionary Computation Conference Companion, (1451-1458)
  114. Wu B, Zhang X and Lin H (2021). Supervisor synthesis of POMDP via automata learning, Automatica (Journal of IFAC), 129:C, Online publication date: 1-Jul-2021.
  115. ACM
    Mardare R, Panangaden P and Plotkin G Fixed-points for quantitative equational logics Proceedings of the 36th Annual ACM/IEEE Symposium on Logic in Computer Science, (1-13)
  116. ACM
    Chatterjee K and Doyen L Stochastic processes with expected stopping time Proceedings of the 36th Annual ACM/IEEE Symposium on Logic in Computer Science, (1-13)
  117. ACM
    Templier P, Rachelson E and Wilson D A geometric encoding for neural network evolution Proceedings of the Genetic and Evolutionary Computation Conference, (919-927)
  118. ACM
    Wang D, Hoffmann J and Reps T Central moment analysis for cost accumulators in probabilistic programs Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation, (559-573)
  119. Castiglioni V, Loreti M and Tini S How Adaptive and Reliable is Your Program? Formal Techniques for Distributed Objects, Components, and Systems, (60-79)
  120. Ying M, Feng Y and Ying S (2021). Optimal Policies for Quantum Markov Decision Processes, International Journal of Automation and Computing, 18:3, (410-421), Online publication date: 1-Jun-2021.
  121. ACM
    Salaht F, Desprez F and Lebre A (2020). An Overview of Service Placement Problem in Fog and Edge Computing, ACM Computing Surveys, 53:3, (1-35), Online publication date: 31-May-2021.
  122. Robinson T, Su G and Zhang M Multiagent Task Allocation and Planning with Multi-Objective Requirements Proceedings of the 20th International Conference on Autonomous Agents and MultiAgent Systems, (1628-1630)
  123. Simão T, Jansen N and Spaan M AlwaysSafe: Reinforcement Learning without Safety Constraint Violations during Training Proceedings of the 20th International Conference on Autonomous Agents and MultiAgent Systems, (1226-1235)
  124. Mate A, Perrault A and Tambe M Risk-Aware Interventions in Public Health: Planning with Restless Multi-Armed Bandits Proceedings of the 20th International Conference on Autonomous Agents and MultiAgent Systems, (880-888)
  125. Liu Z, Yang Y, Miller T and Masters P Deceptive Reinforcement Learning for Privacy-Preserving Planning Proceedings of the 20th International Conference on Autonomous Agents and MultiAgent Systems, (818-826)
  126. Killian J, Perrault A and Tambe M Beyond "To Act or Not to Act": Fast Lagrangian Approaches to General Multi-Action Restless Bandits Proceedings of the 20th International Conference on Autonomous Agents and MultiAgent Systems, (710-718)
  127. Hussenot L, Dadashi R, Geist M and Pietquin O Show Me the Way: Intrinsic Motivation from Demonstrations Proceedings of the 20th International Conference on Autonomous Agents and MultiAgent Systems, (620-628)
  128. Dong Z, Das S, Fowler P and Ho C Efficient Nonmyopic Online Allocation of Scarce Reusable Resources Proceedings of the 20th International Conference on Autonomous Agents and MultiAgent Systems, (447-455)
  129. Cui J, Macke W, Yedidsion H, Goyal A, Urieli D and Stone P Scalable Multiagent Driving Policies for Reducing Traffic Congestion Proceedings of the 20th International Conference on Autonomous Agents and MultiAgent Systems, (386-394)
  130. de Nijs F, Walraven E, De Weerdt M and Spaan M (2021). Constrained Multiagent Markov Decision Processes, Journal of Artificial Intelligence Research, 70, (955-1001), Online publication date: 1-May-2021.
  131. Oliehoek F, Witwicki S and Kaelbling L (2021). A Sufficient Statistic for Influence in Structured Multiagent Environments, Journal of Artificial Intelligence Research, 70, (789-870), Online publication date: 1-May-2021.
  132. Iwaki T, Wu J, Wu Y, Sandberg H and Johansson K (2022). Multi-hop sensor network scheduling for optimal remote estimation, Automatica (Journal of IFAC), 127:C, Online publication date: 1-May-2021.
  133. Skrynnik A, Staroverov A, Aitygulov E, Aksenov K, Davydov V and Panov A (2021). Forgetful experience replay in hierarchical reinforcement learning from expert demonstrations▪, Knowledge-Based Systems, 218:C, Online publication date: 22-Apr-2021.
  134. ACM
    Hahn E and Hartmanns A Symblicit exploration and elimination for probabilistic model checking Proceedings of the 36th Annual ACM Symposium on Applied Computing, (1798-1806)
  135. ACM
    Pan Y, Li S, Chen Q, Zhang N, Cheng T, Li Z, Guo B, Han Q and Zhu T (2021). Efficient Schedule of Energy-Constrained UAV Using Crowdsourced Buses in Last-Mile Parcel Delivery, Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 5:1, (1-23), Online publication date: 19-Mar-2021.
  136. Ntemos K, Kolokotronis N and Kalouptsidis N Using trust to mitigate malicious and selfish behavior of autonomous agents in CRNs 2016 IEEE 27th Annual International Symposium on Personal, Indoor, and Mobile Radio Communications (PIMRC), (1-7)
  137. Che Y, Lai Y, Luo S, Wu K and Duan L (2021). UAV-Aided Information and Energy Transmissions for Cognitive and Sustainable 5G Networks, IEEE Transactions on Wireless Communications, 20:3, (1668-1683), Online publication date: 1-Mar-2021.
  138. Wu Y and Pan L (2021). SG-PAC, Computers and Security, 102:C, Online publication date: 1-Mar-2021.
  139. ACM
    Wang G, Fang Z, Xie X, Wang S, Sun H, Zhang F, Liu Y and Zhang D (2020). Pricing-aware Real-time Charging Scheduling and Charging Station Expansion for Large-scale Electric Buses, ACM Transactions on Intelligent Systems and Technology, 12:1, (1-26), Online publication date: 28-Feb-2021.
  140. ACM
    Berthon R, Fijalkow N, Filiot E, Guha S, Maubert B, Murano A, Pinault L, Pinchinat S, Rubin S and Serre O (2020). Alternating Tree Automata with Qualitative Semantics, ACM Transactions on Computational Logic, 22:1, (1-24), Online publication date: 31-Jan-2021.
  141. ACM
    Fu X, Cai H, Li W and Li L (2020). SEADS, ACM Transactions on Software Engineering and Methodology, 30:1, (1-45), Online publication date: 31-Jan-2021.
  142. ACM
    Vajjha K, Shinnar A, Trager B, Pestun V and Fulton N CertRL: formalizing convergence proofs for value and policy iteration in Coq Proceedings of the 10th ACM SIGPLAN International Conference on Certified Programs and Proofs, (18-31)
  143. Pananjady A and Wainwright M (2020). Instance-Dependent ℓ-Bounds for Policy Evaluation in Tabular Reinforcement Learning, IEEE Transactions on Information Theory, 67:1, (566-585), Online publication date: 1-Jan-2021.
  144. ACM
    Du X, Li Y, Xie X, Ma L, Liu Y and Zhao J Marble Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering, (423-435)
  145. Zychlinski N, Chan C and Dong J Scheduling queues with simultaneous and heterogeneous requirements from multiple types of servers Proceedings of the Winter Simulation Conference, (2365-2376)
  146. Sharma H and Jain R Finite Time Guarantees for Continuous State MDPs with Generative Model 2020 59th IEEE Conference on Decision and Control (CDC), (3617-3622)
  147. Li C and Chen W Energy Efficient Joint Pushing and On-demand Transmission over Shared Spectrum GLOBECOM 2020 - 2020 IEEE Global Communications Conference, (1-6)
  148. Guan W, Zhang H and Victor Leung C Slice Reconfiguration Based on Demand Prediction with Dueling Deep Reinforcement Learning GLOBECOM 2020 - 2020 IEEE Global Communications Conference, (1-6)
  149. Yang Z, Nguyen L, Zhu J, Pan Z, Li J and Jin F Coordinating disaster emergency response with heuristic reinforcement learning Proceedings of the 12th IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, (565-572)
  150. Brown D, Niekum S and Petrik M Bayesian robust optimization for imitation learning Proceedings of the 34th International Conference on Neural Information Processing Systems, (2479-2491)
  151. Shifrin M, Menasché D, Cohen A, Goeckel D and Gurewitz O (2020). Optimal PHY Configuration in Wireless Networks, IEEE/ACM Transactions on Networking, 28:6, (2601-2614), Online publication date: 1-Dec-2020.
  152. Stahlbuhk T, Shrader B and Modiano E (2020). Throughput Maximization in Uncooperative Spectrum Sharing Networks, IEEE/ACM Transactions on Networking, 28:6, (2517-2530), Online publication date: 1-Dec-2020.
  153. Hemachandra N, Patil K and Tripathi S (2020). Equilibrium points and equilibrium sets of some queues, Queueing Systems: Theory and Applications, 96:3-4, (245-284), Online publication date: 1-Dec-2020.
  154. Feng W, Huang C, Turrini A and Li Y Modelling and Implementation of Unmanned Aircraft Collision Avoidance Dependable Software Engineering. Theories, Tools, and Applications, (52-69)
  155. ACM
    Xie H, Li Y and Lui J (2020). A Reinforcement Learning Approach to Optimize Discount and Reputation Tradeoffs in E-commerce Systems, ACM Transactions on Internet Technology, 20:4, (1-26), Online publication date: 8-Nov-2020.
  156. Ganzfried S, Laughlin C and Morefield C Parallel Algorithm for Nash Equilibrium in Multiplayer Stochastic Games with Application to Naval Strategic Planning Distributed Artificial Intelligence, (1-13)
  157. Könighofer B, Lorber F, Jansen N and Bloem R Shield Synthesis for Reinforcement Learning Leveraging Applications of Formal Methods, Verification and Validation: Verification Principles, (290-306)
  158. ACM
    Miralles-Pechuán L, Jiménez F, Ponce H and Martínez-Villaseñor L A Methodology Based on Deep Q-Learning/Genetic Algorithms for Optimizing COVID-19 Pandemic Government Actions Proceedings of the 29th ACM International Conference on Information & Knowledge Management, (1135-1144)
  159. Dias Pastor H, Oliveira Borges I, Freire V, Valdivia Delgado K and Nunes de Barros L Risk-Sensitive Piecewise-Linear Policy Iteration for Stochastic Shortest Path Markov Decision Processes Advances in Soft Computing, (383-395)
  160. Neto E, Freire V and Delgado K Risk Sensitive Markov Decision Process for Portfolio Management Advances in Soft Computing, (370-382)
  161. Kuinchtner D, Meneguzzi F and Sales A A Tensor-Based Markov Decision Process Representation Advances in Soft Computing, (313-324)
  162. Zugarová E and Guy T Similarity-based transfer learning of decision policies 2020 IEEE International Conference on Systems, Man, and Cybernetics (SMC), (37-44)
  163. Chen J, Dong J and Shi P (2020). A survey on skill-based routing with applications to service operations management, Queueing Systems: Theory and Applications, 96:1-2, (53-82), Online publication date: 1-Oct-2020.
  164. ACM
    Quach H, Yeom S and Kim K Survey on Reinforcement Learning based Efficient Routing in SDN The 9th International Conference on Smart Media and Applications, (196-200)
  165. Camilli M and Russo B Model-Based Testing Under Parametric Variability of Uncertain Beliefs Software Engineering and Formal Methods, (175-192)
  166. Cui J, Boussetta K and Valois F (2021). Classification of data aggregation functions in wireless sensor networks, Computer Networks: The International Journal of Computer and Telecommunications Networking, 178:C, Online publication date: 4-Sep-2020.
  167. Mohammed H, Wei Z, Wu E and Netravali R (2020). Continuous prefetch for interactive data applications, Proceedings of the VLDB Endowment, 13:12, (2297-2311), Online publication date: 1-Aug-2020.
  168. Magirou E, Vassalos P and Barakitis N (2020). A policy iteration algorithm for the American put option and free boundary control problems, Journal of Computational and Applied Mathematics, 373:C, Online publication date: 1-Aug-2020.
  169. ACM
    Wideł W, Audinot M, Fila B and Pinchinat S (2019). Beyond 2014, ACM Computing Surveys, 52:4, (1-36), Online publication date: 31-Jul-2020.
  170. Zhang H, Sun J and Xu Z Adaptive Structural Hyper-Parameter Configuration by Q-Learning 2020 IEEE Congress on Evolutionary Computation (CEC), (1-8)
  171. ACM
    Boodaghians S, Fusco F, Lazos P and Leonardi S Pandora's Box Problem with Order Constraints Proceedings of the 21st ACM Conference on Economics and Computation, (439-458)
  172. ACM
    Ashok P, Chatterjee K, Křetínský J, Weininger M and Winkler T Approximating Values of Generalized-Reachability Stochastic Games Proceedings of the 35th Annual ACM/IEEE Symposium on Logic in Computer Science, (102-115)
  173. ACM
    Stevens C and Bagheri H Reducing run-time adaptation space via analysis of possible utility bounds Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering, (1522-1534)
  174. ACM
    Sikdar S and Jermaine C MONSOON: Multi-Step Optimization and Execution of Queries with Partially Obscured Predicates Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data, (225-240)
  175. ACM
    Salem F, Chahed T, Altman E, Gati A and Altman Z Scalable Markov Decision Process Model for Advanced Sleep Modes Management in 5G Networks Proceedings of the 13th EAI International Conference on Performance Evaluation Methodologies and Tools, (136-141)
  176. ACM
    Song J and Guérin R (2020). Pricing (and Bidding) Strategies for Delay Differentiated Cloud Services, ACM Transactions on Economics and Computation, 8:2, (1-58), Online publication date: 14-May-2020.
  177. Nemati S, Icten Z, Maillart L and Schaefer A (2019). Mitigating Information Asymmetry in Liver Allocation, INFORMS Journal on Computing, 32:2, (234-248), Online publication date: 1-Apr-2020.
  178. Hyytiä E, Righter R, Virtamo J and Viitasaari L (2020). On value functions for FCFS queues with batch arrivals and general cost structures, Performance Evaluation, 138:C, Online publication date: 1-Apr-2020.
  179. Liang M, Wang D and Liu D (2020). Improved value iteration for neural-network-based stochastic optimal control design, Neural Networks, 124:C, (280-295), Online publication date: 1-Apr-2020.
  180. ACM
    Rizk Y, Awad M and Tunstel E (2019). Cooperative Heterogeneous Multi-Robot Systems, ACM Computing Surveys, 52:2, (1-31), Online publication date: 31-Mar-2020.
  181. ACM
    MendonÇa M, Ziviani A and Barreto A (2019). Graph-Based Skill Acquisition For Reinforcement Learning, ACM Computing Surveys, 52:1, (1-26), Online publication date: 31-Jan-2020.
  182. ACM
    Watanabe T and Sakuragawa T A Sublinear-Regret Reinforcement Learning Algorithm on Constrained Markov Decision Processes with reset action Proceedings of the 4th International Conference on Machine Learning and Soft Computing, (51-55)
  183. Chen Y and Ryzhov I (2020). Technical Note—Consistency Analysis of Sequential Learning Under Approximate Bayesian Inference, Operations Research, 68:1, (295-307), Online publication date: 1-Jan-2020.
  184. Wu Q, Ge H, Fan Q, Yin W, Chang B, Wu G and Lin F (2020). Efficient Task Offloading for 802.11p-Based Cloud-Aware Mobile Fog Computing System in Vehicular Networks, Wireless Communications & Mobile Computing, 2020, Online publication date: 1-Jan-2020.
  185. Shah A, Ganesan R, Jajodia S, Samarati P and Cam H (2019). Adaptive Alert Management for Balancing Optimal Performance among Distributed CSOCs using Reinforcement Learning, IEEE Transactions on Parallel and Distributed Systems, 31:1, (16-33), Online publication date: 1-Jan-2020.
  186. Su J, Cheng H, Guo H and Peng Z (2019). Robust Quadratic Programming for MDPs with uncertain observation noise, Neurocomputing, 370:C, (28-38), Online publication date: 22-Dec-2019.
  187. Feng Y, Li L and Liu Q A kernel loss for solving the bellman equation Proceedings of the 33rd International Conference on Neural Information Processing Systems, (15456-15467)
  188. Seijen H, Fatemi M and Tavakoli A Using a logarithmic mapping to enable lower discount factors in reinforcement learning Proceedings of the 33rd International Conference on Neural Information Processing Systems, (14134-14144)
  189. Pike-Burke C and Grünewälder S Recovering bandits Proceedings of the 33rd International Conference on Neural Information Processing Systems, (14122-14131)
  190. Ortner R, Pirotta M, Fruit R, Lazaric A and Maillard O Regret bounds for learning state representations in reinforcement learning Proceedings of the 33rd International Conference on Neural Information Processing Systems, (12738-12748)
  191. Harutyunyan A, Dabney W, Mesnard T, Heess N, Azar M, Piot B, van Hasselt H, Singh S, Wayne G, Precup D and Munos R Hindsight credit assignment Proceedings of the 33rd International Conference on Neural Information Processing Systems, (12498-12507)
  192. Grill J, Domingues O, Ménard P, Munos R and Valko M Planning in entropy-regularized Markov decision processes and games Proceedings of the 33rd International Conference on Neural Information Processing Systems, (12404-12413)
  193. Cui H and Khardon R Sampling networks and aggregate simulation for online POMDP planning Proceedings of the 33rd International Conference on Neural Information Processing Systems, (9222-9232)
  194. Bojchevski A and Günnemann S Certifiable robustness to graph perturbations Proceedings of the 33rd International Conference on Neural Information Processing Systems, (8319-8330)
  195. Dai F and Walter M Maximum expected hitting cost of a Markov decision process and informativeness of rewards Proceedings of the 33rd International Conference on Neural Information Processing Systems, (7679-7687)
  196. Zhang L, Tang K and Yao X Explicit planning for efficient exploration in reinforcement learning Proceedings of the 33rd International Conference on Neural Information Processing Systems, (7488-7497)
  197. Russel R and Petrik M Beyond confidence regions Proceedings of the 33rd International Conference on Neural Information Processing Systems, (7049-7058)
  198. Yang D, Zhao L, Lin Z, Qin T, Bian J and Liu T Fully parameterized quantile function for distributional reinforcement learning Proceedings of the 33rd International Conference on Neural Information Processing Systems, (6193-6202)
  199. Qian J, Fruit R, Pirotta M and Lazaric A Exploration bonus for regret minimization in discrete and continuous average reward MDPs Proceedings of the 33rd International Conference on Neural Information Processing Systems, (4890-4899)
  200. Bellemare M, Dabney W, Dadashi R, Taiga A, Castro P, Roux N, Schuurmans D, Lattimore T and Lyle C A geometric perspective on optimal representations for reinforcement learning Proceedings of the 33rd International Conference on Neural Information Processing Systems, (4358-4369)
  201. Zhang Z and Ji X Regret minimization for reinforcement learning by evaluating the optimal bias function Proceedings of the 33rd International Conference on Neural Information Processing Systems, (2827-2836)
  202. Nachum O, Chow Y, Dai B and Li L DualDICE Proceedings of the 33rd International Conference on Neural Information Processing Systems, (2318-2328)
  203. Rosenberg A and Mansour Y Online stochastic shortest path with bandit feedback and unknown transition function Proceedings of the 33rd International Conference on Neural Information Processing Systems, (2212-2221)
  204. Tessler C, Tennenholtz G and Mannor S Distributional policy optimization Proceedings of the 33rd International Conference on Neural Information Processing Systems, (1352-1362)
  205. Cheung W Regret minimization for reinforcement learning with vectorial feedback and complex objectives Proceedings of the 33rd International Conference on Neural Information Processing Systems, (726-736)
  206. Yang C, Ma X, Huang W, Sun F, Liu H, Huang J and Gan C Imitation learning from observations by minimizing inverse dynamics disagreement Proceedings of the 33rd International Conference on Neural Information Processing Systems, (239-249)
  207. Clemens J, Kluth T and Reineking T (2019). β-SLAM, Information Fusion, 52:C, (62-75), Online publication date: 1-Dec-2019.
  208. Adelman D and Uçkun C (2019). Dynamic Electricity Pricing to Smart Homes, Operations Research, 67:6, (1520-1542), Online publication date: 1-Nov-2019.
  209. Ning J and Sobel M (2019). Easy Affine Markov Decision Processes, Operations Research, 67:6, (1719-1737), Online publication date: 1-Nov-2019.
  210. Aalto S and Lassila P (2019). Near-optimal dispatching policy for energy-aware server clusters, Performance Evaluation, 135:C, Online publication date: 1-Nov-2019.
  211. ACM
    Qiao G, Zhou H, Kapadia M, Yoon S and Pavlovic V Scenario Generalization of Data-driven Imitation Models in Crowd Simulation Proceedings of the 12th ACM SIGGRAPH Conference on Motion, Interaction and Games, (1-11)
  212. Efrosinin D, Kochetkova I, Samouylov K and Stepanova N Algorithmic Analysis of a Two-Class Multi-server Heterogeneous Queueing System with a Controllable Cross-connectivity Analytical and Stochastic Modelling Techniques and Applications, (1-17)
  213. ACM
    Pandya P and Wakankar A Logical specification and uniform synthesis of robust controllers Proceedings of the 17th ACM-IEEE International Conference on Formal Methods and Models for System Design, (1-11)
  214. Bray R (2019). Markov Decision Processes with Exogenous Variables, Management Science, 65:10, (4598-4606), Online publication date: 1-Oct-2019.
  215. Abbou A and Makis V (2019). Group Maintenance, INFORMS Journal on Computing, 31:4, (719-731), Online publication date: 1-Oct-2019.
  216. Lopes Silva M, de Souza S, Freitas Souza M and Bazzan A (2019). A reinforcement learning-based multi-agent framework applied for solving routing and scheduling problems, Expert Systems with Applications: An International Journal, 131:C, (148-171), Online publication date: 1-Oct-2019.
  217. Taheri Javan N, Sabaei M and Dehghan M (2019). To overhear or not to overhear: a dilemma between network coding gain and energy consumption in multi-hop wireless networks, Wireless Networks, 25:7, (4097-4113), Online publication date: 1-Oct-2019.
  218. Efrosinin D and Stepanova N On Optimal Control Policy of MAP(t)/M/2 Queueing System with Heterogeneous Servers and Periodic Arrival Process Distributed Computer and Communication Networks, (179-194)
  219. Avni G, Henzinger T, Ibsen-Jensen R and Novotný P Bidding Games on Markov Decision Processes Reachability Problems, (1-12)
  220. Ibrahim M and Reveliotis S Throughput maximization of complex resource allocation systems through timed-continuous-Petri-net modeling (Extended Abstract) 2019 24th IEEE International Conference on Emerging Technologies and Factory Automation (ETFA), (1457-1460)
  221. Sai A, Buckley J and Le Gear A (2019). Assessing the security implication of Bitcoin exchange rates, Computers and Security, 86:C, (206-222), Online publication date: 1-Sep-2019.
  222. Wang J, Zhuang Z, Qi Q, Li T and Liao J (2019). Deep reinforcement learning-based cooperative interactions among heterogeneous vehicular networks, Applied Soft Computing, 82:C, Online publication date: 1-Sep-2019.
  223. ACM
    Avni G, Henzinger T and Chonev V (2019). Infinite-duration Bidding Games, Journal of the ACM, 66:4, (1-29), Online publication date: 26-Aug-2019.
  224. Simão T and Spaan M Structure learning for safe policy improvement Proceedings of the 28th International Joint Conference on Artificial Intelligence, (3453-3459)
  225. Ramesh R, Tomar M and Ravindran B Successor options Proceedings of the 28th International Joint Conference on Artificial Intelligence, (3304-3310)
  226. Ie E, Jain V, Wang J, Narvekar S, Agarwal R, Wu R, Cheng H, Chandra T and Boutilier C SLATEQ Proceedings of the 28th International Joint Conference on Artificial Intelligence, (2592-2599)
  227. Brafman R and De Giacomo G Planning for LTLf /LDLf goals in non-Markovian fully observable nondeterministic domains Proceedings of the 28th International Joint Conference on Artificial Intelligence, (1602-1608)
  228. Mansouri M, Lacerda B, Hawes N and Pecora F Multi-robot planning under uncertain travel times and safety constraints Proceedings of the 28th International Joint Conference on Artificial Intelligence, (478-484)
  229. Lassila P, Gebrehiwot M and Aalto S (2019). Optimal energy-aware load balancing and base station switch-off control in 5G HetNets, Computer Networks: The International Journal of Computer and Telecommunications Networking, 159:C, (10-22), Online publication date: 4-Aug-2019.
  230. Lacerda B, Faruq F, Parker D and Hawes N (2020). Probabilistic planning with formal performance guarantees for mobile service robots, International Journal of Robotics Research, 38:9, (1098-1123), Online publication date: 1-Aug-2019.
  231. Friedrich S, Schreibauer M and Buss M (2019). Least-squares policy iteration algorithms for robotics, Engineering Applications of Artificial Intelligence, 83:C, (72-84), Online publication date: 1-Aug-2019.
  232. ACM
    Povéda G, Regnier-Coudert O, Teichteil-Königsbuch F, Dupont G, Arnold A, Guerra J and Picard M Evolutionary approaches to dynamic earth observation satellites mission planning under uncertainty Proceedings of the Genetic and Evolutionary Computation Conference, (1302-1310)
  233. ACM
    Gabor T, Sedlmeier A, Kiermeier M, Phan T, Henrich M, Pichlmair M, Kempter B, Klein C, Sauer H, AG R and Wieghardt J Scenario co-evolution for reinforcement learning on a grid world smart factory domain Proceedings of the Genetic and Evolutionary Computation Conference, (898-906)
  234. ACM
    Bedewy A, Sun Y, Kompella S and Shroff N Age-optimal Sampling and Transmission Scheduling in Multi-Source Systems Proceedings of the Twentieth ACM International Symposium on Mobile Ad Hoc Networking and Computing, (121-130)
  235. Aviv Y, Wei M and Zhang F (2019). Responsive Pricing of Fashion Products, Management Science, 65:7, (2982-3000), Online publication date: 1-Jul-2019.
  236. Zadorojniy A, Wasserkrug S, Zeltyn S and Lipets V (2019). Unleashing Analytics to Reduce Costs and Improve Quality in Wastewater Treatment, Interfaces, 49:4, (262-268), Online publication date: 1-Jul-2019.
  237. Dehghanian A and Kharoufeh J (2019). Optimal stopping with a capacity constraint, Operations Research Letters, 47:4, (311-316), Online publication date: 1-Jul-2019.
  238. Baier C, Bertrand N, Piribauer J and Sankur O Long-run satisfaction of path properties Proceedings of the 34th Annual ACM/IEEE Symposium on Logic in Computer Science, (1-14)
  239. ACM
    Russo G, Cardellini V and Presti F Reinforcement Learning Based Policies for Elastic Stream Processing on Heterogeneous Resources Proceedings of the 13th ACM International Conference on Distributed and Event-based Systems, (31-42)
  240. Choudhury S, Knickerbocker J and Kochenderfer M Dynamic Real-time Multimodal Routing with Hierarchical Hybrid Planning 2019 IEEE Intelligent Vehicles Symposium (IV), (2397-2404)
  241. ACM
    Zhou H, Khatri S, Hu J and Liu F A Memory-Efficient Markov Decision Process Computation Framework Using BDD-based Sampling Representation Proceedings of the 56th Annual Design Automation Conference 2019, (1-6)
  242. Moran M and Gordon G (2019). Curious Feature Selection, Information Sciences: an International Journal, 485:C, (42-54), Online publication date: 1-Jun-2019.
  243. Aouadhi M, Delahaye B and Lanoix A (2019). Introducing probabilistic reasoning within Event-B, Software and Systems Modeling (SoSyM), 18:3, (1953-1984), Online publication date: 1-Jun-2019.
  244. Asiain E, Clempner J and Poznyak A (2019). Controller exploitation-exploration reinforcement learning architecture for computing near-optimal policies, Soft Computing - A Fusion of Foundations, Methodologies and Applications, 23:11, (3591-3604), Online publication date: 1-Jun-2019.
  245. Mallozzi P, Castellano E, Pelliccione P, Schneider G and Tei K A runtime monitoring framework to enforce invariants on reinforcement learning agents exploring complex environments Proceedings of the 2nd International Workshop on Robotics Software Engineering, (5-12)
  246. Disser Y and Hopp A On Friedmann’s Subexponential Lower Bound for Zadeh’s Pivot Rule Integer Programming and Combinatorial Optimization, (168-180)
  247. Huang Y and Chen X (2018). A Sensitivity‐Based Construction Approach to Variance Minimization of Markov Decision Processes, Asian Journal of Control, 21:3, (1166-1178), Online publication date: 22-May-2019.
  248. Wray K and Zilberstein S Policy Networks Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems, (2270-2272)
  249. Perkins T Optimal Risk in Multiagent Blind Tournaments Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems, (2159-2161)
  250. Elkholy A, Yang F and Gustafson S Interpretable Automated Machine Learning in Maana™ Knowledge Platform Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems, (1937-1939)
  251. Hanna J and Stone P Reducing Sampling Error in Policy Gradient Learning Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems, (1016-1024)
  252. Pineda L and Zilberstein S Soft Labeling in Stochastic Shortest Path Problems Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems, (467-475)
  253. He M and Guo H Interleaved Q-Learning with Partially Coupled Training Process Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems, (449-457)
  254. Perrault A and Boutilier C Experiential Preference Elicitation for Autonomous Heating and Cooling Systems Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems, (431-439)
  255. Lomuscio A and Pirovano E A Counter Abstraction Technique for the Verification of Probabilistic Swarm Systems Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems, (161-169)
  256. Pineda L and Zilberstein S (2019). Probabilistic planning with reduced models, Journal of Artificial Intelligence Research, 65:1, (271-306), Online publication date: 1-May-2019.
  257. Berkhout J and Heidergott B (2019). Analysis of Markov Influence Graphs, Operations Research, 67:3, (892-904), Online publication date: 1-May-2019.
  258. Su Y, Li J and Li Y (2019). Optimality of admission control in a repairable queue, Operations Research Letters, 47:3, (202-207), Online publication date: 1-May-2019.
  259. Kochovski P, Drobintsev P and Stankovski V (2019). Formal Quality of Service assurances, ranking and verification of cloud deployment options with a probabilistic model checking method, Information and Software Technology, 109:C, (14-25), Online publication date: 1-May-2019.
  260. Anwar A, Kelly J, Atia G and Guirguis M (2019). Pinball attacks against Dynamic Channel assignment in wireless networks, Computer Communications, 140:C, (23-37), Online publication date: 1-May-2019.
  261. Lee K, Kim G, Ortega P, Lee D and Kim K (2019). Bayesian optimistic Kullback–Leibler exploration, Machine Language, 108:5, (765-783), Online publication date: 1-May-2019.
  262. ACM
    Oraby S, Bhuiyan M, Gundecha P, Mahmud J and Akkiraju R (2019). Modeling and Computational Characterization of Twitter Customer Service Conversations, ACM Transactions on Interactive Intelligent Systems, 9:2-3, (1-28), Online publication date: 25-Apr-2019.
  263. ACM
    Zhao E and Sukkerd R Interactive explanation for planning-based systems Proceedings of the 10th ACM/IEEE International Conference on Cyber-Physical Systems, (322-323)
  264. ACM
    Zhang C, Kuppannagari S, Xiong C, Kannan R and Prasanna V A cooperative multi-agent deep reinforcement learning framework for real-time residential load scheduling Proceedings of the International Conference on Internet of Things Design and Implementation, (59-69)
  265. Pilanawithana B, Atapattu S and Evans J Average Transmission Success Probability Bound for SWIPT Relay Networks 2019 IEEE Wireless Communications and Networking Conference (WCNC), (1-6)
  266. Zhang H, Wu W, Wang C, Li M and Yang R Deep Reinforcement Learning-Based Offloading Decision Optimization in Mobile Edge Computing 2019 IEEE Wireless Communications and Networking Conference (WCNC), (1-7)
  267. Kaminski B, Katoen J and Matheja C (2019). On the hardness of analyzing probabilistic programs, Acta Informatica, 56:3, (255-285), Online publication date: 1-Apr-2019.
  268. ACM
    Aalto S, Lassila P and Taboada I Indexability of an opportunistic scheduling problem with partial channel information Proceedings of the 12th EAI International Conference on Performance Evaluation Methodologies and Tools, (95-102)
  269. ACM
    Kumar M U, Bhat S, Kavitha V and Hemachandra N Ultimately Stationary Policies to Approximate Risk-Sensitive Discounted MDPs Proceedings of the 12th EAI International Conference on Performance Evaluation Methodologies and Tools, (63-70)
  270. ACM
    Cadas A, Bušić A and Doncel J Optimal Control of Dynamic Bipartite Matching Models Proceedings of the 12th EAI International Conference on Performance Evaluation Methodologies and Tools, (39-46)
  271. Balseiro S and Brown D (2019). Approximations to Stochastic Dynamic Programs via Information Relaxation Duality, Operations Research, 67:2, (577-597), Online publication date: 1-Mar-2019.
  272. Ko H, Lee J and Pack S (2022). CG-E2S2, Future Generation Computer Systems, 92:C, (1093-1102), Online publication date: 1-Mar-2019.
  273. Chen W and Wu C (2022). A trading decision support model to maximize the sustainability of a self-financed guaranteed farmgate price program, Computers and Electronics in Agriculture, 158:C, (303-312), Online publication date: 1-Mar-2019.
  274. Thomas E, Sharma R and Nazarathy Y (2019). Towards demand side management control using household specific Markovian models, Automatica (Journal of IFAC), 101:C, (450-457), Online publication date: 1-Mar-2019.
  275. Ortega G, Hendrix E and García I (2019). A CUDA approach to compute perishable inventory control policies using value iteration, The Journal of Supercomputing, 75:3, (1580-1593), Online publication date: 1-Mar-2019.
  276. Berkhout J and Heidergott B (2019). The Jump Start Power Method, Journal of Scientific Computing, 78:3, (1691-1723), Online publication date: 1-Mar-2019.
  277. Lefebvre D (2019). Approximated timed reachability graphs for the robust control of discrete event systems, Discrete Event Dynamic Systems, 29:1, (31-56), Online publication date: 1-Mar-2019.
  278. Gonzalez-Fernandez Y, Hamidi S, Chen S and Liaskos S (2019). Efficient elicitation of software configurations using crowd preferences and domain knowledge, Automated Software Engineering, 26:1, (87-123), Online publication date: 1-Mar-2019.
  279. Ni Z and Motani M (2019). Online Policies for Energy Harvesting Receivers With Time-Switching Architectures, IEEE Transactions on Wireless Communications, 18:2, (1233-1246), Online publication date: 1-Feb-2019.
  280. Le Thi H, Ho V and Pham Dinh T (2019). A unified DC programming framework and efficient DCA based approaches for large scale batch reinforcement learning, Journal of Global Optimization, 73:2, (279-310), Online publication date: 1-Feb-2019.
  281. Yu X, Zhou X and Zhang Y (2019). Collision-Free Trajectory Generation and Tracking for UAVs Using Markov Decision Process in a Cluttered Environment, Journal of Intelligent and Robotic Systems, 93:1-2, (17-32), Online publication date: 1-Feb-2019.
  282. ACM
    Qu C, Ji F, Qiu M, Yang L, Min Z, Chen H, Huang J and Croft W Learning to Selectively Transfer Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining, (699-707)
  283. Zhao X, Robu V, Flynn D, Dinmohammadi F, Fisher M and Webster M Probabilistic model checking of robots deployed in extreme environments Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence and Thirty-First Innovative Applications of Artificial Intelligence Conference and Ninth AAAI Symposium on Educational Advances in Artificial Intelligence, (8066-8074)
  284. Xie H, Li Y and Lui J Optimizing discount & reputation trade-offs in E-Commerce systems Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence and Thirty-First Innovative Applications of Artificial Intelligence Conference and Ninth AAAI Symposium on Educational Advances in Artificial Intelligence, (7992-7999)
  285. Pitis S Rethinking the discount factor in reinforcement learning Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence and Thirty-First Innovative Applications of Artificial Intelligence Conference and Ninth AAAI Symposium on Educational Advances in Artificial Intelligence, (7949-7956)
  286. Majeed S and Hutter M Performance guarantees for homomorphisms beyond Markov Decision Processes Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence and Thirty-First Innovative Applications of Artificial Intelligence Conference and Ninth AAAI Symposium on Educational Advances in Artificial Intelligence, (7659-7666)
  287. Bueno T, de Barros L, Mauá D and Sanner S Deep reactive policies for planning in stochastic nonlinear domains Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence and Thirty-First Innovative Applications of Artificial Intelligence Conference and Ninth AAAI Symposium on Educational Advances in Artificial Intelligence, (7530-7537)
  288. Simão T and Spaan M Safe policy improvement with baseline bootstrapping in factored environments Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence and Thirty-First Innovative Applications of Artificial Intelligence Conference and Ninth AAAI Symposium on Educational Advances in Artificial Intelligence, (4967-4974)
  289. Ma S and Yu J State-augmentation transformations for risk-sensitive reinforcement learning Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence and Thirty-First Innovative Applications of Artificial Intelligence Conference and Ninth AAAI Symposium on Educational Advances in Artificial Intelligence, (4512-4519)
  290. Gelada C and Bellemare M Off-policy deep reinforcement learning by bootstrapping the covariate shift Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence and Thirty-First Innovative Applications of Artificial Intelligence Conference and Ninth AAAI Symposium on Educational Advances in Artificial Intelligence, (3647-3655)
  291. Efroni Y, Dalai G, Scherrer B and Mannor S How to combine tree-search methods in reinforcement learning Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence and Thirty-First Innovative Applications of Artificial Intelligence Conference and Ninth AAAI Symposium on Educational Advances in Artificial Intelligence, (3494-3501)
  292. Cohen A, Qiao X, Yu L, Way E and Tong X Diverse exploration via conjugate policies for policy gradient methods Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence and Thirty-First Innovative Applications of Artificial Intelligence Conference and Ninth AAAI Symposium on Educational Advances in Artificial Intelligence, (3404-3411)
  293. Abel D, Arumugam D, Asadi K, Jinnai Y, Littman M and Wong L State abstraction as compression in apprenticeship learning Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence and Thirty-First Innovative Applications of Artificial Intelligence Conference and Ninth AAAI Symposium on Educational Advances in Artificial Intelligence, (3134-3142)
  294. Unhelkar V and Shah J Learning models of sequential decision-making with partial specification of agent behavior Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence and Thirty-First Innovative Applications of Artificial Intelligence Conference and Ninth AAAI Symposium on Educational Advances in Artificial Intelligence, (2522-2530)
  295. ACM
    Xie H and Lui J (2019). A Markov Decision Process Approach to Analyze Discount & Reputation Trade-offs in E-commerce Systems, ACM SIGMETRICS Performance Evaluation Review, 46:2, (3-5), Online publication date: 18-Jan-2019.
  296. ACM
    Batz K, Kaminski B, Katoen J, Matheja C and Noll T (2019). Quantitative separation logic: a logic for reasoning about probabilistic pointer programs, Proceedings of the ACM on Programming Languages, 3:POPL, (1-29), Online publication date: 2-Jan-2019.
  297. Amato C, Konidaris G, Kaelbling L and How J (2019). Modeling and planning with macro-actions in decentralized POMDPs, Journal of Artificial Intelligence Research, 64:1, (817-859), Online publication date: 1-Jan-2019.
  298. ACM
    Li J, Xia B, Geng X, Ming H, Shakkottai S, Subramanian V and Xie L (2018). Mean Field Games in Nudge Systems for Societal Networks, ACM Transactions on Modeling and Performance Evaluation of Computing Systems, 3:4, (1-31), Online publication date: 31-Dec-2019.
  299. Bharadwaj S, Ahmadi M, Tanaka T and Topcu U Transfer Entropy in MDPs with Temporal Logic Specifications 2018 IEEE Conference on Decision and Control (CDC), (4173-4180)
  300. Avni G, Henzinger T and Ibsen-Jensen R Infinite-Duration Poorman-Bidding Games Web and Internet Economics, (21-36)
  301. ACM
    Charles J, Chanel C, Chauffaut C, Chauvin P and Drougard N Human-Agent Interaction Model Learning based on Crowdsourcing Proceedings of the 6th International Conference on Human-Agent Interaction, (20-28)
  302. Riemer M, Liu M and Tesauro G Learning abstract options Proceedings of the 32nd International Conference on Neural Information Processing Systems, (10445-10455)
  303. Tirinzoni A, Chen X, Petrik M and Ziebart B Policy-conditioned uncertainty sets for robust Markov decision processes Proceedings of the 32nd International Conference on Neural Information Processing Systems, (8953-8963)
  304. Tirinzoni A, Sanchez R and Restelli M Transfer of value functions via variational methods Proceedings of the 32nd International Conference on Neural Information Processing Systems, (6182-6192)
  305. Metelli A, Papini M, Faccio F and Restelli M Policy optimization via importance sampling Proceedings of the 32nd International Conference on Neural Information Processing Systems, (5447-5459)
  306. Liu Q, Li L, Tang Z and Zhou D Breaking the curse of horizon Proceedings of the 32nd International Conference on Neural Information Processing Systems, (5361-5371)
  307. Efroni Y, Dalal G, Scherrer B and Mannor S Multiple-step greedy policies in online and approximate reinforcement learning Proceedings of the 32nd International Conference on Neural Information Processing Systems, (5244-5253)
  308. Lee K, Choi S and Oh S Maximum causal tsallis entropy imitation learning Proceedings of the 32nd International Conference on Neural Information Processing Systems, (4408-4418)
  309. Cui H, Marinescu R and Khardon R From stochastic planning to marginal MAP Proceedings of the 32nd International Conference on Neural Information Processing Systems, (3085-3095)
  310. Fruit R, Pirotta M and Lazaric A Near optimal exploration-exploitation in non-communicating Markov decision processes Proceedings of the 32nd International Conference on Neural Information Processing Systems, (2998-3008)
  311. Thodoroff P, Durand A, Pineau J and Precup D Temporal regularization in Markov decision process Proceedings of the 32nd International Conference on Neural Information Processing Systems, (1784-1794)
  312. ACM
    Raghothaman M, Kulkarni S, Heo K and Naik M (2018). User-guided program reasoning using Bayesian inference, ACM SIGPLAN Notices, 53:4, (722-735), Online publication date: 2-Dec-2018.
  313. ACM
    Wang D, Hoffmann J and Reps T (2018). PMAF: an algebraic framework for static analysis of probabilistic programs, ACM SIGPLAN Notices, 53:4, (513-528), Online publication date: 2-Dec-2018.
  314. Cheung M, Hou F and Huang J (2018). Delay-Sensitive Mobile Crowdsensing: Algorithm Design and Economics, IEEE Transactions on Mobile Computing, 17:12, (2761-2774), Online publication date: 1-Dec-2018.
  315. Xiao Y and Krunz M (2018). Dynamic Network Slicing for Scalable Fog Computing Systems With Energy Harvesting, IEEE Journal on Selected Areas in Communications, 36:12, (2640-2654), Online publication date: 1-Dec-2018.
  316. Liu Q, Dong X, Chen H and Wang Y (2018). IncPregel, Frontiers of Computer Science: Selected Publications from Chinese Universities, 12:6, (1076-1089), Online publication date: 1-Dec-2018.
  317. Bai A, Wu F and Chen X (2018). Posterior sampling for Monte Carlo planning under uncertainty, Applied Intelligence, 48:12, (4998-5018), Online publication date: 1-Dec-2018.
  318. ACM
    Baier C and Dubslaff C (2018). From verification to synthesis under cost-utility constraints, ACM SIGLOG News, 5:4, (26-46), Online publication date: 12-Nov-2018.
  319. Jiang Q, Leung V, Tang H and Xi H (2018). Energy-Efficient Traffic Rate Adaptation for Wireless Streaming Media Transmission, IEEE Transactions on Circuits and Systems for Video Technology, 28:11, (3313-3319), Online publication date: 1-Nov-2018.
  320. Xin B, Tang K, Wang L and Chen C Knowledge Transfer between Multi-granularity Models for Reinforcement Learning 2018 IEEE International Conference on Systems, Man, and Cybernetics (SMC), (2881-2886)
  321. Ornik M and Topcu U Deception in Optimal Control 2018 56th Annual Allerton Conference on Communication, Control, and Computing (Allerton), (821-828)
  322. Ahmadi M, Cubuktepe M, Jansen N and Topcu U Verification of Uncertain POMDPs Using Barrier Certificates 2018 56th Annual Allerton Conference on Communication, Control, and Computing (Allerton), (115-122)
  323. Jones D, Hollinger G, Kuhlman M, Sofge D and Gupta S Stochastic Optimization for Autonomous Vehicles with Limited Control Authority 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), (2395-2401)
  324. J.V. B and Dharma D (2018). HAS, Journal of Parallel and Distributed Computing, 120:C, (1-15), Online publication date: 1-Oct-2018.
  325. Wu S, Ren X, Dey S and Shi L (2018). Optimal scheduling of multiple sensors over shared channels with packet transmission constraint, Automatica (Journal of IFAC), 96:C, (22-31), Online publication date: 1-Oct-2018.
  326. Silva D, Zhang B and Ayhan H (2018). Admission control strategies for tandem Markovian loss systems, Queueing Systems: Theory and Applications, 90:1-2, (35-63), Online publication date: 1-Oct-2018.
  327. Ceran E, Gündüz D and György A A Reinforcement Learning Approach to Age of Information in Multi-User Networks 2018 IEEE 29th Annual International Symposium on Personal, Indoor and Mobile Radio Communications (PIMRC), (1967-1971)
  328. Batun S, Schaefer A, Bhandari A and Roberts M (2018). Optimal Liver Acceptance for Risk-Sensitive Patients, Service Science, 10:3, (320-333), Online publication date: 1-Sep-2018.
  329. Martín A, Rodríguez-Fernández V and Camacho D (2018). CANDYMAN, Engineering Applications of Artificial Intelligence, 74:C, (121-133), Online publication date: 1-Sep-2018.
  330. Krishnamurthy V, Aprem A and Bhatt S (2018). Multiple stopping time POMDPs, Automatica (Journal of IFAC), 95:C, (385-398), Online publication date: 1-Sep-2018.
  331. Cavazos-Cadena R (2018). Characterization of the Optimal Risk-Sensitive Average Cost in Denumerable Markov Decision Chains, Mathematics of Operations Research, 43:3, (1025-1050), Online publication date: 1-Aug-2018.
  332. Shaviv D and Özgür A (2018). Online Power Control for Block i.i.d. Energy Harvesting Channels, IEEE Transactions on Information Theory, 64:8, (5920-5937), Online publication date: 1-Aug-2018.
  333. Sehr M and Bitmead R (2022). Stochastic output-feedback model predictive control, Automatica (Journal of IFAC), 94:C, (315-323), Online publication date: 1-Aug-2018.
  334. Rattaro C and Belzarena P (2018). Cognitive Radio Networks, Wireless Personal Communications: An International Journal, 101:4, (2053-2083), Online publication date: 1-Aug-2018.
  335. ACM
    Cobb A, Everett R, Markham A and Roberts S Identifying Sources and Sinks in the Presence of Multiple Agents with Gaussian Process Vector Calculus Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, (1254-1262)
  336. Li H, Xia Y and Zhang W Finite sample analysis of LSTD with random projections and eligibility traces Proceedings of the 27th International Joint Conference on Artificial Intelligence, (2390-2396)
  337. Lecarpentier E, Infantes G, Lesire C and Rachelson E Open loop execution of tree-search algorithms Proceedings of the 27th International Joint Conference on Artificial Intelligence, (2362-2368)
  338. Yadav A, Wilder B, Rice E, Petering R, Craddock J, Yoshioka-Maxwell A, Hemler M, Onasch-Vera L, Tambe M and Woo D Bridging the gap between theory and practice in influence maximization Proceedings of the 27th International Joint Conference on Artificial Intelligence, (5399-5403)
  339. Schmoll S and Schubert M Dynamic resource routing using real-time dynamic programming Proceedings of the 27th International Joint Conference on Artificial Intelligence, (4822-4828)
  340. Panda S and Vorobeychik Y Scalable initial state interdiction for factored MDPs Proceedings of the 27th International Joint Conference on Artificial Intelligence, (4801-4807)
  341. Horák K, Bošansky B and Chatterjee K Goal-HSVI Proceedings of the 27th International Joint Conference on Artificial Intelligence, (4764-4770)
  342. De Giacomo G and Rubin S Automata-theoretic foundations of FOND planning for LTLf and LDLf goals Proceedings of the 27th International Joint Conference on Artificial Intelligence, (4729-4735)
  343. Chatterjee K, Fu H, Goharshady A and Okati N Computational approaches for stochastic shortest path on succinct MDPs Proceedings of the 27th International Joint Conference on Artificial Intelligence, (4700-4707)
  344. Boutilier C, Cohen A, Hassidim A, Mansour Y, Meshi O, Mladenov M and Schuurmans D Planning and learning with stochastic action sets Proceedings of the 27th International Joint Conference on Artificial Intelligence, (4674-4682)
  345. Amiri S, Wei S, Zhang S, Sinapov J, Thomason J and Stone P Multi-modal predicate identification using dynamically learned robot controllers Proceedings of the 27th International Joint Conference on Artificial Intelligence, (4638-4645)
  346. Grau-Moya J, Leibfried F and Bou-Ammar H Balancing two-player stochastic games with soft Q-learning Proceedings of the 27th International Joint Conference on Artificial Intelligence, (268-274)
  347. Zhang J and Bareinboim E Characterizing the Limits of Autonomous Systems Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems, (2165-2167)
  348. Schwab D, Zhu Y and Veloso M Zero Shot Transfer Learning for Robot Soccer Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems, (2070-2072)
  349. Yadav A, Noothigattu R, Rice E, Onasch-Vera L, Soriano Marcolino L and Tambe M Please be an Influencer? Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems, (1423-1431)
  350. Belle V On Plans With Loops and Noise Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems, (1310-1317)
  351. Wulfe B, Chintakindi S, Choi S, Hartong-Redden R, Kodali A and Kochenderfer M Real-Time Prediction of Intermediate-Horizon Automotive Collision Risk Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems, (1087-1096)
  352. Jain A and Precup D Eligibility Traces for Options Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems, (1008-1016)
  353. Barlier M, Laroche R and Pietquin O Training Dialogue Systems With Human Advice Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems, (999-1007)
  354. Phan T, Belzner L, Gabor T and Schmid K Leveraging Statistical Multi-Agent Online Planning with Emergent Value Function Approximation Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems, (730-738)
  355. Mukhopadhyay A, Wang Z and Vorobeychik Y A Decision Theoretic Framework for Emergency Responder Dispatch Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems, (588-596)
  356. ACM
    Akshay S, Genest B and Vyas N Distribution-based objectives for Markov Decision Processes Proceedings of the 33rd Annual ACM/IEEE Symposium on Logic in Computer Science, (36-45)
  357. ACM
    Baier C, Bertrand N, Dubslaff C, Gburek D and Sankur O Stochastic Shortest Paths and Weight-Bounded Properties in Markov Decision Processes Proceedings of the 33rd Annual ACM/IEEE Symposium on Logic in Computer Science, (86-94)
  358. Li Y, Wu X, Lou Y, Chen H and Li J (2022). Coupling based estimation approaches for the average reward performance potential in Markov chains, Automatica (Journal of IFAC), 93:C, (172-182), Online publication date: 1-Jul-2018.
  359. Xu J, Guo C, Zhang H and Yang J (2018). Resource allocation for real-time traffic in unreliable wireless cellular networks, Wireless Networks, 24:5, (1405-1418), Online publication date: 1-Jul-2018.
  360. ACM
    Péron M, Bartlett P, Becker K, Helmstedt K and Chadès I Two Approximate Dynamic Programming Algorithms for Managing Complete SIS Networks Proceedings of the 1st ACM SIGCAS Conference on Computing and Sustainable Societies, (1-10)
  361. Hsu Y Age of Information: Whittle Index for Scheduling Stochastic Arrivals 2018 IEEE International Symposium on Information Theory (ISIT), (2634-2638)
  362. ACM
    Raghothaman M, Kulkarni S, Heo K and Naik M User-guided program reasoning using Bayesian inference Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation, (722-735)
  363. ACM
    Wang D, Hoffmann J and Reps T PMAF: an algebraic framework for static analysis of probabilistic programs Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation, (513-528)
  364. Candelieri A, Perego R and Archetti F Intelligent Pump Scheduling Optimization in Water Distribution Networks Learning and Intelligent Optimization, (352-369)
  365. Sisikoglu Sir E, Pariazar M and Sir M (2018). Capacitated inspection scheduling of multi-unit systems, Computers and Industrial Engineering, 120:C, (471-479), Online publication date: 1-Jun-2018.
  366. ACM
    Varakantham P, Kumar A, Lau H and Yeoh W (2018). Risk-Sensitive Stochastic Orienteering Problems for Trip Optimization in Urban Environments, ACM Transactions on Intelligent Systems and Technology, 9:3, (1-25), Online publication date: 31-May-2018.
  367. ACM
    Fissaa T, Guermah H, EL Hamlaoui M, Hafiddi H and Nassar M An Intelligent Approach for Context-Aware Service Selection using Machine Learning Proceedings of the International Conference on Learning and Optimization Algorithms: Theory and Applications, (1-6)
  368. Walraven E and Spaan M (2019). Column generation algorithms for constrained POMDPs, Journal of Artificial Intelligence Research, 62:1, (489-533), Online publication date: 1-May-2018.
  369. ACM
    Morris N, Stewart C, Chen L, Birke R and Kelley J Model-driven computational sprinting Proceedings of the Thirteenth EuroSys Conference, (1-13)
  370. Ceran E, Gündüz D and György A Average age of information with hybrid ARQ under a resource constraint 2018 IEEE Wireless Communications and Networking Conference (WCNC), (1-6)
  371. Taksande P, Roy A and Karandikar A Optimal traffic splitting policy in LTE-based heterogeneous network 2018 IEEE Wireless Communications and Networking Conference (WCNC), (1-6)
  372. ACM
    Vinod A and Oishi M Scalable Underapproximative Verification of Stochastic LTI Systems using Convexity and Compactness Proceedings of the 21st International Conference on Hybrid Systems: Computation and Control (part of CPS Week), (1-10)
  373. ACM
    Duran S and Verloop I (2018). Asymptotic Optimal Control of Markov-Modulated Restless Bandits, Proceedings of the ACM on Measurement and Analysis of Computing Systems, 2:1, (1-25), Online publication date: 3-Apr-2018.
  374. Guerreiro S (2018). Using Markov Theory to Deliver Informed Decisions in Partially Observable Business Processes Operation, International Journal of Operations Research and Information Systems, 9:2, (53-72), Online publication date: 1-Apr-2018.
  375. Chafik S, Larach A and Daoui C (2018). Parallel Hierarchical Pre-Gauss-Seidel Value Iteration Algorithm, International Journal of Decision Support System Technology, 10:2, (1-22), Online publication date: 1-Apr-2018.
  376. Cheng Y, Li H and Thorstenson A (2018). Advance selling with double marketing efforts in a newsvendor framework, Computers and Industrial Engineering, 118:C, (352-365), Online publication date: 1-Apr-2018.
  377. Hou L, Zheng K, Chatzimisios P and Feng Y (2018). A Continuous-Time Markov decision process-based resource allocation scheme in vehicular cloud for mobile video services, Computer Communications, 118:C, (140-147), Online publication date: 1-Mar-2018.
  378. Abel D, Williams E, Brawner S, Reif E and Littman M Bandit-based solar panel control Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence and Thirtieth Innovative Applications of Artificial Intelligence Conference and Eighth AAAI Symposium on Educational Advances in Artificial Intelligence, (7713-7718)
  379. Gupta T, Kumar A and Paruchuri P Planning and learning for decentralized MDPs with event driven rewards Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence and Thirtieth Innovative Applications of Artificial Intelligence Conference and Eighth AAAI Symposium on Educational Advances in Artificial Intelligence, (6186-6194)
  380. de Nijs F, Spaan M and de Weerdt M Preallocation and planning under stochastic resource constraints Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence and Thirtieth Innovative Applications of Artificial Intelligence Conference and Eighth AAAI Symposium on Educational Advances in Artificial Intelligence, (4662-4669)
  381. Maliah S and Shani G MDP-based cost sensitive classification using decision trees Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence and Thirtieth Innovative Applications of Artificial Intelligence Conference and Eighth AAAI Symposium on Educational Advances in Artificial Intelligence, (3746-3753)
  382. Kim K and Park H Imitation learning via kernel mean embedding Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence and Thirtieth Innovative Applications of Artificial Intelligence Conference and Eighth AAAI Symposium on Educational Advances in Artificial Intelligence, (3415-3422)
  383. Harutyunyan A, Vrancx P, Bacon P, Precup D and Nowe A Learning with options that terminate off-policy Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence and Thirtieth Innovative Applications of Artificial Intelligence Conference and Eighth AAAI Symposium on Educational Advances in Artificial Intelligence, (3173-3182)
  384. Harb J, Bacon P, Klissarov M and Precup D When waiting is not an option Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence and Thirtieth Innovative Applications of Artificial Intelligence Conference and Eighth AAAI Symposium on Educational Advances in Artificial Intelligence, (3165-3172)
  385. Dabney W, Rowland M, Bellemare M and Munos R Distributional reinforcement learning with quantile regression Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence and Thirtieth Innovative Applications of Artificial Intelligence Conference and Eighth AAAI Symposium on Educational Advances in Artificial Intelligence, (2892-2901)
  386. Cohen A, Yu L and Wright R Diverse exploration for fast and safe policy improvement Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence and Thirtieth Innovative Applications of Artificial Intelligence Conference and Eighth AAAI Symposium on Educational Advances in Artificial Intelligence, (2876-2883)
  387. Cheng W, Erfani S, Zhang R and Ramamohanarao K Learning datum-wise sampling frequency for energy-efficient human activity recognition Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence and Thirtieth Innovative Applications of Artificial Intelligence Conference and Eighth AAAI Symposium on Educational Advances in Artificial Intelligence, (2143-2150)
  388. Crawford D, Levit A, Ghadermarzy N, Oberoi J and Ronagh P (2018). Reinforcement learning using quantum boltzmann machines, Quantum Information & Computation, 18:1-2, (51-74), Online publication date: 1-Feb-2018.
  389. Šošić A, Zoubir A, Rueckert E, Peters J and Koeppl H (2018). Inverse reinforcement learning via nonparametric spatio-temporal subgoal modeling, The Journal of Machine Learning Research, 19:1, (2777-2821), Online publication date: 1-Jan-2018.
  390. Yu H, Mahmood A and Sutton R (2018). On generalized Bellman equations and temporal-difference learning, The Journal of Machine Learning Research, 19:1, (1864-1912), Online publication date: 1-Jan-2018.
  391. Bumblauskas D, Gemmill D, Igou A and Anzengruber J (2017). Smart Maintenance Decision Support Systems (SMDSS) based on corporate big data analytics, Expert Systems with Applications: An International Journal, 90:C, (303-317), Online publication date: 30-Dec-2017.
  392. ACM
    Nishtala R, Carpenter P, Petrucci V and Martorell X (2017). The Hipster Approach for Improving Cloud System Efficiency, ACM Transactions on Computer Systems, 35:3, (1-28), Online publication date: 29-Dec-2017.
  393. ACM
    Berg B, Dorsman J and Harchol-Balter M (2017). Towards Optimality in Parallel Scheduling, Proceedings of the ACM on Measurement and Analysis of Computing Systems, 1:2, (1-30), Online publication date: 19-Dec-2017.
  394. Iwaki T, Wu Y, Wu J, Sandberg H and Johansson K Wireless sensor network scheduling for remote estimation under energy constraints 2017 IEEE 56th Annual Conference on Decision and Control (CDC), (3362-3367)
  395. Haskell W, Yu P, Sharma H and Jain R Randomized function fitting-based empirical value iteration 2017 IEEE 56th Annual Conference on Decision and Control (CDC), (2467-2472)
  396. ACM
    Scheftelowitsch D, Buchholz P, Hashemi V and Hermanns H Multi-Objective Approaches to Markov Decision Processes with Uncertain Transition Parameters Proceedings of the 11th EAI International Conference on Performance Evaluation Methodologies and Tools, (44-51)
  397. Hadfield-Menell D, Milli S, Abbeel P, Russell S and Dragan A Inverse reward design Proceedings of the 31st International Conference on Neural Information Processing Systems, (6768-6777)
  398. Greenewald K, Tewari A, Klasnja P and Murphy S Action centered contextual bandits Proceedings of the 31st International Conference on Neural Information Processing Systems, (5979-5987)
  399. Barreto A, Dabney W, Munos R, Hunt J, Schaul T, van Hasselt H and Silver D Successor features for transfer in reinforcement learning Proceedings of the 31st International Conference on Neural Information Processing Systems, (4058-4068)
  400. Fruit R, Pirotta M, Lazaric A and Brunskill E Regret minimization in MDPs with options without prior knowledge Proceedings of the 31st International Conference on Neural Information Processing Systems, (3169-3179)
  401. Roy A, Xu H and Pokutta S Reinforcement learning under model mismatch Proceedings of the 31st International Conference on Neural Information Processing Systems, (3046-3055)
  402. Lin J, Rao Y, Lu J and Zhou J Runtime neural pruning Proceedings of the 31st International Conference on Neural Information Processing Systems, (2178-2188)
  403. Metelli A, Pirotta M and Restelli M Compatible reward inverse reinforcement learning Proceedings of the 31st International Conference on Neural Information Processing Systems, (2047-2056)
  404. Zhang L, Tang K and Yao X Log-normality and skewness of estimated state/action values in reinforcement learning Proceedings of the 31st International Conference on Neural Information Processing Systems, (1802-1812)
  405. Agrawal S and Jia R Optimistic posterior sampling for reinforcement learning Proceedings of the 31st International Conference on Neural Information Processing Systems, (1184-1194)
  406. Shaviv D and Oezguer A Online Power Control for Block i.i.d. Energy Harvesting Channels GLOBECOM 2017 - 2017 IEEE Global Communications Conference, (1-6)
  407. Leeuwen D and Núñez Queija R (2017). Optimal dispatching in a tandem queue, Queueing Systems: Theory and Applications, 87:3-4, (269-291), Online publication date: 1-Dec-2017.
  408. ACM
    Cheng W, Erfani S, Zhang R and Ramamohanarao K Markov Dynamic Subsequence Ensemble for Energy-Efficient Activity Recognition Proceedings of the 14th EAI International Conference on Mobile and Ubiquitous Systems: Computing, Networking and Services, (282-291)
  409. ACM
    Elfar M, Zhong Z, Li Z, Chakrabarty K and Pajic M (2017). Synthesis of Error-Recovery Protocols for Micro-Electrode-Dot-Array Digital Microfluidic Biochips, ACM Transactions on Embedded Computing Systems, 16:5s, (1-22), Online publication date: 31-Oct-2017.
  410. Brown D and Haugh M (2017). Information Relaxation Bounds for Infinite Horizon Markov Decision Processes, Operations Research, 65:5, (1355-1379), Online publication date: 1-Oct-2017.
  411. Chaudhuri S, Baig I and Das D (2017). Self organizing method for handover performance optimization in LTE-advanced network, Computer Communications, 110:C, (151-163), Online publication date: 15-Sep-2017.
  412. Li W, Ku M, Chen Y, Liu K and Zhu S (2017). Performance Analysis for Two-Way Network-Coded Dual-Relay Networks With Stochastic Energy Harvesting, IEEE Transactions on Wireless Communications, 16:9, (5747-5761), Online publication date: 1-Sep-2017.
  413. Hou X, Haijema R and Liu D (2017). Display, disposal, and order policies for fresh produce with a back storage at a wholesale market, Computers and Industrial Engineering, 111:C, (18-28), Online publication date: 1-Sep-2017.
  414. Naskos A, Gounaris A and Katsaros P (2017). Cost-aware horizontal scaling of NoSQL databases using probabilistic model checking, Cluster Computing, 20:3, (2687-2701), Online publication date: 1-Sep-2017.
  415. Hernandez-Leal P, Zhan Y, Taylor M, Sucar L and Munoz De Cote E (2017). An exploration strategy for non-stationary opponents, Autonomous Agents and Multi-Agent Systems, 31:5, (971-1002), Online publication date: 1-Sep-2017.
  416. Bohy A, Bruyère V, Raskin J and Bertrand N (2017). Symblicit algorithms for mean-payoff and shortest path in monotonic Markov decision processes, Acta Informatica, 54:6, (545-587), Online publication date: 1-Sep-2017.
  417. Zhou F, Wu C and Yu C Dynamic dispatching for re-entrant production lines — A deep learning approach 2017 13th IEEE Conference on Automation Science and Engineering (CASE), (1026-1031)
  418. Mladenov M, Boutilier C, Schuurmans D, Meshi O, Elidan G and Lu T Logistic Markov decision processes Proceedings of the 26th International Joint Conference on Artificial Intelligence, (2486-2493)
  419. Mann T, Mannor S and Precup D Approximate value iteration with temporally extended actions Proceedings of the 26th International Joint Conference on Artificial Intelligence, (5035-5039)
  420. Yadav A, Chan H, Jiang A, Xu H, Rice E and Tambe M Maximizing awareness about HIV in social networks of homeless youth with limited information Proceedings of the 26th International Joint Conference on Artificial Intelligence, (4959-4963)
  421. Imaizumi M and Fujimaki R Factorized asymptotic bayesian policy search for POMDPs Proceedings of the 26th International Joint Conference on Artificial Intelligence, (4346-4352)
  422. Amor N, Fargier H and Sabbadin R Equilibria in ordinal games Proceedings of the 26th International Joint Conference on Artificial Intelligence, (105-111)
  423. Agrawal P and Varakantham P Proactive and reactive coordination of non-dedicated agent teams operating in uncertain environments Proceedings of the 26th International Joint Conference on Artificial Intelligence, (28-34)
  424. Tosatto S, Pirotta M, D'Eramo C and Restelli M Boosted fitted q-iteration Proceedings of the 34th International Conference on Machine Learning - Volume 70, (3434-3443)
  425. Machado M, Bellemare M and Bowling M A Laplacian Framework for option discovery in reinforcement learning Proceedings of the 34th International Conference on Machine Learning - Volume 70, (2295-2304)
  426. MacGlashan J, Ho M, Loftin R, Peng B, Wang G, Roberts D, Taylor M and Littman M Interactive learning from policy-dependent human feedback Proceedings of the 34th International Conference on Machine Learning - Volume 70, (2285-2294)
  427. Jiang N, Krishnamurthy A, Agarwal A, Langford J and Schapire R Contextual decision processes with low Bellman rank are PAC-learnable Proceedings of the 34th International Conference on Machine Learning - Volume 70, (1704-1713)
  428. Higgins I, Pal A, Rusu A, Matthey L, Burgess C, Pritzel A, Botvinick M, Blundell C and Lerchner A DARLA Proceedings of the 34th International Conference on Machine Learning - Volume 70, (1480-1490)
  429. Du S, Chen J, Li L, Xiao L and Zhou D Stochastic variance reduction methods for policy evaluation Proceedings of the 34th International Conference on Machine Learning - Volume 70, (1049-1058)
  430. Bellemare M, Dabney W and Munos R A Distributional Perspective on Reinforcement Learning Proceedings of the 34th International Conference on Machine Learning - Volume 70, (449-458)
  431. Asadi K and Littman M An alternative softmax operator for reinforcement learning Proceedings of the 34th International Conference on Machine Learning - Volume 70, (243-252)
  432. (2017). On probabilistic snap-stabilization, Theoretical Computer Science, 688:C, (49-76), Online publication date: 6-Aug-2017.
  433. Zheng Y, Wu D, Ke Y, Yang C, Chen M and Zhang G (2017). Online Cloud Transcoding and Distribution for Crowdsourced Live Game Video Streaming, IEEE Transactions on Circuits and Systems for Video Technology, 27:8, (1777-1789), Online publication date: 1-Aug-2017.
  434. Hassler M (2017). Heuristic decision rules for short-term trading of renewable energy with co-located energy storage, Computers and Operations Research, 83:C, (199-213), Online publication date: 1-Jul-2017.
  435. Hernandez-Leal P, Zhan Y, Taylor M, Sucar L and Munoz De Cote E (2017). Efficiently detecting switches against non-stationary opponents, Autonomous Agents and Multi-Agent Systems, 31:4, (767-789), Online publication date: 1-Jul-2017.
  436. Ko H, Lee J and Pack S (2017). An Opportunistic Push Scheme for Online Social Networking Services in Heterogeneous Wireless Networks, IEEE Transactions on Network and Service Management, 14:2, (416-428), Online publication date: 1-Jun-2017.
  437. (2017). Opportunistic scheduling with flow size information for Markovian time-varying channels, Performance Evaluation, 112:C, (27-52), Online publication date: 1-Jun-2017.
  438. Randour M, Raskin J and Sankur O (2017). Percentile queries in multi-dimensional Markov decision processes, Formal Methods in System Design, 50:2-3, (207-248), Online publication date: 1-Jun-2017.
  439. Moreno G, Strichman O, Chaki S and Vaisman R Decision-making with cross-entropy for self-adaptation Proceedings of the 12th International Symposium on Software Engineering for Adaptive and Self-Managing Systems, (90-101)
  440. da Silva F, Glatt R and Costa A Simultaneously Learning and Advising in Multiagent Reinforcement Learning Proceedings of the 16th Conference on Autonomous Agents and MultiAgent Systems, (1100-1108)
  441. Philipp P and Rettinger A Reinforcement Learning for Multi-Step Expert Advice Proceedings of the 16th Conference on Autonomous Agents and MultiAgent Systems, (962-971)
  442. Kumar R and Varakantham P Exploiting Anonymity and Homogeneity in Factored Dec-MDPs through Precomputed Binomial Distributions Proceedings of the 16th Conference on Autonomous Agents and MultiAgent Systems, (732-740)
  443. Grześ M Reward Shaping in Episodic Reinforcement Learning Proceedings of the 16th Conference on Autonomous Agents and MultiAgent Systems, (565-573)
  444. Bogert K and Doshi P Scaling Expectation-Maximization for Inverse Reinforcement Learning to Multiple Robots under Occlusion Proceedings of the 16th Conference on Autonomous Agents and MultiAgent Systems, (522-529)
  445. Claes D, Oliehoek F, Baier H and Tuyls K Decentralised Online Planning for Multi-Robot Warehouse Commissioning Proceedings of the 16th Conference on Autonomous Agents and MultiAgent Systems, (492-500)
  446. Vouros G Learning Conventions via Social Reinforcement Learning in Complex and Open Settings Proceedings of the 16th Conference on Autonomous Agents and MultiAgent Systems, (455-463)
  447. Yadav A, Wilder B, Rice E, Petering R, Craddock J, Yoshioka-Maxwell A, Hemler M, Onasch-Vera L, Tambe M and Woo D Influence Maximization in the Field Proceedings of the 16th Conference on Autonomous Agents and MultiAgent Systems, (150-158)
  448. ACM
    Banovic N, Wang A, Jin Y, Chang C, Ramos J, Dey A and Mankoff J Leveraging Human Routine Models to Detect and Generate Human Behaviors Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems, (6683-6694)
  449. Budde C, Dehnert C, Hahn E, Hartmanns A, Junges S and Turrini A JANI Proceedings, Part II, of the 23rd International Conference on Tools and Algorithms for the Construction and Analysis of Systems - Volume 10206, (151-168)
  450. Baier C, Klein J, Klüppelholz S and Wunderlich S Maximizing the Conditional Expected Reward for Reaching the Goal Proceedings, Part II, of the 23rd International Conference on Tools and Algorithms for the Construction and Analysis of Systems - Volume 10206, (269-285)
  451. Butkova Y, Wimmer R and Hermanns H Long-Run Rewards for Markov Automata Proceedings, Part II, of the 23rd International Conference on Tools and Algorithms for the Construction and Analysis of Systems - Volume 10206, (188-203)
  452. McGregor S, Buckingham H, Dietterich T, Houtman R, Montgomery C and Metoyer R (2017). Interactive visualization for testing Markov Decision Processes, Journal of Visual Languages and Computing, 39:C, (93-106), Online publication date: 1-Apr-2017.
  453. Lauri M, Ropponen A and Ritala R (2017). Meeting a deadline, Annals of Mathematics and Artificial Intelligence, 79:4, (337-370), Online publication date: 1-Apr-2017.
  454. ACM
    Oraby S, Gundecha P, Mahmud J, Bhuiyan M and Akkiraju R "How May I Help You?" Proceedings of the 22nd International Conference on Intelligent User Interfaces, (343-355)
  455. ACM
    Jeong S and Breazeal C Toward Robotic Companions that Enhance Psychological Wellbeing with Smartphone Technology Proceedings of the Companion of the 2017 ACM/IEEE International Conference on Human-Robot Interaction, (345-346)
  456. ACM
    Hayes B and Shah J Improving Robot Controller Transparency Through Autonomous Policy Explanation Proceedings of the 2017 ACM/IEEE International Conference on Human-Robot Interaction, (303-312)
  457. Yang J, Zhu K, Ran Y, Cai W and Yang E (2017). JointźAdmissionźControlźandźRoutingźViaźApproximate Dynamic Programming for Streaming Video Over Software-Defined Networking, IEEE Transactions on Multimedia, 19:3, (619-631), Online publication date: 1-Mar-2017.
  458. Wang X, Zhang M and Ren F (2017). A hybrid-learning based broker model for strategic power trading in smart grid markets, Knowledge-Based Systems, 119:C, (142-151), Online publication date: 1-Mar-2017.
  459. Cao Z, Guo H, Zhang J, Oliehoek F and Fastenrath U Maximizing the probability of arriving on time Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, (4481-4487)
  460. Zheng Z, Shroff N and Mohapatra P When to reset your keys Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, (3679-3685)
  461. Li Z, Narayan A and Leong T An efficient approach to model-based hierarchical reinforcement learning Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, (3583-3589)
  462. Gilbert H, Weng P and Xu Y Optimizing quantiles in preference-based Markov decision processes Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, (3569-3575)
  463. ACM
    Filieri A, Maggio M, Angelopoulos K, D’ippolito N, Gerostathopoulos I, Hempel A, Hoffmann H, Jamshidi P, Kalyvianaki E, Klein C, Krikava F, Misailovic S, Papadopoulos A, Ray S, Sharifloo A, Shevtsov S, Ujma M and Vogel T (2017). Control Strategies for Self-Adaptive Software Systems, ACM Transactions on Autonomous and Adaptive Systems, 11:4, (1-31), Online publication date: 3-Feb-2017.
  464. Omidshafiei S, Agha-Mohammadi A, Amato C, Liu S, How J and Vian J (2017). Decentralized control of multi-robot partially observable Markov decision processes using belief space macro-actions, International Journal of Robotics Research, 36:2, (231-258), Online publication date: 1-Feb-2017.
  465. Wu Y, Hu F, Kumar S, Matyjas J, Sun Q and Zhu Y (2017). Apprenticeship Learning Based Spectrum Decision in Multi-Channel Wireless Mesh Networks with Multi-Beam Antennas, IEEE Transactions on Mobile Computing, 16:2, (314-325), Online publication date: 1-Feb-2017.
  466. ACM
    Song X, Zhang Q, Sekimoto Y, Shibasaki R, Yuan N and Xie X (2016). Prediction and Simulation of Human Mobility Following Natural Disasters, ACM Transactions on Intelligent Systems and Technology, 8:2, (1-23), Online publication date: 18-Jan-2017.
  467. ACM
    Pérez J, Silva D, Góez J, Sarmiento A, Sarmiento-Romero A, Akhavan-Tabatabaei R and Riaño G (2017). Algorithm 972, ACM Transactions on Mathematical Software, 43:3, (1-22), Online publication date: 16-Jan-2017.
  468. Han D, Wu J, Zhang H and Shi L (2017). Optimal sensor scheduling for multiple linear dynamical systems, Automatica (Journal of IFAC), 75:C, (260-270), Online publication date: 1-Jan-2017.
  469. Ho J and Ermon S Generative adversarial imitation learning Proceedings of the 30th International Conference on Neural Information Processing Systems, (4572-4580)
  470. Pazis J, Parr R and How J Improving PAC exploration using the median of means Proceedings of the 30th International Conference on Neural Information Processing Systems, (3898-3906)
  471. Munos R, Stepleton T, Harutyunyan A and Bellemare M Safe and efficient off-policy reinforcement learning Proceedings of the 30th International Conference on Neural Information Processing Systems, (1054-1062)
  472. Arts J, Basten R and Van Houtum G (2016). Repairable Stocking and Expediting in a Fluctuating Demand Environment, Operations Research, 64:6, (1285-1301), Online publication date: 1-Dec-2016.
  473. Munoz de Cote E, Garcia E and Morales E (2016). Transfer learning by prototype generation in continuous spaces, Adaptive Behavior - Animals, Animats, Software Agents, Robots, Adaptive Systems, 24:6, (464-478), Online publication date: 1-Dec-2016.
  474. Fagundes M, Ossowski S, Cerquides J and Noriega P (2016). Design and evaluation of norm-aware agents based on Normative Markov Decision Processes, International Journal of Approximate Reasoning, 78:C, (33-61), Online publication date: 1-Nov-2016.
  475. Legrain A and Jaillet P (2016). A stochastic algorithm for online bipartite resource allocation problems, Computers and Operations Research, 75:C, (28-37), Online publication date: 1-Nov-2016.
  476. Xia L (2016). Optimization of Markov decision processes under the variance criterion, Automatica (Journal of IFAC), 73:C, (269-278), Online publication date: 1-Nov-2016.
  477. ACM
    Jeong S and Breazeal C Improving Smartphone Users' Affect and Wellbeing with Personalized Positive Psychology Interventions Proceedings of the Fourth International Conference on Human Agent Interaction, (131-137)
  478. ACM
    Hartmanns A, Hermanns H and Bungert M Flexible support for time and costs in scenario-aware dataflow Proceedings of the 13th International Conference on Embedded Software, (1-10)
  479. Qiao J, He Y and Shen X (2016). Proactive Caching for Mobile Video Streaming in Millimeter Wave 5G Networks, IEEE Transactions on Wireless Communications, 15:10, (7187-7198), Online publication date: 1-Oct-2016.
  480. Wu H, Lin X, Liu X, Tan K and Zhang Y (2016). CoSchd, IEEE/ACM Transactions on Networking, 24:5, (2579-2592), Online publication date: 1-Oct-2016.
  481. Liu Y, Lee M and Zheng Y (2016). Adaptive Multi-Resource Allocation for Cloudlet-Based Mobile Cloud Computing System, IEEE Transactions on Mobile Computing, 15:10, (2398-2410), Online publication date: 1-Oct-2016.
  482. Moreira D, Delgado K and Nunes De Barros L (2016). Robust probabilistic planning with ilao, Applied Intelligence, 45:3, (662-672), Online publication date: 1-Oct-2016.
  483. ACM
    Ghaderi J, Shakkottai S and Srikant R (2016). Scheduling Storms and Streams in the Cloud, ACM Transactions on Modeling and Performance Evaluation of Computing Systems, 1:4, (1-28), Online publication date: 21-Sep-2016.
  484. (2016). Planning for robotic exploration based on forward simulation, Robotics and Autonomous Systems, 83:C, (15-31), Online publication date: 1-Sep-2016.
  485. Wu C, Chien W, Chuang Y and Cheng Y (2016). Multiple product admission control in semiconductor manufacturing systems with process queue time (PQT) constraints, Computers and Industrial Engineering, 99:C, (347-363), Online publication date: 1-Sep-2016.
  486. Bian T and Jiang Z (2016). Value iteration and adaptive dynamic programming for data-driven adaptive optimal control design, Automatica (Journal of IFAC), 71:C, (348-360), Online publication date: 1-Sep-2016.
  487. Anagnostopoulos A, Grandoni F, Leonardi S and Sankowski P (2016). Online Network Design with Outliers, Algorithmica, 76:1, (88-109), Online publication date: 1-Sep-2016.
  488. Walraven E and Spaan M Planning under uncertainty for aggregated electric vehicle charging with renewable energy supply Proceedings of the Twenty-second European Conference on Artificial Intelligence, (904-912)
  489. Yu Q, Wan H, Xu J, Lécué F and Chang L Explanatory diagnosis of an ontology stream via reasoning about actions Proceedings of the Twenty-second European Conference on Artificial Intelligence, (1596-1597)
  490. Higuera-Chan C, Jasso-Fuentes H and Minjárez-Sosa J (2016). Discrete-Time Control for Systems of Interacting Objects with Unknown Random Disturbance Distributions, Applied Mathematics and Optimization, 74:1, (197-227), Online publication date: 1-Aug-2016.
  491. Varakantham P Sequential decision making for improving efficiency in urban environments Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, (4090-4093)
  492. Da Silva F and Costa A Transfer learning for multiagent reinforcement learning systems Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, (3982-3983)
  493. Zhang Q, Durfee E, Singh S, Chen A and Witwicki S Commitment semantics for sequential decision making under reward uncertainty Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, (3315-3321)
  494. Cui H and Khardon R Online symbolic gradient-based optimization for factored action MDPs Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, (3075-3081)
  495. Zhan Y, Ammar H and Taylor M Theoretically-grounded policy advice from multiple teachers in reinforcement learning settings with applications to negative transfer Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, (2315-2321)
  496. Gombolay M, Jensen R, Stigile J, Son S and Shah J Apprenticeship scheduling Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, (826-833)
  497. Zhang R, Yu Y, Chamie M, Açikmeşe B and Ballard D Decision-making policies for heterogeneous autonomous multi-agent systems with safety constraints Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, (546-552)
  498. ACM
    Katoen J The Probabilistic Model Checking Landscape Proceedings of the 31st Annual ACM/IEEE Symposium on Logic in Computer Science, (31-45)
  499. ACM
    Chatterjee K, Henzinger T and Otop J Quantitative Automata under Probabilistic Semantics Proceedings of the 31st Annual ACM/IEEE Symposium on Logic in Computer Science, (76-85)
  500. Wei Q and Chen X (2016). Continuous-time Markov decision processes under the risk-sensitive average cost criterion, Operations Research Letters, 44:4, (457-462), Online publication date: 1-Jul-2016.
  501. Ayesta U, Erausquin M, Ferreira E and Jacko P (2016). Optimal dynamic resource allocation to prevent defaults, Operations Research Letters, 44:4, (451-456), Online publication date: 1-Jul-2016.
  502. Pleşca C, Charvillat V and Ooi W (2016). Multimedia prefetching with optimal Markovian policies, Journal of Network and Computer Applications, 69:C, (40-53), Online publication date: 1-Jul-2016.
  503. Lu X, Yin B, Zhang X, Cao J and Kang Y (2016). Event-based optimization for admission control in distributed service system, Telecommunications Systems, 62:3, (553-567), Online publication date: 1-Jul-2016.
  504. Petrik M and Luss R Interpretable policies for dynamic product recommendations Proceedings of the Thirty-Second Conference on Uncertainty in Artificial Intelligence, (607-616)
  505. Gilbert H, Zanuttini B, Viappiani P, Weng P and Nicart E Model-free reinforcement learning with Skew-Symmetric Bilinear utilities Proceedings of the Thirty-Second Conference on Uncertainty in Artificial Intelligence, (252-261)
  506. Boutilier C and Lu T Budget allocation using weakly coupled, constrained Markov decision processes Proceedings of the Thirty-Second Conference on Uncertainty in Artificial Intelligence, (52-61)
  507. Zhou Y, Guo X and Sun X (2016). Acquisition Pricing and Inventory Decisions on Dual-Source Spare-Part System with Final Production and Remanufacturing, Scientific Programming, 2016, (9), Online publication date: 1-Jun-2016.
  508. Clemens J, Reineking T and Kluth T (2016). An evidential approach to SLAM, path planning, and active exploration, International Journal of Approximate Reasoning, 73:C, (1-26), Online publication date: 1-Jun-2016.
  509. Walraven E, Spaan M and Bakker B (2016). Traffic flow optimization, Engineering Applications of Artificial Intelligence, 52:C, (203-212), Online publication date: 1-Jun-2016.
  510. Essen C, Jobstmann B, Parker D and Varshneya R (2016). Synthesizing efficient systems in probabilistic environments, Acta Informatica, 53:4, (425-457), Online publication date: 1-Jun-2016.
  511. Yadav A, Kamar E, Grosz B and Tambe M HEALER Proceedings of the 2016 International Conference on Autonomous Agents & Multiagent Systems, (1504-1506)
  512. K.J. P, Bodas T and Tulabandhula T Reinforcement Learning Algorithms for Regret Minimization in Structured Markov Decision Processes Proceedings of the 2016 International Conference on Autonomous Agents & Multiagent Systems, (1289-1290)
  513. Urieli D and Stone P An MDP-Based Winning Approach to Autonomous Power Trading Proceedings of the 2016 International Conference on Autonomous Agents & Multiagent Systems, (827-835)
  514. Yadav A, Chan H, Xin Jiang A, Xu H, Rice E and Tambe M Using Social Networks to Aid Homeless Shelters Proceedings of the 2016 International Conference on Autonomous Agents & Multiagent Systems, (740-748)
  515. Suay H, Brys T, Taylor M and Chernova S Learning from Demonstration for Shaping through Inverse Reinforcement Learning Proceedings of the 2016 International Conference on Autonomous Agents & Multiagent Systems, (429-437)
  516. ACM
    Banovic N, Buzali T, Chevalier F, Mankoff J and Dey A Modeling and Understanding Human Routine Behavior Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems, (248-260)
  517. Avrachenkov K, Filar J, Gaitsgory V and Stillman A (2016). Singularly perturbed linear programs and Markov decision processes, Operations Research Letters, 44:3, (297-301), Online publication date: 1-May-2016.
  518. Li Y and Wu X (2016). A unified approach to time-aggregated Markov decision processes, Automatica (Journal of IFAC), 67:C, (77-84), Online publication date: 1-May-2016.
  519. ACM
    Barnat J, Černá I, Ročkai P, Štill V and Zákopčanová K On verifying C++ programs with probabilities Proceedings of the 31st Annual ACM Symposium on Applied Computing, (1238-1243)
  520. Shapiro A (2016). Rectangular Sets of Probability Measures, Operations Research, 64:2, (528-541), Online publication date: 1-Apr-2016.
  521. Fearnley J, Rabe M, Schewe S and Zhang L (2016). Efficient approximation of optimal control for continuous-time Markov games, Information and Computation, 247:C, (106-129), Online publication date: 1-Apr-2016.
  522. Hyytiä E and Aalto S (2016). On Round-Robin routing with FCFS and LCFS scheduling, Performance Evaluation, 97:C, (83-103), Online publication date: 1-Mar-2016.
  523. Salemi Parizi M and Ghate A (2016). Multi-class, multi-resource advance scheduling with no-shows, cancellations and overbooking, Computers and Operations Research, 67:C, (90-101), Online publication date: 1-Mar-2016.
  524. Robbel P, Oliehoek F and Kochenderfer M Exploiting anonymity in approximate linear programming Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, (2537-2573)
  525. Kawaguchi K Bounded optimal exploration in MDP Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, (1758-1764)
  526. Barreto A, Beirigo R, Pineau J and Precup D Incremental stochastic factorization for online reinforcement learning Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, (1468-1475)
  527. Fearnley J and Savani R The complexity of all-switches strategy improvement Proceedings of the twenty-seventh annual ACM-SIAM symposium on Discrete algorithms, (130-139)
  528. Lin J and Weitnauer M (2016). Modeling of multihop wireless sensor networks with MAC, queuing, and cooperation, International Journal of Distributed Sensor Networks, 2016, (3-3), Online publication date: 1-Jan-2016.
  529. Ding N, Sadeghi P and Kennedy R (2016). Discrete Convexity and Stochastic Approximation for Cross-layer Onoff Transmission Control, IEEE Transactions on Wireless Communications, 15:1, (389-400), Online publication date: 1-Jan-2016.
  530. Hosseini A (2016). A non-penalty recurrent neural network for solving a class of constrained optimization problems, Neural Networks, 73:C, (10-25), Online publication date: 1-Jan-2016.
  531. Hoey J, Schröder T and Alhothali A (2016). Affect control processes, Artificial Intelligence, 230:C, (134-172), Online publication date: 1-Jan-2016.
  532. Jansen N, Kaminski B, Katoen J, Olmedo F, Gretz F and McIver A (2015). Conditioning in Probabilistic Programming, Electronic Notes in Theoretical Computer Science (ENTCS), 319:C, (199-216), Online publication date: 21-Dec-2015.
  533. Hsu Y, Abedini N, Gautam N, Sprintson A and Shakkottai S (2015). Opportunities for network coding, IEEE/ACM Transactions on Networking, 23:6, (1876-1889), Online publication date: 1-Dec-2015.
  534. Liang Xiao , Jinliang Liu , Qiangda Li , Mandayam N and Poor H (2015). User-Centric View of Jamming Games in Cognitive Radio Networks, IEEE Transactions on Information Forensics and Security, 10:12, (2578-2590), Online publication date: 1-Dec-2015.
  535. El-Hajj A, Niyato D and Dawy Z (2015). A DEC-MDP model for joint uplink/downlink resource management in OFDMA-based networks, Physical Communication, 17:C, (107-117), Online publication date: 1-Dec-2015.
  536. ACM
    Lavi N and Levy H (2015). Admit or Reject? Preserve or Drop?, ACM SIGMETRICS Performance Evaluation Review, 43:3, (25-29), Online publication date: 19-Nov-2015.
  537. Hartmanns A and Hermanns H (2015). In the quantitative automata zoo, Science of Computer Programming, 112:P1, (3-23), Online publication date: 15-Nov-2015.
  538. ACM
    Ghasemi M, Mohaqeqi M and Kargahi M Joint management of processing and cooling power based on inaccurate thermal information in a stochastic real-time system Proceedings of the 23rd International Conference on Real Time and Networks Systems, (45-54)
  539. ACM
    Xia W, Kantarcioglu M, Wan Z, Heatherly R, Vorobeychik Y and Malin B Process-Driven Data Privacy Proceedings of the 24th ACM International on Conference on Information and Knowledge Management, (1021-1030)
  540. Khademi A, Saure D, Schaefer A, Braithwaite R and Roberts M (2015). The Price of Nonabandonment, Manufacturing & Service Operations Management, 17:4, (554-570), Online publication date: 1-Oct-2015.
  541. Jian Qiao , Shen X, Mark J and Lei Lei (2015). Video Quality Provisioning for Millimeter Wave 5G Cellular Networks With Link Outage, IEEE Transactions on Wireless Communications, 14:10, (5692-5703), Online publication date: 1-Oct-2015.
  542. Dinh Thai Hoang , Xiao Lu , Dusit Niyato , Ping Wang , Dong In Kim and Zhu Han (2015). Applications of Repeated Games in Wireless Networks: A Survey, IEEE Communications Surveys & Tutorials, 17:4, (2102-2135), Online publication date: 1-Oct-2015.
  543. Gilbert H, Spanjaard O, Viappiani P and Weng P Reducing the Number of Queries in Interactive Value Iteration Proceedings of the 4th International Conference on Algorithmic Decision Theory - Volume 9346, (139-152)
  544. Gilbert H Sequential Decision Making Under Uncertainty Using Ordinal Preferential Information Proceedings of the 4th International Conference on Algorithmic Decision Theory - Volume 9346, (573-577)
  545. Naskos A, Gounaris A, Mouratidis H and Katsaros P Security-Aware Elasticity for NoSQL Databases Proceedings of the 5th International Conference on Model and Data Engineering - Volume 9344, (181-197)
  546. Wang J, Ding X, Lahijanian M, Paschalidis I and Belta C (2015). Temporal logic motion control using actor–critic methods, International Journal of Robotics Research, 34:10, (1329-1344), Online publication date: 1-Sep-2015.
  547. Urgaonkar R, Wang S, He T, Zafer M, Chan K and Leung K (2015). Dynamic service migration and workload scheduling in edge-clouds, Performance Evaluation, 91:C, (205-228), Online publication date: 1-Sep-2015.
  548. Alagoz O, Ayvaci M and Linderoth J (2015). Optimally solving Markov decision processes with total expected discounted reward function, Computers and Industrial Engineering, 87:C, (311-316), Online publication date: 1-Sep-2015.
  549. ACM
    Bai A, Wu F and Chen X (2015). Online Planning for Large Markov Decision Processes with Hierarchical Decomposition, ACM Transactions on Intelligent Systems and Technology, 6:4, (1-28), Online publication date: 13-Aug-2015.
  550. Moradian M and Ashtiani F (2015). Optimal Relaying in a Slotted Aloha Wireless Network With Energy Harvesting Nodes, IEEE Journal on Selected Areas in Communications, 33:8, (1680-1692), Online publication date: 1-Aug-2015.
  551. Chang H (2015). Random search for constrained Markov decision processes with multi-policy improvement, Automatica (Journal of IFAC), 58:C, (127-130), Online publication date: 1-Aug-2015.
  552. Zhang L, Tang K and Yao X Increasingly cautious optimism for practical PAC-MDP exploration Proceedings of the 24th International Conference on Artificial Intelligence, (4033-4040)
  553. Munzer T, Piot B, Geist M, Pietquin O and Lopes M Inverse reinforcement learning in relational domains Proceedings of the 24th International Conference on Artificial Intelligence, (3735-3741)
  554. Cropper A and Muggleton S Learning efficient logical robot strategies involving composable objects Proceedings of the 24th International Conference on Artificial Intelligence, (3423-3429)
  555. Belle V and Levesque H ALLEGRO Proceedings of the 24th International Conference on Artificial Intelligence, (2762-2769)
  556. Berlink H and Costa A Batch reinforcement learning for smart home energy management Proceedings of the 24th International Conference on Artificial Intelligence, (2561-2567)
  557. Hadoux E, Beynier A, Maudet N, Weng P and Hunter A Optimization of probabilistic argumentation with Markov decision models Proceedings of the 24th International Conference on Artificial Intelligence, (2004-2010)
  558. Gilbert H, Spanjaard O, Viappiani P and Weng P Solving MDPs with skew symmetric bilinear utility functions Proceedings of the 24th International Conference on Artificial Intelligence, (1989-1995)
  559. Song L, Feng Y and Zhang L Planning for stochastic games with co-safe objectives Proceedings of the 24th International Conference on Artificial Intelligence, (1682-1688)
  560. De Chamisso F, Soulier L and Aupetit M Exploratory digraph navigation using A Proceedings of the 24th International Conference on Artificial Intelligence, (1624-1630)
  561. Lacerda B, Parker D and Hawes N Optimal policy generation for partially satisfiable co-safe LTL specifications Proceedings of the 24th International Conference on Artificial Intelligence, (1587-1593)
  562. Dibangoye J, Buffet O and Simonin O Structural results for cooperative decentralized control models Proceedings of the 24th International Conference on Artificial Intelligence, (46-52)
  563. Kratochvil V and Vomlel J Influence diagrams for the optimization of a vehicle speed profile Proceedings of the Twelfth UAI Conference on Bayesian Modeling Applications Workshop - Volume 1565, (44-53)
  564. Walraven E and Spaan M Planning under uncertainty with weighted state scenarios Proceedings of the Thirty-First Conference on Uncertainty in Artificial Intelligence, (912-921)
  565. Petrik M and Wu X Optimal threshold control for energy arbitrage with degradable battery storage Proceedings of the Thirty-First Conference on Uncertainty in Artificial Intelligence, (692-701)
  566. Bacon P, Balle B and Precup D Learning and planning with timing information in Markov decision processes Proceedings of the Thirty-First Conference on Uncertainty in Artificial Intelligence, (111-120)
  567. Clemente L and Raskin J Multidimensional beyond Worst-Case and Almost-Sure Problems for Mean-Payoff Objectives Proceedings of the 2015 30th Annual ACM/IEEE Symposium on Logic in Computer Science (LICS), (257-268)
  568. Abu Alsheikh M, Dinh Thai Hoang , Niyato D, Hwee-Pink Tan and Shaowei Lin (2015). Markov Decision Processes With Applications in Wireless Sensor Networks: A Survey, IEEE Communications Surveys & Tutorials, 17:3, (1239-1267), Online publication date: 1-Jul-2015.
  569. ACM
    Guo X, Singh R, Kumar P and Niu Z A High Reliability Asymptotic Approach for Packet Inter-Delivery Time Optimization in Cyber-Physical Systems Proceedings of the 16th ACM International Symposium on Mobile Ad Hoc Networking and Computing, (197-206)
  570. ACM
    Fearnley J and Savani R The Complexity of the Simplex Method Proceedings of the forty-seventh annual ACM symposium on Theory of Computing, (201-208)
  571. Kobayashi T, Aoyama T, Sekiyama K and Fukuda T (2015). Selection Algorithm for Locomotion Based on the Evaluation of Falling Risk, IEEE Transactions on Robotics, 31:3, (750-765), Online publication date: 1-Jun-2015.
  572. Cutler M, Walsh T and How J (2015). Real-World Reinforcement Learning via Multifidelity Simulators, IEEE Transactions on Robotics, 31:3, (655-671), Online publication date: 1-Jun-2015.
  573. Man Hon Cheung and Jianwei Huang (2015). DAWN: Delay-Aware Wi-Fi Offloading and Network Selection, IEEE Journal on Selected Areas in Communications, 33:6, (1214-1223), Online publication date: 1-Jun-2015.
  574. El Helou M, Ibrahim M, Lahoud S, Khawam K, Mezher D and Cousin B (2015). A Network-Assisted Approach for RAT Selection in Heterogeneous Cellular Networks, IEEE Journal on Selected Areas in Communications, 33:6, (1055-1067), Online publication date: 1-Jun-2015.
  575. Araghi S, Khosravi A and Creighton D (2015). Intelligent cuckoo search optimized traffic signal controllers for multi-intersection network, Expert Systems with Applications: An International Journal, 42:9, (4422-4431), Online publication date: 1-Jun-2015.
  576. Lin J, Jung H, Chang Y, Jung J and Weitnauer M (2015). On cooperative transmission range extension in multi-hop wireless ad-hoc and sensor networks, Ad Hoc Networks, 29:C, (117-134), Online publication date: 1-Jun-2015.
  577. ACM
    Naveen K and Kumar A (2015). Relay Selection with Channel Probing in Sleep-Wake Cycling Wireless Sensor Networks, ACM Transactions on Sensor Networks, 11:3, (1-38), Online publication date: 28-May-2015.
  578. Sleight J and Durfee E Effective Influence Abstractions for Organizational Design Proceedings of the 2015 International Conference on Autonomous Agents and Multiagent Systems, (1267-1274)
  579. Zhang Y, Sreedharan S and Kambhampati S Capability Models and Their Applications in Planning Proceedings of the 2015 International Conference on Autonomous Agents and Multiagent Systems, (1151-1159)
  580. Claes D, Robbel P, Oliehoek F, Tuyls K, Hennes D and van der Hoek W Effective Approximations for Multi-Robot Coordination in Spatially Distributed Tasks Proceedings of the 2015 International Conference on Autonomous Agents and Multiagent Systems, (881-890)
  581. Efthymiadis K and Kudenko D Knowledge Revision for Reinforcement Learning with Abstract MDPs Proceedings of the 2015 International Conference on Autonomous Agents and Multiagent Systems, (763-770)
  582. Naskos A, Stachtiari E, Gounaris A, Katsaros P, Tsoumakos D, Konstantinou I and Sioutas S Dependable horizontal scaling based on probabilistic model checking Proceedings of the 15th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing, (31-40)
  583. Chih-Yu Wang , Yan Chen , Hung-Yu Wei and Liu K (2015). Scalable Video Multicasting: A Stochastic Game Approach With Optimal Pricing, IEEE Transactions on Wireless Communications, 14:5, (2353-2367), Online publication date: 1-May-2015.
  584. Ghate A (2015). Inverse optimization in countably infinite linear programs, Operations Research Letters, 43:3, (231-235), Online publication date: 1-May-2015.
  585. ACM
    Feng L, Wiltsche C, Humphrey L and Topcu U Controller synthesis for autonomous systems interacting with human operators Proceedings of the ACM/IEEE Sixth International Conference on Cyber-Physical Systems, (70-79)
  586. Kanoun K and van der Schaar M Big-data streaming applications scheduling with online learning and concept drift detection Proceedings of the 2015 Design, Automation & Test in Europe Conference & Exhibition, (1547-1550)
  587. ACM
    Chatterjee K, Henzinger T, Jobstmann B and Singh R (2015). Measuring and Synthesizing Systems in Probabilistic Environments, Journal of the ACM, 62:1, (1-34), Online publication date: 2-Mar-2015.
  588. Zhe Wang , Aggarwal V and Xiaodong Wang (2015). Iterative Dynamic Water-Filling for Fading Multiple-Access Channels With Energy Harvesting, IEEE Journal on Selected Areas in Communications, 33:3, (382-395), Online publication date: 1-Mar-2015.
  589. Combes R, Elayoubi S, Ali A, Saker L and Chahed T (2015). Optimal online control for sleep mode in green base stations, Computer Networks: The International Journal of Computer and Telecommunications Networking, 78:C, (140-151), Online publication date: 26-Feb-2015.
  590. Khamfroush H, Lucani D, Pahlevani P and Barros J (2015). On Optimal Policies for Network-Coded Cooperation: Theory and Implementation, IEEE Journal on Selected Areas in Communications, 33:2, (199-212), Online publication date: 1-Feb-2015.
  591. Randour M, Raskin J and Sankur O Variations on the Stochastic Shortest Path Problem Proceedings of the 16th International Conference on Verification, Model Checking, and Abstract Interpretation - Volume 8931, (1-18)
  592. Piot B, Geist M and Pietquin O Imitation learning applied to embodied conversational agents Proceedings of the 4th International Conference on Machine Learning for Interactive Systems - Volume 43, (1-5)
  593. Taleghan M, Dietterich T, Crowley M, Hall K and Albers H (2015). PAC optimal MDP planning with application to invasive species management, The Journal of Machine Learning Research, 16:1, (3877-3903), Online publication date: 1-Jan-2015.
  594. García J and Fernández F (2015). A comprehensive survey on safe reinforcement learning, The Journal of Machine Learning Research, 16:1, (1437-1480), Online publication date: 1-Jan-2015.
  595. Chen X, Lin Q and Zhou D (2015). Statistical decision making for optimal budget allocation in crowd labeling, The Journal of Machine Learning Research, 16:1, (1-46), Online publication date: 1-Jan-2015.
  596. Su P, Wu C and Lee L (2015). A recursive dialogue game for personalized computer-aided pronunciation training, IEEE/ACM Transactions on Audio, Speech and Language Processing, 23:1, (127-141), Online publication date: 1-Jan-2015.
  597. Lorenzo J, Hernández-Noriega I and Prieto-Rumeau T (2015). Approximation of two-person zero-sum continuous-time Markov games with average payoff criterion, Operations Research Letters, 43:1, (110-116), Online publication date: 1-Jan-2015.
  598. Abginehchi S, Larsen C and Thorstenson A (2015). Exploring the economic consequences of letting a supplier hold reserve storage, Computers and Operations Research, 53:C, (223-233), Online publication date: 1-Jan-2015.
  599. Kuhn J, Mandjes M and Nazarathy Y Exploration vs exploitation with partially observable Gaussian autoregressive arms Proceedings of the 8th International Conference on Performance Evaluation Methodologies and Tools, (209-216)
  600. Prashanth L, Chatterjee A and Bhatnagar S (2014). Two timescale convergent Q-learning for sleep-scheduling in wireless sensor networks, Wireless Networks, 20:8, (2589-2604), Online publication date: 1-Nov-2014.
  601. ACM
    Puggelli A, Sangiovanni-Vincentelli A and Seshia S Robust strategy synthesis for probabilistic systems applied to risk-limiting renewable-energy pricing Proceedings of the 14th International Conference on Embedded Software, (1-10)
  602. Ghoshdastidar D, Dukkipati A and Bhatnagar S (2014). Newton-based stochastic optimization using q -Gaussian smoothed functional algorithms, Automatica (Journal of IFAC), 50:10, (2606-2614), Online publication date: 1-Oct-2014.
  603. Hadoux E, Beynier A and Weng P Solving Hidden-Semi-Markov-Mode Markov Decision Problems Proceedings of the 8th International Conference on Scalable Uncertainty Management - Volume 8720, (176-189)
  604. Feinberg E, Huang J and Scherrer B (2014). Modified policy iteration algorithms are not strongly polynomial for discounted dynamic programming, Operations Research Letters, 42:6, (429-431), Online publication date: 1-Sep-2014.
  605. Herrera J, Hendrix E, Casado L and Haijema R Data Parallelism in Traffic Control Tables with Arrival Information Revised Selected Papers, Part I, of the Euro-Par 2014 International Workshops on Parallel Processing - Volume 8805, (60-70)
  606. ACM
    Song X, Zhang Q, Sekimoto Y and Shibasaki R Prediction of human emergency behavior and their mobility following large-scale disaster Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, (5-14)
  607. Giovanidis A, Liao Q and Stańczak S (2014). Measurement-adaptive cellular random access protocols, Wireless Networks, 20:6, (1495-1514), Online publication date: 1-Aug-2014.
  608. Ferns N and Precup D Bisimulation metrics are optimal value functions Proceedings of the Thirtieth Conference on Uncertainty in Artificial Intelligence, (210-219)
  609. ACM
    Baier C, Klein J, Klüppelholz S and Wunderlich S Weight monitoring with linear temporal logic Proceedings of the Joint Meeting of the Twenty-Third EACSL Annual Conference on Computer Science Logic (CSL) and the Twenty-Ninth Annual ACM/IEEE Symposium on Logic in Computer Science (LICS), (1-10)
  610. ACM
    Baier C, Dubslaff C and Klüppelholz S Trade-off analysis meets probabilistic model checking Proceedings of the Joint Meeting of the Twenty-Third EACSL Annual Conference on Computer Science Logic (CSL) and the Twenty-Ninth Annual ACM/IEEE Symposium on Logic in Computer Science (LICS), (1-10)
  611. ACM
    Handoko S, Nguyen D, Yuan Z and Lau H Reinforcement learning for adaptive operator selection in memetic search applied to quadratic assignment problem Proceedings of the Companion Publication of the 2014 Annual Conference on Genetic and Evolutionary Computation, (193-194)
  612. Chanel C, Lesire C and Teichteil-Königsbuch F A robotic execution framework for online probabilistic (Re)planning Proceedings of the Twenty-Fourth International Conferenc on International Conference on Automated Planning and Scheduling, (454-462)
  613. Ramirez M and Sardina S Directed fixed-point regression-based planning for non-deterministic domains Proceedings of the Twenty-Fourth International Conferenc on International Conference on Automated Planning and Scheduling, (235-243)
  614. ACM
    Larrañaga M, Ayesta U and Verloop I (2014). Index policies for a multi-class queue with convex holding cost and abandonments, ACM SIGMETRICS Performance Evaluation Review, 42:1, (125-137), Online publication date: 20-Jun-2014.
  615. ACM
    Larrañaga M, Ayesta U and Verloop I Index policies for a multi-class queue with convex holding cost and abandonments The 2014 ACM international conference on Measurement and modeling of computer systems, (125-137)
  616. Belzner L Verifiable Decisions in Autonomous Concurrent Systems Proceedings of the 16th IFIP WG 6.1 International Conference on Coordination Models and Languages - Volume 8459, (17-32)
  617. Zeng K, Nielson F and Nielson H The Stochastic Quality Calculus Proceedings of the 16th IFIP WG 6.1 International Conference on Coordination Models and Languages - Volume 8459, (179-193)
  618. Dibangoye J, Amato C, Buffet O and Charpillet F Exploiting separability in multiagent planning with continuous-state MDPs Proceedings of the 2014 international conference on Autonomous agents and multi-agent systems, (1281-1288)
  619. Piot B, Geist M and Pietquin O Boosted and reward-regularized classification for apprenticeship learning Proceedings of the 2014 international conference on Autonomous agents and multi-agent systems, (1249-1256)
  620. Rochlin I and Sarne D Constraining information sharing to improve cooperative information gathering Proceedings of the 2014 international conference on Autonomous agents and multi-agent systems, (237-244)
  621. Bogert K and Doshi P Multi-robot inverse reinforcement learning under occlusion with interactions Proceedings of the 2014 international conference on Autonomous agents and multi-agent systems, (173-180)
  622. Adelman D and Barz C (2014). A Unifying Approximate Dynamic Programming Model for the Economic Lot Scheduling Problem, Mathematics of Operations Research, 39:2, (374-402), Online publication date: 1-May-2014.
  623. Baier C, Daum M, Dubslaff C, Klein J and Klüppelholz S Energy-Utility Quantiles Proceedings of the 6th International Symposium on NASA Formal Methods - Volume 8430, (285-299)
  624. ACM
    Dubslaff C, Klüppelholz S and Baier C Probabilistic model checking for energy analysis in software product lines Proceedings of the 13th international conference on Modularity, (169-180)
  625. Klein J, Müller D, Baier C and Klüppelholz S Are Good-for-Games Automata Good for Probabilistic Model Checking? Proceedings of the 8th International Conference on Language and Automata Theory and Applications - Volume 8370, (453-465)
  626. Yahyaa S and Manderick B Knowledge Gradient for Online Reinforcement Learning Revised Selected Papers of the 6th International Conference on Agents and Artificial Intelligence - Volume 8946, (103-118)
  627. Rens G, Meyer T and Lakemeyer G A Logic for Specifying Stochastic Actions and Observations Proceedings of the 8th International Symposium on Foundations of Information and Knowledge Systems - Volume 8367, (305-323)
  628. Altisen K and Devismes S On Probabilistic Snap-Stabilization Proceedings of the 15th International Conference on Distributed Computing and Networking - Volume 8314, (272-286)
  629. Rochlin I and Sarne D (2014). Utilizing costly coordination in multi-agent joint exploration, Multiagent and Grid Systems, 10:1, (23-49), Online publication date: 1-Jan-2014.
  630. Geramifard A, Walsh T, Tellex S, Chowdhary G, Roy N and How J (2013). A Tutorial on Linear Function Approximators for Dynamic Programming and Reinforcement Learning, Foundations and Trends® in Machine Learning, 6:4, (375-451), Online publication date: 19-Dec-2013.
  631. Hyytiä E and Aalto S Round-robin routing policy Proceedings of the 7th International Conference on Performance Evaluation Methodologies and Tools, (69-78)
  632. Ahner D and Parson C Weapon tradeoff analysis using dynamic programming for a dynamic weapon target assignment problem within a simulation Proceedings of the 2013 Winter Simulation Conference: Simulation: Making Decisions in a Complex World, (2831-2841)
  633. Xu D and Son Y An integrated simulation, Markov decision processes and game theoretic framework for analysis of supply chain competitions Proceedings of the 2013 Winter Simulation Conference: Simulation: Making Decisions in a Complex World, (3930-3931)
  634. Devadasan P, Zhong H and Nof S (2013). Collaborative intelligence in knowledge based service planning, Expert Systems with Applications: An International Journal, 40:17, (6778-6787), Online publication date: 1-Dec-2013.
  635. Lozenguez G, Mouaddib A, Beynier A, Adouane L and Martinet P Simultaneous Auctions for "Rendez-Vous" Coordination Phases in Multi-robot Multi-task Mission Proceedings of the 2013 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT) - Volume 02, (67-74)
  636. Synthesizing distributed scheduling implementation for probabilistic component-based systems Proceedings of the Eleventh ACM/IEEE International Conference on Formal Methods and Models for Codesign, (87-96)
  637. ACM
    Nunez-Varela J and Wyatt J (2013). Models of gaze control for manipulation tasks, ACM Transactions on Applied Perception, 10:4, (1-22), Online publication date: 1-Oct-2013.
  638. Araghi S, Khosravi A, Johnstone M and Creighton D (2013). A novel modular Q-learning architecture to improve performance under incomplete learning in a grid soccer game, Engineering Applications of Artificial Intelligence, 26:9, (2164-2171), Online publication date: 1-Oct-2013.
  639. Alagoz O, Chhatwal J and Burnside E (2013). Optimal Policies for Reducing Unnecessary Follow-Up Mammography Exams in Breast Cancer Diagnosis, Decision Analysis, 10:3, (200-224), Online publication date: 1-Sep-2013.
  640. Akshay S, Bertrand N, Haddad S and Hélouët L The steady-state control problem for markov decision processes Proceedings of the 10th international conference on Quantitative Evaluation of Systems, (290-304)
  641. ACM
    Song X, Zhang Q, Sekimoto Y, Horanont T, Ueyama S and Shibasaki R Modeling and probabilistic reasoning of population evacuation during large-scale disaster Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining, (1231-1239)
  642. ACM
    Hallak A, Di-Castro D and Mannor S Model selection in markovian processes Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining, (374-382)
  643. Veatch M (2013). Approximate Linear Programming for Average Cost MDPs, Mathematics of Operations Research, 38:3, (535-544), Online publication date: 1-Aug-2013.
  644. Ko A, Sabourin R and Gagnon F (2013). Performance of distributed multi-agent multi-state reinforcement spectrum management using different exploration schemes, Expert Systems with Applications: An International Journal, 40:10, (4115-4126), Online publication date: 1-Aug-2013.
  645. Puggelli A, Li W, Sangiovanni-Vincentelli A and Seshia S Polynomial-Time Verification of PCTL Properties of MDPs with Convex Uncertainties Proceedings of the 25th International Conference on Computer Aided Verification - Volume 8044, (527-542)
  646. Kaufman D and Schaefer A (2013). Robust Modified Policy Iteration, INFORMS Journal on Computing, 25:3, (396-410), Online publication date: 1-Jul-2013.
  647. van Hee K and Sidorova N The right timing Proceedings of the 34th international conference on Application and Theory of Petri Nets and Concurrency, (1-20)
  648. ACM
    Rastegari B, Condon A, Immorlica N and Leyton-Brown K Two-sided matching with partial information Proceedings of the fourteenth ACM conference on Electronic commerce, (733-750)
  649. ACM
    Rastegari B, Condon A, Immorlica N and Leyton-Brown K Two-sided matching with partial information Proceedings of the fourteenth ACM conference on Electronic commerce, (733-750)
  650. ACM
    Simari G, Dickerson J, Sliva A and Subrahmanian V (2013). Parallel Abductive Query Answering in Probabilistic Logic Programs, ACM Transactions on Computational Logic, 14:2, (1-39), Online publication date: 1-Jun-2013.
  651. Tsoumakos D, Konstantinou I, Boumpouka C, Sioutas S and Koziris N Automated, elastic resource provisioning for NoSQL clusters using TIRAMOLA Proceedings of the 13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing, (34-41)
  652. Mahmud M and Ramamoorthy S Learning in non-stationary MDPs as transfer learning Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems, (1259-1260)
  653. Hernandez-Leal P, Munoz de Cote E and Sucar L Modeling non-stationary opponents Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems, (1135-1136)
  654. Urieli D and Stone P A learning agent for heat-pump thermostat control Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems, (1093-1100)
  655. Chakraborty D and Stone P Cooperating with a markovian ad hoc teammate Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems, (1085-1092)
  656. Trevizan F and Veloso M Finding objects through stochastic shortest path problems Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems, (547-554)
  657. Koga M, Silva V, Cozman F and Costa A Speeding-up reinforcement learning through abstraction and transfer learning Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems, (119-126)
  658. Lewis B, Erera A, Nowak M and Chelsea C. W (2013). Managing Inventory in Global Supply Chains Facing Port-of-Entry Disruption Risks, Transportation Science, 47:2, (162-180), Online publication date: 1-May-2013.
  659. Yu H and Bertsekas D (2013). On Boundedness of Q-Learning Iterates for Stochastic Shortest Path Problems, Mathematics of Operations Research, 38:2, (209-227), Online publication date: 1-May-2013.
  660. ACM
    Hahn E and Hermanns H Rewarding probabilistic hybrid automata Proceedings of the 16th international conference on Hybrid systems: computation and control, (313-322)
  661. McLay L and Mayorga M (2013). A Dispatching Model for Server-to-Customer Systems That Balances Efficiency and Equity, Manufacturing & Service Operations Management, 15:2, (205-220), Online publication date: 1-Apr-2013.
  662. Chen T, Han T and Kwiatkowska M (2013). On the complexity of model checking interval-valued discrete time Markov chains, Information Processing Letters, 113:7, (210-216), Online publication date: 1-Apr-2013.
  663. Bernardo M, De Nicola R and Loreti M (2013). A uniform framework for modeling nondeterministic, probabilistic, stochastic, or mixed processes and their behavioral equivalences, Information and Computation, 225, (29-82), Online publication date: 1-Apr-2013.
  664. ACM
    Tran V, Nguyen K, Son T and Pontelli E (2013). A conformant planner based on approximation, ACM Transactions on Intelligent Systems and Technology, 4:2, (1-38), Online publication date: 1-Mar-2013.
  665. Hazon N, Aumann Y, Kraus S and Sarne D (2013). Physical search problems with probabilistic knowledge, Artificial Intelligence, 196, (26-52), Online publication date: 1-Mar-2013.
  666. Wiesemann W, Kuhn D and Rustem B (2013). Robust Markov Decision Processes, Mathematics of Operations Research, 38:1, (153-183), Online publication date: 1-Feb-2013.
  667. Stranders R, Munoz De Cote E, Rogers A and Jennings N (2013). Near-optimal continuous patrolling with teams of mobile information gathering agents, Artificial Intelligence, 195, (63-105), Online publication date: 1-Feb-2013.
  668. Post I and Ye Y The simplex method is strongly polynomial for deterministic Markov decision processes Proceedings of the twenty-fourth annual ACM-SIAM symposium on Discrete algorithms, (1465-1473)
  669. Uc-Cetina V (2013). A novel reinforcement learning architecture for continuous state and action spaces, Advances in Artificial Intelligence, 2013, (7-7), Online publication date: 1-Jan-2013.
  670. Fernando N, Loke S and Rahayu W (2013). Mobile cloud computing, Future Generation Computer Systems, 29:1, (84-106), Online publication date: 1-Jan-2013.
  671. Buyukkaramikli N, Bertrand J and Ooijen H (2013). Periodic capacity management under a lead-time performance constraint, OR Spectrum, 35:1, (221-249), Online publication date: 1-Jan-2013.
  672. ACM
    Chen J, Ghosh A, Magutt J and Chiang M QAVA Proceedings of the 8th international conference on Emerging networking experiments and technologies, (121-132)
  673. Wu C, Cheng Y, Tang P and Yu J Optimal batch process admission control in tandem queueing systems with queue time constraint considerations Proceedings of the Winter Simulation Conference, (1-6)
  674. Chen X, Fernandez E and Kelton W Optimization model selection for simulation-based approximate dynamic programming approaches in semiconductor manufacturing operations Proceedings of the Winter Simulation Conference, (1-12)
  675. Haijema R, van Dijk D, Hendrix E and van der Wal J Simulation to discover structure in optimal dynamic control policies Proceedings of the Winter Simulation Conference, (1-12)
  676. Ramirez-Nafarrate A, Hafizoglu A, Gel E and Fowler J Comparison of ambulance diversion policies via simulation Proceedings of the Winter Simulation Conference, (1-12)
  677. Hibbard B Decision support for safe AI design Proceedings of the 5th international conference on Artificial General Intelligence, (117-125)
  678. ACM
    Gast N, Tomozei D and Le Boudec J (2012). Optimal storage policies with wind forecast uncertainties, ACM SIGMETRICS Performance Evaluation Review, 40:3, (28-32), Online publication date: 4-Dec-2012.
  679. Cote N, Canu A, Bouzid M and Mouaddib A Humans-Robots Sliding Collaboration Control in Complex Environments with Adjustable Autonomy Proceedings of the The 2012 IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technology - Volume 02, (146-153)
  680. Zhang Y, Xu Y, Sun T and Liu P Green-Waved Cooperative Coordination Algorithm for Decentralized Traffic Control Proceedings of the The 2012 IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technology - Volume 02, (75-79)
  681. Hengst B On-Line model-based continuous state reinforcement learning using background knowledge Proceedings of the 25th Australasian joint conference on Advances in Artificial Intelligence, (851-862)
  682. ACM
    Vlassis N, Littman M and Barber D (2012). On the Computational Complexity of Stochastic Controller Optimization in POMDPs, ACM Transactions on Computation Theory, 4:4, (1-8), Online publication date: 1-Nov-2012.
  683. Bogdan K and da Silva V Forward and backward feature selection in gradient-based MDP algorithms Proceedings of the 11th Mexican international conference on Advances in Artificial Intelligence - Volume Part I, (383-394)
  684. Minami R and da Silva V Shortest stochastic path with risk sensitive evaluation Proceedings of the 11th Mexican international conference on Advances in Artificial Intelligence - Volume Part I, (371-382)
  685. Gouberman A and Siegle M Markov Reward Models and Markov Decision Processes in Discrete and Continuous Time Advanced Lectures of the International Autumn School on Stochastic Model Checking. Rigorous Dependability Analysis Using Model Checking Techniques for Stochastic Systems - Volume 8453, (156-241)
  686. Gebler D, Hashemi V and Turrini A Computing Behavioral Relations for Probabilistic Concurrent Systems Advanced Lectures of the International Autumn School on Stochastic Model Checking. Rigorous Dependability Analysis Using Model Checking Techniques for Stochastic Systems - Volume 8453, (117-155)
  687. Budde C, D'Argenio P, Sánchez Terraf P and Wolovick N A Theory for the Semantics of Stochastic and Non-deterministic Continuous Systems Advanced Lectures of the International Autumn School on Stochastic Model Checking. Rigorous Dependability Analysis Using Model Checking Techniques for Stochastic Systems - Volume 8453, (67-86)
  688. Forejt V, Kwiatkowska M and Parker D Pareto curves for probabilistic model checking Proceedings of the 10th international conference on Automated Technology for Verification and Analysis, (317-332)
  689. Ayvaci M, Alagoz O and Burnside E (2012). The Effect of Budgetary Restrictions on Breast Cancer Diagnostic Decisions, Manufacturing & Service Operations Management, 14:4, (600-617), Online publication date: 1-Oct-2012.
  690. Peyronnet S, De Rougemont M and Strozecki Y Approximate verification and enumeration problems Proceedings of the 9th international conference on Theoretical Aspects of Computing, (228-242)
  691. Chen T, Forejt V, Kwiatkowska M, Simaitis A, Trivedi A and Ummels M Playing stochastic games precisely Proceedings of the 23rd international conference on Concurrency Theory, (348-363)
  692. Cooper W and Rangarajan B (2012). Performance Guarantees for Empirical Markov Decision Processes with Applications to Multiperiod Inventory Models, Operations Research, 60:5, (1267-1281), Online publication date: 1-Sep-2012.
  693. Walsh T and Goschin S Dynamic teaching in sequential decision making environments Proceedings of the Twenty-Eighth Conference on Uncertainty in Artificial Intelligence, (863-872)
  694. Petrik M and Subramanian D An approximate solution method for large risk-averse Markov decision processes Proceedings of the Twenty-Eighth Conference on Uncertainty in Artificial Intelligence, (805-814)
  695. Hay N, Russell S, Tolpin D and Shimony S Selecting computations Proceedings of the Twenty-Eighth Conference on Uncertainty in Artificial Intelligence, (346-355)
  696. Dibangoye J, Amato C and Doniec A Scaling up decentralized MDPs through heuristic search Proceedings of the Twenty-Eighth Conference on Uncertainty in Artificial Intelligence, (217-226)
  697. Stern R, Kalech M and Felner A (2012). Finding patterns in an unknown graph, AI Communications, 25:3, (229-256), Online publication date: 1-Aug-2012.
  698. Wu C, Lin J and Chien W (2012). Dynamic production control in parallel processing systems under process queue time constraints, Computers and Industrial Engineering, 63:1, (192-203), Online publication date: 1-Aug-2012.
  699. Piunovskiy A and Zhang Y (2012). The Transformation Method for Continuous-Time Markov Decision Processes, Journal of Optimization Theory and Applications, 154:2, (691-712), Online publication date: 1-Aug-2012.
  700. Liu C, Huang Z, Xu X, Zuo L and Wu J A rapid sparsification method for kernel machines in approximate policy iteration Proceedings of the 9th international conference on Advances in Neural Networks - Volume Part I, (533-544)
  701. Zhang H (2012). Solving an Infinite Horizon Adverse Selection Model Through Finite Policy Graphs, Operations Research, 60:4, (850-864), Online publication date: 1-Jul-2012.
  702. Trevizan F and Veloso M Short-sighted stochastic shortest path problems Proceedings of the Twenty-Second International Conference on International Conference on Automated Planning and Scheduling, (288-296)
  703. Seegebarth B, Müller F, Schattenberg B and Biundo S Making hybrid plans more clear to human users — a formal approach for generating sound explanations Proceedings of the Twenty-Second International Conference on International Conference on Automated Planning and Scheduling, (225-233)
  704. Keller T and Eyerich P PROST Proceedings of the Twenty-Second International Conference on International Conference on Automated Planning and Scheduling, (119-127)
  705. Bai A, Chen X, MacAlpine P, Urieli D, Barrett S and Stone P WrightEagle and UT Austin villa Robot Soccer World Cup XV, (1-12)
  706. Grześ M and Hoey J Analysis of methods for solving MDPs Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 3, (1237-1238)
  707. Devlin S and Kudenko D Dynamic potential-based reward shaping Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 1, (433-440)
  708. ACM
    Kong J, Chung S and Skadron K (2012). Recent thermal management techniques for microprocessors, ACM Computing Surveys, 44:3, (1-42), Online publication date: 1-Jun-2012.
  709. Wei Q and Guo X (2012). New Average Optimality Conditions for Semi-Markov Decision Processes in Borel Spaces, Journal of Optimization Theory and Applications, 153:3, (709-732), Online publication date: 1-Jun-2012.
  710. Hardman N and Colombi J (2012). An empirical methodology for human integration in the SE technical processes, Systems Engineering, 15:2, (172-190), Online publication date: 1-Jun-2012.
  711. Lozenguez G, Adouane L, Beynier A, Mouaddib A and Martinet P (2012). Map partitioning to approximate an exploration strategy in mobile robotics, Multiagent and Grid Systems, 8:3, (275-288), Online publication date: 1-May-2012.
  712. ACM
    Lassaigne R and Peyronnet S Approximate planning and verification for large markov decision processes Proceedings of the 27th Annual ACM Symposium on Applied Computing, (1314-1319)
  713. ACM
    Abundo M, Cardellini V and Lo Presti F (2012). Admission control policies for a multi-class QoS-aware service oriented architecture, ACM SIGMETRICS Performance Evaluation Review, 39:4, (89-98), Online publication date: 9-Mar-2012.
  714. Bertsekas D and Yu H (2012). Q-Learning and Enhanced Policy Iteration in Discounted Dynamic Programming, Mathematics of Operations Research, 37:1, (66-94), Online publication date: 1-Feb-2012.
  715. von Essen C and Jobstmann B Synthesizing efficient controllers Proceedings of the 13th international conference on Verification, Model Checking, and Abstract Interpretation, (428-444)
  716. Reddyvari V and Jagannatham A (2012). Optimal H.264 scalable video scheduling policies for 3G/4G wireless cellular and video sensor networks, Advances in Multimedia, 2012, (17-17), Online publication date: 1-Jan-2012.
  717. Sisikoglu E, Epelman M and Smith R A sampled fictitious play based learning algorithm for infinite horizon Markov decision processes Proceedings of the Winter Simulation Conference, (4091-4102)
  718. Feldman Z, Masin M, Tantawi A, Arroyo D and Steinder M Using approximate dynamic programming to optimize admission control in cloud computing environment Proceedings of the Winter Simulation Conference, (3158-3169)
  719. Epelman M, Ghate A and Smith R (2011). Sampled fictitious play for approximate dynamic programming, Computers and Operations Research, 38:12, (1705-1718), Online publication date: 1-Dec-2011.
  720. ACM
    Bougeret M, Casanova H, Rabie M, Robert Y and Vivien F Checkpointing strategies for parallel jobs Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis, (1-11)
  721. Kim J and Powell W (2011). Optimal Energy Commitments with Storage and Intermittent Supply, Operations Research, 59:6, (1347-1360), Online publication date: 1-Nov-2011.
  722. Feng J, Liu L and Liu X (2011). TECHNICAL NOTE---An Optimal Policy for Joint Dynamic Price and Lead-Time Quotation, Operations Research, 59:6, (1523-1527), Online publication date: 1-Nov-2011.
  723. Ye Y (2011). The Simplex and Policy-Iteration Methods Are Strongly Polynomial for the Markov Decision Problem with a Fixed Discount Rate, Mathematics of Operations Research, 36:4, (593-603), Online publication date: 1-Nov-2011.
  724. Verhoef C, Bhulai S and van der Mei R (2011). Optimal resource allocation in synchronized multi-tier Internet services, Performance Evaluation, 68:11, (1072-1084), Online publication date: 1-Nov-2011.
  725. Abundo M, Cardellini V and Lo Presti F Optimal admission control for a QoS-aware service-oriented system Proceedings of the 4th European conference on Towards a service-based internet, (179-190)
  726. Ogryczak W, Perny P and Weng P On minimizing ordered weighted regrets in multiobjective Markov decision processes Proceedings of the Second international conference on Algorithmic decision theory, (190-204)
  727. Combes R, Altman Z and Altman E Self-organizing relays in LTE networks Proceedings of the 7th International Conference on Network and Services Management, (99-106)
  728. Barnat J, Čern$#225; I and Tůmov$#225; J Timed automata approach to verification of systems with degradation Proceedings of the 7th international conference on Mathematical and Engineering Methods in Computer Science, (84-93)
  729. ACM
    Alur R and Trivedi A Relating average and discounted costs for quantitative analysis of timed systems Proceedings of the ninth ACM international conference on Embedded software, (165-174)
  730. Müller F and Biundo S HTN-style planning in relational POMDPs using first-order FSCs Proceedings of the 34th Annual German conference on Advances in artificial intelligence, (216-227)
  731. Delgado K, De Barros L, Cozman F and Sanner S (2011). Using mathematical programming to solve Factored Markov Decision Processes with Imprecise Probabilities, International Journal of Approximate Reasoning, 52:7, (1000-1017), Online publication date: 1-Oct-2011.
  732. Dinculescu M, Hundt C, Panangaden P, Pineau J and Precup D The duality of state and observation in probabilistic transition systems Proceedings of the 9th international conference on Logic, Language, and Computation, (206-230)
  733. Haijema R Optimal issuing of perishables with a short fixed shelf life Proceedings of the Second international conference on Computational logistics, (160-169)
  734. Esparza J and Gaiser A Probabilistic abstractions with arbitrary domains Proceedings of the 18th international conference on Static analysis, (334-350)
  735. Araya-López M, Buffet O, Thomas V and Charpillet F Active learning of MDP models Proceedings of the 9th European conference on Recent Advances in Reinforcement Learning, (42-53)
  736. Robards M and Sunehag P Gradient based algorithms with loss functions and kernels for improved on-policy control Proceedings of the 9th European conference on Recent Advances in Reinforcement Learning, (30-41)
  737. Li Y and Schuurmans D MapReduce for parallel reinforcement learning Proceedings of the 9th European conference on Recent Advances in Reinforcement Learning, (309-320)
  738. Ummels M and Wojtczak D The complexity of nash equilibria in limit-average games Proceedings of the 22nd international conference on Concurrency theory, (482-496)
  739. Robards M, Sunehag P, Sanner S and Marthi B Sparse Kernel-SARSA(λ) with an eligibility trace Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part III, (1-17)
  740. Dulac-Arnold G, Denoyer L, Preux P and Gallinari P Datum-wise classification Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part I, (375-390)
  741. Robards M, Sunehag P, Sanner S and Marthi B Sparse Kernel-SARSA(λ) with an eligibility trace Proceedings of the 2011th European Conference on Machine Learning and Knowledge Discovery in Databases - Volume Part III, (1-17)
  742. Dulac-Arnold G, Denoyer L, Preux P and Gallinari P Datum-wise classification Proceedings of the 2011th European Conference on Machine Learning and Knowledge Discovery in Databases - Volume Part I, (375-390)
  743. Bigus J, Campbell M, Carmeli B, Cefkin M, Chang H, Chen-Ritzo C, Cody W, Ebadollahi S, Evfimievski A, Farkash A, Glissmann S, Gotz D, Grandison T, Gruhl D, Haas P, Hsiao M, Hsueh P, Hu J, Jasinski J, Kaufman J, Kieliszewski C, Kohn M, Knoop S, Maglio P, Mak R, Nelken H, Neti C, Neuvirth H, Pan Y, Peres Y, Ramakrishnan S, Rosen-Zvi M, Renly S, Selinger P, Shabo A, Sorrentino R, Sun J, Syeda-Mahmood T, Tan W, Tao Y, Yaesoubi R and Zhu X (2011). Information technology for healthcare transformation, IBM Journal of Research and Development, 55:5, (492-505), Online publication date: 1-Sep-2011.
  744. Huang Y, Guo X and Song X (2011). Performance Analysis for Controlled Semi-Markov Systems with Application to Maintenance, Journal of Optimization Theory and Applications, 150:2, (395-415), Online publication date: 1-Aug-2011.
  745. Oh E and Kim K A geometric traversal algorithm for reward-uncertain MDPs Proceedings of the Twenty-Seventh Conference on Uncertainty in Artificial Intelligence, (565-572)
  746. Nath S, Zoeter O, Narahari Y and Dance C Dynamic mechanism design for markets with strategic resources Proceedings of the Twenty-Seventh Conference on Uncertainty in Artificial Intelligence, (539-546)
  747. Asmuth J and Littman M Learning is planning Proceedings of the Twenty-Seventh Conference on Uncertainty in Artificial Intelligence, (19-26)
  748. Levina T, Levin Y, McGill J and Nediak M (2011). Network Cargo Capacity Management, Operations Research, 59:4, (1008-1023), Online publication date: 1-Jul-2011.
  749. ACM
    Jungmann A, Lutterbeck J, Werdehausen B, Kleinjohann B and Kleinjohann L Towards a real-world scenario for investigating organic computing principles in heterogeneous societies of robots Proceedings of the 2011 workshop on Organic computing, (41-50)
  750. Jakab H and Csató L Improving Gaussian process value function approximation in policy gradient algorithms Proceedings of the 21st international conference on Artificial neural networks - Volume Part II, (221-228)
  751. Weng P Markov decision processes with ordinal rewards Proceedings of the Twenty-First International Conference on International Conference on Automated Planning and Scheduling, (282-289)
  752. Abundo M, Cardellini V and Lo Presti F An MDP-based admission control for a QoS-aware service-oriented system Proceedings of the Nineteenth International Workshop on Quality of Service, (1-3)
  753. Grześ M and Hoey J Efficient planning in R-max The 10th International Conference on Autonomous Agents and Multiagent Systems - Volume 3, (963-970)
  754. Dibangoye J, Mouaddib F and Chaib-draa B Toward error-bounded algorithms for infinite-horizon DEC-POMDPs The 10th International Conference on Autonomous Agents and Multiagent Systems - Volume 3, (947-954)
  755. Taylor M, Suay H and Chernova S Integrating reinforcement learning with human demonstrations of varying ability The 10th International Conference on Autonomous Agents and Multiagent Systems - Volume 2, (617-624)
  756. Yadav N and Sardina S Decision theoretic behavior composition The 10th International Conference on Autonomous Agents and Multiagent Systems - Volume 2, (575-582)
  757. Devlin S and Kudenko D Theoretical considerations of potential-based reward shaping for multi-agent systems The 10th International Conference on Autonomous Agents and Multiagent Systems - Volume 1, (225-232)
  758. Comanici G and Precup D Basis function discovery using spectral clustering and bisimulation metrics Proceedings of the 11th international conference on Adaptive and Learning Agents, (85-99)
  759. Elwany A, Gebraeel N and Maillart L (2011). Structured Replacement Policies for Components with Complex Degradation Processes and Dedicated Sensors, Operations Research, 59:3, (684-695), Online publication date: 1-May-2011.
  760. Eshragh A and Filar J (2011). Hamiltonian Cycles, Random Walks, and Discounted Occupational Measures, Mathematics of Operations Research, 36:2, (258-270), Online publication date: 1-May-2011.
  761. ACM
    Pietquin O, Geist M, Chandramohan S and Frezza-Buet H (2011). Sample-efficient batch reinforcement learning for dialogue management optimization, ACM Transactions on Speech and Language Processing , 7:3, (1-21), Online publication date: 1-May-2011.
  762. Yang R, Bhulai S, van der Mei R and Seinstra F (2011). Optimal resource allocation for time-reservation systems, Performance Evaluation, 68:5, (414-428), Online publication date: 1-May-2011.
  763. Bienvenu M, Fritz C and McIlraith S (2011). Specifying and computing preferred plans, Artificial Intelligence, 175:7-8, (1308-1345), Online publication date: 1-May-2011.
  764. Sun C, Stevens-Navarro E, Shah-Mansouri V and Wong V (2011). A constrained MDP-based vertical handoff decision algorithm for 4G heterogeneous wireless networks, Wireless Networks, 17:4, (1063-1081), Online publication date: 1-May-2011.
  765. Hahn E, Han T and Zhang L Synthesis for PCTL in parametric Markov decision processes Proceedings of the Third international conference on NASA Formal methods, (146-161)
  766. ACM
    Gawlitza T and Seidl H (2011). Solving systems of rational equations through strategy iteration, ACM Transactions on Programming Languages and Systems, 33:3, (1-48), Online publication date: 1-Apr-2011.
  767. Konovalov M, Malashenko Y and Nazarova I (2011). Job control in heterogeneous computing systems, Journal of Computer and Systems Sciences International, 50:2, (220-237), Online publication date: 1-Apr-2011.
  768. Hodge D and Glazebrook K (2011). Dynamic resource allocation in a multi-product make-to-stock production system, Queueing Systems: Theory and Applications, 67:4, (333-364), Online publication date: 1-Apr-2011.
  769. Çil E, Karaesmen F and Örmeci E (2011). Dynamic pricing and scheduling in a multi-class single-server queueing system, Queueing Systems: Theory and Applications, 67:4, (305-331), Online publication date: 1-Apr-2011.
  770. Kang Y and Prabhu V An approach for dynamic optimization of prevention program implementation in stochastic environments Proceedings of the 4th international conference on Social computing, behavioral-cultural modeling and prediction, (260-267)
  771. Gawlitza T and Monniaux D Improving strategies via SMT solving Proceedings of the 20th European conference on Programming languages and systems: part of the joint European conferences on theory and practice of software, (236-255)
  772. Balbach F and Zeugmann T (2011). Teaching randomized learners with feedback, Information and Computation, 209:3, (296-319), Online publication date: 1-Mar-2011.
  773. Wachs A, Schochetman I and Smith R (2011). Average Optimality in Nonhomogeneous Infinite Horizon Markov Decision Processes, Mathematics of Operations Research, 36:1, (147-164), Online publication date: 1-Feb-2011.
  774. Luo C, Yu F, Ji H and Leung V (2011). Optimal channel access for TCP performance improvement in cognitive radio networks, Wireless Networks, 17:2, (479-492), Online publication date: 1-Feb-2011.
  775. Sharna S and Murshed M Performance improvement of vertical handoff algorithms for QoS support over heterogeneous wireless networks Proceedings of the Thirty-Fourth Australasian Computer Science Conference - Volume 113, (17-24)
  776. Lam C and Ip W (2011). A customer satisfaction inventory model for supply chain integration, Expert Systems with Applications: An International Journal, 38:1, (875-883), Online publication date: 1-Jan-2011.
  777. Hannah L, Powell W and Blei D Nonparametric density estimation for stochastic optimization with an observable state variable Proceedings of the 23rd International Conference on Neural Information Processing Systems - Volume 1, (820-828)
  778. Xu H and Mannor S Distributionally robust Markov decision processes Proceedings of the 23rd International Conference on Neural Information Processing Systems - Volume 2, (2505-2513)
  779. Neu G, György A, Szepesvári C and Antos A Online Markov decision processes under bandit feedback Proceedings of the 23rd International Conference on Neural Information Processing Systems - Volume 2, (1804-1812)
  780. ACM
    Al-Zubaidy H, Lambadaris I and Talim J (2010). Optimal scheduling in high-speed downlink packet access networks, ACM Transactions on Modeling and Computer Simulation, 21:1, (1-27), Online publication date: 1-Dec-2010.
  781. Niyato D, Wang P, Hossain E, Saad W and Hjorungnes A (2010). Exploiting mobility diversity in sharing wireless access, IEEE Transactions on Wireless Communications, 9:12, (3866-3877), Online publication date: 1-Dec-2010.
  782. Hartmanns A Model-checking and simulation for stochastic timed systems Proceedings of the 9th international conference on Formal Methods for Components and Objects, (372-391)
  783. ACM
    Cai C, Wang Y and Geers G Adaptive traffic signal control using vehicle-to-infrastructure communication Proceedings of the Third International Workshop on Computational Transportation Science, (43-47)
  784. Chhatwal J, Alagoz O and Burnside E (2010). Optimal Breast Biopsy Decision-Making Based on Mammographic Features and Demographic Factors, Operations Research, 58:6, (1577-1591), Online publication date: 1-Nov-2010.
  785. ACM
    Cirillo M, Karlsson L and Saffiotti A (2010). Human-aware task planning, ACM Transactions on Intelligent Systems and Technology, 1:2, (1-26), Online publication date: 1-Nov-2010.
  786. Delgado K, Fang C, Sanner S and De Barros L Symbolic bounded real-time dynamic programming Proceedings of the 20th Brazilian conference on Advances in artificial intelligence, (193-202)
  787. Eboli M and Cozman F Markov decision processes from colored Petri nets Proceedings of the 20th Brazilian conference on Advances in artificial intelligence, (72-81)
  788. Chen T and Lu J Towards analysis of semi-Markov decision processes Proceedings of the 2010 international conference on Artificial intelligence and computational intelligence: Part I, (41-48)
  789. Baier C On model checking techniques for randomized distributed systems Proceedings of the 8th international conference on Integrated formal methods, (1-11)
  790. Simari G, Dickerson J and Subrahmanian V Cost-based query answering in action probabilistic logic programs Proceedings of the 4th international conference on Scalable uncertainty management, (319-332)
  791. Cardon S, Chetcuti-Sperandio N, Delorme F and Lagrue S A Markovian process modeling for Pickomino Proceedings of the 7th international conference on Computers and games, (199-210)
  792. Chandramohan S, Geist M and Pietquin O Sparse approximate dynamic programming for dialog management Proceedings of the 11th Annual Meeting of the Special Interest Group on Discourse and Dialogue, (107-115)
  793. Smith M Compositional abstraction of PEPA models for transient analysis Proceedings of the 7th European performance engineering conference on Computer performance engineering, (252-267)
  794. Mateescu R and Serwe W A study of shared-memory mutual exclusion protocols using CADP Proceedings of the 15th international conference on Formal methods for industrial critical systems, (180-197)
  795. Kwiatkowska M, Norman G and Parker D A framework for verification of software with time and probabilities Proceedings of the 8th international conference on Formal modeling and analysis of timed systems, (25-45)
  796. ACM
    Kunnumkal S and Topaloglu H (2010). A stochastic approximation method with max-norm projections and its applications to the Q-learning algorithm, ACM Transactions on Modeling and Computer Simulation, 20:3, (1-26), Online publication date: 1-Sep-2010.
  797. Perny P and Weng P On Finding Compromise Solutions in Multiobjective Markov Decision Processes Proceedings of the 2010 conference on ECAI 2010: 19th European Conference on Artificial Intelligence, (969-970)
  798. Pralet C, Verfaillie G, Lemaître M and Infantes G Constraint-Based Controller Synthesis in Non-Deterministic and Partially Observable Domains Proceedings of the 2010 conference on ECAI 2010: 19th European Conference on Artificial Intelligence, (681-686)
  799. Hans A and Udluft S Uncertainty Propagation for Efficient Exploration in Reinforcement Learning Proceedings of the 2010 conference on ECAI 2010: 19th European Conference on Artificial Intelligence, (361-366)
  800. Li R, Wang P and James G Multiscale Adaptive Agent-Based Management of Storage-Enabled Photovoltaic Facilities Proceedings of the 2010 conference on ECAI 2010: 19th European Conference on Artificial Intelligence, (151-156)
  801. McLay L, Lee A and Jacobson S (2010). Risk-Based Policies for Airport Security Checkpoint Screening, Transportation Science, 44:3, (333-349), Online publication date: 1-Aug-2010.
  802. Li B and Si J (2010). Approximate robust policy iteration using multilayer perceptron neural networks for discounted infinite-horizon Markov decision processes with uncertain correlated transition matrices, IEEE Transactions on Neural Networks, 21:8, (1270-1280), Online publication date: 1-Aug-2010.
  803. ACM
    Abe N, Melville P, Pendus C, Reddy C, Jensen D, Thomas V, Bennett J, Anderson G, Cooley B, Kowalczyk M, Domick M and Gardinier T Optimizing debt collections using constrained reinforcement learning Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining, (75-84)
  804. Fearnley J Exponential lower bounds for policy iteration Proceedings of the 37th international colloquium conference on Automata, languages and programming: Part II, (551-562)
  805. Armony M and Gurvich I (2010). When Promotions Meet Operations, Manufacturing & Service Operations Management, 12:3, (470-488), Online publication date: 1-Jul-2010.
  806. Downey C and Sanner S Temporal difference Bayesian model averaging Proceedings of the 27th International Conference on International Conference on Machine Learning, (311-318)
  807. Wang J, Venkatesha Prasad R and Niemegeers I (2010). Solving the uncertainty of vertical handovers in multi-radio home networks, Computer Communications, 33:9, (1122-1132), Online publication date: 1-Jun-2010.
  808. Shiang H and van der Schaar M (2010). Online learning in autonomic multi-hop wireless networks for transmitting mission-critical applications, IEEE Journal on Selected Areas in Communications, 28:5, (728-741), Online publication date: 1-Jun-2010.
  809. Min D and Yih Y (2010). An elective surgery scheduling problem considering patient priority, Computers and Operations Research, 37:6, (1091-1099), Online publication date: 1-Jun-2010.
  810. Sharma R and Gopal M (2010). Review article, Applied Soft Computing, 10:3, (675-688), Online publication date: 1-Jun-2010.
  811. Joshi S, Kersting K and Khardon R Self-taught decision theoretic planning with First Order Decision Diagrams Proceedings of the Twentieth International Conference on International Conference on Automated Planning and Scheduling, (89-96)
  812. Sanner S, Uther W and Delgado K Approximate dynamic programming with affine ADDs Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1, (1349-1356)
  813. Rabinovich Z, Dufton L, Larson K and Jennings N Cultivating desired behaviour Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1, (1097-1104)
  814. de Cote E and Jennings N Planning against fictitious players in repeated normal form games Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1, (1073-1080)
  815. Comanici G and Precup D Optimal policy switching algorithms for reinforcement learning Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1, (709-714)
  816. Grześ M and Kudenko D PAC-MDP learning with knowledge-based admissible models Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1, (349-358)
  817. Amato C and Shani G High-level reinforcement learning in strategy games Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1, (75-82)
  818. Schurr N, Picciano P and Marecki J Function allocation for NextGen airspace via agents Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: Industry track, (1731-1738)
  819. Armony M and Ward A (2010). Fair Dynamic Routing in Large-Scale Heterogeneous-Server Systems, Operations Research, 58:3, (624-637), Online publication date: 1-May-2010.
  820. Lai G, Margot F and Secomandi N (2010). An Approximate Dynamic Programming Approach to Benchmark Practice-Based Heuristics for Natural Gas Storage Valuation, Operations Research, 58:3, (564-582), Online publication date: 1-May-2010.
  821. Vrancx P, Verbeeck K and Nowé A (2010). Analyzing the dynamics of stigmergetic interactions through pheromone games, Theoretical Computer Science, 411:21, (2116-2126), Online publication date: 1-May-2010.
  822. Grze M and Kudenko D (2010). 2010 Special Issue, Neural Networks, 23:4, (541-550), Online publication date: 1-May-2010.
  823. Akuiyibo E and Boyd S (2010). Adaptive modulation with smoothed flow utility, EURASIP Journal on Wireless Communications and Networking, 2010, (1-9), Online publication date: 1-Apr-2010.
  824. Tan M, Alhajj R and Polat F (2010). Automated large-scale control of gene regulatory networks, IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, 40:2, (286-297), Online publication date: 1-Apr-2010.
  825. ACM
    Seneviratne L and Izquierdo E An interactive framework for image annotation through gaming Proceedings of the international conference on Multimedia information retrieval, (517-526)
  826. ACM
    Baffa A and Ciarlini A Modeling POMDPs for generating and simulating stock investment policies Proceedings of the 2010 ACM Symposium on Applied Computing, (2394-2399)
  827. Jung H and Pedram M Optimizing the power delivery network in dynamically voltage scaled systems with uncertain power mode transition times Proceedings of the Conference on Design, Automation and Test in Europe, (351-356)
  828. Montes-De-Oca R and Lemus-Rodríguez E (2010). When are the value iteration maximizers close to an optimal stationary policy of a discounted Markov decision process?, WSEAS Transactions on Mathematics, 9:3, (151-160), Online publication date: 1-Mar-2010.
  829. Verbancsics P and Stanley K (2010). Evolving Static Representations for Task Transfer, The Journal of Machine Learning Research, 11, (1737-1769), Online publication date: 1-Mar-2010.
  830. Jaksch T, Ortner R and Auer P (2010). Near-optimal Regret Bounds for Reinforcement Learning, The Journal of Machine Learning Research, 11, (1563-1600), Online publication date: 1-Mar-2010.
  831. Castro D and Meir R (2010). A Convergent Online Single Time Scale Actor Critic Algorithm, The Journal of Machine Learning Research, 11, (367-410), Online publication date: 1-Mar-2010.
  832. Savaşaneril S, Griffin P and Keskinocak P (2010). Dynamic Lead-Time Quotation for an M/M/1 Base-Stock Inventory Queue, Operations Research, 58:2, (383-395), Online publication date: 1-Mar-2010.
  833. Secomandi N (2010). Optimal Commodity Trading with a Capacitated Storage Asset, Management Science, 56:3, (449-467), Online publication date: 1-Mar-2010.
  834. ACM
    Peng H and Lin Y (2010). An optimal warning-zone-length assignment algorithm for real-time and multiple-QoS on-chip bus arbitration, ACM Transactions on Embedded Computing Systems, 9:4, (1-39), Online publication date: 1-Mar-2010.
  835. ACM
    Martínez Ortuno F, Harder U and Harrison P A markovian futures market for computing power Proceedings of the first joint WOSP/SIPEW international conference on Performance engineering, (177-182)
  836. Osais Y, Yu F and St-Hilaire M Thermal management of biosensor networks Proceedings of the 7th IEEE conference on Consumer communications and networking conference, (249-253)
  837. Shlakhter O, Lee C, Khmelev D and Jaber N (2010). Acceleration Operators in the Value Iteration Algorithms for Markov Decision Processes, Operations Research, 58:1, (193-202), Online publication date: 1-Jan-2010.
  838. Zhang H (2010). Partially Observable Markov Decision Processes, Operations Research, 58:1, (214-228), Online publication date: 1-Jan-2010.
  839. Delage E and Mannor S (2010). Percentile Optimization for Markov Decision Processes with Parameter Uncertainty, Operations Research, 58:1, (203-213), Online publication date: 1-Jan-2010.
  840. Powell W (2010). Rejoinder---The Languages of Stochastic Optimization, INFORMS Journal on Computing, 22:1, (23-25), Online publication date: 1-Jan-2010.
  841. Huang J and Krishnamurthy V (2010). Transmission control in cognitive radio as a Markovian dynamic game, IEEE Transactions on Communications, 58:1, (301-310), Online publication date: 1-Jan-2010.
  842. Gloannec S and Mouaddib A Navigation Method Selector for an Autonomous Explorer Rover with a Markov Decision Process Proceedings of the 2nd International Conference on Intelligent Robotics and Applications, (147-156)
  843. Montes-De-Oca R and Lemus-Rodríguez E Value iteration and action Ɛ-approximation of optimal policies in discounted Markov decision processes Proceedings of the 14th WSEAS International Conference on Applied mathematics, (213-218)
  844. Ramírez-Hernández J and Fernandez E A simulation-based approximate dynamic programming approach for the control of the Intel Mini-Fab benchmark model Winter Simulation Conference, (1634-1645)
  845. Strehl A, Li L and Littman M (2009). Reinforcement Learning in Finite MDPs: PAC Analysis, The Journal of Machine Learning Research, 10, (2413-2444), Online publication date: 1-Dec-2009.
  846. Taylor M and Stone P (2009). Transfer Learning for Reinforcement Learning Domains: A Survey, The Journal of Machine Learning Research, 10, (1633-1685), Online publication date: 1-Dec-2009.
  847. Ji G and Liang B Stochastic rate control for scalable VBR video streaming over wireless networks Proceedings of the 28th IEEE conference on Global telecommunications, (5924-5929)
  848. Zadorojniy A, Even G and Shwartz A (2009). A Strongly Polynomial Algorithm for Controlled Queues, Mathematics of Operations Research, 34:4, (992-1007), Online publication date: 1-Nov-2009.
  849. Huang H and Lau V (2009). Delay-sensitive distributed power and transmission threshold control for S-ALOHA network with finite state Markov fading channels, IEEE Transactions on Wireless Communications, 8:11, (5632-5638), Online publication date: 1-Nov-2009.
  850. Bhatnagar S, Sutton R, Ghavamzadeh M and Lee M (2009). Natural actor-critic algorithms, Automatica (Journal of IFAC), 45:11, (2471-2482), Online publication date: 1-Nov-2009.
  851. Barreto A, Augusto D and Barbosa H On the characteristics of sequential decision problems and their impact on evolutionary computation and reinforcement learning Proceedings of the 9th international conference on Artificial evolution, (194-205)
  852. Defourny B, Ernst D and Wehenkel L Bounds for multistage stochastic programs using supervised learning strategies Proceedings of the 5th international conference on Stochastic algorithms: foundations and applications, (61-73)
  853. Coucheney P, Hyon E, Touati C and Gaujal B Myopic versus clairvoyant admission policies in wireless networks Proceedings of the Fourth International ICST Conference on Performance Evaluation Methodologies and Tools, (1-10)
  854. Beccuti M, Codetta-Raiteri D and Franceschinis G Multiple abstraction levels in performance analysis of WSN monitoring systems Proceedings of the Fourth International ICST Conference on Performance Evaluation Methodologies and Tools, (1-10)
  855. Gast N and Gaujal B A mean field approach for optimization in particle systems and applications Proceedings of the Fourth International ICST Conference on Performance Evaluation Methodologies and Tools, (1-10)
  856. Hardman N, Colombi J, Jacques D, Hill R and Miller J Application of a seeded hybrid genetic algorithm for user interface design Proceedings of the 2009 IEEE international conference on Systems, Man and Cybernetics, (462-467)
  857. ACM
    Munir A and Gordon-Ross A An MDP-based application oriented optimal policy for wireless sensor networks Proceedings of the 7th IEEE/ACM international conference on Hardware/software codesign and system synthesis, (183-192)
  858. Ye Z, Abouzeid A and Ai J (2009). Optimal stochastic policies for distributed data aggregation in wireless sensor networks, IEEE/ACM Transactions on Networking, 17:5, (1494-1507), Online publication date: 1-Oct-2009.
  859. Ho Z, Lau V and Cheng R (2009). Cross-layer design of FDD-OFDM systems based on ACK/NAK feedbacks, IEEE Transactions on Information Theory, 55:10, (4568-4584), Online publication date: 1-Oct-2009.
  860. van Otterlo M (2009). Intensional dynamic programming. A Rosetta stone for structured dynamic programming, Journal of Algorithms, 64:4, (169-191), Online publication date: 1-Oct-2009.
  861. Heris S, Sistani M and Pariz N Using control theory for analysis of reinforcement learning and optimal policy properties in grid-world problems Proceedings of the Intelligent computing 5th international conference on Emerging intelligent computing technology and applications, (276-285)
  862. Devlin S, Grzes M and Kudenko D Reinforcement Learning in RoboCup KeepAway with Partial Observability Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology - Volume 02, (201-208)
  863. Chaput P, Danos V, Panangaden P and Plotkin G Approximating labelled Markov processes again! Proceedings of the 3rd international conference on Algebra and coalgebra in computer science, (145-156)
  864. Ummels M and Wojtczak D Decision problems for Nash equilibria in stochastic games Proceedings of the 23rd CSL international conference and 18th EACSL Annual conference on Computer science logic, (515-529)
  865. Niyato D, Hossain E and Kim D (2009). Joint admission control and antenna assignment for multiclass QoS in spatial multiplexing MIMO wireless networks, IEEE Transactions on Wireless Communications, 8:9, (4855-4865), Online publication date: 1-Sep-2009.
  866. Kong L, Wong G and Tsang D (2009). Performance study and system optimization on sleep mode operation in IEEE 802.16e, IEEE Transactions on Wireless Communications, 8:9, (4518-4528), Online publication date: 1-Sep-2009.
  867. Ahmad S, Liu M, Javidi T, Zhao Q and Krishnamachari B (2009). Optimality of myopic sensing in multichannel opportunistic access, IEEE Transactions on Information Theory, 55:9, (4040-4050), Online publication date: 1-Sep-2009.
  868. Zhang D and Adelman D (2009). An Approximate Dynamic Programming Approach to Network Revenue Management with Customer Choice, Transportation Science, 43:3, (381-394), Online publication date: 1-Aug-2009.
  869. Jung H, Hwang A and Pedram M (2009). Predictive-flow-queue-based energy optimization for gigabit ethernet controllers, IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 17:8, (1113-1126), Online publication date: 1-Aug-2009.
  870. Iglesias A, Martínez P, Aler R and Fernández F (2009). Learning teaching strategies in an Adaptive and Intelligent Educational System through Reinforcement Learning, Applied Intelligence, 31:1, (89-106), Online publication date: 1-Aug-2009.
  871. Joshi S, Kersting K and Khardon R Generalized first order decision diagrams for first order Markov decision processes Proceedings of the 21st International Joint Conference on Artificial Intelligence, (1916-1921)
  872. Sanner S, Goetschalckx R, Driessens K and Shani G Bayesian real-time dynamic programming Proceedings of the 21st International Joint Conference on Artificial Intelligence, (1784-1789)
  873. Ganzfried S and Sandholm T Computing equilibria in multiplayer stochastic games of imperfect information Proceedings of the 21st International Joint Conference on Artificial Intelligence, (140-146)
  874. ACM
    Zhang H, Parkes D and Chen Y Policy teaching through reward function learning Proceedings of the 10th ACM conference on Electronic commerce, (295-304)
  875. Glazebrook K, Kirkbride C and Ouenniche J (2009). Index Policies for the Admission Control and Routing of Impatient Customers to Heterogeneous Service Stations, Operations Research, 57:4, (975-989), Online publication date: 1-Jul-2009.
  876. Giovanidis A, Wunder G and Bühler J (2009). Optimal control of a single queue with retransmissions, IEEE Transactions on Wireless Communications, 8:7, (3736-3746), Online publication date: 1-Jul-2009.
  877. Jung H and Pedram M (2009). Uncertainty-aware dynamic power management in partially observable domains, IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 17:7, (929-942), Online publication date: 1-Jul-2009.
  878. Zhang J and Cao X (2009). Continuous-time Markov decision processes with nth-bias optimality criteria, Automatica (Journal of IFAC), 45:7, (1628-1638), Online publication date: 1-Jul-2009.
  879. Xia L, Chen X and Cao X (2009). Policy iteration for customer-average performance optimization of closed queueing systems, Automatica (Journal of IFAC), 45:7, (1639-1648), Online publication date: 1-Jul-2009.
  880. Ratliff N, Silver D and Bagnell J (2009). Learning to search, Autonomous Robots, 27:1, (25-53), Online publication date: 1-Jul-2009.
  881. Riedmiller M, Gabel T, Hafner R and Lange S (2009). Reinforcement learning for robot soccer, Autonomous Robots, 27:1, (55-73), Online publication date: 1-Jul-2009.
  882. Hu Z and Tham C SI-CCMAC Proceedings of the 7th international conference on Modeling and Optimization in Mobile, Ad Hoc, and Wireless Networks, (131-140)
  883. Kim H and Shin K Optimal admission and eviction control of secondary users at cognitive radio HotSpots Proceedings of the 6th Annual IEEE communications society conference on Sensor, Mesh and Ad Hoc Communications and Networks, (198-206)
  884. Regan K and Boutilier C Regret-based reward elicitation for Markov decision processes Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence, (444-451)
  885. Bartlett P and Tewari A REGAL Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence, (35-42)
  886. Yuan L, Huayong Z and Lincheng S The Lagrangian relaxation based resources allocation methods for air-to-ground operations under uncertainty circumstances Proceedings of the 21st annual international conference on Chinese control and decision conference, (5645-5650)
  887. Seiffertt J, Mulder S, Dua R and Wunsch D Neural networks and Markov models for the iterated prisoner's dilemma Proceedings of the 2009 international joint conference on Neural Networks, (1544-1550)
  888. ACM
    Kolter J and Ng A Near-Bayesian exploration in polynomial time Proceedings of the 26th Annual International Conference on Machine Learning, (513-520)
  889. ACM
    Diuk C, Li L and Leffler B The adaptive k-meteorologists problem and its application to structure learning and feature selection in reinforcement learning Proceedings of the 26th Annual International Conference on Machine Learning, (249-256)
  890. Bertuccelli L, Bethke B and How J Robust adaptive Markov decision processes in multi-vehicle applications Proceedings of the 2009 conference on American Control Conference, (1304-1309)
  891. Bidot J, Vidal T, Laborie P and Beck J (2009). A theoretic and practical framework for scheduling in a stochastic environment, Journal of Scheduling, 12:3, (315-344), Online publication date: 1-Jun-2009.
  892. Niyato D, Chaisiri S and Sung L Optimal Power Management for Server Farm to Support Green Computing Proceedings of the 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid, (84-91)
  893. Ponsen M, Taylor M and Tuyls K Abstraction and generalization in reinforcement learning Proceedings of the Second international conference on Adaptive and Learning Agents, (1-32)
  894. Li L, Littman M and Mansley C Online exploration in least-squares policy iteration Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems - Volume 2, (733-739)
  895. Schurr N, Marecki J and Tambe M Improving adjustable autonomy strategies for time-critical domains Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems - Volume 1, (353-360)
  896. Petrik M and Zilberstein S (2009). A bilinear programming approach for multiagent planning, Journal of Artificial Intelligence Research, 35:1, (235-274), Online publication date: 1-May-2009.
  897. Niyato D, Hossain E and Camorlinga S (2009). Remote patient monitoring service using heterogeneous wireless access networks, IEEE Journal on Selected Areas in Communications, 27:4, (412-423), Online publication date: 1-May-2009.
  898. Vermeulen I, Bohte S, Elkhuizen S, Lameris H, Bakker P and Poutré H (2009). Adaptive resource allocation for efficient patient scheduling, Artificial Intelligence in Medicine, 46:1, (67-80), Online publication date: 1-May-2009.
  899. Gayon J, Talay-değirmenci I, Karaesmen F and Örmeci E (2009). Optimal pricing and production policies of a make-to-stock system with fluctuating demand, Probability in the Engineering and Informational Sciences, 23:2, (205-230), Online publication date: 1-Apr-2009.
  900. Neuhäuβer M, Stoelinga M and Katoen J Delayed Nondeterminism in Continuous-Time Markov Decision Processes Proceedings of the 12th International Conference on Foundations of Software Science and Computational Structures - Volume 5504, (364-379)
  901. Archibald T, Black D and Glazebrook K (2009). Indexability and Index Heuristics for a Simple Class of Inventory Routing Problems, Operations Research, 57:2, (314-326), Online publication date: 1-Mar-2009.
  902. Wang R and Lau V (2009). Closed-loop cross-layer SDMA designs with outdated CSIT, IEEE Transactions on Wireless Communications, 8:3, (1322-1328), Online publication date: 1-Mar-2009.
  903. Ghasemi N and Dey S (2009). A constrained MDP approach to dynamic quantizer design for HMM state estimation, IEEE Transactions on Signal Processing, 57:3, (1203-1209), Online publication date: 1-Mar-2009.
  904. Glazebrook K and Minty R (2009). A Generalized Gittins Index for a Class of Multiarmed Bandits with General Resource Requirements, Mathematics of Operations Research, 34:1, (26-44), Online publication date: 1-Feb-2009.
  905. Ni W, Li W and Alam M (2009). Determination of optimal call admission control policy in wireless networks, IEEE Transactions on Wireless Communications, 8:2, (1038-1044), Online publication date: 1-Feb-2009.
  906. Walsh T, Nouri A, Li L and Littman M (2009). Learning and planning in environments with delayed feedback, Autonomous Agents and Multi-Agent Systems, 18:1, (83-105), Online publication date: 1-Feb-2009.
  907. Meuleau N, Benazera E, Brafman R, Hansen E and Mausam (2009). A heuristic search approach to planning with continuous resources in stochastic domains, Journal of Artificial Intelligence Research, 34:1, (27-59), Online publication date: 1-Jan-2009.
  908. Secomandi N and Margot F (2009). Reoptimization Approaches for the Vehicle-Routing Problem with Stochastic Demands, Operations Research, 57:1, (214-230), Online publication date: 1-Jan-2009.
  909. Gayon J, Benjaafar S and de Véricourt F (2009). Using Imperfect Advance Demand Information in Production-Inventory Systems with Multiple Customer Classes, Manufacturing & Service Operations Management, 11:1, (128-143), Online publication date: 1-Jan-2009.
  910. Gupta D and Wang L (2009). A Stochastic Inventory Model with Trade Credit, Manufacturing & Service Operations Management, 11:1, (4-18), Online publication date: 1-Jan-2009.
  911. Foo B and Van Der Schaar M (2009). A rules-based approach for configuring chains of classifiers in real-time stream mining systems, EURASIP Journal on Advances in Signal Processing, 2009, (1-17), Online publication date: 1-Jan-2009.
  912. ACM
    Iocchi L, Lukasiewicz T, Nardi D and Rosati R (2009). Reasoning about actions with sensing under qualitative and probabilistic uncertainty, ACM Transactions on Computational Logic, 10:1, (1-41), Online publication date: 1-Jan-2009.
  913. Ghate A and Smith R (2009). Characterizing extreme points as basic feasible solutions in infinite linear programs, Operations Research Letters, 37:1, (7-10), Online publication date: 1-Jan-2009.
  914. Xia L, Xie M, Yin W and Dong J Max-min optimality of service rates in queueing systems with customer-average performance criterion Proceedings of the 40th Conference on Winter Simulation, (509-515)
  915. Brut M, Sedes F, Grigoras R and Charvillat V (2008). An ontology-based approach for providing multimedia personalised recommendations, International Journal of Web and Grid Services, 4:3, (314-329), Online publication date: 1-Nov-2008.
  916. ACM
    Hu Z and Tham C CCMAC Proceedings of the 11th international symposium on Modeling, analysis and simulation of wireless and mobile systems, (60-69)
  917. Beccuti M, Codetta-Raiteri D, Franceschinis G and Haddad S Non deterministic repairable fault trees for computing optimal repair strategy Proceedings of the 3rd International Conference on Performance Evaluation Methodologies and Tools, (1-10)
  918. Carmon Y and Shwartz A Eventually-stationary policies for Markov decision models with non-constant discounting Proceedings of the 3rd International Conference on Performance Evaluation Methodologies and Tools, (1-6)
  919. ACM
    Ziebart B, Maas A, Dey A and Bagnell J Navigate like a cabbie Proceedings of the 10th international conference on Ubiquitous computing, (322-331)
  920. de Saint-Cyr F Scenario update applied to causal reasoning Proceedings of the Eleventh International Conference on Principles of Knowledge Representation and Reasoning, (188-197)
  921. Thon I, Landwehr N and De Raedt L A simple model for sequences of relational state descriptions Proceedings of the 2008th European Conference on Machine Learning and Knowledge Discovery in Databases - Volume Part II, (506-521)
  922. Taylor M, Jong N and Stone P Transferring instances for model-based reinforcement learning Proceedings of the 2008th European Conference on Machine Learning and Knowledge Discovery in Databases - Volume Part II, (488-505)
  923. Melo F and Lopes M Fitted natural actor-critic Proceedings of the 2008th European Conference on Machine Learning and Knowledge Discovery in Databases - Volume Part II, (66-81)
  924. Joshi S and Khardon R Stochastic planning with first order decision diagrams Proceedings of the Eighteenth International Conference on International Conference on Automated Planning and Scheduling, (156-163)
  925. Jakiewicz A (2008). A note on negative dynamic programming for risk-sensitive control, Operations Research Letters, 36:5, (531-534), Online publication date: 1-Sep-2008.
  926. Ciesinski F, Baier C, Größer M and Parker D Generating Compact MTBDD-Representations from Probmela Specifications Proceedings of the 15th international workshop on Model Checking Software, (60-76)
  927. Cai K, Jiang C, Hu H and Bai C (2008). An experimental study of adaptive testing for software reliability assessment, Journal of Systems and Software, 81:8, (1406-1429), Online publication date: 1-Aug-2008.
  928. Asmuth J, Littman M and Zinkov R Potential-based shaping in model-based reinforcement learning Proceedings of the 23rd national conference on Artificial intelligence - Volume 2, (604-609)
  929. Isom J, Meyn S and Braatz R Piecewise linear dynamic programming for constrained POMDPs Proceedings of the 23rd national conference on Artificial intelligence - Volume 1, (291-296)
  930. Zhang H and Parkes D Value-based policy teaching with active indirect elicitation Proceedings of the 23rd national conference on Artificial intelligence - Volume 1, (208-214)
  931. Aumann Y, Hazon N, Kraus S and Sarne D Physical search problems applying economic search models Proceedings of the 23rd national conference on Artificial intelligence - Volume 1, (9-16)
  932. Hermanns H, Wachter B and Zhang L Probabilistic CEGAR Proceedings of the 20th international conference on Computer Aided Verification, (162-175)
  933. ACM
    Syed U, Bowling M and Schapire R Apprenticeship learning using linear programming Proceedings of the 25th international conference on Machine learning, (1032-1039)
  934. ACM
    Li L A worst-case comparison between temporal difference and residual gradient with linear function approximation Proceedings of the 25th international conference on Machine learning, (560-567)
  935. ACM
    Kolter J, Coates A, Ng A, Gu Y and DuHadway C Space-indexed dynamic programming Proceedings of the 25th international conference on Machine learning, (488-495)
  936. ACM
    Epshteyn A, Vogel A and DeJong G Active reinforcement learning Proceedings of the 25th international conference on Machine learning, (296-303)
  937. ACM
    Diuk C, Cohen A and Littman M An object-oriented representation for efficient reinforcement learning Proceedings of the 25th international conference on Machine learning, (240-247)
  938. Goetschalckx R, Sanner S and Driessens K Reinforcement Learning with the Use of Costly Features Proceedings of the 2008 conference on ECAI 2008: 18th European Conference on Artificial Intelligence, (779-780)
  939. ACM
    Jung H, Rong P and Pedram M Stochastic modeling of a thermally-managed multi-core system Proceedings of the 45th annual Design Automation Conference, (728-733)
  940. Munos R and Szepesvári C (2008). Finite-Time Bounds for Fitted Value Iteration, The Journal of Machine Learning Research, 9, (815-857), Online publication date: 1-Jun-2008.
  941. ACM
    Hui B, Gustafson S, Irani P and Boutilier C The need for an interaction cost model in adaptive interfaces Proceedings of the working conference on Advanced visual interfaces, (458-461)
  942. Witwicki S and Durfee E Commitment-based service coordination Proceedings of the 2008 AAMAS international conference on Service-oriented computing: agents, semantics, and engineering, (134-148)
  943. Babes M, de Cote E and Littman M Social reward shaping in the prisoner's dilemma Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 3, (1389-1392)
  944. Liu Y and Koenig S An exact algorithm for solving MDPs under risk-sensitive planning objectives with one-switch utility functions Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 1, (453-460)
  945. Jong N, Hester T and Stone P The utility of temporal abstraction in reinforcement learning Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 1, (299-306)
  946. Ross S, Pineau J, Paquet S and Chaib-draa B (2008). Online planning algorithms for POMDPs, Journal of Artificial Intelligence Research, 32:1, (663-704), Online publication date: 1-May-2008.
  947. Oliehoek F, Spaan M and Vlassis N (2008). Optimal and approximate Q-value functions for decentralized POMDPs, Journal of Artificial Intelligence Research, 32:1, (289-353), Online publication date: 1-May-2008.
  948. Goldman C and Zilberstein S (2008). Communication-based decomposition mechanisms for decentralized MDPs, Journal of Artificial Intelligence Research, 32:1, (169-202), Online publication date: 1-May-2008.
  949. Wu D and Koutsoukos X (2008). Reachability analysis of uncertain systems using bounded-parameter Markov decision processes, Artificial Intelligence, 172:8-9, (945-954), Online publication date: 1-May-2008.
  950. An X, Xiang Y and Cercone N (2008). Dynamic multiagent probabilistic inference, International Journal of Approximate Reasoning, 48:1, (185-213), Online publication date: 1-Apr-2008.
  951. Falowo O and Chan H (2008). Joint call admission control algorithms, Computer Communications, 31:6, (1200-1217), Online publication date: 1-Apr-2008.
  952. Zhang K, Xu Y, Chen X and Cao X (2008). Policy iteration based feedback control, Automatica (Journal of IFAC), 44:4, (1055-1061), Online publication date: 1-Apr-2008.
  953. Bryce D, Kambhampati S and Smith D (2008). Sequential Monte Carlo in reachability heuristics for probabilistic planning, Artificial Intelligence, 172:6-7, (685-715), Online publication date: 1-Apr-2008.
  954. Baier C, Bertrand N and Größer M On decision problems for probabilistic Büchi automata Proceedings of the Theory and practice of software, 11th international conference on Foundations of software science and computational structures, (287-301)
  955. ACM
    Jung H and Pedram M Resilient dynamic power management under uncertainty Proceedings of the conference on Design, automation and test in Europe, (224-229)
  956. da Motta Salles Barreto A and Anderson C (2008). Restricted gradient-descent algorithm for value-function approximation in reinforcement learning, Artificial Intelligence, 172:4-5, (454-482), Online publication date: 1-Mar-2008.
  957. Courtemanche F, Najjar M, Paccoud B and Mayers A Assisting elders via dynamic multi-tasks planning Proceedings of the 1st international conference on Ambient media and systems, (1-8)
  958. Jung H and Pedram M A stochastic local hot spot alerting technique Proceedings of the 2008 Asia and South Pacific Design Automation Conference, (468-473)
  959. Veanes M, Campbell C, Grieskamp W, Schulte W, Tillmann N and Nachmanson L Model-based testing of object-oriented reactive systems with spec explorer Formal methods and testing, (39-76)
  960. Wang C, Joshi S and Khardon R (2008). First order decision diagrams for relational MDPs, Journal of Artificial Intelligence Research, 31:1, (431-472), Online publication date: 1-Jan-2008.
  961. Luo Z, Bell D and McCollum B Skill combination for reinforcement learning Proceedings of the 8th international conference on Intelligent data engineering and automated learning, (87-96)
  962. ACM
    Baier C, Bertrand N and Schnoebelen P (2007). Verifying nondeterministic probabilistic channel systems against ω-regular linear-time properties, ACM Transactions on Computational Logic, 9:1, (5-es), Online publication date: 1-Dec-2007.
  963. Zhang L and Hermanns H Deciding simulations on probabilistic automata Proceedings of the 5th international conference on Automated technology for verification and analysis, (207-222)
  964. Gawlitza T and Seidl H Computing game values for crash games Proceedings of the 5th international conference on Automated technology for verification and analysis, (177-191)
  965. Niño-Mora J Characterization and computation of restless bandit marginal productivity indices Proceedings of the 2nd international conference on Performance evaluation methodologies and tools, (1-10)
  966. ACM
    Bourenane M, Mellouk A and Benhamamouch D Reinforcement learning in multi-agent environment and ant colony for packet scheduling in routers Proceedings of the 5th ACM international workshop on Mobility management and wireless access, (137-143)
  967. Giro S and D'Argenio P Quantitative model checking revisited Proceedings of the 5th international conference on Formal modeling and analysis of timed systems, (179-194)
  968. ACM
    Bakhshi R, Bonnet F, Fokkink W and Haverkort B (2007). Formal analysis techniques for gossiping protocols, ACM SIGOPS Operating Systems Review, 41:5, (28-36), Online publication date: 1-Oct-2007.
  969. Wilson N, Grimes D and Freuder E A cost-based model and algorithms for interleaving solving and elicitation of CSPs Proceedings of the 13th international conference on Principles and practice of constraint programming, (666-680)
  970. Walsh T, Nouri A, Li L and Littman M Planning and Learning in Environments with Delayed Feedback Proceedings of the 18th European conference on Machine Learning, (442-453)
  971. Gawlitza T and Seidl H Precise relational invariants through strategy iteration Proceedings of the 21st international conference, and Proceedings of the 16th annuall conference on Computer Science Logic, (23-40)
  972. Böhnstedt L, Ferrein A and Lakemeyer G Options in Readylog Reloaded --- Generating Decision-Theoretic Plan Libraries in Golog Proceedings of the 30th annual German conference on Advances in Artificial Intelligence, (352-366)
  973. Paletta L, Fritz G, Kintzler F, Irran J and Dorffner G Perception and Developmental Learning of Affordances in Autonomous Robots Proceedings of the 30th annual German conference on Advances in Artificial Intelligence, (235-250)
  974. ACM
    Kwiatkowska M Quantitative verification Proceedings of the the 6th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering, (449-458)
  975. Neuhäußer M and Katoen J Bisimulation and logical preservation for continuous-time markov decision processes Proceedings of the 18th international conference on Concurrency Theory, (412-427)
  976. ACM
    Kwiatkowska M Quantitative verification The 6th Joint Meeting on European software engineering conference and the ACM SIGSOFT symposium on the foundations of software engineering: companion papers, (449-458)
  977. Ang C and Tham C (2007). Analysis and optimization of service availability in a HA cluster with load-dependent machine availability, IEEE Transactions on Parallel and Distributed Systems, 18:9, (1307-1319), Online publication date: 1-Sep-2007.
  978. Hao T, Baoqun Y and Hongsheng X (2007). Error bounds of optimization algorithms for semi-Markov decision processes, International Journal of Systems Science, 38:9, (725-736), Online publication date: 1-Sep-2007.
  979. Van Phan C, Baek K and Kim J Opportunistic transmission for wireless sensor networks under delay constraints Proceedings of the 2007 international conference on Computational science and its applications - Volume Part III, (858-871)
  980. ACM
    Farbod A and Liang B Optimal admission control policies for heterogeneous wireless networks The Fourth International Conference on Heterogeneous Networking for Quality, Reliability, Security and Robustness & Workshops, (1-7)
  981. Charvillat V and Grigoraş R (2007). Reinforcement learning for dynamic multimedia adaptation, Journal of Network and Computer Applications, 30:3, (1034-1058), Online publication date: 1-Aug-2007.
  982. Jong N and Stone P Model-based exploration in continuous state spaces Proceedings of the 7th International conference on Abstraction, reformulation, and approximation, (258-272)
  983. Kuter U and Hu J Computing and using lower and upper bounds for action elimination in MDP planning Proceedings of the 7th International conference on Abstraction, reformulation, and approximation, (243-257)
  984. Jurdziński M and Trivedi A Reachability-time games on timed automata Proceedings of the 34th international conference on Automata, Languages and Programming, (838-849)
  985. Hallerstede S and Hoang T Qualitative probabilistic modelling in event-B Proceedings of the 6th international conference on Integrated formal methods, (293-312)
  986. ACM
    Mihailidis A, Boger J, Canido M and Hoey J (2007). The use of an intelligent prompting system for people with dementia, Interactions, 14:4, (34-37), Online publication date: 1-Jul-2007.
  987. Beccuti M, Franceschinis G and Haddad S Markov decision Petri net and Markov decision well-formed net formalisms Proceedings of the 28th international conference on Applications and theory of Petri nets and other models of concurrency, (43-62)
  988. ACM
    Pandey S, Chakrabarti D and Agarwal D Multi-armed bandit problems with dependent arms Proceedings of the 24th international conference on Machine learning, (721-728)
  989. ACM
    Delage E and Mannor S Percentile optimization in uncertain Markov decision processes with application to efficient exploration Proceedings of the 24th international conference on Machine learning, (225-232)
  990. Kumar D, Altman E and Kelif J Globally optimal user-network association in an 802.11 WLAN & 3G UMTS hybrid cell Proceedings of the 20th international teletraffic conference on Managing traffic performance in converged networks, (1173-1187)
  991. Berten V and Gaujal B Grid brokering for batch allocation using indexes Proceedings of the 1st EuroFGI international conference on Network control and optimization, (215-225)
  992. ACM
    Oliehoek F and Vlassis N Q-value functions for decentralized POMDPs Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems, (1-8)
  993. ACM
    Dolgov D, James M and Samples M Combinatorial resource scheduling for multiagent MDPs Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems, (1-8)
  994. ACM
    Rabinovich Z, Rosenschein J and Kaminka G Dynamics based control with an application to area-sweeping problems Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems, (1-8)
  995. ACM
    Wu J and Durfee E Sequential resource allocation in multiagent systems with uncertainties Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems, (1-8)
  996. ACM
    Jong N and Stone P Model-based function approximation in reinforcement learning Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems, (1-8)
  997. ACM
    Witwicki S and Durfee E Commitment-driven distributed joint policy search Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems, (1-8)
  998. ACM
    Wu J and Durfee E Solving large TÆMS problems efficiently by selective exploration and decomposition Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems, (1-8)
  999. ACM
    Taylor M, Whiteson S and Stone P Transfer via inter-task mappings in policy search reinforcement learning Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems, (1-8)
  1000. Pralet C, Verfaillie G and Schiex T (2007). An algebraic graphical model for decision with uncertainties, feasibilities, and utilities, Journal of Artificial Intelligence Research, 29:1, (421-489), Online publication date: 1-May-2007.
  1001. Zhenzhen Ye , Abouzeid A and Jing Ai Optimal Policies for Distributed Data Aggregation in Wireless Sensor Networks Proceedings of the IEEE INFOCOM 2007 - 26th IEEE International Conference on Computer Communications, (1676-1684)
  1002. Tsitsiklis J (2007). NP-Hardness of checking the unichain condition in average cost MDPs, Operations Research Letters, 35:3, (319-323), Online publication date: 1-May-2007.
  1003. Jung H and Pedram M Dynamic power management under uncertain information Proceedings of the conference on Design, automation and test in Europe, (1060-1065)
  1004. Riedmiller M and Gabel T On Experiences in a Complex and Competitive Gaming Domain Proceedings of the 2007 IEEE Symposium on Computational Intelligence and Games, (17-23)
  1005. Gawlitza T and Seidl H Precise fixpoint computation through strategy iteration Proceedings of the 16th European Symposium on Programming, (300-315)
  1006. Ca Van Phan and Jeong Geun Kim An Energy-Efficient Transmission Strategy for Wireless Sensor Networks Proceedings of the 2007 IEEE Wireless Communications and Networking Conference, (3406-3411)
  1007. Stevens-Navarro E, Wong V and Yuxia Lin A Vertical Handoff Decision Algorithm for Heterogeneous Wireless Networks Proceedings of the 2007 IEEE Wireless Communications and Networking Conference, (3199-3204)
  1008. Hui Chen , Chan H and Leung V Two Cross-Layer Optimization Methods for Transporting Multimedia Traffic Over Multicode CDMA Networks Proceedings of the 2007 IEEE Wireless Communications and Networking Conference, (288-293)
  1009. Yu O, Saric E and Anfei Li Dynamic Control of Open Spectrum Management Proceedings of the 2007 IEEE Wireless Communications and Networking Conference, (127-132)
  1010. Qianchuan Zhao , Geirhofer S, Lang Tong and Sadler B Optimal Dynamic Spectrum Access via Periodic Channel Sensing Proceedings of the 2007 IEEE Wireless Communications and Networking Conference, (33-37)
  1011. Gimbert H Pure stationary optimal strategies in Markov decision processes Proceedings of the 24th annual conference on Theoretical aspects of computer science, (200-211)
  1012. Fiat A, Mansour Y and Nadav U Efficient contention resolution protocols for selfish agents Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms, (179-188)
  1013. Petrik M An analysis of Laplacian methods for value function approximation in MDPs Proceedings of the 20th international joint conference on Artifical intelligence, (2574-2579)
  1014. Meuleau N and Brafman R Hierarchical heuristic forward search in Stochastic domains Proceedings of the 20th international joint conference on Artifical intelligence, (2542-2549)
  1015. Perny P, Spanjaard O and Storme L State space search for risk-averse agents Proceedings of the 20th international joint conference on Artifical intelligence, (2353-2358)
  1016. Farinelli A, Finzi A and Lukasiewicz T Team programming in Golog under partial observability Proceedings of the 20th international joint conference on Artifical intelligence, (2097-2102)
  1017. Petrik M and Zilberstein S Average-reward decentralized Markov decision processes Proceedings of the 20th international joint conference on Artifical intelligence, (1997-2002)
  1018. Dai P and Goldsmith J Topological value iteration algorithm for Markov decision processes Proceedings of the 20th international joint conference on Artifical intelligence, (1860-1865)
  1019. Wang C, Joshi S and Khardon R First order decision diagrams for relational MDPs Proceedings of the 20th international joint conference on Artifical intelligence, (1095-1100)
  1020. György A, Kocsis L, Szabó I and Szepesvári C Continuous time associative bandit problems Proceedings of the 20th international joint conference on Artifical intelligence, (830-835)
  1021. Glazebrook K and Kirkbride C (2007). Dynamic routing to heterogeneous collections of unreliable servers, Queueing Systems: Theory and Applications, 55:1, (9-25), Online publication date: 1-Jan-2007.
  1022. Alexander G and Raja A The Role of Problem Classification in Online Meta-cognition Proceedings of the IEEE/WIC/ACM international conference on Intelligent Agent Technology, (218-225)
  1023. Immorlica N, Jain K and Mahdian M Game-theoretic aspects of designing hyperlink structures Proceedings of the Second international conference on Internet and Network Economics, (150-161)
  1024. Größer M, Norman G, Baier C, Ciesinski F, Kwiatkowska M and Parker D On reduction criteria for probabilistic reward models Proceedings of the 26th international conference on Foundations of Software Technology and Theoretical Computer Science, (309-320)
  1025. Zhao H and Doshi P A hierarchical framework for composing nested web processes Proceedings of the 4th international conference on Service-Oriented Computing, (116-128)
  1026. Porta J, Vlassis N, Spaan M and Poupart P (2006). Point-Based Value Iteration for Continuous POMDPs, The Journal of Machine Learning Research, 7, (2329-2367), Online publication date: 1-Dec-2006.
  1027. Kok J and Vlassis N (2006). Collaborative Multiagent Reinforcement Learning by Payoff Propagation, The Journal of Machine Learning Research, 7, (1789-1828), Online publication date: 1-Dec-2006.
  1028. Brydon M (2006). Economic metaphors for solving intrafirm allocation problems, Decision Support Systems, 42:3, (1657-1672), Online publication date: 1-Dec-2006.
  1029. ACM
    Poole D and Mackworth A Dimensions of complexity of intelligent agents Proceedings of the 2006 international symposium on Practical cognitive agents and robots, (81-92)
  1030. Jonker J, Piersma N and Potharst R (2006). A decision support system for direct mailing decisions, Decision Support Systems, 42:2, (915-925), Online publication date: 1-Nov-2006.
  1031. Wolovick N and Johr S A characterization of meaningful schedulers for continuous-time markov decision processes Proceedings of the 4th international conference on Formal Modeling and Analysis of Timed Systems, (352-367)
  1032. Proper S and Tadepalli P Scaling model-based average-reward reinforcement learning for product delivery Proceedings of the 17th European conference on Machine Learning, (735-742)
  1033. Hölldobler S, Karabaev E and Skvortsova O (2006). FLUCAP, Journal of Artificial Intelligence Research, 27:1, (419-439), Online publication date: 1-Sep-2006.
  1034. Kveton B, Hauskrecht M and Guestrin C (2006). Solving factored MDPs with hybrid state and action variables, Journal of Artificial Intelligence Research, 27:1, (153-201), Online publication date: 1-Sep-2006.
  1035. Tripathi A and Nair S (2006). Mobile Advertising in Capacitated Wireless Networks, IEEE Transactions on Knowledge and Data Engineering, 18:9, (1284-1296), Online publication date: 1-Sep-2006.
  1036. Bournez O and Garnier F Proving positive almost sure termination under strategies Proceedings of the 17th international conference on Term Rewriting and Applications, (357-371)
  1037. Yeow W, Tham C and Wong W Hard constrained semi-Markov decision processes Proceedings of the 21st national conference on Artificial intelligence - Volume 1, (549-554)
  1038. Liu Y and Koenig S Functional value iteration for decision-theoretic planning with general utility functions proceedings of the 21st national conference on Artificial intelligence - Volume 2, (1186-1186)
  1039. Kveton B and Hauskrecht M Learning basis functions in hybrid domains proceedings of the 21st national conference on Artificial intelligence - Volume 2, (1161-1166)
  1040. Ferns N, Castro P, Precup D and Panangaden P Methods for computing state similarity in Markov decision processes Proceedings of the Twenty-Second Conference on Uncertainty in Artificial Intelligence, (174-181)
  1041. ACM
    Taylor M, Whiteson S and Stone P Comparing evolutionary and temporal difference methods in a reinforcement learning domain Proceedings of the 8th annual conference on Genetic and evolutionary computation, (1321-1328)
  1042. ACM
    Oliehoek F, de Jong E and Vlassis N The parallel Nash Memory for asymmetric games Proceedings of the 8th annual conference on Genetic and evolutionary computation, (337-344)
  1043. ACM
    Ratliff N, Bagnell J and Zinkevich M Maximum margin planning Proceedings of the 23rd international conference on Machine learning, (729-736)
  1044. ACM
    Maggioni M and Mahadevan S Fast direct policy evaluation using multiscale analysis of Markov diffusion processes Proceedings of the 23rd international conference on Machine learning, (601-608)
  1045. Finzi A and Lukasiewicz T Adaptive multi-agent programming in GTGolog Proceedings of the 29th annual German conference on Artificial intelligence, (389-403)
  1046. Finzi A and Lukasiewicz T Game-theoretic agent programming in Golog under partial observability Proceedings of the 29th annual German conference on Artificial intelligence, (113-127)
  1047. Baier C and Wolf V Stochastic reasoning about channel-based component connectors Proceedings of the 8th international conference on Coordination Models and Languages, (1-15)
  1048. Liu Y and Koenig S Probabilistic planning with nonlinear utility functions Proceedings of the Sixteenth International Conference on International Conference on Automated Planning and Scheduling, (410-413)
  1049. Meuleau N, Brafman R and Benazera E Stochastic over-subscription planning using hierarchies of MDPs Proceedings of the Sixteenth International Conference on International Conference on Automated Planning and Scheduling, (121-130)
  1050. Kveton B and Hauskrecht M Solving factored MDPs with exponential-family transition models Proceedings of the Sixteenth International Conference on International Conference on Automated Planning and Scheduling, (114-120)
  1051. Bonet B Bounded branching and modalities in non-deterministic planning Proceedings of the Sixteenth International Conference on International Conference on Automated Planning and Scheduling, (42-51)
  1052. Paletta L and Fritz G Reinforcement learning of predictive features in affordance perception Proceedings of the 2006 international conference on Towards affordance-based robot control, (77-90)
  1053. Van Hentenryck P, Bent R and Vergados Y Online stochastic reservation systems Proceedings of the Third international conference on Integration of AI and OR Techniques in Constraint Programming for Combinatorial Optimization Problems, (212-227)
  1054. Peyrard N and Sabbadin R Mean Field Approximation of the Policy Iteration Algorithm for Graph-based Markov Decision Processes Proceedings of the 2006 conference on ECAI 2006: 17th European Conference on Artificial Intelligence August 29 -- September 1, 2006, Riva del Garda, Italy, (595-599)
  1055. Forsell N and Sabbadin R Approximate linear-programming algorithms for graph-based Markov decision processes Proceedings of the 2006 conference on ECAI 2006: 17th European Conference on Artificial Intelligence August 29 -- September 1, 2006, Riva del Garda, Italy, (590-594)
  1056. Pralet C, Verfaillie G and Schiex T Decision with uncertainties, feasibilities, and utilities Proceedings of the 2006 conference on ECAI 2006: 17th European Conference on Artificial Intelligence August 29 -- September 1, 2006, Riva del Garda, Italy, (427-431)
  1057. ACM
    Simari G and Parsons S On the relationship between MDPs and the BDI architecture Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems, (1041-1048)
  1058. ACM
    Dolgov D and Durfee E Resource allocation among agents with preferences induced by factored MDPs Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems, (297-304)
  1059. Agha G, Meseguer J and Sen K (2006). PMaude, Electronic Notes in Theoretical Computer Science (ENTCS), 153:2, (213-239), Online publication date: 1-May-2006.
  1060. Baier C, D'Argenio P and Groesser M (2006). Partial Order Reduction for Probabilistic Branching Time, Electronic Notes in Theoretical Computer Science (ENTCS), 153:2, (97-116), Online publication date: 1-May-2006.
  1061. Wu D and Koutsoukos X Probabilistic verification of uncertain systems using bounded-parameter markov decision processes Proceedings of the Third international conference on Modeling Decisions for Artificial Intelligence, (283-294)
  1062. Danos V, Desharnais J, Laviolette F and Panangaden P (2006). Bisimulation and cocongruence for probabilistic systems, Information and Computation, 204:4, (503-523), Online publication date: 1-Apr-2006.
  1063. Koutsoukos X and Riley D Computational methods for reachability analysis of stochastic hybrid systems Proceedings of the 9th international conference on Hybrid Systems: computation and control, (377-391)
  1064. Sen K, Viswanathan M and Agha G Model-Checking markov chains in the presence of uncertainties Proceedings of the 12th international conference on Tools and Algorithms for the Construction and Analysis of Systems, (394-410)
  1065. Fei Y, Wong V and Leung V (2006). Efficient QoS provisioning for adaptive multimedia in mobile communication networks by reinforcement learning, Mobile Networks and Applications, 11:1, (101-110), Online publication date: 1-Feb-2006.
  1066. ACM
    Dix J, Kraus S and Subrahmanian V (2006). Heterogeneous temporal probabilistic agents, ACM Transactions on Computational Logic, 7:1, (151-198), Online publication date: 1-Jan-2006.
  1067. Brázdil T and Kučera A Computing the expected accumulated reward and gain for a subclass of infinite markov chains Proceedings of the 25th international conference on Foundations of Software Technology and Theoretical Computer Science, (372-383)
  1068. El Falou S and Bourdon F Mobile agent migration Proceedings of the 18th Australian Joint conference on Advances in Artificial Intelligence, (1204-1208)
  1069. Bagherjeiran A, Eick C, Chen C and Vilalta R Adaptive Clustering Proceedings of the Fifth IEEE International Conference on Data Mining, (565-568)
  1070. Baier C, Hermanns H, Katoen J and Haverkort B (2005). Efficient computation of time-bounded reachability probabilities in uniform continuous-time Markov decision processes, Theoretical Computer Science, 345:1, (2-26), Online publication date: 21-Nov-2005.
  1071. de Véricourt F and Zhou Y (2005). Managing Response Time in a Call-Routing Problem with Service Failure, Operations Research, 53:6, (968-981), Online publication date: 1-Nov-2005.
  1072. Groesser M and Baier C Partial order reduction for markov decision processes Proceedings of the 4th international conference on Formal Methods for Components and Objects, (408-427)
  1073. Gimenez-Guzman J, Martinez-Bauset J and Pla V Performance bounds for mobile cellular networks with handover prediction Proceedings of the 8th international conference on Management of Multimedia Networks and Services, (35-46)
  1074. Hao T, Hongsheng X and Baoqun Y (2005). The optimal robust control policy for uncertain semi-Markov control processes, International Journal of Systems Science, 36:13, (791-800), Online publication date: 20-Oct-2005.
  1075. Szer D and Charpillet F An optimal best-first search algorithm for solving infinite horizon DEC-POMDPs Proceedings of the 16th European conference on Machine Learning, (389-399)
  1076. Finzi A and Lukasiewicz T Game-theoretic reasoning about actions in nonmonotonic causal theories Proceedings of the 8th international conference on Logic Programming and Nonmonotonic Reasoning, (185-197)
  1077. van Hee K, Serebrenik A, Sidorova N, Voorhoeve M and van der Wal J The price of coordination in resource management Proceedings of the 3rd international conference on Business Process Management, (96-108)
  1078. Nilim A and El Ghaoui L (2005). Robust Control of Markov Decision Processes with Uncertain Transition Matrices, Operations Research, 53:5, (780-798), Online publication date: 1-Sep-2005.
  1079. ACM
    Strehl A and Littman M A theoretical analysis of Model-Based Interval Estimation Proceedings of the 22nd international conference on Machine learning, (856-863)
  1080. Perny P, Spanjaard O and Weng P Algebraic Markov decision processes Proceedings of the 19th international joint conference on Artificial intelligence, (1372-1377)
  1081. Kveton B and Hauskrecht M An MCMC approach to solving hybrid factored MDPs Proceedings of the 19th international joint conference on Artificial intelligence, (1346-1351)
  1082. Schaffer S, Clement B and Chien S Probabilistic reasoning for plan robustness Proceedings of the 19th international joint conference on Artificial intelligence, (1266-1271)
  1083. Rintanen J Conditional planning in the discrete belief space Proceedings of the 19th international joint conference on Artificial intelligence, (1260-1265)
  1084. Wilkinson D, Bowling M and Ghodsi A Learning subjective representations for planning Proceedings of the 19th international joint conference on Artificial intelligence, (889-894)
  1085. ACM
    Dolgov D and Durfee E Computationally-efficient combinatorial auctions for resource allocation in weakly-coupled MDPs Proceedings of the fourth international joint conference on Autonomous agents and multiagent systems, (657-664)
  1086. ACM
    Wu J and Durfee E Automated resource-driven mission phasing techniques for constrained agents Proceedings of the fourth international joint conference on Autonomous agents and multiagent systems, (331-338)
  1087. ACM
    Taylor M and Stone P Behavior transfer for value-function-based reinforcement learning Proceedings of the fourth international joint conference on Autonomous agents and multiagent systems, (53-59)
  1088. Kurano M, Yasuda M, Nakagami J and Yoshida Y Perceptive evaluation for the optimal discounted reward in markov decision processes Proceedings of the Second international conference on Modeling Decisions for Artificial Intelligence, (283-293)
  1089. Giménez-Guzmán J, Martínez-Bauset J and Pla V An afterstates reinforcement learning approach to optimize admission control in mobile cellular networks Proceedings of the Second international conference on Wireless Systems and Network Architectures in Next Generation Internet, (115-129)
  1090. Blass A, Gurevich Y, Nachmanson L and Veanes M Play to test Proceedings of the 5th international conference on Formal Approaches to Software Testing, (32-46)
  1091. Younes H Planning and execution with phase transitions Proceedings of the 20th national conference on Artificial intelligence - Volume 2, (1030-1035)
  1092. Munos R Error bounds for approximate value iteration Proceedings of the 20th national conference on Artificial intelligence - Volume 2, (1006-1011)
  1093. Liu Y and Koenig S Risk-sensitive planning with one-switch utility functions Proceedings of the 20th national conference on Artificial intelligence - Volume 2, (993-999)
  1094. Taylor M, Stone P and Liu Y Value functions for RL-based behavior transfer Proceedings of the 20th national conference on Artificial intelligence - Volume 2, (880-885)
  1095. Costan A, Gaubert S, Goubault E, Martel M and Putot S A policy iteration algorithm for computing fixed points in static analysis of programs Proceedings of the 17th international conference on Computer Aided Verification, (462-475)
  1096. Younes H, Littman M, Weissman D and Asmuth J (2005). The first probabilistic track of the international planning competition, Journal of Artificial Intelligence Research, 24:1, (851-887), Online publication date: 1-Jul-2005.
  1097. Bayer-Zubek V and Dietterich T (2005). Integrating learning from examples into the search for diagnostic policies, Journal of Artificial Intelligence Research, 24:1, (263-303), Online publication date: 1-Jul-2005.
  1098. Spaan M and Vlassis N (2005). Perseus, Journal of Artificial Intelligence Research, 24:1, (195-220), Online publication date: 1-Jul-2005.
  1099. Verfaillie G and Jussien N (2005). Constraint Solving in Uncertain and Dynamic Environments, Constraints, 10:3, (253-281), Online publication date: 1-Jul-2005.
  1100. Bhagwan R, Douglis F, Hildrum K, Kephart J and Walsh W Time-varying management of data storage Proceedings of the First conference on Hot topics in system dependability, (14-14)
  1101. Paletta L, Fritz G and Seifert C Perception-Action based object detection from local descriptor combination and reinforcement learning Proceedings of the 14th Scandinavian conference on Image Analysis, (639-648)
  1102. Jacobs S, Ferrein A and Lakemeyer G Controlling unreal tournament 2004 bots with the logic-based action language GOLOG Proceedings of the First AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment, (151-152)
  1103. Kersting K An Inductive Logic Programming Approach to Statistical Relational Learning Proceedings of the 2005 conference on An Inductive Logic Programming Approach to Statistical Relational Learning, (1-228)
  1104. Bournez O and Garnier F Proving positive almost-sure termination Proceedings of the 16th international conference on Term Rewriting and Applications, (323-337)
  1105. Ren Z, H. Krogh B and Marculescu R (2005). Hierarchical Adaptive Dynamic Power Management, IEEE Transactions on Computers, 54:4, (409-420), Online publication date: 1-Apr-2005.
  1106. ACM
    Tisgaonkar S, Hung C and Bing B Next generation wireless systems using Markov decision process model Proceedings of the 43rd annual ACM Southeast Conference - Volume 2, (94-95)
  1107. ACM
    Tisgaonkar S, Hung C and Bing B Intelligent handoff management with interference control for next generation wireless systems Proceedings of the 43rd annual ACM Southeast Conference - Volume 2, (1-6)
  1108. ACM
    Diaz G, Larsen K, Pardo J, Cuartero F and Valero V An approach to handle real time and probabilistic behaviors in e-commerce Proceedings of the 2005 ACM symposium on Applied computing, (815-820)
  1109. Mariano-Romero C, Alcocer-Yamanaka V and Morales E Multiobjective water pinch analysis of the cuernavaca city water distribution network Proceedings of the Third international conference on Evolutionary Multi-Criterion Optimization, (870-884)
  1110. ACM
    Jansen D and Hermanns H (2005). QoS modelling and analysis with UML-statecharts, ACM SIGMETRICS Performance Evaluation Review, 32:4, (28-33), Online publication date: 1-Mar-2005.
  1111. ACM
    Baier C, Ciesinski F and Größer M (2005). ProbMela and verification of Markov decision processes, ACM SIGMETRICS Performance Evaluation Review, 32:4, (22-27), Online publication date: 1-Mar-2005.
  1112. Zhang W and Zhang N (2005). Restricted value iteration, Journal of Artificial Intelligence Research, 23:1, (123-165), Online publication date: 1-Jan-2005.
  1113. Greensmith E, Bartlett P and Baxter J (2004). Variance Reduction Techniques for Gradient Estimates in Reinforcement Learning, The Journal of Machine Learning Research, 5, (1471-1530), Online publication date: 1-Dec-2004.
  1114. Even-Dar E and Mansour Y (2004). Learning Rates for Q-learning, The Journal of Machine Learning Research, 5, (1-25), Online publication date: 1-Dec-2004.
  1115. del Angel G and Fine T (2004). Optimal power and retransmission control policies for random access systems, IEEE/ACM Transactions on Networking, 12:6, (1156-1166), Online publication date: 1-Dec-2004.
  1116. Li Q and Liu L (2004). An Algorithmic Approach for Sensitivity Analysis of Perturbed Quasi-Birth-and-Death Processes, Queueing Systems: Theory and Applications, 48:3-4, (365-397), Online publication date: 1-Nov-2004.
  1117. ACM
    Joshi K, Hiltunen M, Schlichting R, Sanders W and Agbaria A Online model-based adaptation for optimizing performance and dependability Proceedings of the 1st ACM SIGSOFT workshop on Self-managed systems, (85-89)
  1118. Yu F, Wong V and Leung V Efficient QoS Provisioning for Adaptive Multimedia in Mobile Communication Networks by Reinforcement Learning Proceedings of the First International Conference on Broadband Networks, (579-588)
  1119. Li C and Pyeatt L A short tutorial on reinforcement learning Intelligent information processing II, (509-513)
  1120. Parker R and Kapuscinski R (2004). Optimal Policies for a Capacitated Two-Echelon Inventory System, Operations Research, 52:5, (739-755), Online publication date: 1-Oct-2004.
  1121. Bäuerle N, Engelhardt-Funke O and Kolonko M (2004). Routing Of Airplanes To Two Runways: Monotonicity Of Optimal Controls, Probability in the Engineering and Informational Sciences, 18:4, (533-560), Online publication date: 1-Oct-2004.
  1122. Adelman D (2004). A Price-Directed Approach to Stochastic Inventory/Routing, Operations Research, 52:4, (499-514), Online publication date: 1-Aug-2004.
  1123. Pynadath D and Marsella S Fitting and Compilation of Multiagent Models through Piecewise Linear Functions Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems - Volume 3, (1197-1204)
  1124. Fischer F, Rovatsos M and Weiss G Hierarchical Reinforcement Learning in Communication-Mediated Multiagent Coordination Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems - Volume 3, (1334-1335)
  1125. Dolgov D and Durfee E Graphical Models in Local, Asymmetric Multi-Agent Markov Decision Processes Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems - Volume 2, (956-963)
  1126. Psounis K, Zhu A, Prabhakar B and Motwani R (2004). Modeling correlations in web traces and implications for designing replacement policies, Computer Networks: The International Journal of Computer and Telecommunications Networking, 45:4, (379-398), Online publication date: 15-Jul-2004.
  1127. ACM
    Nachmanson L, Veanes M, Schulte W, Tillmann N and Grieskamp W Optimal strategies for testing nondeterministic systems Proceedings of the 2004 ACM SIGSOFT international symposium on Software testing and analysis, (55-64)
  1128. Yu H and Bertsekas D Discretized approximations for POMDP with average cost Proceedings of the 20th conference on Uncertainty in artificial intelligence, (619-627)
  1129. Ferns N, Panangaden P and Precup D Metrics for finite Markov decision processes Proceedings of the 20th conference on Uncertainty in artificial intelligence, (162-169)
  1130. ACM
    Wingate D and Seppi K P3VI Proceedings of the twenty-first international conference on Machine learning
  1131. ACM
    Nachmanson L, Veanes M, Schulte W, Tillmann N and Grieskamp W (2004). Optimal strategies for testing nondeterministic systems, ACM SIGSOFT Software Engineering Notes, 29:4, (55-64), Online publication date: 1-Jul-2004.
  1132. Yoon S and Lewis M (2004). Optimal Pricing and Admission Control in a Queueing System with Periodically Varying Parameters, Queueing Systems: Theory and Applications, 47:3, (177-199), Online publication date: 1-Jul-2004.
  1133. Chang H (2004). Technical Note, Journal of Optimization Theory and Applications, 122:1, (207-217), Online publication date: 1-Jul-2004.
  1134. López M, Barea R, Bergasa L and Escudero M (2004). A Human–Robot Cooperative Learning System for Easy Installation of Assistant Robots in New Working Environments, Journal of Intelligent and Robotic Systems, 40:3, (233-265), Online publication date: 1-Jul-2004.
  1135. Chang H, Givan R and Chong E (2004). Parallel Rollout for Online Solution of Partially Observable Markov Decision Processes, Discrete Event Dynamic Systems, 14:3, (309-341), Online publication date: 1-Jul-2004.
  1136. Dimuro G and Costa A Interval-based markov decision processes for regulating interactions between two agents in multi-agent systems Proceedings of the 7th international conference on Applied Parallel Computing: state of the Art in Scientific Computing, (102-111)
  1137. Savagaonkar U, Chong E and Givan R (2004). Online pricing for bandwidth provisioning in multi-class networks, Computer Networks: The International Journal of Computer and Telecommunications Networking, 44:6, (835-853), Online publication date: 22-Apr-2004.
  1138. ACM
    Karabudak D, Hung C and Bing B A call admission control scheme using genetic algorithms Proceedings of the 2004 ACM symposium on Applied computing, (1151-1158)
  1139. Rykov V and Efrosinin D (2004). Optimal Control of Queueing Systems with Heterogeneous Servers, Queueing Systems: Theory and Applications, 46:3/4, (389-407), Online publication date: 1-Mar-2004.
  1140. Wüst C and Verhaegh W (2004). Quality Control for Scalable Media Processing Applications, Journal of Scheduling, 7:2, (105-117), Online publication date: 1-Mar-2004.
  1141. Baier C, Hermanns H and Katoen J (2004). Probabilistic weak simulation is decidable in polynomial time, Information Processing Letters, 89:3, (123-130), Online publication date: 14-Feb-2004.
  1142. Chatterjee K, Jurdziński M and Henzinger T Quantitative stochastic parity games Proceedings of the fifteenth annual ACM-SIAM symposium on Discrete algorithms, (121-130)
  1143. Liberatore P (2004). On polynomial sized MDP succinct policies, Journal of Artificial Intelligence Research, 21:1, (551-577), Online publication date: 1-Jan-2004.
  1144. PROBMELA Proceedings of the Second ACM/IEEE International Conference on Formal Methods and Models for Co-Design, (57-66)
  1145. Glazebrook K, Lumley R and Ansell P (2003). Index Heuristics for Multiclass M/G/1 Systems with Nonpreemptive Service and Convex Holding Costs, Queueing Systems: Theory and Applications, 45:2, (81-111), Online publication date: 2-Oct-2003.
  1146. Munos R Error bounds for approximate policy iteration Proceedings of the Twentieth International Conference on International Conference on Machine Learning, (560-567)
  1147. McMahan H, Gordon G and Blum A Planning in the presence of cost functions controlled by an adversary Proceedings of the Twentieth International Conference on International Conference on Machine Learning, (536-543)
  1148. Ferrein A, Fritz C and Lakemeyer G Extending DTGOLOG with options Proceedings of the 18th international joint conference on Artificial intelligence, (1394-1395)
  1149. Bonet B and Geffner H Faster heuristic search algorithms for planning with uncertainty and full feedback Proceedings of the 18th international joint conference on Artificial intelligence, (1233-1238)
  1150. Manandhar S, Tarim A and Walsh T Scenario-based stochastic constraint programming Proceedings of the 18th international joint conference on Artificial intelligence, (257-262)
  1151. Price B and Boutilier C (2003). Accelerating reinforcement learning through implicit imitation, Journal of Artificial Intelligence Research, 19:1, (569-629), Online publication date: 1-Jul-2003.
  1152. Guestrin C, Koller D, Parr R and Venkataraman S (2003). Efficient solution algorithms for factored MDPs, Journal of Artificial Intelligence Research, 19:1, (399-468), Online publication date: 1-Jul-2003.
  1153. Shumsky R and Pinker E (2003). Gatekeepers and Referrals in Services, Management Science, 49:7, (839-856), Online publication date: 1-Jul-2003.
  1154. Kim K and Dean T (2003). Solving factored MDPs using non-homogeneous partitions, Artificial Intelligence, 147:1-2, (225-251), Online publication date: 1-Jul-2003.
  1155. Kim S and Chang H Parallelizing parallel rollout algorithm for solving Markov decision processes Proceedings of the OpenMP applications and tools 2003 international conference on OpenMP shared memory parallel programming, (122-136)
  1156. Monniaux D Abstract interpretation of programs as Markov decision processes Proceedings of the 10th international conference on Static analysis, (237-254)
  1157. Bonet B and Geffner H Labeled RTDP Proceedings of the Thirteenth International Conference on International Conference on Automated Planning and Scheduling, (12-21)
  1158. Bournez O and Hoyrup M Rewriting logic and probabilities Proceedings of the 14th international conference on Rewriting techniques and applications, (61-75)
  1159. Uther W and Veloso M TTree Adaptive agents and multi-agent systems, (260-290)
  1160. Cao X (2003). From Perturbation Analysis to Markov Decision Processes and Reinforcement Learning, Discrete Event Dynamic Systems, 13:1-2, (9-39), Online publication date: 1-Jan-2003.
  1161. ACM
    Grigoras R, Charvillat V and Douze M Optimizing hypervideo navigation using a Markov decision process approach Proceedings of the tenth ACM international conference on Multimedia, (39-48)
  1162. Munos R and Moore A (2002). Variable Resolution Discretization in Optimal Control, Machine Language, 49:2-3, (291-323), Online publication date: 1-Nov-2002.
  1163. Kearns M and Singh S (2002). Near-Optimal Reinforcement Learning in Polynomial Time, Machine Language, 49:2-3, (209-232), Online publication date: 1-Nov-2002.
  1164. Ormoneit D and Sen Ś (2002). Kernel-Based Reinforcement Learning, Machine Language, 49:2-3, (161-178), Online publication date: 1-Nov-2002.
  1165. Debouk R, Lafortune S and Teneketzis D (2002). On an Optimization Problem in Sensor Selection, Discrete Event Dynamic Systems, 12:4, (417-445), Online publication date: 1-Oct-2002.
  1166. Parsons S and Wooldridge M (2002). Game Theory and Decision Theory in Multi-Agent Systems, Autonomous Agents and Multi-Agent Systems, 5:3, (243-254), Online publication date: 1-Sep-2002.
  1167. Meuleau N and Smith D Optimal limited contingency planning Proceedings of the Nineteenth conference on Uncertainty in Artificial Intelligence, (417-426)
  1168. Lizotte D, Madani O and Greiner R Budgeted learning of nailve-bayes classifiers Proceedings of the Nineteenth conference on Uncertainty in Artificial Intelligence, (378-385)
  1169. Yoon S, Fern A and Givan R Inductive policy selection for first-order MDPs Proceedings of the Eighteenth conference on Uncertainty in artificial intelligence, (568-576)
  1170. Guestrin C and Gordon G Distributed planning in hierarchical factored MDPs Proceedings of the Eighteenth conference on Uncertainty in artificial intelligence, (197-206)
  1171. Bresina J, Dearden R, Meuleau N, Ramakrishnan S, Smith D and washington R Planning under continuous time and resource uncertainty Proceedings of the Eighteenth conference on Uncertainty in artificial intelligence, (77-84)
  1172. Poupart P, Boutilier C, Patrascu R and Schuurmans D Piecewise linear value function approximation for factored MDPs Eighteenth national conference on Artificial intelligence, (292-299)
  1173. Patrascu R, Poupart P, Schuurmans D, Boutilier C and Guestrin C Greedy linear value-approximation for factored Markov decision processes Eighteenth national conference on Artificial intelligence, (285-291)
  1174. Lane T and Kaelbling L Nearly deterministic abstractions of Markov decision processes Eighteenth national conference on Artificial intelligence, (260-266)
  1175. ACM
    Broder A and Mitzenmacher M Optimal plans for aggregation Proceedings of the twenty-first annual symposium on Principles of distributed computing, (144-152)
  1176. ACM
    Das S, Grosz B and Pfeffer A Learning and decision Proceedings of the first international joint conference on Autonomous agents and multiagent systems: part 3, (1121-1128)
  1177. ACM
    Xuan P and Lesser V Multi-agent policies Proceedings of the first international joint conference on Autonomous agents and multiagent systems: part 3, (1098-1105)
  1178. ACM
    Suematsu N and Hayashi A A multiagent reinforcement learning algorithm using extended optimal response Proceedings of the first international joint conference on Autonomous agents and multiagent systems: part 1, (370-377)
  1179. Jha S, Sheyner O and Wing J Two Formal Analys s of Attack Graphs Proceedings of the 15th IEEE workshop on Computer Security Foundations
  1180. Anjali T, Scoglio C, de Oliveira J, Akyildiz I and Uhl G (2002). Optimal policy for label switched path setup in MPLS Networks, Computer Networks: The International Journal of Computer and Telecommunications Networking, 39:2, (165-183), Online publication date: 5-Jun-2002.
  1181. Daw N, Kakade S and Dayan P (2002). Opponent interactions between serotonin and dopamine, Neural Networks, 15:4, (603-616), Online publication date: 1-Jun-2002.
  1182. Kenyon C and Mitzenmacher M (2002). Linear waste of best fit bin packing on skewed distributions, Random Structures & Algorithms, 20:3, (441-464), Online publication date: 1-May-2002.
  1183. Kalyanasundaram S, Chong E and Shroff N (2002). Optimal resource allocation in multi-class networks with user-specified utility functions, Computer Networks: The International Journal of Computer and Telecommunications Networking, 38:5, (613-630), Online publication date: 5-Apr-2002.
  1184. Jouffe L Reinforcement learning for fuzzy agents New learning paradigms in soft computing, (181-230)
  1185. Hermanns H (2002). Interactive Markov chains, 10.5555/1744274, Online publication date: 1-Jan-2002.
  1186. Boutilier C Planning and programming with first-order markov decision processes Proceedings of the 8th conference on Theoretical aspects of rationality and knowledge, (99-110)
  1187. ACM
    Choi S and Liu J A dynamic mechanism for time-constrained trading Proceedings of the fifth international conference on Autonomous agents, (568-575)
  1188. ACM
    Minut S and Mahadevan S A reinforcement learning model of selective visual attention Proceedings of the fifth international conference on Autonomous agents, (457-464)
  1189. Bonet B and Geffner H (2001). Planning and Control in Artificial Intelligence, Applied Intelligence, 14:3, (237-252), Online publication date: 9-May-2001.
  1190. Boyan J and Mitzenmacher M Improved results for route planning in stochastic transportation Proceedings of the twelfth annual ACM-SIAM symposium on Discrete algorithms, (895-902)
  1191. Munos R (2000). A Study of Reinforcement Learning in the Continuous Case by the Means of Viscosity Solutions, Machine Language, 40:3, (265-299), Online publication date: 1-Sep-2000.
  1192. ACM
    Simunic T, Benini L, Glynn P and De Micheli G Dynamic power management for portable systems Proceedings of the 6th annual international conference on Mobile computing and networking, (11-19)
  1193. ACM
    Haas Z, Halpern J, Li L and Wicker S A decision-theoretic approach to resource allocation in wireless multimedia networks Proceedings of the 4th international workshop on Discrete algorithms and methods for mobile computing and communications, (86-95)
  1194. Peshkin L, Kim K, Meuleau N and Kaelbling L Learning to cooperate via policy search Proceedings of the Sixteenth conference on Uncertainty in artificial intelligence, (489-496)
  1195. ACM
    Pendrith M Distributed reinforcement learning for a traffic engineering application Proceedings of the fourth international conference on Autonomous agents, (404-411)
  1196. Tambe M and Zhang W (2000). Towards Flexible Teamwork in Persistent Teams, Autonomous Agents and Multi-Agent Systems, 3:2, (159-183), Online publication date: 1-Jun-2000.
  1197. ACM
    Benini L and Micheli G (2000). System-level power optimization, ACM Transactions on Design Automation of Electronic Systems, 5:2, (115-192), Online publication date: 1-Apr-2000.
  1198. Singh S, Jaakkola T, Littman M and Szepesvári C (2000). Convergence Results for Single-Step On-PolicyReinforcement-Learning Algorithms, Machine Language, 38:3, (287-308), Online publication date: 1-Mar-2000.
  1199. ACM
    Zobel C and Scherer W SMG Proceedings of the 31st conference on Winter simulation: Simulation---a bridge to the future - Volume 1, (569-572)
  1200. Laroche P, Charpillet F and Schott R Mobile Robotics Planning Using Abstract Markov Decision Processes Proceedings of the 11th IEEE International Conference on Tools with Artificial Intelligence
  1201. Szepesvári C and Littman M (1999). A Unified Analysis of Value-Function-Based Reinforcement Learning Algorithms, Neural Computation, 11:8, (2017-2060), Online publication date: 1-Nov-1999.
  1202. Hauskrecht M, Pandurangan G and Upfal E Computing near optimal strategies for stochastic investment planning problems Proceedings of the 16th international joint conference on Artificial intelligence - Volume 2, (1310-1315)
  1203. Boutilier C Sequential optimality and coordination in multiagent systems Proceedings of the 16th international joint conference on Artifical intelligence - Volume 1, (478-485)
  1204. Sabbadin R A possibilistic model for qualitative sequential decision problems under uncertainty in partially observable environments Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence, (567-574)
  1205. Meuleau N, Peshkin L, Kim K and Kaelbling L Learning finite-state controllers for partially observable environments Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence, (427-436)
  1206. Meuleau N, Kim K, Kaelbling L and Cassandra A Solving POMDPs by searching the space of finite policies Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence, (417-426)
  1207. Hoey J, St-Aubin R, Hu A and Boutilier C SPUDD Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence, (279-288)
  1208. Boutilier C, Goldszmidt M and Sabata B Continuous value function approximation for sequential bidding policies Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence, (81-90)
  1209. Ho C and Lea C (1999). Improving call admission policies in wireless networks, Wireless Networks, 5:4, (257-265), Online publication date: 1-Jul-1999.
  1210. Aviv Y and Federgruen A (1999). The value iteration method for countable state Markov decision processes, Operations Research Letters, 24:5, (223-234), Online publication date: 1-Jun-1999.
  1211. Meuleau N and Bourgine P (1999). Exploration of Multi-State Environments, Machine Language, 35:2, (117-154), Online publication date: 1-May-1999.
  1212. Boutilier C Knowledge representation for stochastic decision processes Artificial intelligence today, (111-152)
  1213. Blythe J An overview of planning under uncertainty Artificial intelligence today, (85-110)
  1214. Hauskrecht M, Meuleau N, Kaelbling L, Dean T and Boutilier C Hierarchical solution of Markov decision processes using macro-actions Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence, (220-229)
  1215. Cao X (1998). The Relations Among Potentials, Perturbation Analysis,and Markov Decision Processes, Discrete Event Dynamic Systems, 8:1, (71-87), Online publication date: 1-Mar-1998.
  1216. Horvitz E and Seiver A Time-critical action Proceedings of the Thirteenth conference on Uncertainty in artificial intelligence, (250-257)
  1217. Boutilier C Correlated action effects in decision theoretic regression Proceedings of the Thirteenth conference on Uncertainty in artificial intelligence, (30-37)
  1218. Bacchus F, Boutilier C and Grove A Structured solution methods for non-Markovian decision processes Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence, (112-117)
  1219. ACM
    Barbuceanu M and Fox M Integrating communicative action, conversations and decision theory to coordinate agents Proceedings of the first international conference on Autonomous agents, (49-58)
  1220. Bacchus F, Boutilier C and Grove A Rewarding behaviors Proceedings of the thirteenth national conference on Artificial intelligence - Volume 2, (1160-1167)
  1221. Boutilier C Learning conventions in multiagent stochastic domains using likelihood estimates Proceedings of the Twelfth international conference on Uncertainty in artificial intelligence, (106-114)
  1222. Boutilier C Planning, learning and coordination in multiagent decision processes Proceedings of the 6th conference on Theoretical aspects of rationality and knowledge, (195-210)
  1223. ACM
    Hahn E, Perez M, Schewe S, Somenzi F, Trivedi A and Wojtczak D Multi-Objective Omega-Regular Reinforcement Learning, Formal Aspects of Computing, 0:0
  1224. ACM
    Tong J, Shi D, Liu Y and Fan W GLDAP: Global Dynamic Action Persistence Adaptation for Deep Reinforcement Learning, ACM Transactions on Autonomous and Adaptive Systems, 0:0
  1225. ACM
    Singh R, Miller T, Lyons H, Sonenberg L, Velloso E, Vetere F, Howe P and Dourish P Directive Explanations for Actionable Explainability in Machine Learning Applications, ACM Transactions on Interactive Intelligent Systems, 0:0
  1226. ACM
    Malviya S, Kumar P, Namasudra S and Tiwary U Experience Replay-based Deep Reinforcement Learning for Dialogue Management Optimisation, ACM Transactions on Asian and Low-Resource Language Information Processing, 0:0
  1227. Li Y and Jiang F A Gradient Learning Optimization for Dynamic Power Management 2015 IEEE International Conference on Systems, Man, and Cybernetics, (2061-2066)
  1228. Meng Y, Munroe C, Wu Y and Begum M A learning from demonstration framework to promote home-based neuromotor rehabilitation 2016 25th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN), (1126-1131)
  1229. Li H and Zheng Z Optimal Timing of Moving Target Defense: A Stackelberg Game Model MILCOM 2019 - 2019 IEEE Military Communications Conference (MILCOM), (1-6)
  1230. Larrañaga M, Assaad M, Destounis A and Paschos G Dynamic pilot allocation over Markovian fading channels: A restless bandit approach 2016 IEEE Information Theory Workshop (ITW), (290-294)
  1231. Shaviv D, Özgür A and Permuter H Capacity of remotely powered communication 2016 IEEE International Symposium on Information Theory (ISIT), (1979-1983)
  1232. Stahlbuhk T, Shrader B and Modiano E Throughput maximization in uncooperative spectrum sharing networks 2016 IEEE International Symposium on Information Theory (ISIT), (1242-1246)
  1233. Geske J and Green R Optimal storage investment and management under uncertainty 2016 IEEE 8th International Power Electronics and Motion Control Conference (IPEMC-ECCE Asia), (524-529)
  1234. Deng L, Wang C, Chen M and Zhao S Timely wireless flows with arbitrary traffic patterns: Capacity region and scheduling algorithms IEEE INFOCOM 2016 - The 35th Annual IEEE International Conference on Computer Communications, (1-9)
  1235. Hanawal M, Nguyen D and Krunz M Jamming attack on in-band full-duplex communications: Detection and countermeasures IEEE INFOCOM 2016 - The 35th Annual IEEE International Conference on Computer Communications, (1-9)
  1236. Karasev V, Ayvaci A, Heisele B and Soatto S Intent-aware long-term prediction of pedestrian motion 2016 IEEE International Conference on Robotics and Automation (ICRA), (2543-2549)
  1237. Pineda L, Takahashi T, Jung H, Zilberstein S and Grupen R Continual planning for search and rescue robots 2015 IEEE-RAS 15th International Conference on Humanoid Robots (Humanoids), (243-248)
  1238. Nan Z, Jia Y, Chen Z and Liang L Reinforcement-Learning-Based Optimization for Content Delivery Policy in Cache-Enabled HetNets 2019 IEEE Global Communications Conference (GLOBECOM), (1-6)
  1239. Jungmann A and Kleinjohann B A holistic and adaptive approach for automated prototyping of image processing functionality 2016 IEEE 21st International Conference on Emerging Technologies and Factory Automation (ETFA), (1-8)
Contributors
  • UBC Sauder School of Business
Please enable JavaScript to view thecomments powered by Disqus.

Recommendations