Ronald Parr

Professor of Computer Science
Office: D209 LSRC
Department of Computer Science
Duke University
LSRC / Box 90129
Durham, NC 27708
phone: (919) 660-6537
fax: (919) 660-6519
email: parr-at-cs.duke.edu

Are you interested in studying or working in my research group? Before you send me an email that I probably won't be able to answer, please look here.

I am a professor at the Duke University Department of Computer Science. From July 2015 through June 2017, I was department chair and delivered three graduation speeches: 2015, 2016, and 2017.

Here is my CV (updated 1/2024) and and brief bio.

Jump to:
Classes
Papers
Slides
Everything Else

Classes

CompSci 570 (Fall 2024)

Journal Papers

Counting Objects with a Combination of Horizontal and Overhead Sensors, Erik Halvorson and Ronald Parr, The International Journal of Robotics Research, June 2010, 29(7), pp. 840-854.

Non-Myopic Multi-Aspect Sensing with Partially Observable Markov Decision Processes, Shihao Ji, Ronald Parr, and Lawrence Carin, IEEE Transactions on Signal Processing, June 2007 Volume 55, Issue: 6, Part 1, 2007, pp. 2720-2730.

Least-Squares Policy Iteration, Michail Lagoudakis and Ronald Parr, Accepted to the Journal of Machine Learning Research (JMLR), Vol. 4, 2003, pp. 1107-1149.

Efficient Solution Algorithms for Factored MDPs, Carlos Guestrin, Daphne Koller, Ronald Parr and Shobha Venkataraman, Journal of Artificial Intelligence Research (JAIR), Vol. 19, 2003, pp. 399-468. [Winner of IJCAI-JAIR best paper award]

Highly Refereed Papers (also includes WAFR, IROS and ICRA)

A Path to Simpler Models Starts With Noise, Lesia Semenova, Harry Chen, Ronald Parr, and Cynthia Rudin, Neural Information Processing Systems 2023 (NeurIPS 2023).

On the Existence of Simpler Machine Learning Models, Lesia Semenova, Cynthia Rudin, and Ronald Parr, ACM Conference on Fairness, Accountability, and Transparency (ACM FAccT), 2022.

Policy Caches with Successor Features, Mark Nemecek and Ronald Parr, Proceedings of the 38th International Conference on Machine Learning (ICML-21).

Deep Radial-Basis Value Functions for Continuous Control, Kavosh Asadi, Neev Parikh, Ronald Parr, George Konidaris, and Michael Littman, Proceedings of the Thirty Fifth AAAI Conference on Artificial Intelligence (AAAI-21).

Revisiting the Softmax Bellman Operator: New Benefits and New Perspective, Zhao Song, Ronald Parr, and Lawrence Carin, Proceedings of the Thirty-sixth International Conference on Machine Learning (ICML 2019). supplemental material

Improving PAC Exploration Using the Median Of Means, Jason Pazis, Ronald Parr, and Jonathan How, Neural Information Processing Systems 2016 (NIPS 2016).

Linear Feature Encoding for Reinforcement Learning, Zhao Song, Ronald Parr, Xuejun Liao, and Lawrence Carin, Neural Information Processing Systems 2016 (NIPS 2016). supplemental material

Efficient PAC-optimal Exploration in Concurrent, Continuous State MDPs with Delayed Updates, Jason Pazis and Ronald Parr, Proceedings of the Thirtieth AAAI Conference (AAAI 2016).

Distance Minimization for Reward Learning from Scored Trajectories, Benjamin Burchfiel, Carlo Tomasi, and Ronald Parr, Proceedings of the Thirtieth AAAI Conference (AAAI 2016).

Unsupervised Discovery of Object Classes with a Mobile Robot, Julian Mason, Bhaskara Marthi, and Ronald Parr, International Conference on Robotics and Automation (ICRA 2014).

Sample Complexity and Performance Bounds for Non-parametric Approximate Linear Programming, Jason Pazis and Ronald Parr, Proceedings of the Twenty Seventh Association for Advancement of Artificial Intelligence Conference (AAAI 2013).

PAC Optimal Exploration in Continuous Space Markov Decision Processes, Jason Pazis and Ronald Parr, Proceedings of the Twenty Seventh Association for Advancement of Artificial Intelligence Conference (AAAI 2013). [AAAI 2013 Outstanding Paper Honorable Mention] (Our AAAI 2016 paper supersedes this.)

Object Disappearance for Object Discovery, Julian Mason, Bhaskara Marthi, and Ronald Parr, IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS-2012).

Value Function Approximation in Noisy Environments Using Locally Smoothed Regularized Approximate Linear Programs, Gavin Taylor and Ronald Parr, Proceedings of the Twenty-Eighth International Conference on Uncertainty in Artificial Intelligence (UAI-2012).

Greedy Algorithms for Sparse Reinforcement Learning, Christopher Painter-Wakefield and Ronald Parr, Proceedings of the Twenty-Ningth International Conference on Machine Learning (ICML-2012). (long version)

Computing Optimal Strategies to Commit to in Stochastic Games, Joshua Letchford, Liam MacDermed, Vincent Conitzer, Ronald Parr, and Charles Isbell, Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence (AAAI-2012).

Generalized Value Functions for Large Action Sets, Jason Pazis and Ronald Parr, Proceedings of the Twenty-Eighth International Conference on Machine Learning (ICML-2011).

Non-parametric Approximate Linear Programming for MDPs, Jason Pazis and Ronald Parr, Proceedings of the Twenty-Fifth AAAI conference on Artificial Intelligence (AAAI-2011). (Note: This version differs from the published verision in that it contains a small correction to the statement of theorem 5.4.)

Security Games with Multiple Attacker Resources, Dmytro Korzhyk, Vincent Conitzer, Ronald Parr, Proceedings of the Twenty-second International Joint Conference on Artificial Intelligence (IJCAI-2011).

Textured Occupancy Grids for Monocular Localization Without Features, Julian Mason, Susanna Ricco, Ronald Parr, 2011 IEEE International Conference on Robotics and Automation (ICRA 2011).

Solving Stackelberg Games with Uncertain Observability, Dmytro Korzhyk, Vincent Conitzer, Ronald Parr, The Tenth International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2011).

Linear Complementarity for Regularized Policy Evaluation and Improvement, Jeff Johns, Christopher Painter-Wakefield, Ronald Parr, Advances in Neural Information Processing Systems 23 (NIPS 23). (Note: This is the full version of the paper, which includes the appendix. The printed proceedings do not include the appendix.)

Feature Selection Using Regularization in Approximate Linear Programs for Markov Decision Processes, Marek Petrik, Gavin Taylor, Ronald Parr, and Shlomo Zilberstein, International Conference on Machine Learning (ICML-2010), pp. 871-878. (Full Version)

Complexity of Computing Optimal Stackelberg Strategies in Security Resource Allocation Games, Dmytro Korzhyk, Vincent Conitzer, and Ronald Parr, Proceedings of the 24th National Conference on Artificial Intelligence (AAAI-10).

Kernelized Value Function Approximation for Reinforcement Learning, Gavin Taylor and Ronald Parr, International Conference on Machine Learning (ICML-2009).

Multi-step Multi-sensor Hider-seeker Games, Erik Halvorson, Vincent Conitzer and Ronald Parr, Proceedings of the Twenty-First International Joint Conference on Artificial Intelligence (IJCAI-2009).

Planning Aims for a Network of Horizontal and Overhead Sensors, Erik Halvorson and Ronald Parr, Workshop on the Algorithmic Foundations of Robotics 2008 (WAFR-2008).

An Analysis of Linear Models, Linear Value-Function Approximation, and Feature Selection for Reinforcement Learning, Ronald Parr, Lihong Li, Gavin Taylor, Christopher Painter-Wakefield, and Michael L. Littman, International Conference on Machine Learning (ICML-2008), pp. 752-759. Note: Please see this important addendum.

Point-Based Policy Iteration, Shihao Ji, Ronald Parr, Hui Li, Xuejun Liao, and Lawrence Carin, Proceedings of the Twenty-Second National Conference on Artificial Intelligence (AAAI-2007), pp 1243-1249.

Analyzing Feature Generation for Value-Function Approximation, Ronald Parr, Christopher Painter-Wakefield, Lihong Li, and Michael Littman, International Conference on Machine Learning (ICML-2007).

Efficient Selection of Disambiguating Actions for Stereo Vision, Monika Schaeffer and Ronald Parr, Uncertainty in Artificial Intelligence (UAI-2006).

Hierarchical Linear/Constant Time SLAM using Particle Filters for Dense Maps Austin I. Eliazar and Ronald Parr, Advances in Neural Information Processing Systems (NIPS-19) 2005.

Learning Probabilistic Motion Models for Mobile Robots, Austin I. Eliazar and Ronald Parr, Proceedings of the Twenty First International Conference on Machine Learning (ICML-2004).

DP-SLAM 2.0, Austin Eliazar and Ronald Parr, IEEE 2004 International Conference on Robotics and Automation (ICRA 2004).

Reinforcement Learning as Classification: Leveraging Modern Classifiers, Michail Lagoudakis and Ronald Parr, Proceedings of the Twentieth International Conference on Machine Learning (ICML-2003).

DP-SLAM: Fast, Robust Simulataneous Localization and Mapping without Predetermined Landmarks, Austin Eliazar and Ronald Parr, Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence (IJCAI 03).

Learning in Zero-Sum Team Markov Games using Factored Value Functions , Michail Lagoudakis and Ronald Parr, To appear in Advances in Neural Information Processing Systems (NIPS-15) 2002.

Value Function Approximation in Zero-Sum Markov Games Michail Lagoudakis and Ronald Parr, Proceedings of the 18th Conference on Uncertainty in Artificial Intelligence (UAI 2002).

Coordinated Reinforcement Learning, Carlos Guestrin, Michail Lagoudakis,and Ronald Parr. Proceedings of the Nineteenth International Conference on Machine Learning (ICML-2002).

XPathLearner: An On-Line Self-Tuning Markov Histogram for XML Path Selectivity Estimation, Lipyeow Lim, Min Wang, Sriram Padmanabhan, Jeffrey Scott Vitter, Ronald Parr. VLDB 2002.

Multiagent Planning with Factored MDPs, Carlos Guestrin, Daphne Koller and Ronald Parr, Advances in Neural Information Processing Systems (NIPS-14) 2001.

Model-Free Least-Squares Policy Iteration Michail Lagoudakis, Ronald Parr, To appear in Advances in Neural Information Processing Systems (NIPS-14) 2001. Longer tech report version available.

Inference in Hybrid Networks: Theoretical Limits and Practical Algorithms, Uri Lerner, Ronald Parr, Uncertainty in Artificial Intelligence, Proceedings of the Seventeenth Conference (UAI 2001). [Joint winner of best student paper award (student first author).]

Max-norm Projections for Factored MDPs, Carlos Guestrin, Daphne Koller and Ronald Parr, IJCAI 2001.

Policy Iteration for Factored MDPs, Daphne Koller and Ronald Parr, Uncertainty in Artificial Intelligence, Proceedings of the Sixteenth Conference (UAI 2000).

Making Rational Decisions Using Adaptive Utility Elicitation Urszula Chajewska, Daphne Koller, Ronald Parr, Proceedings of the Seventeenth National Conference on Artificial Intelligence (AAAI 2000).

Bayesian Fault Detection and Diagnosis in Dynamic Systems Uri Lerner, Ronald Parr, Daphne Koller and Gautam Biswas, Proceedings of the Seventeenth National Conference on Artificial Intelligence (AAAI 2000).

Policy Search Via Density Estimation, Andrew Ng, Ronald Parr and Daphne Koller, NIPS 99.

Reinforcement Learning Using Approximate Belief States, Andrés Rodríguez, Ronald Parr and Daphne Koller, NIPS 99.

Computing Factored Value Functions for Policies in Structured MDPs, Daphne Koller and Ronald Parr, IJCAI 1999.

Flexible Decomposition Algorithms for Weakly Coupled Markov Decision Problems , Ronald Parr. UAI 98.

Reinforcement Learning with Hierarchies of Machines, Ronald Parr and Stuart Russell. NIPS 97.

Generalized Prioritized Sweeping David Andre, Nir Friedman, and Ronald Parr. NIPS 97.

Approximating Optimal Policies for Partially Observable Stochastic Domains, Ronald Parr, Stuart Russell, Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence (IJCAI-95). [Note: This is a slightly revised version with a few corrections.]

Provably Bounded Optimal Agents, Stuart Russell, Devika Subramanian, Ronald Parr, in Proceedings of the Thirteenth International Joint Conference on Artificial Intelligence (IJCAI-93).

Other Papers

Fitted Q-Learning for Relational Domains, Srijita Das, Sriraam Natarajan, Kaushik Roy, Ronald Parr and Kristian Kersting, 17th International Conference on Knowledge Representation and Reasoning 2020 (KR2020) poster. Full paper available on arXiv.

L1 Regularized Linear Temporal Difference Learning, Duke CS Technical Report TR-2012-01 Christopher Painter-Wakefield and Ronald Parr, 2012.

Planning Aims for a Network of Horizontal and Overhead Sensors, Erik Halvorson and Ronald Parr. International Symposium on Artificial Intelligence and Mathematics, 2008.

Least-Squares Methods in Reinforcement Learning for Control, Michail Lagoudakis, Ronald Parr and Michael L. Littman. Second Hellenic Conference on Artificial Intelligence (SETN-02), Thessaloniki, Greece, April 11-12, 2002.

Model-Free Least-Squares Policy Iteration, Michail Lagoudakis and Ronald Parr, Duke University Technical Report. (Longer version of NIPS 2001 paper above.)

Coordinated Reinforcement Learning, Carlos Guestrin, Michail Lagoudakis,and Ronald Parr. To appear in the Proceedings of the 2002 AAAI Spring Symposium Series: Collaborative Learning Agents

Selecting the Right Algorithm Michail Lagoudakis, Michael L. Littman and Ronald Parr, Proceedings of the 2001 AAAI Fall Symposium Series: Using Uncertainty within Computation, Cape Cod, MA, November 2001.

Solving Factored POMDPs with Linear Value Functions, Carlos Guestrin, Daphne Koller and Ronald Parr, In the IJCAI-01 workshop on Planning under Uncertainty and Incomplete Information.

Max-norm Projections for Factored MDPs, Carlos Guestrin, Daphne Koller, and Ronald Parr, AAAI Spring Symposium, Stanford, California, March 2001.

Adaptive Utility Elicitation using Value of Information (abstract), Urszula Chajewska, Miriam Kuppermann, Ronald Parr and Daphne Koller, Presented at the 22nd Annual Meeting of the Society for Medical Decision Making (MDM' 00).

A Unifying Framework for Temporal Abstraction in Stochastic Processes, Ronald Parr, Symposium on Abstraction Reformulation and Approximation, 1998 (SARA-98).

Feasibility Study of Fully Automated Vehicles Using Decision-theoretic Control, Jeffrey Forbes, Nikunj Oza, Ronald Parr, Stuart Russell, California PATH Research Report, UCB-ITS-PRR-97-18.

Some Slides

These are slides from some talks. Ask if you want me to add more. If you adapt content from these slides for your own talks, I'd appreciate if you let me know and gave me a little acknowledgment somewhere.

Slides on linear value functions and linear models from the 2006 NSF workshop on Approximate Dynamic Programming, joint work with Christopher Painter-Wakefield and Michael Littman.
Slides from my overview of hierarchical reinforcement learning for the ICML 2005 Worskshop on Rich Representations for Reinforcement Learning

My Dissertation

You can view the abstract or download the whole thing as gzipped postscript. Please read some comments about the version of my thesis posted on this page.

Workshop

I was one of the organizers of the NIPS 98 Workshop on Abstraction and Hierarchy in Reinforcement Learning.

Random Stuff

I have no relation to Jack Paar, who spells his name differently. There is a web page about people named Parr, but I am not related to any of them. I also am not related to Katherine Parr, one of the wives of Henry VIII. According to Webster, a parr is a small fish. The relationship is quite distant.

I'm into digital photography.