research-article

A Bayesian Approach for Quantifying Data Scarcity when Modeling Human Behavior via Inverse Reinforcement Learning

Authors:

Tahera Hossain,

Snehal Prabhudesai,

Nikola BanovicAuthors Info & Claims

ACM Transactions on Computer-Human Interaction, Volume 30, Issue 1

Article No.: 8, Pages 1 - 27

https://doi.org/10.1145/3551388

Published: 07 March 2023 Publication History

Abstract

Computational models that formalize complex human behaviors enable study and understanding of such behaviors. However, collecting behavior data required to estimate the parameters of such models is often tedious and resource intensive. Thus, estimating dataset size as part of data collection planning (also known as Sample Size Determination) is important to reduce the time and effort of behavior data collection while maintaining an accurate estimate of model parameters. In this article, we present a sample size determination method based on Uncertainty Quantification (UQ) for a specific Inverse Reinforcement Learning (IRL) model of human behavior, in two cases: (1) pre-hoc experiment design—conducted in the planning stage before any data is collected, to guide the estimation of how many samples to collect; and (2) post-hoc dataset analysis—performed after data is collected, to decide if the existing dataset has sufficient samples and whether more data is needed. We validate our approach in experiments with a realistic model of behaviors of people with Multiple Sclerosis (MS) and illustrate how to pick a reasonable sample size target. Our work enables model designers to perform a deeper, principled investigation of the effects of dataset size on IRL model parameters.

References

[1]

Rebecca Adaimi and Edison Thomaz. 2019. Leveraging active learning and conditional mutual information to minimize data annotation in human activity recognition. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 3, 3 (2019), 23 pages. DOI:

Digital Library

[2]

C. J. Adcock. 1997. Sample size determination: A review. Journal of the Royal Statistical Society: Series D (The Statistician) 46, 2 (1997), 261–283. DOI:

[3]

Roberto Alejo, Vicente García, and J. Pacheco. 2015. An efficient over-sampling approach based on mean square error back-propagation for dealing with the multi-class imbalance problem. Neural Processing Letters 42, 3 (2015), 603–617. DOI:

Digital Library

[4]

Nikola Banovic, Tofi Buzali, Fanny Chevalier, Jennifer Mankoff, and Anind K. Dey. 2016. Modeling and understanding human routine behavior. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems.Association for Computing Machinery, New York, NY,248–260. DOI:

Digital Library

[5]

Nikola Banovic, Tovi Grossman, and George Fitzmaurice. 2013. The effect of time-based cost of error in target-directed pointing tasks. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. Association for Computing Machinery, New York, NY,1373–1382. DOI:

Digital Library

[6]

Nikola Banovic, Jennifer Mankoff, and Anind K. Dey. 2018. Computational model of human routine behaviors. In Proceedings of the Computational Interaction. Antti Oulasvirta, Per Ola Kristensson, Xiaojun Bi, and Andrew Howes (Eds.), Oxford University Press, Oxford, 377–398.

[7]

Nikola Banovic, Antti Oulasvirta, and Per Ola Kristensson. 2019. Computational modeling in human-computer interaction. In Proceedings of the Extended Abstracts of the 2019 CHI Conference on Human Factors in Computing Systems.Association for Computing Machinery, New York, NY,1–7. DOI:

Digital Library

[8]

Nikola Banovic, Anqi Wang, Yanfeng Jin, Christie Chang, Julian Ramos, Anind Dey, and Jennifer Mankoff. 2017. Leveraging human routine models to detect and generate human behaviors. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems.ACM, New York, NY,6683–6694. DOI:

Digital Library

[9]

Edmon Begoli, Tanmoy Bhattacharya, and Dimitri Kusnezov. 2019. The need for uncertainty quantification in machine-assisted medical decision making. Nature Machine Intelligence 1, 1 (2019), 20–23. DOI:

[10]

James O. Berger. 1985. Statistical Decision Theory and Bayesian Analysis. Springer New York, New York, NY. DOI:

[11]

Jose M. Bernardo and Adrian F. M. Smith. 2000. Bayesian Theory. John Wiley & Sons, New York, NY.

[12]

Christopher Bishop. 2006. Pattern Recognition and Machine Learning. Springer-Verlag New York.

Digital Library

[13]

David M. Blei, Alp Kucukelbir, and Jon D. McAuliffe. 2017. Variational inference: A review for statisticians. Journal of the American Statistical Association 112, 518 (2017), 859–877. DOI:

[14]

Natthaphan Boonyanunta and Panlop Zeephongsekul. 2004. Predicting the relationship between the size of training sample and the predictive power of classifiers. In Proceedings of the Knowledge-based Intelligent Information and Engineering Systems. Mircea Gh. Negoita, Robert J. Howlett, and Lakhmi C. Jain (Eds.), Springer, Berlin,529–535.

[15]

Leo Breiman. 2001. Statistical modeling: The two cultures. StatisticalScience 16, 3 (2001), 199–231.

[16]

Steve Brooks, Andrew Gelman, Galin Jones, and Xiao-Li Meng (Eds.), 2011. Handbook of Markov Chain Monte Carlo. Chapman and Hall/CRC. DOI:

[17]

Daniel S. Brown and Scott Niekum. 2017. Efficient probabilistic performance bounds for inverse reinforcement learning. arXiv preprint arXiv:1707.00724 (2017).

[18]

Daniel S. Brown and Scott Niekum. 2018. Efficient probabilistic performance bounds for inverse reinforcement learning. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence.

[19]

Kathryn Chaloner and Isabella Verdinelli. 1995. Bayesian experimental design: A review. Statistical Science 10, 3 (1995), 273–304. DOI:

[20]

Youngjae Chang, Akhil Mathur, Anton Isopoussu, Junehwa Song, and Fahim Kawsar. 2020. A systematic study of unsupervised domain adaptation for robust human-activity recognition. Proceedings of the ACM on Interactive, Mobile, Wearable, and Ubiquitous Technologies 4, 1 (2020), 30 pages. DOI:

Digital Library

[21]

Xiuli Chen, Sandra Dorothee Starke, Chris Baber, and Andrew Howes. 2017. A cognitive model of how people make decisions through interaction with visual displays. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems.Association for Computing Machinery, New York, NY,1205–1216. DOI:

Digital Library

[22]

Jacob Cohen. 1977. CHAPTER 1 - The concepts of power analysis. In Proceedings of the Statistical Power Analysis for the Behavioral Sciences. Jacob Cohen (Ed.), Academic Press, 1–17. DOI:

[23]

Thomas A. Cover and Joy A. Thomas. 2006. Elements of Information Theory (2nd ed.). John Wiley & Sons, Hoboken, NJ.

[24]

Luis G. Crespo, Sean P. Kenny, and Daniel P. Giesy. 2014. The NASA langley multidisciplinary uncertainty quantification challenge. In Proceedings of the 16th AIAA Non-Deterministic Approaches Conference. American Institute of Aeronautics and Astronautics, Reston, Virginia. DOI:

[25]

Yuchen Cui and Scott Niekum. 2017. Active learning from critiques via bayesian inverse reinforcement learning. In Proceedings of the Robotics: Science and Systems Workshop on Mathematical Models, Algorithms, and Human-Robot Interaction.

[26]

Nathan Eagle and Alex Sandy Pentland. 2009. Eigenbehaviors: Identifying structure in routine. Behavioral Ecology and Sociobiology 63, 7 (2009), 1057–1066. DOI:

[27]

Katayoun Farrahi and Daniel Gatica-Perez. 2012. Extracting mobile behavioral patterns with the distant N-gram topic model. In Proceedings of the 2012 16th International Symposium on Wearable Computers. 1–8.

Digital Library

[28]

Rebecca Fiebrink, Perry R. Cook, and Dan Trueman. 2011. Human model evaluation in interactive supervised learning. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems.Association for Computing Machinery, New York, NY,147–156. DOI:

Digital Library

[29]

Rosa L. Figueroa, Qing Zeng-Treitler, Sasikiran Kandula, and Long H. Ngo. 2012. Predicting sample size required for classification performance. BMC Medical Informatics and Decision Making 12, 1 (2012), 8.

[30]

Chelsea Finn, Sergey Levine, and Pieter Abbeel. 2016. Guided cost learning: Deep inverse optimal control via policy optimization. In Proceedings of the 33rd International Conference on International Conference on Machine Learning.JMLR.org, 49–58.

[31]

Daniel Foreman-Mackey, David W. Hogg, Dustin Lang, and Jonathan Goodman. 2013. emcee: The MCMC hammer. Publications of the Astronomical Society of the Pacific 125, 925 (2013), 306.

[32]

K. Fukunaga and R. R. Hayes. 1989. Effects of sample size in classifier design. IEEE Transactions on Pattern Analysis and Machine Intelligence 11, 8 (1989), 873–885.

Digital Library

[33]

Christoph Gebhardt, Brian Hecox, Bas van Opheusden, Daniel Wigdor, James Hillis, Otmar Hilliges, and Hrvoje Benko. 2019. Learning cooperative personalized policies from gaze data. In Proceedings of the 32nd Annual ACM Symposium on User Interface Software and Technology.Association for Computing Machinery, New York, NY,197–208. DOI:

Digital Library

[34]

Christoph Gebhardt, Antti Oulasvirta, and Otmar Hilliges. 2020. Hierarchical Reinforcement Learning as a Model of Human Task Interleaving. arXiv:cs.AI/2001.02122.

[35]

Samuel J. Gershman, Eric J. Horvitz, and Joshua B. Tenenbaum. 2015. Computational rationality: A converging paradigm for intelligence in brains, minds, and machines. Science 349, 6245 (2015), 273–278. arXiv:https://science.sciencemag.org/content/349/6245/273.full.pdf.

[36]

Roger Ghanem, David Higdon, and Houman Owhadi (Eds.). 2017. Handbook of uncertainty quantification. Springer International Publishing, Cham. arXiv:1507.00398.

[37]

W. R. Gilks, S. Richardson, and D. J. Spiegelhalter. 1996. Markov Chain Monte Carlo in Practice. Chapman & Hall, New York, NY.

[38]

Dorota Glowacka, Tuukka Ruotsalo, Ksenia Konuyshkova, kumaripaba Athukorala, Samuel Kaski, and Giulio Jacucci. 2013. Directing exploratory search: Reinforcement learning from user interactions with keywords. In Proceedings of the 2013 International Conference on Intelligent User Interfaces.Association for Computing Machinery, New York, NY,117–128. DOI:

Digital Library

[39]

Jonathan Goodman and Jonathan Weare. 2010. Ensemble samplers with affine invariance. Communications in Applied Mathematics and Computational Science 5, 1 (2010), 65–80.

[40]

Stephen L. Hauser and Jorge R. Oksenberg. 2006. The neurobiology of multiple sclerosis: Genes, inflammation, and neurodegeneration. Neuron 52, 1 (2006), 61–76.

[41]

Xun Huan and Youssef M. Marzouk. 2013. Simulation-based optimal Bayesian experimental design for nonlinear systems. Journal of Computational Physics 232, 1 (2013), 288–317. DOI:

Digital Library

[42]

Mahdi Imani and Ulisses M. Braga-Neto. 2018. Control of gene regulatory networks using Bayesian inverse reinforcement learning. IEEE/ACM Transactions on Computational Biology and Bioinformatics 16, 4 (2018), 1250–1261.

Digital Library

[43]

Sozo Inoue, Paula Lago, Tahera Hossain, Tittaya Mairittha, and Nattaya Mairittha. 2019. Integrating activity recognition and nursing care records: The system, deployment, and a verification study. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 3, 3 (2019), 24 pages. DOI:

Digital Library

[44]

Sozo Inoue and Xincheng Pan. 2016. Supervised and unsupervised transfer learning for activity recognition from simple in-home sensors. In Proceedings of the 13th International Conference on Mobile and Ubiquitous Systems: Computing, Networking and Services.Association for Computing Machinery, New York, NY,20–27. DOI:

Digital Library

[45]

Nathalie Japkowicz and Shaju Stephen. 2002. The class imbalance problem: A systematic study. Intelligent Data Analysis 6, 5 (2002), 429–449.

[46]

Edwin T. Jaynes and G. L. Bretthorst. 2003. Probability Theory: The Logic of Science. Cambridge University Press.

[47]

James M. Joyce. 2011. Kullback–Leibler Divergence. Springer, Berlin,720–722. DOI:

[48]

Leslie Pack Kaelbling, Michael L. Littman, and Andrew W. Moore. 1996. Reinforcement learning: A survey. Journal of Artificial Intelligence Research 4, 1 (1996), 237–285.

[49]

Antti Kangasrääsiö and Samuel Kaski. 2018. Inverse reinforcement learning from summary data. Machine Learning 107, 8 (2018), 1517–1535. DOI:

Digital Library

[50]

Antti Kangasrääsiö, Jussi P. P. Jokinen, Antti Oulasvirta, Andrew Howes, and Samuel Kaski. 2019. Parameter inference for computational cognitive models with approximate bayesian computation. Cognitive Science 43, 6 (2019), e12738. DOI:

[51]

Marc. C. Kennedy and Anthony O’Hagan. 2001. Bayesian calibration of computer models. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 63, 3 (2001), 425–464. DOI:

[52]

Iuliia Kotseruba and John K. Tsotsos. 2020. 40 years of cognitive architectures: Core cognitive abilities and practical applications. Artificial Intelligence Review 53, 1 (2020), 17–94. DOI:

Digital Library

[53]

Anna L. Kratz, Tiffany J. Braley, Emily Foxen-Craft, Eric Scott, John F. Murphy III, and Susan L. Murphy. 2017. How do pain, fatigue, depressive, and cognitive symptoms relate to well-being and social and physical functioning in the daily lives of individuals with multiple sclerosis? Archives of Physical Medicine and Rehabilitation 98, 11 (2017), 2160–2166.

[54]

Anna L. Kratz, Susan L. Murphy, and Tiffany J. Braley. 2017. Ecological momentary assessment of pain, fatigue, depressive, and cognitive symptoms reveals significant daily variability in multiple sclerosis. Archives of Physical Medicine and Rehabilitation 98, 11 (2017), 2142–2150.

[55]

Anna L. Kratz, Susan L. Murphy, and Tiffany J. Braley. 2017. Pain, fatigue, and cognitive symptoms are temporally associated within but not across days in multiple sclerosis. Archives of Physical Medicine and Rehabilitation 98, 11 (2017), 2151–2159.

[56]

Solomon Kullback and Richard A. Leibler. 1951. On information and sufficiency. The Annals of Mathematical Statistics 22, 1 (1951), 79–86.

[57]

Katri Leino, Antti Oulasvirta, and Mikko Kurimo. 2019. RL-KLM: Automating keystroke-level modeling with reinforcement learning. In Proceedings of the 24th International Conference on Intelligent User Interfaces.Association for Computing Machinery, New York, NY,476–480. DOI:

Digital Library

[58]

Katri Leino, Kashyap Todi, Antti Oulasvirta, and Mikko Kurimo. 2019. Computer-supported form design using keystroke-level modeling with reinforcement learning. In Proceedings of the 24th International Conference on Intelligent User Interfaces: Companion.Association for Computing Machinery, New York, NY, 85–86. DOI:

Digital Library

[59]

Russell V. Lenth. 2001. Some practical guidelines for effective sample size determination. The American Statistician 55, 3 (2001), 187–193. DOI:

[60]

Richard L. Lewis, Andrew Howes, and Satinder Singh. 2014. Computational rationality: Linking mechanism and behavior through bounded utility maximization. Topics in Cognitive Science 6, 2 (2014), 279–311. DOI:

[61]

Nan Li, Subbarao Kambhampati, and Sungwook Yoon. 2009. Learning probabilistic hierarchical task networks to capture user preferences. In Proceedings of the International Joint Conference on Artificial Intelligence. Retrieved from https://www.aaai.org/ocs/index.php/IJCAI/IJCAI-09/paper/view/417/874.

[62]

Dennis V. Lindley. 1997. The choice of sample size. Journal of the Royal Statistical Society. Series D (The Statistician) 46, 2 (1997), 129–138. Retrieved from http://www.jstor.org/stable/2988516.

[63]

Qiang Liu and Dilin Wang. 2016. Stein variational gradient descent: A general purpose bayesian inference algorithm. In Proceedings of the Advances in Neural Information Processing Systems 29. Barcelona, Spain, 2378–2386.

[64]

Magnus S. Magnusson. 2000. Discovering hidden time patterns in behavior: T-patterns and their detection. Behavior Research Methods, Instruments, and Computers 32, 1 (2000), 93–110. DOI:

[65]

Gideon S. Mann and Andrew McCallum. 2010. Generalized expectation criteria for semi-supervised learning with weakly labeled data. Journal of Machine Learning Research 11, 32 (2010), 955–984. Retrieved from http://jmlr.org/papers/v11/mann10a.html.

Digital Library

[66]

Scott E. Maxwell, Ken Kelley, and Joseph R. Rausch. 2008. Sample size planning for statistical power and accuracy in parameter estimation. Annual Review of Psychology 59, 1 (2008), 537–563. DOI:

[67]

Roderick Melnik. 2015. Universality of Mathematical Models in Understanding Nature, Society, and Man-Made World. John Wiley & Sons, Ltd., 1–16. DOI:

[68]

Peter Müller. 2005. Simulation based optimal design. Handbook of Statistics 25 (2005), 509–518. DOI:

[69]

Andrew Y. Ng and Stuart J. Russell. 2000. Algorithms for inverse reinforcement learning. In Proceedings of the 17th International Conference on Machine Learning.Morgan Kaufmann Publishers Inc., San Francisco, CA,663–670.

[70]

William L. Oberkampf, Timothy G. Trucano, and Charles Hirsch. 2004. Verification, validation, and predictive capability in computational engineering and physics. Applied Mechanics Reviews 57, 5 (2004), 345–384. DOI:

[71]

Anthony O’Hagan, Caitlin E. Buck, Alireza Daneshkhah, J. Richard Eiser, Paul H. Garthwaite, David J. Jenkinson, Jeremy E. Oakley, and Tim Rakow. 2006. Uncertain Judgements: Eliciting Experts’ Probabilities. John Wiley & Sons, Ltd, Chichester, United Kingdom. DOI:

[72]

Antti Oulasvirta, Jussi P. P. Jokinen, and Andrew Howes. 2022. Computational rationality as a theory of interaction. In Proceedingsof the 2022 CHI Conference on Human Factors in Computing Systems.

[73]

Giovanni Parmigiani and Lurdes Y. T. Inoue. 2009. Decision Theory: Principles and Approaches. John Wiley & Sons, Inc., West Sussex, United Kingdom. Retrieved from http://books.google.com/books?id=mnjGCYqWj7EC&pgis=1.

[74]

Martin Pilch, Timothy G. Trucano, and Jon C. Helton. 2006. Ideas Underlying Quantification of Margins and Uncertainties (QMU): A White Paper. Technical Report. Sandia National Laboratories.

[75]

Martin L. Puterman. 2014. Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley and Sons.

[76]

Deepak Ramachandran and Eyal Amir. 2007. Bayesian inverse reinforcement learning. In Proceedings of the International Joint Conference on Artificial Intelligence. 2586–2591.

[77]

Christian P. Robert and George Casella. 2004. Monte Carlo Statistical Methods. Springer New York, NY. DOI:

[78]

Lina M. Rojas-Barahona and Christophe Cerisara. 2014. Bayesian inverse reinforcement learning for modeling conversational agents in a virtual environment. In Proceedings of the International Conference on Intelligent Text Processing and Computational Linguistics. Springer, 503–514.

Digital Library

[79]

Stephane Ross, Geoffrey J. Gordon, and J. Andrew Bagnell. 2011. No-regret reductions for imitation learning and structured prediction. In Proceedings of the 14th International Conference on Artificial Intelligence and Statistics.

[80]

Adam Sadilek and John Krumm. 2012. Far out: Predicting long-term human mobility. In Proceedings of the AAAI Conference on Artificial Intelligence. Retrieved from https://www.aaai.org/ocs/index.php/AAAI/AAAI12/paper/view/4845/5275.

[81]

John M. Salsman, David Victorson, Seung W. Choi, Amy H. Peterman, Allen W. Heinemann, Cindy Nowinski, and David Cella. 2013. Development and validation of the positive affect and well-being scale for the neurology quality of life (Neuro-QOL) measurement system. Quality of Life Research 22, 9 (2013), 2569–2580.

[82]

Sayan Sarcar, Jussi P. P. Jokinen, Antti Oulasvirta, Zhenxin Wang, Chaklam Silpasuwanchai, and Xiangshi Ren. 2018. Ability-based optimization of touchscreen interactions. IEEE Pervasive Computing 17, 1 (2018), 15–26. DOI:

Digital Library

[83]

David W. Scott. 2015. Multivariate Density Estimation: Theory, Practice, and Visualization. John Wiley & Sons.

[84]

Burr Settles. 2009. Active Learning Literature Survey. Technical Report. University of Wisconsin-Madison Department of Computer Sciences.

[85]

D. S. Sivia and J. Skilling. 2006. Data Analysis: A Bayesian Tutorial (2nd ed.). Oxford University Press, New York, NY.

[86]

R. William Soukoreff and I. Scott MacKenzie. 2004. Towards a standard for pointing device evaluation, perspectives on 27 years of fitts’ law research in HCI. International Journal of Human-Computer Studies 61, 6 (2004), 751–789. DOI:

Digital Library

[87]

Arun Venkatraman, Martial Hebert, and J. Bagnell. 2015. Improving multi-step prediction of learned time series models. In Proceedings of the AAAI Conference on Artificial Intelligence. Retrieved from https://www.aaai.org/ocs/index.php/AAAI/AAAI15/paper/view/9592/9976.

[88]

Udo Von Toussaint. 2011. Bayesian inference in physics. Reviews of Modern Physics 83, 3 (2011), 943–999. DOI:

[89]

Robert C. Wilson and Anne G. E. Collins. 2019. Ten simple rules for the computational modeling of behavioral data. eLife 8 (2019), e49547. DOI:

[90]

Xuhai Xu, Prerna Chikersal, Afsaneh Doryab, Daniella K. Villalba, Janine M. Dutcher, Michael J. Tumminia, Tim Althoff, Sheldon Cohen, Kasey G. Creswell, J. David Creswell, Jennifer Mankoff, and Anind K. Dey. 2019. Leveraging routine behavior and contextually-filtered features for depression detection among college students. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 3, 3, (2019), 33 pages. DOI:

Digital Library

[91]

Shuochao Yao, Yiran Zhao, Huajie Shao, Aston Zhang, Chao Zhang, Shen Li, and Tarek Abdelzaher. 2018. RDeepSense: Reliable deep mobile computing models with uncertainty estimations. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 1, 4 (2018), 26 pages. DOI:

Digital Library

[92]

Shuochao Yao, Yiran Zhao, Huajie Shao, Chao Zhang, Aston Zhang, Shaohan Hu, Dongxin Liu, Shengzhong Liu, Lu Su, and Tarek Abdelzaher. 2018. SenseGAN: Enabling deep learning for internet of things with a semi-supervised framework. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 2, 3, (2018), 21 pages. DOI:

Digital Library

[93]

Yunxiu Zeng, Kai Xu, Quanjun Yin, L. Qin, Yabing Zha, and William Yeoh. 2018. Inverse reinforcement learning based human behavior modeling for goal recognition in dynamic local network interdiction. In Proceedings of the AAAI Workshops.

[94]

Brian Ziebart, Anind Dey, and J. Andrew Bagnell. 2012. Probabilistic pointing target prediction via inverse optimal control. In Proceedings of the 2012 ACM International Conference on Intelligent User Interfaces.Association for Computing Machinery, New York, NY,1–10. DOI:

Digital Library

[95]

Brian D. Ziebart, J. Andrew Bagnell, and Anind K. Dey. 2010. Modeling interaction via the principle of maximum causal entropy. In Proceedings of the 27th International Conference on International Conference on Machine Learning.Omnipress, 1255–1262. Retrieved from http://dl.acm.org/citation.cfm?id=3104322.3104481.

Digital Library

[96]

Brian D. Ziebart, Andrew L. Maas, Anind K. Dey, and J. Andrew Bagnell. 2008. Navigate like a cabbie: Probabilistic reasoning from observed context-aware behavior. In Proceedings of the 10th International Conference on Ubiquitous Computing. Association for Computing Machinery, New York, NY, 322–331. DOI:

Digital Library

[97]

Tjalf Ziemssen, Raimar Kern, and Katja Thomas. 2016. Multiple sclerosis: Clinical profiling and data collection as prerequisite for personalized medicine approach. BMC Neurology 16, 1 (2016), 124.

Cited By

Zhang LZhu MQin JLi Y(2024)Latent-Maximum-Entropy-Based Cognitive Radar Reward Function Estimation With Nonideal ObservationsIEEE Transactions on Aerospace and Electronic Systems10.1109/TAES.2024.340667160:5(6656-6670)Online publication date: Oct-2024
https://doi.org/10.1109/TAES.2024.3406671
Antar AKratz ABanovic N(2023)Behavior Modeling Approach for Forecasting Physical Functioning of People with Multiple SclerosisProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/35808877:1(1-29)Online publication date: 28-Mar-2023
https://dl.acm.org/doi/10.1145/3580887
Li YLin KChen CYao GYa-ju Chang Lee YLiu CChen W(2023)Three Ways to Improve Arm Function in the Chronic Phase After Stroke by Robotic Priming Combined With Mirror Therapy, Arm Training, and Movement-Oriented TherapyArchives of Physical Medicine and Rehabilitation10.1016/j.apmr.2023.02.015104:8(1195-1202)Online publication date: Aug-2023
https://doi.org/10.1016/j.apmr.2023.02.015

Index Terms

A Bayesian Approach for Quantifying Data Scarcity when Modeling Human Behavior via Inverse Reinforcement Learning
1. Human-centered computing
  1. Human computer interaction (HCI)
    1. HCI theory, concepts and models

Recommendations

Inverse reinforcement learning from summary data

Inverse reinforcement learning (IRL) aims to explain observed strategic behavior by fitting reinforcement learning models to behavioral data. However, traditional IRL methods are only applicable when the observations are in the form of state-action ...
Preference elicitation and inverse reinforcement learning
ECMLPKDD'11: Proceedings of the 2011th European Conference on Machine Learning and Knowledge Discovery in Databases - Volume Part III

We state the problem of inverse reinforcement learning in terms of preference elicitation, resulting in a principled (Bayesian) statistical formulation. This generalises previous work on Bayesian inverse reinforcement learning and allows us to obtain a ...
Inverse reinforcement learning via nonparametric spatio-temporal subgoal modeling

Advances in the field of inverse reinforcement learning (IRL) have led to sophisticated inference frameworks that relax the original modeling assumption of observing an agent behavior that reflects only a single intention. Instead of learning a global ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Computer-Human Interaction

ACM Transactions on Computer-Human Interaction Volume 30, Issue 1

February 2023

537 pages

ISSN:1073-0516

EISSN:1557-7325

DOI:10.1145/3585399

Editors:
Kristina Höök
KTH Royal Institute of Technology, Sweden
,
Kasper Hornbæk
University of Copenhagen, Denmark

Issue’s Table of Contents

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 March 2023

Online AM: 27 July 2022

Accepted: 30 May 2022

Revised: 30 April 2022

Received: 20 January 2021

Published in TOCHI Volume 30, Issue 1

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

U.S. Department of Energy, Office of Science, Office of Advanced Scientific Computing Research
National Energy Research Scientific Computing Center (NERSC), a U.S. Department of Energy Office of Science User Facility located at Lawrence Berkeley National Laboratory

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
1,033
Total Downloads

Downloads (Last 12 months)237
Downloads (Last 6 weeks)16

Reflects downloads up to 03 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Zhang LZhu MQin JLi Y(2024)Latent-Maximum-Entropy-Based Cognitive Radar Reward Function Estimation With Nonideal ObservationsIEEE Transactions on Aerospace and Electronic Systems10.1109/TAES.2024.340667160:5(6656-6670)Online publication date: Oct-2024
https://doi.org/10.1109/TAES.2024.3406671
Antar AKratz ABanovic N(2023)Behavior Modeling Approach for Forecasting Physical Functioning of People with Multiple SclerosisProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/35808877:1(1-29)Online publication date: 28-Mar-2023
https://dl.acm.org/doi/10.1145/3580887
Li YLin KChen CYao GYa-ju Chang Lee YLiu CChen W(2023)Three Ways to Improve Arm Function in the Chronic Phase After Stroke by Robotic Priming Combined With Mirror Therapy, Arm Training, and Movement-Oriented TherapyArchives of Physical Medicine and Rehabilitation10.1016/j.apmr.2023.02.015104:8(1195-1202)Online publication date: Aug-2023
https://doi.org/10.1016/j.apmr.2023.02.015

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Full Text

View this article in Full Text.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View full text|Download PDF

View Issue’s Table of Contents