Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3511808.3557711acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
short-paper

The SimIIR 2.0 Framework: User Types, Markov Model-Based Interaction Simulation, and Advanced Query Generation

Published: 17 October 2022 Publication History

Abstract

Simulated user retrieval system interactions enable studies with controlled user behavior. To this end, the SimIIR framework offers static, rule-based methods. We present an extended SimIIR 2.0 version with new components for dynamic user type-specific Markov model-based interactions and more realistic query generation. A flexible modularization ensures that the SimIIR 2.0 framework can serve as a platform to implement, combine, and run the growing number of proposed search behavior and query simulation ideas.

References

[1]
Kumaripaba Athukorala, Dorota Glowacka, Giulio Jacucci, Antti Oulasvirta, and Jilles Vreeken. 2016. Is exploratory search different? A comparison of information search behavior for exploratory and lookup tasks. Journal of the Association for Information Science and Technology 67, 11 (2016), 2635--2651.
[2]
Leif Azzopardi, Maarten de Rijke, and Krisztian Balog. 2007. Building simulated queries for known-item topics: An analysis using six European languages. In Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2007, Amsterdam, The Netherlands, July 23--27, 2007. ACM, 455--462.
[3]
Edwin G. Boring. 1946. Mind and mechanism. The American Journal of Psychology 59, 2 (1946), 173--192.
[4]
Timo Breuer, Norbert Fuhr, and Philipp Schaer. 2022. Validating simulations of user query variants. In Advances in Information Retrieval - 44th European Conference on IR Research, ECIR 2022, Stavanger, Norway, April 10--14, 2022, Proceedings, Part I (Lecture Notes in Computer Science, Vol. 13185). Springer, 80--94.
[5]
Arthur Câmara, David Maxwell, and Claudia Hauff. 2022. Searching, learning, and subtopic ordering: A simulation-based analysis. In Advances in Information Retrieval - 44th European Conference on IR Research, ECIR 2022, Stavanger, Norway, April 10--14, 2022, Proceedings, Part I (Lecture Notes in Computer Science, Vol. 13185). Springer, 142--156.
[6]
Ben Carterette, Ashraf Bah, and Mustafa Zengin. 2015. Dynamic test collections for retrieval evaluation. In Proceedings of the 2015 International Conference on The Theory of Information Retrieval, ICTIR 2015, Northampton, Massachusetts, USA, September 27--30, 2015. ACM, 91--100.
[7]
Jia Chen, Jiaxin Mao, Yiqun Liu, Fan Zhang, Min Zhang, and Shaoping Ma. 2021. Towards a better understanding of query reformulation behavior in web search. In The Web Conference 2021, WWW 2021, Virtual Event, Ljubljana, Slovenia, April 19--23, 2021. ACM / IW3C2, 743--755.
[8]
Aleksandr Chuklin, Ilya Markov, and Maarten de Rijke. 2015. Click Models for Web Search. Morgan & Claypool Publishers.
[9]
Cyril W. Cleverdon, Jack Mills, and Michael E. Keen. 1966. Factors determining the performance of indexing systems; Volume 1: Design. Technical Report. College of Aeronautics, Cranfield.
[10]
Michael J. Cole. 2010. Simulation of the IIR user: Beyond the automagic. In Proceedings of the SIGIR 2010 Workshop on the Simulation of Interaction: Automated Evaluation of Interactive IR (SimInt 2010). 1--2.
[11]
Susan T. Dumais, Edward Cutrell, Jonathan J. Cadiz, Gavin Jancke, Raman Sarin, and Daniel C. Robbins. 2003. Stuff I've seen: A system for personal information retrieval and re-use. In Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2003, July 28 -- August 1, 2003, Toronto, Canada. ACM, 72--79.
[12]
Sean R. Eddy. 1995. Multiple alignment using hidden Markov models. In Proceedings of the Third International Conference on Intelligent Systems for Molecular Biology, Cambridge, United Kingdom, July 16--19, 1995. AAAI, 114--120.
[13]
Pierre Erbacher, Ludovic Denoyer, and Laure Soulier. 2022. Interactive query clarification and refinement via user simulation. In The 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2022, Madrid, Spain, July 11--15, 2022. ACM, 2420--2425.
[14]
Sebastian Gomes, Miriam Boon, and Orland Hoeber. 2022. Astudy of cross-session cross-device search within an academic digital library. In The 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2022, Madrid, Spain, July 11--15, 2022. ACM, 384--394.
[15]
Sebastian Günther, Paul Göttert, and Matthias Hagen. 2022. Exploring LSTMs for simulating search sessions in digital libraries. In Linking Theory and Practice of Digital Libraries - 26th International Conference on Theory and Practice of Digital Libraries, TPDL 2022, Padua, Italy, September 20--23, 2022, Proceedings, (to appear). 5 pages.
[16]
Sebastian Günther and Matthias Hagen. 2021. Assessing query suggestions for search session simulation. In Causality in Search and Recommendation (CSR) and Simulation of Information Retrieval Evaluation (Sim4IR) workshops at SIGIR 2021 (CEUR Workshop Proceedings, Vol. 2911). CEUR-WS.org, 8 pages.
[17]
Matthias Hagen, Jakob Gomoll, Anna Beyer, and Benno Stein. 2013. From search session detection to search mission detection. In Open research Areas in Information Retrieval, OAIR 2013, Lisbon, Portugal, May 15--17, 2013. ACM, 85--92.
[18]
Matthias Hagen, Martin Potthast, Michael Völske, Jakob Gomoll, and Benno Stein. 2016. How writers search: Analyzing the search and writing logs of nonfictional essays. In Proceedings of the 2016 ACM Conference on Human Information Interaction and Retrieval, CHIIR 2016, Carrboro, North Carolina, USA, March 13--17, 2016. ACM, 193--202.
[19]
Donna Harman. 2011. Information Retrieval Evaluation. Morgan & Claypool Publishers.
[20]
Kalervo Järvelin. 2009. Interactive relevance feedback with graded relevance and sentence extraction: Simulated user experiments. In Proceedings of the 18th ACM Conference on Information and Knowledge Management, CIKM 2009, Hong Kong, China, November 2--6, 2009. ACM, 2053--2056.
[21]
Jiepu Jiang and Chaoqun Ni. 2016. What affects word changes in query reformulation during a task-based search session?. In Proceedings of the 2016 ACM Conference on Human Information Interaction and Retrieval, CHIIR 2016, Carrboro, North Carolina, USA, March 13--17, 2016. ACM, 111--120.
[22]
Rosie Jones and Kristina Lisa Klinkner. 2008. Beyond the session timeout: Automatic hierarchical segmentation of search topics in query logs. In Proceedings of the 17th ACM Conference on Information and Knowledge Management, CIKM 2008, Napa Valley, California, USA, October 26--30, 2008. ACM, 699--708.
[23]
Chris Jordan, Carolyn R. Watters, and Qigang Gao. 2006. Using controlled query generation to evaluate blind relevance feedback algorithms. In ACM/IEEE Joint Conference on Digital Libraries, JCDL 2006, Chapel Hill, NC, USA, June 11--15, 2006, Proceedings. ACM, 286--295.
[24]
Diane Kelly. 2009. Methods for evaluating interactive information retrieval systems with users. Foundations and Trends in Information Retrieval 3, 1--2 (2009), 1--224.
[25]
Heikki Keskustalo, Kalervo Järvelin, and Ari Pirkola. 2008. Evaluating the effectiveness of relevance feedback based on a user simulation model: Effects of a user scenario on cumulated gain value. Information Retrieval 11, 3 (2008), 209--228.
[26]
To Eun Kim and Aldo Lipani. 2022. A multi-task based neural model to simulate users in goal oriented dialogue systems. In The 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2022, Madrid, Spain, July 11--15, 2022. ACM, 2115--2119.
[27]
Julia Kiseleva, Hoang Thanh Lam, Mykola Pechenizkiy, and Toon Calders. 2013. Predicting current user intent with contextual Markov models. In 13th IEEE International Conference on Data Mining Workshops, ICDM Workshops, TX, USA, December 7--10, 2013. IEEE Computer Society, 391--398.
[28]
Ron Kohavi, Diane Tang, and Ya Xu. 2020. Trustworthy Online Controlled Experiments: A Practical Guide to A/B Testing. Cambridge University Press.
[29]
Alexander Kotov, Paul N. Bennett, Ryen W. White, Susan T. Dumais, and Jaime Teevan. 2011. Modeling and analysis of cross-session search tasks. In Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2011, Beijing, China, July 25--29, 2011. ACM, 5--14.
[30]
Sahiti Labhishetty and ChengXiang Zhai. 2022. RATE: A reliability-aware testerbased evaluation framework of user simulators. In Advances in Information Retrieval - 44th European Conference on IR Research, ECIR 2022, Stavanger, Norway, April 10--14, 2022, Proceedings, Part I (Lecture Notes in Computer Science, Vol. 13185). Springer, 336--350.
[31]
David Maxwell and Leif Azzopardi. 2016. Simulating interactive information retrieval: SimIIR: A framework for the simulation of interaction. In Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval, SIGIR 2016, Pisa, Italy, July 17--21, 2016. ACM, 1141--1144.
[32]
David Maxwell, Leif Azzopardi, Kalervo Järvelin, and Heikki Keskustalo. 2015. An initial investigation into fixed and adaptive stopping strategies. In Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2015, Santiago, Chile, August 9--13, 2015. ACM, 903--906.
[33]
Philipp Mayr. 2016. Sowiport User Search Sessions Data Set (SUSS). GESIS - Leibniz-Institute for the Social Sciences. Data File Version 1.0.0. https://doi.org/10.7802/1380
[34]
Gustavo Penha, Arthur Câmara, and Claudia Hauff. 2022. Evaluating the robustness of retrieval pipelines with query variation generators. In Advances in Information Retrieval - 44th European Conference on IR Research, ECIR 2022, Stavanger, Norway, April 10--14, 2022, Proceedings, Part I (Lecture Notes in Computer Science, Vol. 13185). Springer, 397--412.
[35]
Martin Potthast, Matthias Hagen, Michael Völske, and Benno Stein. 2013. Crowdsourcing Interaction Logs to Understand Text Reuse from theWeb. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, ACL 2013, 4--9 August 2013, Sofia, Bulgaria, Volume 1: Long Papers. The Association for Computer Linguistics, 1212--1221.
[36]
Mark Sanderson. 2010. Test collection based evaluation of information retrieval systems. Foundations and Trends in Information Retrieval 4, 4 (2010), 247--375.
[37]
Ahmad Shamshad, M. A. Bawadi, W. M. A. Wan Hussin, Taksiah A. Majid, and S. A. M. Sanusi. 2005. First and second order Markov chain models for synthetic generation of wind speed time series. Energy 30, 5 (2005), 693--708.
[38]
Marc Sloan, Hui Yang, and JunWang. 2015. A term-based methodology for query reformulation understanding. Information Retrieval Journal 18, 2 (2015), 145--165.
[39]
Mark D. Smucker. 2011. An analysis of user strategies for examining and processing ranked lists of documents. In Human-Computer Information Retrieval Symposium, HCIR 2011, Mountain View, CA, USA, October 20, 2011. 4 pages.
[40]
Matthijs T. J. Spaan. 2012. Partially observable Markov decision processes. In Reinforcement Learning. Adaptation, Learning, and Optimization, Vol. 12. Springer, 387--414. https://doi.org/10.1007/978--3--642--27645--3_12
[41]
Paul Thomas, Alistair Moffat, Peter Bailey, and Falk Scholer. 2014. Modeling decision points in user search behavior. In Fifth Information Interaction in Context Symposium, IIiX '14, Regensburg, Germany, August 26--29, 2014. ACM, 239--242.
[42]
Suzan Verberne, Maya Sappelli, Kalervo Järvelin, and Wessel Kraaij. 2015. User simulations for interactive search: Evaluating personalized query suggestion. In Advances in Information Retrieval - 37th European Conference on IR Research, ECIR 2015, Vienna, Austria, March 29 -- April 2, 2015, Proceedings (Lecture Notes in Computer Science, Vol. 9022). Springer, 678--690.
[43]
Hui Yang, Dongyi Guan, and Sicong Zhang. 2015. The query change model: Modeling session search as a Markov decision process. ACM Transactions on Information Systems 33, 4 (2015), 20:1--20:33.
[44]
Saber Zerhoudi, Michael Granitzer, Jörg Schlötterer, and Christin Seifert. 2021. Query change as a contextual Markov model for simulating user search behaviour. In Forum for Information Retrieval Evaluation, FIRE 2021, Virtual Event, India, December 13--17, 2021. ACM, 43--51.
[45]
Saber Zerhoudi, Michael Granitzer, Christin Seifert, and Joerg Schloetterer. 2022. Evaluating simulated user interaction and search behaviour. In Advances in Information Retrieval - 44th European Conference on IR Research, ECIR 2022, Stavanger, Norway, April 10--14, 2022, Proceedings, Part II (Lecture Notes in Computer Science, Vol. 13186). Springer, 240--247.
[46]
Saber Zerhoudi, Michael Granitzer, Christin Seifert, and Jörg Schlötterer. 2022. Simulating user interaction and search behaviour in digital libraries. In Proceedings of the 18th Italian Research Conference on Digital Libraries, Padua, Italy, February 24--25, 2022 (hybrid event) (CEURWorkshop Proceedings, Vol. 3160). CEURWS. org, 15 pages.
[47]
Shuo Zhang, Mu-Chun Wang, and Krisztian Balog. 2022. Analyzing and simulating user utterance reformulation in conversational recommender systems. In The 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2022, Madrid, Spain, July 11--15, 2022. ACM, 133--143.
[48]
Yinan Zhang, Xueqing Liu, and ChengXiang Zhai. 2017. Information retrieval evaluation as search simulation: A general formal framework for IR evaluation. In Proceedings of the ACM SIGIR International Conference on Theory of Information Retrieval, ICTIR 2017, Amsterdam, The Netherlands, October 1--4, 2017. ACM, 193--200.

Cited By

View all
  • (2024)Report on the Search Futures Workshop at ECIR 2024ACM SIGIR Forum10.1145/3687273.368728858:1(1-41)Online publication date: 7-Aug-2024
  • (2024)Report on the Collab-a-Thon at ECIR 2024ACM SIGIR Forum10.1145/3687273.368728758:1(1-11)Online publication date: 7-Aug-2024
  • (2024)Toward Evaluating the Reproducibility of Information Retrieval Systems with Simulated UsersProceedings of the 2nd ACM Conference on Reproducibility and Replicability10.1145/3641525.3663619(25-29)Online publication date: 18-Jun-2024
  • Show More Cited By

Index Terms

  1. The SimIIR 2.0 Framework: User Types, Markov Model-Based Interaction Simulation, and Advanced Query Generation

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      CIKM '22: Proceedings of the 31st ACM International Conference on Information & Knowledge Management
      October 2022
      5274 pages
      ISBN:9781450392365
      DOI:10.1145/3511808
      • General Chairs:
      • Mohammad Al Hasan,
      • Li Xiong
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 17 October 2022

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. search behavior
      2. simulation
      3. software framework
      4. user modeling

      Qualifiers

      • Short-paper

      Funding Sources

      Conference

      CIKM '22
      Sponsor:

      Acceptance Rates

      CIKM '22 Paper Acceptance Rate 621 of 2,257 submissions, 28%;
      Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

      Upcoming Conference

      CIKM '25

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)36
      • Downloads (Last 6 weeks)5
      Reflects downloads up to 16 Nov 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Report on the Search Futures Workshop at ECIR 2024ACM SIGIR Forum10.1145/3687273.368728858:1(1-41)Online publication date: 7-Aug-2024
      • (2024)Report on the Collab-a-Thon at ECIR 2024ACM SIGIR Forum10.1145/3687273.368728758:1(1-11)Online publication date: 7-Aug-2024
      • (2024)Toward Evaluating the Reproducibility of Information Retrieval Systems with Simulated UsersProceedings of the 2nd ACM Conference on Reproducibility and Replicability10.1145/3641525.3663619(25-29)Online publication date: 18-Jun-2024
      • (2024)SIGIR 2024 Workshop on Simulations for Information Access (Sim4IA 2024)Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657991(3058-3061)Online publication date: 10-Jul-2024
      • (2024)Evaluating Generative Ad Hoc Information RetrievalProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657849(1916-1929)Online publication date: 10-Jul-2024
      • (2024)Tutorial on User Simulation for Evaluating Information Access Systems on the WebCompanion Proceedings of the ACM Web Conference 202410.1145/3589335.3641243(1254-1257)Online publication date: 13-May-2024
      • (2024)Who Will Evaluate the Evaluators? Exploring the Gen-IR User Simulation SpaceExperimental IR Meets Multilinguality, Multimodality, and Interaction10.1007/978-3-031-71736-9_11(166-171)Online publication date: 9-Sep-2024
      • (2024)Reproduction and Simulation of Interactive Retrieval ExperimentsAdvances in Information Retrieval10.1007/978-3-031-56069-9_40(328-330)Online publication date: 24-Mar-2024
      • (2024)Context-Driven Interactive Query Simulations Based on Generative Large Language ModelsAdvances in Information Retrieval10.1007/978-3-031-56060-6_12(173-188)Online publication date: 24-Mar-2024
      • (2023)User Simulation for Evaluating Information Access SystemsProceedings of the Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in the Asia Pacific Region10.1145/3624918.3629549(302-305)Online publication date: 26-Nov-2023
      • Show More Cited By

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media