Abstract
Recollecting details from lifelog data involves a higher level of granularity and reasoning than a conventional lifelog retrieval task. Investigating the task of Question Answering (QA) in lifelog data could help in human memory recollection, as well as improve traditional lifelog retrieval systems. However, there has not yet been a standardised benchmark dataset for the lifelog-based QA. In order to provide a first dataset and baseline benchmark for QA on lifelog data, we present a novel dataset, LLQA, which is an augmented 85-day lifelog collection and includes over 15,000 multiple-choice questions. We also provide different baselines for the evaluation of future works. The results showed that lifelog QA is a challenging task that requires more exploration. The dataset is publicly available at https://github.com/allie-tran/LLQA.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Anderson, P., et al.: Bottom-up and top-down attention for image captioning and visual question answering, pp. 6077–6086 (2018)
Bao, H., et al.: Unilmv2: pseudo-masked language models for unified language model pre-training. In: International Conference on Machine Learning, pp. 642–652. PMLR (2020)
Bush, V., et al.: As we may think. The atlantic monthly 176(1), 101–108 (1945)
Byrne, D., Kelliher, A., Jones, G.J.: Life editing: third-party perspectives on lifelog content. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 1501–1510 (2011)
Castro, S., Azab, M., Stroud, J., Noujaim, C., Wang, R., Deng, J., Mihalcea, R.: Lifeqa: a real-life dataset for video question answering. In: Proceedings of the 12th Language Resources and Evaluation Conference, pp. 4352–4358 (2020)
Doherty, A., Smeaton, A.: Automatically segmenting LifeLog data into events
Fan, C.: EgoVQA - an egocentric video question answering benchmark dataset. In: 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), pp. 4359–4366 (Oct 2019), iSSN: 2473–9944
Fukui, A., Park, D.H., Yang, D., Rohrbach, A., Darrell, T., Rohrbach, M.: Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding. arXiv:1606.01847 [cs], September 2016
Gao, Y., Bing, L., Li, P., King, I., Lyu, M.R.: Generating distractors for reading comprehension questions from real examinations. In: AAAI-19 AAAI Conference on Artificial Intelligence (2019)
Gemmell, J., Bell, C., Lueder, R.: Mylifebits: a personal database for everything. Commun. ACM 49, 89–95 (2006)
Gurrin, C., et al.: Overview of the NTCIR-14 lifelog-3 task. In: Proceedings of the 14th NTCIR Conference, p. 13. NII (2019)
Gurrin, C., et al.: Introduction to the third annual lifelog search challenge (LSC’20). In: Proceedings of the 2020 International Conference on Multimedia Retrieval, ICMR 2020, pp. 584–585. Association for Computing Machinery
Gurrin, C., Smeaton, A.F., Doherty, A.R., et al.: Lifelogging: personal big data. Found. Trends Inform. Retrieval 8(1), 1–125 (2014)
Hu, R., Andreas, J., Rohrbach, M., Darrell, T., Saenko, K.: Learning to reason: end-to-end module networks for visual question answering. arXiv:1704.05526 [cs], Septrmber 2017. arXiv: 1704.05526 version: 3
Jang, Y., Song, Y., Yu, Y., Kim, Y., Kim, G.: TGIF-QA: toward spatio-temporal reasoning in visual question answering
Jia, Y., et al.: Caffe: convolutional architecture for fast feature embedding
Lei, J., Yu, L., Bansal, M., Berg, T.L.: TVQA: localized, compositional video question answering. arXiv:1809.01696 [cs] (May 2019), arXiv: 1809.01696
Lei, J., Yu, L., Berg, T.L., Bansal, M.: TVQA+: spatio-temporal grounding for video question answering. arXiv:1904.11574 [cs], May 2020. arXiv: 1904.11574
Lokoč, J., et al.: Is the reign of interactive search eternal? findings from the video browser showdown 2020. ACM Trans. Multimedia Comput. Commun. Appl. 17(3), July 2021
Nguyen, T.N., et al.: Lifeseeker 3.0: An interactive lifelog search engine for lsc’21. In: Proceedings of the 4th Annual on Lifelog Search Challenge, pp. 41–46 (2021)
Ninh, V.T., Le, T.K., Zhou, L., Piras, L., Riegler, M.: Overview of ImageCLEFlifelog 2020: Lifelog moment retrieval and sport performance lifelog. In: CLEF (Working Notes), p. 17 (2020)
Pennington, J., Socher, R., Manning, C.D.: GloVe: global vectors for word representation. In: Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543
Reddy, S., Chen, D., Manning, C.D.: CoQA: a conversational question answering challenge. Trans. Assoc. Comput. Linguist. 7, 249–266 (2019)
Sellen, A.J., Whittaker, S.: Beyond total capture: a constructive critique of lifelogging 53(5), 70–77
Speer, R., Chin, J., Havasi, C.: Conceptnet 5.5: an open multilingual graph of general knowledge. In: Thirty-First AAAI Conference on Artificial Intelligence (2017)
Tran, L.D., Nguyen, M.D., Thanh Binh, N., Lee, H., Gurrin, C.: Myscéal 2.0: a revised experimental interactive lifelog retrieval system for lsc’21. In: Proceedings of the 4th Annual on Lifelog Search Challenge, pp. 11–16 (2021)
Trotman, A., Geva, S., Kamps, J.: Report on the sigir 2007 workshop on focused retrieval. In: ACM SIGIR Forum, vol. 41, pp. 97–103. ACM, New York (2007)
Xu, D., et al.: Video question answering via gradually refined attention over appearance and motion. In: Proceedings of the 25th ACM International Conference on Multimedia, MM 2017, pp. 1645–1653. Association for Computing Machinery, event-place: Mountain View, California, USA
Ye, Y., Zhao, Z., Li, Y., Chen, L., Xiao, J., Zhuang, Y.: Video question answering via attribute-augmented attention network learning. In: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 829–832 (2017)
Acknowledgements
This work was conducted with the financial support of the Science Foundation Ireland under grant agreement 13/RC/2106_P2 and the Centre for Research Training in Digitally-Enhanced Reality (d-real) under Grant No. 18/CRT/6224. For the purpose of Open Access, the author has applied a CC BY public copyright licence to any Author Accepted Manuscript version arising from this submission.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 Springer Nature Switzerland AG
About this paper
Cite this paper
Tran, LD., Ho, T.C., Pham, L.A., Nguyen, B., Gurrin, C., Zhou, L. (2022). LLQA - Lifelog Question Answering Dataset. In: Þór Jónsson, B., et al. MultiMedia Modeling. MMM 2022. Lecture Notes in Computer Science, vol 13141. Springer, Cham. https://doi.org/10.1007/978-3-030-98358-1_18
Download citation
DOI: https://doi.org/10.1007/978-3-030-98358-1_18
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-98357-4
Online ISBN: 978-3-030-98358-1
eBook Packages: Computer ScienceComputer Science (R0)