research-article

MLM: A Benchmark Dataset for Multitask Learning with Multiple Languages and Modalities

Authors:

Jason Armitage,

Golsa Tahmasebzadeh,

Maria Maleshkova,

Jens LehmannAuthors Info & Claims

CIKM '20: Proceedings of the 29th ACM International Conference on Information & Knowledge Management

Pages 2967 - 2974

https://doi.org/10.1145/3340531.3412783

Published: 19 October 2020 Publication History

Abstract

In this paper, we introduce the MLM (Multiple Languages and Modalities) dataset - a new resource to train and evaluate multitask systems on samples in multiple modalities and three languages. The generation process and inclusion of semantic data provide a resource that further tests the ability for multitask systems to learn relationships between entities. The dataset is designed for researchers and developers who build applications that perform multiple tasks on data encountered on the web and in digital archives. A second version of MLM provides a geo-representative subset of the data with weighted samples for countries of the European Union. We demonstrate the value of the resource in developing novel applications in the digital humanities with a motivating use case and specify a benchmark set of tasks to retrieve modalities and locate entities in the dataset. Evaluation of baseline multitask and single task systems on the full and geo-representative versions of MLM demonstrate the challenges of generalising on diverse data. In addition to the digital humanities, we expect the resource to contribute to research in multimodal representation learning, location estimation, and scene understanding.

Supplementary Material

MP4 File (3340531.3412783.mp4)

This video is a presentation of the paper ?MLM: A Benchmark Dataset for Multitask Learning with Multiple Languages and Modalities?. MLM is a resource for training and evaluating multitask systems on diverse data. We also present the generation process for the dataset, a set of benchmark evaluation tasks, and a multitask machine learning framework. Please find more information on the resource and project at http://cleopatra.ijs.si/goal-mlm/.

Download
9.11 MB

References

[1]

Beatrice Alex, Kate Byrne, Claire Grover, and Richard Tobin. 2015. Adapting the Edinburgh geoparser for historical georeferencing. International Journal of Humanities and Arts Computing 9, 1 (2015), 15--35.

[2]

Mehdi Ali, Hajira Jabeen, Charles Tapley Hoyt, and Jens Lehmann. 2019. The KEEN Universe. In International Semantic Web Conference. Springer, 3--18.

[3]

Yusuf Aytar, Mubarak Shah, and Jiebo Luo. 2008. Utilizing semantic word similarity measures for video retrieval. In 2008 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 1--8.

[4]

Georges Baatz, Olivier Saurer, Kevin Köser, and Marc Pollefeys. 2012. Large scale visual geo-localization of images in mountainous terrain. In European conference on computer vision. Springer, 517--530.

[5]

Tadas Baltruaitis, Chaitanya Ahuja, and Louis-Philippe Morency. 2018. Multimodal machine learning: A survey and taxonomy. IEEE transactions on pattern analysis and machine intelligence 41, 2 (2018), 423--443.

[6]

Tilman Becker, Edward Curry, Anja Jentzsch, and Walter Palmetshofer. 2016. New Horizons for a Data-Driven Economy: Roadmaps and Action Plans for Technology, Businesses, Policy, and Society. In New Horizons for a Data-Driven Economy: A Roadmap for Usage and Exploitation of Big Data in Europe, José María Cavanillas, Edward Curry, and Wolfgang Wahlster (Eds.). Springer International Publishing, Cham, 277--291. https://doi.org/10.1007/978--3--319--21569--3_16

[7]

Yoshua Bengio. 2009. Learning Deep Architectures for AI. Foundations and Trends in Machine Learning 2 (2009), 71.

Digital Library

[8]

Alexander Binder, Wojciech Samek, Klaus-Robert Müller, and Motoaki Kawanabe. 2013. Enhanced representation and multi-task learning for image annotation. Computer Vision and Image Understanding 117, 5 (2013), 466--478.

Digital Library

[9]

Nicolas Blanc, Timothée Produit, and Jens Ingensand. 2018. A semi-automatic tool to georeference historical landscape images. Technical Report. PeerJ Preprints.

[10]

Jan Brejcha. 2017. State-of-the-art in visual geo-localization. Pattern Analysis and Applications 20, 3 (2017), 613--637.

Digital Library

[11]

Simon Brodeur, Ethan Perez, Ankesh Anand, Florian Golemo, Luca Celotti, Florian Strub, Jean Rouat, Hugo Larochelle, and Aaron C. Courville. 2018. HoME: a Household Multimodal Environment. In 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30 - May 3, 2018, Workshop Track Proceedings. OpenReview.net. https://openreview.net/forum? id=B1pJ3dkwG

[12]

Barbara Caputo, Henning Müller, Jesus Martinez-Gomez, Mauricio Villegas, Burak Acar, Novi Patricia, Neda Marvasti, Suzan Üsküdarl?, Roberto Paredes, Miguel Cazorla, et al. 2014. ImageCLEF 2014: Overview and analysis of the results. In International Conference of the Cross-Language Evaluation Forum for European Languages. Springer, 192--211.

[13]

Rich Caruana. 1997. Multitask learning. Machine learning 28, 1 (1997), 41--75.

[14]

David M Chen, Georges Baatz, Kevin Köser, Sam S Tsai, Ramakrishna Vedantham, Timo Pylvänäinen, Kimmo Roimela, Xin Chen, Jeff Bach, Marc Pollefeys, et al. 2011. City-scale landmark identification on mobile devices. In CVPR 2011. IEEE, 737--744.

Digital Library

[15]

Jaeyoung Choi, Claudia Hauff, Olivier Van Laere, and Bart Thomee. 2015. The placing task at MediaEval 2015. In MediaEval 2015, Wurzen, Germany, 14--15 September 2015; Ceur Workshop Proceedings 1436, 2015. CEUR.

[16]

Jaeyoung Choi, Martha Larson, Gerald Friedland, and Alan Hanjalic. 2019. From Intra-Modal to Inter-Modal Space: Multi-task Learning of Shared Representations for Cross-Modal Retrieval. In 2019 IEEE Fifth International Conference on Multimedia Big Data (BigMM). IEEE, Singapore, Singapore, 1--10. https://doi.org/10.1109/BigMM.2019.00--48

[17]

Grace Chu, Brian Potetz, Weijun Wang, Andrew Howard, Yang Song, Fernando Brucher, Thomas Leung, and Hartwig Adam. 2019. Geo-Aware Networks for Fine-Grained Recognition. In Proceedings of the IEEE International Conference on Computer Vision Workshops. 0--0.

[18]

Bertrand Delezoide and Hervé Le Borgne. 2007. SemanticVox: A multilingual video search engine. In Proceedings of the 6th ACM international conference on Image and video retrieval. 81--84.

Digital Library

[19]

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, Minneapolis, Minnesota, 4171--4186. https://doi.org/10.18653/v1/N19--1423

[20]

Mouna Harrach, Alexandre Devaux, and Mathieu Brédif. 2019. Interactive Image Geolocalization in an Immersive Web Application. International Archives of the Photogrammetry, Remote Sensing & Spatial Information Sciences (2019).

[21]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Identity mappings in deep residual networks. In European conference on computer vision. Springer, 630--645.

[22]

Junjie Hu, Sebastian Ruder, Aditya Siddhant, Graham Neubig, Orhan Firat, and Melvin Johnson. 2020. XTREME: A massively multilingual multi-task benchmark for evaluating cross-lingual generalization. arXiv preprint arXiv:2003.11080 (2020).

[23]

Alexis Joly, Hervé Goëau, Hervé Glotin, Concetto Spampinato, Pierre Bonnet, Willem-Pier Vellinga, Julien Champ, Robert Planqué, Simone Palazzo, and Henning Müller. 2016. LifeCLEF 2016: multimedia life species identification challenges. In International Conference of the Cross-Language Evaluation Forum for European Languages. Springer, 286--310.

[24]

Hyo Jin Kim, Enrique Dunn, and Jan-Michael Frahm. 2017. Learned contextual feature reweighting for image geo-localization. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 3251--3260.

[25]

Giorgos Kordopatis-Zilos, Symeon Papadopoulos, and Ioannis Kompatsiaris. 2017. Geotagging text content with language models and feature mining. Proc. IEEE 105, 10 (2017), 1971--1986.

[26]

Giorgos Kordopatis-Zilos, Symeon Papadopoulos, and Yiannis Kompatsiaris. 2016. In-depth exploration of geotagging performance using sampling strategies on YFCC100M. In Proceedings of the 2016 ACM Workshop on Multimedia COMMONS. 3--10.

Digital Library

[27]

Tomasz Kornuta, Deepta Rajan, Chaitanya Shivade, Alexis Asseman, and Ahmet S. Ozcan. 2019. Leveraging Medical Visual Question Answering with Supporting Facts. In Working Notes of CLEF 2019 - Conference and Labs of the Evaluation Forum, Lugano, Switzerland, September 9--12, 2019 (CEUR Workshop Proceedings, Vol. 2380), Linda Cappellato, Nicola Ferro, David E. Losada, and Henning Müller (Eds.). CEUR-WS.org. http://ceur-ws.org/Vol-2380/paper_112.pdf

[28]

Ryohei Kuga, Asako Kanezaki, Masaki Samejima, Yusuke Sugano, and Yasuyuki Matsushita. 2017. Multi-task learning using multi-modal encoder-decoder networks with shared skip connections. In Proceedings of the IEEE International Conference on Computer Vision Workshops. 403--411.

[29]

Martha Larson, Mohammad Soleymani, Guillaume Gravier, Bogdan Ionescu, and Gareth JF Jones. 2017. The benchmarking initiative for multimedia evaluation: MediaEval 2016. IEEE MultiMedia 24, 1 (2017), 93--96.

[30]

Qing Li, Qingyi Tao, Shafiq Joty, Jianfei Cai, and Jiebo Luo. 2018. Vqa-e: Explaining, elaborating, and enhancing your answers for visual questions. In Proceedings of the European Conference on Computer Vision (ECCV). 552--567.

Digital Library

[31]

Ying Lin, Shengqi Yang, Veselin Stoyanov, and Heng Ji. 2018. A multi-lingual multi-task architecture for low-resource sequence labeling. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 799--809.

[32]

Mingsheng Long, Zhangjie Cao, Jianmin Wang, and S Yu Philip. 2017. Learning multiple tasks with multilinear relationship networks. In Advances in neural information processing systems. 1594--1603.

[33]

Yi Luan, Luheng He, Mari Ostendorf, and Hannaneh Hajishirzi. 2018. Multi-Task Identification of Entities, Relations, and Coreference for Scientific Knowledge Graph Construction. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Brussels, Belgium, 3219--3232. https://doi.org/10.18653/v1/D18--1360

[34]

Junyu Luo, Ying Shen, Xiang Ao, Zhou Zhao, and Min Yang. 2019. Cross-modal Image-Text Retrieval with Multitask Learning. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management. 2309--2312.

Digital Library

[35]

Javier Marin, Aritro Biswas, Ferda Ofli, Nicholas Hynes, Amaia Salvador, Yusuf Aytar, Ingmar Weber, and Antonio Torralba. 2019. Recipe1m+: A dataset for learning cross-modal embeddings for cooking recipes and food images. IEEE transactions on pattern analysis and machine intelligence (2019).

Digital Library

[36]

Stuart E Middleton, Giorgos Kordopatis-Zilos, Symeon Papadopoulos, and Yiannis Kompatsiaris. 2018. Location extraction from social media: Geoparsing, location disambiguation, and geotagging. ACM Transactions on Information Systems (TOIS) 36, 4 (2018), 1--27.

Digital Library

[37]

Ludovic Moncla, Mauro Gaio, Thierry Joliveau, and Yves-François Le Lay. 2017. Automated geoparsing of Paris street names in 19th century novels. In Proceedings of the 1st ACM SIGSPATIAL Workshop on Geospatial Humanities. 1--8.

Digital Library

[38]

Eric Muller-Budack, Kader Pustu-Iren, and Ralph Ewerth. 2018. Geolocation estimation of photos using a hierarchical model and scene classification. In Proceedings of the European Conference on Computer Vision (ECCV). 563--579.

Digital Library

[39]

Binh D Nguyen, Thanh-Toan Do, Binh X Nguyen, Tuong Do, Erman Tjiputra, and Quang D Tran. 2019. Overcoming Data Limitation in Medical Visual Question Answering. In International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer, 522--530.

[40]

Luc Pauwels. 2012. A multimodal framework for analyzing websites as cultural expressions. Journal of Computer-Mediated Communication 17, 3 (2012), 247--265.

Digital Library

[41]

Miguel De Prado, Jing Su, Rabia Saeed, Lorenzo Keller, Noelia Vallez, Andrew Anderson, David Gregg, Luca Benini, Tim Llewellynn, Nabil Ouerhani, Rozenn Dahyot, and Nuria Pazos. 2020. Bonseyes AI Pipeline?Bringing AI to You: End-toEnd Integration of Data, Algorithms, and Deployment Tools. ACM Trans. Internet Things 1, 4, Article 26 (Aug. 2020), 25 pages. https://doi.org/10.1145/3403572

Digital Library

[42]

Arnau Ramisa, Fei Yan, Francesc Moreno-Noguer, and Krystian Mikolajczyk. 2017. Breakingnews: Article annotation by image and text processing. IEEE transactions on pattern analysis and machine intelligence 40, 5 (2017), 1072--1085.

[43]

Shreya Shankar, Yoni Halpern, Eric Breck, James Atwood, Jimbo Wilson, and D. Sculley. 2017. No Classification without Representation: Assessing Geodiversity Issues in Open Data Sets for the Developing World. In NIPS 2017 workshop: Machine Learning for the Developing World.

[44]

Devendra Singh Chaplot, Lisa Lee, Ruslan Salakhutdinov, Devi Parikh, and Dhruv Batra. 2019. Embodied Multimodal Multitask Learning. arXiv preprint arXiv:1902.01385 (2019).

[45]

Harini Suresh and John V Guttag. 2019. A framework for understanding unintended consequences of machine learning. arXiv preprint arXiv:1901.10002 (2019).

[46]

Kevin Tang, Manohar Paluri, Li Fei-Fei, Rob Fergus, and Lubomir Bourdev. 2015. Improving image classification with location context. In Proceedings of the IEEE international conference on computer vision. 1008--1016.

[47]

Bart Thomee, David A Shamma, Gerald Friedland, Benjamin Elizalde, Karl Ni, Douglas Poland, Damian Borth, and Li-Jia Li. 2016. YFCC100M: The new data in multimedia research. Commun. ACM 59, 2 (2016), 64--73.

Digital Library

[48]

Michele Trevisiol, Hervé Jégou, Jonathan Delhumeau, and Guillaume Gravier. 2013. Retrieving geo-location of videos with a divide & conquer hierarchical multimodal approach. In Proceedings of the 3rd ACM conference on International conference on multimedia retrieval. 1--8.

Digital Library

[49]

Theodora Tsikrika, Adrian Popescu, and Jana Kludas. 2011. Overview of the Wikipedia Image Retrieval Task at ImageCLEF 2011. In CLEF (Notebook Papers/Labs/Workshop), Vol. 4. 5.

[50]

Burak Uzkent, Evan Sheehan, Chenlin Meng, Zhongyi Tang, Marshall Burke, David B Lobell, and Stefano Ermon. 2019. Learning to Interpret Satellite Images using Wikipedia. In IJCAI. 3620--3626.

[51]

Xin Wang, Jiawei Wu, Junkun Chen, Lei Li, Yuan-Fang Wang, and William Yang Wang. 2019. VATEX: A large-scale, high-quality multilingual dataset for videoand-language research. In Proceedings of the IEEE International Conference on Computer Vision. 4581--4591.

[52]

Tobias Weyand, Ilya Kostrikov, and James Philbin. 2016. Planet-photo geolocation with convolutional neural networks. In European Conference on Computer Vision. Springer, 37--55.

[53]

Xue-Wen Chen and Xiaotong Lin. 2014. Big Data Deep Learning: Challenges and Perspectives. IEEE Access 2 (2014), 514--525. https://doi.org/10.1109/ACCESS. 2014.2325029

[54]

Jie Yu and Jiebo Luo. 2008. Leveraging probabilistic season and location context models for scene understanding. In Proceedings of the 2008 international conference on Content-based image and video retrieval. 169--178.

Digital Library

[55]

Yu Zhang and Qiang Yang. 2018. A Survey on Multi-Task Learning. arXiv:1707.08114 [cs] (July 2018). http://arxiv.org/abs/1707.08114 arXiv: 1707.08114.

[56]

Yan-Tao Zheng, Ming Zhao, Yang Song, Hartwig Adam, Ulrich Buddemeier, Alessandro Bissacco, Fernando Brucher, Tat-Seng Chua, and Hartmut Neven. 2009. Tour the world: building a web-scale landmark recognition engine. In 2009 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 1085--1092.

[57]

Bolei Zhou, Agata Lapedriza, Aditya Khosla, Aude Oliva, and Antonio Torralba. 2017. Places: A 10 million image database for scene recognition. IEEE transactions on pattern analysis and machine intelligence 40, 6 (2017), 1452--1464.

[58]

Mingyang Zhou, Runxiang Cheng, Yong Jae Lee, and Zhou Yu. 2018. A visual attention grounding neural model for multimodal machine translation. arXiv preprint arXiv:1808.08266 (2018).

Cited By

Liu ZSchaldenbrand POkogwu BPeng WPeng WYun YHundt AKim JOh J(2024)SCoFT: Self-Contrastive Fine-Tuning for Equitable Image Generation2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52733.2024.01029(10822-10832)Online publication date: 16-Jun-2024
https://doi.org/10.1109/CVPR52733.2024.01029
Kacupaj ESingh KMaleshkova MLehmann J(2024)Conversational Question Answering over Knowledge GraphsEvent Analytics across Languages and Communities10.1007/978-3-031-64451-1_9(169-186)Online publication date: 17-Jun-2024
https://doi.org/10.1007/978-3-031-64451-1_9
Gottschalk S(2024)Collection and Integration of Event-Centric Information in Cross-Lingual Knowledge GraphsEvent Analytics across Languages and Communities10.1007/978-3-031-64451-1_6(111-122)Online publication date: 17-Jun-2024
https://doi.org/10.1007/978-3-031-64451-1_6
Show More Cited By

Index Terms

MLM: A Benchmark Dataset for Multitask Learning with Multiple Languages and Modalities
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Multi-task learning
2. Information systems
  1. Information retrieval
    1. Specialized information retrieval
      1. Multimedia and multimodal retrieval
      2. Structure and multilingual text search
        Multilingual and cross-lingual retrieval

Recommendations

On Big Data Learning for Small Data Problems
KDD '18: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

Much recent progress in machine learning have been fueled by the explosive growth in the amount and diversity of data available, and the computational resources needed to crunch through the data. This begs the question of whether machine learning ...
FlexCare: Leveraging Cross-Task Synergy for Flexible Multimodal Healthcare Prediction
KDD '24: Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

Multimodal electronic health record (EHR) data can offer a holistic assessment of a patient's health status, supporting various predictive healthcare tasks. Recently, several studies have embraced the multitask learning approach in the healthcare domain, ...
Optimal Task Grouping Approach in Multitask Learning
Neural Information Processing
Abstract
Multi-task learning has become a powerful solution in which multiple tasks are trained together to leverage the knowledge learned from one task to improve the performance of the other tasks. However, the tasks are not always constructive on each ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

CIKM '20: Proceedings of the 29th ACM International Conference on Information & Knowledge Management

October 2020

3619 pages

ISBN:9781450368599

DOI:10.1145/3340531

General Chairs:
Mathieu d'Aquin
DSI, Insight, NUI Galway, Ireland
,
Stefan Dietze
GESIS, Cologne, Germany, Heinrich-Heine-University Düsseldorf, Germany, L3S Research Center, Germany
,
Program Chairs:
Claudia Hauff
TU Delft, The Netherlands
,
Edward Curry
DSI, Insight, NUI Galway, Ireland
,
Philippe Cudre Mauroux
eXascale, University of Fribourg, Switzerland

Copyright © 2020 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 19 October 2020

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Horizon 2020

Conference

CIKM '20

Sponsor:

CIKM '20: The 29th ACM International Conference on Information and Knowledge Management

October 19 - 23, 2020

Virtual Event, Ireland

Acceptance Rates

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Sponsor:
sigir
sigir

The 34th ACM International Conference on Information and Knowledge Management

November 10 - 14, 2025

Seoul , Republic of Korea

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

7
Total Citations
View Citations
252
Total Downloads

Downloads (Last 12 months)41
Downloads (Last 6 weeks)7

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Liu ZSchaldenbrand POkogwu BPeng WPeng WYun YHundt AKim JOh J(2024)SCoFT: Self-Contrastive Fine-Tuning for Equitable Image Generation2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52733.2024.01029(10822-10832)Online publication date: 16-Jun-2024
https://doi.org/10.1109/CVPR52733.2024.01029
Kacupaj ESingh KMaleshkova MLehmann J(2024)Conversational Question Answering over Knowledge GraphsEvent Analytics across Languages and Communities10.1007/978-3-031-64451-1_9(169-186)Online publication date: 17-Jun-2024
https://doi.org/10.1007/978-3-031-64451-1_9
Gottschalk S(2024)Collection and Integration of Event-Centric Information in Cross-Lingual Knowledge GraphsEvent Analytics across Languages and Communities10.1007/978-3-031-64451-1_6(111-122)Online publication date: 17-Jun-2024
https://doi.org/10.1007/978-3-031-64451-1_6
Tahmasebzadeh GHakimov SEwerth RMüller-Budack E(2023)Multimodal Geolocation Estimation of News PhotosAdvances in Information Retrieval10.1007/978-3-031-28238-6_14(204-220)Online publication date: 2-Apr-2023
https://dl.acm.org/doi/10.1007/978-3-031-28238-6_14
Tahmasebzadeh GMüller-Budack EHakimov SEwerth R(2023)MM-Locate-News: Multimodal Focus Location Estimation in NewsMultiMedia Modeling10.1007/978-3-031-27077-2_16(204-216)Online publication date: 9-Jan-2023
https://dl.acm.org/doi/10.1007/978-3-031-27077-2_16
Srinivasan KRaman KChen JBendersky MNajork MDiaz FShah CSuel TCastells PJones RSakai T(2021)WIT: Wikipedia-based Image Text Dataset for Multimodal Multilingual Machine LearningProceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3404835.3463257(2443-2449)Online publication date: 11-Jul-2021
https://dl.acm.org/doi/10.1145/3404835.3463257
Tahmasebzadeh GKacupaj EMüller-Budack EHakimov SLehmann JEwerth RDiaz FShah CSuel TCastells PJones RSakai T(2021)GeoWINE: Geolocation based Wiki, Image, News and Event RetrievalProceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3404835.3462786(2565-2569)Online publication date: 11-Jul-2021
https://dl.acm.org/doi/10.1145/3404835.3462786

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten