Nothing Special   »   [go: up one dir, main page]

There is a newer version of the record available.

Published December 1, 2021 | Version v1
Dataset Open

Learning to quantify: LeQua 2022 datasets

Description

# Learning to Quantify

The aim of LeQua 2022 (the 1st edition of the CLEF “Learning to Quantify” lab) is to allow the comparative evaluation of methods for “learning to quantify” in textual datasets, i.e., methods for training predictors of the relative frequencies of the classes of interest in sets of unlabelled textual documents. These predictors (called “quantifiers”) will be required to issue predictions for several such sets, some of them characterized by class frequencies radically different from the ones of the training set.

## Links

https://lequa2022.github.io/

https://github.com/HLT-ISTI/LeQua2022_scripts

## Tasks

T1A: This task is concerned with evaluating binary quantifiers, i.e., quantifiers that must only predict the relative frequencies of a class and its complement. Participants in this task will be provided with documents already converted into vector form; the task is thus suitable for participants who do not wish to engage in generating representations for the textual documents, but want instead to concentrate on optimizing the methods for learning to quantify.

T1B: This task is concerned with evaluating single-label multi-class quantifiers, i.e., quantifiers that operate on documents that each belong to exactly one among a set of n>2 classes. Like in Task T1A, participants will be provided with documents already converted in vector form.

T2A: Like Task T1A, this task is concerned with evaluating binary quantifiers. Unlike in Task T1A, participants will be provided with the raw text of the documents; the task is thus suitable for participants who also wish to engage in generating suitable representations for the textual documents, or to train end-to-end systems.

T2B: Like Task T1B, this task is concerned with evaluating single-label multi-class quantifiers; like in Task T2A, participants will be provided with the raw text of the documents.

Files

ReadMe.txt

Files (1.5 GB)

Name Size Download all
md5:033aaaa0df2fad4ff61bf21ae40d629a
1.9 kB Preview Download
md5:c2fbf10756baf9b6627e570d220e0845
230.2 MB Preview Download
md5:0dca2e82adc97b219022d2a6cf11386f
908.4 MB Preview Download
md5:0bd2aba01c723e7aec534e2f15a9895d
83.7 MB Preview Download
md5:401114c778d992551b27a1f7d25805f4
299.3 MB Preview Download

Additional details

Funding

SoBigData-PlusPlus – SoBigData++: European Integrated Infrastructure for Social Mining and Big Data Analytics 871042
European Commission
AI4Media – A European Excellence Centre for Media, Society and Democracy 951911
European Commission