Computer Science > Computation and Language

arXiv:2212.05506 (cs)

[Submitted on 11 Dec 2022 (v1), last revised 15 Dec 2022 (this version, v2)]

Title:FastClass: A Time-Efficient Approach to Weakly-Supervised Text Classification

Authors:Tingyu Xia, Yue Wang, Yuan Tian, Yi Chang

View PDF

Abstract:Weakly-supervised text classification aims to train a classifier using only class descriptions and unlabeled data. Recent research shows that keyword-driven methods can achieve state-of-the-art performance on various tasks. However, these methods not only rely on carefully-crafted class descriptions to obtain class-specific keywords but also require substantial amount of unlabeled data and takes a long time to train. This paper proposes FastClass, an efficient weakly-supervised classification approach. It uses dense text representation to retrieve class-relevant documents from external unlabeled corpus and selects an optimal subset to train a classifier. Compared to keyword-driven methods, our approach is less reliant on initial class descriptions as it no longer needs to expand each class description into a set of class-specific keywords. Experiments on a wide range of classification tasks show that the proposed approach frequently outperforms keyword-driven models in terms of classification accuracy and often enjoys orders-of-magnitude faster training speed.

Comments:	EMNLP 2022
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2212.05506 [cs.CL]
	(or arXiv:2212.05506v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2212.05506

Submission history

From: Tingyu Xia [view email]
[v1] Sun, 11 Dec 2022 13:43:22 UTC (308 KB)
[v2] Thu, 15 Dec 2022 01:07:43 UTC (308 KB)

Computer Science > Computation and Language

Title:FastClass: A Time-Efficient Approach to Weakly-Supervised Text Classification

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:FastClass: A Time-Efficient Approach to Weakly-Supervised Text Classification

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators