research-article

Semi-supervised Multi-Label Topic Models for Document Classification and Sentence Labeling

Authors:

Hossein Soleimani,

David J. MillerAuthors Info & Claims

CIKM '16: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management

Pages 105 - 114

https://doi.org/10.1145/2983323.2983752

Published: 24 October 2016 Publication History

Abstract

Extracting parts of a text document relevant to a class label is a critical information retrieval task. We propose a semi-supervised multi-label topic model for jointly achieving document and sentence-level class inferences. Under our model, each sentence is associated with only a subset of the document's labels (including possibly none of them), with the label set of the document the union of the labels of all of its sentences. For training, we use both labeled documents, and, typically, a larger set of unlabeled documents. Our model, in a semisupervised fashion, discovers the topics present, learns associations between topics and class labels, predicts labels for new (or unlabeled) documents, and determines label associations for each sentence in every document. For learning, our model does not require any ground-truth labels on sentences. We develop a Hamiltonian Monte Carlo based algorithm for efficiently sampling from the joint label distribution over all sentences, a very high-dimensional discrete space. Our experiments show that our approach outperforms several benchmark methods with respect to both document and sentence-level classification, as well as test set log-likelihood. All code for replicating our experiments is available from https://github.com/hsoleimani/MLTM.

References

[1]

K Nigam, A McCallum, and T Mitchell. Semi-supervised text classification using EM. Semi-Supervised Learning, pp. 33--56, 2006.

[2]

D J Miller and H S Uyar. A mixture of experts classifier with learning based on both labelled and unlabelled data. In NIPS, pp. 571--577, 1997.

Digital Library

[3]

D M Blei, L Carin, and D Dunson. Probabilistic Topic Models. Comm. of the ACM, 55(4):77--84, 2012.

Digital Library

[4]

D M Blei, A Y Ng, and M I Jordan. Latent Dirichlet Allocation. JMLR, 3:993--1022, 2003.

Digital Library

[5]

D M Blei and J D McAuliffe. Supervised topic models. arXiv preprint arXiv:1003.0783, pp. 1--22, 2010.

[6]

S Lacoste-Julien, F Sha, and M I Jordan. DiscLDA: Discriminative learning for dimensionality reduction and classification. In NIPS, pp. 897--904, 2009.

Digital Library

[7]

C Wang, D M Blei, and F-F Li. Simultaneous image classification and annotation. In CVPR, 2009.

[8]

A Dai and A J Storkey. The supervised hierarchical Dirichlet process. IEEE TPAMI, 37(2):243--255, 2014.

Digital Library

[9]

D Mimno and A McCallum. Topic Models Conditioned on Arbitrary Features with Dirichlet-multinomial Regression. In UAI, pp. 411--418, 2008.

[10]

Y Bao, N Collier, and A Datta. A Partially Supervised Cross-Collection Topic Model for Cross-Domain Text Classification. In CIKM, pp. 239--248, 2013.

Digital Library

[11]

Y Lu and C Zhai. Opinion integration through semi-supervised topic modeling. In WWW, pp. 121--130, 2008.

Digital Library

[12]

X-L Mao, Z-Y Ming, T-S Chua, et al. SSHLDA: A Semi-Supervised Hierarchical Topic Model. In EMNLP, pp 800--809, 2012.

Digital Library

[13]

D Ramage, D Hall, R Nallapati, and C D Manning. Labeled LDA : A supervised topic model for credit attribution in multi-labeled corpora. In EMNLP, pp. 248--256, 2009.

Digital Library

[14]

D Ramage, C D Manning, and S Dumais. Partially labeled topic models for interpretable text mining. In KDD, pp. 457--465, 2011.

Digital Library

[15]

D Kim, S Kim, and A Oh. Dirichlet process with mixed random measures: a nonparametric topic model for labeled data. In ICML, pp. 727--734, 2012.

[16]

Z Yang, A Kotov, A Mohan, and S Lu. Parametric and non-parametric user-aware sentiment topic models. In SIGIR, pp. 413--422, 2015.

Digital Library

[17]

M I Jordan, Z Ghahramani, T S Jaakkola, and L K Saul. An introduction to variational methods for graphical models. Mach. learn., 37(2):183--233, 1999.

Digital Library

[18]

J Nocedal and S Wright. Numerical Optimization. Springer, 2006.

[19]

T G Dietterich, R H Lathrop, and T Lozano-Pérez. Solving the multiple instance problem with axis-parallel rectangles. Artif Intell, 89:31--71, 1997.

Digital Library

[20]

J Amores. Multiple instance classification: Review, taxonomy and comparative study. Artif Intell, 201:81--105, 2013.

Digital Library

[21]

J R Foulds and P Smyth. Multi-instance mixture models and semi-supervised learning. In ICDM, 2011.

[22]

S Andrews, I Tsochantaridis, and T Hofmann. Support Vector Machines for Multiple-Instance Learning. In NIPS, pp. 561--568, 2002.

Digital Library

[23]

S-H Yang, H Zha, and B-G Hu. Dirichlet-bernoulli alignment: A generative model for multi-class multi-label multi-instance corpora. In NIPS, pp. 2143--2150, 2009.

Digital Library

[24]

D M Blei and M I Jordan. Modeling annotated data. In SIGIR, pp. 127--134, 2003.

Digital Library

[25]

T L Griffiths and M Steyvers. Finding scientific topics. In PNAS, 101(suppl 1):5228--5235, 2004.

[26]

A Pakman and L Paninski. Auxiliary-variable exact Hamiltonian Monte Carlo samplers for binary distributions. In NIPS, pp. 2490--2498, 2013.

Digital Library

[27]

R M Neal. MCMC using Hamiltonian dynamics. Handbook of Markov Chain Monte Carlo, 2, 2011.

[28]

T P Minka. Estimating a Dirichlet distribution, 2000.

[29]

V-A Nguyen, J Boyd-Graber, and P Resnik. Sometimes average is best: The importance of averaging for prediction using MCMC inference in topic modeling. In EMNLP, pp. 1752--1757, 2014.

[30]

S Brooks, A Gelman, et al.% G Jones, and XL Meng. Handbook of Markov Chain Monte Carlo. CRC press, 2011.

[31]

G L Jones, M Haran, B S Caffo, and R Neath. Fixed-width output analysis for Markov chain Monte Carlo. JASA, 101(476):1537--1547, 2006.

[32]

D D Lewis. Evaluating Text Categorization I. In HLT, pp. 312--318, 1991.

Digital Library

[33]

A Zubiaga, A P García-Plaza, et al. Content-based Clustering for Tag Cloud Visualization. In ASONAM, 2009.

Digital Library

[34]

M. Porter. An algorithm for suffix stripping. Program, 14(3):130--137, 1980.

[35]

J Lehmann, R Isele, et al. DBpedia-a large-scale, multilingual knowledge base extracted from wikipedia. Semantic Web, 5:1--29, 2014.

[36]

X Zhang, J Zhao, and Y LeCun. Character-level convolutional networks for text classification. In NIPS, pp. 649--657, 2015.

Digital Library

Cited By

Yu XWang DZhang M(2024)Dimensionality Reduction for Partial Label Learning: A Unified and Adaptive ApproachIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.336772136:8(3765-3782)Online publication date: Aug-2024
https://doi.org/10.1109/TKDE.2024.3367721
Chen WCrawford VOh ANaumann TGloberson ASaenko KHardt MLevine S(2023)Bicriteria approximation algorithms for the submodular cover problemProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3669300(72705-72716)Online publication date: 10-Dec-2023
https://dl.acm.org/doi/10.5555/3666122.3669300
Kumar JShao JKumar RDin SMawuli CYang Q(2023)Online Semi-Supervised Classification on Multilabel Evolving High-Dimensional Text StreamsIEEE Transactions on Systems, Man, and Cybernetics: Systems10.1109/TSMC.2023.327529853:10(5983-5995)Online publication date: Oct-2023
https://doi.org/10.1109/TSMC.2023.3275298
Show More Cited By

Recommendations

Semi-supervised latent Dirichlet allocation for multi-label text classification
IEA/AIE'13: Proceedings of the 26th International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems

This paper proposes a semi-supervised latent Dirichlet allocation (ssLDA) method, which differs from the existing supervised topic models for multi-label classification in mainly two aspects. Firstly both labeled and unlabeled learning data are used in ...
Semi-supervised multi-label classification using incomplete label information
Highlights
- An inductive semi-supervised method called Smile is proposed for multi-label classification using incomplete label information.
Abstract
Classifying multi-label instances using incompletely labeled instances is one of the fundamental tasks in multi-label learning. Most existing methods regard this task as supervised weak-label learning problem and assume sufficient ...
Statistical topic models for multi-label document classification

Machine learning approaches to multi-label document classification have to date largely relied on discriminative modeling techniques such as support vector machines. A drawback of these approaches is that performance rapidly drops off as the total ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

CIKM '16: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management

October 2016

2566 pages

ISBN:9781450340731

DOI:10.1145/2983323

General Chairs:
Snehasis Mukhopadhyay
Indiana University Purdue University Indianapolis, USA
,
ChengXiang Zhai
University of Illinois at Urbana-Champaign, USA
,
Program Chairs:
Elisa Bertino
Purdue University
,
Fabio Crestani
University of Lugano
,
Javed Mostafa
University of North Carolina
,
Jie Tang
Tsinghua University
,
Luo Si
Alibaba Group Inc & Purdue University
,
Xiaofang Zhou
University of Queensland
,
Yi Chang
Yahoo Research
,
Yunyao Li
IBM Research - Almaden
,
Parikshit Sondhi
WalmartLabs

Copyright © 2016 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 October 2016

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

CIKM'16

Sponsor:

CIKM'16: ACM Conference on Information and Knowledge Management

October 24 - 28, 2016

Indiana, Indianapolis, USA

Acceptance Rates

CIKM '16 Paper Acceptance Rate 160 of 701 submissions, 23%;

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Sponsor:
sigir
sigir

The 34th ACM International Conference on Information and Knowledge Management

November 10 - 14, 2025

Seoul , Republic of Korea

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

25
Total Citations
View Citations
863
Total Downloads

Downloads (Last 12 months)41
Downloads (Last 6 weeks)6

Reflects downloads up to 14 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Yu XWang DZhang M(2024)Dimensionality Reduction for Partial Label Learning: A Unified and Adaptive ApproachIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.336772136:8(3765-3782)Online publication date: Aug-2024
https://doi.org/10.1109/TKDE.2024.3367721
Chen WCrawford VOh ANaumann TGloberson ASaenko KHardt MLevine S(2023)Bicriteria approximation algorithms for the submodular cover problemProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3669300(72705-72716)Online publication date: 10-Dec-2023
https://dl.acm.org/doi/10.5555/3666122.3669300
Kumar JShao JKumar RDin SMawuli CYang Q(2023)Online Semi-Supervised Classification on Multilabel Evolving High-Dimensional Text StreamsIEEE Transactions on Systems, Man, and Cybernetics: Systems10.1109/TSMC.2023.327529853:10(5983-5995)Online publication date: Oct-2023
https://doi.org/10.1109/TSMC.2023.3275298
Bera DPratap RVerma B(2023)Dimensionality Reduction for Categorical DataIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2021.313237335:4(3658-3671)Online publication date: 1-Apr-2023
https://doi.org/10.1109/TKDE.2021.3132373
Sun XChai J(2023)Random forest feature selection for partial label learningNeurocomputing10.1016/j.neucom.2023.126870561(126870)Online publication date: Dec-2023
https://doi.org/10.1016/j.neucom.2023.126870
Duarte JBerton L(2023)A review of semi-supervised learning for text classificationArtificial Intelligence Review10.1007/s10462-023-10393-856:9(9401-9469)Online publication date: 31-Jan-2023
https://doi.org/10.1007/s10462-023-10393-8
Zhang MWu JBao W(2022)Disambiguation Enabled Linear Discriminant Analysis for Partial Label Dimensionality ReductionACM Transactions on Knowledge Discovery from Data10.1145/349456516:4(1-18)Online publication date: 8-Jan-2022
https://dl.acm.org/doi/10.1145/3494565
Gong NYao N(2022)GeSe: Generalized static embeddingApplied Intelligence10.1007/s10489-021-03001-152:9(10148-10160)Online publication date: 11-Jan-2022
https://doi.org/10.1007/s10489-021-03001-1
Abualigah LForestiero AElaziz M(2022)Bio-Inspired Agents for a Distributed NLP-Based Clustering in Smart EnvironmentsProceedings of the 13th International Conference on Soft Computing and Pattern Recognition (SoCPaR 2021)10.1007/978-3-030-96302-6_64(678-687)Online publication date: 22-Feb-2022
https://doi.org/10.1007/978-3-030-96302-6_64
He JLi LWang YWu X(2021)Hierarchical features-based targeted aspect extraction from online reviewsIntelligent Data Analysis10.3233/IDA-19495225:1(205-223)Online publication date: 26-Jan-2021
https://doi.org/10.3233/IDA-194952
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents