Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/2983323.2983752acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Semi-supervised Multi-Label Topic Models for Document Classification and Sentence Labeling

Published: 24 October 2016 Publication History

Abstract

Extracting parts of a text document relevant to a class label is a critical information retrieval task. We propose a semi-supervised multi-label topic model for jointly achieving document and sentence-level class inferences. Under our model, each sentence is associated with only a subset of the document's labels (including possibly none of them), with the label set of the document the union of the labels of all of its sentences. For training, we use both labeled documents, and, typically, a larger set of unlabeled documents. Our model, in a semisupervised fashion, discovers the topics present, learns associations between topics and class labels, predicts labels for new (or unlabeled) documents, and determines label associations for each sentence in every document. For learning, our model does not require any ground-truth labels on sentences. We develop a Hamiltonian Monte Carlo based algorithm for efficiently sampling from the joint label distribution over all sentences, a very high-dimensional discrete space. Our experiments show that our approach outperforms several benchmark methods with respect to both document and sentence-level classification, as well as test set log-likelihood. All code for replicating our experiments is available from https://github.com/hsoleimani/MLTM.

References

[1]
K Nigam, A McCallum, and T Mitchell. Semi-supervised text classification using EM. Semi-Supervised Learning, pp. 33--56, 2006.
[2]
D J Miller and H S Uyar. A mixture of experts classifier with learning based on both labelled and unlabelled data. In NIPS, pp. 571--577, 1997.
[3]
D M Blei, L Carin, and D Dunson. Probabilistic Topic Models. Comm. of the ACM, 55(4):77--84, 2012.
[4]
D M Blei, A Y Ng, and M I Jordan. Latent Dirichlet Allocation. JMLR, 3:993--1022, 2003.
[5]
D M Blei and J D McAuliffe. Supervised topic models. arXiv preprint arXiv:1003.0783, pp. 1--22, 2010.
[6]
S Lacoste-Julien, F Sha, and M I Jordan. DiscLDA: Discriminative learning for dimensionality reduction and classification. In NIPS, pp. 897--904, 2009.
[7]
C Wang, D M Blei, and F-F Li. Simultaneous image classification and annotation. In CVPR, 2009.
[8]
A Dai and A J Storkey. The supervised hierarchical Dirichlet process. IEEE TPAMI, 37(2):243--255, 2014.
[9]
D Mimno and A McCallum. Topic Models Conditioned on Arbitrary Features with Dirichlet-multinomial Regression. In UAI, pp. 411--418, 2008.
[10]
Y Bao, N Collier, and A Datta. A Partially Supervised Cross-Collection Topic Model for Cross-Domain Text Classification. In CIKM, pp. 239--248, 2013.
[11]
Y Lu and C Zhai. Opinion integration through semi-supervised topic modeling. In WWW, pp. 121--130, 2008.
[12]
X-L Mao, Z-Y Ming, T-S Chua, et al. SSHLDA: A Semi-Supervised Hierarchical Topic Model. In EMNLP, pp 800--809, 2012.
[13]
D Ramage, D Hall, R Nallapati, and C D Manning. Labeled LDA : A supervised topic model for credit attribution in multi-labeled corpora. In EMNLP, pp. 248--256, 2009.
[14]
D Ramage, C D Manning, and S Dumais. Partially labeled topic models for interpretable text mining. In KDD, pp. 457--465, 2011.
[15]
D Kim, S Kim, and A Oh. Dirichlet process with mixed random measures: a nonparametric topic model for labeled data. In ICML, pp. 727--734, 2012.
[16]
Z Yang, A Kotov, A Mohan, and S Lu. Parametric and non-parametric user-aware sentiment topic models. In SIGIR, pp. 413--422, 2015.
[17]
M I Jordan, Z Ghahramani, T S Jaakkola, and L K Saul. An introduction to variational methods for graphical models. Mach. learn., 37(2):183--233, 1999.
[18]
J Nocedal and S Wright. Numerical Optimization. Springer, 2006.
[19]
T G Dietterich, R H Lathrop, and T Lozano-Pérez. Solving the multiple instance problem with axis-parallel rectangles. Artif Intell, 89:31--71, 1997.
[20]
J Amores. Multiple instance classification: Review, taxonomy and comparative study. Artif Intell, 201:81--105, 2013.
[21]
J R Foulds and P Smyth. Multi-instance mixture models and semi-supervised learning. In ICDM, 2011.
[22]
S Andrews, I Tsochantaridis, and T Hofmann. Support Vector Machines for Multiple-Instance Learning. In NIPS, pp. 561--568, 2002.
[23]
S-H Yang, H Zha, and B-G Hu. Dirichlet-bernoulli alignment: A generative model for multi-class multi-label multi-instance corpora. In NIPS, pp. 2143--2150, 2009.
[24]
D M Blei and M I Jordan. Modeling annotated data. In SIGIR, pp. 127--134, 2003.
[25]
T L Griffiths and M Steyvers. Finding scientific topics. In PNAS, 101(suppl 1):5228--5235, 2004.
[26]
A Pakman and L Paninski. Auxiliary-variable exact Hamiltonian Monte Carlo samplers for binary distributions. In NIPS, pp. 2490--2498, 2013.
[27]
R M Neal. MCMC using Hamiltonian dynamics. Handbook of Markov Chain Monte Carlo, 2, 2011.
[28]
T P Minka. Estimating a Dirichlet distribution, 2000.
[29]
V-A Nguyen, J Boyd-Graber, and P Resnik. Sometimes average is best: The importance of averaging for prediction using MCMC inference in topic modeling. In EMNLP, pp. 1752--1757, 2014.
[30]
S Brooks, A Gelman, et al.% G Jones, and XL Meng. Handbook of Markov Chain Monte Carlo. CRC press, 2011.
[31]
G L Jones, M Haran, B S Caffo, and R Neath. Fixed-width output analysis for Markov chain Monte Carlo. JASA, 101(476):1537--1547, 2006.
[32]
D D Lewis. Evaluating Text Categorization I. In HLT, pp. 312--318, 1991.
[33]
A Zubiaga, A P García-Plaza, et al. Content-based Clustering for Tag Cloud Visualization. In ASONAM, 2009.
[34]
M. Porter. An algorithm for suffix stripping. Program, 14(3):130--137, 1980.
[35]
J Lehmann, R Isele, et al. DBpedia-a large-scale, multilingual knowledge base extracted from wikipedia. Semantic Web, 5:1--29, 2014.
[36]
X Zhang, J Zhao, and Y LeCun. Character-level convolutional networks for text classification. In NIPS, pp. 649--657, 2015.

Cited By

View all
  • (2024)Dimensionality Reduction for Partial Label Learning: A Unified and Adaptive ApproachIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.336772136:8(3765-3782)Online publication date: Aug-2024
  • (2023)Bicriteria approximation algorithms for the submodular cover problemProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3669300(72705-72716)Online publication date: 10-Dec-2023
  • (2023)Online Semi-Supervised Classification on Multilabel Evolving High-Dimensional Text StreamsIEEE Transactions on Systems, Man, and Cybernetics: Systems10.1109/TSMC.2023.327529853:10(5983-5995)Online publication date: Oct-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
CIKM '16: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management
October 2016
2566 pages
ISBN:9781450340731
DOI:10.1145/2983323
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 October 2016

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. credit assignment
  2. multi-label classification
  3. semi-supervised learning
  4. topic models

Qualifiers

  • Research-article

Conference

CIKM'16
Sponsor:
CIKM'16: ACM Conference on Information and Knowledge Management
October 24 - 28, 2016
Indiana, Indianapolis, USA

Acceptance Rates

CIKM '16 Paper Acceptance Rate 160 of 701 submissions, 23%;
Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)41
  • Downloads (Last 6 weeks)6
Reflects downloads up to 14 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Dimensionality Reduction for Partial Label Learning: A Unified and Adaptive ApproachIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.336772136:8(3765-3782)Online publication date: Aug-2024
  • (2023)Bicriteria approximation algorithms for the submodular cover problemProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3669300(72705-72716)Online publication date: 10-Dec-2023
  • (2023)Online Semi-Supervised Classification on Multilabel Evolving High-Dimensional Text StreamsIEEE Transactions on Systems, Man, and Cybernetics: Systems10.1109/TSMC.2023.327529853:10(5983-5995)Online publication date: Oct-2023
  • (2023)Dimensionality Reduction for Categorical DataIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2021.313237335:4(3658-3671)Online publication date: 1-Apr-2023
  • (2023)Random forest feature selection for partial label learningNeurocomputing10.1016/j.neucom.2023.126870561(126870)Online publication date: Dec-2023
  • (2023)A review of semi-supervised learning for text classificationArtificial Intelligence Review10.1007/s10462-023-10393-856:9(9401-9469)Online publication date: 31-Jan-2023
  • (2022)Disambiguation Enabled Linear Discriminant Analysis for Partial Label Dimensionality ReductionACM Transactions on Knowledge Discovery from Data10.1145/349456516:4(1-18)Online publication date: 8-Jan-2022
  • (2022)GeSe: Generalized static embeddingApplied Intelligence10.1007/s10489-021-03001-152:9(10148-10160)Online publication date: 11-Jan-2022
  • (2022)Bio-Inspired Agents for a Distributed NLP-Based Clustering in Smart EnvironmentsProceedings of the 13th International Conference on Soft Computing and Pattern Recognition (SoCPaR 2021)10.1007/978-3-030-96302-6_64(678-687)Online publication date: 22-Feb-2022
  • (2021)Hierarchical features-based targeted aspect extraction from online reviewsIntelligent Data Analysis10.3233/IDA-19495225:1(205-223)Online publication date: 26-Jan-2021
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media