Few-shot learning for short text classification

Leiming Yan¹,
Yuhui Zheng² &
Jie Cao³

5098 Accesses
56 Citations
Explore all metrics

Abstract

Due to the limited length and freely constructed sentence structures, it is a difficult classification task for short text classification. In this paper, a short text classification framework based on Siamese CNNs and few-shot learning is proposed. The Siamese CNNs will learn the discriminative text encoding so as to help classifiers distinguish those obscure or informal sentence. The different sentence structures and different descriptions of a topic are viewed as ‘prototypes’, which will be learned by few-shot learning strategy to improve the classifier’s generalization. Our experimental results show that the proposed framework leads to better results in accuracies on twitter classifications and outperforms some popular traditional text classification methods and a few deep network approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Few-shot short-text classification with language representations and centroid similarity

Article 01 August 2022

Topic Model with Fully-Connected Layers for Short-Text Classification

Densely Connected Bidirectional LSTM with Max-Pooling of CNN Network for Text Classification

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Bin G, Sheng VS (2016) A robust regularization path algorithm for ν-support vector classification. IEEE Trans Neural Netw Learn Syst. https://doi.org/10.1109/TNNLS.2016.2527796
Article Google Scholar
Bin G, Sheng VS, Tay KY, Romano W, Li S (2015) Incremental support vector learning for ordinal regression. IEEE Trans Neural Netw Learn Syst 26(7):1403–1416
Article MathSciNet Google Scholar
Blaes S, Burwick T (2017) Few-shot learning in deep networks through global prototyping[J]. Neural Netw Off J Int Neural Netw Soc 94:159–172
Article Google Scholar
Chen B, Qi X, Sun X, Shi Y-Q (2017) Quaternion pseudo-Zernike moments combining both of RGB information and depth information for color image splicing detection. J Vis Commun Image Represent
Cheng J, Zhang X, Li P et al (2016) Exploring sentiment parsing of microblogging texts for opinion polling on Chinese public figures. Appl Intell 45(2):429–442
Article Google Scholar
Ding G, Guo Y, Zhou J, Gao Y (2016) Large-scale cross-modality search via collective matrix factorization hashing. IEEE Trans Image Process 25(11):5427–5440
Article MathSciNet Google Scholar
Ding G, Zhou J, Guo Y, Lin Z, Zhao S (2017) Large-scale image retrieval with sparse embedded hashing. Neurocomputing 257:24–36
Article Google Scholar
Fu Z, Huang F, Sun X, Vasilakos AV, Yang C-N (2016) Enabling semantic search based on conceptual graphs over encrypted outsourced data. IEEE Trans Serv Comput. https://doi.org/10.1109/TSC.2016.2622697
Guo Y, Ding G, Han J (2017) Robust quantization for general similarity search. IEEE Trans Image Process PP(99):1–1
Guo Y, Ding G, Liu L, Han J, Shao L (2017) Learning to hash with optimized anchor embedding for scalable retrieval. IEEE Trans Image Process 26(3):1344–1354
Article MathSciNet Google Scholar
Guo Y, Ding G, Han J et al Zero-shot learning with transferred samples. IEEE Trans Image Process 26(7):3277
Article MathSciNet Google Scholar
Han J, Cheng G, Li Z et al (2017) A unified metric learning-based framework for co-saliency detection. IEEE Trans Circuits Syst Video Technol PP(99):1–1
Han J, Chen H, Liu N et al (2017) CNNs-based RGB-D saliency detection via cross-view transfer and multiview fusion[J]. IEEE Trans Cybern PP(99):1–13
Hariharan B, Girshick R (2016). Low-shot visual object recognition. arXiv:1606.02819
Hecht T, Gepperth A (2016). Computational advantages of deep prototype-based learning. In: International conference on artificial neural networks, Springer, pp 121–127
Jetley S, Romera-Paredes B, Jayasumana S, Torr P (2015) Prototypical priors: from improving classification to zero-shot learning. arXiv preprint arXiv:1512. 01192
Kim Y (2014) Convolutional neural networks for sentence classification. arXiv, 1408.5882
Koch G, Zemel R, Salakhutdinov R (2015) Siamese neural networks for one-shot image recognition. Proceedings of the 32nd international conference on machine learning, Lille, France
Lake BM, Salakhutdinov R, Tenenbaum JB (2013) One-shot learning by inverting a compositional causal process[J]. Adv Neural Inf Proces Syst 2526–2534
Lampert CH, Nickisch H, Harmeling S (2009) Learning to detect unseen object classes by between-class attribute transfer. In: IEEE conference on computer vision and pattern recognition. CVPR 2009 IEEE, pp 951–958
Li J, Li X, Yang B, Sun X (2015) Segmentation-based image copy-move forgery detection scheme. IEEE Trans Inf Forensics Secur 10(3):507–518
Article Google Scholar
Mike T, Kevan B, Georgios P (2012) Sentiment strength detection for the social web. J Assoc Inf Sci Technol 63(1):163–173
Article Google Scholar
Mikolov T, Karafiát M, Burget L et al (2010) Recurrent neural network based language model. 11th Annual Conference of the International Speech Communication Association, Makuhari, Japan, pp 1045–1048
Nakov P, Rosenthal S, Kiritchenko S et al (2016) Developing a successful SemEval task in sentiment analysis of Twitter and other social media texts. Lang Resour Eval 50(1):35–65
Article Google Scholar
Ravi S, Larochelle H (2017) Optimization as a Model for Few-Shot Learning. 5th International Conference on Learning Representations(ICLR), Toulon, France. https://openreview.net/pdf?id=rJY0-Kcll
Rezende DJ, Mohamed S, Danihelka I, Gregor K, Wierstra D (2016) One-shot generalization in deep generative models. arXiv preprint arXiv:1603.05106
Saif H, Fernández M, He Y et al (2013) Evaluation datasets for twitter sentiment analysis: a survey and a new dataset, the STS-gold. Proceedings of the first international workshop on emotion and sentiment in social and expressive media: approaches and perspectives from AI, A workshop of the XIII International Conference of the Italian Association for Artificial Intelligence, Turin, Italy, pp 9–21
Schroff F, Kalenichenko D, Philbin J (2015) Facenet: a unified embedding for face recognition and clustering. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 815–823
Snell J, Swersky K, Zemel RS (2017) Prototypical networks for few-shot learning. arXiv:1703.05175
Socher R, Lin CC-Y, Ng AY, Manning CD (2011) Parsing natural scenes and natural language with recursive neural networks. Proceedings of the 28th international conference on machine learning, Washington, USA, pp 129–136
Speriosu M, Upadhyay S, Sudan N et al (2011) Twitter polarity classification with label propagation over lexical links and the follower graph. Proceedings of the EMNLP First workshop on Unsupervised Learning in NLP, Edinburgh, Scotland, pp 53–63
Sundermeyer M, Schlüter R, Ney H (2012) LSTM neural networks for language modeling. 13th annual conference of the international speech communication association, Portland, USA, pp 194–197
Tang D, Wei F, Qin B (2014) Coooolll: A deep learning system for Twitter sentiment classification. Proceedings of the 8th International Workshop on Semantic Evaluation, Dublin, Ireland, pp 208–212
Triantafillou E, Zemel RS, Urtasun R Few-shot learning through an information retrieval lens. arXiv:1707.02610
Turney PD, Pantel P (2010) From frequency to meaning: vector space models of semantics. J Artif Intell Res 37(1):141–188
Article MathSciNet Google Scholar
Vinyals O, Blundell C, Lillicrap T, Wierstra D et al (2016) Matching networks for one shot learning. Adv Neural Inf Process Sys 3630–3638
Wang X, Liu Y, Sun C et al (2012) Predicting polarities of tweets by composing word embeddings with long short-term memory. Unabbreviated Name of Conference, Portland, USA, pp 194–197
Wang J, Li T, Shi Y-Q, Lian S, Ye J Forensics feature analysis in quaternion wavelet domain for distinguishing photographic images and computer graphics. Multimed Tools Appl. https://doi.org/10.1007/s11042-016-4153-0
Article Google Scholar
Weinberger KQ, Blitzer J, Saul LK (2005) Distance metric learning for large margin nearest neighbor classification. In: Advances in neural information processing systems, pp 1473–1480
Yan L, Zheng W, Zhang H(H) et al (2017) Learning discriminative sentiment chunk vectors for twitter sentiment analysis. J Inf Technol 18(7):1605–1613. https://doi.org/10.6138/JIT.2017.18.7.20170410
Article Google Scholar
Yao X, Han J, Cheng G, Qian X, Guo L (2016) Semantic annotation of high-resolution satellite images via weakly supervised learning. IEEE Trans Geosci Remote Sens 54(6):3660–3671
Article Google Scholar
Yao X, Han J, Zhang D, Nie F (2017) Revisiting co-saliency detection: a novel approach based on two-stage multi-view spectral rotation co-clustering. IEEE Trans Image Process 26(7):3196–3209
Article MathSciNet Google Scholar
Zhang Z, Saligrama V (2015) Zero-shot learning via semantic similarity embedding. In: Proceedings of the IEEE international conference on computer vision, pp 4166–4174
Zhang D, Han J, Li C, Wang J, Li X (2016) Detection of co-salient objects by looking deep and wide. Int J Comput Vis 20(2):215–232
Article MathSciNet Google Scholar
Zhang D, Han J, Jiang L, Ye S, Chang X (2017) Revealing event saliency in unconstrained video collection. IEEE Trans Image Process 26(4):1746–1758
Article MathSciNet Google Scholar
Zhang D, Meng D, Han J (2017) Co-saliency detection via a self-paced multiple-instance learning framework. IEEE Trans Pattern Anal Mach Intell 39(5):865–878
Article Google Scholar
Zhao Y, Ding DZ, Chen RS (2016) A discontinuous Galerkin time domain integral equation method for electromagnetic scattering from PEC objects. IEEE Trans Antennas Propag 64(6):2410–2417
Article MathSciNet Google Scholar
Zheng Y, Jeon B, Sun L, Zhang J, Zhang H (2017) Student's t-Hidden Markov Model for Unsupervised Learning Using Localized Feature Selection. IEEE Trans Circuits Syst Video Technol. https://doi.org/10.1109/TCSVT.2017.2724940
Zhou Z, Yang C-N, Chen B, Sun X, Liu Q, Wu QMJ (2016) Effective and efficient image copy detection with resistance to arbitrary rotation. IEICE Trans Inf Syst E99-D(6):1531–1540
Article Google Scholar
Zhou Z, Wang Y, Jonathan Wu QM, Yang C-N, Sun X (2017) Effective and efficient global context verification for image copy detection. IEEE Trans Inf Forensics Secur 12(1):48–63
Article Google Scholar

Download references

Acknowledgements

This work is supported by the Chinese National Natural Science Foundation (NSFC) [grant numbers 61772281, 61602254]; the National Social Science Foundation of China (No. 16ZDA054); Jiangsu Provincial 333 Project (BRA2017396); Six Major Talents PeakProject of Jiangsu Province (XYDXXJS-CXTD-005); the Priority Academic Program Development of Jiangsu Higher Education Institutions (PAPD) and Jiangsu Collaborative Innovation Center on Atmospheric Environment and Equipment Technology (CICAEET).

Author information

Authors and Affiliations

Jiangsu Engineering Center of Network Monitoring, Nanjing University of Information Science & Technology, Nanjing, China
Leiming Yan
School of Computer & Software, Nanjing University of Information Science & Technology, Nanjing, China
Yuhui Zheng
School of Mathematical &Statistics, Nanjing University of Information Science & Technology, Nanjing, China
Jie Cao

Authors

Leiming Yan
View author publications
You can also search for this author in PubMed Google Scholar
Yuhui Zheng
View author publications
You can also search for this author in PubMed Google Scholar
Jie Cao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Leiming Yan.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yan, L., Zheng, Y. & Cao, J. Few-shot learning for short text classification. Multimed Tools Appl 77, 29799–29810 (2018). https://doi.org/10.1007/s11042-018-5772-4

Download citation

Received: 26 September 2017
Revised: 31 January 2018
Accepted: 09 February 2018
Published: 21 February 2018
Issue Date: November 2018
DOI: https://doi.org/10.1007/s11042-018-5772-4

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Few-shot short-text classification with language representations and centroid similarity

Topic Model with Fully-Connected Layers for Short-Text Classification

Densely Connected Bidirectional LSTM with Max-Pooling of CNN Network for Text Classification

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Few-shot learning for short text classification

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Few-shot short-text classification with language representations and centroid similarity

Topic Model with Fully-Connected Layers for Short-Text Classification

Densely Connected Bidirectional LSTM with Max-Pooling of CNN Network for Text Classification

Explore related subjects

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation