Abstract
Due to the limited length and freely constructed sentence structures, it is a difficult classification task for short text classification. In this paper, a short text classification framework based on Siamese CNNs and few-shot learning is proposed. The Siamese CNNs will learn the discriminative text encoding so as to help classifiers distinguish those obscure or informal sentence. The different sentence structures and different descriptions of a topic are viewed as ‘prototypes’, which will be learned by few-shot learning strategy to improve the classifier’s generalization. Our experimental results show that the proposed framework leads to better results in accuracies on twitter classifications and outperforms some popular traditional text classification methods and a few deep network approaches.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Bin G, Sheng VS (2016) A robust regularization path algorithm for ν-support vector classification. IEEE Trans Neural Netw Learn Syst. https://doi.org/10.1109/TNNLS.2016.2527796
Bin G, Sheng VS, Tay KY, Romano W, Li S (2015) Incremental support vector learning for ordinal regression. IEEE Trans Neural Netw Learn Syst 26(7):1403–1416
Blaes S, Burwick T (2017) Few-shot learning in deep networks through global prototyping[J]. Neural Netw Off J Int Neural Netw Soc 94:159–172
Chen B, Qi X, Sun X, Shi Y-Q (2017) Quaternion pseudo-Zernike moments combining both of RGB information and depth information for color image splicing detection. J Vis Commun Image Represent
Cheng J, Zhang X, Li P et al (2016) Exploring sentiment parsing of microblogging texts for opinion polling on Chinese public figures. Appl Intell 45(2):429–442
Ding G, Guo Y, Zhou J, Gao Y (2016) Large-scale cross-modality search via collective matrix factorization hashing. IEEE Trans Image Process 25(11):5427–5440
Ding G, Zhou J, Guo Y, Lin Z, Zhao S (2017) Large-scale image retrieval with sparse embedded hashing. Neurocomputing 257:24–36
Fu Z, Huang F, Sun X, Vasilakos AV, Yang C-N (2016) Enabling semantic search based on conceptual graphs over encrypted outsourced data. IEEE Trans Serv Comput. https://doi.org/10.1109/TSC.2016.2622697
Guo Y, Ding G, Han J (2017) Robust quantization for general similarity search. IEEE Trans Image Process PP(99):1–1
Guo Y, Ding G, Liu L, Han J, Shao L (2017) Learning to hash with optimized anchor embedding for scalable retrieval. IEEE Trans Image Process 26(3):1344–1354
Guo Y, Ding G, Han J et al Zero-shot learning with transferred samples. IEEE Trans Image Process 26(7):3277
Han J, Cheng G, Li Z et al (2017) A unified metric learning-based framework for co-saliency detection. IEEE Trans Circuits Syst Video Technol PP(99):1–1
Han J, Chen H, Liu N et al (2017) CNNs-based RGB-D saliency detection via cross-view transfer and multiview fusion[J]. IEEE Trans Cybern PP(99):1–13
Hariharan B, Girshick R (2016). Low-shot visual object recognition. arXiv:1606.02819
Hecht T, Gepperth A (2016). Computational advantages of deep prototype-based learning. In: International conference on artificial neural networks, Springer, pp 121–127
Jetley S, Romera-Paredes B, Jayasumana S, Torr P (2015) Prototypical priors: from improving classification to zero-shot learning. arXiv preprint arXiv:1512. 01192
Kim Y (2014) Convolutional neural networks for sentence classification. arXiv, 1408.5882
Koch G, Zemel R, Salakhutdinov R (2015) Siamese neural networks for one-shot image recognition. Proceedings of the 32nd international conference on machine learning, Lille, France
Lake BM, Salakhutdinov R, Tenenbaum JB (2013) One-shot learning by inverting a compositional causal process[J]. Adv Neural Inf Proces Syst 2526–2534
Lampert CH, Nickisch H, Harmeling S (2009) Learning to detect unseen object classes by between-class attribute transfer. In: IEEE conference on computer vision and pattern recognition. CVPR 2009 IEEE, pp 951–958
Li J, Li X, Yang B, Sun X (2015) Segmentation-based image copy-move forgery detection scheme. IEEE Trans Inf Forensics Secur 10(3):507–518
Mike T, Kevan B, Georgios P (2012) Sentiment strength detection for the social web. J Assoc Inf Sci Technol 63(1):163–173
Mikolov T, Karafiát M, Burget L et al (2010) Recurrent neural network based language model. 11th Annual Conference of the International Speech Communication Association, Makuhari, Japan, pp 1045–1048
Nakov P, Rosenthal S, Kiritchenko S et al (2016) Developing a successful SemEval task in sentiment analysis of Twitter and other social media texts. Lang Resour Eval 50(1):35–65
Ravi S, Larochelle H (2017) Optimization as a Model for Few-Shot Learning. 5th International Conference on Learning Representations(ICLR), Toulon, France. https://openreview.net/pdf?id=rJY0-Kcll
Rezende DJ, Mohamed S, Danihelka I, Gregor K, Wierstra D (2016) One-shot generalization in deep generative models. arXiv preprint arXiv:1603.05106
Saif H, Fernández M, He Y et al (2013) Evaluation datasets for twitter sentiment analysis: a survey and a new dataset, the STS-gold. Proceedings of the first international workshop on emotion and sentiment in social and expressive media: approaches and perspectives from AI, A workshop of the XIII International Conference of the Italian Association for Artificial Intelligence, Turin, Italy, pp 9–21
Schroff F, Kalenichenko D, Philbin J (2015) Facenet: a unified embedding for face recognition and clustering. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 815–823
Snell J, Swersky K, Zemel RS (2017) Prototypical networks for few-shot learning. arXiv:1703.05175
Socher R, Lin CC-Y, Ng AY, Manning CD (2011) Parsing natural scenes and natural language with recursive neural networks. Proceedings of the 28th international conference on machine learning, Washington, USA, pp 129–136
Speriosu M, Upadhyay S, Sudan N et al (2011) Twitter polarity classification with label propagation over lexical links and the follower graph. Proceedings of the EMNLP First workshop on Unsupervised Learning in NLP, Edinburgh, Scotland, pp 53–63
Sundermeyer M, Schlüter R, Ney H (2012) LSTM neural networks for language modeling. 13th annual conference of the international speech communication association, Portland, USA, pp 194–197
Tang D, Wei F, Qin B (2014) Coooolll: A deep learning system for Twitter sentiment classification. Proceedings of the 8th International Workshop on Semantic Evaluation, Dublin, Ireland, pp 208–212
Triantafillou E, Zemel RS, Urtasun R Few-shot learning through an information retrieval lens. arXiv:1707.02610
Turney PD, Pantel P (2010) From frequency to meaning: vector space models of semantics. J Artif Intell Res 37(1):141–188
Vinyals O, Blundell C, Lillicrap T, Wierstra D et al (2016) Matching networks for one shot learning. Adv Neural Inf Process Sys 3630–3638
Wang X, Liu Y, Sun C et al (2012) Predicting polarities of tweets by composing word embeddings with long short-term memory. Unabbreviated Name of Conference, Portland, USA, pp 194–197
Wang J, Li T, Shi Y-Q, Lian S, Ye J Forensics feature analysis in quaternion wavelet domain for distinguishing photographic images and computer graphics. Multimed Tools Appl. https://doi.org/10.1007/s11042-016-4153-0
Weinberger KQ, Blitzer J, Saul LK (2005) Distance metric learning for large margin nearest neighbor classification. In: Advances in neural information processing systems, pp 1473–1480
Yan L, Zheng W, Zhang H(H) et al (2017) Learning discriminative sentiment chunk vectors for twitter sentiment analysis. J Inf Technol 18(7):1605–1613. https://doi.org/10.6138/JIT.2017.18.7.20170410
Yao X, Han J, Cheng G, Qian X, Guo L (2016) Semantic annotation of high-resolution satellite images via weakly supervised learning. IEEE Trans Geosci Remote Sens 54(6):3660–3671
Yao X, Han J, Zhang D, Nie F (2017) Revisiting co-saliency detection: a novel approach based on two-stage multi-view spectral rotation co-clustering. IEEE Trans Image Process 26(7):3196–3209
Zhang Z, Saligrama V (2015) Zero-shot learning via semantic similarity embedding. In: Proceedings of the IEEE international conference on computer vision, pp 4166–4174
Zhang D, Han J, Li C, Wang J, Li X (2016) Detection of co-salient objects by looking deep and wide. Int J Comput Vis 20(2):215–232
Zhang D, Han J, Jiang L, Ye S, Chang X (2017) Revealing event saliency in unconstrained video collection. IEEE Trans Image Process 26(4):1746–1758
Zhang D, Meng D, Han J (2017) Co-saliency detection via a self-paced multiple-instance learning framework. IEEE Trans Pattern Anal Mach Intell 39(5):865–878
Zhao Y, Ding DZ, Chen RS (2016) A discontinuous Galerkin time domain integral equation method for electromagnetic scattering from PEC objects. IEEE Trans Antennas Propag 64(6):2410–2417
Zheng Y, Jeon B, Sun L, Zhang J, Zhang H (2017) Student's t-Hidden Markov Model for Unsupervised Learning Using Localized Feature Selection. IEEE Trans Circuits Syst Video Technol. https://doi.org/10.1109/TCSVT.2017.2724940
Zhou Z, Yang C-N, Chen B, Sun X, Liu Q, Wu QMJ (2016) Effective and efficient image copy detection with resistance to arbitrary rotation. IEICE Trans Inf Syst E99-D(6):1531–1540
Zhou Z, Wang Y, Jonathan Wu QM, Yang C-N, Sun X (2017) Effective and efficient global context verification for image copy detection. IEEE Trans Inf Forensics Secur 12(1):48–63
Acknowledgements
This work is supported by the Chinese National Natural Science Foundation (NSFC) [grant numbers 61772281, 61602254]; the National Social Science Foundation of China (No. 16ZDA054); Jiangsu Provincial 333 Project (BRA2017396); Six Major Talents PeakProject of Jiangsu Province (XYDXXJS-CXTD-005); the Priority Academic Program Development of Jiangsu Higher Education Institutions (PAPD) and Jiangsu Collaborative Innovation Center on Atmospheric Environment and Equipment Technology (CICAEET).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Yan, L., Zheng, Y. & Cao, J. Few-shot learning for short text classification. Multimed Tools Appl 77, 29799–29810 (2018). https://doi.org/10.1007/s11042-018-5772-4
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-018-5772-4