Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Employing fisher discriminant analysis for Arabic text classification

Published: 01 February 2018 Publication History

Abstract

Linear discriminant analysis (LDA) is proposed for Arabic text classification.LDA employs less dimensions, which is helpful for sizable textual feature vectors.Despite that LDA is semantic loss feature reduction method, it shows useful results. Fisher's discriminant analysis; also called linear discriminant analysis (LDA), is a popular dimensionality reduction technique that is widely used for features extraction. LDA aims at finding an optimal linear transformation based on maximizing a class separability. Even though LDA shows useful results in various pattern recognition problems, such as face recognition, less attention has been devoted to employing this technique in Arabic information retrieval tasks. In particular, the sizable feature vectors in textual data enforces to implement dimensionality reduction techniques such as LDA. In this paper, we empirically investigated an LDA based method for Arabic text classification. We used a corpus that contains 2,000 documents belonging to five categories. The experimental results showed that the performance of semantic loss LDA based method was almost the same as the semantic rich singular value decomposition (SVD), and that is indication that LDA is a promising method for text mining applications. Display Omitted

References

[1]
A.C. Rencher, John Wiley & Sons, 2003.
[2]
S. Deerwester, S.T. Dumais, G.W. Furnas, T.K. Landauer, R. Harshman, Indexing by latent semantic analysis, J Am Soc Inf Sci, 41 (1990) 391.
[3]
I. Jolliffe, Principal component analysis, John Wiley & Sons, Ltd, 2002.
[4]
M. Belkin, P. Niyogi, Laplacian eigenmaps for dimensionality reduction and data representation, Neural Comput, 15 (2003) 1373-1396.
[5]
Avialable: https://translate.google.com/.
[6]
F.S. Al-Anzi, D. AbuZeina, Stemming impact on Arabic text categorization performance: a survey, in: 2015 5th international conference on information & communication technology and accessibility (ICTA), IEEE, 2015.
[7]
K. Torkkola, Linear discriminant analysis in document classification, in: IEEE ICDM workshop on text mining, 2001.
[8]
C.H. Park, H. Park, A comparison of generalized linear discriminant analysis algorithms, Pattern Recognit, 41 (2008) 1083-1097.
[9]
T. Li, Z. Shenghuo, O. Mitsunori, Using discriminant analysis for multi-class classification: an experimental investigation, Knowl Inf Syst, 10 (2006) 453-472.
[10]
A.M. Martnez, A.C. Kak, Pca versus lda, IEEE Trans Pattern Anal Mach Intell, 23 (2001) 228-233.
[11]
C. Liu, H. Wechsler, Gabor feature based classification using the enhanced fisher linear discriminant model for face recognition, IEEE Trans Image Process, 11 (2002) 467-476.
[12]
X. Wang, X. Tang, Dual-space linear discriminant analysis for face recognition, in: Proceedings of the 2004 IEEE computer society conference on computer vision and pattern recognition, 2004. CVPR 2004, 2, IEEE, 2004.
[13]
J. Lu, K.N. Plataniotis, A.N. Venetsanopoulos, Face recognition using kernel direct discriminant analysis algorithms, IEEE Trans Neural Netw, 14 (2003) 117-126.
[14]
W.-S. Zheng, J.-H. Lai, S.Z. Li, 1D-LDA vs. 2D-LDA: when is vector-based linear discriminant analysis better than matrix-based?, Pattern Recognit, 41 (2008) 2156-2172.
[15]
F.S. Al-Anzi, D. AbuZeina, Toward an enhanced Arabic text classification using cosine similarity and Latent Semantic Indexing, J King Saud Univ Comput Inf Sci, 29 (2017) 189-195.
[16]
H. Froud, A. Lachkar, S.A. Ouatik, Arabic text summarization based on latent semantic analysis to enhance Arabic documents clustering, in: International Journal of Data Mining & Knowledge Management Process (IJDKP), 2013.
[17]
F. Harrag, E. Al-Qawasmah, Improving Arabic text categorization using neural network with SVD, JDIM, 8 (2010) 233-239.
[18]
F.S. Al-Anzi, D. AbuZeina, Big data categorization for arabic text using latent semantic indexing and clustering, in: International conference on engineering technologies and big data analytics (ETBDA 2016), 2016.
[19]
H.M. Alghamdi, A. Selamat, Arabic web page clustering: a review, J King Saud Univ Comput Inf Sci (2017).
[20]
A. Al-Badarneh, E. Al-Shawakfa, B. Bani-Ismail, K. Al-Rababah, S. Shatnawi, The impact of indexing approaches on Arabic text classification, J Inf Sci, 43 (2017) 159-173.
[21]
M. Hadni, M. Gouiouez, Graph based representation for Arabic text categorization, in: Proceedings of the 2nd international conference on big data, cloud and applications, ACM, 2017.
[22]
Avialable: http://www.cs.waikato.ac.nz/ml/weka/.
[23]
Avialable: http://orange.biolab.si/.
[24]
Avialable: https://rapidminer.com/.
[25]
S. Marsland, Machine learning: an algorithmic perspective, CRC press, 2015.
[26]
W.L. Martinez, A.R. Martinez, A. Martinez, J. Solka, Exploratory data analysis with MATLAB, CRC Press, 2010.
[27]
S. Theodoridis, K. Koutroumbas, Pattern Recognition, Academic Press.2010, 2008.
[28]
M. Kantardzic, Data mining: concepts, models, methods, and algorithms, John Wiley & Sons, 2011.
[29]
R.O. Duda, P.E. Hart, Wiley, New York, 1973.
[30]
Avialable: http://www.alqabas.com.kw/Default.aspx.
[31]
F.S. Al-Anzi, D. AbuZeina, Beyond vector space model for hierarchical Arabic text classification: a Markov chain approach, Inf Process Manage, 54 (2018) 105-115.
[32]
F.S. Al-Anzi, D. AbuZeina, S. Hasan, Utilizing standard deviation in text classification weighting schemes, Int J Innov Comput Inf Control, 13 (2017). http://www.ijicic.org/ijicic-130420.pdf

Cited By

View all
  • (2024)A systematic review of Arabic text classification: areas, applications, and future directionsSoft Computing - A Fusion of Foundations, Methodologies and Applications10.1007/s00500-023-08384-628:2(1545-1566)Online publication date: 1-Jan-2024
  • (2023)Analysis of Cursive Text Recognition Systems: A Systematic Literature ReviewACM Transactions on Asian and Low-Resource Language Information Processing10.1145/359260022:7(1-30)Online publication date: 13-Apr-2023
  • (2023)Automatic speech emotion recognition based on hybrid features with ANN, LDA and K_NN classifiersMultimedia Tools and Applications10.1007/s11042-023-15413-x82:27(42783-42801)Online publication date: 1-Nov-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Computers and Electrical Engineering
Computers and Electrical Engineering  Volume 66, Issue C
February 2018
530 pages

Publisher

Pergamon Press, Inc.

United States

Publication History

Published: 01 February 2018

Author Tags

  1. Arabic
  2. Classification
  3. Eigenvectors
  4. Fisher
  5. Linear discriminant analysis
  6. Text

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 09 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)A systematic review of Arabic text classification: areas, applications, and future directionsSoft Computing - A Fusion of Foundations, Methodologies and Applications10.1007/s00500-023-08384-628:2(1545-1566)Online publication date: 1-Jan-2024
  • (2023)Analysis of Cursive Text Recognition Systems: A Systematic Literature ReviewACM Transactions on Asian and Low-Resource Language Information Processing10.1145/359260022:7(1-30)Online publication date: 13-Apr-2023
  • (2023)Automatic speech emotion recognition based on hybrid features with ANN, LDA and K_NN classifiersMultimedia Tools and Applications10.1007/s11042-023-15413-x82:27(42783-42801)Online publication date: 1-Nov-2023
  • (2022)Improved sine cosine algorithm with simulated annealing and singer chaotic map for Hadith classificationNeural Computing and Applications10.1007/s00521-021-06448-y34:2(1385-1406)Online publication date: 1-Jan-2022

View Options

View options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media