Abstract
Cloud computing is considered to be an integral aspect in all business and this is expected to change the information technology (IT) landscape. This has been based on the model that delivers services on the internet using the pay-as-you go model that has several advantages like the no up-front cost, a lower IT staff, and a lower operation cost. A technology that is made use of for retrieval of data from huge database is known as text mining. This is used by cloud for efficiently retrieving data from the data centres of cloud. In providing navigation as well as mechanisms for browsing intuitively, text document clustering has an important role. This is done by organizing huge amounts of information into smaller number of clusters. Bag of words (BoW) is a representation that is used for the clustering of these methods but in many case it is not satisfactory as relations that exist between terms that don’t co-occur are ignored. To handle this problem a document level and sentence level integration of the concepts is made. This increases the space of the feature vector and also brings down the clustering algorithm’s efficiency. In order to overcome this a self-organizing feature map (SOFM) based algorithm makes use of the concepts of genetic algorithm (GA) along with grey wolf optimization (GWO) which are considered popular in the SOFM. The goal of the SOFM-GA is to find an optimal topology of network (the number of neurons and their array dimension) along with an optimal training parameter like the scheduling of learning rate and the annealing of neighborhood width. The SOFM-GWO and the GWO-based approach to the formation of SOFM are compared with the SOM standard relating to quality and the weights and map generated. The results of the experiment show that this method achieved better results.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Irfan, R., King, C.K., Grages, D., Ewen, S., Khan, S.U., Madani, S.A., Tziritas, N.: A survey on text mining in social networks. Knowl. Eng. Rev. 30(02), 157–170 (2015)
Bandaru, S., Madhuri, K.B.: An efficient semantic model for concept based clustering and classification. Int. J. Comput. Sci. Eng. 4(3), 340 (2012)
Lama, P. (2013). Clustering system based on text mining using the K-means algorithm: news headlines clustering
Khalid, A., Alam, F., Ahmed, I.: Extracting reference text from citation contexts. Clust. Comput. 1, 1–18 (2017)
Yasodha, M., Ponmuthuramalingam, P.: An advanced concept-based mining model to enrich text clustering. Int. J. Comput. Sci. Issue 9(4), 417–422 (2012)
Durga, J., Sunitha, D., Narasimha, S. P. (2012). A survey on concept based mining model using various clustering techniques. Int. J. Adv. Res. Comput. Sci. Softw. Eng
Le Thi, H.A., Nguyen, M.C.: Self-organizing maps by difference of convex functions optimization. Data Min. Knowl. Discov. 28(5–6), 1336–1365 (2014)
Mirjalili, S., Mirjalili, S.M., Lewis, A.: Grey wolf optimizer. Adv. Eng. Softw. 69, 46–61 (2014)
Song, X., Tang, L., Zhao, S., Zhang, X., Li, L., Huang, J., Cai, W.: Grey wolf optimizer for parameter estimation in surface waves. Soil Dyn. Earthq. Eng. 75, 147–157 (2015)
Krishna, S.M., Bhavani, S.D.: An efficient approach for text clustering based on frequent itemsets. Eur. J. Sci. Res. 42(3), 399–410 (2010)
Saranya, S., Munieswari, R.: A survey on improving the clustering performance in text mining for efficient information retrieval. Int. J. Eng. Trends Technol. (IJETT) 8(5), 1 (2014)
Yang, H., Wang, Z., Xu, H. (2015) On-line text mining and recommendation based on ontology and implied sentiment inclination. In 2015 17th International Conference on Advanced Communication Technology (ICACT), pp. 613–617. IEEE
Vidhya, K. A., Aghila, G. (2010) Hybrid text mining model for document classification. In 2010 The 2nd International Conference on Computer and Automation Engineering (ICCAE), Vol. 1, pp. 210–214. IEEE
Santra, A.K., Christy, C.J.: Genetic algorithm and confusion matrix for document clustering. Int. J. Comput. Sci. Issues 9(1), 322–328 (2012)
Emary, E., Yamany, W., Hassanien, A.E., Snasel, V.: Multi-objective gray-wolf optimization for attribute reduction. Proc. Comput. Sci. 65, 623–632 (2015)
Kishor, A., Singh, P. K.: Empirical study of grey wolf optimizer. In Proceedings of Fifth International Conference on Soft Computing for Problem Solving, pp. 1037–1049. Springer, Singapore (2016)
Elhariri, E., El-Bendary, N., Hassanien, A. E., Abraham, A.: Grey wolf optimization for one-against-one multi-class support vector machines. In 2015 7th International Conference of Soft Computing and Pattern Recognition (SoCPaR), pp. 7–12. IEEE (2015)
Matharage, S., Alahakoon, D.: Enhancing GSOM text clustering with latent semantic analysis. In 2010 Fifth International Conference on Information and Automation for Sustainability, pp. 441–446. IEEE (2010)
Yu, L., Zheng, J., Shen, W.C., Wu, B., Wang, B., Qian, L., Zhang, B.R.: BC-PDM: data mining, social network analysis and text mining system based on cloud computing. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1496–1499. ACM (2012)
Zhou, J., Cao, Z., Dong, X., Lin, X.: PPDM: a privacy-preserving protocol for cloud-assisted e-healthcare systems. IEEE J. Sel. Topics Signal Process. 9(7), 1332–1344 (2015)
Zeng, J., Ruan, G., Crowell, A., Prakash, A., Plale, B.: Cloud computing data capsules for non-consumptive use of texts. In Proceedings of the 5th ACM Workshop on Scientific Cloud Computing, pp. 9–16. ACM (2014)
Tablan, V., Roberts, I., Cunningham, H., Bontcheva, K.: GATECloud. net: a platform for large-scale, open-source text processing on the cloud. Philos. Trans. R. Soc. A, 371(1983), 20120071 (2013)
Samovsky, M., Kacur, T.: Cloud-based classification of text documents using the Grid gain platform. In: 2012 7th IEEE International Symposium on Applied Computational Intelligence and Informatics (SACI), pp. 241–245. IEEE (2012)
Punitha, S.C., Thangaiah, P.R.J., Punithavalli, M.: Performance analysis of clustering using partitioning and hierarchical clustering techniques. Int. J. Database Theory Appl. 7(6), 233–240 (2014)
Bhardwaj, B.: Text mining, its utilities, challenges and clustering techniques. Int. J. Comput. Appl. 135(7), 22–24 (2016)
Shehata, S., Karray, F., Kamel, M.: An efficient concept-based mining model for enhancing text clustering. IEEE Trans. Knowl. Data Eng. 22(10), 1360–1371 (2010)
Menaga, N., Hemapriya, B.: An efficient concept-based mining model for enhancing text clustering. Int. J. Comput. Trends Technol. 41 (2013)
Navaneethakumar, V.M., Chandrasekar, C.: A consistent web documents based text clustering using concept based mining model. IJCSI Int. J. Comput. Sci. 2012, 9 (2012)
Huang, C.L., Tsai, C.Y.: A hybrid SOFM-SVR with a filter-based feature selection for stock market forecasting. Expert Syst. Appl. 36(2), 1529–1539 (2009)
Jang, J., Lee, Y., Lee, S., Shin, D., Kim, D., Rim, H.: A novel density-based clustering method using word embedding features for dialogue intention recognition. Clust. Comput. 19(4), 2315–2326 (2016)
Bharadwaj, D., Shukla, S.: Text mining technique using genetic algorithm. In: Proceedings on International Conference on Advances in Computer Application (ICACA) (2013)
Zhang, S., Zhou, Y.: Grey wolf optimizer based on powell local optimization method for clustering analysis. Discret. Dyn. Nat. Soc (2015)
Saremi, S., Mirjalili, S.Z., Mirjalili, S.M.: Evolutionary population dynamics and grey wolf optimizer. Neural Comput. Appl. 26(5), 1257–1263 (2015)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Thilagavathy, R., Sabitha, R. Using cloud effectively in concept based text mining using grey wolf self organizing feature map. Cluster Comput 22 (Suppl 5), 10697–10707 (2019). https://doi.org/10.1007/s10586-017-1159-y
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10586-017-1159-y