Abstract
According to statistics, over the past year, the quality of education has fallen due to the pandemic, and the percentage of plagiarism in the work of students has increased. Modern plagiarism detection systems work well with external plagiarism, they allow to weed out works and answers that completely copy someone else’s published ideas. Using natural language processing methods, the proposed algorithm allows not only detecting plagiarism, but also correctly classifies students’ responses by the amount of plagiarism. This research paper implements a two-step plagiarism detection algorithm. In the experiment, the text was converted into a vector form by the GloVe method, and then segmented by K-means and the result was obtained by the FP-Growth unsupervised learning algorithm.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Bensalem, I., Rosso, P., Chikhi, S.: Intrinsic plagiarism detection using N-gram classes. In: EMNLP 2014 - 2014 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference, pp. 1459–1464 (2014). https://doi.org/10.3115/v1/d14-1153
Clough, P., Stevenson, M.: Developing a corpus of plagiarised short answers. In: 31, pp. 527–540 (2005)
El Tahir Ali, A.M., Dahwa Abdulla, H.M., Snášel, V.: Overview and comparison of plagiarism detection tools. In: CEUR Workshop Proceedings, vol. 706, pp. 161–172 (2011). ISSN: 16130073
Foltýnek, T., et al.: Testing of support tools for plagiarism detection. Int. J. Educ. Technol. High. Educ. 17(1), Article no. 46 (2020). https://doi.org/10.1186/s41239-020-00192-4. arXiv: 2002.04279. ISSN: 23659440
Li, Y., Wu, H.: A clustering method based on k-means algorithm. In: Phys. Procedia 25, 1104–1109 (2012). https://doi.org/10.1016/j.phpro.2012.03.206. ISSN: 18753892
Liang, P.: Semi-supervised learning for natural language. In: Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science, p. 86 (2005). http://hdl.handle.net/1721.1/33296
Mikolov, T., et al.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, October 2013. arXiv: 1310.4546. ISSN: 10495258
Pennington, J., Richard, S., Manning, C.: GloVe: global vectors for word representation. Br. J. Neurosurg. 31(6), 682–687 (2017). https://doi.org/10.1080/02688697.2017.1354122. ISSN: 1360046X
Scanlon, P.M., Neumann, D.R.: Internet plagiarism among college students. J. College Stud. Dev. 43(3), 374–385 (2002). ISSN: 08975264
Shafiee, A., Karimi, M.: On the relationship between entropy and information. Phys. Essays 20(3), 487–493 (2007). https://doi.org/10.4006/1.3153419. ISSN: 08361398
Su, Z., et al.: Plagiarism detection using the Levenshtein distance and Smith-Waterman algorithm. In: 3rd International Conference on Innovative Computing Information and Control, ICICIC 2008, pp. 1–3 (2008). https://doi.org/10.1109/ICICIC.2008.422
Sun, Y., Platoš, J.: High-dimensional text clustering by dimensionality reduction and improved density peak. In: Wireless Communications and Mobile Computing 2020 (2020). https://doi.org/10.1155/2020/8881112. ISSN: 15308677
Acknowledgment
This research is conducted within the framework of the grant num. AP09058174 “Development of language-independent unsupervised methods of semantic analysis of large amounts of text data”.
The work was done with partial support from the Mexican Government through the grant A1-S-47854 of the CONACYT, Mexico and grants 20211784, 20211884, and 20211178 of the Secretaría de Investigación y Posgrado of the Instituto Politécnico Nacional, Mexico. The authors thank the CONACYT for the computing resources brought to them through the Plataforma de Aprendizaje Profundo para Tecnologías del Lenguaje of the Laboratorio de Supercómputo of the INAOE, Mexico.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Nurlybayeva, S., Akhmetov, I., Gelbukh, A., Mussabayev, R. (2021). Plagiarism Detection in Students’ Answers Using FP-Growth Algorithm. In: Batyrshin, I., Gelbukh, A., Sidorov, G. (eds) Advances in Soft Computing. MICAI 2021. Lecture Notes in Computer Science(), vol 13068. Springer, Cham. https://doi.org/10.1007/978-3-030-89820-5_12
Download citation
DOI: https://doi.org/10.1007/978-3-030-89820-5_12
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-89819-9
Online ISBN: 978-3-030-89820-5
eBook Packages: Computer ScienceComputer Science (R0)