Plagiarism Detection in Students’ Answers Using FP-Growth Algorithm

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13068))

Included in the following conference series:

Mexican International Conference on Artificial Intelligence

845 Accesses
1 Citations

Abstract

According to statistics, over the past year, the quality of education has fallen due to the pandemic, and the percentage of plagiarism in the work of students has increased. Modern plagiarism detection systems work well with external plagiarism, they allow to weed out works and answers that completely copy someone else’s published ideas. Using natural language processing methods, the proposed algorithm allows not only detecting plagiarism, but also correctly classifies students’ responses by the amount of plagiarism. This research paper implements a two-step plagiarism detection algorithm. In the experiment, the text was converted into a vector form by the GloVe method, and then segmented by K-means and the result was obtained by the FP-Growth unsupervised learning algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

An Approach for Plagiarism Detection in Learning Resources

Survey on Plagiarism Detection Systems and Their Comparison

Psquad: Plagiarism detection and document similarity of Hindi text

Article 22 July 2023

References

Bensalem, I., Rosso, P., Chikhi, S.: Intrinsic plagiarism detection using N-gram classes. In: EMNLP 2014 - 2014 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference, pp. 1459–1464 (2014). https://doi.org/10.3115/v1/d14-1153
Clough, P., Stevenson, M.: Developing a corpus of plagiarised short answers. In: 31, pp. 527–540 (2005)
Google Scholar
El Tahir Ali, A.M., Dahwa Abdulla, H.M., Snášel, V.: Overview and comparison of plagiarism detection tools. In: CEUR Workshop Proceedings, vol. 706, pp. 161–172 (2011). ISSN: 16130073
Google Scholar
Foltýnek, T., et al.: Testing of support tools for plagiarism detection. Int. J. Educ. Technol. High. Educ. 17(1), Article no. 46 (2020). https://doi.org/10.1186/s41239-020-00192-4. arXiv: 2002.04279. ISSN: 23659440
Li, Y., Wu, H.: A clustering method based on k-means algorithm. In: Phys. Procedia 25, 1104–1109 (2012). https://doi.org/10.1016/j.phpro.2012.03.206. ISSN: 18753892
Liang, P.: Semi-supervised learning for natural language. In: Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science, p. 86 (2005). http://hdl.handle.net/1721.1/33296
Mikolov, T., et al.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, October 2013. arXiv: 1310.4546. ISSN: 10495258
Pennington, J., Richard, S., Manning, C.: GloVe: global vectors for word representation. Br. J. Neurosurg. 31(6), 682–687 (2017). https://doi.org/10.1080/02688697.2017.1354122. ISSN: 1360046X
Scanlon, P.M., Neumann, D.R.: Internet plagiarism among college students. J. College Stud. Dev. 43(3), 374–385 (2002). ISSN: 08975264
Google Scholar
Shafiee, A., Karimi, M.: On the relationship between entropy and information. Phys. Essays 20(3), 487–493 (2007). https://doi.org/10.4006/1.3153419. ISSN: 08361398
Su, Z., et al.: Plagiarism detection using the Levenshtein distance and Smith-Waterman algorithm. In: 3rd International Conference on Innovative Computing Information and Control, ICICIC 2008, pp. 1–3 (2008). https://doi.org/10.1109/ICICIC.2008.422
Sun, Y., Platoš, J.: High-dimensional text clustering by dimensionality reduction and improved density peak. In: Wireless Communications and Mobile Computing 2020 (2020). https://doi.org/10.1155/2020/8881112. ISSN: 15308677

Download references

Acknowledgment

This research is conducted within the framework of the grant num. AP09058174 “Development of language-independent unsupervised methods of semantic analysis of large amounts of text data”.

The work was done with partial support from the Mexican Government through the grant A1-S-47854 of the CONACYT, Mexico and grants 20211784, 20211884, and 20211178 of the Secretaría de Investigación y Posgrado of the Instituto Politécnico Nacional, Mexico. The authors thank the CONACYT for the computing resources brought to them through the Plataforma de Aprendizaje Profundo para Tecnologías del Lenguaje of the Laboratorio de Supercómputo of the INAOE, Mexico.

Author information

Authors and Affiliations

Faculty of Information Technology, Kazakh-British Technical University, Almaty, Kazakhstan
Sabina Nurlybayeva & Iskander Akhmetov
Instituto Politecnico Nacional, CIC, Mexico City, Mexico
Alexander Gelbukh
Institute of Information and Computational Technologies, Pushkin Street 125, Almaty, Kazakhstan
Iskander Akhmetov & Rustam Mussabayev

Authors

Sabina Nurlybayeva
View author publications
You can also search for this author in PubMed Google Scholar
Iskander Akhmetov
View author publications
You can also search for this author in PubMed Google Scholar
Alexander Gelbukh
View author publications
You can also search for this author in PubMed Google Scholar
Rustam Mussabayev
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Instituto Politécnico Nacional, Centro de Investigación en Computación, Mexico City, Mexico
Ildar Batyrshin
Instituto Politécnico Nacional, Centro de Investigación en Computación, Mexico City, Mexico
Alexander Gelbukh
Instituto Politécnico Nacional, Centro de Investigación en Computación, Mexico City, Mexico
Grigori Sidorov

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Nurlybayeva, S., Akhmetov, I., Gelbukh, A., Mussabayev, R. (2021). Plagiarism Detection in Students’ Answers Using FP-Growth Algorithm. In: Batyrshin, I., Gelbukh, A., Sidorov, G. (eds) Advances in Soft Computing. MICAI 2021. Lecture Notes in Computer Science(), vol 13068. Springer, Cham. https://doi.org/10.1007/978-3-030-89820-5_12

Download citation

DOI: https://doi.org/10.1007/978-3-030-89820-5_12
Published: 21 October 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-89819-9
Online ISBN: 978-3-030-89820-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Plagiarism Detection in Students’ Answers Using FP-Growth Algorithm

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

An Approach for Plagiarism Detection in Learning Resources

Survey on Plagiarism Detection Systems and Their Comparison

Psquad: Plagiarism detection and document similarity of Hindi text

References

Acknowledgment

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Plagiarism Detection in Students’ Answers Using FP-Growth Algorithm

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

An Approach for Plagiarism Detection in Learning Resources

Survey on Plagiarism Detection Systems and Their Comparison

Psquad: Plagiarism detection and document similarity of Hindi text

References

Acknowledgment

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation