EXPLORING AUTHOR PROFILING FOR PLAGIARISM DETECTION: LEVERAGING PERSONALITY TRAITS AND ENSEMBLE METHODS
##plugins.themes.academic_pro.article.main##
Abstract
Plagiarism poses a significant challenge in academic circle, as individuals frequently pass off internet content as their own without proper attribution. Traditional detection methods, reliant on established databases, falter when the source material is absent. Author profiling emerges as a crucial tool, analyzing collective language patterns to discern traits like gender, age, native language, and personality. This paper focuses on leveraging personality traits for both plagiarism detection and author profiling. Employing machine learning, particularly ensemble methods, offers promising solutions to these intricate challenges. A dataset of 67 technical research papers, annotated with OCEAN personality traits and plagiarism percentages, underwent preprocessing including outlier detection and normalization. Ensemble techniques, like Extended Gradient Boosting Regressor, Bagging Regressor, Gradient Regressor and AdaBoost Regressor, were applied as base models, with Random Forest Regressor serving as the meta model. Findings reveal notable RMSE values: 0.29 for stacking, 0.93 for averaging, and 2.39 for max voting. Comparison with non-ensemble methods underscores the effectiveness of ensemble learning, notably with Random Forest Regressor achieving a commendable RMSE of 0.29 post-training. Novelty lies in integrating plagiarism detection with personality-based author profiling, providing a comprehensive approach for tackling academic misconduct. By melding machine learning with personality insights, novel avenues for improving detection accuracy emerge. Moreover, ensemble methods enhance the robustness of the approach, showcasing innovative strategies for maintaining academic integrity. This study's findings, integrating plagiarism detection with personality-based author profiling, promise to enhance academic integrity and scholarly conduct, offering valuable insights for refining detection tools and informing decision-making in diverse domains.
##plugins.themes.academic_pro.article.details##

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
References
- Farrokh Habibzadeh, “Plagiarism: A Bird’s Eye View”, J Korean Med Sci. , PP. 38-45, 2023, doi: 10.3346/jkms.2023.38.e373
- Ei Mon Phyo, Theoo Lwin, Hpone Pyae Tun, Zaw Zaw Oo, Kyaw Swa Mya & Henry Silverman, “ Knowledge, attitudes, and practices regarding plagiarism of postgraduate students in Myanmar”, Accountability in Research, PP. 672-691, DOI: 10.1080/08989621.2022.2077643
- Nur-E Hafsa, “Plagiarism: A Global Phenomenon”, Journal of Education and Practice, Vol.12, PP. 53-59, 2021, DOI: 10.7176/JEP/12-3-08
- T. L. Giluk and B. E. Postlethwaite, “Big Five personality and academic dishonesty: A meta-analytic review,” Personality and Individual Differences, vol. 72, pp. 59–67, 2015, doi: 10.1016/j.paid.2014.08.027.
- Hloniphani Ndebele, “Demystifying student plagiarism in academic writing: Towards an ‘educational’ solution”, CriSTaL, Vol 8, PP. 39-54, 2020, DOI: 14426/cristal.v8i2.284
- Vrushali Bhuyar, S. N. Deshmukh, “Analysis of Support Tools for Plagiarism Detection”, ICAMIDA , ACSR, pp. 38–46, 2023, DOI: 10.2991/978-94-6463-136-4_6
- Francisco Rangel, Fabio Celli, Paolo Rosso, Martin Potthast, Benno Stein, Walter Daelemans, “Overview of the 3rd Author Profiling Task at PAN 2015”, CLEF 2015 Labs and Workshops, Vol 1391,PP. ,2015 http://ceur-ws.org/.
- D. Radha, Dr. P. Chandra Shekhar, “An Advanced Document Representation Technique Based Approach for Author Profiles Prediction using Word Embedding Techniques”, IJISAE, Vol 12(7s), 377–393, 2023, https://ijisae.org/index.php/IJISAE/article/view/4081
- Feridun Kaya, Fatih Aydin, Astrid Schepman, Paul Rodway, Okan Yetişensoy & Meva Demir Kaya,” The Roles of Personality Traits, AI Anxiety, and Demographic Factors in Attitudes toward Artificial Intelligence” International Journal of Human–Computer Interaction, PP. 497-514, DOI: 10.1080/10447318.2022.2151730
- Wiebke Bleidorn , Christopher J. Hopwood , Mitja D. Back , Jaap J. A. Denissen , Marie Hennecke , Patrick L. Hill , Markus Jokela , Christian Kandler , Richard E. Lucas , Maike Luhmann , Ulrich Orth , Brent W. Roberts , Jenny Wagner , Cornelia Wrzus , Johannes Zimmermann, “Personality Trait Stability and Change”, Personality Science, Vol. 2, PP. 1-20, 2021, https://doi.org/10.5964/ps.6009
- K.M.G.S Karunarathna , M.P.R.I.R. Silva , and R.A.H.M. Rupasingha, “Ensemble Learning Approach for Identifying Personality Traits based on Individuals' Behavior”, Sri Lanka Journal of Social Sciences and Humanities Volume 3 Issue 1, 2023, PP. 107-116, DOI: http://doi.org/10.4038/sljssh.v3i1.91
- MUHAMMAD FARAZ MANZOOR, MUHAMMAD SHOAIB FAROOQ, MUHAMMAD HASEEB , UZMA FAROOQ , SOHAIL KHALID , AND ADNAN ABID, “Exploring the Landscape of Intrinsic Plagiarism Detection: Benchmarks, Techniques, Evolution, and Challenges”, IEEE, Vol 11, 2023, DOI: 10.1109/ACCESS.2023.3338855
- Todd Zhou , Hong Jiao, “Exploration of the Stacking Ensemble Machine Learning Algorithm for Cheating Detection in Large-Scale Assessment”, Sage Journals, Vol 83, 2023, https://doi.org/10.1177/00131644221117193
- Peter Eshun, Kyeremeh Tawiah Dabone, Ruth Keziah Annan-Brew, Inuusah Mahama1, Samuel Ofori Danquah, “ Personality Traits and Levels of Self-Efficacy as Predictors of Academic Dishonesty among Higher Education Students in Ghana”, Psychology, 2023, Vol 14, PP. 13-34, DOI: 10.4236/psych.2023.141002
- Constantinos M. Kokkinos, Nafsika Antoniadou, Ioanna Voulgaridou, “Majors unleashed: unravelling students’ personality profiles across academic disciplines”, Current Psychology, 2024, https://doi.org/10.1007/s12144-024-05721-2
- Parvez Mahbub, Naz Zarreen Oishie, S M Rafizul Haque, “Authorship Identification of Source Code Segments Written by Multiple Authors Using Stacking Ensemble Method” , ICCIT, Vol 1 , XXXX , 2019, DOI: 10.1109/ICCIT48885.2019.9038412
- Muhammad Sajid Maqbool , Israr Hanif, Sajid Iqbal , Abdul Basit and Aiman Shabbir, “Optimized Feature Extraction and Cross-Lingual Text Reuse Detection using Ensemble Machine Learning Models” ,Journal of Computing & Biomedical Informatics, Vol 5, XXXX, 2023, DOI: https://doi.org/10.56979/501/2023
- Eivind Strøm, “Multi-label Style Change Detection by Solving a Binary Classification Problem”, CLEF, Vol 2936, XXXX , 2022,
- Todd Zhou, Hong Jiao, “Data Augmentation in Machine Learning for Cheating Detection in Large-Scale Assessment: An Illustration with the Blending Ensemble Learning Algorithm”, Psychological Test and Assessment Modeling, Vol 64, PP 425-444, 2022.
- M. Gullu , H. Polat, “Text authorship identification based on ensemble learning and genetic algorithm combination in Turkish text”, Politeknik Dergisi, Vol 25(3): PP 1287-1297, (2022), DOI: 10.2339/politeknik.992493
- D. Kopev, D. Zlatkova, K. Mitov, A. Atanasov, M. Hardalov, I. Koychev, and P. Nakov, ‘‘Recursive style breach detection with multifaceted ensemble learning,’’ in Proc. Int. Conf. Artif. Intell., Methodol., Syst., Appl., in Lecture Notes in Computer Science: Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics, vol. 11089, 2018, pp. 126–137, DOI: 10.1007/978-3-319-99344-7_12.
- Stamatatos, E, Potthast, M, Rangel, F, Rosso, P, Stein, B, “Overview of the PAN/CLEF 2015 Evaluation Lab” En Experimental IR Meets Multilinguality, Multimodality, and Interaction: 6th International Conference of the CLEF Association, CLEF'15, Toulouse, France, Proceedings. Springer International Publishing, PP 518-538, 2015 DOI:10.1007/978-3-319-24027-5_49.
- Jesus Serrano-Guerrero, Bashar Alshouha, Mohammad Bani-Doumi, Francisco Chiclana, Francisco P. Romero , Jose A. Olivas, “Combining machine learning algorithms for personality trait prediction” Egyptian Informatics Journal, Vol 25, PP 1-13, 2024, https://doi.org/10.1016/j.eij.2024.100439
- Majid Ramezani, Mohammad-Reza Feizi-Derakhshi,Mohammad-Ali Balafar, Meysam Asgari-Chenaghlu, Ali-Reza Feizi-Derakhshi, Narjes Nikzad-Khasmakhi, Mehrdad Ranjbar-Khadivi, Zoleikha JahanbakhshNagadeh, Elnaz Zafarani-Moattar, Taymaz Rahkar-Farshi, “Automatic Personality Prediction; an Enhanced Method Using Ensemble Modeling”, Neural Computing and Applications, Vol 34, PP 18369-18389, 2022.
- Muntasir Hoq, Peter Brusilovsky, Bita Akram, “Analysis of an Explainable Student Performance Prediction Model in an Introductory Programming Course”, International Conference on Educational Data Mining, PP. 79-90, 2023, https://doi.org/10.5281/zenodo.8115693