Analysis of Various Measures of Text Similarity for Comparing Topics of Computer Science Syllabuses

© 2024 by IJETT Journal
Volume-72 Issue-7
Year of Publication : 2024
Author : Ritu Sodhi, Jitendra Choudhary, Ritu Jain, Ruby Bhatt, Ritesh Joshi, Anil Patidar
DOI : 10.14445/22315381/IJETT-V72I7P128

Ritu Sodhi, Jitendra Choudhary, Ritu Jain, Ruby Bhatt, Ritesh Joshi, Anil Patidar, "Analysis of Various Measures of Text Similarity for Comparing Topics of Computer Science Syllabuses," International Journal of Engineering Trends and Technology, vol. 72, no. 7, pp. 260-265, 2024. Crossref,

Text similarity measures are used to find out how much different texts are similar. There is a need to compare text for document comparison, text classification, text summarizing, information retrieval, question-answer sessions, clustering documents, etc. There is also a need to compare computer science terms; while plagiarism checks, website contents, comparing syllabuses of the same subject, notes, books, etc. This research focused on the text similarity measures to compare text related to computer science terms. This research executed some of the lexical and semantic similarity measures for comparing topics of the syllabus of programming using Python. And found after executing various approaches that spacy using a large English model and cos_similarity together gives a better result. In the future, this research can be improved by including more similarity measures and by increasing the size of the dataset for comparison of computer science terms.

Computer science, Python, Spacy, Syllabus, Text similarity.

