MNet-Sim: A Multi-layered Semantic Similarity Network to Evaluate Sentence Similarity

MNet-Sim: A Multi-layered Semantic Similarity Network to Evaluate Sentence Similarity

© 2021 by IJETT Journal
Volume-69 Issue-7
Year of Publication : 2021
Authors : Manuela Nayantara Jeyaraj, Dharshana Kasthurirathna
DOI :  10.14445/22315381/IJETT-V69I7P225

How to Cite?

Manuela Nayantara Jeyaraj, Dharshana Kasthurirathna, "MNet-Sim: A Multi-layered Semantic Similarity Network to Evaluate Sentence Similarity," International Journal of Engineering Trends and Technology, vol. 69, no. 7, pp. 181-189, 2021. Crossref,

Similarity is a comparative - subjective measure that varies with the domain within which it is considered. In several NLP applications such as document classification, pattern recognition, chatbot questionanswering, sentiment analysis, etc., identifying an accurate similarity score for sentence pairs has become a crucial area of research. In the existing models that assess similarity, the limitation of effectively computing this similarity based on contextual comparisons, the localization due to the centering theory, and the lack of non-semantic textual comparisons have proven to be drawbacks. Hence, this paper presents a multi-layered semantic similarity network model built upon multiple similarity measures that render an overall sentence similarity score based on the principles of Network Science, neighboring weighted relational edges, and a proposed extended node similarity computation formula. The proposed multi-layered network model was evaluated and tested against established state-of-the-art models and is shown to have demonstrated better performance scores in assessing sentence similarity.

Multi-layer network, Network science, Semantic similarity.

[1] J. Yuhua Li, David McLean, Zuhair A Bandar, James D O’shea, and Keeley Crockett. Sentence similarity based on semantic nets and corpus statistics. IEEE transactions on knowledge and data engineering, 18(8)(2006) 1138–1150.
[2] Rada Mihalcea, Courtney Corley, Carlo Strapparava, et al. Corpus-based and knowledge-based measures of text semantic similarity. In Aaai, 6(2006) 775–780.
[3] Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. Bert: Pretraining of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018.
[4] Matthew E Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. Deep contextualized word representations. arXiv preprint arXiv:1802.05365, 2018.
[5] Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, and Ilya Sutskever. Language models are unsupervised multitask learners. OpenAI blog, 1(8)(2019) 9.
[6] Haoming Jiang, Pengcheng He, Weizhu Chen, Xiaodong Liu, Jianfeng Gao, and Tuo Zhao. Smart: Robust and efficient finetuning for pre-trained natural language models through principled regularized optimization. arXiv preprint arXiv:1911.03437 (2019).
[7] Wei Wang, Bin Bi, Ming Yan, Chen Wu, Zuyi Bao, Jiangnan Xia, Liwei Peng, and Luo Si. Structbert: Incorporating language structures into pre-training for deep language understanding. arXiv preprint arXiv:1908.04577 (2019).
[8] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Lukasz Kaiser, and IlliaPolosukhin. Attention is all you need. arXiv preprint arXiv:1706.03762 (2017).
[9] Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J Liu. Exploring the limits of transfer learning with a unified textto- text transformer. arXiv preprint arXiv:1910.10683 (2019).
[10] Peter Shaw, Jakob Uszkoreit, and Ashish Vaswani. Self-attention with relative position representations. arXiv preprint arXiv:1803.02155 (2018).
[11] Ruining He, Anirudh Ravula, Bhargav Kanagal, and Joshua Ainslie. Realformer: Transformer likes residual attention. arXiv eprints pages arXiv–2012 (2020).
[12] Baoxian Jia, Xin Huang, and Shuang Jiao. Application of semantic similarity calculation based on knowledge graph for personalized study recommendation service. Educational Sciences: Theory & Practice, 18(6)(2018).
[13] Muhammad Jawad Hussain, Shahbaz Hassan Wasti, Guangjian Huang, Lina Wei, Yuncheng Jiang, and Yong Tang. An approach for measuring semantic similarity between Wikipedia concepts using multiple inheritances. Information Processing & Management, 57(3)(2020) 102188.
[14] Jan WiraGotama Putra and Takenobu Tokunaga. Evaluating text coherence based on a semantic similarity graph. In Proceedings of TextGraphs-11: the Workshop on Graph-based Methods for Natural Language Processing, (2017),76–85.
[15] Naoaki Okazaki, Yutaka Matsuo, and Mitsuru Ishizuka. Improving chronological sentence ordering by a precedence relation. In Proceedings of the 20th international conference on Computational Linguistics, page 750. Association for Computational Linguistics, (2004).
[16] JJ Chen, C-A Tsai, H Moon, H Ahn, JJ Young, and C-H Chen. Decision threshold adjustment in class prediction. SAR and QSAR in Environmental Research, 17(3)(2006) 337–352.
[17] Piotr Bródka, Przemys?awKazienko, Katarzyna Musia?, and Krzysztof Skibicki. Analysis of neighbourhoods in multi-layered dynamic social networks. International Journal of Computational Intelligence Systems, 5(3)(2012) 582–596.
[18] Zhenzhong Lan, Mingda Chen, Sebastian Goodman, Kevin Gimpel, Piyush Sharma, and Radu Soricut. Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019).
[19] Chunjie Luo, Jianfeng Zhan, XiaoheXue, Lei Wang, Rui Ren, and Qiang Yang. Cosine normalization: Using cosine similarity instead of the dot product in neural networks. In International Conference on Artificial Neural Networks, (2018) 382–391. Springer.
[20] Heng Chen and Hai Xu. Quantitative linguistics approach to interlanguage development: a study based on the guangwai- Lancaster Chinese learner corpus. Lingua, 230: 102736 (2019).
[21] Matt Kusner, Yu Sun, Nicholas Kolkin, and Kilian Weinberger. From word embeddings to document distances. In International conference on machine learning, (2015) 957– 966. PMLR.
[22] Ryan J Gallagher, Morgan R Frank, Lewis Mitchell, Aaron J Schwartz, Andrew J Reagan, Christopher M Danforth, and Peter Sheridan Dodds. Generalized word shift graphs: A method for visualizing and explaining pairwise comparisons between texts. arXiv preprint arXiv:2008.02250, (2020).
[23] Muhammad Uzair and Noreen Jamil. Effects of hidden layers on the efficiency of neural networks. In 2020 IEEE 23rd International Multitopic Conference (INMIC), (2020) 1–6. IEEE
[24] SukhlalSangule, Dr. Sunil Phulre. Sentiment Detection Using Fish Optimization Genetic Algorithm International Journal of Engineering Trends and Technology, 68(12) 140-145.