Raga Net: A Novel Deep Learning Framework for Indian Raga Recognition based on Deep Convolution Neural Network and Long Short-term Memory

Raga Net: A Novel Deep Learning Framework for Indian Raga Recognition based on Deep Convolution Neural Network and Long Short-term Memory

  IJETT-book-cover           
  
© 2025 by IJETT Journal
Volume-73 Issue-2
Year of Publication : 2025
Author : Swati Pravin Aswale, Wani Patil, Shrikant Chavate
DOI : 10.14445/22315381/IJETT-V73I2P113

How to Cite?
Swati Pravin Aswale, Wani Patil, Shrikant Chavate, "Raga Net: A Novel Deep Learning Framework for Indian Raga Recognition based on Deep Convolution Neural Network and Long Short-term Memory," International Journal of Engineering Trends and Technology, vol. 73, no. 2, pp. 155-165, 2025. Crossref, https://doi.org/10.14445/22315381/IJETT-V73I2P113

Abstract
Indian Raga classification is challenging because of the vast disparity in the music's swaras, pitch, melody, style and intonation. This paper presents the Indian Raga classification using a Deep Convolution Neural Network and Multiple Acoustic Features. The multiple acoustic features include particular spectral, temporal, and voice quality attributes of the musical voice that describe the uniqueness of different ragas. A Novel hybrid Archimedes Optimization Algorithm in light of Multi-Attribute Utility Theory (AoA-MAUT) is utilized to choose salient and distinctive features from Multi Acoustic Features. Further, it uses the novel RagaNet, which combines Parallel Deep Convolution Neural Network (PDCNN) to depict the spectral properties of musical features and Long Short-Term Memory (LSTM) for providing temporal and long-term dependencies of the musical features. The proposed AoA-MAUT-RagaNet-based Indian classical raga recognition results in a general accuracy of 91.71%, a precision of 0.93, a recall of 0.90, and an F1-score of 0.91, which is superior compared to the traditional state of arts.

Keywords
Raga identification, Archimedes Optimization Algorithm, Multi-Attribute utility theory, Deep Convolution Neural Network, Multiple acoustic features.

References
[1] Rajeswari Sridhar, and T.V. Geetha, “Swara Indentification for South Indian Classical Music,” 9th International Conference on Information Technology (ICIT'06), Bhubaneswar, India, pp. 143-144, 2006.
[CrossRef] [Google Scholar] [Publisher Link]
[2] Rajeswari Sridhar, and T.V. Geetha, “Music Information Retrieval of Carnatic Songs Based on Carnatic Music Singer Identification,” 2008 International Conference on Computer and Electrical Engineering, Phuket, Thailand, pp. 407-411, 2008.
[CrossRef] [Google Scholar] [Publisher Link]
[3] Gaurav Pandey, Chaitanya Mishra, and Paul Ipe, “TANSEN: A System for Automatic Raga Identification,” Indian International Conference on Artificial Intelligence, pp. 1350-1363, 2003.
[Google Scholar]
[4] J.P. Bello et al., “A Tutorial on Onset Detection in Music Signals,” IEEE Transactions on Speech and Audio Processing, vol. 13, no. 5, pp. 1035-1047, 2005.
[CrossRef] [Google Scholar] [Publisher Link]
[5] Anssi Klapuri, and Manuel Davy, Signal Processing Methods for Music Transcription, Springer New York, NY, 2006.
[CrossRef] [Google Scholar] [Publisher Link]
[6] Parag Chordia, “Automatic Raag Classification of Pitch-Tracked Performances Using Pitch-Class and Pitch-Class Dyad Distributions,” International Conference on Mathematics and Computing (ICMC), pp. 314-321, 2006.
[Google Scholar] [Publisher Link]
[7] Graham E. Poliner et al., “Melody Transcription from Music Audio: Approaches and Evaluation,” IEEE Transactions on Audio, Speech, and Language Processing, vol. 15, no. 4, pp. 1247-1256, 2007.
[CrossRef] [Google Scholar] [Publisher Link]
[8] S. Samsekai Manjabhat et al., “Raga and Tonic Identification in Carnatic Music,” Journal of New Music Research, vol. 46, no. 3, pp. 229-245, 2017.
[CrossRef] [Google Scholar] [Publisher Link]
[9] Ram Avtar 'Vir, Theory of Indian Music, Pankaj Publications, New Delhi, India, 1999.
[Publisher Link]
[10] Surendra Shetty, and Sarika Hegde, “Automatic Classification of Carnatic Music Instruments Using MFCC and LPC,” Proceedings of Data Management, Analytics and Innovation, pp. 463-474, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[11] Keunwoo Choi, George Fazekas, and Mark Sandler, “Automatic Tagging Using Deep Convolutional Neural Networks,” Arxiv, pp. 1-7, 2016.
[CrossRef] [Google Scholar] [Publisher Link]
[12] Ashu Abdul et al., “An Emotion-Aware Personalized Music Recommendation System Using a Convolutional Neural Networks Approach,” Applied Sciences, vol. 8, no. 7, pp. 1-16, 2018.
[CrossRef] [Google Scholar] [Publisher Link]
[13] Shun-Hao Chang et al., “A Personalized Music Recommendation System Using Convolutional Neural Networks Approach,” IEEE International Conference on Applied System Invention, Chiba, Japan, pp. 47-49, 2018.
[CrossRef] [Google Scholar] [Publisher Link]
[14] Miao Jiang, Ziyi Yang, and Chen Zhao, “What to Play Next? Arnn-Based Music Recommendation System,” 2017 51st Asilomar Conference on Signals, Systems, and Computers, Pacific Grove, CA, USA, pp. 356-358, 2017.
[CrossRef] [Google Scholar] [Publisher Link]
[15] Tara N. Sainath et al., “Convolutional, Long Short-Term Memory, Fully Connected Deep Neural Networks,” IEEE International Conference on Acoustics, Speech and Signal Processing, South Brisbane, QLD, Australia, pp. 4580-4584, 2015.
[CrossRef] [Google Scholar] [Publisher Link]
[16] Md Afif Al Mamun et al., “Bangla Music Genre Classification Using Neural Network,” 2019 8th International Conference System Modeling and Advancement in Research Trends, Moradabad, India, pp. 397-403, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[17] Adiyansjah, Alexander A.S. Gunawan, and Derwin Suhartono, “Music Recommender System Based on Genre Using Convolutional Recurrent Neural Networks,” Procedia Computer Science, vol. 157, pp. 99-109, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[18] Prasenjeet Fulzele et al., “A Hybrid Model for Music Genre Classification Using LSTM and SVM,” 2018 Eleventh International Conference on Contemporary Computing, Noida, India, pp. 1-3, 2018.
[CrossRef] [Google Scholar] [Publisher Link]
[19] Yu Tao, Yuanxing Zhang, and Kaigui Bian, “Attentive Context-Aware Music Recommendation,” 2019 IEEE Fourth International Conference on Data Science in Cyberspace, Hangzhou, China, pp. 54-61, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[20] Nagamanoj Karunakaran, and Arti Arya, “A Scalable Hybrid Classifier for Music Genre Classification Using Machine Learning Concepts and Spark,” 2018 International Conference on Intelligent Autonomous Systems, Singapore, pp. 128-135, 2018.
[CrossRef] [Google Scholar] [Publisher Link]
[21] Frank Zalkow, and Meinard Müller, “Using Weakly Aligned Score-Audio Pairs to Train Deep Chroma Models for Cross-Modal Music Retrieval,” Proceedings of the 21st ISMIR Conference, Montreal, Canada, pp. 184-191, 2020.
[Google Scholar] [Publisher Link]
[22] Wing W.Y. Ng, Weijie Zeng, and Ting Wang, “Multi-Level Local Feature Coding Fusion for Music Genre Recognition,” IEEE Access, vol. 8, pp. 152713-152727, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[23] A. Elbir, and N. Aydin, “Music Genre Classification and Music Recommendation by Using Deep Learning,” Electronics Letters, vol. 56, no. 12, pp. 627-629, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[24] Rosilde Tatiana Irene et al., “Automatic Playlist Generation Using Convolutional Neural Networks and Recurrent Neural Networks,” 2019 27th European Signal Processing Conference (EUSIPCO), A Coruna, Spain, pp. 1-5, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[25] Hyoung-Gook Kim, Gee Yeun Kim, and Jin Young Kim, “Music Recommendation System Using Human Activity Recognition From Accelerometer Data,” IEEE Transactions on Consumer Electronics, vol. 65, no. 3, pp. 349-358, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[26] Siji John et al., “Classification of Indian Classical Carnatic Music Based on Raga Using Deep Learning,” 2020 IEEE Recent Advances in Intelligent Computational Systems, Thiruvananthapuram, India, pp. 110-113, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[27] Vishnu S. Pendyala et al.,“Towards Building a Deep Learning Based Automated Indian Classical Music Tutor for the Masses,” Systems and Soft Computing, vol. 4, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[28] Akhilesh Kumar Sharma et al., “Classification of Indian Classical Music with Time-Series Matching Deep Learning Approach,” IEEE Access, vol. 9, pp. 102041-102052, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[29] Dipti Joshi, Jyoti Pareek, and Pushkar Ambatkar, “Indian Classical Raga Identification Using Machine Learning,” International Semantic Intelligence Conference, New Delhi, India, pp. 259-263, 2021.
[Google Scholar] [Publisher Link]
[30] Kishor Barasu Bhangale, and Mohanaprasad Kothandaraman, “Survey of Deep Learning Paradigms for Speech Processing,” Wireless Personal Communications, vol. 125, no. 2, pp. 1913-1949, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[31] Kishor Barasu Bhangale, and Mohanaprasad Kothandaraman, “A Review on Speech Processing Using Machine Learning Paradigm,” International Journal of Speech Technology, vol. 24, no. 2, pp. 367-388, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[32] Kishor Bhangale, and Mohanaprasad Kothandaraman, “Speech Emotion Recognition Based on Multiple Acoustic Features and Deep Convolutional Neural Network,” Electronics, vol. 12, no. 4, pp. 1-17, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[33] Neha Chauhan, Tsuyoshi Isshiki, and Dongju Li, “Speaker Recognition Using LPC, MFCC, ZCR Features with ANN and SVM Classifier for Large Input Database,” 2019 IEEE 4th International Conference on Computer and Communication Systems, Singapore, pp. 130-133, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[34] Yulia Oganian et al., “Vowel and Formant Representation in the Human Auditory Speech Cortex,” Neuron, vol. 111, no. 13, pp. 2105-2118, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[35] Anagha Sonawane, M.U. Inamdar, and Kishor B. Bhangale, “Sound Based Human Emotion Recognition Using MFCC & Multiple SVM,” 2017 International Conference on Information, Communication, Instrumentation and Control, Indore, India, pp. 1-4, 2017.
[CrossRef] [Google Scholar] [Publisher Link]
[36] Bryce J. Dietrich, Matthew Hayes, and Diana Z. O’brien, “Pitch Perfect: Vocal Pitch and the Emotional Intensity of Congressional Speech,” American Political Science Review, vol. 113, no. 4, pp. 941-962, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[37] Seema Chaudhary, Sangeeta Kakarwal, and Ratnadeep Deshmukh, “Musical Instrument Recognition Using Audio Features with Integrated Entropy Method,” Journal of Integrated Science and Technology, vol. 9, no. 2, pp. 92-97, 2021.
[Google Scholar] [Publisher Link]
[38] Abraham Woubie, Lauri Koivisto, and Tom Bäckström, “Voice-Quality Features for Deep Neural Network Based Speaker Verification Systems,” 2021 29th European Signal Processing Conference (EUSIPCO), Dublin, Ireland, pp. 176-180, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[39] Fatma A. Hashim et al., “Archimedes Optimization Algorithm: A New Metaheuristic Algorithm for Solving Optimization Problems,” Applied Intelligence, vol. 51, pp. 1531-1551, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[40] Abeer S. Desuky et al., “EAOA: An Enanced Archimedes Optimization Algorithm for Feature Selection in Classification,” IEEE Access, vol. 9, pp. 120795-120814, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[41] Ahmed Fathy et al., “Archimedes Optimization Algorithm Based Maximum Power Point Tracker for Wind Energy Generation System,” Ain Shams Engineering Journal, vol. 13, no. 2, pp. 1-18, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[42] Lifang Zhang et al., “Ensemble Wind Speed Forecasting with Multi-Objective Archimedes Optimization Algorithm and Sub-Model Selection,” Applied Energy, vol. 301, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[43] Susmita Bandyopadhyay, “A Novel Multi-Criteria Decision Analysis Technique Considering Various Essential Characteristics,” Research Square, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[44] José Ramón San Cristóbal Mateo, Multi Criteria Analysis in the Renewable Energy Industry, 1st ed., Springer London, 2012.
[CrossRef] [Google Scholar] [Publisher Link]
[45] Kishor B. Bhangale, and Mohanaprasad Kothandaraman, “Speech Emotion Recognition Using the Novel PEmoNet (Parallel Emotion Network),” Applied Acoustics, vol. 212, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[46] Kishor Bhangale, and K. Mohanaprasad, “Speech Emotion Recognition Using Mel Frequency Log Spectrogram and Deep Convolutional Neural Network,” International Conference on Futuristic Communication and Network Technologies, pp. 241-250, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[47] Ashish Pradhan, and Archit Yajnik, “Parts-of-Speech Tagging of Nepali Texts with Bidirectional LSTM, Conditional Random Fields and HMM,” Multimedia Tools and Applications, vol. 83, no. 4, pp. 9893-9909, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[48] Yunus Korkmaz, and Aytug Boyaci, “Hybrid Voice Activity Detection System Based on LSTM and Auditory Speech Features,” Biomedical Signal Processing and Control, vol. 80, 2023.
[CrossRef] [Google Scholar] [Publisher Link]