A Unified Vision Transformer and Wavelet-Based 
Framework for Multi-Disease Brain MRI Classification 
and Patient Survival Prediction

Anuj Gupta; Anita; Manish Gupta

doi:https://doi.org/10.14445/22315381/IJETT-V74I1P122

Research Article | Open Access | Download PDF

Volume 74 | Issue 1 | Year 2026 | Article Id. IJETT-V74I1P122 | DOI : https://doi.org/10.14445/22315381/IJETT-V74I1P122

A Unified Vision Transformer and Wavelet-Based Framework for Multi-Disease Brain MRI Classification and Patient Survival Prediction

Anuj Gupta, Anita, Manish Gupta

Received	Revised	Accepted	Published
25 Nov 2025	02 Jan 2026	06 Jan 2026	14 Jan 2026

Citation :

Anuj Gupta, Anita, Manish Gupta, "A Unified Vision Transformer and Wavelet-Based Framework for Multi-Disease Brain MRI Classification and Patient Survival Prediction," International Journal of Engineering Trends and Technology (IJETT), vol. 74, no. 1, pp. 284-297, 2026. Crossref, https://doi.org/10.14445/22315381/IJETT-V74I1P122

Abstract

Brain tumours and the neurodegenerative condition Alzheimer's Disease (AD) are problematic in terms of diagnosis. The proposed work provides a unified deep learning approach with a Discrete Wavelet Transform (DWT) front-end and Vision Transformer (ViT) feature extraction with kernel-based Extreme Learning Machine (KELM) classifiers in order to jointly perform multi-class tumour identification and patient survival prediction from MR imaging. Brain tumours and the neurodegenerative condition Alzheimer's Disease (AD) pose significant diagnostic challenges. The model proposed here is trained on two public datasets comprising more than 7,000 T1‑weighted tumour images and 369 multi‑modal glioma volumes. Wavelet decomposition augments spatial input with multi‑scale texture information, the ViT learns global context, and separate KELM heads yield diagnosis and prognosis. As demonstrated by extensive experiments, the accuracy of tumour classification reaches 98.02% and the accuracy of survival prediction reaches 94.67% and Grad-CAM and attention rollout visualisations help identify clinically relevant regions. The main research question to be considered in this research is whether one unified deep learning architecture could be capable of accomplishing effective brain tumor diagnosis and patient survival rates prediction simultaneously through MRI, and, at the same time, remain interpretable to a clinical end-user. The proposed framework addresses a serious gap in the ongoing neuroimaging research, in which the two tasks are generally considered separately, because it involves the joint diagnosis and prognosis in one model. The results demonstrate that the unified architecture demonstrates high classification accuracy, strong survival prediction, and clinical explanations, and this indicates the importance of the unified clinical decision-support system.

Keywords

Brain MRI, Vision Transformers, Discrete Wavelet Transform, Extreme Learning Machines, Survival Analysis.

References

[1] Ze Liu et al., “Swin Transformer: Hierarchical Vision Transformer using Shifted Windows,” 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada, pp. 9992-10002, 2021.
[CrossRef] [Google Scholar] [Publisher Link]

[2] Sérgio Pereira et al., “Brain Tumor Segmentation using Convolutional Neural Networks in MRI Images,” IEEE Transactions on Medical Imaging, vol. 35, no. 5, pp. 1240-1251, 2016.
[CrossRef] [Google Scholar] [Publisher Link]

[3] Alexey Dosovitskiy et al., “An Image Is Worth 16×16 Words: Transformers for Image Recognition at Scale,” International Conference on Learning Representations, 2021.
[Google Scholar] [Publisher Link]

[4] Jieneng Chen et al., “TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation,” arXiv Preprint, pp. 1-13, 2021.
[CrossRef] [Google Scholar] [Publisher Link]

[5] Ali Hatamizadeh et al., “UNETR: Transformers for 3D Medical Image Segmentation,” 2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Waikoloa, HI, USA, pp. 1748-1758, 2022.
[CrossRef] [Google Scholar] [Publisher Link]

[6] Finale Doshi-Velez, and Been Kim, “Towards a Rigorous Science of Interpretable Machine Learning,” arXiv Preprint, pp. 1-13, 2017.
[CrossRef] [Google Scholar] [Publisher Link]

[7] Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin, “Why Should I Trust You?’ Explaining the Predictions of Any Classifier,” Proceedings of the 22^nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco California USA, pp. 1135-1144, 2016.
[CrossRef] [Google Scholar] [Publisher Link]

[8] Konstantinos Kamnitsas et al., “Efficient Multi-Scale 3D CNN with Fully Connected CRF for Accurate Brain Lesion Segmentation,” Medical Image Analysis, vol. 36, pp. 61-78, 2017.
[CrossRef] [Google Scholar] [Publisher Link]

[9] Abdullah Almuhaimeed et al., “Brain Tumor Classification using GAN-Augmented Data with Autoencoders and Swin Transformers,” Frontiers in Medicine, vol. 12, pp. 1-26, 2025.
[CrossRef] [Google Scholar] [Publisher Link]

[10] K. Chandraprabha, L. Ganesan, and K. Baskaran, “A Novel Approach for the Detection of Brain Tumor and Its Classification via End-to-End Vision Transformer-CNN Architecture,” Frontiers in Oncology, vol. 15, pp. 1-18, 2025.
[CrossRef] [Google Scholar] [Publisher Link]

[11] Clifford R. Jack “NIA-AA Research Framework: Toward a Biological Definition of Alzheimer’s Disease,” Alzheimer's & Dementia, vol. 14, no. 4, pp. 535-562, 2018.
[CrossRef] [Google Scholar] [Publisher Link]

[12] Chunfeng Lian et al., “Multi-Task Weakly-Supervised Attention Network for Dementia Status Estimation with Structural MRI,” IEEE Transactions on Neural Networks and Learning Systems, vol. 33, no. 8, pp. 4056-4068, 2022.
[CrossRef] [Google Scholar] [Publisher Link]

[13] Emerald U. Henry, Onyeka Emebob, and Conrad Asotie Omonhinmin, “Vision Transformers in Medical Imaging: A Review,” arXiv Preprint, pp. 1-31, 2022.
[CrossRef] [Google Scholar] [Publisher Link]

[14] Hugo J. W. L. Aerts et al., “Decoding Tumour Phenotype by Noninvasive Imaging using a Quantitative Radiomics Approach,” Nature Communications, vol. 5, pp. 1-9, 2014.
[CrossRef] [Google Scholar] [Publisher Link]

[15] Jiangwei Lao et al., “A Deep Learning-Based Radiomics for Prediction of Survival in Glioblastoma,” Scientific Reports, vol. 7, pp. 1-8, 2017.
[CrossRef] [Google Scholar] [Publisher Link]

[16] Spyridon Bakas et al., “Identifying the Best Machine Learning Algorithms for Brain Tumor Segmentation, Progression Assessment, and Overall Survival Prediction in the BraTS Challenge,” arXiv Preprint, pp. 1-49, 2018.
[CrossRef] [Google Scholar] [Publisher Link]

[17] Ujjwal Baid et al., “The RSNA-ASNR-MICCAI BraTS 2021 Benchmark on Brain Tumor Segmentation and Radiogenomic Classification,” arXiv Preprint, 2021.
[CrossRef] [Google Scholar] [Publisher Link]

[18] Ujjwal Baid et al., “The RSNA-ASNR-MICCAI BraTS 2021 Benchmark on Brain Tumor Segmentation and Radiogenomic Classification,”arXiv Preprint, 2021.
[CrossRef] [Google Scholar] [Publisher Link]

[19] Bohua Deng et al., “An Overview of Extreme Learning Machines,” 2019 4^th International Conference on Control, Robotics and Cybernetics (CRC), Tokyo, Japan, pp. 189-195, 2019.
[CrossRef] [Google Scholar] [Publisher Link]

[20] Guang-Bin Huang, Qin-Yu Zhu, and Chee-Kheong Siew, “Extreme Learning Machine: Theory and Applications,” Neurocomputing, vol. 70, no. 1-3, pp. 489-501, 2006.
[CrossRef] [Google Scholar] [Publisher Link]

[21] Guang-Bin Huang et al., “Extreme Learning Machine for Regression and Multiclass Classification,” IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), vol. 42, no. 2, pp. 513-529, 2012.
[CrossRef] [Google Scholar] [Publisher Link]

[22] Ramprasaath R. Selvaraju et al., “Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization,” International Journal of Computer Vision, vol. 128, no. 2, pp. 336-359, 2020.
[CrossRef] [Google Scholar] [Publisher Link]

[23] Aditya Chattopadhay et al., “Grad-CAM++: Improved Visual Explanations for Deep Convolutional Networks,” 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), Lake Tahoe, NV, USA, 2018.
[CrossRef] [Google Scholar] [Publisher Link]

[24] Samira Abnar, and Willem Zuidema, “Quantifying Attention Flow in Transformers,” Proceedings of the 58^th Annual Meeting of the Association for Computational Linguistics, pp. 4190-4197, 2020.
[Google Scholar] [Publisher Link]

[25] Scott M Lundberg, and Su-In Lee, “A Unified Approach to Interpreting Model Predictions,” Advances in Neural Information Processing Systems, vol. 30, pp. 1-10, 2017.
[Google Scholar] [Publisher Link]

[26] Mukund Sundararajan, Ankur Taly, and Qiqi Yan, “Axiomatic Attribution for Deep Networks,” Proceedings of the 34^th International Conference on Machine Learning, Sydney NSW Australia, vol. 70, pp. 3319-3328, 2017.
[Google Scholar] [Publisher Link]

[27] Jost Tobias Springenberg et al., “Striving for Simplicity: The All Convolutional Net,” arXiv Preprint, pp. 1-14, 2015.
[CrossRef] [Google Scholar] [Publisher Link]

[28] Nicholas J. Tustison et al., “N4ITK: Improved N3 Bias Correction,” IEEE Transactions on Medical Imaging, vol. 29, no. 6, pp. 1310-1320, 2010.
[CrossRef] [Google Scholar] [Publisher Link]

[29] Stephen M. Smith, “Fast Robust Automated Brain Extraction,” Human Brain Mapping, vol. 17, no. 3, pp. 143-155, 2002.
[CrossRef] [Google Scholar] [Publisher Link]

[30] Joost J.M. van Griethuysen et al., “Computational Radiomics System to Decode the Radiographic Phenotype,” Cancer Research, vol. 77, no. 21, pp. e104-e107, 2017.
[CrossRef] [Google Scholar] [Publisher Link]

[31] Ashish Vaswani et al., “Attention Is All You Need,” Advances in Neural Information Processing Systems, vol. 30, pp. 5998-6008, 2017.
[Google Scholar] [Publisher Link]

[32] MONAI, Medical Open Network for Artificial Intelligence, 2020. [Online]. Available: https://project-monai.github.io/

[33] C. Kishor Kumar Reddy et al., “A Fine-Tuned Vision Transformer based Enhanced Multi-Class Brain Tumor Classification using MRI Scan Imagery,” Frontiers in Oncology, vol. 14, pp. 1-23, 2024.
[CrossRef] [Google Scholar] [Publisher Link]

[34] Min Hao et al., “Survival Prediction in Gliomas based on MRI Radiomics Combined with Clinical Factors and Molecular Biomarkers,” PeerJ, vol. 13, pp. 1-23, 2025.
[CrossRef] [Google Scholar] [Publisher Link]

[35] Ahmad Chaddad et al., “A Radiomic Model for Gliomas Grade and Patient Survival Prediction,” Bioengineering, vol. 12, no. 5, pp. 1-19, 2025.
[CrossRef] [Google Scholar] [Publisher Link]

[36] Roya Poursaeed, Mohsen Mohammadzadeh, and Ali Asghar Safaei, “Survival Prediction of Glioblastoma Patients using Machine Learning and Deep Learning: A Systematic Review,” BMC Cancer, vol. 24, no. 1, pp. 1-36, 2024.
[CrossRef] [Google Scholar] [Publisher Link]

[37] Palani Thanaraj Krishnan et al., “Enhancing Brain Tumor Detection in MRI with a Rotation Invariant Vision Transformer,” Frontiers in Neuroinformatics, vol. 18, pp. 1-13, 2024.
[CrossRef] [Google Scholar] [Publisher Link]

[38] Fatma E. AlTahhan et al., “Refined Automatic Brain Tumor Classification using Hybrid Convolutional Neural Networks for MRI Scans,” Diagnostics, vol. 13, no. 5, pp. 1-16, 2023.
[CrossRef] [Google Scholar] [Publisher Link]

[39] Bjoern H. Menze et al., “The Multimodal Brain Tumor Image Segmentation Benchmark (BRATS),” IEEE Transactions on Medical Imaging, vol. 34, no. 10, pp. 1993-2024, 2015.
[CrossRef] [Google Scholar] [Publisher Link]

[40] Navoneel Chakrabarty, Kaggle Contributor, Brain MRI Images for Brain Tumour Classification, 2020. [Online]. Available: https://www.kaggle.com/datasets/navoneel/brain-mri-images-for-brain-tumor-detection

[41] Diederik P. Kingma, and Jimmy Ba, “Adam: A Method for Stochastic Optimization,” arXiv Preprint, pp. 1-15, 2017.
[CrossRef] [Google Scholar] [Publisher Link]

[42] Ilya Loshchilov, and Frank Hutter, “Decoupled Weight Decay Regularization,” arXiv Preprint, pp. 1-19, 2019.
[CrossRef] [Google Scholar] [Publisher Link]

[43] Ilya Loshchilov, and Frank Hutter, “SGDR: Stochastic Gradient Descent with Warm Restarts,” arXiv Preprint, pp. 1-16, 2017.
[CrossRef] [Google Scholar] [Publisher Link]

[44] Adam Paszke et al., “PyTorch: An Imperative Style, High-Performance Deep Learning Library,” Advances in Neural Information Processing Systems, vol. 32, 2019.
[Google Scholar] [Publisher Link]

[45] Lutz Prechelt, Early Stopping-But When?, Neural Networks: Tricks of the Trade, Springer, Berlin, Heidelberg, pp. 55-69, 1998.
[CrossRef] [Google Scholar] [Publisher Link]