Research Article | Open Access | Download PDF
Volume 74 | Issue 1 | Year 2026 | Article Id. IJETT-V74I1P121 | DOI : https://doi.org/10.14445/22315381/IJETT-V74I1P121Triple-Slope Linear Unit: Balancing Gradient Preservation and Activation Scaling in Deep Neural Networks
Rajaa Miftah, Mostafa Hanoune, Mohssine Bentaib
| Received | Revised | Accepted | Published |
|---|---|---|---|
| 01 Sep 2025 | 29 Dec 2025 | 06 Jan 2026 | 14 Jan 2026 |
Citation :
Rajaa Miftah, Mostafa Hanoune, Mohssine Bentaib, "Triple-Slope Linear Unit: Balancing Gradient Preservation and Activation Scaling in Deep Neural Networks," International Journal of Engineering Trends and Technology (IJETT), vol. 74, no. 1, pp. 275-283, 2026. Crossref, https://doi.org/10.14445/22315381/IJETT-V74I1P121
Abstract
Rectified Linear Unit (ReLU), and its variations, have become the new activation functions in Deep Learning Systems, because of their low computational costs and good empirical results. However, they have significant limitations, such as the "Dying Neuron" issue, uncontrollable activation growth, and less gradient flow in extreme regions. This paper proposes a new type of activation function, called the Triple-Slope Linear Unit (TSLU), which is a simple yet effective piecewise-linear activation function that attempts to resolve these problems. TSLU has three separate linear regions with adjustable slope: a slight positive slope for negative inputs in order to keep the gradient flowing, a unit slope in the centre region to perform identity mapping, and a decreased slope for significant positive inputs that limit activation magnitude. The function is continuous, parameter-efficient, and needs no complicated mathematical operations, making it applicable for low-latency and resource-constrained applications. We provide a theoretical analysis to show that our proposed activation function, TSLU, preserves non-vanishing gradients for any input range without causing activation explosion. Experimental results on benchmark image classification and natural language processing tasks demonstrate that TSLU achieves comparable or superior performance to ReLU, Leaky ReLU, and Parametric ReLU, with improved training stability and generalization. These findings highlight TSLU as a lightweight, interpretable, and deployable alternative for Deep Modern Neural Networks.
Keywords
Triple-Slope Linear Unit (TSLU), Neural Network Activation Functions, Gradient Flow Preservation, Activation Magnitude Control, Dead Neuron Mitigation, Deep Learning Optimization, Training Stability.
References
[1] Shiv Ram Dubey, Satish
Kumar Singh, and Bidyut Baran Chaudhuri, “Activation Functions in Deep
Learning: A Comprehensive Survey and Benchmark,” Neurocomputing, vol. 503, pp. 92-108, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[2] Andrinandrasana David
Rasamoelina, Fouzia Adjailia, and Peter Sinčák, “A Review of Activation
Function for Artificial Neural Network,” 2020
IEEE 18th World Symposium on Applied Machine Intelligence and
Informatics (SAMI), Herlany, Slovakia, pp. 281-286, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[3] Yann LeCun et al., “Gradient-Based
Learning Applied to Document Recognition,” Proceedings
of the IEEE, vol. 86, no. 11, pp. 2278-2324, 1998.
[CrossRef] [Google Scholar] [Publisher Link]
[4] David LeRoy Elliott, “A
Better Activation Function for Artificial Neural Networks,” University of
Maryland, 1998.
[Google Scholar] [Publisher Link]
[5] Abien Fred Agarap, “Deep
Learning using Rectified Linear Units (ReLU),” arXiv Preprint, pp. 1-7, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[6] Kaiming He et al., “Delving
Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet
Classification,” arXiv Preprint, pp.
1-11, 2015.
[CrossRef] [Google Scholar] [Publisher Link]
[7] Bing Xu et al., “Empirical
Evaluation of Rectified Activations in Convolutional Network,” arXiv Preprint, pp. 1-5, 2015.
[CrossRef] [Google Scholar] [Publisher Link]
[8] Shijun Zhang, Jianfeng Lu, and
Hongkai Zhao, “Deep Network Approximation: Beyond ReLU to Diverse Activation
Functions,” pp. 1-39, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[9] Djork-Arné Clevert, Thomas
Unterthiner, and Sepp Hochreiter, “Fast and Accurate Deep Network Learning by
Exponential Linear Units (ELUs),” arXiv
Preprint, pp. 1-14, 2016.
[CrossRef] [Google Scholar] [Publisher Link]
[10] Günter Klambauer et al., “Self-Normalizing
Neural Networks,” arXiv Preprint, pp.
1-102, 2017.
[CrossRef] [Google Scholar] [Publisher Link]
[11] Jonathan T. Barron, “Continuously
Differentiable Exponential Linear Units,” arXiv
Preprint, pp. 1-2, 2017.
[CrossRef] [Google Scholar] [Publisher Link]
[12] Bhavya Raitani, “A Survey
on Recent Activation Functions with Emphasis on Oscillating Activation
Functions,” Engineering Archive,
2022.
[CrossRef] [Google Scholar] [Publisher Link]
[13] Prajit Ramachandran, Barret
Zoph, and Quoc V. Le, “Swish: A Self-Gated Activation Function,” arXiv Preprint, pp. 1-13, 2017.
[CrossRef] [Google Scholar] [Publisher Link]
[14] Diganta Misra, “Mish: A
Self Regularized Non-Monotonic Activation Function,” arXiv Preprint, pp. 1-14, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[15] Andrea Apicella et al., “A
Survey on Modern Trainable Activation Functions,” Neural Networks, vol. 138, pp. 14-32, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[16] Ashish Rajanand, and Pradeep
Singh, “ErfReLU: Adaptive Activation Function for Deep Neural Network,” arXiv Preprint, pp. 1-8, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[17] Nicholas Gerard Timmons, and
Andrew Rice, “Approximating Activation Functions,” arXiv Preprint, pp. 1-10, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[18] Eric Alcaide, “E-swish:
Adjusting Activations to Different Network Depths,” arXiv Preprint, pp. 1-13, 2018.
[CrossRef] [Google Scholar] [Publisher Link]
[19] Shumin Kong, and Masahiro
Takatsuka, “Hexpo: A Vanishing-Proof Activation Function,” 2017 International Joint Conference on Neural Networks (IJCNN),
Anchorage, AK, USA, pp. 2562-2567, 2017.
[CrossRef] [Google Scholar] [Publisher Link]
[20] Stefan Elfwing, Eiji Uchibe,
and Kenji Doya, “Sigmoid-weighted Linear Units for Neural Network Function
Approximation in Reinforcement Learning,” Neural
Networks, vol. 107, pp. 3-11, 2018.
[CrossRef] [Google Scholar] [Publisher Link]
[21] Shiv Ram Dubey, and Soumendu
Chakraborty, “Average Biased ReLU Based CNN Descriptor for Improved Face
Retrieval,” Multimedia Tools and
Applications, vol. 80, no. 15, pp. 23181-23206, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[22] Stamatis Mastromichalakis,
“ALReLU: A Different Approach on Leaky ReLU Activation Function to Improve Neural
Networks Performance,” arXiv Preprint,
pp. 10-10, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[23] John Chidiac, and Danielle
Azar, “ReCA: A Parametric ReLU Composite Activation Function,” arXiv Preprint, pp. 1-10, 2025.
[CrossRef] [Google Scholar] [Publisher Link]
[24] Yufeng Xia et al., “RBUE: A
ReLU-Based Uncertainty Estimation Method of Deep Neural Networks,” arXiv Preprint, pp. 1-15, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[25] Wenling Shang et al., “Understanding
and Improving Convolutional Neural Networks via Concatenated Rectified Linear
Units,” arXiv Preprint, pp. 1-17,
2016.
[CrossRef] [Google Scholar] [Publisher Link]
[26] Fréderic Godin et al., “Dual
Rectified Linear Units (DReLUs): A Replacement for Tanh Activation Functions in
Quasi-Recurrent Neural Networks,” Pattern
Recognition Letters, vol. 116, pp. 8-14, 2018.
[CrossRef] [Google Scholar] [Publisher Link]
[27] Suo Qiu, Xiangmin Xu, and
Bolun Cai, “FReLU: Flexible Rectified Linear Units for Improving Convolutional
Neural Networks,” arXiv Preprint, pp.
1-6, 2018.
[CrossRef] [Google Scholar] [Publisher Link]
[28] Jiale Cao et al., “Randomly
Translational Activation Inspired by the Input Distributions of ReLU,” Neurocomputing, vol. 275, pp. 859-868,
2018.
[CrossRef] [Google Scholar] [Publisher Link]
[29] Yang Li et al., “Improving
Deep Neural Network with Multiple Parametric Exponential Linear Units,” arXiv Preprint, pp. 1-28, 2017.
[CrossRef] [Google Scholar] [Publisher Link]
[30] Zheng Qiumei, Tan Dan, and Wang
Fenghua, ‘Improved Convolutional Neural Network based on Fast Exponentially Linear
Unit Activation Function,” IEEE Access,
vol. 7, pp. 151359-151367, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[31] Daeho Kim, Jinah Kim, and Jaeil
Kim, “Elastic Exponential Linear Units for Convolutional Neural Networks,” Neurocomputing, vol. 406, pp. 253-266,
2020.
[CrossRef] [Google Scholar] [Publisher Link]
[32] Mohit Goyal, Rajan Goyal, and
Brejesh Lall, “Learning Activation Functions: A New Paradigm for Understanding Neural
Networks,” arXiv Preprint, pp. 1-18,
2020.
[CrossRef] [Google Scholar] [Publisher Link]
[33] Ameya D. Jagtap, Kenji
Kawaguchi, and George Em Karniadakis, “Locally Adaptive Activation Functions
with Slope Recovery Term for Deep and Physics-Informed Neural Networks,” Proceedings of the Royal Society a
Mathematical, Physics and Engineering Sciences, vol. 476, no. 2239, pp. 1-20,
2020.
[CrossRef] [Google Scholar] [Publisher Link]
[34] Gianluca Maguolo, Loris
Nanni, and Stefano Ghidon, “Ensemble of Convolutional Neural Networks Trained
with Different Activation Functions,” Expert
Systems with Applications, vol. 166, pp. 1-13, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[35] Huizhen Zhao et al., “A
Novel Softplus Linear Unit for Deep Convolutional Neural Networks,” Applied Intelligence, vol. 48, no. 7,
pp. 1707-1720, 2018.
[CrossRef] [Google Scholar] [Publisher Link]
[36] Evgenii Pishchik, “Trainable
Activations for Image Classification,” Mathematics
& Computer Science, 2023.
[Google Scholar] [Publisher Link]
[37] Dan Hendrycks, and Kevin
Gimpel, “Gaussian Error Linear Units (GELUs),” arXiv Preprint, pp.1-10, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[38] Alejandro Molina, Patrick
Schramowski, and Kristian Kersting, “Padé Activation Units: End-to-end Learning
of Flexible Activation Functions in Deep Networks,” arXiv Preprint, pp. 1-17, 2020.
[CrossRef] [Google Scholar] [Publisher Link]