Composition of Feature Selection Methods And Oversampling Techniques For Banking Fraud Detection With Artificial Intelligence
How to Cite?
Bouzgarne Itri, Youssfi Mohamed, Bouattane Omar, Qbadou Mohamed, "Composition of Feature Selection Methods And Oversampling Techniques For Banking Fraud Detection With Artificial Intelligence," International Journal of Engineering Trends and Technology, vol. 69, no. 11, pp. 216-226, 2021. Crossref, https://doi.org/10.14445/22315381/IJETT-V69I11P228
Abstract
The digital age is accompanied by a proliferation of crimes and attacks against institutions handling banking data, such as card fraud and electronic payments. The traditional protection systems used by banks based on rules and signatures are proving increasingly insufficient and ineffective in the face of constantly evolving attack techniques. Artificial intelligence and machine learning becoming dominant problem-solving techniques to fill these gaps. Thus, this article proposes a new approach for optimising the performance of prediction models for fraud detection in the case of credit cards. Although fraud prediction algorithms have been developed to deal with the problem, they still encounter some very common difficulties due to the imbalance data set. Hence, this study proposes a new composition-based algorithm, an approach that combines oversampling and feature selection methods to find the best combination of several supervised classification algorithms. This work aim to maximise the performance of the fraudulent transaction detection model in the presence of an imbalanced Dataset, while illustrating the impact of oversampling methods on the relevance of features. This research obtains the best performance in comparison to the pervious results on the same scope.
Keywords
Machine Learning, Oversampling, Feature Selection, imbalanced dataset, credit card, Fraud.
Reference
[1] European Central Bank, Payments Statistics: (2018) .Press release. (2019).https://www.ecb.europa.eu/press/pr/stats/paysec/html/ecb.pis20 18~c758d7e773.en.html
[2] Javelin Strategy & Research. Identity Fraud Hits All-Time High With 16.7 Million U.S. Victims in (2017).https://www.javelinstrategy.com/press-release/identity-fraudhits- all-time-high-167-million-us-victims-2017-according-new-javelin
[3] Nilsonreport.com. [online]. Source of news and analysis of the global card and mobile payment industry Available at: https://shiftprocessing.com/credit-card-fraud-statistics/
[4] Sayyed Shifanaz, S.Muzaffar, V.Kshirsagar,P.Kadlag, N.Kadam.Fraud Detection in Online Transactions using Data Mining Technique. IJETT International Journal of Computer Science Engineering (IJETT - IJCSE) - Special Issue ICIETEM (2019).
[5] Chawla et al., SMOTE: synthetic minority over-sampling technique. Journal of artificial intelligence research 16 (2002) 321– 357.
[6] Mark A. Hall: Correlation-based Feature Selection for Machine Learning. Université de Waikato, NewZeland, (1999).
[7] Phua, C., Lee, V. C. S., Smith-Miles, K., and Gayler, R. W., A comprehensive survey of data mining-based fraud detection research. CoRR, abs/1009.6119. (2010).
[8] Saia R, Carta S. Evaluating credit card transactions in the frequency domain for a proactive fraud detection approach. ICETE 2017 - Proc. 14th Int. Jt. Conf. E-bus.Telecommun. 4(Icete) (2017) 335–42.
[9] Fiore U, De Santis A, Perla F, Zanetti P, Palmieri F. Using Generative Adversarial Networks for Improving Classification Effectiveness in Credit Card Fraud Detection. Inf. Sci. (NY). (2017).
[10] Varun Kumar K S, Vijaya Kumar V G, Vijay Shankar A, Pratibha K. (2020) Credit Card Fraud Detection using MachineLearning Algorithms.International Journal of Engineering Research & Technology. 9(7) (2020).
[11] Muhammad Syafiq Alza bin Alias et al. Improved Sampling Data Workflow Using Smtmk To Increase The Classification Accuracy Of Imbalanced Dataset. European Journal of Molecular & Clinical Medicine, 8(2) (2021) 91-99
[12] Yang W., Zhang Y., Ye K., Li L., Xu CZ., FFD: A Federated Learning-Based Method for Credit Card Fraud Detection. In: Chen K., Seshadri S., Zhang LJ. (eds) Big Data – BigData 2019. BIGDATA 2019. Lecture Notes in Computer Science, Springer, Cham. 11514 (2019)
[13] D. Varmedja, M. Karanovic, S. Sladojevic, M. Arsenovic, and A. Anderla, Credit Card Fraud Detection - Machine Learning methods, 2019 18th International Symposium INFOTEH-JAHORINA (INFOTEH), (2019) 1-5, doi: 10.1109/INFOTEH.2019.8717766.
[14] H. Han, W. Wang and B. Mao, Borderline-SMOTE: A new oversampling method in imbalanced data sets learning, Proc. of International Conference on Intelligent Computing, Part I, Hefei, China, (2005) 878-887.
[15] H. He, Y. Bai, E. A. Garcia and S. Li, ADASYN: Adaptive synthetic sampling approach for imbalanced learning, IEEE International Joint Conference on Neural Networks, (IEEE World Congress On Computational Intelligence), 3 (2008) 1322-1328.
[16] S. Barua, M. M. Islam, X. Yao, and K. Murase, MWMOTE – Majority weighted minority oversampling technique for imbalanced data set learning, IEEE Trans. Knowl. Data Eng., 26(2) (2014) 405-425,.
[17] Kang, Y. and Won, S., Weight decision algorithm for oversampling technique on class-imbalanced learning, ICCAS (2010) 182-186
[18] Bunkhumpornpat et al.Safe-Level-SMOTE: Safe-Level-Synthetic Minority Over-Sampling TEchnique for Handling the Class Imbalanced Problem, Proceedings of the 13th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining, (2009) 475– 482
[19] Xu, H., Yu, S., Chen, J. et al. An improved firefly algorithm for feature selection in classification.Wireless Pers. Commun. 102 (2018) 2823– 2834.
[20] Yusta, S.C., Different metaheuristic strategies to solve the feature selection problem. Pattern Recognition Letters. 30(5) (2009) 525–534.
[21] Japkowicz N & Stephen S., The class imbalance problem: A systematic study. Intelligent Data Analysis 6(5) (2002) 42-449.
[22] Buda M, Maki A & Mazurowski MA., A systematic study of the class imbalance problem in convolutional neural networks. Neural Networks 106 (2018) 249-259.
[23] Kovács, G.: An empirical comparison and evaluation of minority oversampling techniques on many imbalanced datasets. Applied Soft Computing. 83 (2019) 105662.
[24] C. J. V. Rijsbergen, Information Retrieval. Butterworth-Heinemann, Newton, MA, USA, 2nd edition. (1979)
[25] Miroslav Kubat and Stan Matwin: Addressing the Curse of Imbalanced Training Sets: One-Sided Selection. Proceedings of the 14th International Conference on Machine Learning, (1997) 179-186.
[26] Itri BouzgarneandYoussfi Mohammed., Empirical Oversampling Threshold Strategy for Machine Learning Performance Optimisation in Insurance Fraud Detection International Journal of Advanced Computer Science and Applications(IJACSA), 11(10) (2020).
[27] P. Branco, L. Torgo, R.P. RibeiroA Survey of Predictive Modelling on Imbalanced Distributions. ACM Computing Surveys (CSUR), 49 (2) (2016) 1-50
[28] Rtayli N, Enneya N. Selection Features and Support Vector Machine for Credit Card Risk Identification. Procedia Manuf (2020) 46:941–8
[29] Sohony I, Pratap R, Nambiar U. Ensemble learning for credit card fraud detection. In: ACM International Conference Proceeding Series, (2018).
[30] Saia R, Carta S. Evaluating credit card transactions in the frequency domain for a proactive fraud detection approach. ICETE 2017 - Proc. 14th Int. Jt. Conf. E-bus.Telecommun. 4(Icete) (2017) 335–42
[31] Kittidachanan K. Anomaly Detection based on GS-OCSVM Classification. In: 202012th Int. Conf. Knowl. Smart Technol, (2020) 64–9
[32] Zamini M, Montazer G. Credit Card Fraud Detection using autoencoder based clustering. In: 9th International Symposium on Telecommunication: With Emphasis on Information and Communication Technology, IST 2018, (2019)
[33] Fiore U, De Santis A, Perla F, Zanetti P, Palmieri F. Using Generative Adversarial Networks for Improving Classification Effectiveness in Credit Card Fraud Detection. Inf. Sci. (NY). (2017).
[34] Randhawa, Kuldeep, Chu Kiong Loo, Manjeevan Seera, CHEE PENG Lim, and Asoke K. Nandi, Credit card fraud detection using AdaBoost and majority voting IEEE ACCESS, (2018) 14277-14284.
[35] Nayak H.D., Deekshita, Anvitha L., Shetty A., D’Souza D.J., Abraham M.P., Fraud Detection in Online Transactions Using Machine Learning Approaches—A Review. Advances in Intelligent Systems and Computing, Springer, Singapore 1133 (2021) 589-599.
[36] Fatima Zohra El hlouli, Jamal Riffi, Mohamed Adnane Mahraz, Ali El Yahyaouy, Hamid Tairi., Credit Card Fraud Detection Based on Multilayer Perceptron and Extreme Learning Machine Architectures.2020 International Conference on Intelligent Systems and Computer Vision (ISCV) (2020).
[37] Ruttala Sailusha, V. Gnaneswar, R. Ramesh, G. Ramakoteswara Rao. 4th International Conference on Intelligent Computing and Control Systems (ICICCS) (2020).
[38] M. Ummul Safa , R. M. Ganga., Credit Card Fraud Detection Using Machine Learning. International Journal of Research in Engineering, Science and Management 2(11) (2019) 2581-5792 .ISSN (Online).