Quantile Regression and Machine Learning based hybrid approach for Outlier Detection in Multivariate Time Series data
Quantile Regression and Machine Learning based hybrid approach for Outlier Detection in Multivariate Time Series data 



© 2022 by IJETT Journal  
Volume70 Issue6 

Year of Publication : 2022  
Authors : Dharmendra Patel, Pranav Vyas, Arpit Trivedi, Tushar Mehta, Kanubhai K Patel, Sanskruti Patel, Hardik Rajgor 

DOI : 10.14445/22315381/IJETTV70I6P221 
How to Cite?
Dharmendra Patel, Pranav Vyas, Arpit Trivedi, Tushar Mehta, Kanubhai K Patel, Sanskruti Patel, Hardik Rajgor, "Quantile Regression and Machine Learning based hybrid approach for Outlier Detection in Multivariate Time Series data," International Journal of Engineering Trends and Technology, vol. 70, no. 6, pp. 185194, 2022. Crossref, https://doi.org/10.14445/22315381/IJETTV70I6P221
Abstract
Univariate and Multivariate techniques can be used to discover outliers in multivariate time data. Univariate approaches are difficult to use because they require prior adjustments. On the other hand, multivariate approaches do not necessitate any prior adjustments and find outliers straight from the original data. The dimensionality reduction concept is used by most multivariate approaches to find outliers or anomalies in the data. The most common technique for dimensionality reduction is Principal Component Analysis (PCA). It is widely used in literature surveys to discover outliers and anomalies. However, it has several disadvantages, including being less interpretable, requiring feature scaling before usage, and losing data. This research proposes a new algorithm that uses a hybrid approach of Quantile Regression and Machine Learning to find outliers in multivariate time series data. The algorithm is compared with wellknown techniques PCA and Ordinary least square regression(OLSR). The experimental results revealed that the proposed algorithm is simple, effective, and retain all information while detecting outliers.
Keywords
Outlier, Anomaly, Univariate, Multivariate, Quantile Regression, Principal Component Analysis(PCA), Ordinary least square regression(OLSR).
Reference
[1] B. A. V. K. V. Chandola, Anomaly Detection: A Survey, ACM Comput, Surv. 41(3) (2009) 172.
[2] A. B. A. V. K. V. Chandola, Anomaly Detection for Discrete Sequences:A Survey, IEEE Trans. Knowl. Data Eng. 24(5) (2012) 823 – 839.
[3] P. E. A. C. Agon, TimeSeries Data Mining, ACM Comput. Surv. 45(1) (2012) 134.
[4] J. G. C. A. A. J. H. M. Gupta, Outlier Detection for Temporal Data  Morgan & Claypool Publishers. (2014).
[5] J. G. C. A. A. J. H. M. Gupta, Outlier Detection for Temporal Data: A Survey, IEEE Trans. Knowl. Data Eng. 26(9) (2014) 2250–2267.
[6] C. C. Aggarwal, Outlier Analysis, New York: Springer. (2017).
[7] J. S. A. C. F. S. Papadimitriou, Streaming Pattern Discovery in Multiple TimeSeries, in In Proceedings of the 31st International Conference on Very Large Data Bases (VLDB 2005) ACM, Trondheim, Norway. (2005).
[8] D. P. A. R. S. T. P. Galeano, Outlier Detection in Multivariate Time Series by Projection Pursuit, Journal of the American Statistical Association. 101(474) (2006) 654–669.
[9] R. B. A. F. Battaglia, Outliers Detection in Multivariate Time Series by Independent Component Analysis, Neural Computation. 11(2) (2007) 1962–1984.
[10] D. S. A. R. M. F. J. H. R. R. D. H. A. G. B.H. M. S. Shahriar, Detecting Heat Events in Dairy Cows Using Accelerometers and Unsupervised Learning, Computers and Electronics in Agriculture. 128 (2016) 2026.
[11] Y. L. Z. F. A. C. G. H. Lu, An Outlier Detection Algorithm Based on CrossCorrelation Analysis for Time Series Dataset, IEEE Access. 6 (2018) 53593–53610.
[12] M. S. A. T. Yairi, Anomaly Detection Using Autoencoders with Nonlinear Dimensionality Reduction, in In Proceedings of the MLSDA 2014 2nd Workshop on Machine Learning for Sensory Data Analysis, ACM, Gold Coast, Australia. (2014).
[13] K. K. A. L. V. R. Z. Xu, Adaptive Streaming Anomaly Analysis, in In Proceedings of NIPS 2016 Workshop on Artificial Intelligence for Data, Barcelona, Spain. (2016).
[14] B. Y. A. C. S. J. T. Kieu, Outlier Detection for Multidimensional Time Series Using Deep Neural Networks, in In Proceedings of the 19th IEEE International Conference on Mobile Data Management, IEEE, Aalborg, Denmark. (2018).
[15] S. A. S. A. D. A. S. A. M. Munir, A Deep Learning Approach for Unsupervised Anomaly Detection in Time Series, IEEE Access. 7 (2019) 1991–2005.
[16] Y. Z. C. N. R. L. W. S. A. D. P. Y. Su, Robust Anomaly Detection for Multivariate Time Series through Stochastic Recurrent Neural, in In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ’19), Anchorage, AK, USA. (2019).
[17] S. W. X. Z. M. L. Y. G. S. A. X. X. D. G. Q. Li, Using Quantile Regression Approach to Analyze Price Movements of Agricultural Products in China, Journal of Integrative Agriculture. 11(4) (2012) 674–683.
[18] K. O. A. S. Samreth, The Effect of Foreign Aid on Corruption: A Quantile Regression Approach, Economics Letters. 115(2) (2012) 240– 243.
[19] I. Helland, Some Theoretical Aspects of Partial Least Squares Regression, Chemometrics and Intelligent Laboratory Systems. 58 (2001) 97107.
[20] D. E. N. M. L. M. M. A. D. B. York, Unified Equations for the Slope, Intercept, and Standard Errors of the Best Straight Line, American Journal of Physics. 72(3) (2004) 367–375.
[21] N. C. A. J. ShaweTaylor, An Introduction to Support Vector Machines and Other KernelBased Learning Methods, Royal Holloway, University of London: Cambridge: Cambridge University Press. (2000).
[22] H. X. Z. X. Y. Z. M. C. L. Z. D. H. B. T. F. C Y Zhao, Application of support Vector Machine (SVM) for Prediction Toxic Activity of Different Data Sets, Toxicology. 217 (2006) 105119.
[23] Cao L.J and E.H. Tay, Support Vector with Adaptive Parameters in Financial Time Series Forecasting, IEEE Trans. Neural Network. 14 (2001) 15061518.
[24] Chang C.C, Lin C.J, LIBSVM: A Library for Support Vector Machines, ACM Transactions on Intelligent Systems and Technology. 2(27) (2011) 1–27.
[25] Tay F.E.H, Cao L.J, Modified Support Vector Machines in Financial Time Series Forecasting, Neurocomputing. 48 (2002) 847–861.