Research Article | Open Access | Download PDF
Volume 74 | Issue 6 | Year 2026 | Article Id. IJETT-V74I6P103 | DOI : https://doi.org/10.14445/22315381/IJETT-V74I6P103Detecting System Anomalies without Labels using Workflow Patterns in Logs
Arun Kumar Bandlamudi, Sunitha Pachala
| Received | Revised | Accepted | Published |
|---|---|---|---|
| 13 Aug 2025 | 02 Apr 2026 | 20 Apr 2026 | 27 Jun 2026 |
Citation :
Arun Kumar Bandlamudi, Sunitha Pachala, "Detecting System Anomalies without Labels using Workflow Patterns in Logs," International Journal of Engineering Trends and Technology (IJETT), vol. 74, no. 6, pp. 33-46, 2026. Crossref, https://doi.org/10.14445/22315381/IJETT-V74I6P103
Abstract
Large software systems create many logs. These logs help developers find and fix problems. Logs record what happens inside the system. Logs usually appear in a semi-structured text format. Hand-reading all logs is hard in large systems. In this paper, a method named ADR is proposed. ADR stands for Anomaly Detection by workflow Relations. It finds mathematical patterns from logs. These patterns show the way events in the system relate to each other. ADR checks if logs follow these patterns. If the patterns are not followed, it indicates that something is wrong. The process starts by converting raw logs into event sequences. Then, these events are put into a special matrix. This matrix records the number of times each event happens. The system then checks for hidden patterns in this matrix. These patterns are referred to as numerical relations. ADR has two versions: sADR and uADR. The first one, sADR is semi-supervised. It needs a few labeled logs to learn. The second one, uADR is fully unsupervised. It works without any labeled logs. This saves time and reduces effort. Both versions were tested on four public datasets. ADR found useful patterns and detected many problems in the logs. It worked well with or without labels. ADR is a new and effective method. It uses numerical patterns to find system problems. It works even when logs are not labeled.
Keywords
Logs, ADR, Anomalies, Detection.
References
[1] Mohamed Amine Batoun
et al., “A Literature Review and Existing Challenges on Software Logging
Practices: From the Creation to the Analysis of Software Logs,” Empirical
Software Engineering, vol. 29, no. 4, pp. 1-61, 2024.
[CrossRef] [Google Scholar]
[Publisher Link]
[2] Nan Yang et al., “An
Interview Study about the use of Logs in Embedded Software Engineering,” Empirical
Software Engineering, vol. 28, no. 2, 2023.
[CrossRef]
[Google Scholar]
[Publisher Link]
[3] Ralph Foorthuis, “On
the Nature and Types of Anomalies: A Review of Deviations in Data,” International
Journal of Data Science and Analytics, vol. 12, no. 4, pp. 297-331, 2021.
[CrossRef] [Google Scholar]
[Publisher Link]
[4] Jesper E. van Engelen,
and Holger H. Hoos, “A Survey on Semi-Supervised Learning,” Machine Learning,
vol. 109, no. 2, pp. 373-440, 2019.
[CrossRef]
[Google Scholar]
[Publisher Link]
[5] Matthias Kowal, Sofia
Ananieva, and Thomas Thüm, “Explaining Anomalies in Feature Models,” ACM
SIGPLAN Notices, vol. 52, no. 3, pp. 132-143, 2016.
[CrossRef] [Google Scholar]
[Publisher Link]
[6] Shreya Shankar et al.,
“Moving Fast with Broken Data,” ArXiv Preprint, pp. 1-14, 2023.
[CrossRef] [Google Scholar]
[Publisher
Link]
[7] Weibin Meng et al.,
“LogAnomaly: Unsupervised Detection of Sequential and Quantitative Anomalies in
Unstructured Logs,” International Join
Conferences on Artificial Intelligence Organization, vol. 19, no. 7, pp.
4739-4745, 2019.
[CrossRef]
[Google Scholar]
[Publisher Link]
[8] Bo Zhang et al.,
“Anomaly Detection Via Mining Numerical Workflow Relations from Logs,” 2020
International Symposium on Reliable Distributed Systems (SRDS), Shanghai,
China, pp. 195-204, 2020.
[CrossRef]
[Google Scholar]
[Publisher Link]
[9] Arie Karniel, and
Yoram Reich, “Formalizing a Workflow-Net Implementation of
Design-Structure-Matrix-based Process Planning for new Product Development,” IEEE
Transactions on Systems, Man, and Cybernetics-Part A: Systems and Humans,
vol. 41, no. 3, pp. 476-491, 2011.
[CrossRef]
[Google Scholar]
[Publisher Link]
[10] Alon Geva et al., “Adverse Drug Event Presentation and Tracking (ADEPT):
Semiautomated, high Throughput Pharmacovigilance using Real-World Data,” JAMIA
Open, vol. 3, no. 3, pp. 413-421, 2020.
[CrossRef]
[Google Scholar]
[Publisher Link]
[11] Christian Schlereth, and Bernd Skiera, “Two New Features in Discrete
Choice Experiments to Improve Willingness-to-Pay Estimation that Result in SDR
and SADR: Separated (Adaptive) Dual Response,” Management Science, vol.
63, no. 3, pp. 587-900, 2017.
[CrossRef]
[Google Scholar]
[Publisher Link]
[12] Linda Härmark, Florence van Hunsel, and Birgitta Grundmark, “ADR
Reporting by the General Public: Lessons Learnt from the Dutch and Swedish
Systems,” Drug Safety, vol. 38, no. 4, pp. 337-347, 2015.
[CrossRef] [Google Scholar]
[Publisher Link]
[13] Marcello Cinque et al., “On the Impact of Debugging on Software
Reliability Growth Analysis: A Case Study,” Computational Science and its
Applications - ICCSA 2014: 14th International Conference,
Guimarães, Portugal, vol. 8583, pp. 461-475, 2014.
[CrossRef] [Google Scholar]
[Publisher Link]
[14] Adetokunbo A.O. Makanju, A. Nur Zincir-Heywood, and Evangelos E. Milios,
“Clustering Event Logs using Iterative Partitioning,” Proceedings of the 15th
ACM SIGKDD International Conference on Knowledge Discovery and Data Mining,
Association for Computing Machinery, New York, NY, United States, pp.
1255-1264, 2009.
[CrossRef]
[Google Scholar]
[Publisher Link]
[15] Qiang Fu et al., “Execution Anomaly Detection in Distributed Systems
through Unstructured Log Analysis,” 2009 Ninth IEEE International Conference
on Data Mining, Miami Beach, FL, USA, pp. 149-158, 2009.
[CrossRef] [Google Scholar]
[Publisher Link]
[16] Min Du, and Feifei Li, “Spell: Streaming Parsing of System Event Logs,” 2016
IEEE 16th International Conference on Data Mining (ICDM),
Barcelona, Spain, pp. 859-864, 2016.
[CrossRef]
[Google Scholar]
[Publisher Link]
[17] Pinjia He et al., “Drain: An Online Log Parsing Approach with Fixed Depth
Tree,” 2017 IEEE International Conference on Web Services (ICWS),
Honolulu, HI, USA, pp. 33-40, 2017.
[CrossRef]
[Google Scholar]
[Publisher Link]
[18] Jieming Zhu et al., “Tools and Benchmarks for Automated Log Parsing,” 2019
IEEE/ACM 41st International Conference on Software Engineering:
Software Engineering in Practice (ICSE-SEIP), Montreal, QC, Canada, pp.
121-130, 2019.
[CrossRef]
[Google Scholar]
[Publisher Link]
[19] R.K. Sahoo et al., “Failure Data Analysis of a Large-Scale Heterogeneous
Internet Services,” International Conference on Dependable Systems and
Networks, Florence, Italy, pp. 772-781, 2004.
[CrossRef] [Google Scholar]
[Publisher Link]
[20] Yinglung Liang et al., “Failure Prediction in IBM BlueGene/L Event Logs,”
Seventh IEEE International Conference on Data Mining (ICDM 2007), Omaha,
NE, USA, pp. 583-588, 2007.
[CrossRef]
[Google Scholar]
[Publisher Link]
[21] Peter Bodik et al., “Fingerprinting the Datacenter: Automated
Classification of Performance Crises,” Proceedings of the 5th
European Conference on Computer Systems, Association for
Computing Machinery, New York, NY, United States, pp. 111-124, 2010.
[CrossRef]
[Google Scholar]
[Publisher Link]
[22] Shilin He et al., “Experience Report: System Log Analysis for Anomaly
Detection,” 2016 IEEE 27th International Symposium on Software
Reliability Engineering (ISSRE), Ottawa, ON, Canada, pp. 207-218, 2016.
[CrossRef] [Google Scholar]
[Publisher Link]
[23] Wei Xu et al., “Detecting Large-Scale System Problems by Mining Console
Logs,” Proceedings of the ACM SIGOPS 22nd Symposium on Operating
Systems Principles, Association for
Computing Machinery, New York, NY, United States, pp. 117-132, 2009.
[CrossRef] [Google Scholar]
[Publisher Link]
[24] Qingwei Lin et al., “Log Clustering based Problem Identification for
Online Service Systems,” Proceedings of the 38th International
Conference on Software Engineering Companion, Association for
Computing Machinery, New York, NY, United States, pp. 102-111, 2016.
[CrossRef]
[Google Scholar]
[Publisher Link]
[25] Jian-Guang LOU et al., “Mining Invariants from Console Logs for System
Problem Detection,” 2010 USENIX Annual Technical Conference (USENIX ATC 10),
2010.
[Google Scholar]
[26] Min Du et al., “DeepLog: Anomaly Detection and Diagnosis from System Logs
through Deep Learning,” Proceedings of the 2017 ACM SIGSAC Conference on
Computer and Communications Security, Association for
Computing Machinery, New York, NY, United States, pp. 1285-1298, 2017.
[CrossRef]
[Google Scholar]
[Publisher Link]
[27] Xu Zhang et al., “Robust Log-based Anomaly Detection on Unstable Log
Data,” Proceedings of the 2019 27th ACM Joint Meeting on European
Software Engineering Conference and Symposium on the Foundations of Software
Engineering, Association for Computing Machinery, New York, NY,
United States, pp. 807-817, 2019.
[CrossRef]
[Google Scholar]
[Publisher Link]
[28] Christophe Bertero et al., “Experience Report: Log Mining using Natural
Language Processing and Application to Anomaly Detection,” 2017 IEEE 28th
International Symposium on Software Reliability Engineering (ISSRE),
Toulouse, France, pp. 351-360, 2017.
[CrossRef]
[Google Scholar]
[Publisher Link]