CausalRCA Causal inference based precise fine-grained root cause localization for microservice applications

Open Access
Authors
Publication date 09-2023
Journal Journal of Systems and Software
Article number 111724
Volume | Issue number 203
Number of pages 13
Organisations
  • Faculty of Science (FNWI) - Informatics Institute (IVI)
Abstract
Effectively localizing root causes of performance anomalies is crucial to enabling the rapid recovery and loss mitigation of microservice applications in the cloud. Depending on the granularity of the causes that can be localized, a service operator may take different actions, e.g., restarting or migrating services if only faulty services can be localized (namely, coarse-grained) or scaling resources if specific indicative metrics on the faulty service can be localized (namely, fine-grained). Prior research mainly focuses on coarse-grained faulty service localization, and there is now a growing interest in fine-grained root cause localization to identify faulty services and metrics. Causal inference (CI) based methods have gained popularity recently for root cause localization, but currently used CI methods have limitations, such as the linear causal relations assumption and strict data distribution requirements. To tackle these challenges, we propose a framework named CausalRCA to implement fine-grained, automated, and real-time root cause localization. The CausalRCA uses a gradient-based causal structure learning method to generate weighted causal graphs and a root cause inference method to localize root cause metrics. We conduct coarse- and fine-grained root cause localization to evaluate the localization performance of CausalRCA. Experimental results show that CausalRCA has significantly outperformed baseline methods in localization accuracy, e.g., the average AC@3 of the fine-grained root cause metric localization in the faulty service is 0.719, and the average increase is 10% compared with baseline methods. In addition, the average Avg@5 has improved by 9.43%. Codes and data are open-sourced and can be found in our Github repository CausalRCA.
Document type Article
Language English
Published at https://doi.org/10.1016/j.jss.2023.111724
Other links https://www.scopus.com/pages/publications/85159396126
Downloads
CausalRCA (Final published version)
Permalink to this page
Back