Online root-cause performance analysis of parallel applications

Research output: Contribution to journalArticleResearchpeer-review

1 Citation (Scopus)

Abstract

© 2015 Elsevier B.V. All rights reserved. The evolution of hardware is improving at an incredible rate. However, the advances in parallel software have been hampered for many reasons. Developing an efficient parallel application is still not an easy task. Applications rarely achieve good performances immediately and, therefore, careful performance analysis and optimization are crucial. These tasks are difficult and require a thorough understanding of the programs behavior. In this paper, we propose a systematic approach to online root-cause performance analysis. The automated analysis uses an online model to quickly identify the most important performance problems, and correlates them with application source code. Our technique is able to discover causal dependencies among the problems, infer their root causes and explain them to developers. In all of the scenarios we performed, this online modelling and analysis approach allowed us to understand the behavior of the applications, evaluate the performance and locate problem causes without specific knowledge of application internals.
Original languageEnglish
Pages (from-to)81-107
JournalParallel Computing
Volume48
DOIs
Publication statusPublished - 9 Jul 2015

Keywords

  • Online performance analysis
  • Online performance modelling
  • Parallel applications
  • Root-cause analysis

Fingerprint

Dive into the research topics of 'Online root-cause performance analysis of parallel applications'. Together they form a unique fingerprint.

Cite this